#failsave

california@diaspora.permutationsofchaos.com

Why #Twitter Didn’t Go Down

The first big point that is keeping the caches running is that they are ran as #Aurora jobs on #Mesos. Aurora finds servers for applications to run on, Mesos aggregates all the servers together so Aurora knows about them. Aurora will also keep applications running after they are started. If we say a #cache #cluster needs 100 servers, it will do its best to keep 100 running. If a server completely breaks for some reason, Mesos will detect this, remove the #server from its aggregated pool, Aurora will now be informed that there are only 99 caches running and then know it needs to find a new server from Aurora to run on. It will automatically find one and bring the total back to 100. No person needs to get involved.

read more about it here: https://matthewtejo.substack.com/p/why-twitter-didnt-go-down-from-a

If you want to learn more about Mesos and Aurora go to #Apache:


#software #failsave #opensource #technology #internet #framework #automation #administration