This is a collection of notes on the [talks] I have listened to. I think [talks]
are a great way to communicate concisely what someone has found out in their
experiences and we need to carefully listen and understand so that we do not
repeat the same mistakes again (or) make changes to our workflow and be better at what we do.
I will categorize this into topics and talks. Each topic will have the talk,
link to the talk and my notes on it.
Mastering Chaos - A Netflix Guide to Micro services
- Micro services are an abstraction, It should talk to the db, do compute/whatever it has
to do. If needed, maintain a cache so that the latency’s don’t shoot up. And expose this
entire package to the Application.
- There might be multiple reasons why a service can fail(time out/logical error etc..)
- Errors should not cascade, if one Micro service fails it should not take anything
else down with it.
- Netflix made Hystrix to solve this problem of
fallbacks and careful error handling. For making sure all this works, They created a
framework called FIT -> Fault Injection testing, which will test changes and failures
under load. They also make sure FIT is only done for absolutely critical micro services.
- Go multi region and make sure you have somewhere to go if one region fails
Trickle analysis, slowly introduce changes into production.
- Stateless service: No cache/database. No problem if a node dies down. A pure function.
Stateful service: Uses database. cache needs to be multi region as well, EV cache (memcached) does
this for netflix.
- Beautiful slide on incident handling, Another Beautiful slide
about how variance(new languages/frameworks) affects a company