The talk was accepted to the conference program
About 4 years ago, when we started developing Yandex Delivery, we used all the main patterns for building stable and reliable applications:
- canary release
- retries and timeouts
- rate limiting
- circuit breaker
- feature toggling
Even if one of our datacenters is unavailable, our users will not notice anything. We can enable/disable and configure our features in production in real time, and much more.
But all this was not enough to prevent the system from experiencing downtime sometimes
I'll tell you about the non-obvious problems we encountered and what lessons we learned from various production incidents
- architectural solutions that lead to problems (inter-service interaction, entity processing, etc.)
- problems when developing external API
- specifics of working with mobile clients
- problems with PostgreSQL and what we did wrong
The largest professional conference for developers of high-load systems
The price is soaring — the closer the conference is, the more it costs.
The current price of a ticket is — 280000 AMD
Changed your mind?
Tell us why.
Thank you for your reply!
Professional conference for developers of high-load systems