16 באפריל 2015 summit

Summit 2015: Reactive by Example / Eran Harel

A cool story about the evolution of our monitoring infrastructure. From the naive approach to a super resilient system. How do we manage to handle 4M metrics / minute, and over 1K concurrent connections? What strategies did we try to apply and where did it fail? What are the techniques and technologies we use in order to achieve this? How do we handle errors, and failures at this scale? What can we still improve?

MP3

קישור היסטורי • קישור קבוע