Chaos Engineering: System Resiliency in Practice (Paperback)
As more companies move toward microservices and other distributed technologies, the complexity of these systems increases. You can't remove the complexity, but through Chaos Engineering you can discover vulnerabilities and prevent outages before they impact your customers. This practical guide shows engineers how to navigate complex systems while optimizing to meet business goals.
Two of the field's prominent figures, Casey Rosenthal and Nora Jones, pioneered the discipline while working together at Netflix. In this book, they expound on the what, how, and why of Chaos Engineering while facilitating a conversation from practitioners across industries. Many chapters are written by contributing authors to widen the perspective across verticals within (and beyond) the software industry.
- Learn how Chaos Engineering enables your organization to navigate complexity
- Explore a methodology to avoid failures within your application, network, and infrastructure
- Move from theory to practice through real-world stories from industry experts at Google, Microsoft, Slack, and LinkedIn, among others
- Establish a framework for thinking about complexity within software systems
- Design a Chaos Engineering program around game days and move toward highly targeted, automated experiments
- Learn how to design continuous collaborative chaos experiments
About the Author
Casey Rosenthal is CEO and cofounder of Verica, and was formerly the engineering manager of the Chaos Engineering Team at Netflix. He has experience with distributed systems, artificial intelligence, translating novel algorithms and academia into working models, and selling a vision of the possible to clients and colleagues alike. His superpower is transforming misaligned teams into high-performance teams, and his personal mission is to help people see that something different, something better, is possible. For fun, he models human behavior using personality profiles in Ruby, Erlang, Elixir, and Prolog.Nora Jones is the cofounder and CEO of Jeli. She is a dedicated and driven technology leader and software engineer with a passion for the intersection between how people and software work in practice in distributed systems. In November 2017 she keynoted at AWS re: Invent to share her experiences helping organizations large and small reach crucial availability with an audience of 40,000 people, helping kick off the Chaos Engineering movement we see today. Since then she has keynoted at several other conferences around the world, highlighting her work on topics such as Resilience Engineering, Chaos Engineering, Human Factors, Site Reliability, and more from her work at Netflix, Slack, and Jet.com. Additionally, she created and founded the www.learningfromincidents.io movement to develop and open source cross-organization learnings and analysis from reliability incidents across various organizations.