Matthew Boeckman - June 15, 2017
Ask members of a traditional DevOps team what it’s like to be on-call and you’ll likely hear a variety of answers such as, “it’s part of the job,” “it’s stressful,” or the very direct, “it sucks.”
As innovative teams incorporate non-traditional Ops folks into the fold, like developers, they need to bring a modern approach with them. This new modern incident management framework leans on automation and seeks to accelerate and streamline the often slow developing on-call process, while also keeping everyone on the same page, maintaining optimal uptime, and getting more zzz’s.
The Dev and Ops Guide to Incident Management provides an actionable introduction and clear approach for redefining traditional or homegrown processes to ensure that all team members are set up for success when, not if, a system failure occurs.
In this book readers will learn some of the basic concepts associated with traditional monitoring systems, on-call roles and processes. We’ll explore some more advanced topics around team organization, and ways to balance responsiveness and sleep.
Download your copy today to receive:
The latest in incident management theory and practice
An outline of the ideal after action process
Expert guidance to ensure alerts are actionable, and the team member who is alerted is equipped to act
Suggestions for the ideal on-call team structure and assigned roles