Showing Post-Incident Review Posts

Top Incident Management KPIs to Monitor Blog Banner

Proactive incident management begins with continuous improvement of processes, people, and technology. DevOps and IT teams need to track key performance indicators (KPIs) over time...

Read More »
September Roundup: A Game Day Recap Blog Banner

Preparation is the key to effective on-call management and faster incident remediation. From our State of On-Call Report, we found that incident response, on average,...

Read More »
Becoming a Reliability Engineer (SRE) Blog Banner

The world of defined roles for site reliability engineering (SRE) is relatively new. The principle was first defined and implemented by Ben Treynor, VP of...

Read More »
Incident Preparation: Uptime Is No Guarantee Blog Banner

Working to prevent downtime is a never-ending battle. But no matter what you do, in today’s era of continuous deployment and integrated services, uptime is...

Read More »
Checklist For Running Your Runbook Documentation Blog Banner

Runbooks, sometimes referred to as playbooks, are standardized documents containing information and procedures for resolving common IT or DevOps incidents. Runbooks walk through the steps...

Read More »
The DevOps Incident Management Flowchart Blog Banner

Finding the most effective way to manage incidents in your organization is dependent on two things: 1) The maturity of your product and, 2) the...

Read More »
The Incident Management Handbook Blog Banner

Incident management isn’t a straightforward, one-size-fits-all process. Every organization is built upon different infrastructure—technologically, culturally, and personnel-wise. And with the growing popularity of integrated systems...

Read More »
The-DevOps-Dictionary-Part-Three-Blog-Banner

Finally, the last part of The DevOps Dictionary is here! We covered letters A-O in parts one and two. Today, we’ll finish the series of...

Read More »
The-DevOps-Dictionary-VictorOps-Part-Two-Banner

We’re back with more of The DevOps Dictionary, a list of the most important terms and DevOps definitions. Part One covered letters A-I, and part...

Read More »
VictorOps-First-Leap-Into-Chaos-Engineering-Tactics-Banner

We decided to embark on a journey to make our systems more reliable by creating intentional chaos. Our team developed the SRE Council, made up...

Read More »

Ready to get started?

Let us help you make on-call suck less.