There’s been a lot of discussion around post-mortems lately, both in the DevOps space and outside of it. Why do them? Should they be blameless? How often? Who runs them? SO MANY QUESTIONS.
We’ve been asking some of those same questions for the past two years now. From our 2014 & 2015 State of On-call data, here’s what we currently know about the post-mortem, or retrospective, landscape.
– 50% have a defined post-mortem process, while 50% do not (same number as 2014).
– 66% only perform post-mortems after significant outages (down from 75% in 2014).
– 81% strive for blameless post-mortems (up from 65% in 2014).
– Majority use the post-mortems to share with their team & executive shareholders.
– 63% report that they use post-mortems to improve their process (up from 50% in 2014).
What This Means…
Looking at the numbers above, a story starts to emerge. Teams know they should be doing post-mortems, but most still only do them after a significant outage. Teams know that project post-mortems help with getting better at incident management, yet only half the folks out there have a defined process around post-mortems.
It’s completely understandable. Talking about doing post-mortems and actually doing them are two different things. Just like you may know that you should eat better, it’s not always easy to do. Pulling all the information necessary to have a successful post-mortem – texts, chats, phone call logs, emails, fragmented conversations – seems daunting.
But There’s Help…
Fortunately, we have a tool that will make this process much easier. And we can say that because we eat our own dog food; we use our post-mortem tool to run post-mortems, adding in notes and edits as we go through the process about when certain tactics were tried and what we did to troubleshoot the problem as we moved through the incident timeline. Because our incident timeline captures all the actions that took place during the firefight, you have all the information you need in one place.
We gather as a group to do our post-mortems, with different people speaking to what happened and why they did what they did. This is where the blameless portion comes into play. Post-mortems aren’t meant to be a finger-pointing exercise. Ideally, you learn from your mistakes and you leave the post-mortem with action items that make your process better.
We take the hassle out of post-mortem reporting, leaving you more time to do what really matters: IMPROVE.
If you are looking for additional resources around any post-mortem topic, we’ve got you covered with a beginner’s guide, Everything You’ve Ever Wanted to Know about Post-mortems But Were Afraid to Ask, some suggestions on how to conduct blameless post-mortems, and all the ways you might be failing at the post-mortem process.
(Hint: blame is not the name of the game.)
This is a hard topic, even for the most seasoned on-call engineers, so if you have questions, shoot them our way. And remember, knowing about post-mortems is half the battle.