What you do after an incident is resolved can be as important as the actual fighting of the fire. Namely, if you don’t learn from the mistakes you’ve made, you’re doomed to repeat them. And when you’re talking about mistakes that cost the company revenue, those are expensive lessons to keep learning.
In our State of On-Call Report (it’s live! get a copy!), we dug in and not only asked questions about how people are doing on-call but what they’re doing after they are done with their on-call rotation…besides sleeping soundly.
We believe that you should do a post-mortem, or retrospective, weekly – even if you haven’t had a big outage. But if you’re in the 50% not conducting any sort of post-mortem and you have questions about how to get started doing them, we have a helpful guide with tips for the beginner.
And if you’re a VictorOps user, we make post-mortems even easier to do with our reporting feature, allowing you to pull all the relevant incident details into a single doc with the capability to add appropriate notes or edit out unnecessary parts. Our system data suggests that companies using our post-mortem tools have less false alarms and less downtime. Yes, it is possible. You can see even more of our post-mortem reporting magic in the video below…
Finally, if you’d like to do blameless post-mortems but are just not there yet, we have some suggestions (both in printed and spoken form) on best practices and how to gently introduce them into your workplace.
You now have no excuses. Get out there and start making your team better by talking about what happened…without pointing fingers.