A snapshot of a past incident showing the specifics of how you solved the problem (data, people, order of events, etc).
- Provides an opportunity for reflection, learning and blameless discussion about what happened during an outage
- Review the positives and deltas of your process after an incident is resolved
- Easily identify gaps in your tactical process that can be improved (ie. creating new annotations or transformation) for faster resolution
Incident Frequency Report
Identify patterns within your incidents in order to optimize your on-call process
- See the most frequent paging events
- Easily identify false alarms
- Note thresholds that need to be modified
- Determine future resources that need to be deployed
Incident Metric Report
A powerful tool for looking at process, dealing with alert fatigue and optimizing your team’s performance.
- Understand if the high volume of pages means that those systems are too noisy and noise reduction is needed, or alternatively maybe those systems are too fragile and need some bug cleanup
- Metrics include incidents, acknowledgements, recovery, MTTA, and MTTR
User Metric Report
Uncover statistics about team members within your team as well as in your organization as a whole.
- Bring to light the pain points for individual users and teams in your organization
- See if a someone is getting paged entirely too much - too many pages may affect productivity
Incident Trends Report
Graphs a running average of incidents, acknowledgments, and resolves. Showing a 15-day average of incident metrics, which gives a smoother graph to show long-term trends.
- Understand at a glance the basic makeup of your incidents
- See the trending of your incident lifecycle - from incoming alert to final resolution
A look at the daily average acknowledgment and resolve rate.
- Trending (both long and short) for average response and resolution times for incidents
- Identify bottlenecks in your process and build a plan to remove them