World Class On-Call & Alerting - Free 14 Day Trial: Start Here.
Being on call doesn’t have to suck.
Automated scheduling can offer flexibility and cut down on alert noise. Using effective on-call calendar management creates happier employees and, ultimately, leads to less confusion and faster incident remediation. So, let’s discuss how you can manage on-call schedules to make employees and customers happier.
In a previous post on avoiding alert fatigue, we discussed how alert fatigue can cause a number of psychological effects. Poorly managed on-call rotations can create many of the same situations. Stressed, anxious, sleep-deprived employees could struggle with productivity—potentially slowing down incident remediation. Not to mention the crankiness it would cause in the office!
Talk to your teams. How do they want on-call calendars to be structured? What can you do to add flexibility to schedules while providing a high level of reliability? Work on building on-call schedules that appease customer expectations of reliability while simultaneously making employees happy.
A big part of establishing a DevOps culture is making everyone responsible for system uptime. Furthering a culture of code ownership and accountability lends itself to a higher level of system reliability. When every member of the team takes on-call responsibilities, they become more acquainted with the system, and incidents are resolved in a timely manner.
When engineers buys into on-call schedules, then everybody spends less time on call. This, of course, gives every team member more time to focus on other projects. On-call calendars become more flexible because short-term schedule shifts become easier. All in all, flexibility and a sense of community at work makes for a more cohesive, happy team.
At VictorOps, in the words of our Hand, we like to say DevOps is, “An approach to our work where we continuously look for methods to evaluate and improve the technology, process, and people as they relate to building, deploying, operating, securing, and supporting the value our organization provides.”
We’re constantly thinking about how we can influence technology, processes, and people to make on call suck less. Creating intelligent alert routing rules and establishing strategic escalation policies is a great start. When an alert comes in the middle of the night, it should never wake someone up if they can’t act on it.
An on-call calendar should be structured so the correct person gets alerted at the right time. But, on top of that, adding actionable context to an alert can give engineer’s the insight they need to quickly remediate issues.
Automate and surface technology that makes calendars more visible and alerting processes that make your people more efficient. Optimize all three and you’ll create a culture of continuous improvement, system reliability, and employee/customer satisfaction.
Download our free State of On-Call report to see everything DevOps and IT teams are doing to achieve a better on-call experience.