World Class On-Call & Alerting - Free 14 Day Trial: Start Here.
Working remotely is at the top of everyone’s mind right now. But, working remotely across distributed teams and engineering disciplines is easier than it’s ever been. And, as they say in show business, the show must go on. Organizations need to strongly consider the power of automation and data to help engineering teams collaborate and improve transparency for remote workers.
Quick note: This post isn’t intended to pander to the technical reader. Rather, this post is meant to highlight the importance of automation and data to build resilience into an organization’s DevOps and IT workforce – giving them the ability to be effective anywhere.
An effective system of automation driven by data is required in order to have an engineering team that can work remotely from each other yet still prove effective in building applications and supporting them. Automation without data for remote employees is just a process, and data without automation is just unactionable context. Here’s how the two come together to help development teams be effective:
High-performing release automation means that delivery chains don’t need to stall. The delivery chain itself isn’t the whole benefit for remote engineers. An additional benefit of release automation is the ability to see the status of each phase of delivery and gain more control over pushing applications through the process.
In the Network Operations Center (NOC) model, engineers needed to be physically in a facility to identify and triage incidents when they occur. But, the modern NOC can now be 100% virtual. With powerful incident response tooling, teams can be mobilized wherever they are. The status of incidents can be visible to the entire team and all activity associated with those incidents are more transparent, from alert notification to acknowledgment all the way to remediation and post-incident documentation.
Mobilization is also the point when communication tools are brought into the equation. Communication tools include chat and video/audio conference software. Incident response tools integrate directly into your chat communication tools for real-time digital communication. Additionally, you can now allow teams to launch war rooms and conference bridges directly from the incident response tool – in-line with critical incident context and remediation tools.
After remote teams are mobilized, they need to get into the firefight and fix things. But, being remote means it’s not always easy to get context from team members who have expertise in the affected systems. This is where an incident response tool also brings context to those on-call responders addressing problems. Incident response tools will provide data from monitoring tools, in addition to any data the team has automatically set up for incidents of this type. These types of annotations, for example, could be a runbook, support articles, post-incident reviews, or similar incidents from the past.
If the on-call engineer isn’t qualified to address a specific issue then an incident response tool can recommend responders based on previous incidents they’ve worked on. So, instead of yelling down the hall for help, the virtual NOC allows you to identify people in the organization to help with a few button clicks, no face-to-face interaction needed.
Once the dust has settled and issues are resolved, the team needs to learn and adapt to prevent future issues. The data and context is collected during an incident and feeds into post-incident reviews, allowing teams to remotely build out more automation and tooling to help prevent the issues or implement quick-fixes for the root cause. For example, teams can now create runbooks or support articles for future incidents the on-call staff might see.
Additionally, all data collected across the delivery chain allows organizations to understand how they’re doing long-term. Measuring release velocity and impact of incidents over a long period of time allows teams to avoid face-to-face meetings but still have a full understanding of how their processes and applications are doing.
All of the above requires automation driven by delivery chain data. It also requires highly integrated DevOps toolchains and strong communication tools. The objective of automation and data is to give organizations the flexibility to have their engineers anywhere. In the event that organizations decide to go fully remote, they can do so with the confidence that applications will continue to be developed and supported.
Of course, for some teams, sustained disconnection can impact culture and morale. Tools can’t sustain or mend culture, but they can certainly help maintain what’s in place and help engineers work remotely without reducing efficiency or effectiveness.
Learn how DevOps and IT operations engineers can use VictorOps, remotely or otherwise, to drastically reduce incident impact and make on-call suck less. Sign up for a 14-day free trial to test it yourself or reach out to our sales team for a personalized demo.