VictorOps is now Splunk On-Call! Learn More.
Whether you still work within a NOC or not, IT operations, technical support and DevOps-minded engineers are constantly looking for better ways to collaborate and accurately track incident documentation. The rise of real-time incident response tools, alert automation and collaborative on-call workflows has changed the traditional approach to incident management. Because development and operations teams aren’t always built the same, ITSM and ITIL principles don’t always apply – requiring highly customizable, flexible IT alerting and ticketing practices.
Ultimately, IT ticketing and holistic incident management is improved with detailed monitoring context, intelligent real-time alert routing, collaboration tools and dynamic on-call scheduling. With VictorOps and ServiceNow, modern DevOps and IT teams are able to spend more time fixing major customer-facing problems and less time manually creating incidents, filling out fields and moving tickets through a queue.
So, we’re proud to share a number of enhancements to our ServiceNow integration. And, we’ll share some tips and tricks for making the most of your SNOW integration – not just technically but in practice, as well. Below, we’ll dive into some ways you can use automation and simple incident workflows to create a process for real-time incident response. Stop worrying about manually maintaining documentation in your IT ticketing and incident management tool and start worrying about end-user experiences.
Bidirectional, fully-customizable assignment mapping and mapping between custom fields across VictorOps and ServiceNow allows you to build out your incident response and management system exactly how you want to.
IT practitioners and DevOps-centric teams both need the ability to manually, or automatically, create tickets and escalate incidents. Well, you can easily create incidents in either tool and pass information back and forth, allowing people to work in the place they feel most comfortable. And, you’ll be able to track tickets and historical incident documentation across both solutions. You can acknowledge and resolve incidents from both ServiceNow and VictorOps and you can easily assign incidents to the right users or teams in both tools.
No matter how you build your assignment groups or customize your fields in ServiceNow, you can directly map them to customizable VictorOps escalation policies. And, not only that, but you can share work notes between the two tools. So, your incident response and incident management software can work for you and your team, not the other way around.
With a purpose-built solution for real-time incident response and intelligent alerting, major problems don’t get lost in a queue. The severity of incidents is more accurately documented and responders are jumping onto issues faster than ever before. As responders get immediately notified of problems by their monitoring tools, they should be focused on restoring services for customers, not tracking tickets. So, the integration of VictorOps and ServiceNow allows you to do just that – use VictorOps to rapidly remediate issues while keeping historical data in ServiceNow.
If you really wanted to uplevel your incident response processes, even on top of the reporting provided by ServiceNow and VictorOps, you can build Splunk dashboards to track metrics such as cost of downtime and time spent working on unplanned work. These insights can help DevOps-minded teams continuously improve incident management and keep developers and IT practitioners working on the creation of new products, services and features. Straight out the box, the VictorOps and ServiceNow integration helps you quickly take action around incidents without losing any documentation.
In the midst of a firefight, the last thing you want is to spend time context-switching by clicking back and forth between multiple applications. At the click of a button in VictorOps, you can open up the exact ServiceNow ticket you need to look at.
And, the related monitoring context and work notes can be surfaced in a single pane of glass – whether or not your preferred pane of glass is ServiceNow or VictorOps. But, with the alert details and collaboration tools integrated with VictorOps, you can communicate bidirectionally through VictorOps and chat apps like Slack or Microsoft Teams, all while keeping tickets updated in ServiceNow.
From critical infrastructure and application incidents to minor IT concerns and requests across the company (e.g. new keyboard or new monitor, etc.), operations teams need to try and plan for personnel and asset capacity. Over time, ticket tracking and incident management tools like ServiceNow can help you understand when requests come in, how often certain types of equipment is needed and the number of work hours needed from the IT staff. The VictorOps and ServiceNow integration allows these IT teams to focus on delivering the service they need while continuously tracking these important capacity planning and incident management metrics in ServiceNow.
Automation is driving the ability to improve the entire incident management process – from the way engineers manage on-call scheduling to the tracking of work over time. Not only does this help you learn about the way your system works, but it helps you better understand how your people work. We’ve stated it in the past but we’ll say it again – as machine learning and automation become more and more commonplace in DevOps and IT, the focus of this technology should always be on the way it affects people.
Combined, VictorOps and ServiceNow allow you to build out a full-scale system for better automatic alerting and notification, real-time incident collaboration and IT ticketing. More often than not, the right people will receive the appropriate alerts at the right time. Less time is spent simply trying to diagnose a problem, triaging the incident and figuring out who’s the appropriate responder. Once intelligent alert automation helps surface incident context to the right person or team, it’s just a matter of offering the right functionality that encourages deep collaboration.
Customized routing keys, escalation policies and the alert rules engine allow you to turn VictorOps into a full-fledged system for incident response. Over time, with minimal effort, alerts are continuously served up to the right users and teams – driving mean time to acknowledge and resolve (MTTA/MTTR) from hours to minutes.
Even with everything you can do with the VictorOps and ServiceNow integration, it’s only as good as the data ingested from your monitoring tools. So, how can you build a highly observable system and improve the way you take action on observable problems? Well, this idea of comprehensive observability built out on top of your real-time incident response and incident management tools can help you fix issues faster – often before major incidents even occur.
Tracking key service metrics can help you determine if a problem even exists. This is often one of the hardest things to do if you simply set-and-forget basic application and infrastructure monitoring solutions. Defining important high-level metrics across your various applications and services and determining the best way to track them can help you quickly detect incidents. Some monitoring tools to help you with better tracking of metrics could include things like AWS CloudWatch, Metricly, SignalFx or Microsoft Azure Monitoring.
But, once you see a problem, how do you use tracing to track it across disparate services, teams, applications, networks, servers, etc. A common problem we see in incident response and incident management is a series of alerts across different monitoring solutions that may actually only relate to one core issue. So, tracing tools can help you quickly track the origins of a problem and understand what’s really happening with your service. Some examples of software for dynamic tracing include New Relic, Sysdig, SignalFx, AppDynamics or Nagios.
Now, you can dive into the logs to diagnose specifically what’s wrong with your application or service. With better metrics and traces, you’ll have likely narrowed down the problem enough to understand which logs you need to dive into in order to find the applicable context you need to fix the problem. Ideally, your logging solution has robust search and filtration capability to help you speed up the diagnostic even more. Logging solutions include tools like Splunk Enterprise and Cloud, Loggly or Logz.io.
But, how do you mobilize the team and take action around the problem once you know exactly what’s wrong? That’s where the VictorOps and ServiceNow integration comes into play. You can correlate alerts and similar incidents, automate notifications based on alert payload data from your full suite of observability tools, and quickly execute on remediation strategies. With the right data alongside effective collaboration tools, you can turn events and major incidents into minor issues.
The monitoring, alerting, incident response, incident management and IT ticketing flywheel shouldn’t be thought of in separate buckets. Data and process needs to flow cleanly through all stages of incident detection, notification, response and remediation. Then, with a clear storage center for historical incident documentation, you can conduct more thorough post-incident reviews – endorsing a culture of DevOps and continuous improvement. And, luckily for you, that’s just what our latest integration with ServiceNow offers up.