Michael Handa, Executive Director of Managed Services at Blue Sentry IT, an AWS Managed Service Partner, leverages the power of VictorOps both internally and with clients. Blue Sentry’s NOC uses VictorOps as a platform of engagement for on-call scheduling, escalations, and real-time collaborative incident response.
With a fully distributed workforce, on-call incident management needs to offer visibility and collaboration tools to people across multiple geographic locations. Blue Sentry IT takes this a step further by managing incidents internally, as well as offering managed services to external parties. So, Michael Handa has his fair share of experience responding to and remediating issues.
Where It All Began
Michael joined Blue Sentry IT about 2 years ago, has been running help desks for 11-12 years, and has been doing IT much longer than that. As the Executive Director of Blue Sentry IT’s managed services team, Michael consults with clients who are already in AWS or who are looking to be in AWS. So, Michael isn’t only maintaining Blue Sentry’s own infrastructure, but he’s helping other companies set up and maintain their own systems as well.
From migrations to deployments, Michael helps his clients organize and facilitate their own infrastructure needs. Setting up monitoring thresholds, alerting systems, and incident management tools also falls on Michael’s plate. He assists clients with building new infrastructure, implementing continuous integration and delivery (CI/CD) processes, supplementing engineering teams, and on occasion acts as a completely outsourced engineering team.
Incident Management and the NOC at Blue Sentry IT
Blue Sentry’s managed services team operates with a NOC, a centralized location for handling servers and networking equipment. Their on-call NOC team consists of about 10 people who route incidents to the proper person or team. These people manage ticketing and alert routing 24 hours a day, 7 days a week.
The on-call team breaks up scheduling and uses follow-the-sun rotations as much as possible. Blue Sentry uses New Relic to monitor their network and systems, set thresholds, and send alerts directly into their ticketing system (Freshservice). With Freshservice as a way to track their tickets, they can then use VictorOps as their platform of engagement for incident response and resolution.
With custom webhook functionality and some light usage of the VictorOps transmogrifier, new tickets in Freshservice create an alert in VictorOps with variable incident data. Also, when a call comes through to the NOC, a ticket is automatically created in Freshservice and an incident is created in VictorOps. Via webhook functionality, the severity of alerts set in a Freshservice ticket are automatically sent into VictorOps and updated. Then, VictorOps allows members of the NOC to automatically route alerts and calls to the right person based on on-call schedules.
Everyone at Blue Sentry takes advantage of VictorOps personal paging policies in order to respond to alerts through phone call, SMS, the web client or mobile app, Slack, and/or email. Members of the NOC can respond, route, and escalate incidents appropriately through their preferred method of notification. With our robust Slack integration, people can chat about, acknowledge, and resolve incidents directly through Slack.
When discussing Blue Sentry’s needs with Michael, he emphasized the importance of cost-effective live call routing functionality. We have Twilio live call routing built directly into our platform–and because it’s one of the most highly used features in Blue Sentry’s NOC, they loved that they didn’t need to switch context when setting up calls. So of course, affordable live call routing proved to be one of the most important capabilities for Blue Sentry IT when managing alerts in the NOC.
Because Blue Sentry operates with a completely distributed team, robust on-call scheduling capabilities were a must. Members of the managed services team can get notified anywhere, at any time, through multiple methods. The team can push schedule changes into Slack for improved on-call calendar visibility. Also, people can use Take On-Call functionality to make short-term schedule adjustments with coworkers–making it so they won’t miss their child’s dance recital or baseball game.
Blue Sentry needed an easy way to take alerts, organize them into tickets, and then get them to a place where the incidents could be addressed. At its core, VictorOps gives Blue Sentry a centralized platform for monitoring, alerting, ticketing, and incident response. A number of tools integrate with VictorOps to make this process seamless and allows for simpler workflows when a member of the NOC is on-call and receives an alert.
While VictorOps has functionality to help with CI/CD and improve visibility into the SDLC, Blue Sentry really only needed VictorOps for alerting and incident remediation. They simply needed an easy way to surface incidents, alert on the problem, and immediately get the right people involved in the firefight.
The Switch to VictorOps Incident Management
“We may have gone with VictorOps even if it didn’t have call routing functionality.”
Previous to Michael joining the Blue Sentry team, they were using PagerDuty for alerting workflows. In fact, Michael’s first exposure to VictorOps came while working with a client who was using VictorOps for incident response and remediation. But, while working with this client, Michael noticed a few things in VictorOps that would be an improvement to Blue Sentry’s own situation.
Based on Blue Sentry’s monitoring, alerting, and incident remediation setup, Michael noticed there were a few benefits in switching from PagerDuty to VictorOps. One main benefit being that live call routing would be more affordable for them. As their NOC operates heavily on live call routing, VictorOps proved to be more cost-effective for the necessary functionality.
Blue Sentry IT does not currently put engineers on-call, but will get them involved via Slack whenever there’s an issue they need to be aware of. Because Blue Sentry uses Slack heavily for incident workflows, this makes it easy to loop in engineers whenever they need to get involved. And the best part is, due to the strength of the VictorOps and Slack integration, any updates or messages in Slack will also automatically appear in the VictorOps incident timeline.
Both platforms contain a lot of the basic on-call scheduling, alert routing, and incident escalation functionality. But, Michael liked the cross-functional visibility in Slack and VictorOps. All in one place an alert could come in from their ticketing system, get routed to the proper person with applicable alert data, and team members could get on a call and collaborate right there. We’re always proud to make on-call suck less for our customers like Michael.
And because we know how important live call routing is to Blue Sentry’s business, we’ve never been more proud than when Michael went on to say, “We may have gone with VictorOps even if it didn’t have call routing functionality.”
Blue Sentry’s Favorite VictorOps Functionality?
The VictorOps support team! Michael mentioned a few things that truly stood out when he was looking at VictorOps. We always expect someone to comment on our features, but Michael really loved his experience with our support team. In addition to the capabilities such as live call routing or on-call schedules that we’ve mentioned above, we’re happy to have a top-notch support team that’s willing to go above and beyond for our customers to help fix issues, setup integrations, and create on-call rotations.
Alerting, Visibility, and Collaboration in One Place
If the NOC didn’t exist, engineers at Blue Sentry would take on more on-call responsibilities. With the way Blue Sentry is structured, the NOC helps support a culture of DevOps collaboration and SRE. Through automation and intelligent process development, Blue Sentry IT can set up monitoring tools, adjust on-call schedules, route alerts, manage tickets, and collaborate to resolve incidents in one centralized platform.
Incident management solutions need the flexibility to support various organizational structures. Download our free Incident Management Buyers Guide to learn all about incident management functionality that moves you past basic alerting and into collaborative incident response.