ChatOps is a practice in DevOps that has taken hold. And it’s easy to see why. The ability to run scripts from your primary communication device is very enticing to system operators. But you want to know what’s even better than running commands on your own? Having them run automatically!
VictorOps has allowed organizations to run webhooks as part of their escalation policy since day one, however, the data we passed wasn’t very useful and made it difficult to make decisions based on it.
While sitting around discussing a customer request for more granular auto-resolves, we suddenly thought about how much power people would have if we just sent more information with a webhook, so we decided to start including the alert details for the alert that started an incident.
Here’s an example of the content that the enhanced webhooks contain:
To use the new webhook you’re going to need a simple server capable of receiving a webhook. I wrote a simple ruby sinatra app that will consume the webhooks and then schedule an resolve to be fired in the future using at (yeah unix!).
The exact logic that you want to schedule auto-resolves, or acknowledgements, is up to you but this should give you much better granularity of auto resolution.
Hopefully this rather pedestrian demonstration will spark other ideas on how you could use webhooks to automate your workflow.
Other things you could do:
– If you have some well understood error scenarios where the alert has a standard solution, you could run basic commands from your webhook servers to actually fix issues. In that world, you might delay your escalation policy so that it doesn’t wake up a user for 5 minutes.
– For incidents that signify a major outage, create a google hangout and send it into the timeline so the people will be able to join it when they login.
– Create a matching JIRA ticket that will be used in longer-term storage for an incident.
If you need help configuring webhooks, take a look at our knowledge base.