Meet You at DockerCon 2017

Alex Wellock, a senior IT engineer at VictorOps, knows a thing or two about Docker. Before joining us in November, he worked at HP on the Docker/HP Enterprise partnership, setting up Containers as a Service with Docker Universal Control Plane. In February, Maggie Gourlay wrote about attending her first Game Developer’s Conference after spending many years working in gaming. Now, after spending over a year implementing Docker containers, I’m about to attend DockerCon 2017. I’m most excited about the presentations, seeing the vendors, and learning what…
Read More

Available is the new On-Call

As teams look to grow their DevOps practice, they face many fundamental challenges. Integrating Developer and Ops workflows provide massive lifts in efficiency, but require focused work. Continuous Deployment offers a step-function in development velocity, while requiring a sea-change in the way Ops manages systems. Sharing responsibility for Applications and Infrastructure across a wider team brings experiential benefits and integrates teams with historical silos. While sharing responsibility for infrastructure is great, the ugly truth of DevOps is that most people don’t want to be on-call.…
Read More

MTTR ZERO: one weird trick solves all the problems

We all agree the MTTR (mean time to repair/resolve) metric is core to any Incident Management practice. Today, we’re pleased to announce a new solution within the VictorOps toolset: Automated Incident Resolution. This approach to resolving Incidents will, we think, forever change the landscape of Incident Management for your team. The basic problem, as has been explored in the past, is people. They have to be notified, they have to wake up, get a computer, investigate, diagnose, repair, and ultimately resolve the initial alert or…
Read More

Reducing Alert Noise: Going from 1000 Alerts to 10 Alerts Overnight

Monitoring tools are great. Here at VictorOps, we are constantly rolling out new integrations with monitoring tools and without them, VictorOps wouldn’t have much to work with. They enable you to check system health every few minutes and often alert you in the same way: by sending an email or notification every time a check finds a failure. If you haven’t set up alert dependencies in your monitoring systems, this can become noisy. In cases where you have configured your monitoring systems to check system health every…
Read More

Don’t Miss This Webinar: The Evolving Role of Context in Incident Management

Providing Situational Context to first responders is one of the most nuanced and critical success factors teams need as they manage and resolve incidents. It’s critical at all stages of incident management, from alert detection through postmortem. Provide no context, and you’ll materially impede resolution efforts. Overwhelm a team with data, and chaos ensues. Understanding the evolving role of context will differentiate your incident management abilities and prepare you for ongoing success. In this webinar, you’ll gain an understanding of the evolving role of situational…
Read More

U mad bro? Disaster planning for on-call

Disaster. That word gets used a lot in our circles–it’s a trigger to the deepest FUD argument a vendor or colleague can make. A disaster can be defined in any number of ways: the number of customers impacted, revenue loss, or the number systems impacted. There are many metrics by which a disaster will be judged. For an on-call team however, the tale of a disaster is told in the minutes and the hours. Much like a security breach, the reality of a systems disaster…
Read More

The Sweet 17: Check Out the Latest Bracket of VictorOps Integrations

Load watcher. Issue creator. Log analyzer. The power of any solution in your stack is definitionally limited by the ways in which you can integrate it with the rest of your stack. At VictorOps, we’re keenly aware of the heterogenous environments that we all work in. To that end, we announce the March Sweet 17: a plethora of the newest integrations to empower your team. Log Analysis Splunk SaaS: Splunk indexes and makes searchable data from any app, server or network device in real time…
Read More

From Building RESTful APIs to Teaching Hip Hop: An Interview with VictorOps Developer DeAndré Carroll

DeAndré Carroll is a platform and API developer at VictorOps. He also has a multi-decade career as a street dancer, choreographer, teacher, and artistic entrepreneur as director of The FunKinetic Project. JK: Which came first, dance or coding? DC: I have had my relationship with computers far longer than I have been dancing. I started programming when I was eleven years old. My elementary school in Northeast Denver had a program where they would pair mentors from IBM with kids they identified had aptitude for…
Read More

Context through coupling: JIRA for on-call teams

Some of the most visible artifacts of organizational silos in engineering are tools. Visibility into dashboards, workflows, or documentation is cordoned off in separate and often redundant systems. Conway’s Law manifests in tool choices and integrations as much as in application development. As a team matures an Incident Management or DevOps practice, breaking these tool walls is necessary. In this post I’ll explore a low effort, high value way that you can extend integration between VictorOps and JIRA to break down those silos, and empower your…
Read More

Hiring Out Key Infrastructure: Is the Exit Clearly Marked?

Recent events on the Internet have produced a lot of headlines, and if you’re an Ops Manager, a lot of headaches. Yesterday’s AWS outage caused widespread issues across several industries, and many affected organizations are waking up today realizing they didn’t have a good way to respond, other than waiting for Amazon to identify and correct the issue. Outages happen to everyone; the key is knowing how to respond, and indeed knowing whether you can respond at all. Outsourced providers help achieve scale and redundancy…
Read More