VictorOps is now Splunk On-Call! Learn More.
Everyone remembers their firsts: First kiss, first concert, first job. A first on-call horror story is no different. Joni Klippert, VP of Product at VictorOps, vividly remembers her first nightmare on-call experience. Although Joni’s not an engineer, she got a glimpse into the life of being on-call and dealing with a stressful outage when bringing VictorOps to market for the first time. Watch the video below or read on to learn about why we awarded Joni’s on-call horror story The Worst First Time.
Not being an engineer or in operations, Joni’s on-call horror story is atypical. Joni was running the VictorOps alpha program, and the team decided they were going public with their offering on a certain day and time. This meant releasing a publicly-available beta program so anyone could sign up and evaluate the product.
VictorOps was ready to go—an email was sent to the early-access list and articles were scheduled in media outlets announcing VictorOps going live. To make the launch a success, a website was designed to give new users access to the product. The team pushed the website live the night before. The managed WordPress hosting service the team selected worked well in the staging environment, but when it was time to go live—it didn’t. At 10 p.m. five people were on a Google Hangout trying to figure out what to do and how to get the website live.
Everyone from designers and web developers, to highly-technical people on the team were troubleshooting the issue.
One of the technical team members explained that with a configured, up-to-date LAMP stack, the WordPress install could run on that server. But the team didn’t have that kind of time or the resources. However, Joni joined VictorOps after working for a cloud provider where they had access to essentially every cloud provider in the world. They had pre-configured applications at the ready where she could simply say “I want a LAMP stack on any hosting provider. Go.” and it would be deployed for her within 15 minutes. She gave it a whirl using her old account and it worked. The website was up and running, and the team was able to load a copy of the website and launch it. It was a stressful seven hours trying to troubleshoot the problem, but everything was fixed in time.
Joni had never been on-call with a pager, but after that night, she experienced the feeling first-hand. She had heard stories of people not being able to go to their kids’ recitals or sleeping in a different room than a spouse. Joni explained, “It was amazing to be on that Google Hangout and see significant others come walking into the screen from behind saying, ‘Are you coming to bed yet, what’s going on?’ It is a very real thing and it’s a very stressful experience for people who have this job.”
This was one time Joni experienced an on-call horror story, but the folks doing this job everyday encounter it more often. Because of Joni’s outage firefight, she has a newfound empathy for people on-call.
Miss the rest of the series? Catch up with the other on-call horror stories.