VictorOps is now Splunk On-Call! Learn More.

Incident Management in a Complex Serverless Framework

Brad Griffith November 18, 2019

DevOps On-Call Monitoring & Alerting Release
Incident Management in a Complex Serverless Framework Blog Banner Image

Serverless frameworks can lead to highly efficient, scalable systems that allow developers to build complex software faster and more reliably. Serverless frameworks allow engineering teams to focus on individual functions across multiple applications or microservices and eliminates numerous problems with maintaining physical hardware. Serverless capabilities are also often referred to as Functions as a Service (or FaaS).

Many benefits of serverless include scalability, speed, flexibility and even cost-effectiveness. However, FaaS can also lead to complexity which leads to less reliable applications and infrastructure and difficult on-call incident management operations. Serverless frameworks really only work when observability becomes a first-class citizen, continuous testing becomes the norm and engineers proactively think about their deployment mechanics and incident management practices.

Serverless benefits and challenges

Serverless presents engineering teams with lots of opportunities for growth and velocity but it also presents challenges. IT service management (ITSM) has changed over time and new technologies continue to change the way we develop services and release them into the world. Identifying incidents and resolving incidents in serverless systems is a totally different ballgame. Defining dependencies, building redundancies into applications and infrastructure can

So, we put this article together to help showcase the way serverless applications are changing IT service management and some methods for improving incident management when disaster strikes.

The serverless difference for IT operations

The job of IT operations has become more convoluted and confusing with the improvements being made to continuous delivery and continuous integration (CI/CD). When software development was first taking off and users were taking advantage of on-prem applications on their personal computers, IT had a lot more time to review code before shipping it to production. Six-month release cycles allowed for more time to run tests and QA products and services – leading to fewer negative impacts on customers.

However, on the flip side, end-users had to wait months before getting even the smallest updates added into their software. Then, the evolution of cloud storage and cloud service providers such as AWS, GCP and Azure led to Platforms as a Service (PaaS). PaaS differs from serverless (FaaS) applications in the sense that an application is deployed and hosted as a single unit whereas FaaS houses autonomous functions separate from one another. So, engineering and IT teams began to offload numerous hardware, server and network responsibilities onto their cloud providers, helping them focus on the development and deployment of new features.

Serverless allows for more flexible applications and infrastructure and can also help teams save money on computing resources. By breaking applications into small autonomous functions and storing parts of services across multiple devices and servers, you can build a product that scales up and down in real-time only when it needs to. Serverless functions can also mitigate an incident’s blast radius between different parts of a service and, as long as you’ve built visibility into your systems, help you isolate and identify problems faster.

Monitoring & Incident Response

Agile IT and incident management

While technology evolved, so did the structure of engineering teams. DevOps adoption continues to rise and developers and operations teams are continuously finding better ways to work together. According to, one of the first mentions of the concept of DevOps came in 2008 when Patrick Debois (widely credited as a primary creator of DevOps) attended a meeting to discuss the concept of “Agile Infrastructure.” The intersection of Agile software development and the IT Infrastructure Library (ITIL) led people to think more about the concept of DevOps – reducing silos between IT and development, automating processes, tightening feedback loops and deploying reliable code faster.

Today, IT professionals are helping facilitate cloud-based services and architecture, changing the fundamental responsibilities of their day-to-day lives. While there’s a lot less physical work as far as putting servers into racks, there’s a lot more work involving the overall monitoring, alerting and cloud-service management strategy. Operations teams are getting more input during product development so they can proactively test and run QA without creating a release bottleneck.

This way, DevOps-minded teams are able to release code to production faster and keep customers happy. And, if something does go wrong, developers and IT professionals both share accountability for incident management – meaning they collaborate in real-time and get the right people involved at the right time. Developers who wrote the code that broke in production aren’t simply passing the responsibility for remediation over the wall to an IT team, they’re diving in and fixing problems themselves.

Serverless operations meet ITSM and service operations

With complex serverless applications, it’s easy for incident management responsibilities to get lost in translation. Clearly assigned service ownership and detailed mapping of the way serverless functions are being used in your applications and services will drastically improve the way you approach on-call scheduling, alerting and incident response. Once you know how work moves through the development and incident management lifecycles, you can start to implement better processes. Serverless functionality really only changes how technical applications and infrastructure interact with each other, it shouldn’t completely change the way people work together.

The added complexity of serverless on ITSM, service operations and incident management can give you a headache by just thinking about it. But, if you take a step back and begin to address each operation individually, you’ll begin to see that it doesn’t really need to look that different. Your approach to software development and overall architecture will look completely different, but the general approach to incident management doesn’t have to.

DevOps, the serverless framework and resilient applications and infrastructure

The core values of DevOps (automation, collaboration, transparency, exposure, accountability and continuous improvement) will facilitate more efficient incident management in any environment. In fact, within a complex serverless framework, the closer the relationship between developers and IT teams, the less distant functions will feel from one another. DevOps builds transparency across silos, including individual functions in a serverless framework.

See how DevOps and IT operations teams, in any environment, are improving incident management with a holistic approach to on-call scheduling, alerting and real-time incident response. Sign up for a 14-day free trial or request a free personalized demo of VictorOps to optimize incident management and make on-call suck less.

Let us help you make on-call suck less.

Get Started Now