World Class On-Call & Alerting - Free 14 Day Trial: Start Here.
Over the past few years, the term shift-left has been gaining traction within DevOps organizations. Shift-left refers to a development approach where application quality and security become a focal point for the development team as early as possible in the development lifecycle. It’s implemented through strategies of end-to-end automated testing and continuous integration/deployment (CI/CD).
As a result, the developers on the team often find themselves supporting all aspects of the applications being built. This very fact results in developers finding themselves on-call to support their applications. This isn’t inherently bad – involving developers in on-call operations leads to better overall user experiences. However, it does create new challenges for developers who can no longer just write code and let someone else respond when a problem occurs in production.
Below, I explain in detail the challenges of being a developer in a shift-left world, how a developer can cope with being on call, and what steps they can take to quickly resolve issues that arise while on-call in an effort to keep the development process moving.
Traditionally speaking, a developer’s job has been to translate business requirements into technical requirements, and then design and build the application to function appropriately to fulfill these requirements. At the conclusion of development, the application would be handed off for testing where functionality was verified and bugs were identified. This is hardly the case anymore.
DevOps and Agile development practices, including the shift-left approach to improving application quality, have expanded the role of the developer. With the edict to test earlier comes the goal of end-to-end automated testing. This results in developers often finding themselves responsible for writing and maintaining automated test scripts in addition to developing the application itself. With the integration of these automated scripts into the CI/CD process, developers are now expected to fully understand and support the build and deployment process. All that to say, the average developer today supports many more aspects of the development pipeline than the average developer of previous eras.
In addition to the added responsibilities being thrust upon today’s developers, there is also the added pressure to increase quality while continuously delivering improvements and additional features to the application. This can rarely be done without developers supporting the development pipeline outside of normal business hours. So what strategies can on-call developers utilize to quickly resolve application issues and get back to their lives outside of work?
One of the biggest challenges for application developers is familiarizing themselves with the facets of the application that they don’t necessarily work with each day. For organizations with developers that also work on-call to support the development process, the need for team members to understand the full scope of the application is particularly important.
We’ve all heard of the term “silo” as it relates to a work environment. It describes a mentality within an organization where employees fail to communicate effectively to allow all members of the team to collaborate in the most efficient manner possible. This silo mentality cannot exist in a software development organization where developers work on-call in their off hours to help maintain the speed of delivery for the application. Some tips for overcoming this challenge include the following:
Maintain communication within the development team each day through the use of Agile practices such as Scrum. In this way, each team member will have a level of context as to what their fellow developers are working on. Therefore, when an issue arises, and a call is placed to the developer on-call, there is a much higher likelihood that they will have a point of reference in regard to the issue at hand. This will allow for a quicker resolution and a less stressful experience for those on call. They won’t have that moment where they wonder what exactly is being referred to by the reporter.
One of the common ways in which an application failure is reported in a shift-left world is through the failure of an automated test script. So it only makes sense that each developer on the team should be familiar with the different test frameworks used to support the project.
Ensuring proper access is another consideration to be made. Developers can sometimes find themselves on the development side of things so often that granting proper access to other environments can be forgotten, which will be necessary if debugging requires looking at a specific issue in a specific environment. Ensure that each developer has valid credentials to grant them access to log files in supported environments to make the process for debugging as seamless as possible.
While communication and collaboration amongst team members is key in providing the on-call developer with a better chance to achieve a quick resolution, the administrative side of dealing with application issues can be a frustrating experience. This is where incident management software comes into play. Take VictorOps, for example, where developers can be set up to receive alerts regarding relevant issues as a result of the monitoring tools configured to work with the product. VictorOps can help to make the life of on-call personnel easier by providing features such as:
Imagine it’s 2:00 AM and you’re alerted of an ongoing production issue that can’t wait – requiring a developers expertise. While you may not want to start calling around randomly and waking up everyone along the way, it would be nice to communicate with team members that are available and are scheduled to be on-call as well. VictorOps can assist with this as they provide the status of each user along with contact information that can be utilized for this exact situation.
One of the crucial facets of making on-call work bearable for the developer is to eliminate alerts that can wait until normal business hours. Being able to classify certain alerts accordingly can make the job infinitely more tolerable by allowing on-call personnel not to be tied to their phones for situations that do not require immediate action or remediation. Warnings and other informational alerts can be configured to show up in reporting within the VictorOps UI while also being suppressed so as not to alert on-call personnel.
Being tied to your phone for on-call work is hardly considered fun for most developers. That being said, it’s a reality of the job in today’s development climate. A few key steps can be taken to ensure issues are resolved quickly, allowing those on-call responders to get back to their lives. Consider improving collaboration amongst your development team by giving developers insight into portions of the application with which they have less familiarity. In addition, utilize incident management software and take full advantage of the features in place to reduce unnecessary alerts and gain greater visibility into the availability of co-workers. This will help to facilitate effective communication, allowing for faster remediation of on-call-worthy issues.
Sign up for a 14-day free trial or get a free personalized demo of VictorOps to see how software developers can collaborate better and reduce alert noise – all while improving visibility into deployments and on-call availability.