VictorOps is now Splunk On-Call! Learn More.
As application development has moved its collective mindset towards running in containers on cloud platforms, there are a few technologies that are front and center in the industries’ collective mind. The two largest are arguably Docker and Kubernetes, as a container format and an orchestration engine, respectively. While these technologies are great at what they do, they lack any real management capabilities.
Docker and Kubernetes are great tools for deploying applications. But, on their own, they provide just a sliver of the functionality that operations teams need to be successful. Fortunately, there are tools available to plug any operations gaps in Docker and Kubernetes. In this article, I explain how DevOps teams can use one such tool, Ansible, to streamline their work when they’re supporting Docker and Kubernetes-based environments.
Docker containers run effectively under the control of Kubernetes. But, what happens when Docker, Kubernetes or any of the components they rely on need to be updated, restarted or otherwise maintained?
While these activities can be performed manually, those processes are both time-consuming and error-prone. In DevOps and IT, this is where automation tools come into play and show their true value. Ansible is a leading tool in this space due to the simplicity of getting it up and running. Ansible has no agents that need to be preinstalled on hosts; instead, it uses SSH to connect to hosts and its runbooks are written in YAML – an easy format to work with when using simple text editors and version control software like Git.
Ansible connects to your inventory of hosts that are sorted into groups, whether they’re nested, simple static files, or dynamic inventories – as is often the case when large cloud providers or on-premises hypervisors like VMware are involved. Ansible’s ability to dynamically grab all the hosts that meet specific criteria (using tags or labels) is almost lifesaving when working with containers and Kubernetes clusters that can grow or shrink based on their capacity requirements.
Here’s a sample from a playbook for adding a node to a K8s cluster on AWS:
As an automation tool, Ansible is used to create playbooks for resolving common events and incidents that occur in the environment, from restarting applications and services to adding additional cluster nodes. This automation is used by on-call staff, from operations to development, to reduce mean time to acknowledge and recover (MTTA/MTTR), and to provide better service overall.
To go one step further, with the introduction of Ansible Tower, integration can occur between incident management platforms and the suite of tools used for automation. In the case of VictorOps, for example, or Ansible’s upstream open-source project AWX, Ansible can leverage VictorOps’ Webhooks capability to launch specific playbooks at any point of the incident lifecycle – attempting automatic remediation. If the playbook fixes the problem, the incident can be resolved without waking anyone up at 4 AM. If it doesn’t, normal escalation and notifications can still occur.
Documentation on Ansible is available at docs.ansible.com. There are hundreds of available modules organized into categories. These include a dedicated Kubernetes module, multiple Docker modules, and many other modules to support cloud-specific deployments from a basic AWS EC2 module to more specific modules like Azure Container (Kubernetes) Service.
Beyond pure operations, development groups can use Ansible to handle the lifecycle of a containerized application within Kubernetes by creating an Ansible Operator. Operators are a more advanced packaging format that the industry is just now starting to adopt.
Kubernetes administration is a great example of an environment that is both complicated (by its sheer number of moving parts) and simple (due to the repetition of many required tasks). Add Docker and its products, and the environment needs just a little automation magic. Ansible shines in such dynamic cloud environments in which a multitude of complex and routine tasks need to be automated and extended.
Bring automation into the incident response process and make on-call suck less when working with containerized applications. Try a 14-day free trial of VictorOps or sign up for a personalized demo to learn how you can integrate on-call schedules, intelligent alert routing and collaboration to get more from your incident management solution.
Vince Power is a Solution Architect with a focus on cloud adoption and technology implementations using open-source technologies. He has extensive experience with core computing and networking (IaaS), identity and access management (IAM), application platforms (PaaS), and continuous delivery.