Top Docker Monitoring and Alerting Solutions

Dan Holloran May 08, 2019

Monitoring & Alerting On-Call
Top Docker Monitoring and Alerting Solutions Blog Banner Image

Containerization technologies like Docker and Kubernetes have completely transformed the way applications are developed and rolled out. Though similar to virtualization, containers allow organizations to put a powerful abstraction layer on top of their servers – making the most efficient use of their hardware resources.

However, as applications become more virtualized and fragmented, keeping track of their performance issues and upgrade requirements has also become a complex exercise. Instead of monitoring individual machines, IT and DevOps teams today have to monitor thousands of containers.

The tools for monitoring traditional applications and infrastructure are not equipped to tackle the complexity of a Docker environment. Hence, a new breed of Docker monitoring tools is now grabbing the limelight. In this article, we’ll list some of the most popular Docker monitoring solutions available in the market.

Docker API (Docker Stats)

Docker API is a simple monitoring service which collects the basic metrics required for monitorings Docker clusters. The Docker API will give you access to CPU usage, memory, ETL, latency and other basic network data. The Docker API is used directly in the command line web and provides all of the basic data you could want. The Docker Stats tool is open-source and you can configure it to meet most of your basic requirements, such as integration with visualization and alerting tools.

Unfortunately, the Docker API doesn’t allow you to create charts and dashboards that unify metrics across your entire cluster. The tool is a simple solution for collecting the data. If you‘d like to visualize dashboards for metrics like CPU usage, memory, and disk utilization for all containers running in the system, you’ll have to import the data into a visualization tool. You can, however, set up basic alerts from the Docker API if you so desire.

Sysdig

Sysdig uses an agent-based approach to monitoring. It offers both self-hosted, open-source solutions and paid, cloud-based Docker monitoring solutions.

With the open-source version, you have to install kernel headers on the host OS, which can be complex and time-consuming. However, Sysdig Monitor can automatically discover all the containers in your Docker environment. You can also display the relevant metrics related to your clusters in an out-of-the-box configurable dashboard. It’s important to note that Sysdig offers one of the most comprehensive sets of metrics and allows you to monitor them in real-time.

Another advantage of Sysdig is that it offers native support for Linux, Kubernetes, Mesos, and Swarm. It can collect all types of data from Docker event logs, along with the metadata from any container orchestration tool you might be using. You can also record and replay your event/activity stream and troubleshoot issues by monitoring pods, cluster, and namespace. And, you can configure Sysdig alerts to stay on top of incidents in your Docker environment.

How to Make On-Call Suck Less

Prometheus

Prometheus is an open-source monitoring and alerting solution initially built at SoundCloud. To make monitoring your Docker environment easier, you may want to run it as a Docker service on Docker Swarm. Prometheus has an active community, and since its inception in 2012, has seen widespread adoption by many companies comfortable with open-source tools for Docker monitoring.

Unlike the agent-based approach, Prometheus has a centralized server that manages registered systems from which it scrapes data. This pull-based approach makes Prometheus highly scalable. For visual monitoring, you can create graphs and dashboards using the Prometheus Dashboard Builder.

Prometheus is preferred by many organizations as it offers client libraries in most languages including Java, Go, Python, .NET, PHP, Ruby, etc. However, being a project under development, Prometheus requires a lot of attention and you might face configuration challenges. Further, it is dependent on Grafana for dashboarding.

cAdvisor

cAdvisor (short for container Advisor) is another open-source solution which can help you monitor resource usage and performance data of your Docker containers. It’s easy to set up and provides graphs showing CPU, memory, network throughput and disk space utilization. This visualization helps you quickly assess if your cluster needs additional resources.

For deeper troubleshooting, you can also gather usage statistics for individual containers. While the toolkit could be a good match for some teams, it has some limitations. With cAdvisor, monitoring more than one Docker host is not simple – and the charts allow you to view trends for just over a one-minute window. Further, it doesn’t offer any alerting mechanism.

Splunk

Last but not least, you can use Splunk for monitoring Docker containers and applications running on the containers. Splunk is an excellent option for centralizing all your application and infrastructure logs for convenient monitoring and in-depth analysis. You can use a range of options for collecting data from Docker; including Docker’s native logging (Splunk Logging driver, JSON, Syslog, etc.), forwarders, logging libraries (.NET, Java and node.js) and more.

With Splunk, you can accelerate root-cause analysis and conduct thorough post-incident reviews with minimal effort. It allows you to intelligently index, search and correlate container-based data across your distributed stack. Further, it offers advanced dashboards and alerts for efficient monitoring.

Concluding notes on the top Docker monitoring and alerting solutions

As more organizations adopt new technologies for application development and delivery, their environment is becoming more complex. The data is also becoming fragmented – leading organizations to move away from reactive measures and proactively approaching monitoring and alerting with AIOps tools.

Highly intelligent platforms help you leverage data from multiple data sources for numerous benefits – from collection methods to analytical technologies to visualization techniques. Such platforms can help you enhance your IT operations with useful insights and a higher degree of automation.

See how a centralized solution for monitoring, alerting and collaboration leads to efficient incident management. Download the free Incident Management Buyer’s Guide to learn what else you can do to make on-call suck less for DevOps and IT.

Ready to get started?

Let us help you make on-call suck less.