VictorOps is now Splunk On-Call! Learn More.

Your Open Source Tool Belt: Chaos Testing

Marlo Vernon December 21, 2018

DevOps Monitoring & Alerting Chaos Engineering
Your Open Source Tool Belt: Chaos Testing Blog Banner Image

A couple of years ago, when Amazon announced its famous Prime Day sale, the website traffic suddenly increased by 28%, amounting to 73.8 million visitors (Digital Commerce). As a result, a resource on the homepage took a long time to load, further delaying the load-time of other pages, and slowing the website down for nearly an hour. Apart from the obvious financial loss, failure to meet peak loads during the sale was a large reputational loss for Amazon. So, as you can see, even large reputable organizations like Amazon and other collaborative DevOps teams need a plan for rapid incident response and remediation.

According to research by Neumob, slow performance is the leading cause of shopping cart abandonment. In fact, if a product image doesn’t load in a mobile app, more than 47% of users are likely to exit the app and buy from a different source. Even if only a quarter of the users who abandoned their shopping carts eventually go to competitors, your company still encounters a net loss of 29% of its customers.

So, it’s imperative to perform load and stress testing to overcome these challenges and build pliable systems that are able to manage such overwhelming traffic. Stress testing is used to test the reliability, speed, scalability, effectiveness, and interoperability of your system. The tests help you examine how your system works under extreme conditions and ensures that your service doesn’t crash during critical situations. Performance is the most crucial feature of any product.

When stress testing or load testing is performed at the end of the development cycle, developers often get little to no time to make necessary changes, potentially causing a delay in the final release. In a DevOps environment, new features and functions are added at a much quicker pace. It’s possible that a new feature or service passes all the automated tests, and gets deployed to the server within minutes. But, if the code is not optimized to manage multiple concurrent users, it could result in system failure. By integrating a stress testing tool with continuous integration testing, you can detect performance issues much earlier, helping you provide a robust platform for your users.

You can find a number of free, paid, or “freemium” options for load testing and stress testing tools on the market. But which tool should you choose? In this blog post, we’ve collated a list of open source tools to help you make the decision.

The Tools

1) Apache JMeter

Apache JMeter is a pure Java open-source application designed to load test and stress test functional behavior of your system and measure performance. The application was originally built to test web applications but has since extended to other test functions.

JMeter can help you test performance on both static and dynamic resources. You can also simulate a heavy load on your server, network, or object to assess its strength and analyze the overall performance under various load types.

As one of the most popular tools in the industry, Apache JMeter allows you to test the performance and functionality of web applications and comes along with detailed reports. However, it can be challenging to scale JMeter to large-scale testing across multiple machines.

In a DevOps environment, a performance engineer is required. DevOps-focused teams need someone who can test every time changes to an application throw an error in the old test. However, DevOps teams often include engineers who don’t have the ability to write a new test script. These engineers need a tool that’s easy to use and configure. JMeter doesn’t require coding and can simply record and run scripts using your browser by altering the proxy settings. The tool doesn’t need state of the art infrastructure for stress testing and can support several load injectors managed by a single controller.

2) Grinder

Grinder is a Java load testing framework that helps you perform stress testing, functional testing, capacity testing, and reliability testing. The tool can perform on any platform and in any OS, and can load test everything that has a Java API.

Grinder supports multiple technologies like HTTPClient, Jython, Apache XMLBeans, JEdit Syntax, Clojure, Standards, PicoContainer, etc. This can help your DevOps teams adequately test their applications since they keep adding new features developed on the latest technologies.

You can test the web applications on the browser by recording the script using a proxy. You can also import test data in a spreadsheet or use an alternate analysis tool. It allows you to handle cookies and manage client connections for test contexts. Its graphical console also helps you monitor and control multiple load injectors. Working in a DevOps environment is all about collaboration, and Grinder allows you to centralize script editing and distribution. It provides you with a common view of application reliability and speed at every stage of the development lifecycle.

3) Gatling

Gatling is a highly capable stress/load testing tool, designed to test and measure your application’s end-to-end performance. In case you’re looking for the quickest and easiest form of a performance testing tool, you should consider Gatling. The tool is built on Scala, Akka, and Netty. Gatling’s code-like scripting allows you to easily manage the testing scenarios and automate them in your continuous delivery pipeline.

Gatling is an influential tool for system reliability. With only a few machines, you can run synthetic tests and simulate thousands of requests every second on your web application and get high-precision metrics. Gatling works with any browser or in any operating system.

In order to develop reliable applications, you need to understand how location affects application performance. Gatling can automatically generate load from cloud servers around the world, and thus your DevOps team does not need to set up the tool on multiple servers, saving time and money. Gatling allows you to execute test cases in different clouds; however, it does not distribute load between multiple machines.

4) Locust

Locust is a Python-based, open source and distributed user load testing tool. It is extensible and a great tool for testing APIs. You can use Locust to test the number of users your system can handle at the same time and allows you to create hundreds of thousands of virtual users. You can code test scripts and the application sends multiple virtual users to your website (or system) to test the scripts. It offers a web-based user interface that showcases your load test results in real time.

Web applications and web-based services are the main targets of the tool. However, if you’re comfortable with Python, you can test anything you want.

5) Tsung

Tsung is another open source, multi-protocol distributed stress testing tool. It helps you monitor your CPU usage, memory usage, as well as the traffic on your network. The tool is developed in Erlang and is protocol-independent. Currently, it can be used to stress AMQP, MQTT, HTTP, WebDAV, MySQL, LDAP, SOAP, PostgreSQL, and XMPP/Jabber servers, helping your DevOps team to test new features and functionality that’s continuously developed on the latest technologies.

The main strength of Tsung lies in its ability to simulate a large number of concurrent users from a single machine. You can also distribute the users on a cluster for machines. It doesn’t provide a GUI for test development/execution. Hence, you’ll have to use shell scripts.

6) Taurus

Taurus is an open source automation tool that offers an easy way to create, run, and analyze stress tests. Taurus allows you to perform chaos/load testing on a specific piece of code while it’s still in the development phase. The tool works independently as well as in conjunction with other open source stress testing tools, adding to their functionality.

The advantage of working on Taurus is that it allows you to write test cases in YAML which allows you to describe a test in a plain text file. The reports are displayed directly within the application. Taurus can also integrate with Jenkins, a continuous integration testing tool, helping you automatically trigger a performance test with every build, quickly review the test results, and share instant feedback with the development team.


Stress testing is required to ensure application reliability and flawless performance under extreme conditions, providing your customers with a great user experience. It’s critical for you to identify the potential breaking points of your system and rectify these issues before they become an expensive issue for your business.

Various tools are available to simulate thousands of users, analyze your application’s performance, and monitor your servers in critical situations. By intentionally injecting chaos and stress into your systems, you can identify weaknesses and build reliability into your infrastructure and applications, likely heading off unintentional incidents from occurring in your production systems at a later date.

Leverage an incident management solution with your chaos engineering tools and practices to build resilient architecture and improve team collaboration. Try your own 14-day free trial of VictorOps to make on-call suck less and build highly reliable services.

Let us help you make on-call suck less.

Get Started Now