Showing Chaos Engineering Posts

What to Make of SRE's Golden Signals Blog Banner

Effective site reliability engineering (SRE) relies on a deep understanding of a service’s underlying infrastructure and architecture. Improving the visibility into application and infrastructure health...

Read More »
September Roundup: A Game Day Recap Blog Banner

Preparation is the key to effective on-call management and faster incident remediation. From our State of On-Call Report, we found that incident response, on average,...

Read More »
Simulators and Validators for SRE and Chaos Engineering Blog Header

Common Gaps in SRE At its core, SRE is an engineer’s approach to improving operational system reliability via a path that includes, unsurprisingly, even more...

Read More »
How Workiva Built a Culture of Devops and SRE Banner

Creating a DevOps environment of collaboration, code ownership, and accountability inherently helps teams build on SRE efforts. We spoke with Mike, an SRE Manager at...

Read More »
Understated-Downtime-Costs-Blog-Banner

I’m not completely sure everyone knows the real costs of downtime—and it’s a helluva number… In fact, DevOps.com conducted a study showing that Fortune 1000...

Read More »
VictorOps-First-Leap-Into-Chaos-Engineering-Tactics-Banner

We decided to embark on a journey to make our systems more reliable by creating intentional chaos. Our team developed the SRE Council, made up...

Read More »
SRE-Is-A-Behavior-Not-A-Dedicated-Role-Banner

VictorOps, like many startups, has gone through major growth in the last couple years. New teammates, new customers, and a maturing organization have all demanded...

Read More »

Ready to get started?

Let us help you make on-call suck less.