www.gremlin.com Open in urlscan Pro
2a03:b0c0:3:d0::d25:d001  Public Scan

Submitted URL: https://r20.rs6.net/tn.jsp?f=001BRRcEaWN7DnAU1kH85DuU7Wtg8lpd5oRM5BjvaOXyaui1qRVweKJyGbCrKUCOHVIcqRRwvenChDH_pD_bE4u...
Effective URL: https://www.gremlin.com//blog/the-state-of-chaos-engineering-in-2021/
Submission: On February 16 via api from US — Scanned from DE

Form analysis 1 forms found in the DOM

#

<form action="#"><input name="bot-field" value="" autocomplete="new-password" class="css-qvutmh">
  <div class="css-99phxe">
    <div>
      <div class="CK__FormGroup">
        <div class="CK__ControlWrapper css-w05ues">
          <div class="CK__ControlWrapper css-19o983o">First name</div><input type="text" class="CK__Input css-1s59yep" name="first_name" id="first_name-251adac8-9704-4035-bd40-c463b7f19e87" value="">
        </div>
      </div>
    </div>
    <div>
      <div class="CK__FormGroup">
        <div class="CK__ControlWrapper css-w05ues">
          <div class="CK__ControlWrapper css-19o983o">Last name</div><input type="text" class="CK__Input css-1s59yep" name="last_name" id="last_name-23b45c34-1088-471c-9f0e-30106f4ad5c2" value="">
        </div>
      </div>
    </div>
    <div>
      <div class="CK__FormGroup">
        <div class="CK__ControlWrapper css-w05ues">
          <div class="CK__ControlWrapper css-19o983o">Email</div><input type="email" class="CK__Input css-1s59yep" name="admin_email" id="admin_email-39b7713a-f026-4599-9879-9711e1226816" value="">
        </div>
      </div>
    </div>
    <div class="css-mmnj2q">
      <div class="css-mbkma0"><button class="CK__Button CK__Button--primary css-1l0vhvo" type="submit"><span class="css-0">Sign
            Up</span></button><a data-link-external="true" target="_blank" rel="noopener noreferrer nofollow" href="https://app.gremlin.com" class="CK__Button CK__Button--default css-qb7ep2"><span class="css-0">Log in</span></a></div>
    </div>
  </div>
</form>

Text Content

Product
Pricing
Resources
Community
Company
Login
Get started

January 26, 2021 - 4 min read


THE STATE OF CHAOS ENGINEERING IN 2021


 * 
 * 
 * 
 * 
 * 

Aileen Horgan
VP of Marketing
Related
 * Podcast: Break Things on Purpose | Gunnar Grosch: From user to hero to
   advocate
   February 8, 2022 - 17 min read
 * Podcast: Break Things on Purpose | Sam Rossoff: Data Centers Inside Data
   Centers
   January 25, 2022 - 28 min read
 * Podcast: Break Things on Purpose | Unpopular Opinions
   January 11, 2022 - 4 min read

Read the full report

Track the evolution of Chaos Engineering in the first-ever State of Chaos
Engineering report.

Explore the findings

Five years ago today, our co-founders launched Gremlin with a simple but bold
mission: Build a more reliable internet. Over the past five years, the practice
of Chaos Engineering is increasingly employed as a means for proactively testing
systems to make them more resilient and reliable. Chaos Engineering has reached
mainstream media (with articles in Politico and Bloomberg), community
conferences on the subject have grown from a few hundred attendees to 3,500+
registrants, and Gremlin’s own user base has been actively testing systems,
having executed nearly half a million Chaos Engineering attacks. 

All of this interest got us curious: who is actually practicing Chaos
Engineering, what techniques are they using, what problems are they solving, and
what are they seeing as a result? To find out, we surveyed the community and
examined Gremlin platform data, resulting in the first State of Chaos
Engineering Report. While the report confirms the growing interest in Chaos
Engineering, we also observed that the nature of software failure modes has
shifted. To better support teams of all sizes in their efforts to prepare their
systems for a wide variety of failure scenarios, we are making the entire
Gremlin attack library available to users of Gremlin Free.


STATE OF CHAOS ENGINEERING REPORT

With more than 500 responses, primarily from software and site reliability
engineers, we identified the ways in which these roles use Chaos Engineering to
improve the reliability and resilience of their systems. 

The top benefits to Chaos Engineering? Increased availability and decreased mean
time to resolution (MTTR). In fact, teams who frequently run Chaos Engineering
experiments were more likely to have >99.9% availability—an absolutely
impressive feat. 23% of teams have an MTTR of under 1 hour, and over 60% of
teams have an MTTR of under 12 hours. Not only is Chaos Engineering keeping
services up and running, it's creating more informed dev and ops teams who are
better equipped to respond to incidents when they do happen.

> Engineering teams across the globe use Chaos Engineering to intentionally
> inject harm into their systems, monitor the impact, and fix failures before
> they negatively impact customer experiences. The State of Chaos Engineering
> Report confirmed that in doing so, they avoid costly outages while reducing
> MTTR and MTTD, prepare their teams for the unknown, and protect the customer
> experience.

Kolton Andrus
CEO, Gremlin

However, the fear of testing in production is real; only 34% of respondents run
Chaos Engineering experiments in production. Dev and staging are much more
common environments for running attacks. Proactive testing in lower environments
can provide better confidence in the stability of services without directly
impacting customer experiences. While Gremlin has continued to advocate for
testing across all environments, and we expect the percent of respondents
testing in production to increase over time, we also recognize that not all
critical services can be experimented on in production—emergency response
systems and autonomous vehicles come to mind. It’s important to consider a
system’s function and impact on customer experiences.

The report also uncovers trends in technology environments and looks to the
future to explore where Chaos Engineering is heading.

Get your copy of the State of Chaos Engineering report.


UNLEASH ALL THE GREMLINS, FOR FREE

As interest in Chaos Engineering continues to grow, we want to provide teams of
all sizes with a safe, secure, and scalable platform to run experiments. When we
launched Gremlin Free, teams were able to run CPU and shutdown attacks, and
shortly after we added blackhole attacks. We believed that offering access to
these three attack types helped teams get started with the foundations of
running Chaos Engineering experiments, but it also prohibited teams from
identifying vulnerabilities across their entire stack. 

So much of our world has shifted online over the past 12 months and as a result,
significant increases in network traffic are a new reality for businesses in
just about every industry. Running latency attacks helps engineers ensure
customer experience continuity, and improves the overall reliability of
services. Latency attacks are the 2nd most popular attack type for Gremlin paid
customers, following closely behind blackhole attacks. A latency attack allows
users to intentionally slow down network requests and observe how this affects
response time, page load time, application stability, and ultimately the
customer experience. 

Starting today, all Gremlin Free users will have access to the full library of
attack types. They can now test for memory leaks, disks filling up, latency,
process killers, blocked DNS access, and more. In doing so, we’ve also opened up
all of our Recommended Scenarios, allowing users to easily run Chaos Engineering
experiments based on common real-world outages.


CREATE YOUR GREMLIN FREE ACCOUNT

Run your first Chaos Experiment in minutes.
First name
Last name
Email
Sign UpLog in


COMMUNITY AND CHAOS CHAMPIONS

A few years ago, we launched a Slack community to connect Chaos Engineering
practitioners, learn best practices, find mentorship, and build reliable
systems, together. Now, that community has nearly 7,000 members with more
joining each day. And it’s not just the Slack community that’s growing. Last
year, the world’s largest Chaos Engineering conference, Chaos Conf, saw a 440%
YoY increase in registrations, and Gremlin’s Chaos Engineering Bootcamps had
close to 2,000 registrants for these free, hands-on workshops.  

In October 2020, we announced the Gremlin Chaos Champions program to recognize
the work practitioners were doing for their teams, community, and the Chaos
Engineering field at large. The freshman class consisted of four Chaos
Champions, and today we’re excited to expand that group. Please meet the newest
members of the Gremlin Chaos Champion program!

> Chaos Engineering has allowed us to finally test long-held assumptions about
> our services and enables us to continuously build more reliable
> infrastructure.

Doug Campbell
SRE, Grubhub

> Chaos can lurk just behind the facade of order. For me, chaos engineering has
> been an amazing journey for discovering that reliability is based on thousands
> of tiny failures, and that success of resilience is based on how many times we
> have failed at something.

Yury Niño
SRE, Aval Digital Labs

> Chaos Engineering has helped me a lot in getting a better understanding of our
> systems and working towards making them resilient. Sometimes chaos is the only
> way to achieve stability.

Saurabh Shendye
Senior SRE, Shipt

If you know someone who deserves to be recognized for their efforts leading
Chaos Engineering in their organization with Gremlin, nominate them today for
the Gremlin Chaos Champion Program.

We’re excited to see how the practice of Chaos Engineering and the larger
reliability space evolves in 2021. We expect to see the global theme of
‘resilience’ continue to lead our thinking, as teams across all sectors and
industries look for ways to make their organizations, teams, and systems more
resilient and better equipped to handle the unexpected.

 * 
 * 
 * 
 * 
 * 

Categories
People and Culture, Industry, Announcements
February 8, 2022 - 17 min read


PODCAST: BREAK THINGS ON PURPOSE | GUNNAR GROSCH: FROM USER TO HERO TO ADVOCATE


 * 
 * 
 * 
 * 
 * 

Jason Yee
Director of Advocacy
Reliability and serverless are at the forefront of today’s conversation. For
this episode Gunnar Grosch, Senior Developer Advocate at AWS, is here to talk
about Chaos Engineering, AWS Serverless, and the work that AWS is doing when it
comes…
Read more
January 31, 2022 - 3 min read


IF YOU'RE ADOPTING KUBERNETES, YOU NEED CHAOS ENGINEERING


 * 
 * 
 * 
 * 
 * 

Andre Newman
Technical Writer
When Ticketmaster started their Kubernetes migration , they had to address a
huge problem: whenever ticket sales opened for a popular event, as many as 150
million visitors flooded their website, effectively causing distributed denial
of…
Read more
 * 
 * 
 * 
 * 

--------------------------------------------------------------------------------

Company
 * Team
   Join us
 * Product
 * Contact
 * Press
 * Privacy

Resources
 * Blog
 * Docs
 * Security

Industries
 * SaaS
 * Finance
 * Retail

Featured
 * What is Chaos Engineering?
 * What is Chaos Monkey?
 * Reliability Calculator
 * What is Site Reliability Engineering?
 * ROI Calculator
 * The 2021 State of Chaos Engineering Report
 * How to achieve reliability in distributed systems

 * 
 * 
 * 
 * 


Loading...

--------------------------------------------------------------------------------

© 2022 Gremlin Inc. San Jose, CA 95113