newrelic.com Open in urlscan Pro
151.101.194.217  Public Scan

Submitted URL: https://groove.grvlnk1.com/url/oljkvksuINRlVB-GQ3a7rn4v8uU/aHR0cHM6Ly9uZXdyZWxpYy5jb20vYmxvZy9ob3ctdG8tcmVsaWMvcGVyZm9ybWlu...
Effective URL: https://newrelic.com/blog/how-to-relic/performing-effective-root-cause-analysis?_gl=1*1urhtal*_up*MQ..&gclid=Cj0KCQjw...
Submission: On October 25 via api from US — Scanned from DE

Form analysis 2 forms found in the DOM

POST /search

<form class="flex relative" method="post" action="/search">
  <label class="sr-only" for="header-search">Search</label>
  <input class="js-full-text-search col-start-1 row-start-1 w-full !pr-16" type="search" id="header-search" name="search" placeholder="What are you looking for?" autocomplete="on" autocorrect="off" autocapitalize="off" value="">
  <div class="absolute right-10 flex items-center justify-center h-full w-16">
    <button class="js-search-form-submit btn-text">
      <span aria-hidden="true">
        <svg width="32" height="32" viewBox="0 0 32 32" fill="none" xmlns="http://www.w3.org/2000/svg">
          <path d="M13.93 20.861A6.93 6.93 0 1 0 13.93 7a6.93 6.93 0 0 0 0 13.861Z" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
          <path d="m18.782 18.782 3.466 3.465.877.878 1.125 1.125.75.75" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
        </svg>
      </span>
      <span class="sr-only">Submit</span>
    </button>
  </div>
  <button class="js-search-form-cancel btn-text !bg-transparent text-nr-black dark:text-nr-white" aria-label="close" type="button">
    <span aria-hidden="true">
      <svg width="32" height="32" viewBox="0 0 32 32" fill="none" xmlns="http://www.w3.org/2000/svg">
        <path d="m8.928 8.928 14.144 14.144m0-14.144L8.928 23.072" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
      </svg>
    </span>
  </button>
</form>

GET /blog/search

<form class="relative w-full max-w-xl large:max-w-3xl" method="get" action="/blog/search">
  <label class="sr-only" for="blog-header-search-input"> Search the blog </label>
  <input class="js-blog-header-search-input w-full !pr-20" id="blog-header-search-input" type="search" placeholder="Type your search here" name="s">
  <button class="js-blog-header-search-submit absolute right-0 top-0 flex items-center justify-center h-full w-16 p-0 bg-transparent text-nr-black" type="submit">
    <span aria-hidden="true">
      <svg width="32" height="32" viewBox="0 0 32 32" fill="none" xmlns="http://www.w3.org/2000/svg">
        <path d="M13.93 20.861A6.93 6.93 0 1 0 13.93 7a6.93 6.93 0 0 0 0 13.861Z" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
        <path d="m18.782 18.782 3.466 3.465.877.878 1.125 1.125.75.75" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
      </svg>
    </span>
    <span class="sr-only">Submit</span>
  </button>
</form>

Text Content

Skip to main content
New Relic Now Dream of innovating more? Make it real on 10/31.
RSVP Now
Search toggle Main navigation menu, 6 items
Search
Submit

 * Platform
   The All-in-One Observability Platform
   30+ capabilities, 750+ integrations, the power of AI, all together.
   View Platform View Platform
   APM
    * APM 360
      
    * Serverless
      
    * Business Observability
      
    * SAP Monitoring
      
    * CodeStream
      
   
   Security
    * Interactive Application Security Testing (IAST)
      
    * Vulnerability Management
      
   
   Digital Experience Monitoring
    * Browser Real User Monitoring
      
    * Mobile Real User Monitoring
      
    * Session Replay
      
    * Synthetics
      
    * Web Performance Monitoring
      
   
   Artificial Intelligence
    * New Relic AI
      
    * AIOps
      
    * AI Monitoring
      
   
   Infrastructure
    * Infrastructure Monitoring
      
    * Kubernetes Monitoring
      
    * Network Monitoring
      
    * Prometheus Monitoring
      
    * AWS Cloud Monitoring
      
    * Azure Cloud Monitoring
      
    * Google Cloud Monitoring
      
   
   Logs
    * Log Management
      
   
   Platform Capabilities
    * OpenTelemetry
      
    * New Relic AI
      
    * Errors Inbox
      
    * Alerts
      
    * Dashboards
      
    * Change Tracking
      
    * Entity Explorer
      
    * Integrations
      

 * Pricing
 * Solutions
   
   The world's leading companies run on New Relic.
   
   Customer Stories Customer Stories
   Use Cases
    * Digital Experience Monitoring
    * Go from reactive to responsive
    * DevOps
    * Tool Consolidation
    * Open Source
   
   Industries
    * Ecommerce & Retail
    * Healthcare
    * Media & Entertainment
    * New Relic for Startups
    * Nonprofit
    * Public Sector
   
   Technologies
    * Amazon Web Services
    * Google Cloud Platform
    * Microsoft Azure
    * Pivotal Cloud Foundry
    * Prometheus Monitoring
    * SAP Monitoring
   
   Named a Leader
   Learn why in the 2024 Gartner® Magic Quadrant™ for Digital Experience
   Monitoring (DEM).
   2024 Observability Forecast
   Power your decisions with invaluable observability insights! Read the report.
   New Relic Now
   Grow your business with more perfect digital experiences. RSVP
   
   
   1
   2
   3
 * Enterprise
 * Developers
   Developers
   Simple K8 Monitoring
   Simplify your Kubernetes monitoring with the New Relic operator.
   Monitor Mobile Apps
   Monitor hybrid mobile apps in minutes with quickstart installations.
   New Relic Browser Agent
   Custom instrumentation with the New Relic browser agent.
   
   
   1
   2
   3
   Popular Docs
    * Start ingesting data
    * Create custom dashboards
    * Examine NRQL queries
    * Forward logs using infrastructure agent
   
   View All Docs View All Docs
   750+ Integrations
   Start now for free.
   Python
   Install
   Java
   Install
   PHP
   Install
   Ruby
   Install
   View All Integrations View All Integrations
   New Relic Now
   Turn peak traffic into peak opportunity. RSVP to see Intelligent
   Observability.
   Secure Faster
   Enhance security with New Relic APM's zero-day vulnerability alerts for
   proactive threat management and peace of mind.
   Root Cause Analysis
   Master root cause analysis with New Relic's guide to identify, solve, and
   prevent issues efficiently and effectively.
   
   
   1
   2
   3
 * Resources
    * About Us
    * Leadership
    * Careers
    * Secure Developer
    * New Relic for Students
    * Newsroom
    * ESG
    * Community Forum
    * Technical Support
   
   
    * Customer Stories
      Videos and case studies show how companies win with observability.
    * Resource Library
      Ebooks, data sheets, and white papers.
    * New Relic Blog
      Industry updates, tips, and best practices.
    * Events & Webinars
      Join us for an upcoming event or training.
    * New Relic University
      Learning paths and training courses.
    * Observability Value Calculator
      Learn the value of New Relic to your business.
   
   2024 Observability Forecast
   Stay ahead with essential observability insights! Read the report now.
   New Relic Now
   The future of observability is coming. Get a sneak peek on 10/31. RSVP now.
   New Relic is furthest in vision.
   Download the 2024 Gartner® Magic Quadrant™ for Digital Experience Monitoring
   (DEM).
   
   
   1
   2
   3

Search toggle
Log in Log in
Log in Log in
Get Started Free Get Started Free
Get Demo Get Demo
Log in Log in
Log in Log in
Get Started Free Get Started Free
Get Demo Get Demo
newrelic blog homepage
 * How to Relic
 * Best Practices
 * Culture
 * News

Blog navigation menu, 4 items
Toggle blog search
 * How to Relic
 * Best Practices
 * Culture
 * News

newrelic blog homepage
 * How to Relic
 * Best Practices
 * Culture
 * News

Toggle blog search
Log in Log in
Log in Log in
Get Started Free Get Started Free
Get Demo Get Demo
Blog navigation menu, 4 items
 * How to Relic
 * Best Practices
 * Culture
 * News

Search the blog Submit
Log in Log in
Log in Log in
Get Started Free Get Started Free
Get Demo Get Demo
APM


PERFORMING EFFECTIVE ROOT CAUSE ANALYSIS

Published Mar 5, 2024 • 7 min read

By Munwar Mohammed

By Shamika Woods


WHAT IS ROOT CAUSE ANALYSIS?

Root cause analysis is a method or process to identify any breakdowns in
processes and systems currently in place that could be improved when adverse
incidents arise.

When effectively implemented, a root cause analysis can:

 * Identify factors that contributed to an adverse event or near-miss so that
   measures can be put in place to address contributing factors.
 * Prevent incidents from happening again in the future.
 * Improve customer experience.
 * Reduce the costs associated with risk.

Conducting a root cause analysis entails an in-depth process, which consists of
finding the proper teams involved with resolving the adverse event, reviewing
valid notes taken during this time, reviewing the data from monitoring tools in
place used to help identify mean time to detect (MTTD) and mean time to resolve
(MTTR), and improve its efficiency in addition to minimizing and/or eliminating
the risk of it recurring.

By team involvement, sharing details of the adverse events and effective
communication can impact a positive outcome.


ROOT CAUSE ANALYSIS: NUTS AND BOLTS

Root cause analysis represents comprehensive investigation, assessment,
evaluation and correction. All these are attributes that feature in the services
that New Relic provides. There are many types of investigative procedures used
to carry out root cause analysis such as these described below:

 * Identify the problem: The first step upon recognizing an issue is to define a
   problem statement and the symptoms (for example, a machinery malfunction, a
   failed or faulty process, or human error). Once that’s done, it’s important
   to isolate any suspected contributing factors to contain the problem while
   you try to uncover the root cause.
 * Collect data: Once the problem is identified, compile as much data as
   possible, including incident reports, evidence in the form of screenshots and
   logs, and interviews with anyone involved with the issue. Using this data,
   you can determine the sequence of events, and especially any adverse events
   that led to the problem, as well as the systems that were involved, how long
   the problem occurred and the overall impact.
 * Determine root cause: The root cause analysis team conducts a brainstorming
   session using techniques such as Fishbone diagrams, Pareto charts, and other
   tools to ascertain the root cause. The root cause analysis manager moderates
   the meeting, which should be collaborative and blameless.
 * Implement the solution: The root cause may point to one or more solutions,
   and the root cause analysis team has to determine which fix is best and when
   it should be delivered. Once the solution is implemented, it must be
   monitored to ensure it’s effective. This process is more formally called root
   cause corrective action.
 * Document actions: A critical part of root cause analysis is preventing the
   problem in question from reoccurring. Documenting the problem and its
   resolution so teams can reference it in the future is essential. The root
   cause analysis team can also include recommendations for physical or process
   improvements as well as preventative actions in the documentation.


FIVE STEPS TO CONDUCT A ROOT CAUSE ANALYSIS

1

Properly define the problem using SMART rules to ensure you have identified the
problem correctly:

 * Specific
 * Measurable
 * Action-oriented
 * Realistic
 * Time-constrained manner

2

Confirm the problem is accurately identified based on data and not perceptions.

3

Take immediate action steps to resolve the problem temporarily.

4

Find the underlying root cause of the problem and take corrective action to
prevent the problem from recurring in the future.

5

Note the identified and established corrective action within the standard
procedures to prevent it from happening again.

Providing a temporary fix until you figure out how to provide a permanent
solution is ideal as long as you have a plan on how to resolve the problem at a
later date. After completing a root cause analysis, it’s important to have a
postmortem call sometime after. Having a call after the root cause analysis will
assist with mitigating repeat incidents by bringing teams together to plan and
communicate lessons learned and how to prevent it from happening again.


ROOT CAUSE ANALYSIS FRAMEWORKS AND METHODOLOGIES

1

Fishbone diagram:
Also called cause and effect diagrams (or Ishikawa diagrams, based on the name
of its founder Kaoru Ishikawa), a fishbone diagram is a visual method for root
cause analysis that organizes cause-and-effect relationships into categories.

2

Pareto analysis: (also known as the “80-20 rule”)
The Pareto Principle states that 80 percent of problems can be traced back to 20
percent of causes. Pareto analysis identifies the problem areas or tasks that
will have the biggest payoff. Using Pareto analysis during a root cause analysis
for errors helps us to understand and identify the most significant errors
usually caused by a few problems which can then be targeted for correction or
resolution.The following are the phases of Pareto analysis:

 * Phase I: Identification of causes of defects
 * Phase II: Collection of sample data
 * Phase III: Graphical representation of results
 * Phase IV: Interpreting the graphed results

3

Five why’s technique: (also known as Gemba Gembutsu, a Japanese phrase meaning
“place and information”)
The five whys is a simple problem-solving technique that helps to get to the
root of a problem quickly. The five whys strategy involves looking at any
problem and drilling down by asking: "Why?" or "What caused this problem?" While
you want clear and concise answers, you want to avoid answers that are too
simple and overlook important details. Start with the problem statement and ask
why it occurred. Typically, the answer to the first "why" should prompt another
"why" and the answer to the second "why" will prompt another and so on. Repeat
the steps until you have asked at least five whys, hence the name “five whys.”
This technique can help you to quickly determine the root cause of a problem.


ROOT CAUSE ANALYSIS GOTCHAS: LESSONS LEARNED

While every root cause analysis user journey is unique and the approach and
techniques used to arrive at a resolution vary, one can derive a common set of
best practices and learnings from the outcomes. The following are some of the
pitfalls that should be avoided along the way:

 * Incomplete or insufficient definition of the problem statement.
 * Lending focus to wrong things or signals. Think long-term and never have a
   near-sighted approach.
 * Stopping at the first sight of a symptom or cause and not exploring other
   avenues/possibilities or digging deeper.
 * Pick your battles. Not everything can be investigated right away. Narrow down
   on high impact, high consequence incidents.
 * Get relevant teams involved and start a root cause analysis as quickly as
   possible to prevent critical data loss or overlap with competing priorities.
 * Data gathering for root cause analysis is difficult and time consuming.
   Taking shortcuts or hypothesizing will render this process ineffective or
   lead to incorrect conclusions. Utilizing the collaboration feature in New
   Relic will help break down data silos and enable teams to look at data in the
   context of other platform UI experiences (incidents, alerts, and dashboards).
 * Summarize findings and corrections, build an executive report with
   recommendations and share with the broader organization.
 * Track recommendations to completion. The ultimate goal of an effective root
   cause analysis is to make sure that incidents are never repeated. Most
   recommendations are long-term and will require significant changes or course
   correction. Tracking progress and holding management accountable with
   frequent status updates and communication is an effective way to accomplish
   this goal.


BEST PRACTICES: POST-INCIDENT IMPROVEMENTS

To make the root cause analysis process easier, here are a few things to
remember:

 * If you have multiple deployments by different teams simultaneously, you must
   be sure to keep detailed logs and track of what each team is doing in
   addition to the timings to be able to trace the root cause easily. If it’s
   hard to keep track of multiple deployments, you may want to provide different
   times. The New Relic change tracking feature allows you to capture deployment
   changes in any part of the system and use it to contextualize performance
   data and help resolve issues more quickly.
 * After the incident, make sure to have a postmortem call to discuss the
   incident in addition to what you can implement as a long-term solution to
   prevent it from happening again. Also, make sure to update your standard
   operating procedures so that the operations team can know what to do if the
   same issue happens again.
 * Continuously evaluate your real-time notifications and active alerts review.
   Turn off those that are not important so that you do not miss out on an
   important alert when there is a critical issue.
 * Understand and continue training on the monitoring tools you’re using to
   resolve incidents to quickly resolve issues.
 * Provide a central channel for team communication during incidents whether
   it’s Slack, Zoom, or a bridge call.
 * Analyze custom dashboards to track the adverse events over time.
 * In spite of your best root cause analysis efforts, if upper management isn’t
   committed to implement the corrective action process or take it seriously,
   your root cause analysis will fail to succeed and be effective.
 * Root cause analysis keys to success are focusing on continuous improvement,
   sharing what you’ve learned from the incident with the broader organization,
   learning from past mistakes, and incorporating exercises involving other
   teams on how to respond to the incident if it happens again.
 * No one is to blame; it’s a process that involves multiple teams coming
   together to make resolving issues easier with everyone communicating to
   create a defined process to prevent the issue from recurring. 
 * New Relic is an all-in-one observability platform that eliminates blind
   spots, removes team/data silos, and provides complete visibility into your
   tech stack in a single unified experience that ultimately helps customers
   solve interesting business and technical challenges more effectively.


CONCLUSION

When organizations build, deploy, and run high-performant, high-throughput
systems, there’s always a probability of dealing with failures in computing
workloads. What’s really critical is to have pre-formulated strategies to handle
such breakdowns when they occur and restore operations quickly. Utilizing the
best practices and lessons learned, coupled with full-stack observability tools
like New Relic, will enable you to drive faster resolution, continue to
accelerate operational efficiency, and prevent future incidents from recurring.


REFERENCES

 * New Relic Documentation: Troubleshoot slow application performance
 * New Relic Documentation:  Improve your website’s performance with New Relic


TAGS

APM Digital Experience Monitoring

--------------------------------------------------------------------------------

By Munwar Mohammed

Munwar Mohammed is a principal technical account manager at New Relic, who
specializes in the areas of solution architecture, customer adoption and sales
engineering across different industry domains. As a trusted advisor and a
strategic partner to mid-large enterprise customers, Munwar is accountable for
driving the overall technical vision and empowering customers to achieve postive
business outcomes around full stack observability for modern cloud architecture
environments.

By Shamika Woods

Shamika Woods is a technical account manager for the central enterprise region.
She is a member of multiple ERG groups, Relics of Color, NR Veterans and Women
at NR who loves to volunteer and assist others. She has worked in the IT
industry for over 20 years. Prior to joining New Relic, Shamika was an
information systems technician (IT) in  the US Navy, a contractor with the
Department of Defense (DoD), Centers of Medicare & Medicaid Services (CMS) ,
and  supported Healthcare.gov as a site reliability engineer (SRE). She’s driven
to helping her customers  maximize their observability while using New Relic and
making sure they gain knowledge and value with each product and feature.

The views expressed on this blog are those of the author and do not necessarily
reflect the views of New Relic. Any solutions offered by the author are
environment-specific and not part of the commercial solutions or support offered
by New Relic. Please join us exclusively at the Explorers Hub
(discuss.newrelic.com) for questions and support related to this blog post. This
blog may contain links to content on third-party sites. By providing such links,
New Relic does not adopt, guarantee, approve or endorse the information, views
or products available on such sites.

750+ integrations to start monitoring your stack for free.

See All Integrations See All Integrations

Share this article

 * Share on Twitter
 * Share on Reddit
 * Share on Facebook
 * Share on LinkedIn

In this article
 * What is root cause analysis?
 * Root cause analysis: nuts and bolts
 * Five steps to conduct a root cause analysis
 * Root cause analysis frameworks and methodologies
 * Root cause analysis gotchas: lessons learned
 * Best practices: post-incident improvements
 * Conclusion


RELATED

APM
Transform your data into real-time business insights with New Relic Pathpoint
9 min read Oct 3, 2024
Read Article
APM
SAP business app support teams optimize process flows with New Relic
3 min read Sep 18, 2024
Read Article
APM
SAP Basis admins excel with New Relic
3 min read Sep 18, 2024
Read Article
Company
Company

 * About Us
 * Careers
 * Leadership
 * Social Impact
 * Code of Conduct
 * ESG
 * Suppliers Portal

Product
Product

 * Integrations
 * Platform
 * Security

Value
Value

 * Pricing
 * New Relic Free Tier
 * New Relic vs Datadog
 * New Relic vs Dynatrace
 * New Relic vs Splunk

Resources
Resources

 * Blog
 * Customer Case Studies
 * Community Forum
 * Documentation
 * Press Releases
 * New Relic University
 * Partner Program
 * 2024 Observability Forecast

FAQs
FAQs

 * What is Infrastructure Monitoring?
 * What is Observability?
 * What is Kubernetes?
 * Six steps for achieving business observability
 * Log management best practices
 * Why you need a single tool for infrastructure monitoring and APM
 * A quick guide to getting started with New Relic
 * Why you need IAST

 * 
 * 
 * 
 * 
 * 
 * 
 * 

Contact Us Contact Us
Terms of Service
DMCA Policy
Privacy Policy
Website Terms
Your Privacy Choices
Cookie Policy
UK Slavery Act of 2015
English
 * Deutsch
 * English
 * Español
 * Français
 * 日本語
 * 한국어
 * Português

©2008-24 New Relic, Inc. All rights reserved
Try Now
 * free forever, no credit card required
 * 1 free user on up to 30 tools
 * up to 100 GB ingest monthly

Get Started Free Get Started Free
Questions?
 * in-depth product demo
 * answer technical questions
 * competitive pricing information

Get a Demo Get a Demo

By signing up you're agreeing to Terms of Service and Services Privacy Notice.