newrelic.com
Open in
urlscan Pro
151.101.194.217
Public Scan
Submitted URL: https://groove.grvlnk1.com/url/TsV41XeTGSWXAer5IA_qAgAs5fQ/aHR0cHM6Ly9uZXdyZWxpYy5jb20vYmxvZy9ob3ctdG8tcmVsaWMvcGVyZm9ybWlu...
Effective URL: https://newrelic.com/blog/how-to-relic/performing-effective-root-cause-analysis?_gl=1*1urhtal*_up*MQ..&gclid=Cj0KCQjw...
Submission: On October 31 via api from US — Scanned from DE
Effective URL: https://newrelic.com/blog/how-to-relic/performing-effective-root-cause-analysis?_gl=1*1urhtal*_up*MQ..&gclid=Cj0KCQjw...
Submission: On October 31 via api from US — Scanned from DE
Form analysis
2 forms found in the DOMPOST /search
<form class="flex relative" method="post" action="/search">
<label class="sr-only" for="header-search">Search</label>
<input class="js-full-text-search col-start-1 row-start-1 w-full !pr-16" type="search" id="header-search" name="search" placeholder="What are you looking for?" autocomplete="on" autocorrect="off" autocapitalize="off" value="">
<div class="absolute right-10 flex items-center justify-center h-full w-16">
<button class="js-search-form-submit btn-text">
<span aria-hidden="true">
<svg width="32" height="32" viewBox="0 0 32 32" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M13.93 20.861A6.93 6.93 0 1 0 13.93 7a6.93 6.93 0 0 0 0 13.861Z" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
<path d="m18.782 18.782 3.466 3.465.877.878 1.125 1.125.75.75" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
</svg>
</span>
<span class="sr-only">Submit</span>
</button>
</div>
<button class="js-search-form-cancel btn-text !bg-transparent text-nr-black dark:text-nr-white" aria-label="close" type="button">
<span aria-hidden="true">
<svg width="32" height="32" viewBox="0 0 32 32" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="m8.928 8.928 14.144 14.144m0-14.144L8.928 23.072" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
</svg>
</span>
</button>
</form>
GET /blog/search
<form class="relative w-full max-w-xl large:max-w-3xl" method="get" action="/blog/search">
<label class="sr-only" for="blog-header-search-input"> Search the blog </label>
<input class="js-blog-header-search-input w-full !pr-20" id="blog-header-search-input" type="search" placeholder="Type your search here" name="s">
<button class="js-blog-header-search-submit absolute right-0 top-0 flex items-center justify-center h-full w-16 p-0 bg-transparent text-nr-black" type="submit">
<span aria-hidden="true">
<svg width="32" height="32" viewBox="0 0 32 32" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M13.93 20.861A6.93 6.93 0 1 0 13.93 7a6.93 6.93 0 0 0 0 13.861Z" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
<path d="m18.782 18.782 3.466 3.465.877.878 1.125 1.125.75.75" stroke="currentColor" stroke-width="2" stroke-linejoin="round"></path>
</svg>
</span>
<span class="sr-only">Submit</span>
</button>
</form>
Text Content
Skip to main content New Relic Now Dream of innovating more? Make it real on 10/31. RSVP Now Search toggle Main navigation menu, 6 items Search Submit * Platform The All-in-One Observability Platform 30+ capabilities, 750+ integrations, the power of AI, all together. View Platform View Platform APM * APM 360 * Serverless * Business Observability * SAP Monitoring * CodeStream Security * Interactive Application Security Testing (IAST) * Vulnerability Management Digital Experience Monitoring * Browser Real User Monitoring * Mobile Real User Monitoring * Session Replay * Synthetics * Web Performance Monitoring Artificial Intelligence * New Relic AI * AIOps * AI Monitoring Infrastructure * Infrastructure Monitoring * Kubernetes Monitoring * Network Monitoring * Prometheus Monitoring * AWS Cloud Monitoring * Azure Cloud Monitoring * Google Cloud Monitoring Logs * Log Management Platform Capabilities * OpenTelemetry * New Relic AI * Errors Inbox * Alerts * Dashboards * Change Tracking * Entity Explorer * Integrations * Pricing * Solutions The world's leading companies run on New Relic. Customer Stories Customer Stories Use Cases * Digital Experience Monitoring * Go from reactive to responsive * DevOps * Tool Consolidation * Open Source Industries * Retail & Ecommerce * Healthcare * Media & Entertainment * New Relic for Startups * Nonprofit * Public Sector Technologies * Amazon Web Services * Google Cloud Platform * Microsoft Azure * Pivotal Cloud Foundry * Prometheus Monitoring * SAP Monitoring Named a Leader Learn why in the 2024 Gartner® Magic Quadrant™ for Digital Experience Monitoring (DEM). 2024 Observability Forecast Power your decisions with invaluable observability insights! Read the report. New Relic Now Grow your business with more perfect digital experiences. RSVP 1 2 3 * Enterprise * Developers Developers Simple K8 Monitoring Simplify your Kubernetes monitoring with the New Relic operator. Monitor Mobile Apps Monitor hybrid mobile apps in minutes with quickstart installations. New Relic Browser Agent Custom instrumentation with the New Relic browser agent. 1 2 3 Popular Docs * Start ingesting data * Create custom dashboards * Examine NRQL queries * Forward logs using infrastructure agent View All Docs View All Docs 775+ Integrations Start now for free. Python Install Java Install PHP Install Ruby Install View All Integrations View All Integrations New Relic Now Turn peak traffic into peak opportunity. RSVP to see Intelligent Observability. Secure Faster Enhance security with New Relic APM's zero-day vulnerability alerts for proactive threat management and peace of mind. Root Cause Analysis Master root cause analysis with New Relic's guide to identify, solve, and prevent issues efficiently and effectively. 1 2 3 * Resources * About Us * Leadership * Careers * Secure Developer * New Relic for Students * Newsroom * ESG * Community Forum * Technical Support * Customer Stories Videos and case studies show how companies win with observability. * Resource Library Ebooks, data sheets, and white papers. * New Relic Blog Industry updates, tips, and best practices. * Events & Webinars Join us for an upcoming event or training. * New Relic University Learning paths and training courses. * Observability Value Calculator Learn the value of New Relic to your business. 2024 Observability Forecast Stay ahead with essential observability insights! Read the report now. New Relic Now The future of observability is coming. Get a sneak peek on 10/31. RSVP now. New Relic is furthest in vision. Download the 2024 Gartner® Magic Quadrant™ for Digital Experience Monitoring (DEM). 1 2 3 Search toggle Log in Log in Log in Log in Get Started Free Get Started Free Get Demo Get Demo Log in Log in Log in Log in Get Started Free Get Started Free Get Demo Get Demo newrelic blog homepage * How to Relic * Best Practices * Culture * News Blog navigation menu, 4 items Toggle blog search * How to Relic * Best Practices * Culture * News newrelic blog homepage * How to Relic * Best Practices * Culture * News Toggle blog search Log in Log in Log in Log in Get Started Free Get Started Free Get Demo Get Demo Blog navigation menu, 4 items * How to Relic * Best Practices * Culture * News Search the blog Submit Log in Log in Log in Log in Get Started Free Get Started Free Get Demo Get Demo APM PERFORMING EFFECTIVE ROOT CAUSE ANALYSIS Published Mar 5, 2024 • 7 min read By Munwar Mohammed By Shamika Woods WHAT IS ROOT CAUSE ANALYSIS? Root cause analysis is a method or process to identify any breakdowns in processes and systems currently in place that could be improved when adverse incidents arise. When effectively implemented, a root cause analysis can: * Identify factors that contributed to an adverse event or near-miss so that measures can be put in place to address contributing factors. * Prevent incidents from happening again in the future. * Improve customer experience. * Reduce the costs associated with risk. Conducting a root cause analysis entails an in-depth process, which consists of finding the proper teams involved with resolving the adverse event, reviewing valid notes taken during this time, reviewing the data from monitoring tools in place used to help identify mean time to detect (MTTD) and mean time to resolve (MTTR), and improve its efficiency in addition to minimizing and/or eliminating the risk of it recurring. By team involvement, sharing details of the adverse events and effective communication can impact a positive outcome. ROOT CAUSE ANALYSIS: NUTS AND BOLTS Root cause analysis represents comprehensive investigation, assessment, evaluation and correction. All these are attributes that feature in the services that New Relic provides. There are many types of investigative procedures used to carry out root cause analysis such as these described below: * Identify the problem: The first step upon recognizing an issue is to define a problem statement and the symptoms (for example, a machinery malfunction, a failed or faulty process, or human error). Once that’s done, it’s important to isolate any suspected contributing factors to contain the problem while you try to uncover the root cause. * Collect data: Once the problem is identified, compile as much data as possible, including incident reports, evidence in the form of screenshots and logs, and interviews with anyone involved with the issue. Using this data, you can determine the sequence of events, and especially any adverse events that led to the problem, as well as the systems that were involved, how long the problem occurred and the overall impact. * Determine root cause: The root cause analysis team conducts a brainstorming session using techniques such as Fishbone diagrams, Pareto charts, and other tools to ascertain the root cause. The root cause analysis manager moderates the meeting, which should be collaborative and blameless. * Implement the solution: The root cause may point to one or more solutions, and the root cause analysis team has to determine which fix is best and when it should be delivered. Once the solution is implemented, it must be monitored to ensure it’s effective. This process is more formally called root cause corrective action. * Document actions: A critical part of root cause analysis is preventing the problem in question from reoccurring. Documenting the problem and its resolution so teams can reference it in the future is essential. The root cause analysis team can also include recommendations for physical or process improvements as well as preventative actions in the documentation. FIVE STEPS TO CONDUCT A ROOT CAUSE ANALYSIS 1 Properly define the problem using SMART rules to ensure you have identified the problem correctly: * Specific * Measurable * Action-oriented * Realistic * Time-constrained manner 2 Confirm the problem is accurately identified based on data and not perceptions. 3 Take immediate action steps to resolve the problem temporarily. 4 Find the underlying root cause of the problem and take corrective action to prevent the problem from recurring in the future. 5 Note the identified and established corrective action within the standard procedures to prevent it from happening again. Providing a temporary fix until you figure out how to provide a permanent solution is ideal as long as you have a plan on how to resolve the problem at a later date. After completing a root cause analysis, it’s important to have a postmortem call sometime after. Having a call after the root cause analysis will assist with mitigating repeat incidents by bringing teams together to plan and communicate lessons learned and how to prevent it from happening again. ROOT CAUSE ANALYSIS FRAMEWORKS AND METHODOLOGIES 1 Fishbone diagram: Also called cause and effect diagrams (or Ishikawa diagrams, based on the name of its founder Kaoru Ishikawa), a fishbone diagram is a visual method for root cause analysis that organizes cause-and-effect relationships into categories. 2 Pareto analysis: (also known as the “80-20 rule”) The Pareto Principle states that 80 percent of problems can be traced back to 20 percent of causes. Pareto analysis identifies the problem areas or tasks that will have the biggest payoff. Using Pareto analysis during a root cause analysis for errors helps us to understand and identify the most significant errors usually caused by a few problems which can then be targeted for correction or resolution.The following are the phases of Pareto analysis: * Phase I: Identification of causes of defects * Phase II: Collection of sample data * Phase III: Graphical representation of results * Phase IV: Interpreting the graphed results 3 Five why’s technique: (also known as Gemba Gembutsu, a Japanese phrase meaning “place and information”) The five whys is a simple problem-solving technique that helps to get to the root of a problem quickly. The five whys strategy involves looking at any problem and drilling down by asking: "Why?" or "What caused this problem?" While you want clear and concise answers, you want to avoid answers that are too simple and overlook important details. Start with the problem statement and ask why it occurred. Typically, the answer to the first "why" should prompt another "why" and the answer to the second "why" will prompt another and so on. Repeat the steps until you have asked at least five whys, hence the name “five whys.” This technique can help you to quickly determine the root cause of a problem. ROOT CAUSE ANALYSIS GOTCHAS: LESSONS LEARNED While every root cause analysis user journey is unique and the approach and techniques used to arrive at a resolution vary, one can derive a common set of best practices and learnings from the outcomes. The following are some of the pitfalls that should be avoided along the way: * Incomplete or insufficient definition of the problem statement. * Lending focus to wrong things or signals. Think long-term and never have a near-sighted approach. * Stopping at the first sight of a symptom or cause and not exploring other avenues/possibilities or digging deeper. * Pick your battles. Not everything can be investigated right away. Narrow down on high impact, high consequence incidents. * Get relevant teams involved and start a root cause analysis as quickly as possible to prevent critical data loss or overlap with competing priorities. * Data gathering for root cause analysis is difficult and time consuming. Taking shortcuts or hypothesizing will render this process ineffective or lead to incorrect conclusions. Utilizing the collaboration feature in New Relic will help break down data silos and enable teams to look at data in the context of other platform UI experiences (incidents, alerts, and dashboards). * Summarize findings and corrections, build an executive report with recommendations and share with the broader organization. * Track recommendations to completion. The ultimate goal of an effective root cause analysis is to make sure that incidents are never repeated. Most recommendations are long-term and will require significant changes or course correction. Tracking progress and holding management accountable with frequent status updates and communication is an effective way to accomplish this goal. BEST PRACTICES: POST-INCIDENT IMPROVEMENTS To make the root cause analysis process easier, here are a few things to remember: * If you have multiple deployments by different teams simultaneously, you must be sure to keep detailed logs and track of what each team is doing in addition to the timings to be able to trace the root cause easily. If it’s hard to keep track of multiple deployments, you may want to provide different times. The New Relic change tracking feature allows you to capture deployment changes in any part of the system and use it to contextualize performance data and help resolve issues more quickly. * After the incident, make sure to have a postmortem call to discuss the incident in addition to what you can implement as a long-term solution to prevent it from happening again. Also, make sure to update your standard operating procedures so that the operations team can know what to do if the same issue happens again. * Continuously evaluate your real-time notifications and active alerts review. Turn off those that are not important so that you do not miss out on an important alert when there is a critical issue. * Understand and continue training on the monitoring tools you’re using to resolve incidents to quickly resolve issues. * Provide a central channel for team communication during incidents whether it’s Slack, Zoom, or a bridge call. * Analyze custom dashboards to track the adverse events over time. * In spite of your best root cause analysis efforts, if upper management isn’t committed to implement the corrective action process or take it seriously, your root cause analysis will fail to succeed and be effective. * Root cause analysis keys to success are focusing on continuous improvement, sharing what you’ve learned from the incident with the broader organization, learning from past mistakes, and incorporating exercises involving other teams on how to respond to the incident if it happens again. * No one is to blame; it’s a process that involves multiple teams coming together to make resolving issues easier with everyone communicating to create a defined process to prevent the issue from recurring. * New Relic is an all-in-one observability platform that eliminates blind spots, removes team/data silos, and provides complete visibility into your tech stack in a single unified experience that ultimately helps customers solve interesting business and technical challenges more effectively. CONCLUSION When organizations build, deploy, and run high-performant, high-throughput systems, there’s always a probability of dealing with failures in computing workloads. What’s really critical is to have pre-formulated strategies to handle such breakdowns when they occur and restore operations quickly. Utilizing the best practices and lessons learned, coupled with full-stack observability tools like New Relic, will enable you to drive faster resolution, continue to accelerate operational efficiency, and prevent future incidents from recurring. REFERENCES * New Relic Documentation: Troubleshoot slow application performance * New Relic Documentation: Improve your website’s performance with New Relic TAGS APM Digital Experience Monitoring -------------------------------------------------------------------------------- By Munwar Mohammed Munwar Mohammed is a principal technical account manager at New Relic, who specializes in the areas of solution architecture, customer adoption and sales engineering across different industry domains. As a trusted advisor and a strategic partner to mid-large enterprise customers, Munwar is accountable for driving the overall technical vision and empowering customers to achieve postive business outcomes around full stack observability for modern cloud architecture environments. By Shamika Woods Shamika Woods is a technical account manager for the central enterprise region. She is a member of multiple ERG groups, Relics of Color, NR Veterans and Women at NR who loves to volunteer and assist others. She has worked in the IT industry for over 20 years. Prior to joining New Relic, Shamika was an information systems technician (IT) in the US Navy, a contractor with the Department of Defense (DoD), Centers of Medicare & Medicaid Services (CMS) , and supported Healthcare.gov as a site reliability engineer (SRE). She’s driven to helping her customers maximize their observability while using New Relic and making sure they gain knowledge and value with each product and feature. The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites. 775+ integrations to start monitoring your stack for free. See All Integrations See All Integrations Share this article * Share on Twitter * Share on Reddit * Share on Facebook * Share on LinkedIn In this article * What is root cause analysis? * Root cause analysis: nuts and bolts * Five steps to conduct a root cause analysis * Root cause analysis frameworks and methodologies * Root cause analysis gotchas: lessons learned * Best practices: post-incident improvements * Conclusion RELATED APM SAP Basis admins excel with New Relic 3 min read Sep 18, 2024 Read Article APM SAP business app support teams optimize process flows with New Relic 3 min read Sep 18, 2024 Read Article APM Transform your data into real-time business insights with New Relic Pathpoint 9 min read Oct 3, 2024 Read Article Company Company * About Us * Careers * Leadership * Social Impact * Code of Conduct * ESG * Suppliers Portal Product Product * Integrations * Platform * Security Value Value * Pricing * New Relic Free Tier * New Relic vs Datadog * New Relic vs Dynatrace * New Relic vs Splunk Resources Resources * Blog * Customer Case Studies * Community Forum * Documentation * Press Releases * New Relic University * Partner Program * 2024 Observability Forecast FAQs FAQs * What is Infrastructure Monitoring? * What is Observability? * What is Kubernetes? * Six steps for achieving business observability * Log management best practices * Why you need a single tool for infrastructure monitoring and APM * A quick guide to getting started with New Relic * Why you need IAST * * * * * * * Contact Us Contact Us Terms of Service DMCA Policy Privacy Policy Website Terms Your Privacy Choices Cookie Policy UK Slavery Act of 2015 English * Deutsch * English * Español * Français * 日本語 * 한국어 * Português ©2008-24 New Relic, Inc. All rights reserved Try Now * free forever, no credit card required * 1 free user on up to 30 tools * up to 100 GB ingest monthly Get Started Free Get Started Free Questions? * in-depth product demo * answer technical questions * competitive pricing information Get a Demo Get a Demo By signing up you're agreeing to Terms of Service and Services Privacy Notice.