Remove references to "root cause(s)", as incidents should focus on all "contributing factors", including the trigger, that lead to the incident #95

theckman · 2019-04-25T10:21:20Z

One change we are seeing in our industry is the wider adoption of the belief that being able to distill an incident down to a single root cause is a myth[1][2]. As the complexities of our systems grow the complexities of our incidents grow, and trying to isolate an incident to one item doesn't result in the types of learnings we need to come out of those incidents.

The truth is that each incident is unique because of the multiple factors that contributed to it, and if any one of those factors was different it would have been a completely different incident. Without giving each of those factors the same care, we miss the opportunity to solve for those different parts.

While pluralizing "root cause" to "root causes" can get you a good part of the way there, in my experience I've seen that the verbiage change from "root causes" to "contributing factors" is a much bigger change in how people think about it and drive the learnings in the way we want. While I initially was skeptical such a minimal language change would make a difference, I can happily admit I was wrong.

At Netflix we've started to change our internal language around it, and have found a much richer set of learnings from teams after an incident. Being that I was a responder at PagerDuty when we started to form these practices and the inception for this documentation, I feel like it'd be a miss if we didn't iterate on these documents to follow with learnings from our industry.

We, and others, have started to talk about Contributing Factors instead. We still identify what was traditionally called the "root cause", but we listed it as one of the factors (often called out as the trigger).

What are your thoughts on updating the verbiage of this documentation to align with our industry shifting its way of thinking?

[1] https://medium.com/@jpaulreed/dev-ops-and-determinism-966a57e3a5cc
[2] https://en.wikipedia.org/wiki/Fallacy_of_the_single_cause

richadams added the enhancement / new section label Apr 25, 2019

richadams mentioned this issue Jun 21, 2019

Change "root cause" to "contributing factors" #97

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove references to "root cause(s)", as incidents should focus on all "contributing factors", including the trigger, that lead to the incident #95

Remove references to "root cause(s)", as incidents should focus on all "contributing factors", including the trigger, that lead to the incident #95

theckman commented Apr 25, 2019

Remove references to "root cause(s)", as incidents should focus on all "contributing factors", including the trigger, that lead to the incident #95

Remove references to "root cause(s)", as incidents should focus on all "contributing factors", including the trigger, that lead to the incident #95

Comments

theckman commented Apr 25, 2019