If only HP knew what HP knows, we would be three times more productive.Lew Platt, former CEO of Hewlett-Packard
One pattern that you see over and over again in operational surprises is that a person who was involved in the surprise was missing some critical bit of information. For example, there may be an implicit contract that becomes violated when someone makes a code change. Or there might be a certain batch job that runs every Tuesday at 4PM might trigger and puts some additional load on the database.
Almost always, this kind of information is present in the head of someone else within the organization. It just wasn’t in the head of the person who really needed it at that moment.
I think the problem of missing information is well understood enough that you see variants of it crop in different places. Here are some examples I’ve encountered:
- The resilience engineering folks often talk about common ground and the problems that arise when common ground breaks down.
- Jorge Aranda wrote his (excellent!) PhD thesis on shared understanding in software organizations.
- At Netflix, sharing context is an important part of the culture.
It turns out that experts are very good at accumulating these critical bits of information and recalling them at the appropriate time. Experts are also very good at communicating efficiently with others who share a lot of that critical information in their heads.
However, what experts are not very good at is transmitting this information to others who don’t yet have it. Experts aren’t explicitly aware of the value of all of this information, and so they tend not to volunteer it without being asked. When a newcomer watches an expert in action, a common refrain is, “how did you know to do that?”
The fact that experts aren’t good at sharing the useful information that they know is one of the challenges that incident investigators face. One of the skills of an investigator is how to elicit these bits of knowledge through interviews.
I think that advancing shared understanding in an organization has the potential to be enormously valuable. One of the things that I hope to accomplish with sharing out writeups of operational surprises is to use them as a vehicle for doing so.
Even if there isn’t a single actionable outcome from a writeup, you never know when that critical bit of knowledge that has been implanted in the heads of the readers will come in handy.