You can’t see what you don’t have a frame of reference for

In my recent talk at the LFI conference, I referenced this quote from Karl Weick:

Quote from Karl E. Weick: "Seeing what one believs and not seeing that for which one has no beliefs are central to sensemaking".

I was reminded of that quote when reading a piece by the journalist Yair Rosenberg:

Rosenberg notes that this particular story hasn’t gotten much widespread press, and his theory is that the attacks, as well as other uncovered antisemitic events, don’t fit neatly into the established narrative frames of reference.

Such is the problem with events that don’t fit neatly into a bucket: we can’t discuss them effectively because they don’t fit into one of our pre-existing categories.

Good category, bad category (or: tag, don’t bucket)

The baby, assailed by eyes, ears, nose, skin, and entrails at once, feels it all as one great blooming, buzzing confusion; and to the very end of life, our location of all things in one space is due to the fact that the original extents or bignesses of all the sensations which came to our notice at once, coalesced together into one and the same space.

William James, The Principles of Psychology (1890)

I recently gave a talk at the Learning from Incidents in Software conference. On the one hand, I mentioned the importance of finding patterns in incidents:

But I also had some… unkind words about categorizing incidents.

We humans need categories to make sense of the world, to slice it up into chunks that we can handle cognitively. Otherwise, the world would just be, as James put it in the quote above, one great blooming, buzzing confusion. So, categorization is essential to humans functioning in the world. In particular, if we want to identify meaningful patterns in incidents, we need to do categorization work.

But there are different techniques we can use to categorize incidents, and some techniques are more productive than others.

The buckets approach

An incident must be placed in exactly one bucket

One technique is what I’ll call the buckets approach of categorization. That’s when you define a set up of categories up front, and then you assign each incident that happens to exactly one bucket. For example, you might have categories such as:

  • bug (new)
  • bug (latent)
  • deployment problem
  • third party

I have seen two issues with the bucketing approach. The first issue is that I’ve never actually seen it yield any additional insight. It can’t provide insights into new patterns because the patterns have already been predefined as the buckets. The best it can do is give you one type of filter to drill down and look at some more issues in more detail. There’s some genuine value in giving you a subset of related incidents to look more closely at, but in practice, I’ve rarely seen anybody actually do the harder work of looking at these subsets.

The second issue is that incidents, being messy things, often don’t fall cleanly into exactly one bucket. Sometimes they fall into multiple, and sometimes they don’t fall into any, and sometimes it’s just really unclear. For example, an issue may involve both a new bug and a deployment problem (as anyone who has accidentally deployed a bug to production and then gotten into even more trouble when trying to roll things back can tell you). The bucket approach forces you to discard information that is potentially useful in identifying patterns by requiring you to put the incident into exactly one bucket. This inevitably leads to arguments about whether an incident should be classified into bucket A or bucket B. This sort of argument is a symptom that this approach is throwing away useful information, and that it really shouldn’t go into a single bucket at all.

The tags approach

You may hang multiple tags on an incident

Another technique is what I’ll call the tags method of categorization. With the tags method, you annotate an incident with zero, one, or multiple tags. The idea behind tagging is that you want to let the details of the incident help you come up with meaningful categories. As incidents happen, you may come up with entirely new categories, or coalesce existing ones into one, or split them apart. Tags also let you examine incidents across multiple dimensions. Perhaps you’re interested in attributes of the people that are responding (maybe there’s a “hero” tag if there’s a frequent hero who comes in to many incidents), or maybe there’s production pressure related to some new feature being developed (in which case, you may want to label with both production-pressure and feature-name), or maybe it’s related to migration work (migration). Well, there are many different dimensions. Here are some examples of potential tags:

  • query-scope-accidentally-too-broad
  • people-with-relevant-context-out-of-office
  • unforeseen-performance-impact

Those example tags may seem weirdly specific, but that’s OK! The tags might be very high level (e.g., production-pressure) or very low level (e.g., pipeline-stopped-in-the-middle), or anywhere in between.

Top-down vs circular

The bucket approach is strictly top-down: you enforce a categorization on incidents from the top. The tags approach is a mix of top-down and bottom-up. When you start tagging, you’ll always start with some prior model of the types of tags that you think are useful. As you go through the details of incidents, new ideas for tags will emerge, and you’ll end up revising your set of tags over time. Someone might revisit the writeup for an incident that happened years ago, and add a new tag to it. This process of tagging incidents and identifying potential new tags categories will help you identify interesting patterns.

The tag-based approach is messier than the bucket-based one, because your collection of tags may be very heterogeneous, and you’ll still encounter situations where it’s not clear whether a tag applies or not. But it will yield a lot more insight.

The value of social science research

One of the benefits of basic scientific research is the potential for bringing about future breakthroughs. Fundamental research in the physical and biological sciences might one day lead to things like new sources of power, radically better construction materials, remarkable new medical treatments. Social scientific research holds no such promise. Work done in the social sciences is never going to yield, say, a new vaccine, or something akin to a transistor.

The statistician (and social science researcher) Andrew Gelman goes so far as to say, literally, that the social sciences are useless. But I’m being unfair to Gelman by quoting him selectively. He actually advocates for social science, not against it. Gelman argues that good social science is important, because we can’t actually avoid social science, and if we don’t have good social science research then we will be governed by bad social “science”. The full title of his blog post is The social sciences are useless. So why do we study them? Here’s a good reason.

We study the social sciences because they help us understand the social world and because, whatever we do, people will engage in social-science reasoning.

Andrew Gelman

I was reminded of this at the recent Learning from Incidents in Software conference when listening to a talk by Dr. Ivan Pupulidy titled Moving Gracefully from Compliance to Learning, the Beginning of Forest Service’s Learning Journey. Pupulidy is a safety researcher who worked at the U.S. Forest Service and is now a professor at University of Alabama, Birmingham.

In particular, it was this slide from Pupulidy’s talk that struck me right between the eyes.

Thanks to work done by Ivan Pupulidy, the Forest Service doesn’t look at incidents this way anymore

The text in yellow captures what you might call the “old view” of safety, where accidents are caused by people not following the rules properly. It is a great example of what I would call bad social science, which invariably leads to bad outcomes.

The dangers of bad social science have been known for a while. Here’s the economist John Maynard Keynes:

The ideas of economists and political philosophers, both when they are right and when they are wrong, are more powerful than is commonly understood. Indeed the world is ruled by little else. Practical men, who believe themselves to be quite exempt from any intellectual influence, are usually the slaves of some defunct economist.

John Maynard Keynes, The General Theory of Employment, Interest and Money (February 1936)

Earlier and more succinctly, here’s the journalist H.L. Mencken:

Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong.

H.L. Mencken, “The Divine Afflatus” in New York Evening Mail (16 November 1917)

As Mencken notes, we can’t get away from explanations. But if we do good social science research, we can get better explanations. These explanations often won’t be neat, and may not even be plausible on first glance. But they at least give us a fighting chance at not being (completely) wrong about the social world.

The high cost of low ambiguity

Natural language is often frustratingly ambiguous.

Homonyms are ambiguous at the individual word level

There are scenarios where ambiguity is useful. For example, in diplomacy, the ambiguity of a statement can provide wiggle room so that the parties can both claim that they agree on a statement, while simultaneously disagreeing on what the statement actually means. (Mind you, this difference in interpretation might lead to problems later on).

Surprisingly, ambiguity can also be useful in engineering work, specifically around design artifacts. The sociologist Kathryn Henderson has done some really interesting research on how these ambiguous boundary objects are able to bridge participants from different engineering domains to work together, where each participant has a different, domain-specific interpretation of the artifact. For more on this, check out her paper Flexible Sketches and Inflexible Data Bases: Visual Communication, Conscription Devices, and Boundary Objects in Design Engineering and her book On Line and On Paper.

Humorists and poets have also made good use of the ambiguity of natural language. Here’s a classic example from Grouch Marx:

Groucho Marx was a virtuoso in using ambiguous wordplay to humorous effect

Despite these advantages, ambiguity in communication hurts more often than it helps. Anyone who has coordinated over Slack during the incident has felt the pain of the ambiguity of Slack messages. There’s the famous $5 million missing comma. As John Gall, the author of Systemantics, reminds us, ambiguity is often the enemy of effective communication.

This type of ambiguity is rarely helpful

Given the harmful nature of ambiguity in communication, the question arises: why is human communication ambiguous?

One theory, which I find convincing, is that communication is ambiguous because it reduces overall communication costs. In particular, it’s much cheaper for the speaker to make utterances that are ambiguous than to speak in a fully unambiguous way. For more details on this theory, check out the paper The communicative function of ambiguity in language by Piantodosi, Tily, and Gibson.

Note that there’s a tradeoff: easier for the speaker is more difficult for the listener, because the listener has to do the work of resolving the ambiguity based on the context. I think people pine for fully unambiguous forms of communication because they have experienced firsthand the costs of ambiguity as listeners but haven’t experienced what the cost to the speaker would be of fully unambiguous communication.

Even in mathematics, a field which we think of us being the most formal and hence unambiguous, mathematical proofs themselves are, in some sense, not completely formal. This has led to amusing responses from mathematical-inclined computer scientists who think that mathematicians are doing proofs in an insufficiently formal way, like Leslie Lamport’s How to Write a Proof and Guy Steele’s It’s Time for a New Old Language.

My claim is that the reason you don’t see mathematical proofs written in a formal-in-the-machine-checkable sense is that mathematicians have collectively come to the conclusion that it isn’t worth the effort.

The role of cost-to-the-speaker vs cost-to-the-listener in language ambiguity is a great example of the nature of negotiated boundaries. We are constantly negotiating boundaries with other people, and those negotiations are generally of the form “we both agree that this joint activity would benefit both of us, but we need to resolve how we are going to distribute the costs amongst us.” Human evolution has worked out how to distribute the costs of verbal communication between speaker and listener, and the solution that evolution hit upon involves ambiguity in natural language.