The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures.
Fred Brooks, The Mythical Man-Month
We software engineers don’t work in a physical medium the way, say, civil, mechanical, electrical, or chemical engineers do. Yes, our software does run on physical machines, and we are not exempt from dealing with limits. But, as captured in that Fred Brooks quote above, there’s a sense in which we software folk feel that we are working in a medium that is limited only by our own minds, by the complexity of these ethereal artifacts we create. When a software system behaves in an unexpected way, we consider it a design flaw: the engineer was not sufficiently smart.
And, yet, contra Brooks, software is a limited medium. Let’s look at two areas where that’s the case.
Software is discrete in a way that the world isn’t
We persist our data in databases that have schemas, which force us to slice up our information in ways that we can represent. But the real world is not so amenable to this type of slicing: it’s a messy place. The mismatch between the messiness of the real world and the structured nature of software data representations results in a medium that is not well-suited to model the way humans treat concepts such as names or time.
Software as a medium, and data storage in particular, encourages over-simplification of the world, because we need to categorize our data, figure out which tables to store it in and what values those columns should have, and so many items in the world just aren’t easy to model well like that.
As an example, consider a common question in my domain, software deployment: is a cluster up? We have to make a decision about that, and yet the answer is often “it depends: why do you want to know”? But that’s not what software as a medium encourages. Instead, we pick a definition of “up”, implement it, and then hope that it meets most needs, knowing it won’t. We can come up with other definitions for other circumstances, but we can’t be comprehensive, and we can’t be flexible. We have to bake in those assumptions.
And so, just like all engineers, given our time and resource constraints, we have to make over-simplifications to get our work done. William Kent wrote a whole book on this topic called Data and Reality: A Timeless Perspective on Perceiving and Managing Information in Our Imprecise World (h/t Hillel Wayne).
Software systems are limited in how they integrate inputs
In the book Problem Frames, Michael Jackson describes several examples of software problems. One of them is a system for counting how many cars pass by on a street. The inputs are two sensors that emit a signal when the cars drive over them. Those two sensors provide a lot less input than a human would have sitting by the side of the road and counting the cars go by.
As humans, when we need to make decisions, we can flexibly integrate a lot of different information signals. If I’m talking to you, for example, I can listen to what you’re saying, and I can also read the expressions on your face. I can make judgments based on how you worded your Slack message, and based on how well I already know you. I can use all of that different information to build a mental model of your actual internal state. Software isn’t like that: we have to hard-code, in advance, the different inputs that the software system will use to make decisions. Software as a medium is inherently limited in modeling external systems that it interacts with.
A couple of months ago, I wrote a blog post titled programming means never getting to say “it depends”, where I used the example of an alerting system: when do you alert a human operator of a potential problem? As humans, we can develop mental models of the human operator: “does the operator already know about X? Wait, I see that they are engaged based on their Slack messages, so I don’t need to alert them, they’re already on it.”
Good luck building an alerting system that constructs a model of the internal state of a human operator! Software just isn’t amenable to incorporating all of the possible signals we might get from a system.
Recognizing the limits of software
The lesson here is that there are limits to how well software system can actually perform, given the limits of software. It’s not simply a matter of managing complexity or avoiding design flaws: yes, we can always build more complex schemas to handle more cases, and build our systems to incorporate large input sets, but this is the equivalent of adding epicycles. Incorrect categorizations and incorrect automated decisions are inevitable, no matter how complex our systems become. They are inherent to the nature of software systems. We’re always going to need to have humans-in-the-loop to make up for these sorts of shortcomings.
The goal is not to build better software systems, but how to build better joint cognitive systems that are made up of humans and software together.