In praise of the Wild West engineer

If you put software engineers on a continuum with slow and cautious at one end and Wild West at the other end, I’ve always been on the slow and cautious side. But I’ve come to appreciate the value of the Wild West engineer.

Here’s an example of Wild West behavior: during some sort of important operational task (let’s say a failover), the engineer carrying it out sees a Python stack trace appear in the UI. Whoops, there’s a bug in the operational code! They ssh to the box, fix the code, and then run it again. It works!

This sort of behavior used to horrify me. How could you just hotfix the code by ssh’ing to the box and updating the code by hand like that? Do you know how dangerous that is? You’re supposed to PR, run tests, and deploy a new image!

But here’s the thing. During an actual incident, the engineers involved will have to take risky actions in order to remediate. You’ve got to poke and prod at the system in a way that’s potentially unsafe in order to (hopefully!) make things better. As Richard Cook wrote, all practitioner actions are gambles. The Wild West engineers are the ones with the most experience making these sorts of production changes, so they’re more likely to be successful at them in these circumstances.

I also don’t think that a Wild West engineer is someone who simply takes unnecessary risks. Rather, they have a well-calibrated sense of what’s risky and what isn’t. In particular, if something breaks, they know how to fix it. Once, years ago, during an incident, I made a global change to a dynamic property in order to speed up a remediation, and a Wild West engineer I knew clucked with disapproval. That was a stupid risk I took. You always stage your changes across regions!

Now, simply because the Wild West engineers have a well-calibrated sense of risk, doesn’t mean their sense is always correct! David Woods notes that all systems are simultaneously well-adapted, under-adapted, and over-adapted. The Wild West engineer might miscalculate a risk But I think it’s a mistake to dismiss Wild West engineers as simply irresponsible. While I’m still firmly the slow-and-cautious type, when everything is on fire, I’m happy to have the Wild West engineers around to take those dangerous remediation actions. Because if it ends up making things worse, they’ll know how to handle it. They’ve been there before.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s