In short, the resilience of a system corresponds to its adaptive capacity tuned to the future. [emphasis added]Branlat, Matthieu & Woods, David. (2010). How do systems manage their adaptive capacity to successfully handle disruptions? A resilience engineering perspective. AAAI Fall Symposium – Technical Report
In simple terms, an incident is a bad thing that has happened that was unexpected. This is just the sort of thing that makes people feel uneasy. Instinctively, we want to be able to say “We now understand what has happened, and we are taking the appropriate steps to make sure that this never happens again.”
But here’s the thing. Taking steps to prevent the last incident from recurring doesn’t do anything to help you deal with the next incident, because your steps will have ensured that the next one is going to be completely different. There is, however, one thing that your next incident will have in common with the last one: both of them are surprises.
We can’t predict the future, but we can get better at anticipating surprise, and dealing with surprise when it happens. Getting better at dealing with surprise is what resilience engineering is all about.
The first step is accepting that surprise is inevitable. That’s hard to do. We want to believe that we are in control of our systems, that we’ve plugged all of the holes. Sure, we may have had a problem before, but we fixed that. If we can just take the time to build it right, it’ll work properly.
Accepting that future operational surprises are inevitable isn’t natural for engineers. It’s not the way we think. We design systems to solve problems, and one of the problems is staying up. We aren’t fatalists.
However, once we do accept that operational surprise is inevitable, we can shift our thinking of the system from the computer-based system to the broader socio-technical system that includes both the people and the computers. The solution space here looks very different, because we aren’t used to thinking about designing systems where people are part of the system, especially when we engineers are part of the system we’re building!
But if we want the ability to handle things the future is going to throw at us, then we need to get better at dealing with surprise. Computers are lousy at this, they can’t adapt to situations they weren’t designed to handle. But people can.
In this frame, accepting that operational surprises are inevitable isn’t fatalism. Building adaptive capacity to deal with future surprises is how we tune to the future.