Hooray for human error

Reading Time: < 1 minute

Human error is usually one or more of the at least three factors which cause a system failure.
Human error is inevitable, therefore we must design systems to be preemptive and resilient to human error: prevent it and survive it. If we don’t, it’s a fragile system.


All real world systems are broken all the time. Defects are present. Humans keep them working anyway. Whenever a system fails thanks to the Swiss Cheese model of errors overwhelming the operators, we shouldn’t blame them. Instead we should thank them for having kept it working so long between failures.
“It doesn’t surprise me that your system failed. What surprises me is that it ever works at all” – Richard I. Cook, MD (I think. I attribute it to him)

And we should thank them for having found the system vulnerability to human error, which we will now fix so as to be antifragile. We can fix with automation, or poka yoke. Instructions to “don’t do the thing” isn’t fixing.