Looking at my notes I thought ‘Part 2 is going to be easy to write.’ Every day this week I spent looking at the ideas, trying to put them in an order that makes sense, and it was not. I was failing and trying to articulate how to limit or eliminate mistakes. Guess what people, you cannot prevent mistakes and or avoid failure. It is going to happen whether you like it or not. The line from Apollo 13 ‘failure is not an option’ is untrue. There is nothing you can do that will ever be foolproof. In fact, there is a saying that I learned while studying the User Experience that a UX designer’s job is to create a user interface that even a fool can use, and the Universes job is the create better fools. So, what are you to do?
If you know mistakes are going to happen, failure is an option and there needs to be some process on handling it. Yes, what I am saying is plan your mistakes, mishaps, and failures. To me the plans come in two parts.
First, if you remember after 9/11 in the US, the government called on movie writers to produce ways that the US could be attacked. Start with the premise of being creative. When I worked on cars at Camp Sky Crest, one counselor would ask another ‘what does Murphy say?’ (as in Murphy’s laws) The thought was to think of anything that could go wrong and try to front run it. Now this is not that easy, how can you get people to be creative in all the mistakes that can happen. Guess what, practice it, eventually you will get better. If you never do it, I guarantee at least of the things you would have thought of will happen.
Second part of planning to fail is ‘what to do when something fails.’ In technology there is this notion of setting up when a system goes down, but that is only one part of failure. Projects fail, code has issues, business users do things wrong, and so on. A process needs to be in place for any failures on how to recover and change so that it reduces the possibility of it happening again.
This includes how to do the postmortem correctly. Most cases people looking for the simple one mistake that caused the outage. What I started to learn later was the blame never solved the problem. Bringing back my agile methods of development there is a practice called the five whys. This is a practice mostly used in requirements gathering to keep asking why, up to five times to get to the ‘real’ reason something is being done. It could be a gorilla and bananas problem, a poor process problem, a people problem, a system problem, a management problem, a time to market problem etc. Unless you keep asking why, it is going to be hard to get to all the breakdowns to address.
In last week’s writing I described one cause could be lack of reading emails. If an email is sent to too many people often multiple people think the other person reviewed it. In processes with say six or seven sign offs everyone thinks the other person read it, so I do not have to. The process is designed to make sure there are checks and balances, but too many checks people take the short cut, and no one checks.
The other thing to make sure of is that all the small issues are included in the whys. If part of the problem is reduced staffed so more workload led to the lack of people paying attention, it needs to be there. There could be lack of documentation of the process, lack of knowing the process, and even lack of practicing the process so it is usable. Time to market often makes people take short cuts, incentives can drive behavior and egos can break things. Communication failures exist, whether it is a language issue, cultural issue, lack of people speaking up (silence is not always agreement) and simple different understandings of the same sentence. If you do not work and find small breaks, build solid communication frameworks and practices misaktes and failure will still happen.
If you are already good at postmortems, then doing the first part is the same thing. Look at your processes, look at what you are doing and challenge the premise that it is the right thing to do. But it becomes before versus after.
Guess what I lied, there are not two parts but three. The third comes from Agile practices. Part one needs to be done often on some regular basis. In Agile Scrum, this is called a retrospective. Where the finished product is not what is being discussed but the process of how we got there. Mistakes are going to happen but understanding the cause of why things went awry and making minor changes reduces the changes of them happening.
“Insanity is doing the same thing over and over and expecting different results.” (Einstein) – But maybe it is also not thinking things can go wrong and planning on first how to prevent them, and second how to handle them when they do.
This opinion is mine, and mine only, my current or former employers have nothing to do with it. I do not write for any financial gain, I do not take advertising and any product company listed was not done for payment. But if you do like what I write you can donate to the charity I support (with my wife who passed away in 2017) Morgan Stanley’s Children’s Hospital or donate to your favorite charity. I pay to host my site out of my own pocket, my intention is to keep it free. I do read all feedback, I mostly wont post any of them.
This Blog is a labor of love, and was originally going to be a book. With the advent of being able to publish yourself on the web I chose this path. I will write many of these and not worry too much about grammar or spelling (I will try to come back later and fix it) but focus on content. I apologize in advance for my ADD as often topics may flip. I hope one day to turn this into a book and or a podcast, but for now it will remain a blog. AI is not used in this writing other than using the web to find information.Images without notes are created using and AI tool that allows me to reuse them.