I was recently sitting in on an oversight meeting of Gemini management. Most of the meeting was spent in executive (ie. private) sessions, but during one of the two public sessions, one main topic was to explore how and what we have learned from past mistakes. This discussion made me realize that learning from mistakes is not an easy thing to do for many institutions. I’m not saying here that Gemini is doing all these things wrong, but I have seen all these issues first hand at various places, including Gemini. I’m sure that if asked, most people would say that they, personally, learn from their mistakes. Yet, institutions and corporations often don’t. Why not?
For one, you first have to be willing to acknowledge the mistake. This step can be a key roadblock for some. If you don’t have an atmosphere where honest inward looking thought and speech is encouraged, mistakes get covered up, denied, assigned to something else, and not brought out as potential lessons and means for improvement. If you believe in hiding bad news in an (ultimately futile) effort to look good, then you will never learn from mistakes.
Second, you need an environment where the mistake is viewed in context of the system that allowed the mistake to happen and that allowed its effect to be as big as it was. What you don’t need is an environment of blame – where mistakes are dealt with admonishments of “don’t do that again”. What you need is a faultless exploration of what in the system could be changed to prevent future similar mistakes. Making and acknowledging mistakes is not about placing blame, but about fixing the system. People will always make mistakes, but you want the system in which they work to be as fault-tolerant, and fault-preventive (if I can coin a new compound word) as possible.
Finally, you have to really broaden your horizons and fix the system, not the symptom. To make up a hypothetical example, if an instrument is damaged because a heating circuit failed (OK, the event is not so hypothetical, but the implicit bad response outlined below is), you could simply decide to remove the heating circuit from the instrument when it’s repaired. That fixes the symptom and you certainly must address the symptom, or you look really foolish if the same accident happens again, but you can’t stop there. What allowed this single point of failure to exist in the first place? What allowed the failure to occur unnoticed? Was there real-time monitoring? Was anyone overseeing the project? Was anyone contacted? Was there a timer on the heater? Are other possible single-point failures being identified and backed up and/or isolated by a fail-safe or some other subsystem? If you don’t start asking yourself these types of questions, there will be no learning from your mistakes. On the other hand, if you are not afraid to take a rigorously honest look at what other similar vulnerabilities might exist, if you are willing to go beyond placing individual blame and look at how the process allowed both the single point failure to exist in the first place and for the eventual failure to go unnoticed and un-contained, then you are probably on a continual course of improvement and empowerment. Isn’t that a better place to be than only a few instances of bad luck away from a repeat of a mistake you chose not to learn from?
Scot remembers several bits of advice he received about mistakes. His water-skiing cousin told him if he wasn’t falling, he wasn’t trying hard enough. His graduate advisor told him “wisdom is that sinking feeling that you’ve made this mistake before.” He learned from these people and others that making mistakes is part of life. Making the same mistake twice, doesn’t have to be.
Scot, great post, you nailed the key points. You are describing the Deming-based way of running a Total Quality Organization (Principal #8 Drive Fear out of the organization–so folks aren’t afraid to admit mistakes, etc.) This is a bit ironic because it’s what “The Toyota Way” is all about. (Yes, I read your post on “not communicating like Toyota”, the reason I was compelled to comment.) For what it’s worth, here’s my take:
I’ve been studying and working within the framework of “the Toyota Way” for almost 25 years (although I’ve never worked directly for Toyota). But based on what I know, I’m convinced the problem is NOT with their corporate philosophy. What happened to Toyota (IMHO) is that they didn’t ADHERE to their own philosophy of “making problems visible”. (I blogged about this last year in “Learning to Love Problems”.) Why did they abandon their values? I can only guess:
1) They got so big that complacency set in (against their philosophy of staying humble enough to keep getting better);
2) They abandoned their commitment to quality by expanding too rapidly in their efforts to surpass GM (against their TQM philosophy);
3) They allowed the Americans to run Toyota America, but failed to indoctrinate them into the true Toyota way and monitor how careful they were in evaluating new designs. (There are exceptions, but generally speaking, Japanese tend to be more risk-averse and consequently more detail-oriented than American counterparts)
4) The Americans running the U.S. organization were more concerned with the legal ramifications in America of openly admitting mistakes. In their misguided efforts to protect their company and their own asses, they stuck their collective heads in the sand while hoping it all would disappear. Very stupid.
No doubt about it, Toyota America is reaping what it sowed, and they must take responsibility for their actions/inactions. No ambiguity here whatsoever.
That said, I can’t help but feel that the media, competitors and the general public are “piling on”. Gosh, Ford had the tire problem just a few years ago (they behaved similarly to Toyota). GM has had its share of recalls too, but now it’s protected by our government (because, of course, it’s now OWNED by our government). And I can’t help but think that perhaps the UAW is putting out a lot of negative PR to leverage the situation in their favor. This is expected, though, because Toyota built its reputation on “quality”.
But have you noticed that none of the Japanese automakers are joining in the tar-and-feathering festivities? I know at least one Japanese competitor that has made it their policy NOT to criticize Toyota during these tough times (no tacky “comparison” commercials, etc). This company’s top management told me it is not the honorable thing to do, and that recalls can and will happen to all the automotive players, just a matter of time. They know as well as anyone that no one’s perfect. And yet many domestic (and Korean) auto companies are attacking with reckless abandon. Someday it will come back and bite them in the ass. And when it happens, you won’t hear a peep from the Japanese automakers.
I’m not sticking up for Toyota; I truly believe they got too big for their britches. But with all the bashing going on, it’s unfair to lose sight of the fact that while GM, Ford and Chrysler have not only had their share of recalls, they also have done much damage to our economy by sending jobs overseas, not to mention all the lay-offs they’ve done over the past 20 years. How many folks have Toyota laid off? Zero–even while they lost millions (billions?) during this recession. For this reason alone–their commitment to their U.S. employees–they deserve a fair shot at redemption.
As an aside, I truly feel sorry for the current head of Toyota. It didn’t happen on his watch. But now he must bear the burden of responsibility and get it fixed. Can’t imagine the pressure he’s under…
I’m personally a “Honda” man, but I’m still rooting for Toyota to turn this around. I think they will. But if they don’t, it will in the long run hurt the American economy.
Just my 2 cents…
Hey, shoot me an email. We gotta get together and catch up. Look forward to talking soon. Yoroshiku to Atsuko san!
Hi Tim,
Thanks for the comment. I totally agree with you that the way this problem was handled by Toyota was not the normal “Toyota Way” and your reasons for why the veered off course sound quite plausible.
For my purposes, Toyota was just a good example of how NOT to deal with problems, in this case. Toyota will survive this, I’m sure, and hopefully, this will be a strong enough event to force them back to the true path. I seem to remember Audi having a similar problem a while back and look at them now….
scot
Hi Scot,
Do you know of any studies of why astronomy (or actually, any) projects go wrong? I’ve been talking to a few people here at HP and everyone has a woeful story to tell of an instrument that turned up massively late and not really working, or some other project that failed horribly. I can kind of see some of the things that go wrong, but I don’t have a very good view of the big picture. Has anyone been brave enough to review a botched project and publish their findings?
Thanks!
Rachel
Hi Rachel,
Good question. I’m sure these kinds of reports exist, but I don’t know of any that have gone public. Too bad. There is a sort of astronomy project management meeting that happens sporadically; the last one was about a year ago, but I didn’t find out about it before it was too late. I’m sure there are some lessons learned at these conferences which would be relevant.
From what I’ve seen from a few projects at places I’ve been, the biggest problems appear to be:
1) These projects nowadays are bigger than can be handled by a Genius and a set of postdocs and grad students. Therefore, informal methods of cost and schedule estimation that worked in the past, don’t work so well now so the project often starts out with an unrealistic plan.
2) Project management is also being done as if a single small group can do everything. In my post about contingency use, I pointed out how a project should use functional contingency before dipping into schedule and cost contingency, but rarely does this happen. We are too caught up in trying to have everything and believing everything will be OK hereafter. Functional contingency is useless in the later stages of the project, so you have to have cost and schedule contingency left at that point. If you spend your cost and schedule contingency early on, you’re left with nothing but delays and overruns at the end.
3) Overall project management is often unclear. I’ve seen a couple cases where the purchasing institution basically hired out project management for the instrument. The problem with this approach is the intermediary doesn’t really take accountability for the project, assuming its client will step in when it is really concerned. Meanwhile, the hiring institution assumes the hired managers are on the job. The end result is that no one is really accepting responsibility for the project oversight and things can go awry fast.
4) Because these projects are often distributed across different teams, there must be well established interfaces and specifications up front. I know of one instrument that consists of two identical, but totally different, spectrographs (each with different detector command sets even!) because there was no agreed-upon set of specifications and interfaces at the start of the project.
5) And finally, astronomy typically pushes the technological edge in its new projects. Risk, delay, overruns, etc. are all par for the course in that realm. So, the areas where technology is being developed have to be well-contained, limited in number and scope, and carefully managed to avoid taking down the rest of the project. If you’re pushing more than one or maybe two areas of technology, you are setting yourself up for delays and overruns.
scot
Thanks a lot, Scot. This is thought-provoking.
One other thing I think I see happening in a project I’m involved in (I’m sure you can guess which) is lack of strong leadership, kind of related to your point 3. There seems to be no one person with a broad view of the project, staying on top of its progress and identifying potential issues before they turn into real, big problems. It all seems very fragmented. Of course, it’s hard to know how much of this is my own fault for just not grasping what’s going on around me, but I can’t help feeling that I wouldn’t have that feeling if the project had an obvious leader with a talent for communicating.
Well, we’ll get there in the end…
-Rachel
Hi Rachel,
What I see happening a lot around here is that the project manager of a project is often the chief project doer. Now, for small projects, this might make sense. But large project where varied resources from different groups have to be controlled, where schedules need to be managed and tasks arranged to interleave most efficiently, where different constituencies need to be kept involved and informed, etc., this approach is not a good one. The chief doer can either do, or manage, not both. So, the choice is usually made to do, else the project wouldn’t EVER get done. The net effect, though, is there isn’t a whole lot of project management and it can cause some of the symptoms you mention.
scot