After Ten Years: A Few Words from Admiral Gehman

I’d like to interrupt my personal recollections of the Columbia accident and its aftermath to give a few words from Admiral Gehman.  You might as well know that there are still people out there who will tell you the CAIB got it wrong.  I am not one of them.  Of course there are minor details that are not exactly right, no report is perfect.  But in the main points the CAIB did get it right, in my opinion.  Admiral Gehman lead the CAIB and in December 2005 he made a remarkable speech to a group of Navy officers.  His talk included a discussion of the terrorist attack on the USS Cole and the USS Iowa main gun battery explosion which killed a number of sailors.  He also mentions various aspects of other naval programs and investigations.  In the interest of brevity, I have deleted all of that other material and just retained his comments on the Columbia accident.  You should read this carefully.

———————————————————————-
In the course of working closely with NASA engineers and NASA scientists as we tried to solve what had happened to the Columbia, we became aware of some organizational traits that caused our eyebrows to rise up on our heads. After not very long, we began to realize that some of these organizational traits were serious impediments to good engineering practices and to safe
and reliable operations. They were doing things that took our breath away.

We concluded and put in our report that the organizational traits, the organizational faults, management faults that we found in the space shuttle program were just as much to blame for the loss of the Columbia as was the famous piece of foam that fell off and broke a hole in the wing. Now, that’s pretty strong language, and in our report, we grounded the shuttle until they fixed these organizational faults.

I need to give you the issue from the NASA point of view so you can understand the pressures that they were under. In a developmental program, any developmental program
The program manager essentially has four areas to trade. The first one is money. Obviously, he can go get more money if he falls behind schedule. If he runs into technical difficulties or something goes wrong, he can go ask for more money. The second one is quantity.  The third one is performance margin. If you are in trouble with your program, and it isn’t working, you shave the performance. You shave the safety margin. You shave the margins.  The fourth one is time. If you are out of money, and you’re running into technical problems, or you need more time to solve a margin problem, you spread the program out, take more time. These are the four things that a program manager has. If you are a program manager for the shuttle, the option of quantity is eliminated. There are only four shuttles. You’re not going to buy any more. What you got is what you got. If money is being held constant, which it is—they’re on a fixed budget, and I’ll get into that later—then if you run into some kind of problem with your program, you can only trade time and margin. If somebody is making you stick to a rigid time schedule, then you’ve only got one thing left, and that’s margin. By margin, I mean either redundancy—making something 1.5 times stronger than it needs to be instead of 1.7 times stronger than it needs to be—or testing it twice instead of five times. That’s what I mean by margin.

It has always been amazing to me how many members of Congress, officials in the Department of Defense, and program managers in our services forget this little rubric. Any one of them will enforce for one reason or another rigid standard against one or two of those parameters. They’ll either give somebody a fixed budget, or they’ll give somebody a fixed time, and they forget that when they do that, it’s like pushing on a balloon. You push in one place, and it pushes out the other place, and it’s amazing how many smart people forget that.

The space shuttle Columbia was damaged at launch by a fault that had repeated itself in previous launches over and over and over again. Seeing this fault happen repeatedly with no harmful effects convinced NASA that something which was happening in violation of its design specifications must have been okay. Why was it okay? Because we got away with it. It didn’t cause a catastrophic failure in the past. You may think that this is ridiculous. This is hardly good
engineering. If something is violating the design specifications of your program and threatening your program, how could you possibly believe that sooner or later it isn’t going to catch up
with you? For you and me, we would translate this in our world into, “We do it this way, because this is the way we’ve always done it.”

The facts don’t make any difference to these people. Well, where were the voices of the engineers? Where were the voices that demanded facts and faced reality? What we found
was that the organization had other priorities. Remember the four things that a program manager can trade? This program manager had other priorities, and he was trading all right, and let me tell you how it worked. In the case of the space shuttle, the driving factor was the International Space Station. 

In January of 2001, a new administration takes office, and the new administration learns in the spring of 2001 that the International Space Station, after two years of effort, is three years behind schedule and 100 percent over budget. They set about to get this program back under control. An independent study suggested that NASA and the International Space Station program ought to be required to pass through some gates. Now, gates are definite times, definite places, and definite performance factors that you have to meet before you can go on. The White House and the Office of Management and Budget agreed to this procedure, and the first gate that NASA had to meet was called U.S. Core Complete. The name doesn’t make any difference, but essentially it was an intermediate stage in the building of the International Space Station, where if we never did anything more, we could quit then.  And the date set for Core Complete was February 2004. Okay, now this is the spring of 2001.

In the summer of 2001, NASA gets a new administrator. The new administrator is the Deputy Director of OMB, the same guy who just agreed to this gate theory. So now if you’re a worker at
NASA, and somebody is leveling these very strict schedule requirements on you that you are a little concerned about, and now the new administrator of NASA becomes essentially the author of this schedule, to you this schedule looks fairly inviolate. If you don’t meet the gate, the
program is shut down; they took it as a threat.  If a program manager is faced with problems and shortfalls and challenges, if the schedule cannot be extended, he either needs money, or he needs to cut into margin. There were no other options, so guess what the people at NASA did? They started to cut into margins. No one directed them to do this. No one told them to do this. The organization did it, because the individuals in the organization thought they were defending the
organization. They thought they were doing what the organization wanted them to do. There weren’t secret meetings in which people found ways to make the shuttle unsafe, but the
organization responded the way organizations respond. They get defensive. We actually found the PowerPoint viewgraphs that were briefed to NASA leadership when the program for good, solid engineering reasons began to slip, and I’ll quote some of them. These were the measures that the managers proposed to take to get back on schedule. These are quotes. One, work over the Christmas holidays. Two, add a third shift at Kennedy Shuttle turnaround facility. Three, do safety checks in parallel rather than sequentially. Four, reduce structural inspection requirements. Five, defer requirements and apply the reserve, and six, reduce testing scope. They’re going to cut corners. That’s what they’re going to do. Nevertheless, for very good reasons, good engineering reasons, and to their credit, they stopped operations several times, because they found problems in the shuttle, and they got farther and farther behind schedule.

Well, two launches before the Columbia’s ill-fated flight—it was in October—a large piece of foam came off at launch and hit the solid rocket booster. The solid rocket boosters are recovered from the ocean and brought back and refurbished. They could look at the damage, and it was significant. So here we have a major piece of debris coming off, striking a part of the shuttle assembly. The rules and regulations say that, when that happens, it has to be classified as the highest level of anomaly, requiring serious engineering work to explain it away. It’s only happened six or seven times out of 111 launches. But the people at NASA understand that if they classify this event as a serious violation of their flight rules, they’re going to have to stop and fix it. So they classify it as essentially a mechanical problem, and they do not classify it as what they call an in-flight anomaly, which is their highest level of deficiency.

Okay, the next flight flies fine. No problem. Then we launch Columbia, and Columbia has a great big piece of foam come off. It hits the shuttle. This has happened two out of three times.
Now, we go to these meetings. Columbia is in orbit, hasn’t crashed, and we’re going to these meetings about what to do about this. The meetings are tape-recorded, so we have been listening to the tape recordings of these meetings, and we listen to these employees as they talk themselves into classifying the fact that foam came off two out of three times as a minor material
maintenance problem, not a threat to safety. Why did they talk themselves into this? Because they knew that, if they classified this as a serious safety violation, they would have to do all these
engineering studies. It would slow down the launch schedule.  They could not possibly complete the International Space Station on time, and they would fail to meet the gate. No one told them
to do that. The organization came to that conclusion all by itself.  They trivialized the work. They
demanded studies, analyses, reviews, meetings, conferences, working groups, and more data. They keep everybody working hard, and they avoided the central issue: Were the crew and the shuttle in danger? [This was] a classic case where individuals, well-meaning individuals, were swept along by the institution’s overpowering desire to protect itself. The system effectively blocked honest efforts to raise legitimate concerns. The individuals who raised concerns and did complain and tried to get some attention faced personal, emotional reactions from the people who were trying to defend the institution. The organization essentially went into a full defensive crouch, and the individuals who were concerned about safety were not able to overcome.
________________________________________________
So back to me.  I would tell you nobody intentionally did anything to compromise safety.  Everyone that I worked with always held the safety of the crew to be their highest priority.  So the real question that you have to answer is how did so many hard working, intelligent, safety minded people come to make such a fundamentally unsafe set of decisions?  And do you think that you or your organization is immune?
Think again.

About waynehale

Wayne Hale is retired from NASA after 32 years. In his career he was the Space Shuttle Program Manager or Deputy for 5 years, a Space Shuttle Flight Director for 40 missions, and is currently a consultant and full time grandpa. He is available for speaking engagements through Special Aerospace Services.
This entry was posted in After Ten Years and tagged , , , . Bookmark the permalink.

51 Responses to After Ten Years: A Few Words from Admiral Gehman

  1. Beth Webber says:

    Wayne, it makes me weep to read this. In the end, the result was to lose the crew, and end up in exactly the place NASA was trying to defend themselves from: behind schedule. We lose over two years of construction on ISS; seven men and women.

    But, the Administration and OMB need to take their share of the guilt: pushing NASA into an untenable corner on the four areas they could trade.

    This was a hard read, but thank you for sharing it with us.

  2. Dave H. says:

    Wayne,

    I met Admiral Hal at the final CAIB public hearing in Washington. When I thanked him for the work he and the Board were doing, he turned the tables, asking me who I was, where I was from, and what I did for a living. He smiled, shook my hand, and thanked me for coming.
    Thusly emcouraged, I went to find the board member who kept asking questions about the foam.

    You should know that NASA was in the very same place two years ago with STS-133. During a tanking test foam in the intertank area cracked and when it was removed four of the exposed stringers had visible cracks.

    …and yet another panel of this tapestry is woven into place…

  3. Rod Wallace says:

    I was part of this team. I am ex-military and I recognize a chain of command. I tried to use the chain of command (my boss) to express concern up the line after the STS-113 loss. I tried to request at least a waiver, but that was rejected. So after reading this part, I think I should have gone to the Center Director, but I have no confidence that he would have done anything. How is any organization going to listen to a mid-level manager or engineer without adequate data to prove a safety concern? Money and time should not diminish the need for data when the consequence is live or death. I was asked after the STS-107 launch if we could say that foam was not a safety concern. My reply was that I could not say it wasn’t a safety concern. If you remember before the WLE testing proved foam could cause critical damage, the team was saying it could not have been the foam that caused the damage. It must have been the bolt changer, wind shear, etc.. Any organization that is not hard data driven and relies on unverified models and PRAs based on even cruder models which are based on flight experience without verified root cause analyses, will always make the same mistakes that the Shuttle Program made if failure is not an option. Keep telling the story!

    • Dave H. says:

      “How is any organization going to listen to a mid-level manager or engineer without adequate data to prove a safety concern?”

      That’s the trick…to shift the paradigm from “all others bring data” to “I don’t have hard data to support my position but what harm is there in checking it out?”
      A good example would have been Linda Ham saying “I don’t think there is much we can do, but have the photos taken and let us know what they reveal.”

      • waynehale says:

        Read the post on “The Tyranny of Requirements” again.

      • Dave H. says:

        Wayne,

        I chuckled when I read your advice. Been through that many times, most recently, two weeks ago. You do make a good point about becoming trapped in a flow chart loop, though.

      • Rod Wallace says:

        We tried to ask for pictures. Lambert and I called Wayne to ask for them. The decision to not get the pictures was not his or Linda’s. Unfortunately, maybe the picture would have confirmed the damage, but on orbit rescue was the only option and no vehicle was near ready to fly. It would have been a tough management decision to expedit vehicle processing and launch again without a root cause for the failure. Entry profile could not have been altered to prevent the accident. In the pre-accident environment without data it would have been hard to get anyone inside or outside of the normal chain of command to take action.

      • waynehale says:

        This will be the subject of a much longer post in the near future. That episode is one of the saddest in my personal history.

    • no one of consequence says:

      “I am ex-military and I recognize a chain of command. I tried to use the chain of command …”
      So have many of us. Learned the hard way that this did not work always the same. In fact, it can be seen as an organizational threat, all too commonly.

      In NASA and other agencies, it is impossible to avoid the theory of “agenda”, which is much less present in the military (at the top of thier game, totally absent). In fact the only times I’ve seen it work, is when two ex-military contemporaneous officers used it “out of band” accross hierarchies, practically like a maisonic ritual.

  4. Frank Ch. Eigler says:

    “I would tell you nobody intentionally did anything to compromise safety. Everyone that I worked with always held the safety of the crew to be their highest priority.”

    The first may well be so, but the Admiral’s excerpt appears to suggest that institutional/project survival weighed heavier on people’s minds than the safety of the crew.

    • waynehale says:

      I think you are missing the point.

      • Frank Ch. Eigler says:

        Seriously, how so? The Admiral spends paragraphs on how the crazy requirements have made people shave margins, and implicitly pass things, so as to not slow down the project. That’s exactly prioritizing the institution higher than the crew, isn’t it?

      • waynehale says:

        I’m sorry, sometimes I’m too abrupt in answering comments and your question is a serious one that requires an explanation.

        The top level policy people who drove those requirements has a poor understanding of how fragile the shuttle really was. Partly that was due to our success in flying it safely for a number of years. A non-engineer policy wonk could hardly be expected to believe the engineers when they raised safety considerations.

        Meanwhile the guys in the trenches kept trying to do the best they could with the resources they had. We all believed that human space flight is important to the nation; if we didn’t believe that we would have gone looking for higher paying work. When the boss repeatedly tells you that there is no more money and you have to make it work with what you have, then virtually all of us kept looking for ways to make it work.

        There is no bright line painted on the floor to tell you when you have crossed over from safe to unsafe. Particularly not in an endeavor that is high risk to begin with. So when you have been brought up in a system that is inherently unsafe (flying in space is inherently unsafe) and there have been compromises made all along, when do you know that the latest retrenchment is the one that breaks the system?

        So I reiterate; nobody intentionally sacrificed safety to protect the institution. But we all did unknowingly.

        So how do you know in your business when you have crossed from safe to unsafe? Don’t quote OSHA regulations to me; that is the bare minimum of safety. Have you ever texted when stopped at a red light? Hmm.

      • Frank Ch. Eigler says:

        Thank you for the longer explanation. “Nobody intentionally sacrificed safety to protect the institution. But we all did unknowingly.” is a fine summary.

      • Rand Simberg says:

        There is no bright line painted on the floor to tell you when you have crossed over from safe to unsafe.

        Exactly. Because “safe” and “unsafe” are not binary conditions. It is always relative. There is no safety this side of the dirt. But S&MA lives in a fantasy world in which that’s not true. Another point I make in my book (he plugged again).

      • Dave H. says:

        “So how do you know in your business when you have crossed from safe to unsafe?”

        From my experience, you have to understand and respect the laws of physics that govern whatever it is you’re doing.
        For example…

        In 1976 at the ripe old age of 21 I was employed at Bettis Atomic Power Lab as a QC inspector inspecting nuclear fuel pellets before they were loaded into the fuel rods. I’d just graduated from DeVry’s Electronic Technician course yet I knew enough about nuclear physics to respect the crit zones, enough so that when my boss tried to get me to violate the rules the heated argument brought the Navy inspector over. A few minutes later he smiled at me and said “Walk with me” to my boss. Nothing was said, no reprecussions ensued. I knew the physics and the consequences of ignoring them better than my boss. Feynman would have been proud, had I known him.

      • no one of consequence says:

        “The top level policy people who drove those requirements has a poor understanding of how fragile the shuttle really was.”

        The USAF, who had every right to … didn’t either. And then they did. With CELV.

        Yes, it looked to many like an airliner … it appeared to look too easy to do … too “finished” … too “completed”. Too much of a success for failings/flaws to be overlooked.

  5. Frank Ch. Eigler says:

    (By the way, Wayne, would it be possible to give a link to Admiral Gehman’s complete speech?)

  6. John Wegner says:

    Wayne — I really like and really appreciate your blog. Regarding the Admiral’s remarks and his frame of reference for those remarks, have you read Diane Vaughan’s book on Challenger? If you have read it, did you read it before or after the CAIB report?

  7. Coastal Ron says:

    Another informative post Wayne. Truly a case of unintended consequences, and also the inability of humans to fully understand complex systems, especially when they require humans to make assessments under duress.

    Your posts, though I’m sure painful for many, should be required reading for those coming through engineering and management school. Required because we have not solved the reasons for these problems, and may never reach that level of human maturity, yet we still need to strive for bigger and better things. Maybe through the lens of history we can at least recognize when we are deluding ourselves…

  8. Rand Simberg says:

    From a post I wrote right after the event: http://www.transterrestrial.com/archives/002068.html

    The Flight Directors Nightmare:

    Ever since the Shuttle first started flying, and perhaps even before, I’ve often thought about a nightmare scenario. I’ve even thought about writing a SF short story, or even full-length novel about it, except that I can’t (intentionally) write fiction (though some would say that I do it often in my attempts to write non-fiction).

    A Shuttle launches. Once they attain orbit, it is discovered that they have damage to the tiles that will not allow them to safely enter. In the real world as it existed in the early nineteen eighties, this would be a soul-torturing dilemma, and one that would likely be ultimately passed up to the President. Here’s the problem. The Shuttle doesn’t have enough consumables to last long enough to launch another one to rescue them. The Soviets might be persuaded to launch a couple Soyuz’s, but it’s not clear if they can do it in time, either, and there’s no way to dock them (though early on, they had the “rescue ball concept” for transfer).

    But assume as a given that they cannot be rescued (which really did correspond to reality). They only have two choices. They can cross their fingers, pray, or do whatever non-technical things they wish to maximize their chances, and attempt to come home anyway, or they can run out of air on orbit (or choose some faster way to go), and the vehicle becomes a flying tomb, to be either repaired and retrieved later, or reenter in a few weeks. The ethical question, related to this post, is should we destroy the vehicle in a futile attempt to save the crew, or should we sacrifice the crew, who will die either way, and at least attempt to salvage the vehicle? How do the politics play? How does the public react? To make it more interesting, assume that there really is a credible capability to do such a repair and retrieval–that the vehicle really can be saved, and that the crew really cannot.

    Now realize that we just averted this scenario in real life only because of the ignorance of Mission Control about the true situation. Is it possible that the tile damage was ignored partly out of (perhaps unconscious) wishful thinking, because the alternative to ignoring it was to face exactly that ethical dilemma and public-relations nightmare? The only difference is that the likelihood of repairing the Orbiter is small. But depending on the level of damage, it might have been larger than the prospects for a safe entry.

    One more consideration. If this had been an ISS mission, the crew would likely be alive today, and wondering what to do with a broken orbiter. It’s likely that the damage would have been viewable, and even apparent, when approaching ISS, and the crew would have been able to use the station as a safe haven. But once they launched into an inclination different than that of ISS, if it turns out to be true that the tiles were fatally damaged on ascent, then their fate was sealed, as was their inability to know about it.

    All of this, of course, points up the folly of the space policy that we have had in place for the past thirty years, in which we have a single, fragile, unresponsive system to get people to and from space.

    • waynehale says:

      When I was a Flight Director I would regularly wake up at 3 AM with the cold sweats about some failure scenario that cost the crew and vehicle. I never remember this being one of those nightmares. There were plenty of ways to loose a crew and this one just didn’t make it on my personal list. Maybe for some of the other guys.

      Of course there was never really a way to rescue a stranded crew in orbit, no real plan in any of the programs all the way back. I recently watched the hoary old Hollywood chestnut “Marooned” but I can’t recommend it (acting is terrible). Having a rescue capability costs big money. Are we going to insist on a rescue capability for the first manned Mars mission?

      I think those moral dilema choices that you outlined are very low probability; but even if one of them had occurred, I believe the team would have insisted on the best chance to save the crew.

      • Rand Simberg says:

        I think those moral dilema choices that you outlined are very low probability; but even if one of them had occurred, I believe the team would have insisted on the best chance to save the crew.

        Another point I make in my book. That’s not how the Navy would do it. In the Navy, you are expected to risk and even sacrifice yourself if necessary to save the ship (e.g. closing a hatch of a compartment on a damaged sub to keep it from flooding the whole vessel, even if you know that others are the wrong side of it, and even if you are yourself). And the Navy has a lot more submarines than NASA had orbiters.

        In the Coast Guard, the motto is “You have to go out, but you don’t have to come back.”

        If NASA doesn’t develop that kind of attitude, maybe we need a different organization if we’re serious about space.

      • no one of consequence says:

        Two things are being conflated – the lack of dissimilar redundancy in crew operations, and the moral dillemma of service and duty.

        The first has in the past been a questionable but conscious decision by representative government.

        The second an artifact of history as NASA’s role in the conquest of space has played out.

        Conflating the two unnecessarily injures the judgement of a situation.

        NASA’s role in space exploration should not be to set a price on human life by trading off risk for technology / economics … for many reasons including the obvious one that it fails at its mission to find the unexamined, undiscovered way to “do it right” where little or no risk is required at all. Otherwise there is no need for it to look.

        NASA is not the military. Cannot be the military. And even in military service, the honorable sacrifice that comes with the duty of the service is the most carefully guarded. We don’t let our people down, we don’t leave them behind, we watch the 6.

        The risk and the duty instead are about our battle with ourselves and our understanding as imperfect but always well meaning and well researched. In that way, when we fail, its not about tradeoff of advancement, but of rising above our unseeable/unknowable limits.

      • no one of consequence says:

        It is “insisted upon” only because it is insisted upon.
        Utterly naive. Advocate a quick return to reality.

        Let me introduce you to the concept of institutions (aka “institutional memory”). Example: there are things in the army and the navy that predate America. Those who try to change them … can’t. For good reasons.

        NASA is a “formed” institution. For the moment, I’ll only focus on the proven positives and neglect discussion/analysis of the rest. It serves its mission, and that mission is necessary indefinitely. End of story. Same for all the other current agencies

        Now, for the sake of argument, lets grant your position that enough “insist upon” what you’re advocating – how to implement:
        Step 1) Find the legal / ethical / economic “place” for it in the grand scheme.
        Step 2) Find the rules / “interfaces” / regulatory to attach it to everything else (including NASA)
        Step 3) Fund and distribute returns from it.

        Solve these riddles and you’ve half a chance at a beginning. Otherwise you’re just playing with strawmen. My point is one of “it doesn’t fit”. And radical redesign of govt & institutions to make it so is working backwards – not a solution just a bigger problem – bad engineering.

        My limited abilities are scoped simply by engineering/math/space, not politics.

    • Dave H. says:

      “You have to go out, but you don’t have to come back.” That’s an interesting concept.

      While that mantra may describe the Coast Guard’s mission as well as Star Fleet’s, the simple fact is that it doesn’t fit well in most workplaces and has at its heart the assumption that there are no-win scenarios.

      Safety in the non-military world is predicated on the belief that everyone goes home at the end of the day, or end of the shift. Once upon a time, it wasn’t that way. Accidents and fatalities were considered a cost of doing business, but workers who were losing their lives challenged this paradigm and took control of the process.

      Human spaceflight falls in between the two paradigms mentioned above. It’s not that “you don’t HAVE to come back”, it’s that “you may NOT come back”. But you always go out with the belief that you will return home and that you have the support necessary to make that happen. For every astronaut there are 10,000 intensely dedicated people at NASA working furiously to make sure they return home.

      Wayne is weaving a very intriguing tapestry with this series of blog posts. It is filling in a lot of gaps in my own story. It’s been nearly ten years, and I’m still trying to figure out how I went from Homestead Works to Banana River.

      • Rand Simberg says:

        For every astronaut there are 10,000 intensely dedicated people at NASA working furiously to make sure they return home.

        Which is why spaceflight is so expensive that we hardly ever do it. If we want to open up space, we have to have a different attitude.

      • no one of consequence says:

        But this is not NASA. Nor can it be. Violates professional conduct insisted upon.

        Accidents for NASA are deep failures that are felt in a quite different way, different “consequence”s entirely. Its not helpful to ignore/deny.

        If you want to have a “wild west” advancement of space, you’ll have to create that with other institutions (or in the service, or with out of US jurisdiction “entities”). For modern legal / ethical / social / professional conduct as practiced in current times simply won’t countenance otherwise. However, once upon a time, the “law west of the Pecos” … could.

      • Rand Simberg says:

        But this is not NASA. Nor can it be. Violates professional conduct insisted upon.

        Accidents for NASA are deep failures that are felt in a quite different way, different “consequence”s entirely. Its not helpful to ignore/deny.

        It’s not about ignoring/denying. It’s about recognizing it explicitly, having a frank national discussion about it, and deciding if we want to change it. It is “insisted upon” only because it is insisted upon. We could choose to insist upon something else, but we have to discuss it in order to make such a choice, instead of just throwing up our hands and saying “that’s just the way it is.”

      • Dave H. says:

        To my eyes, these first steps we’re taking into space are reminiscent of when humanity took to the seas. For several millenia, people sailed over the horizon and didn’t return. Not knowing why led to legends such as sailing off the edge or being eaten by dragons. Eventually humans learned to design vessels that were able to withstand the known conditions and learned navigational skills. This in turn led to discovering things like scurvy. Once again humans experimented and ways to keep people physically and mentally healthy were learned and modern humans explored the globe.

        With space, we have a pretty good idea of how to get to Mars (navigation), for example, and a pretty good idea of what we need the ship to be able to do (shipbuilding skills) and a pretty good idea of how to keep the crew sane and healthy (the human element).
        History shows us that we’ll learn how much we don’t know when they don’t come home.

        Have you ever read “High Calling”? I don’t know how much Columbia’s crew was told, but David Brown’s comment that “if this doesn’t come out alright we’ll just go on to a higher place” tells me all I need to know about “acceptance of risk”.

        It also tells me that there’s a lot more to having “the right stuff” than I’ll ever have.

      • Frank Ch. Eigler says:

        (Dave H.’s reference to Evelyn Husband’s High Calling book took me a while to identify: It’s the subtitle of chapter 13, The Longest Day: “If this thing doesn’t come out right, don’t worry about me; I’m just going on higher”, attributed to Michael Anderson.)

      • Dave H. says:

        Frank, thank you for correcting me and pointing those who haven’t read the book in the right direction. I sent my copy to a good friend a few years ago.

        When I lent it to the lady across the street (she passed a few years ago) she told me that after about halfway through she wept until the end, knowing what had happened.

  9. Rand Simberg says:

    “Are we going to insist on a rescue capability for the first manned Mars mission?”

    If so, there won’t be one. That is a theme of the book I’m about to publish. If we do a Mars mission, it would best be an armada, with a variety of vehicles, but standardized docking.

    • waynehale says:

      Ah well; we shan’t live to see that.

      In a perfect world (perfect universe?) I am completely in agreement with you.

      Where we live now? That is going to be very hard to accomplish.

      • Rand Simberg says:

        Well, that’s partially a function of how long you think you’ll live, isn’t it? Space travel and life extension sort of go together. ;-)

        Seriously, if SpaceX achieves its cost goals, particularly reusability, it will be astonishing how much more we’ll be able to accomplish in the next couple decades.

  10. I remember going before the safety panel many times in the 1990’s. It was a humbling and frustrating experience because what was written in the NASA documents about safety was often ignored or emphasized, depending on what was going on in the background with other missions and payloads.

    I still remember the cold chills that people in the know had when describing the problem with the super ZIP explosive bolt cutters on the IUS for the Mars Observer mission. This was the mission where both the primary and redundant explosives went at once and threw shrapnel through the cargo bay door walls and into the OMS compartment. That one never made it into the media but it sure caused changes in how the safety panel operated because the design flaw was easily found in the documentation post flight, but had passed safety review with flying colors……

    • BenP says:

      What many people are unaware of was that on a prior shuttle mission the initial Super ZIP deploy failed when the crew performed the ARM A then FIRE A switch throws. The crew then tried the redundant switches (ARM B, FIRE B) with no joy. The initial call by the flight controllers to the Flight Director was to arm and fire BOTH circuits. Wisely, the Flight Director paused long enough until someone in the MER recalled that they had information that the A&B circuits were cross-wired, and then correctly called up to the crew to ARM A, FIRE B, which was successful. Had the initial call to arm and fire both circuits gone up, a failure comparable to the STS-51 mission would have occurred. That incident demonstrated good and bad: bad wiring installation, improper documentation, bad communication to flight controllers by KSC engineering, lack of understanding of pyro design by flight controllers (which differed from the more familiar shuttle design), making calls on the fly without procedures – but good situational awareness by the Flight Director and MER Manager for not stepping into an unknown configuration until more data was available. I don’t recall if this even made the list of “close calls” but I distinctly remember the discussions over the loops, because only after the later shrapnel incident were we, as shuttle flight controllers, even aware that firing redundant pyros simultaneously on the Super ZIP represented a risk.

  11. no one of consequence says:

    Remember Admiral Gehman’s talk. Very appropriate. But it reminds me of another talk I’d heard as to a military operations failure.

    In that context, the root cause was an insufficient, “stale”, threat assessment. The team had gotten used to a situation as routine that wasn’t, and had mapped it as “harmless” w/o actually doing an active threat assessment per doctrine each time.

    Because they “knew how things worked”.

    Everyone knew that the Shuttle was the Space Transporation System. But it wasn’t.

    It was an in depth, ongoing, experimental program with limited backup capability / contigency / margin whose prime function was to increasingly improve our understanding of how to safely convey and return human crew in spaceflight, knowing the risks including our incomplete understanding and imperfect processes brought, with our drive and determination to tackle them as the objectives to achieve the prime function.

    And afterwards when allowable after that, transport crew to space.

    That’s really what it was – an iceberg glimpsed by most only from the top.

    A qoute from Abe Lincoln is surfacing a lot recently. It applies here:

    “The dogmas of the quiet past, are inadequate to the stormy present. The occasion is piled high with difficulty, and we must rise — with the occasion. As our case is new, so we must think anew, and act anew. We must disenthrall ourselves, and then we shall save our country.”

    The Shuttle was enthralling. So much that it was hard to realize that each launch was new, and we had to think / act anew.

    Human spaceflight will always require this.

  12. Andrew W says:

    A comment by Chris Bergen at NSF:

    “And @NASAWatch
    “Golden Spike” commercial lunar exploration company includes Wayne Hale, Jerry Griffin, Alan Stern

    You’ve heard of them! ;D”

    http://forum.nasaspaceflight.com/index.php?topic=30367.msg979987#msg979987

    Any idea what that’s about?

  13. Excellent reading with both the blog entry and the comments and daily I am thankful that I don’t have to worry about systems that have an impact on lives but even in what I do I see that it’s all to easy to give in to pressure to get things done quickly and skip all the normal checks and balances to satisfy a clients needs. 90% of the time this works but 10% of the time in ends in a total disaster because the configuration isn’t known, something wasn’t checked/tested properly or the wrong thing was sold.

    It’s very easy to slowly and gradually get yourself trapped into a corner.

    • Dave H. says:

      Gary,

      It’s been my observation that there are at least two ways to paint oneself into a corner and have it result in a disaster:

      Deliberately. This is done most often to increase profits by limiting spending, usually on maintenance. Three good examples are BP’s Texas City explosion, a North Sea oil rig explosion and fire, and the recent Gulf of Mexico disaster.
      Nasa had to deal with budget cuts.

      Inadvertently. Columbia, Challenger, and the Edmund Fitzgerald fell prey to known engineering and operational issues that their owners failed to address properly and in a timely manner. People knew of and periodically brought attention to those issues but those in a position of responsibility failed to take action.

  14. B.N.MURTY says:

    Mr. Wayne Hale,
    After reading all your posts on Columbia Disaster, though it is clear now how the flight was doomed, may I request you to develop and discuss the options to save the flight at reentry or to plan for a rescue mission.
    I feel the Colombia case was a very difficult situation to come out with options to save the flight and the crew because the damage to the wing was done when it was already on its way. This is only as a matter of a lesson for the Managers in the light of the comments made by Admiral. Gehman, chairman of CAIB.
    B.N.MURTY
    VIRGINIA

    • waynehale says:

      Various repair or rescue options were thoroughly studied as part of the accident investigation. The results are documented in the CAIB report. Please look there for the answer to your request.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s