STS-121: The Hardest Launch – Part 3 Wing Leading Edge

Returning to the history of the hardest shuttle launch I ever participated in, we earlier visited with the circumstances leading to the delayed launch of STS-121, the second test flight after the loss of Columbia and with the electrical sensor problems in the External Tank.  The next troubling problem addressed at the Flight Readiness Review was the integrity of the wing leading edge panels.

The very front part of the space shuttle wing gets incredibly hot during reentry, nearly 3,000 degrees F for almost half an hour.  A hole in the heat shield of the wing leading edge caused the loss of Columbia, so there is special emotion and attention focused on those panels.  Made of a special composite material, carbon phenolic cloth hand layered, impregnated with special resin, and fired under vacuum in a special oven, the reinforced carbon-carbon (RCC) panels were hand crafted with great difficulty.  And they were extremely hard to inspect.


Technicians in the Orbiter Processing Facility install wing leading edge RCC panels.


RCC panels showing the sharp bends on the sides and edges







Tests had been run showing that even very minor flaws in the interior of the RCC panels could burn through during reentry leading to the loss of another space shuttle and her crew.  The inspections of the wing leading edge consisted of a visual inspection with a magnifying aid and a ‘tap test’ where the technician tried to determine whether there was a void under the surface by rapping the RCC panel with his knuckles and listening to the sound.  These tests were not adequate to detect the kind of flaws that we now knew could be fatal.  A new inspection technique called ‘flash thermography’ used a strobe heat lamp to impart an energy pulse into the RCC panel and then an infrared camera monitored the temperature decay.  This could detect subsurface flaws in the RCC panels.  This inspection technique was new and there were no records of how the panels had appeared under flash thermography before STS-121.

It has been my experience that new tests often uncover things that were unexpected and not easily understood.  In this case the flash thermography test discovered ‘indications’ which might be a problem, or alternatively might be completely normal and not a problem at all.  In the corner areas of some wing leading edge panels, where there are folds or sharp corners, there were indications of unusual ‘signatures.’  In the worst case avoid under the surface could erode during entry.  Bad.  But the experts were divided.  In hand laying up the carbon phenolic cloth during manufacturing there could be wrinkles, especially in the complex geometry of a bend in the panel.  If these wrinkles existed at the manufacturing and had not caused a problem over many flights, we should be OK.  But if the signatures indicated a growing flaw that might get bigger every flight until the panel failed, that would be different.

We started lab tests to see if the difference could be understood.  Those would take time.  We reviewed the inventory; one spare set of panels was available, and we quickly moved to put the best spares on each orbiter in the locations where panels had the biggest ‘signatures’ – but not for Discovery, already on the launch pad when the discussion came to a head.  We ordered new panels to be made; but the factory throughput was 1 per month and each orbiter needs 44.  And the cost was high at $800,000 per panel.  That would be a long-term plan, not something in the short term.

The NASA Engineering and Safety Center was created after the Columbia accident to be an organization of the best of the best engineers who would be called in for the agency’s toughest problems.  The technical expert on the NESC was adamantly opposed to flying Discovery with any panels which were less than perfect.

At the Flight Readiness Review – almost a year after the last shuttle flight – with Discovery on the launch pad – we had the final discussion.  As Program Manager, I proposed flying as is.  My rationale was that the indications seen from this new test were likely (in my opinion) present from the manufacturing of the panels and had been through multiple reentries that were successful.  Secondly, most of the indications were in protected parts of the panels, covered by seals or other panels.  Thirdly, we were doing everything we could to replace panels as quickly as possible throughout the fleet – not an applicable argument for Discovery.  Fourth, we were moving quickly on testing to determine if ‘signatures’ as were seen on Discovery were a problem or not.  But those tests would take months to perform to gain sufficient sample runs to statistically prove the ‘signatures’ were not a problem.  I felt that the risk of not flying outweighed the risk of flying.  A programmatic stance, not an engineering one; if the second test flight returning after Columbia was delayed significantly, the pressure to end the entire program early would increase.  I admit that my recommendation was risky and not well grounded in engineering data.  But I had heard hours of presentations and discussions and that was my judgement.

Countering my position, the NESC pointed out that this was an indeterminate problem that could have fatal consequences and without more data it was an unacceptable risk.  Hard to argue with that.

The NASA Administrator was in the room and he stepped to the microphone and announced that he was accepting my recommendation and he would accept the risk.  It was very unusual, but that basically ended this topic.  The NESC does not sign the CoFR but if they had, I am sure they would have written a long dissent much as the Associate Administrator for Safety and the Chief Engineer did – but not for this topic, for the fourth one.  Stay tuned for the next installment.

So, what would you have done?  How would you have voted?  Stand down or go fly?  Acceptable risk or not?

We flew Discovery and it came home OK.  Months later we had the flash thermography tests showing that the ‘signatures’ had not grown in size with an additional reentry.  Months later, the laboratory tests demonstrated that RCC panels with fabric wrinkles deep inside were safe to fly.  But that is after the fact.  Sometimes decisions must be made under less than perfect circumstances.  That means somebody must accept the risk that things won’t go well.  It sounds easy until you put your signature on the line for it.


About waynehale

Wayne Hale is retired from NASA after 32 years. In his career he was the Space Shuttle Program Manager or Deputy for 5 years, a Space Shuttle Flight Director for 40 missions, and is currently a consultant and full time grandpa. He is available for speaking engagements through Special Aerospace Services.
This entry was posted in Uncategorized. Bookmark the permalink.

19 Responses to STS-121: The Hardest Launch – Part 3 Wing Leading Edge

  1. Mark Spangenberg says:

    I’ve curious a long time regarding why the apparent hole in the Columbia wing was hit a problem as Columbia continued ascending into orbit? Was the altitude of the wing strike significantly higher than during reentry when distress on the vehicle was becoming apparent?

    Mark Spangenberg


    • Art Hare says:

      The vehicle doesn’t get going all that fast while still in the atmosphere while ascending, but is going mach 20+ when it hits the atmosphere on the way down.

      Doing some quick wikipedia-ing, the shuttle started breaking apart at 230,000ft, and Mach 23. I can’t find the same number for a shuttle, but a Falcon 9 is only doing Mach 6 when it crosses through 230,000ft on the way up.

    • Charley says:

      Mark, I’ll try to answer your question.
      Yes, the strike to the RCC occurred at a high enough altitude so no noticeable problems on aerodynamics. The damage allowed the high gases created by the friction of reentry to melt supports in the wing until total failure.

    • Andy Oxenreider says:

      Altitude isn’t the only factor at play. Altitude – and the attendant change in air density – has a role, as does speed. There is structural heating on an ascending vehicle, but it’s nowhere near what is seen on reentry. Minimizing the losses (and thus, heating) to friction is in fact one of the things you’re trying to balance when designing an ascent trajectory.

      A very fast, low-altitude vehicle can get incredibly hot on ascent – see the Sprint missile of the 70s – but the Shuttle (and pretty much any vehicle trying to get to orbit) is trying to avoid it.

      The strike was at ~65000 feet, much lower (and, at mach 2.5, much slower) than the peak heating of reentry at 240000 feet and about Mach 24. To put a compensated ‘normal’ speed on those Mach numbers, it’s 1650 vs. almost 15000 MPH. By comparison on ascent (these numbers are harder to find) you’re going about Mach 5 at that altitude. Again, on reentry you’re (to a limited extent) trying to encounter atmospheric drag, on ascent you try to avoid it, leading to a much more benign thermal environment.

    • Thomas Moody says:

      Certainly not the expert here Mark but, to me, the obvious suspicion is that as Columbia was rapidly accelerating to orbit, the atmosphere was also rapidly thinning, i.e. the vehicle never got a chance to experience enough of a “thick” atmosphere for a “burn through” event. Seems that way to me anyway…

  2. John Gibbons says:

    Mark: I assume that question was intended to be “was not a problem”…
    The difference is velocity. Most of the delta-V in any launch is after the vehicle is essentially out of the atmosphere; otherwise you’re burning fuel pointlessly. On reentry, the vehicle is entering the atmosphere at orbital speeds; all reduction in velocity is by aerobraking rather than propulsive, so the thermal environment is enormously more challenging. (The OMS burn to begin reentry only serves to drop the perigee into the soup, not to slow down.)

  3. Marek says:

    It must be very difficult to make such decision when you are not risking your own life.

  4. Thomas Moody says:

    Great article, as always Wayne. This reminds me, by far not the same consequences of course, of a similar reaction to the discovery of previously unknown “data” in my own industry of nuclear power. We perform a test when the plant is shut down for refueling where an emergency diesel powered generator starts and connects to an emergency electrical bus that also powers a pump (we have two) that cools the reactor core. One year a few years back we got an industry notice from one plant that did an “engineering analysis” that conjectured what if this emergency diesel generator connected to the emergency bus out of phase and “shorted out” the entire bus, essentially making useless while you had the core cooling pump out of service for maintenance? Well this of course became a very big deal, notwithstanding the years of reliable data and electrical engineering (at our plant) folks showing the virtual impossibility of this occurrence. Nevertheless, our Risk Assessment people jumped on the industry bandwagon, urging us to re-schedule this test (I was the Scheduling Supervisor at the time which is why this rings so true to me) and managed to get it to the Plant Management Team for resolution. Long story short, we were very fortunate to have a pragmatic manager much like yourself who trusted our data and our engineers and we continued to perform the test as scheduled, saving much schedule time and, I’m sure, a LOT of money. What I learned from this was that it was far better for us to NOT react just because the “industry” thought this was the “new and exciting” thing to do. Again, great article Wayne and thanks for digging up some old memories!

  5. Andrew Worth says:

    If I follow correctly, after over a hundred shuttle missions in which the suspect problem had presumably existed but caused no issues a new system for detecting flaws found these suspected voids? That suggests to me that there’s no evidence that there was a genuine new issue to address.

    • The issue was: the new analysis showed voids after of missions. Were they there originally? No one knew. So it’s possible there were no voids originally and they were growing mission after mission and this would be the mission where they finally got big enough to burn through.

  6. Spacebrat1 says:

    always fascinated by your recounting of your experiences on the STS program. This one took nerves of steel, and for that, America owes you a debt of gratitude.

  7. Julie Ritt says:


    It pains me to say it, but your “rationale” for flying sounds an awful lot like the normalization of deviance that had already killed two crews. Worse, actually. A true NoD would be “Yes, there’s a problem, but it hasn’t killed anyone yet, so we can safely ignore it”. Except in this case, damage to the RCC panels *had* killed a crew only two missions earlier. So it was really more like, “Yes, there’s a problem, but it hasn’t killed anyone yet. A very similar problem just killed seven people, but it wasn’t this exact problem so we can still ignore it”. That at least part of the rationale was that you might lose the program if you delayed didn’t help, given that you’d be guaranteed to lose it if you lost another shuttle and/or crew.

    I’m certainly glad the STS-121 crew came home safely of course, but sheeeeesh! The way you described it gave the impression that the over-arching mindset re: safety had not changed one iota. This may not be the case – it’s just the feeling I got from it. (N.B. I am *NOT* implying that you guys didn’t mourn the 107 crew or try to make sure it didn’t happen again.)

    • waynehale says:

      You are exactly right.

      • Dave H. says:

        I was thinking about the differences in the conclusions as specified in the final reports on the losses of Challenger and Columbia.

        Morton Thiokol engineers warned NASA not to launch.
        Foam strikes on the TPS were common and had never caused a problem before.

        One was deliberate. The data history did not suggest a fatal problem with the other.

    • Dave H. says:

      You can only make something “so safe”. There will always be weak links in the chain. The only way engineers sleep at night is by doing everything humanly possible to cover all of the bases. Even so, one head cannot contain all wisdom, and human eyes can only see so much. There’s risks you can control and risks you cannot control.

      It’s when deliberate ignorance of the laws of physics comes into play that lives are lost and tears are shed.

      • waynehale says:

        Not always deliberate

      • Dave H. says:

        I actually forgot to give credit where credit is due…The Apollo 1 fire, where no data existed and a design flaw no one could have anticipated resulted in a fire.

        Bottom line is, not everything is preventable. Sometimes bad things happen.

      • Patrick Chase says:

        Dave H: w.r.t. Apollo 1, IIRC there was widespread appreciation that the CM design was immature and that risks had been taken on multiple fronts (people sometimes cite the famous picture of the astronauts praying over the capsule in this context). Also, I recall reading that there was ongoing controversy about the amount and flammability of materials like velcro (to the point of dueling removals and re-additions) in the cabin atmosphere.

        Competent engineers are really, really good at identifying potential problems. It’s basically our job. If you look at most historical failures in complex systems, you will find that somebody worried about it beforehand, and Apollo 1 was no exception. Columbia was actually somewhat of an exception in the narrow technical sense inasmuch as nobody really believed that the RCC panels in particular were susceptible to a foam strike. See Wayne’s point in a previous post about everybody analyzing the wrong problem (tile strikes).

        The challenge is that in any program of nontrivial scope there will always be a seemingly infinite number of such “identified concerns”, and if you try to address all of them you’ll never get anything done (release product, launch rocket, etc). It is often said that the role of an engineering manager is to decide when it’s “good enough to ship”, and that’s exactly the role Wayne describes himself playing in this post. Mission-critical organizations often have groups with seemingly conflicting mandates (NESC vs management in this instance) precisely to make sure that concerns are brought to the fore and debated thoroughly, and that risk-vs-schedule tradeoffs are made with “eyes wide open”.

        I don’t see anything to fault in Wayne’s account, to be honest. I’ve seen plenty of instances where a new test with no track record or established rubric for interpretation initially returned “scary looking” results, and in those cases somebody has to make a very hard judgment call. IMO it’s rather different from the Challenger and Columbia incidents in that in both of those instances there was prior physical evidence of “something bad” happening in flight (partial O-ring burnthroughs, high-velocity foam shedding).

        A related psychological concern is that the engineers who raise what turns out to be the critical concern will have often raised a vast number of other concerns that turned out to be trivial because, well, that’s what many engineers do. Linda Ham’s remark (as related by Wayne) about the “excitability” of the engineer who first raised the foam strike seems telling in that respect, but I may be over-interpreting what he wrote.

        Spaceflight is a risky endeavor. If I were to pick one thing to fault the Shuttle program for, it would be for downplaying those risks in the early days and failing to plan for contingencies or inform the public as a result.

      • waynehale says:

        How do you think we should have explained this to the public? Hard to do

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s