Exploding Stovepipes

stove·pipe  /ˈstōvpīp/  noun

noun: stovepipe; plural noun: stovepipes; noun: stove-pipe; plural noun: stove-pipes

  1. the pipe taking the smoke and gases from a stove up through a roof or to a chimney.
  2. an information conduit that traverses vertical levels efficiently but does not disperse widely.

Are you ready to go down a detailed technical rabbit hole?  Hang on because there are lessons to be learned from this at the end.  I suppose that I should have tried to summarize this story in a more abbreviated manner, but I have come to the conclusion that missing the richness of the details in a history lesson limits the lessons that can be learned.  So, it is long.  Back to the story: 

The Space Shuttle was held down on the launch pad with eight huge nuts – 35 or so pounds about 8 inch diameter – at the bottom of the two solid rocket boosters.  These nuts were screwed onto threaded steel studs about 3 inches in diameter, roughly 6 feet long. Those studs, connected to the launch pad and placed under enormous tension held the shuttle stack during roll out to the pad and during that famous ‘twang’. Many people, looking at the launch pad, erroneously concluded that the three-story tall tail service masts on either side of the orbiter also provided support.  The TSMs did not, they merely connected cables, flexible hoses for various fluids and gases.  No support was provided.  All the weight of the combined stack – and the famous ‘twang’ or lean of the stack that occurred as the main engines ran up before the boosters fired – all those loads were transmitted through those eight bolts and the threaded steel studs from the launch pad.  None of these items were small.  . 

The shuttle onboard computers issued the commands for multiple pyrotechnic devices all in the same minor cycle of software (80 milliseconds):  SRB ignition, hold down bolt separation, GUCP separation, TSM/T0 umbilical separation.  Things started happening fast after that. 

The massive hold down bolts had two sets of explosives 180 degrees apart which separated the nut into two halves in a symmetric trajectory designed to propel the halves away from the hold down stud.  Tension on the hold down stud should immediately cause it to retract into the housing in the launch pad while the bolt fragments were contained in ‘bolt catcher’ devices to made the ride up nearly to space and down to the ocean with the SRBs.  A failure of the nut to separate was considered catastrophic because that event would severely damage the aft skirt of the SRB where the hydraulics and steering mechanism for the nozzle were contained.  Thankfully, such a failure never occurred during the shuttle program.  But what was observed, on 25 or so launches, was a problem where the nut did not separate cleanly and the threaded stud was pulled up out of its housing for a few seconds. 

What is maybe even more obscure is the consequence of this so called ‘stud hang up’.  A shock wave – imperceptible to view – travels through the system when that stud lets loose.  No deviation to the trajectory.  No damage to the aft skirt of the SRB other than some cosmetic scratches.  No damage to the attachment hardware that connected the SRB to the ET carrying the huge liftoff loads as those SRBs lift the fully fueled stack.  No damage to the ET.  No damage to the attachment hardware that connected the ET to the Orbiter. But deep inside the orbiter, analysis indicated the shock wave from the hang up release could cause significant structural damage.  In some cases, analysis indicated that the structure holding the vertical tail on the orbiter could be over-stressed.  Leaving the shuttle orbiter tail on the launch pad would be, well, catastrophic. 

The critical level of shock could not happen if just one stud hung up, nor if two studs hung up, it might happen if three studs hung up and released in a certain sequence with certain limited wind conditions, but the problem was certain to be critical if four or more studs hung up. 

The shuttle program decided to ‘monitor’ the stud hang ups.  Observe whether they occurred, and if they did how frequently and how many on a given flight.  Hang ups occurred infrequently, rather like the O-ring erosion in the early solid rocket segments or major foam losses from the ET.  Monitoring was considered a viable technical way to control the problem.  If one or two stud hang ups occurred – two hang ups occurred on two flights – it was worrisome but OK.  If three ever occurred, the shuttle program promised to fix the problem.  And so, the troublesome design set on the back burner, so to speak, simmering but not rising to the top of the priority list to fix.  In retrospect, this is nothing more than playing Russian roulette.  It was not a control.  It was whistling in the dark.  Not an acceptable management or technical protocol.  Lesson 1:  Do not do this. 

Why does stud hang up occur sometimes and not all the time?  There are two pyrotechnic devices on the bolt which are subject to very slight delays in the firing circuit.  The pyrotechnic operation can occur a few milliseconds apart, with the result that one side of the nut opens slightly before the other side opens.  In some cases, this forces the stud up against the side of the aft skirt.  In some cases, one of the nut fragments holds to the stud threads for a few seconds.  Frequently, normal separation with no hang up occurs.  But the bottom line is that we knew we had a problem, we knew it could be serious, and we knew what caused it.  Turns out, we also knew how to fix it.   

After the loss of Columbia, there was a review of all the lingering problems with the shuttle system.  We put together plans to fix as many as we could.  This stud hang up was a problem that should be fixed, we decided. Marshall Spaceflight Center and the solid rocket booster project were tasked to improve their pyrotechnic devices to eliminate the potential for a stud hang up.  The solution was surprisingly simple: merely cross strap the charges right there on top of the nut.  This eliminated the delay, the off-kilter separation, and thus eliminated the cause of the problem.

For twenty years, the shuttle program folks had studied this problem and spent untold engineering hours trying to analyze permutations and combinations of hang ups and what effect this might had. Every Shuttle launch had a probabilistic risk analysis for this failure.  The program spent millions of dollars in engineering manhours studying this problem each flight.

During the redesign, I went to the Marshall Spaceflight Center where they were testing this cross-strap bolt.  Testing is important because any new design must be demonstrated to work and not cause unforeseen problems before it could be certified for flight. Watching some of the testing, I talked to the people on the floor who surprised me by reporting: ‘You know what Mr. Hale?  We had this cross-strap bolt ready to go in 1984. It was almost certified. We almost put it into work then.’  I was flabbergasted. For over twenty years the shuttle program had been living with this problem. Why was this improved design not implemented then? In response the workers noted that stud hang up does not cause any problems to the solid rocket booster other than cosmetic damage inside the skirt.  And the budget was cut on the SRB project – as all the shuttle projects budgets were cut in the middle 80’s. The shuttle was operational after all; cost was a problem.  Design and development were over, surely those costly engineers could be taken off the program and no new redesigns were required, right?  Something had to go and in the calculus of the SRB project manager stud hang up was not causing the solid rocket booster project any problem.  So, work to complete the cross-strap pyrotechnics testing and certification was eliminated as low priority work.  Folded in with other savings the cost reduction was summarized and reported to the Program Manager and he was pleased.

And so, for twenty years, another part of the program paid untold millions of dollars in engineering analysis.  And the whole system was at risk every single launch.   

This is, after all, rocket science. 

So, grasshopper, what can we learn?  I have a few lessons and you may be able to glean more.

  1.  Arbitrary budget cuts in high technology closely coupled systems can have unforeseen consequences.  A good program manager will delve deep enough into each work item deleted to understand the consequences of its deletion.  This takes time.  Take the time.
  2. Organizations that work on complex technologies are often, by necessity, broken into work elements to get the job done.  Overview management must be strong enough to detect when one part of the organization makes a decision that will have consequences in another part of the organization.  Stated another way, in space flight most failures occur at the interfaces – technical and organizational. Be wary of stovepipes.
  3. Space flight vehicles are by definition experimental.  They do not fly sufficient times to build up a strong data base like other transportation systems do.  Declaring that a system is ‘operational’ does not mean that engineering vigilance can be significantly reduced.  Bean counters always assume the engineering support withers away after the development phase is over.  This is not true for space flight systems; they carry large engineering workforces into the operational life of the vehicle for good reason.  Beware of program budget plans that assume large cost reductions when the vehicle is declared ‘operational.’  These devices may have an operational mission but from the engineering standpoint, they will always be ‘experimental.’
  4. Monitoring a problem is never the right answer. Fixing the problem is.

And one more for the policy makers:

Whenever we try to do something high risk and complicated – and spaceflight is terribly complicated – especially when the technical systems are highly coupled with low factors of safety, we will face problems like this.  John F. Kennedy spoke out about this when he proposed sending Americans to the Moon.  He added that if the nation were to decide to go to the Moon and stop because it becomes too expensive or too dangerous, it would be better not to go at all.

These systems are expensive, I am sorry. I wish it were different. Many people are working to reduce costs, but my experience is that they will remain expensive, at least with our current technology.  Space projects must be fully supported by adequate resources. Or it would be better not to start at all. 

About waynehale

Wayne Hale is retired from NASA after 32 years. In his career he was the Space Shuttle Program Manager or Deputy for 5 years, a Space Shuttle Flight Director for 40 missions, and has retired from consulting and is currently a full time grandpa. He might be available for speaking engagements for the right incentives (coffee and donuts work!)
This entry was posted in Uncategorized. Bookmark the permalink.

12 Responses to Exploding Stovepipes

  1. denniswingo says:

    Wayne, very very interesting. I know some of the folks involved there at MSFC and they had a whole litany of improvements that were brought almost to the point of implementation and then killed., There was another one that could have had enormous impacts on the system, It was the electromechanical actuators to replace the hydraulics for the SSME’s and aerosurfaces. I saw the hardware under test in the lab in 4487 at MSFC when I started to work there in 1988.

    Going to an all electromechanical system would have eliminated the hydraulic system, and the pesky APU’s. It would have required only a minor upgrade in the electrical systems but would have saved a lot of time and money during the orbiter turn around times as well as made the entire system safer. It was one of the recommendations for Orbiter 2020 operation that was buried because it came out the same week that Columbia went down….

    The project was killed by the budget cuts following the return to flight after Challenger when all of these upgrade ideas were taken forward but then killed after the successful return to flight.

  2. Spacebrat1 says:

    your stories are fascinating, not many folks get to look inside ‘rocket science’. thank you again for sharing

  3. Rand Simberg says:

    Dennis, we were working with Moog on EMAs in Downey about that time period, but at the time, they didn’t have the strength and responsiveness of hydraulics. They probably do now.

    • denniswingo says:

      I can’t remember the guy’s name in charge at the Astrionics branch there at MSFC in 4487 (Monte Montenegro I think was his name), but he had the hardware on the bench for the SSME electric gimbal motors in 1988. Dunno how close that was to your effort, but I was working on return to flight and this was a big topic, how to get rid of the APU’s, at MSFC at the time.

  4. Clay Jones says:

    And I thought having a “stud hang up” was something else….

    I really enjoy these stories Wayne. I believe you could explain Calculus to English majors.

    Systems Engineering – I wonder what might be in the course syllabus (or series of courses)?

  5. Michael Kelly says:

    I find this astonishing, though perhaps I’m being naïve. The first 12 years of my career were spent at TRW Ballistic Missiles Division, the System Engineering/Technical Assistance contractor to the Air Force Ballistic Missile Office. One of the activities I was constantly called on to support was the Interface Control Working Group (ICWG). During development of both Peacekeeper and the Small ICBM, the ICWG met regularly to assess how proposed engineering change orders in any part of the weapon system propagated throughout the entire system – and I mean, the entire system, from logistics to ground support equipment to the missile subsystems and systems to operational crew training…the list was all encompassing. The ICWG had to sign off on any engineering change order, or it was not allowed. We didn’t take this activity lightly, despite how exhausting it was, and it worked quite well (not perfectly, but nothing is ever perfect).

    I’m really surprised that a NASA manned spaceflight program didn’t seem to have a similar activity. After all, the SRB certainly interfaced with the rest of the system…

    Aside from all that, I wonder why NASA didn’t simply implement of variation of the Saturn V hold-down arms (https://www.hq.nasa.gov/pao/History/SP-4204/ch13-4.html).

    • Dan Adamo says:

      Michael, the “soft release” mechanism in the Saturn V hold-down arms was genius in its design. As I recall, a malleable metallic pin was extruded through a steel die to actually depart the launch pad with minimal “lurch” or onsets. This may have been practical because Saturn V’s thrust-to-weight ratio at Saturn I-C ignition was less than 1. It may be the passive soft release technique was not applicable to the Space Shuttle, as it was literally blown off the pad at SRB ignition with appreciable onsets. I remember these from my stint as a “motion control lab rat” in the Shuttle Mission Simulator’s motion base when liftoff cues were being developed before STS-1.

  6. Dan Adamo says:

    Regarding Wayne’s “Grasshopper takeaway #3”, I’ve cast a jaundiced eye at “operational” HSF vehicles since STS-25/51-L’s loss of “Challenger” and her crew. Space Shuttles were declared “operational” after STS-4 landed, yet USAF combat aircraft typically undergo THOUSANDS of test flights before achieving operational status. Earlier this year, SpaceX’s Crew Dragon was declared operational after only two test flights, only one of them crewed. This trend does not bode well IMHO.

    Flying into orbit and flying into combat are both risky propositions, but it costs many times more bucks to conduct a Crew Dragon test launch (where the risk isn’t abated) as opposed to a combat aircraft test (where hostile aircraft with live armament don’t participate). So, as Wayne observes, a truly operational HSF vehicle is “unobtanium” without breakthrough technology.

    We in aerospace owe it to the public and their representatives in government to refrain from calling HSF vehicles “operational” without qualifying or altering that terminology. The first Crew Dragon “tourist mission” in LEO is being planned for about a year from now (ref. https://www.redorbit.com/retired-nasa-astronaut-to-head-first-private-mission-on-spacex-crew-dragon/). Are the risks of an undeniably “experimental” space transportation system consistent with tourism? Can SpaceX and NASA find better terminology than “operational” for Crew Dragon?

  7. John Curry says:

    Excellent summary as always Wayne. Having been in the commercial program management side of things for the last 11 years of my career, it is very interesting to be deeply ingrained in these Technical, Cost, Schedule, and Risk (TCSR) trades nearly every day. We often expend tons of energy assessing continency cases at the expense of working hard to mitigate issues that manifest themselves in nominal operations (e.g. SRB O-ring burn through beginning as early as STS-2, foam coming off the tank for multiple flights before Columbia, etc.). As you said, it take a lot of hard work and a solid team to understand the true LOC or LOV/LOM risk associated with technical performance issues and what the “Fix” would do to schedule and budget and overall program viability. Key is getting a real data driven assessment of risk and make sure everyone agrees we have acceptable risk vice unacceptable risk to continue…..or stop the train until develop a viable plan to mitigate the risk is vetted and approved (it has to be much better than Russian Roulette)!!!

  8. James Carleton says:

    I remember stub hang up well and the subsequent broaching of several Aft Skirt hold down post holes. Although not a “significant problem” at the project level it was one at the SRB element level. We were looking at sleeving the holes and other fixes but the cross strap of course solved the skewed pyro event and thus eliminated the structure damage. thanks for reminding me of “the good old days”.

  9. Tom Marcucci says:

    After reading this post… I imagine Dr Feynman would share your thoughts on this particular issue. His appendix to the Challenger Commission report should be required reading for all involved in space travel and it’s engineering challenges.

  10. Mark Lopa says:

    Wayne, we need a new blog post from you soon! Your fans are waiting! 🙂

Leave a comment