stove·pipe /ˈstōvpīp/ noun
noun: stovepipe; plural noun: stovepipes; noun: stove-pipe; plural noun: stove-pipes
- the pipe taking the smoke and gases from a stove up through a roof or to a chimney.
- an information conduit that traverses vertical levels efficiently but does not disperse widely.
Are you ready to go down a detailed technical rabbit hole? Hang on because there are lessons to be learned from this at the end. I suppose that I should have tried to summarize this story in a more abbreviated manner, but I have come to the conclusion that missing the richness of the details in a history lesson limits the lessons that can be learned. So, it is long. Back to the story:
The Space Shuttle was held down on the launch pad with eight huge nuts – 35 or so pounds about 8 inch diameter – at the bottom of the two solid rocket boosters. These nuts were screwed onto threaded steel studs about 3 inches in diameter, roughly 6 feet long. Those studs, connected to the launch pad and placed under enormous tension held the shuttle stack during roll out to the pad and during that famous ‘twang’. Many people, looking at the launch pad, erroneously concluded that the three-story tall tail service masts on either side of the orbiter also provided support. The TSMs did not, they merely connected cables, flexible hoses for various fluids and gases. No support was provided. All the weight of the combined stack – and the famous ‘twang’ or lean of the stack that occurred as the main engines ran up before the boosters fired – all those loads were transmitted through those eight bolts and the threaded steel studs from the launch pad. None of these items were small. .
The shuttle onboard computers issued the commands for multiple pyrotechnic devices all in the same minor cycle of software (80 milliseconds): SRB ignition, hold down bolt separation, GUCP separation, TSM/T0 umbilical separation. Things started happening fast after that.
The massive hold down bolts had two sets of explosives 180 degrees apart which separated the nut into two halves in a symmetric trajectory designed to propel the halves away from the hold down stud. Tension on the hold down stud should immediately cause it to retract into the housing in the launch pad while the bolt fragments were contained in ‘bolt catcher’ devices to made the ride up nearly to space and down to the ocean with the SRBs. A failure of the nut to separate was considered catastrophic because that event would severely damage the aft skirt of the SRB where the hydraulics and steering mechanism for the nozzle were contained. Thankfully, such a failure never occurred during the shuttle program. But what was observed, on 25 or so launches, was a problem where the nut did not separate cleanly and the threaded stud was pulled up out of its housing for a few seconds.
What is maybe even more obscure is the consequence of this so called ‘stud hang up’. A shock wave – imperceptible to view – travels through the system when that stud lets loose. No deviation to the trajectory. No damage to the aft skirt of the SRB other than some cosmetic scratches. No damage to the attachment hardware that connected the SRB to the ET carrying the huge liftoff loads as those SRBs lift the fully fueled stack. No damage to the ET. No damage to the attachment hardware that connected the ET to the Orbiter. But deep inside the orbiter, analysis indicated the shock wave from the hang up release could cause significant structural damage. In some cases, analysis indicated that the structure holding the vertical tail on the orbiter could be over-stressed. Leaving the shuttle orbiter tail on the launch pad would be, well, catastrophic.
The critical level of shock could not happen if just one stud hung up, nor if two studs hung up, it might happen if three studs hung up and released in a certain sequence with certain limited wind conditions, but the problem was certain to be critical if four or more studs hung up.
The shuttle program decided to ‘monitor’ the stud hang ups. Observe whether they occurred, and if they did how frequently and how many on a given flight. Hang ups occurred infrequently, rather like the O-ring erosion in the early solid rocket segments or major foam losses from the ET. Monitoring was considered a viable technical way to control the problem. If one or two stud hang ups occurred – two hang ups occurred on two flights – it was worrisome but OK. If three ever occurred, the shuttle program promised to fix the problem. And so, the troublesome design set on the back burner, so to speak, simmering but not rising to the top of the priority list to fix. In retrospect, this is nothing more than playing Russian roulette. It was not a control. It was whistling in the dark. Not an acceptable management or technical protocol. Lesson 1: Do not do this.
Why does stud hang up occur sometimes and not all the time? There are two pyrotechnic devices on the bolt which are subject to very slight delays in the firing circuit. The pyrotechnic operation can occur a few milliseconds apart, with the result that one side of the nut opens slightly before the other side opens. In some cases, this forces the stud up against the side of the aft skirt. In some cases, one of the nut fragments holds to the stud threads for a few seconds. Frequently, normal separation with no hang up occurs. But the bottom line is that we knew we had a problem, we knew it could be serious, and we knew what caused it. Turns out, we also knew how to fix it.
After the loss of Columbia, there was a review of all the lingering problems with the shuttle system. We put together plans to fix as many as we could. This stud hang up was a problem that should be fixed, we decided. Marshall Spaceflight Center and the solid rocket booster project were tasked to improve their pyrotechnic devices to eliminate the potential for a stud hang up. The solution was surprisingly simple: merely cross strap the charges right there on top of the nut. This eliminated the delay, the off-kilter separation, and thus eliminated the cause of the problem.
For twenty years, the shuttle program folks had studied this problem and spent untold engineering hours trying to analyze permutations and combinations of hang ups and what effect this might had. Every Shuttle launch had a probabilistic risk analysis for this failure. The program spent millions of dollars in engineering manhours studying this problem each flight.
During the redesign, I went to the Marshall Spaceflight Center where they were testing this cross-strap bolt. Testing is important because any new design must be demonstrated to work and not cause unforeseen problems before it could be certified for flight. Watching some of the testing, I talked to the people on the floor who surprised me by reporting: ‘You know what Mr. Hale? We had this cross-strap bolt ready to go in 1984. It was almost certified. We almost put it into work then.’ I was flabbergasted. For over twenty years the shuttle program had been living with this problem. Why was this improved design not implemented then? In response the workers noted that stud hang up does not cause any problems to the solid rocket booster other than cosmetic damage inside the skirt. And the budget was cut on the SRB project – as all the shuttle projects budgets were cut in the middle 80’s. The shuttle was operational after all; cost was a problem. Design and development were over, surely those costly engineers could be taken off the program and no new redesigns were required, right? Something had to go and in the calculus of the SRB project manager stud hang up was not causing the solid rocket booster project any problem. So, work to complete the cross-strap pyrotechnics testing and certification was eliminated as low priority work. Folded in with other savings the cost reduction was summarized and reported to the Program Manager and he was pleased.
And so, for twenty years, another part of the program paid untold millions of dollars in engineering analysis. And the whole system was at risk every single launch.
This is, after all, rocket science.
So, grasshopper, what can we learn? I have a few lessons and you may be able to glean more.
- Arbitrary budget cuts in high technology closely coupled systems can have unforeseen consequences. A good program manager will delve deep enough into each work item deleted to understand the consequences of its deletion. This takes time. Take the time.
- Organizations that work on complex technologies are often, by necessity, broken into work elements to get the job done. Overview management must be strong enough to detect when one part of the organization makes a decision that will have consequences in another part of the organization. Stated another way, in space flight most failures occur at the interfaces – technical and organizational. Be wary of stovepipes.
- Space flight vehicles are by definition experimental. They do not fly sufficient times to build up a strong data base like other transportation systems do. Declaring that a system is ‘operational’ does not mean that engineering vigilance can be significantly reduced. Bean counters always assume the engineering support withers away after the development phase is over. This is not true for space flight systems; they carry large engineering workforces into the operational life of the vehicle for good reason. Beware of program budget plans that assume large cost reductions when the vehicle is declared ‘operational.’ These devices may have an operational mission but from the engineering standpoint, they will always be ‘experimental.’
- Monitoring a problem is never the right answer. Fixing the problem is.
And one more for the policy makers:
Whenever we try to do something high risk and complicated – and spaceflight is terribly complicated – especially when the technical systems are highly coupled with low factors of safety, we will face problems like this. John F. Kennedy spoke out about this when he proposed sending Americans to the Moon. He added that if the nation were to decide to go to the Moon and stop because it becomes too expensive or too dangerous, it would be better not to go at all.
These systems are expensive, I am sorry. I wish it were different. Many people are working to reduce costs, but my experience is that they will remain expensive, at least with our current technology. Space projects must be fully supported by adequate resources. Or it would be better not to start at all.