“You will never remember the many times the launch slipped, but the on-time failures are with you always” – Walt W. Williams, NASA Program Manager for X-15 and Mercury
In the summer of 2002, the word got out about the NASA HQ screen saver counting down to the launch of Node 2 (US segment of the International Space Station completion). That was well over a year away and the screen saver was counting down in fractions of a second toward that scheduled event. All of us in Shuttle operations were offended. We had been schooled over and over again about the dangers of “launch fever” where people lose their judgment just to make a launch happen on schedule. To a person, we all were committed to not let that happen.
And the program management agreed. Prime example: early that year, the schedule took a real series of delays over BSTRA balls. The Ball-Strut-Tie-Rod-Assembly was a complex part of the main propulsion system plumbing on the shuttle. Flexible pipes over a foot in diameter carried a huge flow of cryogenic hydrogen or cryogenic oxygen into the main engine inlets. Due to the large temperature changes the pipes must have flexibility to respond as the metal contracts or expands. A complicated mechanism ensured that the light weight pipes would move properly; a spiderweb of struts in the curving pipe met at a ball made of incredibly hard material. Cracks were found in one of these balls. This was not good. If chunks of the super-hard BSTRA ball came off, they would go right into the turbopumps where catastrophic damage would occur. But the assembly was too far back down the pipe to inspect easily, even with long optical fiber camera contraptions. Studies, experiments, analysis, and as much inspection as we could do gave us some confidence that the worst would not happen, but all of that work took time out of a compressed schedule. And the HQ countdown clock kept running. We ignored that.
NASA is not a military outfit. Sometimes there are new leaders appointed that believe they can, by fiat, make changes in policy, direction, or actions. NASA is more like a benevolent anarchy where leadership pronouncements generally are considered the starting point for discussion. So I don’t buy schedule pressure as a major cause of the Columbia accident. Earlier blogs have given my point of view on those causes.
A sad side note is that months later, in the Columbia debris that filtered into the reconstruction hanger at KSC, I got my best and close up look at a BSTRA ball, still packaged inside its spiderweb of struts. No cracks. Not the way I wanted to see that item.
But on STS-112 we had a full blown harbinger of what was to come, and we missed it. Here is my part of that story.
Early in the year I had been assigned to be a lead flight director, and a Mission ops director rep, and for STS-113 I was supposed to requalify as an Ascent/Entry Flight director. In addition to all the simulations and training, I needed to work closely with the A/E FD on the flight before mine. John Shannon was the A/E FD for STS-112; Bob Castle was the MOD, and I was “Weather Flight”. That meant I sat next to John (and in front of Bob) during pre-launch and pre-entry operations and kept track of all the weather information. As if John couldn’t do that himself. But it got me in the control center during a real flight and helped to get down the cadence and tempo. Weather, in a real flight, was more often than not, the major problem. Simulations and training attempting to emulate weather problems always came off lame and easy. Real life weather observations and forecasting was always dynamic, complex, and hard to follow.
After the usual difficulties, the countdown clock for STS-112 ticked down the final few seconds of the count. During that last part, the shuttle’s onboard computers – known as the “redundant set” – were in control. The last signal from the ground was “go for main engine start” at T-10 seconds. Unless the ground system – either automatically or manually – sensed a problem and issued an “RSLS Abort” the onboard system would launch itself. The Redundant Set Launch Sequence was software program that did all the onboard checking and commanding in the last 31 seconds. The four redundant set general purpose computers executed the same software in lockstep to the millisecond. The RSLS software commanded the main engines to start in 120 millisecond staggered sequence at 6.6 seconds to go, listened for any failures detected by the engine controllers, checked to see that all three engines were at full thrust, checked a dozen other items, and in the milliseconds right at T=0 sent the pyrotechnic commands which separated the shuttle from the launch pad and ignited the solid rocket boosters. Once the solids lit, you were going someplace in a big hurry. Last step in the RSLS, terminate itself and start the onboard software programs required to actually fly. Of course, when the RSLS terminated, there was no program looking for a ground commanded “RSLS Abort.”
A real fear – alleviated by a million software verification checks – was that somehow the solid rocket boosters would be ignited and something – an RSLS Abort command for example – would stop the launch sequencer in the last milliseconds. That would be a disaster. If the hold down posts and T=0 umbilical panels and the GUCP arm did not separate, or the liquid engines were commanded to shut down, the consequences would be immediate and devastating. So that software was tested over and over again with all the variations of inputs that could be devised.
On STS-112, sitting with the Ascent Flight director, I got to do the one thing that was not allowed for any of the critical flight controllers – watch the television. By some quirk of communications routing, the TV picture would beat the data to the screens of mission control. So I knew the shuttle had lifted off before the DPS officer sung out the magic words: “Liftoff Confirmed”. The Data Processing systems officer was looking at the termination of the RSLS and the start of the software to fly the vehicle, not engines or bolts or anything physical. Out of the corner of my eye, I caught a red light on the screen in front of John: “RSLS Abort”. My heart stopped. But the TV showed that everything was OK, no fireball of explosion, so we road it out.
Much later we found out the cause: bad pin connectors. The pyrotechnics to blow all those holddown bolts are actually part of the launch pad. The signal from the shuttle onboard computers has to go through a set of pull away electrical connectors in what is called the T=0 umbilical panel (because the panel is to separate at T=0) and then the signals are routed to the explosive charges. One set of signals did not make it across the interface due to corroded or misaligned electrical pins. The ground system caught that only half of the pyrotechnics would fire and requested a hold – but it was too late. Of course, the system was redundant, so half the explosives did the job – just barely. A failure of one more pyrotechnic initiator and it would have been a very bad day.
For the next flight, new cables and pins were installed. But that was not enough; during the long return to flight process, a special working team examined every possible aspect of what causes pin connectors to fail. New processes were built into the last two dozen shuttle launches starting with a prohibition against re-using connector pins in those critical areas.
Meanwhile, back to the launch: there was a new camera was attached that looked down from the side of the external tank at the earth falling away below. It was powerfully mesmerizing and I watched the TV until the solid rocket booster separation motors fogged the lens and nothing more could be seen. It should be noted, that the camera was on the wrong side of the tank to see the really interesting development that happened during early ascent: the loss of a big chunk of foam from the outside of the external tank.
A few days after launch the report came in that on the left hand solid rocket booster, one of the foam areas that are sprayed on the booster case to alleviate splash down water loads, had a big dent in it. No real issue there; and some sleuthing turned up pictures of the big foam loss off of the external tank that must have caused it.
After the flight, as Ascent/Entry flight director for STS-113, I followed all the “anomalies” with great interest. We simply had to fix those connector pins for protection against disaster at launch. The program did that. The BSTRA ball issue was still not well understood and more work had to be done to check the Endeavour’s plumbing and make sure no fragments clogged a turbopump. Oh, that foam thing? It was not categorized as an “anomaly.” The program reviewed this ‘event’ at the STS-113 ET/SRB Mate Review. That review was chaired by Jim Halsell, the Shuttle Launch Integration Manager. At that review, the ET Project Manager, Jerry Schmeltzer, categorized the loss as “not a safety of flight issue” and the potential for future losses was “accepted”. In the NASA shuttle system, once a program review had “dispositioned” an item, it was not reviewed further. If there was any discussion at the STS-113 Flight Readiness review, I was not there to hear it; Ascent/Entry Flight Directors did not get to go to Florida for that review. The Lead Flight Director, Paul Dye, and the Mission Ops Director, Bob Castle, attended the FRR but they were focused on the in flight operations, the assembly tasks. It was never a subject for any later flight – STS-107 – because it had already been “dispositioned.” There were simply too many issues to rehash every thing, every launch. Later on, when I talked to Linda Ham, she was worried that the foam issue was not properly addressed; the rationale was lousy.
But by that point we were way behind schedule for the all-important Node 2 launch. Jim Halsell was assigned as commander for that flight, STS-120, and they needed a replacement for him. It turned out to be me. But that is a story for another day.
Turns out that we were more vulnerable to schedule pressure than I thought.