“You will never remember the many times the launch slipped, but the on-time failures are with you always” – Walt W. Williams, NASA Program Manager for X-15 and Mercury
In the summer of 2002, the word got out about the NASA HQ screen saver counting down to the launch of Node 2 (US segment of the International Space Station completion). That was well over a year away and the screen saver was counting down in fractions of a second toward that scheduled event. All of us in Shuttle operations were offended. We had been schooled over and over again about the dangers of “launch fever” where people lose their judgment just to make a launch happen on schedule. To a person, we all were committed to not let that happen.
And the program management agreed. Prime example: early that year, the schedule took a real series of delays over BSTRA balls. The Ball-Strut-Tie-Rod-Assembly was a complex part of the main propulsion system plumbing on the shuttle. Flexible pipes over a foot in diameter carried a huge flow of cryogenic hydrogen or cryogenic oxygen into the main engine inlets. Due to the large temperature changes the pipes must have flexibility to respond as the metal contracts or expands. A complicated mechanism ensured that the light weight pipes would move properly; a spiderweb of struts in the curving pipe met at a ball made of incredibly hard material. Cracks were found in one of these balls. This was not good. If chunks of the super-hard BSTRA ball came off, they would go right into the turbopumps where catastrophic damage would occur. But the assembly was too far back down the pipe to inspect easily, even with long optical fiber camera contraptions. Studies, experiments, analysis, and as much inspection as we could do gave us some confidence that the worst would not happen, but all of that work took time out of a compressed schedule. And the HQ countdown clock kept running. We ignored that.
NASA is not a military outfit. Sometimes there are new leaders appointed that believe they can, by fiat, make changes in policy, direction, or actions. NASA is more like a benevolent anarchy where leadership pronouncements generally are considered the starting point for discussion. So I don’t buy schedule pressure as a major cause of the Columbia accident. Earlier blogs have given my point of view on those causes.
A sad side note is that months later, in the Columbia debris that filtered into the reconstruction hanger at KSC, I got my best and close up look at a BSTRA ball, still packaged inside its spiderweb of struts. No cracks. Not the way I wanted to see that item.
But on STS-112 we had a full blown harbinger of what was to come, and we missed it. Here is my part of that story.
Early in the year I had been assigned to be a lead flight director, and a Mission ops director rep, and for STS-113 I was supposed to requalify as an Ascent/Entry Flight director. In addition to all the simulations and training, I needed to work closely with the A/E FD on the flight before mine. John Shannon was the A/E FD for STS-112; Bob Castle was the MOD, and I was “Weather Flight”. That meant I sat next to John (and in front of Bob) during pre-launch and pre-entry operations and kept track of all the weather information. As if John couldn’t do that himself. But it got me in the control center during a real flight and helped to get down the cadence and tempo. Weather, in a real flight, was more often than not, the major problem. Simulations and training attempting to emulate weather problems always came off lame and easy. Real life weather observations and forecasting was always dynamic, complex, and hard to follow.
After the usual difficulties, the countdown clock for STS-112 ticked down the final few seconds of the count. During that last part, the shuttle’s onboard computers – known as the “redundant set” – were in control. The last signal from the ground was “go for main engine start” at T-10 seconds. Unless the ground system – either automatically or manually – sensed a problem and issued an “RSLS Abort” the onboard system would launch itself. The Redundant Set Launch Sequence was software program that did all the onboard checking and commanding in the last 31 seconds. The four redundant set general purpose computers executed the same software in lockstep to the millisecond. The RSLS software commanded the main engines to start in 120 millisecond staggered sequence at 6.6 seconds to go, listened for any failures detected by the engine controllers, checked to see that all three engines were at full thrust, checked a dozen other items, and in the milliseconds right at T=0 sent the pyrotechnic commands which separated the shuttle from the launch pad and ignited the solid rocket boosters. Once the solids lit, you were going someplace in a big hurry. Last step in the RSLS, terminate itself and start the onboard software programs required to actually fly. Of course, when the RSLS terminated, there was no program looking for a ground commanded “RSLS Abort.”
A real fear – alleviated by a million software verification checks – was that somehow the solid rocket boosters would be ignited and something – an RSLS Abort command for example – would stop the launch sequencer in the last milliseconds. That would be a disaster. If the hold down posts and T=0 umbilical panels and the GUCP arm did not separate, or the liquid engines were commanded to shut down, the consequences would be immediate and devastating. So that software was tested over and over again with all the variations of inputs that could be devised.
On STS-112, sitting with the Ascent Flight director, I got to do the one thing that was not allowed for any of the critical flight controllers – watch the television. By some quirk of communications routing, the TV picture would beat the data to the screens of mission control. So I knew the shuttle had lifted off before the DPS officer sung out the magic words: “Liftoff Confirmed”. The Data Processing systems officer was looking at the termination of the RSLS and the start of the software to fly the vehicle, not engines or bolts or anything physical. Out of the corner of my eye, I caught a red light on the screen in front of John: “RSLS Abort”. My heart stopped. But the TV showed that everything was OK, no fireball of explosion, so we road it out.
Much later we found out the cause: bad pin connectors. The pyrotechnics to blow all those holddown bolts are actually part of the launch pad. The signal from the shuttle onboard computers has to go through a set of pull away electrical connectors in what is called the T=0 umbilical panel (because the panel is to separate at T=0) and then the signals are routed to the explosive charges. One set of signals did not make it across the interface due to corroded or misaligned electrical pins. The ground system caught that only half of the pyrotechnics would fire and requested a hold – but it was too late. Of course, the system was redundant, so half the explosives did the job – just barely. A failure of one more pyrotechnic initiator and it would have been a very bad day.
For the next flight, new cables and pins were installed. But that was not enough; during the long return to flight process, a special working team examined every possible aspect of what causes pin connectors to fail. New processes were built into the last two dozen shuttle launches starting with a prohibition against re-using connector pins in those critical areas.
Meanwhile, back to the launch: there was a new camera was attached that looked down from the side of the external tank at the earth falling away below. It was powerfully mesmerizing and I watched the TV until the solid rocket booster separation motors fogged the lens and nothing more could be seen. It should be noted, that the camera was on the wrong side of the tank to see the really interesting development that happened during early ascent: the loss of a big chunk of foam from the outside of the external tank.
A few days after launch the report came in that on the left hand solid rocket booster, one of the foam areas that are sprayed on the booster case to alleviate splash down water loads, had a big dent in it. No real issue there; and some sleuthing turned up pictures of the big foam loss off of the external tank that must have caused it.
After the flight, as Ascent/Entry flight director for STS-113, I followed all the “anomalies” with great interest. We simply had to fix those connector pins for protection against disaster at launch. The program did that. The BSTRA ball issue was still not well understood and more work had to be done to check the Endeavour’s plumbing and make sure no fragments clogged a turbopump. Oh, that foam thing? It was not categorized as an “anomaly.” The program reviewed this ‘event’ at the STS-113 ET/SRB Mate Review. That review was chaired by Jim Halsell, the Shuttle Launch Integration Manager. At that review, the ET Project Manager, Jerry Schmeltzer, categorized the loss as “not a safety of flight issue” and the potential for future losses was “accepted”. In the NASA shuttle system, once a program review had “dispositioned” an item, it was not reviewed further. If there was any discussion at the STS-113 Flight Readiness review, I was not there to hear it; Ascent/Entry Flight Directors did not get to go to Florida for that review. The Lead Flight Director, Paul Dye, and the Mission Ops Director, Bob Castle, attended the FRR but they were focused on the in flight operations, the assembly tasks. It was never a subject for any later flight – STS-107 – because it had already been “dispositioned.” There were simply too many issues to rehash every thing, every launch. Later on, when I talked to Linda Ham, she was worried that the foam issue was not properly addressed; the rationale was lousy.
But by that point we were way behind schedule for the all-important Node 2 launch. Jim Halsell was assigned as commander for that flight, STS-120, and they needed a replacement for him. It turned out to be me. But that is a story for another day.
Turns out that we were more vulnerable to schedule pressure than I thought.
Well written and a valuable reminder for all of us who work on deadline.
I certainly appreciate the insight you’re giving us readers about how innocent enough events led to the loss of Columbia, e.g. how the foam issue became a non-issue, that some didn’t think that foam loss had been properly addressed.
To this pilot, your post reads like a NTSB accident report. If one reads enough of those, one pattern that stands-out is of innocent enough actions, not incompetence, combining to box-in a pilot to the point from which there is little chance of a graceful exit. Pilots read those to educate themselves so they don’t make the same mistakes.
Thank you for educating us.
Wayne, it hurts to read much of this, but I’m grateful you are taking the time to tell us how this unfolded from your point of view.
It would be a gross oversimplification to say that Lassie was barking about the bridge being on fire but no one understood why she was barking.
If only it had been that easy.
If I recall correctly, STS-107 had been cancelled or delayed several times. It wasn’t your responsibility to review and/or create action plans to deal with events or anomalies. Those items were handled by those whose jobs were to handle them.
In turn, those people followed the tried and tested NASA way: in God we trust, all others bring data. At that point in time the rationale for diverting limited resources to resolving a “non-problem” wasn’t there. Throw in a desire to not wish to be the “long pole in the tent” (“Sir, Mr. Hale here thinks that we should stop launching until we find the reason for the foam shedding. He has no data to back up his ‘rationale.'”) reason for making that screen saver inaccurate and our safety margin slipped a little more.
Lassie WAS barking…but some didn’t hear her at all for various reasons while others thought Timmy fell down the well again instead of figuring out that the bridge was on fire.
Jerry made the comment based on the fact that foam had not caused castrophic damage before. We had only experienced tile damage which were turn around concerns and not safety concerns. The problem was that the Orbiter manager did not jump in with information that no foam impact testing had ever been done and we don’t know if it is a safety of flight concern. We all assumed that a lightweight material like foam could not damage anything important. Flight experience can give you a false scense of security when you don’t understand root causes. We had years of experience and a lot of opportunity to ask the really tough questions. But we did not!
“We all assumed that a lightweight material like foam could not damage anything important.”
In the days following the loss of Columbia, someone compared the foam loss’s effects to having a Styrofoam cooler blow out of the bed of a pickup truck in front of you and its effect on your car.
My first thought was that nobody at NASA rode a motorcycle, my second thought was why did they choose that comparison?
I have been riding for 29 years. About 20 years ago, I was astride my Suzuki GS850G on US 40 a few miles west of the Yough Reservior when a Styrofoam cooler blew out of the back of the pickup truck in front of me. The bike had a Plexifairing windshield and I wore a full-face helmet.
I rolled off the throttle and ducked behind the windshield. The impact destroyed the cooler and the force input to the handlebars sent the bike into a short but intense “tank slapper”. I barely missed the truck, whose driver thought it a good thing to stop on the highway.
F=MA, for the rest of us.
I still wear a full-face helmet. Stink bugs hurt, and they can cut you if you’re unlucky enough to hit one edge-on.
Sean O Keefe made that comment at a press briefing. He also dismissed the Foamology. In that moment, I knew he had no understanding of fundamental physics.
I’ve seen grass driven through trees by tornado force winds, this would be much worse.
“In the NASA shuttle system, once a program review had “dispositioned” an item, it was not reviewed further. ”
Looks like an interesting single-point-of-failure.
Hardly. In general a disposition involved dozens to hundreds of people. I hope that you got the sense from this one launch that we had lots of anomalies on every flight, many of them conceivably with dire consequences. To revisit every anomaly multiple times was not effective in terms of personnel time.
I did not suggest that a “disposition” was a single-person SPOF, just a process SPOF, if e.g. the existence of a prior judgement discourages (or god-forbid precludes?) subsequent contrary-data-gathering.
“To revisit every anomaly multiple times was not effective in terms of personnel time.”
Absolutely. But a *few* of them would have been effective in terms of personnel time.
So without 20-20 hindsight . . . how do you tell which few deserve reconsideration?
“without 20-20 hindsight . . . how do you tell which”
I guess that’s the question for the experts, if they are not limited to “none” by process.
The process continued to work very hard on those problems that were recongnized as on-going. And the program was roundly castigated for having a “marching army” to investigate and mitigate anomalies. Interesting paradox, isn’t it?
Sometimes I think the single most important — and mind-numbingly obvious and overlooked — conclusion from all of your posts is the simple one that humans in the aggregate do not behave like a mass of individual humans. As a species, we forget or deny that over and over and over to our great sorrow. 😦
I know this is painful for you to rake over, but I hope you understand how deep is the unreasoning love that a large part of the population holds for the shuttles, the people in them, and all the people who got them going. I know you’re going to say it doesn’t matter, but it’s there. When Atlantis flew right over my workplace and I saw it in the air before me, I welled up and caught my breath. Goddamn it, that means something. Those Rube-Goldberg monstrosities are precious to me.
Of course it matters. I feel the same way, too.
“I hope you understand how deep is the unreasoning love that a large part of the population holds for the shuttles, the people in them, and all the people who got them going.”
I second that emotion, and would like to expand upon it.
In a sense, NASA is all about dreams, and what we humans can and might be able to accomplish. Those dreams, born of the machinery of war (go on YouTube and watch the video from a White Sands V-2 launch), have resulted in footprints on the Moon, twin Voyagers probing the edges of our solar system, and multiple robotic explorers on Mars.
Those who flew aboard the shuttles carried OUR dreams along with them. That’s the human factor. When they died part of us died along with them.
I’ll third that then, and expand it to mean “population” in general. I know, the focus around here is very much narrowed to American endeavours in space, and the things NASA does and has accomplished in the past. But to me, this “spaceflight” thing is something that mankind does as a whole. It doesn’t make me feel “red, white’n blue all over” because I’m not American and those aren’t my colors, but that doesn’t diminish my emotional attachment to spaceflight and the people that make it happen, nor does it diminish my pride to witness this moment in our evolution as a species.
I hope that you will send a check to the US Treasury to help pay the bill. It will help you be more a part of it. Emotional support is nice but it doesn’t build rockets.
… the word got out about the NASA HQ screen saver counting down to the launch … All of us in Shuttle operations were offended.
A hamfisted political inspiration gone awry. Certain that the originator never had the slightest notion of the impact.
In the early days the Shuttle was sold as a trucking service to orbit or a space pickup truck equivalent. You throw the parts together, have fits and close calls for a while, but then it becomes your “daily driver” that you take into the shop now and then. The people and processes resemble that of “auto equivalents”. This became the political/PR default mode for far to many, which greatly undervalued the program, and set up a series of false dichotomies/dillemmas. Among them, a trucking fleet has an expectation of routine, reliable operation. A tame animal.
The closest thing to Shuttle was a perpetual X program. Many of the X-planes were far from tame. “Pet my tame snake, he probably won’t bite … this time”. So what got conflated was the justifiable pride in the management of one of the most difficult, extraordinary programs in the history of mankind … with reliable, scheduled delivery to orbit of payloads. How many trucks do you know that are at/beyond the bleeding edge of the envelope, where still too much is unresolved in their basic operation, with LOC/LOM numbers so large, with comparitively little effort to improve/refine technologies/operations. “It’s just a space pickup truck – get in it and drive, make those deliveries ontime! After all, this is a delivery business”
The men and women of Shuttle made and operated the largest, highest performance, most complex prototype vehicle that ascended, operated on orbit, and descended more than a hundred times.
The prototype had many (thousands!) issues large and small. In the large they were tamed. But the scope of those issues were beyond the program’s human ability to reliably remedy/remove.
In any flight test program, you mitigate known issues to accumulate flight history and vehicle characterization, you revise the design possibly radically, and remove issues continually throughout the program – the cost of which is continual DD/TME not just fixed cost. You do new vehicles to retire known risks, so that you can advance understanding to be able to characterize/test/remove deeper issues, do another vehicle to remove those, … and eventually stop when diminishing returns set in.
When a flight test program is forced to mitigate perpetually because it can’t revise substantially – usually due to budget (like liquid boosters/metallic TPS/avionics/airframe stress/…) – it bargains with fate, taxing people/processes to the max as a substitute. Even if something like X-33, 34, 38, OSP … had been done/flown concurrently, there would have been a means to external to the main program prove/retire risks, feeding back into the program. Because it takes away some of that enormous pressure to always be right in a million, million details … hundreds of times.
A real fear – alleviated by a million software verification checks – was that somehow the solid rocket boosters would be ignited and something – an RSLS Abort command for example – would stop the launch sequencer in the last milliseconds.
Every flight had this. Like every SLS flight may have as well – although the stack is simpler and there’s a LAS. The accumulated cost + risk was a major factor in the Shuttle program due to the non liquids choice (expediancy) of boosters – which could never be “evolved” / “advanced” out of the program.
I think you give up too easily. Work harder
As an Aerospace Engineer I find these personal accounts as valuable if not more than the homogenized reports. I worked on many a deep space mission that had much more stringent launch schedules than the Shuttle. If you miss the launch window the next stop could be the Smithsonian and not space. However the unmanned missions did not have anywhere near the pressure of protecting a human crew from failure. As long as man goes into Space, man can die in Space.
Having read the reports and a few accounts like yours it did not seem to me the Shuttle Programmed was placing personnel under excessive pressure due to schedule. There are hundreds of surprise/incident/anomalies involved with every mission. Lines have to be drawn as to which issues are a priority and which are not a problem. The “disposition” probably does not mean the issue is forgotten or ignored but that no further investigation is warranted and the rest of the actions are to analyze the evidence to full closure. After all, the Shuttle is only in flight a couple of weeks and some issues can take weeks/months/years to close. Working on Rovers etc allows you to delve deeper into issues because the whole Project can fit in a 2 car garage. If you want to do statistical experimentation on the Shuttle you need a small Island of real estate and a standing army. With any space mission the space and especially Launch environments are difficult to adequately simulate. This is especially true of the Shuttle, the enormity of the Shuttle is awe inspiring and daunting.
I guess part of this particular problem was a lack of visualization. The fragility of the leading edge of the wings seemed not fully appreciated. I keep picturing Richard Feynman at the Challenger Investigation putting an O-Ring in a pitcher of ice water and showing how brittle it becomes. I guess the right person did not visualize a small piece of foam saturated with water and frozen as having such an impact. Had a Flight Rule of not launching if ice is discovered on the vehicle could have given the Shuttle a perfect operational record but no launches in January.
From this Tragedy as any, it is important that we learn from our mistakes so as not to repeat them. I work for SpaceX and we are moving ahead as thoroughly and safely as we can towards a new era of Manned Space Flight. I witnessed the first landing at Edwards and the last miles of the Shuttle Program in the streets of Los Angeles. I speak for many when I say Thank You for your many years of dedication to NASA and the Shuttle and thank you for an your personal account that I will help pass on to the next generation of diligent Engineers.
Of course, the foam off of Columbia’s tank did not have any water or ice in it. From the very start of the program, there were stringent Launch Commit Criteria that prohibited launching with ice on the tank. Later tests at Southwest Research showed that foam alone (dry, no water, no ice) at the velocities involved could destroy the reinforced carbon carbon wing leading edge.
So be careful of what you assume; it doesn’t take ice to bring down a spaceship, sometimes something as light and insubstantial as insulating foam can kill a rocket.
Good luck, do good work.
Living in San Antonio, I am familiar with Southwest Research Institute and their work. I also have owned and ridden several motorcycles, and as another poster mentioned, have experienced the pain tiny bugs can cause at speed. As such, when the results of the foam tests were made public, I wasn’t surprised at the results. I was surprised that when fuel tanks started shedding foam in flight, that NASA didn’t do the tests they contracted SWRI to do post-Columbia.
A wee ways back I was arguing that “The System” is more important than leadership, that usually leaders are still at the mercy of whatever constraints that the system they’re operating under places upon them, and to me this series of posts only confirms my thoughts.
Good write-up Wayne.
I have read through all your posts on STS-107 now. Hard but very good read. Thank you.
STS-112 peaked my curiosity tho with the near RSLS abort. When the abort light lit up was there any noticeable reaction in the room? I assume you were not the only one who privately went through a moment of chill.
Not many people saw it and almost everybody were concentrating on the actual flight operations