When I was a rookie Flight Controller at JSC’s Mission Control, my boss had a hand drawn poster on the wall of his office. During tedious checklist and flight rule reviews I had plenty of time to study that poster; I wish I had a copy of it today. It featured a hand with the thumb upraised and words that went something like this: “Thumb’s Rule: A Simple, Easy to Understand Falsehood is More Useful than A Complex and Incomprehensible Truth”.
In the Internet age, it seems many folks practice this maxim. After all, the 10 second sound bite on the evening news or the 140 character tweet conveys all the information the public needs to know about the complex and difficult problems facing us today, right? Just like the shouting heads on the “news” channels have replaced thoughtful adult conversation about how to move forward through the challenges of 21st century life.
Nor is a thousand word blog post going to fully convey the complexity of the real world.
February is a month for introspection for me, and the events of 8 years ago have been on my mind.
In the Columbia accident investigation report, there are several pages devoted to the use of a computer program called “Crater” which analyzed potential damage to the thermal tiles. The results provided from that computer program indicated that no serious damage had been done to Columbia’s tiles and therefore a safe landing would occur.
Disaster occurred instead.
The accident investigation board spent a substantial amount of time looking into the Crater program and castigated NASA for using Crater to assess damage on Columbia. That program was developed for other uses and had been validated to use for tile damage assessment only with very small impacting objects. Clearly it was used inappropriately during the last flight of Columbia.
The Columbia Accident Investigation Board got all the big things right in my opinion, and most of the small things as well. But on this particular item, well, I’m having second thoughts.
Suitably chastised, we wrote new rules governing the use of any computer program analyzing information for the space shuttle. The limits of the validation of each program had to be clearly stated and the results should not be used if the inputs were outside the validation testing.
Meanwhile, we spent thenext three years shooting different objects at shuttle tiles to determine how they could be damaged by impacts. This work became very important to the assessment of safety from liberation of debris in areas we could not completely eliminate it.
And, after all this testing, it seemed that Crater accurately predicted tile damage at least to a first order effect. Even for larger, harder, faster debris.
So the simple statement that NASA used the Crater program outside its validated limits and came to erroneous conclusions is . . . not quite the complete truth.
So what really happened?
It is complex.
During Columbia’s last flight, significant effort was made to assess the possible damage from debris during launch. Ultimately, it was misdirected effort. From the fuzzy long range video the strike had been near the front of the left wing; but whether it was on the fragile tiles or the RCC wing leading edge panels was unclear.
During the flight, the RCC experts were unanimous in the opinion that the RCC would not have been damaged from such a strike. One of them, an engineer for whom I have great respect, told me after one of the MMT meetings: “That RCC is tough. We shot ice and other hard stuff at it and couldn’t break it.” This was the consensus of opinion in late January 2003.
Unfortunately, as we know, that was wrong. During the investigation, it took a full scale test which shot a big hunk of foam at a flight-like installation of RCC panels to prove to the experts that damage could occur.
No damage evaluation, no computer models, no assessment other than expert opinion was used during the flight of Columbia to come to the conclusion that the RCC was undamaged.
So what conclusions can we draw in retrospect?
Should computer models only be used within the limits to which they are validated by test data? That would be a good practice if analysis tools could cover all conceivable situations. But the real world always surprises us. It is important to know when a situation has left the limits for the validated used of a predictive program. But knowing that, sometimes those predicted results can be useful and sometimes they can even be correct. Use that analysis with care and great trepidation, but sometimes it is better than nothing.
A more important lesson is that experts, even in their field of expertise, can be wrong. As managers of complex vehicles flying in extreme environments using exotic technology with small margins for safety, we tend to trust our experts. That trust must be tempered with the knowledge that they are not always right. Somewhere in the budget limited, schedule compressed environment that is modern spaceflight there has to be enough capacity, enough capability, and enough time/money/equipment to check on the experts’ opinion.
Especially when it means life or death.
That is a complex, almost incomprehensible truth that is going to haunt spacecraft designers and operators for a long time to come.
I’d replace that poster in my old boss’s office with this new rule of thumb:
“You Are Not As Smart As You Think You Are”