Reusable Launch Vehicle Certification

by

Edgar Zapata

National Aeronautics and Space Administration

Kennedy Space Center

August 1995


A shorter version of this paper was presented at the Thirty-Third Space Congress, Canaveral Council of Technical Societies, April 1996. DOWNLOAD >> RLV_Cert_96_SpaceCongress_Zapata .


The following "paper" (I doubt anyone will read all this off the screen...) addresses the subject of certification for truly reusable launch systems. It's a somewhat different take on the subject than the "build it and then certify it" approach. The term "true certification" as I refer to it throughout this writeup precludes certain designs and approaches. They would be "un-certifiable" or "not truly certified" even if built. Although I've worked up a bulletized version (give me the "soundbite version") I'll not place it here given the charts require a good amount of narration to make sense. It may be time to do some real engineering again - design, build, blow it up, learn ,yes, learn and change direction (heresy) if necessary, re-design and do it all over again.

The footnotes (publishings original form of hypertext) have been placed at the end of the document for now and these are the occasional odd letters, hypertexted, in parenthesis. The references are the occasional odd numbers in parenthesis. Any peer review comments would be appreciated.

edgar.zapata-1@ksc.nasa.gov


Abstract

This paper will discuss the flight certification of the next generation Reusable Launch Vehicle (RLV). It will define certification as currently understood, as it will be required in the future, the difference between these two, and what this difference means for the next generation systems design.

Together, NASA and industry have been tasked with demonstrating technologies focused on a reusable single stage to orbit (SSTO) vehicle which will dramatically reduce the costs of achieving low earth orbit. This vision of routine and affordable access to space, if achieved, is driven toward bringing the benefits of space to humanity through a quantum leap in accessibility by means of drastically reduced vehicle turnaround times and recurring flight costs. The approach to certification will be key to the success or failure of this endeavor. Previous and current space vehicle efforts are familiar with the term "certification". However, the express goals of the RLV program, by necessity, will alter the current definition and mindset toward "certification".

This paper will focus principally on the certification process from two perspectives. The first is the NASA Shuttle operation and it's approach to certification. This is chosen for being the only current space vehicle with partial reusability. It is, therefore, a starting point. The second perspective is the virtual target for operation of the next generation RLV. This vehicle has yet to be defined although many concepts, technologies and approaches are being worked by NASA and industry. It is precisely the approach to certification that will shape the future of this program and the eventual approach to the design. This approach, in turn, will define the resulting configuration.

In the process of defining certification for the RLV previous efforts in this area will be reviewed. The subjects of reliability, vehicle health management or "smart systems", affordability and supportability will be discussed in relation to the issue of certification. Also, the relationship of certification to launching off a range versus flying off an operational site will be reviewed. Other applicable subjects such as the methods of the U.S. Federal Aviation Administration will also be discussed. Current RLV concepts and program material relevant to technology pursuits and goals will also be reviewed with relation to the subject of certification.

Certification

Definition

The use of the term "certification" as focused on here is that process which assures a design is capable of safely carrying out it's intended purpose. For a Reusable Launch Vehicle certification the goal is to assure flight worthiness of a system. It is also intended the certification be for "continued" flightworthiness since reusability is a principal characteristic of the system.

It is not the intent here to use the term certification to reference methods or processes that assure the readiness for use of a particular vehicle or launch system. Neither is the intent here to use the term as that process by which a particular vehicle is made or maintained flightworthy. For this, the term maintenance will be used as separate from certification.

The Current NASA Shuttle Certification Process

In any discussion of Shuttle certification a distinction must be made between that certification which is a part of the research, design and development process through implementation versus that certification which occurs continually such as from flight to flight. The first type involves development of systems to a degree that subsequent assemblies can be manufactured and operated without having to undergo the same degree of test or scrutiny. The second type involves processes which assure a particular, actual assembly (part, component, subassembly, line replaceable unit or LRU, system or whole vehicle and ground system) is ready for operational use or flight. This last may also be called maintenance though many sub-categories of processes may be identified here for a system such as the Space Shuttle.

A current example of the first type of certification is the Space Shuttle Main Engine (SSME) Alternate Turbopump Development (ATD) program. Turbopumps are a major part of the turnaround work on the SSME's. The Shuttle main engines in particular, propulsion in general, is one of the main drivers of Shuttle recurring costs whether strictly at Kennedy Space Center (KSC) or through the associated infrastructure (people and facilities) that exist elsewhere around the country such as in these development efforts.

A related example of the second type of certification is the tracking and application of allowable life limits to processing a particular serial number engine. This determines what stays versus what is removed and replaced during engine refurbishment for example.

These two uses of the term "certification" are very interrelated. One (a)tie between the two is the fleet leader program. Fleet leader components are set aside for the express purpose of being tested well beyond what would be acceptable for flight. Their purpose is to provide experience and knowledge into allowable life limits for components. Low and high cycle fatigue limits, analogous to starts and seconds (all similar to aircraft terminology), are determined by previous histories which include fleet leader components. For example, (17)when extensive fleet hot-fire exposure data are available and there are no failures or major material review disposition (MRD) history, the life limit may be 25 percent of fleet leader. Under some criterion (requiring periodic inspection) this may be 50 percent of the representative fleet leader or the failed unit if one has occurred.

The above values of 25 and 50% are the result of applying factors of safety of 4 and 2 to the components. This would seem to indicate a high degree of safety and margin in the hardware. However, the ATD program was begun precisely to address criticality issues first, and second to eliminate removal and replacement of the Rocketdyne Turbopumps every flight. For many components safety factors of 2 or 4 can and do equal only a few uses given fleet leader experience that shows short useful lives.

If a fleet leader builds an experience base that, when factors of 2 are applied, results in operating limits that are low, the certification of systems built from those components has that much greater difficulty in realizing any goal of reusability with little or no maintenance between uses. For example, a Deviation Approval Request (DAR) for a turnaround duct may establish a life limit of 5 starts where the original requirement was for 60. Only by continuing to accumulate time on the fleet leaders (or development articles in this case) and having no failures can the limit be raised eventually toward the original goal.

Returning to the distinctions about the types of certification, it becomes clear then that the second type of certification, certification of particular hardware, might more appropriately be called "Verification of Flight Readiness" or "worthiness". As the useful life of systems diminishes and the resource intensiveness of turnarounds increases, the "verification" then becomes equated with "certification." This previous type of certification is causally related to the first type of certification (a truer use of the term), certification of designs so that subsequent assemblies need not receive the same degree of scrutiny (resources) while still assuring readiness for operation, safety during use, and the capability to carry out the intended purpose.

The ATD LOX turbopump, by virtue of it's certification process, will allow greater reuse without any changes in safety factors. Having accumulated run times on units equal to over 40 flights or 20,000 seconds, the pump is then safely used up to 10 times (using the factor of 4) without any scheduled maintenance. This assumes among (b)other things no major failures during this testing. This is not to say the road to reusability is entirely clear.

Alternate Turbopump Development - An Example

The establishment of longer life limits through extensive testing is a focus of the Pratt & Whitney ATD program. This basic philosophy should allow the LOX turbopumps on the SSME's to be left on for 10 flights without a need for removal as exists on the Rocketdyne design. However, the new turbopumps, set to fly on STS-70, have had their share of problems. The first flight P+W High pressure Oxidizer Turbopump (HPOTP) was shipped to KSC installed on an engine but nonetheless had to be removed and replaced later on. The new preburner boost pump impeller end balancer material was deemed inadequate(c). The replacement too had it's problems, in this case an inlet guide vane crack issue. This will require inspections, not planned at all, after every flight for this particular S/N unit only. This is not to say the approach or basic philosophy is flawed. Actually, the ATD will likely enhance reusability (possibly saving up to a weeks worth of work per SSME turnaround). The basic approach is what will likely be required in the future - only more so.

Consider the implications for a program such as RLV. The ATD program has so far spent $1.2 Billion on one turbopump. For the RLV, 10 flights is only the beginning. Not only will components have to be reusable to degrees not yet seen, they will have to take the additional step that no inspections be required. The R+D effort required for the realization of this goal will likely require, even with much improved cost control and management techniques, far more than the currently foreseen funding for RLV. "Aircraft type operations" will require true, extended reusability built into a design as a result of certification of the first type, rigorous development of systems to a degree that subsequent assemblies need not require the same degree of test or scrutiny, and the associated baggage - manpower and infrastructure.

Aircraft Type Operations - The FAA

Government rules and regulations are imposed on airplane manufacturers and operators to guarantee the general public a certain level of safety. This is done through the Federal Aviation Administration (FAA). The origins here trace back to 1926 and the Air Commerce Act which authorized the first significant federal regulation of civil aviation. Duties given to the Secretary of Commerce included fostering air commerce, establishing airways, investigating accidents and certifying aircraft. The new Aeronautics Branch that was formed eventually led to the formation of the FAA in 1958. By 1970 responsibilities also included setting airport safety standards and certificating those facilities as well.

It may be argued that using aircraft type analogies to space operations avoids obvious differences such as the extremely demanding operational environments of a launch or the on-orbit and re-entry environment. However, it is relevant here to consider approaches (not results) used in the aircraft arena that reflect on how to one day make launch systems that do have "aircraft like operations". Aircraft designers and launch system designers must each design for certain environments. The question is not whether the environments are similar but whether the approach used in one situation to meet requirements is applicable to the other. For an SSME, as an example, the major contributor to life limitations is the internal thermal environment. (10)It is estimated 70% of the problems encountered on a high-pressure fuel turbopump (HPFTP) are thermally induced. Transients(d) such as the startup process represent the most severe environment. The startup temperature transient is especially a major element in limiting turbine life. The thermal shock to the turbine airfoils during preburner startup may be imagined by visualizing a surface that wants to expand but a core that is cool and only catches up later. This thermal delta occurs quickly but with enough of a difference to cause thermal stresses and hence crack propagation concerns. Again, aircraft do not see such environments, but some approaches used in the aircraft world to meet their environments can shed light on how to overcome launch system environments. If a goal of any reusable launch vehicle is to have operations like aircraft it is certainly relevant to review how aircraft got where they are.

Advisory Circulars are used by the FAA as non mandatory guides to meeting actual requirements or Federal Aviation Regulations (FAR's). Start-stop cyclic stresses or low cycle fatigue (LCF) are addressed in the (8)following:

"By a procedure approved by the FAA, operating limitations must be established which specify the maximum allowable number of start-stop stress cycles for each rotor structural part (such as discs, spacers, hubs, and shafts of the compressors and turbines), the failure of which could produce a hazard to the aircraft. A start-stop stress cycle consists of a flight cycle profile or an equivalent representation of engine usage. It includes starting the engine, accelerating to maximum rated power or thrust, decelerating, and stopping. For each cycle the rotor structural parts must reach stabilized temperature during engine operation at a maximum rate power or thrust and after engine shutdown, unless it is shown that the parts undergo the same stress range without temperature stabilization"

For materials the suitability and durability must "Be established on the basis of experience or test".

The actual number of cycles will be derived in many cases from pre-approved procedures already on file with the FAA for establishing initial LCF lives. The most severe mission cycle will be used in these determinations.

Tests that will form a part of the high-cycle fatigue (HCF) profile for turbine aircraft engines include 150 hour endurance tests accumulated by running 6-hour test sequences 25 times. If major repairs or the frequency of service is excessive during these tests then the engine will be subjected to further tests. Other tests will include vibration, calibration, detonation and operation tests. In conducting these (6)block tests separate engines of identical design and construction may be used for each of the various tests.

A current example here is the certification of the Pratt & Whitney PW4084 powerplant for the new Boeing 777. Tests included 3000 simulated (off aircraft) flight cycles with no major component failures and the ability to demonstrate maximum continuous thrust for 3 hours in repeated testing. Flight tests will further include another 1000 cycles. Again, the emphasis is "no major component failures" so as to demonstrate high life limits for both low and high cycle fatigue. Unique to the PW4084 case is the goal of demonstrating extended twin operations (ETOPS) or that is the ability to maintain the performance necessary for single engine flight. The focus here is to develop and demonstrate from the start the suitability for a particular type of operation such as ETOPS rather than to operations which are less demanding but more constrained. Another alternative would have been to initially certify to a less demanding operation and then allow the evolving flight experience to extend the operations envelope. This could also have also resulted in ETOPS certification eventually. By certifying for ETOPS, customer requirements for flexibility (usable on many routes) and reduced operations costs (twin jets versus aircraft using more engines to allow ETOPS) are enabled from day one of delivery.

Maintaining certification once it is achieved will be done through maintenance according to certification maintenance requirements (CMR's). CMR's should not be confused with maintenance requirements arising from certification nor with other scheduled maintenance requirements. For a Boeing 767 the (3)"CMR tasks are identified whenever system probabilities and failure effects are not expected to fall within an acceptable range without a periodic maintenance requirement". CMR's or any changes to CMR's are approved exclusively by FAA engineering. For a 767 most of the CMR frequencies are in the range of thousands of hours of flight time. Prior to these times other requirements that may have been scheduled may have covered the items. These frequencies also list in the thousands of cycles and hours of usage.

For comparison, a review of a high pressure fuel turbopump (HPFTP) on a Shuttle SSME will show a life(e) usage of 8 starts and 2767 seconds. This does not represent usage since the last check. It does not include repair and refurbishment. Major work may have been performed on the assembly during this time. Failures may also have occurred. The usage numbers are only useful to a next order assembly such as the housing(11). Fleet leader numbers may be 69 starts and 25861 seconds for a HPFTP. Again, the same caveats apply. Work on the Rocketdyne Shuttle HPOTP's turbopumps is done after every flight. Removal from the engine is driven by recurring problems with a tip seal retainer. This retainer uses set screws, staked in place, to hold it in. The backing out of these screws and a history of machining tolerance problems on the tip seal itself drives the removal every flight on the HPOTP's. Unlike other work which is also required every flight, this particular item, an inspection, can not be done with the pumps installed on the engines. Following the fulfillment of requirements such as these, a particular serial number set of pumps may be certified for a particular Shuttle flight. However, these will have limited relevance to other pumps and other certification processes. It is the fleet leaders that will bear greatest relevance in certifying not only individual similar items but also the design in general. Again, however, short fleet leader lives will in turn create short actual lives for components to be used.

This is not to say that life limits comparable to the aircraft analogy apply to the case of the launch system. The need and ability for loitering capability in aircraft has as yet no equal in a comparison to a launch vehicle during ascent. The relevance is to methodology used in one case such as aircraft as a reflection on the testing for certification of candidate reusable launch vehicles and systems. The rigorous testing used for the design and subsequent certification of aircraft systems for high life limits and usage results in designs capable of extended reuse with no major failures between uses. This reflects on what will be required to design and certify a reusable launch vehicle for extended reuse with no major failures between uses.

Interestingly, before the FAA was formed, the certification of airline pilots and airplanes was done by Underwriters Laboratories (UL) which is today the world's largest independent certifier of product safety. While safety and reliability or reusability issues do not always overlap, they have in common a need for systems to demonstrate the ability to operate as planned under the most severe operating and environmental conditions. UL is well known for testing products by turning them on and off hundreds of thousands, perhaps millions, of times or perhaps turning them on and leaving them that way for ridiculously extended time periods. Design and destruction leads to the familiar UL symbol on multitudes of products which are safe and certified. It may be said they are "free of range constraints".

Implementation

The certification of a Space Shuttle for launch involves a variety of organizations. Requirements are created through one process (Requirement Change Notice's). Waivers to these requirements are worked by another. Tracking work as it is performed is carried out by still another. One key organization involved in implementation, for example, is NASA Vehicle Engineering. In conjunction with the Shuttle Processing Contractor (SPC) work instructions are written to satisfy requirements. When work is actually performed by the technicians (or engineers in the Launch Control Center) the engineering organizations such as NASA, SPC and the Launch Site Support (LSS) for the particular element will be involved to varying degrees given a high degree of unpredictability (unscheduled work) that inevitably arises during planned work. The first hand involvement by these organizations assures that the requirements have in fact been properly met and that higher level organizations can use certification information provided by the lower level organizations to certify the whole system as "ready for launch." The process is especially manpower intensive. Further, the KSC element involvement in the Shuttle program accounts for 25% of total Shuttle costs, an indication of even greater degrees of oversight as well as other program functions away from the launch site.

Processes such as the Shuttle certification drive infrastructure, a combination of people and facilities. The high degree of intrusiveness in the vehicle processing (planned disassembly and reassembly) as well as spares consumption (unscheduled removals and replacements) is a major driver in the degree of oversight that is used to assure mission reliability. In turn, low life limits on many components drive processes away from being repetitive. Again, this reflects on the required infrastructure.

CoFR - Space Shuttle and the Paper Trail

The Space Shuttle Certificate of Flight Readiness (CoFR) Review(14) process documents the readiness for flight of a Space Shuttle Vehicle. It consists of seven major reviews. It encompasses all the required preparations for a Space Shuttle mission, from acceptance of the major hardware elements through processing, mating, launch, and ferry when required. As an example, the most important of these reviews is the Flight Readiness Review (FRR). This review process leads up to a certification that includes, but is not limited to, 34 items which must be signed off by contractor and NASA management. The total number of organizations now involved is 29. One organization, such as Shuttle Management and Operations, will be responsible for over one third of the items on the list of 34.

The Space Shuttle "certification" has certain key characteristics. First, the certification is "one use only". It is not intended in the Shuttle use of the term certification to apply to the launch-worthiness of a fleet of vehicles or to a system of similar design. The certificate applies only to the case at hand and expires immediately upon landing of that vehicle. The whole process is repeated for this vehicle and the others in the fleet every flight. Also, under close scrutiny, neither is the process one of "maintaining airworthiness" given the degree to which much of the vehicle is replaced from launch to launch. This last point applies not only to new and major expendable elements such as the shuttle ET and practically new elements such as the SRB's but also to the large number of new components on each of the orbiters. This last number(f) may be 100 to 150 LRU's from flow to flow not counting some major components in the main engines and tile work on the TPS.

Second, the certificate is also dependent on the results from a previous flight. FRR's are generally scheduled 2 weeks prior to the launch. This insures that most of the work of launch preparation will be reviewed with only a small delta up to the actual launch date while also assuring time to resolve issues raised prior to launch. Another preference is for the last flight to have landed prior to this review. Information on the SRB's (nozzles, O-rings...) as well as addressing in flight anomalies or IFA's that may affect the next flight is a part of the decision making process. To realize how distinct this situation is from aircraft imagine having to land the previous DC-10, teardown some components and address issues on this aircraft, prior to allowing the next one at the gate to taxi for takeoff.

Third, the certificate of flight readiness is based not only on scheduled work which complies with detailed lower level operations maintenance requirements (OMR's) but also on significant quantities of both unscheduled work and support work which is not documented in the OMR specification. Unscheduled work is documented in a problem reporting system that causes either unscheduled maintenance actions or a review that may accept the condition. Unscheduled work will significantly contribute to the final certificate. Although the variations from system to system vary an average Shuttle flow may consist of as high as 50% unscheduled work out of the total work performed (in manhours). Support work, for example a functional verification of pad systems after rollout, also flows into eventual certification through lower level certificates that are processed by each system.

Relationship of the STS Certification Approach to Other Issues

Reliability

Any discussion of reliability must draw a distinction between mission reliability versus that reliability of a system as it is being prepared for it's mission. Reliability during processing should more appropriately be distinguished as "support reliability" or "process reliability". Both mission and support reliability are related. Mission reliability goes up when redundancy and hence complexity is increased. Support reliability is worsened by additional parts and complexity. Thus, the STS per mission certification approach can assure high mission reliability (0.985 demonstrated reliability) for a very complex system. This same degree of complexity, however, assures this can only be done at great cost due to high support requirements. For example, consider a system at 0.98 mission reliability with certain types of parts. Using the same parts, but only half as many, automatically gives a 0.99 mission reliability with no change in the reliability of the individual components. The reverse is also true. This has important implications for a launch system certification. Even assuming a rigorous test program for components and systems as part of a certification process, the degree of complexity of the vehicle could still greatly affect the support reliability and hence operating costs. The parts count at various levels represents "opportunities" for failure. The more parts, the greater the number of opportunities. While these may aid the mission, through redundancy, they will not aid support costs.

Certification for a truly reusable launch system complicates the previous picture. The two variables, mission reliability and support reliability now have another component, certification. It may be considered certification is a subset of support reliability. As complexity increases to assure mission reliability, support reliability decreases, requiring more intrusion into launch systems and components resulting in a certification that approaches the Shuttle model of certification for "one use only". The following diagram 1.0 shows this relationship.

Diagram 1.0

Arguably, if aircraft are considered to have less complexity than launch systems, then curves C1 and C2 would be appropriate. This demonstrates how a high level of mission reliability, A on curve C1, can, for a given resulting complexity, have a correspondingly high resulting MTBUR of B on curve C2, and thus low support requirements. This system is "supportable" and approaches the aircraft model of certification. As system curves move further down and to the right such as C3 and C4 a lower mission reliability of C can, for a given resulting complexity, have a correspondingly low resulting MTBUR of D on curve C4, and thus high support requirements. This system is not supportable and approaches the Shuttle model of certification. The MTBUR contributes, along with other factors resulting from complexity, to intrusion on the system. Other factors may include pure support, such as connecting an interface or umbilical. This has not failed requiring removal, but refurbishment may be required after each use. It is simply required to operate and adds to support requirements.

The key point here is intrusion. Low MTBUR's as well as high degrees of complexity can move certification processes to the right, toward intrusion, toward Shuttle style certification.

Vehicle Health Management (VHM)

Vehicle Health Management, used often interchangeably with condition monitoring or diagnosis, is loosely defined(2) as periodically or continuously sensing, measuring and recording data about the operating parameters of a machine in order to support decisions related to the operation and maintenance of the machine. Other definitions include establishing links between acquired data and operational conditions so as to have a rational basis for fault diagnosis. Putting aside for now instrumentation, data processing and decision making related to development efforts, the use of VHM in operational systems may be divided into 2 categories, in flight and on the ground. The first category, flight, is generally much more developed than the second. The Shuttle case is similar. Flight operation health management is more developed than ground operations health management which is virtually non-existent. A high mission reliability as a priority as well as the practical need to diagnose flight operations from afar has driven this. However, low Shuttle MTBUR's and low support reliability have not been approached from a system health management perspective.

Critical functions which must be verified on a launch system include much more than the current in flight instrumentation emphasis on pressures and temperatures. Assuming high MTBUR's and the development of highly reusable systems, many functions of a launch system would likely still require verifications in order to assure readiness for flight. This would, in many cases, be regardless of high reusability and high MTBUR's.

For Shuttle, verification for flight involves a multitude of verifications of fluid system integrity. This is driven primarily by the existence of closed compartments such as the orbiter aft and the ET intertank. Although system surveillance systems can indicate the presence of external leakage past an interface the systems in use are not capable of determining the source of the problem. This requires manpower intensive methods, typically violating system integrity, and also the use of hazardous purges in these compartments. Besides external leakage concerns such as interfaces, seals, welds, porosity and structure, there is also internal leakage. Internal leakage verifications include check valves for backflow leakage, valve closure, regulators and relief valves. Functional checks further include position and positive closure or engagement. Internal or external, leakage or functional, a common trait in the Shuttle processing approach is a violation of system integrity due to intrusions(g) into the systems being verified. Intrusion here is meant to refer to personnel access to systems in compartments, installation of access kits, subsequent disassembly of systems for verification as worst case, access to specialized ports and plugs as best case, and reassembly followed by still further testing requirements.(h)

A VHM system with an emphasis on ground turnaround and non-intrusive verification of systems integrity and function does not exist on Shuttle systems. The number of in-flight anomalies detected during a flight reflects on this. Although on average few, less than 10 per flight for example, the actual number of components that will require removal, repair, refurbishment or replacement is many times this. The Shuttle components are not "wired" to indicate where these components are however. A Shuttle ground turnaround health management system does not exist. Such a system would relate to certification. High degrees of intrusion such as in the Shuttle model drive toward certification for "one use only." Eliminating intrusive operations enables the aircraft model of "certification for many uses". Operations such as verifying internal and external leakage, structural integrity verifications, purity and dewpoints of commodities, axial travel of shafts, torque checkout of pumps, integrity of electrical connectors, and hosts of other verifications that may still be required even for a highly reusable launch system will have to be done non-intrusively to maintain certification from flight to flight. Certification of the first type, certification of a design, is thus enabled by being maintained from flight to flight. Higher degrees of intrusion will drive toward certification of the second type, certification of a particular vehicle or system every flight, for one flight only. This would more appropriately be called maintenance or processing. Here, certification is not maintained, it is done every flight.

A VHM system emphasizing ground operations for turnaround would also require flight data as part of the information feeding into operation and maintenance decisions. Instrumentation already available for other purposes, such as flight operation, could provide data that, if recorded, could be processed and analyzed to determine the health of systems non-intrusively. Subsequent ground verification using manpower intensive techniques and violating system integrity could be avoided. An example of this is the F/A-18 Inflight Engine Condition Monitoring System or the Fleet Fatigue Monitoring System using strain sensors. Shuttle examples of this approach also exist such as in some components of the orbital maneuvering system (OMS). Again, the need for intrusive and hence manpower intensive operations is reduced. Not only does a true certification as previously defined require this, it is an integral part of low recurring costs for operation of a system.

Summary of Qualities of the NASA Shuttle Certification Process

In summary, the NASA Shuttle Certification process has the following principal characteristics:

(a) For many components safety factors of 2 or 4 can and do equal only a few uses given fleet leader experience that shows short useful lives.

(b) Shuttle type operations arise due to lack of extended reusability built into the design. "Aircraft type operations" will require true, extended reusability built into a design as a result of certification of the first type, rigorous development of systems to a degree that subsequent assemblies need not require the same degree of test or scrutiny, and the associated baggage - manpower and infrastructure.

(c) Rigorous testing for high life limits is not the norm. For comparison, the rigorous testing used for the design and subsequent certification of aircraft systems for high life limits and usage results in designs capable of extended reuse with no major failures between uses.

(d) The required infrastructure for Shuttle processing is a reflection of low life limits while still requiring high mission reliability.

(e) The certification is "one use only". The Shuttle certification process is not a process of "maintaining airworthiness" given the degree to which much of the vehicle is replaced from launch to launch.

(f) Intrusion is driven by low mean times between unscheduled removals (MTBUR's). Low MTBUR's as well as high degrees of complexity can move certification processes to the right, toward intrusion, toward Shuttle style certification.

(g) Vehicle Health Management (VHM) is not developed on Shuttle. This drives to intrusive operations as a part of verifications for flight. Intrusive operations again move certification processes to the right (reference diagram 1.0), toward intrusion, toward Shuttle style certification.

What Will be Required to Certify RLV

RLV Goals

An understanding of what the RLV goals and requirements are, followed by an understanding of what they mean is key to subsequently determining the impact on eventual certification practices. Higher level requirements lead to detailed lower level requirements. As an example, lower level Space Shuttle operations maintenance requirements from flight to flight form a basis for the one time certification of this vehicle. These requirements may change from one flight to another for various reasons which may or may not be accompanied by modifications to the vehicle hardware. Some requirements are inherent to the vehicle design while others arise or die based on a mature understanding or lack thereof of the design. One aspect of certification becomes clearer in this context. The certification process that will eventually arise for a vehicle is built into a design that responds to certain top level requirements that are met or not and become more detailed and understood with time. A review of the RLV top level requirements can provide insight into what the RLV certification process must be. Although these targets are moving, the relation of RLV goals and requirements to certification can lead to key insights.

The Challenge - Qualitative and Quantitative

The goals of the RLV Technology program are extremely ambitious. A decrease in recurring launch costs by an order of magnitude compared to the Space Shuttle, full reusability, and flight rates of 30 to 40 launches per year for a fleet of five vehicles are routinely referenced in the program. The target is high degrees of automation, vehicle health management and robustness over current designs. The term "leap frog" technology is often used.

The Cooperative Agreement Notice for the X-33, the RLV Advanced Technology Demonstrator, draws a distinction between goals and requirements. The former is not firm, the latter is a must. Requirements for the RLV include a 20 to 25K payload to the International Space Station at 220 nautical mile altitude and 51.6 degrees inclination. Other requirements such as standardized payload interfaces, docking and robustness to adverse weather are not as quantitative. Other more quantitative requirements include:

-The probability of launching within TBD days of scheduled is 0.95

-The probability of safe recovery of the flight vehicle per mission is 0.995

-The probability of safe recovery of the human passengers per mission is 0.999

Quantitative RLV goals include a 7 day ground processing time from landing to launch and a 3.5 day ground processing time from landing to launch for reflight under emergency conditions.

Availability

Planning and evaluation efforts used to surface issues have other commonly used goals. One of these is an availability of 90%. What does this mean for certification? First of all, inherent availability has a standard definition(18):

Ai=MTBF/(MTBF+MTTR)

Where:

Ai = The Inherent Availability of a System

MTBF = Mean Time Between Failure

MTTR = Mean Time To Repair

This may be considered the fraction of time the system is in use rather than in stand-down or delay.

For operational availability(18) (Ao) :

Ao=MTBF/(MTBF+MDT)

Where:

MTBF = Mean Time Between Failure

MDT = Mean Down Time

Also used for operational availability(20) is:

Ao=MFHBM/(MFHBM+Mct+LDT)

Where:

MFHBM = Mean Flight Hours Between Maintenance (on aircraft)(i)

Mct = Mean Corrective Time

LDT = Logistics Down Time

Qualitatively Ao represents the expected percentage of time that a system will be ready to perform satisfactorily in an operating environment. Other definitions of availability are similar. For example, the probability that an aircraft, when used under stated conditions and in an actual operational environment will operate satisfactorily at the start of the mission when the mission is called for at an unknown (random) point in time. Also, as a principal factor of system effectiveness, availability is 1 minus the fraction of the aircraft not available due to maintenance and support only.

The difficulty of comparisons with Shuttle is readily apparent. What is a failure or time to repair or what is the down time for an orbiter for instance? Where can the line be drawn between "ready" versus "actually operating"? How resource sensitive are these definitions? What qualifies as a random point in time?

What is Space Shuttle availability?

For example, MTTR, MDT and Mct can be reduced almost arbitrarily for many systems simply by increasing the manpower and other resources available to ready a system. Consider current Shuttle resources. Using an average time between flights for a Shuttle orbiter of roughly 130 to 150 days as "Mct", and an average flight time of 14 days (MFHBM), the availability of a single orbiter is roughly 8 to 10%. It is preferable to use mean time between flight hours versus time between failures given that an orbiter in space, even though having various failures (IFA's), can be considered "operational" until landing. At that point the orbiter flight hours end (losing certification) and a new maintenance period begins. However, there is a generous assumption built into this view - that all the flight hours are operational. Actually, an orbiter in use is, in a sense, unavailable for another payload customer in a way that aircraft, with typically much shorter individual flight times, are not. Another assumption is that all the hours between orbiter flights are maintenance when some are failures or servicing or simply support (such as connecting umbilicals or interfaces). However, this would be the same as considering the 130 to 150 days as Mct+LDT and so the figure would remain the same.

A fleet "in-operation" figure for Shuttle would, for 7 flights at 14 days each, have a per year figure of (7)(14)/365 or 27%. The prior range of numbers is similar to the Ao result that would be obtained using the definition of "1 minus the fraction of the aircraft not available due to maintenance and support only" which would be somewhere between zero and 25% for Shuttle depending on whether a flight is in progress or not (two orbiters never fly at the same time).

The difficulty of comparisons with commercial aircraft such as jet airliners is readily apparent. Given high relative flight times for a space vehicle (days on orbit) versus an aircraft (hours), the space vehicle (any one in particular) decreases it's "customer availability" (for another flight / customer) even if there is zero turnaround time simply by virtue of the orbit stay times.

Were the operational availability definition used as is, a 90% RLV availability for 7 day flights would require no more than a 19 hour turnaround per vehicle.(j)

However, one basic point about availability holds true for both aircraft or a hypothetical RLV - for a given, fixed resource for operations (manpower, infrastructure) per vehicle with a certain productivity, the lower the turnaround time, the greater the availability. Buying more birds may increase "customer availability" (decrease the time between possible flights) but it does not increase operational availability. There are only 2 ways to increase operational, "true", availability. One is to increase the resource per vehicle so as to decrease the turnaround time, the other is to decrease the turnaround time by design. Increased resources (manpower, infrastructure) may not produce, through increased customer availability, any increased revenue if the resource expenditure is too vast (hence declining returns when this avenue is often pursued.)

A fleet of X number of vehicles with 7 day flights and 7 day turnarounds will always have an operational availability of 50%. This is regardless of the number of vehicles. Half will always be sitting on the ground; again, unless the per vehicle resource is increased.

Certification and Turnaround Time

Returning to the distinctions about certification, the relationship to turnaround times and availability may be established. First, true certification is that process which assures a design is capable of safely carrying out it's intended purpose. This involves development of systems to a degree that subsequent assemblies can be manufactured and operated without having to undergo the same degree of scrutiny (resources) while still assuring readiness for operation and safety during use. Having reviewed aircraft methodologies it can be shown that the rigorous testing used for the design and subsequent certification of aircraft systems for high life limits and usage results in designs capable of extended reuse with no major failures between uses. This is the other approach to decreased turnaround time, decreased turnaround time by design. It will be key to achieving the previously discussed goals such as that for operational availability for a reusable launch system.

Relationship of the RLV Certification Approach to Other Issues

Reliability

The subject of "range safety" is very much related to reliability of a launch system. A range may be considered a requirement of current launch systems given low demonstrated reliability. For a reusable launch system to one day sever it's ties to a range, and transition to a truly operational site, or no longer require explosive charges, the reliability of the system will first have to be demonstrated. This may be considered mission reliability as well as the probability of safe vehicle recovery. This is similar to a new Boeing 777 not being able to simply takeover routes previously operated by other twinjet aircraft. The new aircraft first has to free itself of operating constraints. It requires a high demonstrated reliability. A rigorous certification process is a way of substituting for historical experience which would be built up more slowly. The first aircraft produced are then ready to operate free of these constraints because they have high demonstrated reliability. That these aircraft have a high predicted reliability, or that a shorter experience base could predict a great likelihood of matching other twinjet aircraft demonstrated reliabilities is irrelevant. The key is to demonstrate this reliability. For a reusable launch vehicle operating, or in this case "range", constraints may be overcome similarly, through demonstration. Demonstrated reliability here would have various aspects. First, individual component or subsystem reliability benefits from rigorous testing as part of a design and certification process focused on high life limits and reuse with no major failures between uses. This results in component designs capable of reliable extended reuse. This increases the likelihood of successfully demonstrating no need for a range for a reusable launch system built from these components.

Second, demonstrating high mission reliability through a similar focus at higher system and whole vehicle and ground system levels may allow much reduced complexity while still assuring mission reliability. A distinction is required here between different types of complexity.

Functional complexity: Means complexity because of the very existence of a system versus being eliminated entirely as a function or being replaced by a set of hardware that is simpler as in having "less hardware". For example, having X number of engines or having X number of interfaces at umbilicals. All of these engines or connectors are required for the function of the vehicle. If one vehicle had X number of engines versus another with half as many of a similar type, it has greater functional complexity. Both vehicles would likely have an engine out capability. The vehicle with X number of engines would not have half it's engines to spare for launch. The number of engines is a part of the design intended for a certain function.

Criticality complexity: Means redundancy of parts for a given function so as to be able to operate with failures. Here, regardless of demonstrated reliability, dictated entirely by criticality(k), more parts and hence complexity is added. For example, quad redundant serial checkvalves and dual redundant parallel filters. Should one filter clog the system continues to function normally. Should one checkvalve backflow, the other one may not and again the function may continue uninterrupted. All of the hardware is not required for the system to function and for operation to continue. This is precisely the purpose, to enhance mission reliability.

Functional complexity, if reduced, means both "fewer" and "more integrated." Consider a Shuttle example where the orbital maneuvering system (OMS) is separate from the main propulsion system (MPS). A more integrated system (shared tankage, propellant lines, etc) would, if demonstrated, greatly enhance both support reliability and mission reliability. A system based on such an approach has a much greater probability of freeing itself one day from operating constraints such as the need for a "range" because of (a) higher level systems which, by means of reduced functional complexity, have fewer opportunities for failure and hence a built in higher mission reliability by virtue of simplicity assuming (b) individual component and subsystem demonstrated reliability through rigorous certification testing focused on reuse, high life limits and no major failures between uses . It is easier to achieve high demonstrated reliability for such simplified systems.

This is not to say redundancy may be eliminated on critical systems. Redundancy too will be required to free a reusable launch system of the need for a range. Demonstration of an eventual design will not significantly alter the added complexity that critical functions require. Rather, redundancy is reduced as a result of addressing functional complexity. This does not affect mission reliability except to enhance it. Again, simplicity contributes to higher mission reliability through fewer opportunities for failure.

Mission reliability then becomes intertwined with support reliability in a way that overcomes the traditional launch system paradigm. Returning to Diagram 1.0, the complexity curves are moved "up and to the left" for simpler systems. Redundancy for criticality continues but for a much reduced number of separate systems and hardware. Most importantly, high mission reliability becomes compatible with high supportability.

Complexity driving one way, toward greater mission reliability, but opposingly, toward less support reliability, ceases to be true. A reusable launch system focused on simplicity has the greatest likelihood of achieving the combination of demonstrating high mission reliability aimed toward freeing itself from the "range" as well as doing so affordably.

Vehicle Health Management (VHM)

The addition of Vehicle Health Management systems to the next generation reusable launch systems may arguably be considered an addition of functional complexity. Especially when considered as a ground turnaround system it is one system Shuttle, for example, currently does without. Addition of such a system to the existing orbiter fleet would seem to go directly in opposition to simplicity, affordability, freeing the system from a range and supportability. This assumes however (1) addition to today's complex systems with low supportability versus tomorrow's simpler launch systems, composed of rigorously certified components and (2) use of today's sensors and techniques which have evolved driven by inflight or on orbit operational concerns versus new sensors and techniques evolved from turnaround operational concerns.

Criticality may dictate a hazardous gas detection system will exist for a reusable launch vehicle's closed compartments. A certification process as previously outlined may dictate a focus on leak free joints and interface technologies. Maintaining certification will dictate being able to verify as required that there are no leaks. Affordability, however, will dictate doing this verification, or locating a leak should one occur, with a system that does so quickly and with little manpower or intrusion. A consideration of life cycle costs(4) may make this last such system economically viable.

For example, criticality in aircraft structures is an area that involves periodic inspections on the ground in order to verify acceptability(4). Periodic inspections involving disassembly, gaining access to an area, or maintenance crews using non-destructive tests (NDT) involve manpower. Current work(19) in the field of automating these functions involves smart materials sensors which form an integral part of the structure. These may be fiber optic sensors embedded into composites during the manufacturing process. Rather than add downtime for structural inspections the vehicle would, in effect, be undergoing continuous inspection. This increases availability and has the potential to reduce life cycle costs through greater efficiency such as reduced manpower.

A similar situation will exist for a reusable launch system. Though it may be considered that a major system such as propulsion on launch vehicles is already well advanced in this area (and the same is often the case for aircraft propulsion systems(4)) this is only relative. Existing "vehicle health" systems for Shuttle propulsion are focused on ascent, on orbit operation and landing and data reduction for these intervals. Although information from these intervals will determine some ground operations such as an unscheduled removal and replacement, most verification and ground work is characterized by diverse, intrusive, hands on, manpower intensive operations. Leak checks (bubble soap, baggies, baths, connected sense lines in baths, audibles, flowmeters, mass spectrometers, multiple gas analysis, hazardous gas detection systems, PVT relations, system performance, fluid quantity verification, leakage capture methods, Uson probes, toxic vapor detectors and halogen detectors) are one example of diverse methodologies being used to prepare systems with basically similar concerns. There is no built in smart health management system on Shuttle which can automatically, continuously and non-intrusively verify the acceptability of all these interfaces.

Assuming even a high demonstrated reliability many of these functions would still require flight to flight verification in order to maintain certification. Verification with Shuttle techniques for an RLV would be intrusive to a degree that "maintaining certification" becomes a misnomer and the certification process would approach the Shuttle model of "one use only", flight to flight certification. Such intrusive approaches would also work against demonstrated reliability in the first place given that the technology maturation process of problem diagnosis is slowed by dependence on these techniques. "Development only" instrumentation often overcomes this but with no regard to the sensors and techniques being integrated permanently with the eventual design with an eye on future turnaround needs.

A reusable launch vehicle will require such systems focused on non-intrusive (no access, connection, disconnection, assembly, disassembly) turnaround verification of systems readiness. With such systems certification can be "one time only"(12). This will be a necessary complement to rigorous certification processes at all system levels, reductions in complexity, and demonstration of reliability. This is key to true certification, development of systems to a degree that subsequent assemblies can be manufactured and operated without having to undergo the same degree of test or scrutiny.

Summary of What Will be Required for RLV Certification

The RLV technology program goals are extremely ambitious. The basic goal is to demonstrate technologies leading to a reusable launch vehicle that will be affordable and provide routine access to space. Low cost and high availability will only be combined through a reduction on single vehicle turnaround times.

True certification is development of systems to a degree that subsequent assemblies can be manufactured and operated without having to undergo the same degree of test or scrutiny. Having reviewed aircraft methodologies it can be shown that the rigorous testing used for the design and subsequent certification of aircraft systems for high life limits and usage results in designs capable of extended reuse with no major failures between uses. This approach is one key to reducing turnaround times and achieving RLV goals.

For a reusable launch system to one day sever it's ties to a range, and transition to a truly operational site, or no longer require explosive charges, the reliability of the system will first have to be demonstrated. Rigorous certification processes for components and subsystems will increase reliability and the likelihood of demonstrating no need for a range. A key to establishing a high demonstrated reliability for the whole launch system, however, will be to also reduce functional complexity of the launch system. Reducing functional complexity means both "fewer" and for what's left, "more integrated." A reduction in functional complexity will also reduce criticality complexity.

Complexity driving one way, toward greater mission reliability, but opposingly, toward less support reliability, ceases to be true. A reusable launch system focused on simplicity has the greatest likelihood of achieving the combination of demonstrating high mission reliability aimed toward freeing itself from the "range" as well as doing so affordably.

An advanced, health management system (HMS) focused on ground turnaround will be required for any RLV aimed at one time only certification and the twin goals of affordability and high availability. This additional system should be evolved from turnaround operational concerns versus current systems focused on ascent or on orbit operations only.

An HMS will be a necessary complement to rigorous certification processes at all system levels, reductions in complexity, and demonstration of reliability. This is key to true certification, development of systems to a degree that subsequent assemblies can be manufactured and operated without having to undergo the same degree of test or scrutiny.

In closing, although this paper is not intended to address issues of cost in relation to certification, it is highly probable that the foreseen funding for reusable launch system technologies is inadequate assuming a certification approach as previously reviewed which is consistent with achieving the long term goals of affordable and highly available transportation to space. The term "quantum leap" is often used in the program to refer to what is technologically required to dramatically reduce the cost of space transportation. This would seem to imply that whereas once there was continuity of development all of a sudden there will be a discontinuity, a new state with no traceable connection between it and what came before. This is unlikely. This is not to say affordable and highly available space transportation can not be achieved. However, rigorous certification at all system levels, reductions in complexity, demonstrated reliability and advanced health management systems will be required. This will involve an appreciable investment in the future. This will create the path connecting where we are to where we want to go.

References

1. Advisory Group for Aerospace Research and Development, AGARD, Smart Structures for Aircraft and Spacecraft, AGARD Conference Proceedings 531 Lindau, Germany, October 1992.

2. Aerospace Engineering, Condition Monitoring and Diagnostics, SAE International, January/February 1995.

3. Boeing Company, Boeing 767 Maintenance Planning Data Volume 2

4. Boller, Chr. and Dilger, R., In-Flight Aircraft Structure Health Monitoring Based on Smart Structures Technology, AGARD Conference Proceedings 531, Section 17, October 1992.

5. Department of Defense, Logistics Support Analysis, MIL-STD-1388-1A , 11 April, 1983

6. Federal Aviation Administration, U.S. Dept. of Transportation, Aircraft Engine Type Certification Handbook, Advisory Circular AC 33-2B , June 30, 1994.

7. Federal Aviation Administration, U.S. Dept. of Transportation, Certification Procedures for Products and Parts , Special Federal Aviation Regulations, Subchapter C - Aircraft, Part 21.

8. Federal Aviation Administration, U.S. Dept. of Transportation, Airworthiness Standards: Aircraft Engines, Special Federal Aviation Regulations, Subchapter C - Aircraft, Part 33.

9. Feynman, R.P.,"Personal Observations on the Reliability of the Shuttle," Report by the Presidential Commission on the Space Shuttle Challenger Accident, Appendix F, 1986.

10. Goracke, B. David, Levack, Daniel J.H., Margin Considerations in SSTO O2/H2 Engines, AIAA 94-4676, AIAA Space Programs and Technologies Conference and Exhibit, September 27-29, 1994.

11. National Aeronautics and Space Administration, Endeavour, STS-68 Delta SSME Project, SSME Flight Readiness Review, 21 September 1994.

12. National Aeronautics and Space Administration / Industry Operations Synergy Team, Operations Concept Vision and Operability Criteria Document, November 1994.

13. National Aeronautics and Space Administration, Marshall Space Flight Center, RLV Concept Study Team Review, October 1994.

14. National Aeronautics and Space Administration, Johnson Space Center, Space Shuttle Requirements and Procedures for Certification of Flight Readiness (NSTS 08117) , February 21, 1995.

15. National Aeronautics and Space Administration, SSME Accident / Incident Report SSC Test 904-044, Rockwell International, 23 June 1994.

16. National Aeronautics and Space Administration, Subsystem Certification Plan, Main Propulsion, Rockwell International, November 1977.

17. Rockwell International Corporation, Rocketdyne Division, SSME Component Allowable Life and Hardware Tracking Program Requirements, Specification - RL00532, Rockwell International, February 1994.

18. SAE International RMS Committee (G-11), Reliability, Maintainability, and Supportability Guidebook, 2nd Edition, Society of Automotive Engineers, Inc., 1992.

19. Schmidt, W. and Boller, Chr.,Smart Structures, A Technology For Next Generation Aircraft, AGARD Conference Proceedings 531, Section 1, October 1992.

20. Smiljanic, Ray R., Definitions, Models and Methods for Supportability Analyses, McDonnell Douglas Aerospace (MDA), 28 June 1994.

21. U.S. Statistical Abstract, U.S. Major and National Airline Costs, Air Transport Association of America, 1993.

Acknowledgments

Dave Spacek, NASA KSC, on the subject of Alternate Turbopump Development.

D.R. Komar, NASA KSC, on the subject of KSC Shuttle Main Engine verification and processing for flight.

Footnotes

(a) Other methods used include analysis to determine life limits and assure reliability.

(b) Such as the ability to control production processes so as to have consistent, repeatable results (subsequent manufactured units).

(c) The impeller is balanced in a process that basically removes material and then achieves balance by adding set screws that are staked to preclude backing out or loosening. These set screws were made of tantalum. The material is dense so as to require few for balancing. Cracking (heads cracked through on the screws) was discovered to be a problem leading to loosening. The decision was made to switch to a stainless steel. This will mean more may be required for balancing (a less dense material). Notably, the particular pump passed inspection but, given the experience on other units in work the decision was made to remove and replace the pump.

(d) The 1850R temperatures in this transient are addressed in the new Pratt & Whitney ATD by the use of hollow airfoils. Tactics such as decreasing operating temperatures (using either fuel rich or oxidizer rich cycles) do not address this start transient, LCF problem but do address high cycle fatigue. Pressure transients also contribute to the LCF problem. Notably, aircraft turbine engines operate up to 2160R uncooled. Current aircraft turbine inlet temperatures are as high as 3260R and are being pushed toward 4600R, the stochiometric limit of JP4 fuel. However, the ramp up to these temperatures is slow compared to the SSME startup.

(e) The actual usefulness of tracking regardless of the degree of repair and refurbishment is as an indication of life limits for some internal components which are used over and over with only inspections. A sheet metal problem can, for example, be better understood by comparing a unit set to fly against higher life units.

(f) STS-42 through STS-62, Data through February 28, 1994.

(g) This use of the word intrusion, as in breaking into a system, is not the same as the use in terms such as intrusive instrumentation. This later issue deals with instrumentation protruding into systems, being inside a system and creating undesirable interfaces such as leak paths. It is related, however, in that verification or repair of such instrumentation involves "breaking into" systems.

(h) Subsequent verification work is often driven by having accessed an area in the first place and having handled equipment or access kits the installation or removal of which can cause damage to hardware. Also, subsequent verification work may be of a different nature that the original intrusion. For example, a leak check may require a plug or blanking plate installation. To complete testing on the valve or component may require a remote timing check, still intrusive to a degree given the need to connect and disconnect interfaces for power, test commodities, data or command and control.

(i) MTBUR or "mean time between unscheduled removals" is more often used for operational concerns in commercial aircraft operations since it has an actual impact on schedule and hence cost. Commercial schedule availability figures, for example, are in the 96 to 99% range for some operators and aircraft such as Boeing 737-400's or 747-400's.

(j) The time required for loading is not included in the 19 hours required for turnaround.

(k) This is similar in philosophy to FAA Certification Maintenance Requirements (CMRs).


Return to KSC Next Gen Site

Edgar Zapata, NASA Kennedy Space Center

Shuttle Process Engineering Directorate, Fluid Systems Division