Verification and Validation. in Computational Fluid Dynamics 1

(1)

SAND2002 - 0529 Unlimited Release Printed March 2002

Verification and Validation

in Computational Fluid Dynamics

1

William L. Oberkampf

Validation and Uncertainty Estimation Department Timothy G. Trucano

Optimization and Uncertainty Estimation Department Sandia National Laboratories

P. O. Box 5800

Albuquerque, New Mexico 87185

Abstract

Verification and validation (V&V) are the primary means to assess accuracy and reliability in computational simulations. This paper presents an extensive review of the literature in V&V in computational fluid dynamics (CFD), discusses methods and procedures for assessing V&V, and develops a number of extensions to existing ideas. The review of the development of V&V terminology and methodology points out the contributions from members of the operations research, statistics, and CFD communities. Fundamental issues in V&V are addressed, such as code verification versus solution verification, model validation versus solution validation, the distinction between error and uncertainty, conceptual sources of error and uncertainty, and the relationship between validation and prediction. The fundamental strategy of verification is the identification and quantification of errors in the computational model and its solution. In

verification activities, the accuracy of a computational solution is primarily measured relative to two types of highly accurate solutions: analytical solutions and highly accurate numerical solutions. Methods for determining the accuracy of numerical solutions are presented and the importance of software testing during verification activities is emphasized. The fundamental strategy of

(2)

validation is to assess how accurately the computational results compare with the experimental data, with quantified error and uncertainty estimates for both. This strategy employs a hierarchical methodology that segregates and simplifies the physical and coupling phenomena involved in the complex engineering system of interest. A hypersonic cruise missile is used as an example of how this hierarchical structure is formulated. The discussion of validation assessment also

encompasses a number of other important topics. A set of guidelines is proposed for designing and conducting validation experiments, supported by an explanation of how validation experiments are different from traditional experiments and testing. A description is given of a relatively new procedure for estimating experimental uncertainty that has proven more effective at estimating random and correlated bias errors in wind-tunnel experiments than traditional methods. Consistent with the authors’ contention that nondeterministic simulations are needed in many validation comparisons, a three-step statistical approach is offered for incorporating experimental

uncertainties into the computational analysis. The discussion of validation assessment ends with the topic of validation metrics, where two sample problems are used to demonstrate how such metrics should be constructed. In the spirit of advancing the state of the art in V&V, the paper concludes with recommendations of topics for future research and with suggestions for needed changes in the implementation of V&V in production and commercial software.

(3)

Acknowledgements

The authors sincerely thank Frederick Blottner, Gary Froehlich, and Martin Pilch of Sandia National Laboratories, Patrick Roache consultant, and Michael Hemsch of NASA/Langley Research Center for reviewing the manuscript and providing many helpful suggestions for improvement of the manuscript. We also thank Rhonda Reinert of Technically Write, Inc. for providing extensive editorial assistance during the writing of the manuscript.

(4)

Contents

1. Introduction... 8

1.1 Background... 8

1.2 Outline of the paper... 10

2. Terminology and Methodology...11

2.1 Development of terminology for verification and validation... 11

2.2 Contributions from fluid dynamics...16

2.3 Methodology for verification...17

2.4 Methodology for validation... 19

3. Verification Assessment...24

3.1 Introduction... 24

3.2 Fundamentals of verification...24

3.2.1 Definitions and general principles... 24

3.2.2 Developing the case for code verification...27

3.2.3 Error and the verification of calculations...28

3.3 Role of computational error estimation in verification testing...31

3.3.1 Convergence of discretizations...31

3.3.2 A priori error information... 34

3.3.3 A posteriori error estimates... 36

3.4 Testing...42

3.4.1 Need for verification testing...42

3.4.2 Algorithm and software quality testing...44

3.4.3 Algorithm testing... 48

3.4.4 Software quality engineering...54

4. Validation Assessment...56

4.1 Fundamentals of validation...56

4.1.1 Validation and prediction... 56

4.1.2 Validation error and uncertainty...59

4.2 Construction of a validation experiment hierarchy... 61

4.2.1 Hierarchy strategy... 61

4.2.2 Hierarchy example...63

4.3 Guidelines for validation experiments... 67

4.4 Statistical estimation of experimental error... 74

4.5 Uncertainty quantification in computations...77

4.6 Hypothesis testing...80

4.7 Validation metrics... 82

4.7.1 Recommended characteristics... 82

4.7.2 Validation metric example... 83

4.7.3 Zero experimental measurement error...85

4.7.4 Random error in experimental measurements... 87

5. Recommendations for Future Work and Critical Implementation Issues...89

(5)

Figures

1 Phases of Modeling and Simulation and the Role of V&V...12

2 Verification Process...18

3 Validation Process... 20

4 Validation Tiers...21

5 Demonstration of Extrapolated Error Estimation for Mixed First and Second Order Schemes... 41

6 Integrated View of Verification Assessment for CFD... 45

7 Relationship of Validation to Prediction...57

8 Validation Hierarchy for a Hypersonic Cruise Missile... 64

9 Validation Pyramid for a Hypersonic Cruise Missile... 66

10 Domain of Boundary Value Problem... 84

11 Proposed Validation Metric as a Function of Relative Error... 86

12 Validation Metric as a Function of Relative Error and Data Quantity... 89

Table 1 Major Software Verification Activities...55

(6)

1. Introduction

1.1 Background

During the last three or four decades, computer simulations of physical processes have been used in scientific research and in the analysis and design of engineered systems. The systems of interest have been existing or proposed systems that operate at design conditions, off-design conditions, failure-mode conditions, or accident scenarios. The systems of interest have also been natural systems. For example, computer simulations are used for environmental predictions, as in the analysis of surface-water quality and the risk assessment of underground nuclear-waste

repositories. These kinds of predictions are beneficial in the development of public policy, in the preparation of safety procedures, and in the determination of legal liability. Thus, because of the impact that modeling and simulation predictions can have, the credibility of the computational results is of great concern to engineering designers and managers, public officials, and those who are affected by the decisions that are based on these predictions.

For engineered systems, terminology such as “virtual prototyping” and “virtual testing” is now being used in engineering development to describe numerical simulation for the design, evaluation, and “testing” of new hardware and even entire systems. This new trend of modeling-and-simulation-based design is primarily driven by increased competition in many markets, e.g., aircraft, automobiles, propulsion systems, and consumer products, where the need to decrease the time and cost of bringing products to market is intense. This new trend is also driven by the high cost and time that are required for testing laboratory or field components, as well as complete systems. Furthermore, the safety aspects of the product or system represent an important,

sometimes dominant element of testing or validating numerical simulations. The potential legal and liability costs of hardware failures can be staggering to a company, the environment, or the public. This consideration is especially critical, given that the reliability, robustness, or safety of some of these computationally based designs are high-consequence systems that cannot ever be tested. Examples are the catastrophic failure of a full-scale containment building for a nuclear power plant, a fire spreading through (or explosive damage to) a high-rise office building, and a nuclear weapon involved in a ground-transportation accident. In computational fluid dynamics (CFD) research simulations, in contrast, an inaccurate or misleading numerical simulation in a conference paper or a journal article has comparatively no impact.

Users and developers of computational simulations today face a critical issue: How should confidence in modeling and simulation be critically assessed? Verification and validation (V&V) of computational simulations are the primary methods for building and quantifying this confidence. Briefly, verification is the assessment of the accuracy of the solution to a computational model by comparison with known solutions. Validation is the assessment of the accuracy of a computational simulation by comparison with experimental data. In verification, the relationship of the simulation to the real world is not an issue. In validation, the relationship between computation and the real world, i.e., experimental data, is the issue. Stated differently, verification is primarily a

mathematics issue; validation is primarily a physics issue [278].

In the United States, the Defense Modeling and Simulation Office (DMSO) of the Department of Defense has been the leader in the development of fundamental concepts and terminology for

(7)

V&V [98, 100]. Recently, the Accelerated Strategic Computing Initiative (ASCI) of the Department of Energy (DOE) has also taken a strong interest in V&V. The ASCI program is focused on computational physics and computational mechanics, whereas the DMSO has

traditionally emphasized high-level systems engineering, such as ballistic missile defense systems, warfare modeling, and simulation-based system acquisition. Of the work conducted by DMSO, Cohen recently observed [73]: “Given the critical importance of model validation . . . , it is surprising that the constituent parts are not provided in the (DoD) directive concerning . . .

validation. A statistical perspective is almost entirely missing in these directives.” We believe this observation properly reflects the state of the art in V&V, not just the directives of DMSO. That is, the state of the art has not developed to the point where one can clearly point out all of the actual methods, procedures, and process steps that must be undertaken for V&V. It is our view that the present method of qualitative “graphical validation,” i.e., comparison of computational results and experimental data on a graph, is inadequate. This inadequacy especially affects complex

engineered systems that heavily rely on computational simulation for understanding their predicted performance, reliability, and safety. We recognize, however, that the complexities of the

quantification of V&V are substantial, from both a research perspective and a practical perspective. To indicate the degree of complexity, we suggest referring to quantitative V&V as “validation science.”

It is fair to say that computationalists and experimentalists in the field of fluid dynamics have been pioneers in the development of methodology and procedures in validation. However, it is also fair to say that the field of CFD has, in general, proceeded along a path that is largely

independent of validation. There are diverse reasons why the CFD community has not perceived a strong need for code V&V, especially validation. A competitive and frequently adversarial

relationship (at least in the U.S.) has often existed between computationalists (code users and code writers) and experimentalists, which has led to a lack of cooperation between the two groups. We, on the other hand, view computational simulation and experimental investigations as

complementary and synergistic. To those who might say, “Isn’t that obvious?” We would answer, “It should be, but they have not always been viewed as complementary.” The “line in the sand” was formally drawn in 1975 with the publication of the article “Computers versus Wind Tunnels” [63]. We call attention to this article only to demonstrate, for those who claim it never existed, that a competitive and adversarial relationship has indeed existed in the past, particularly in the U.S. This relationship was, of course, not caused by the quoted article; the article simply brought the competition and conflict to the foreground. In retrospect, the relationship between

computationalists and experimentalists is probably understandable because it represents the classic case of a new technology (computational simulation) that is rapidly growing and attracting a great deal of visibility and funding support that had been the domain of the older technology

(experimentation).

During the last few years, however, the relationship between computationalists and experimentalists has improved significantly. This change reflects a growing awareness that competition does not best serve the interests of either group [3, 38, 53, 78, 106, 205, 207, 212, 227, 236, 237]. Even with this awareness, there are significant challenges in implementing a more cooperative working relationship between the two groups, and in making progress toward a

validation science. From the viewpoint of some experimentalists, one of the challenges is overcoming the perceived threat that CFD poses. Validation science requires a close and synergistic working relationship between computationalists and experimentalists, rather than

(8)

competition. Another significant challenge involves required changes in the perspective of most experimentalists toward validation experiments. We argue that validation experiments are indeed different from traditional experiments, i.e., validation experiments are designed and conducted for the purpose of model validation. For example, there is a critical need for the detailed

characterization of the experimental conditions and the uncertainty estimation of the experimental measurements. Similarly, quantitative numerical error estimation by CFD analysts is a must. For complex engineering problems, this requires a posteriori error estimation; not just formal error analyses or a priori error estimation. And finally, we believe validation science will require the incorporation of nondeterministic simulations, i.e., multiple deterministic simulations that reflect uncertainty in experimental parameters, initial conditions, and boundary conditions that exist in the experiments that are used to validate the computational models.

1.2 Outline of the Paper

This paper presents an extensive review of the literature in V&V in CFD, discusses methods and procedures for assessing V&V, and develops a number of extensions to existing ideas. Section 2 describes the development of V&V terminology and methodology and points out the various contributions by members of the operations research (OR), statistics, and CFD

communities. The development of V&V terminology and methodology is traced back to the OR community, and the accepted terminology in the CFD community is discussed, with differences noted where the terminology differs in the computer science community. The contributions of the CFD community in the development of concepts and procedures for V&V methodology, as well as those in validation experimentation and database construction, are described. Section 2 also summarizes portions of the first engineering standards document published on V&V in CFD [12]. Here we summarize the fundamental approach to verification and validation assessment, five major error sources to be addressed in verification, and the recommended hierarchical, or building-block, validation methodology.

Section 3 discusses the primary methods and procedures of verification assessment. The strategy involves the identification and quantification of errors in the formulation of the discretized model, in the embodiment of the discretized model, i.e., the computer program, and the

computation of a numerical solution. The distinction is made between error and uncertainty, code verification and solution verification, and the relationship of verification to software quality engineering. In verification activities, the accuracy of a computational solution is primarily

measured relative to two types of highly accurate solutions: analytical solutions and highly accurate numerical solutions. Methods for determining the accuracy of these numerical solutions are

presented and the importance of software testing during verification activities is emphasized. A survey of the principal methods used in software quality engineering is also provided.

Section 4 discusses the primary methods and procedures for validation assessment. The strategy of validation is to assess how accurately the computational results compare with the experimental data, with quantified error and uncertainty estimates for both. We begin this section with a discussion of model validation as opposed to “solution validation”, the relationship of validation to prediction, four sources of uncertainty and error in validation, and the relationship between validation and calibration. The recommended validation strategy employs a hierarchical methodology that segregates and simplifies the physical and coupling phenomena involved in the complex engineering system of interest. The hierarchical methodology is clearly directed toward

(9)

validation assessment of models in an engineering analysis environment, not simply a research environment. A hypersonic cruise missile is used as an example of how this hierarchical structure is formulated. The discussion of validation assessment also encompasses a number of other important topics. A set of guidelines is proposed for designing and conducting validation experiments, supported by an explanation of how validation experiments are different from traditional experiments and testing. Next is a description of a relatively new procedure for estimating experimental uncertainty that has proven more effective at estimating random and correlated bias errors in wind-tunnel experiments than traditional methods. Consistent with the authors’ contention that nondeterministic simulations are needed in many validation comparisons, a three-step statistical approach is offered for incorporating experimental uncertainties into the

computational analysis. Error concepts from hypothesis testing, which is commonly used in the statistical validation of models, are also considered for their possible application in determining and evaluating a metric for comparing the results of a computational simulation with experimental data. The discussion of validation assessment ends with the topic of validation metrics, where two sample problems are used to demonstrate how such metrics should be constructed.

Section 5 makes a number of recommendations for future research topics in computational, mathematical and experimental activities related to V&V. Suggestions are also made for needed improvements in engineering standards activities, the need for implementation of V&V processes in software development, as well as needed changes in experimental activities directed toward validation.

2. Terminology and Methodology

2.1 Development of Terminology for Verification and Validation

The issues underlying the V&V of mathematical and computational models of physical processes in nature touch on the very foundations of mathematics and science. Verification will be seen to be rooted in issues of continuum and discrete mathematics and in the accuracy and

correctness of complex logical structures (computer codes). Validation is deeply rooted in the question of how formal constructs of nature (mathematical models) can be tested by physical observation. The renowned twentieth-century philosophers of science, Popper [263, 264] and Carnap [56], laid the foundation for the present-day concepts of validation. The first technical discipline that began to struggle with the methodology and terminology of verification and

validation was the operations research (OR) community, also referred to as systems analysis [27, 29, 31, 54, 59, 64, 71, 77, 86, 93, 121, 128, 160, 181-183, 189, 190, 193, 198, 217, 224-226, 248-250, 252, 253, 293-297, 304, 345, 349]. The fundamental issues of V&V were first debated about 30 to 40 years ago in the OR field. (See [183] for an excellent historical review of the philosophy of science viewpoint of validation. See [30, 146, 235] for bibliographies in

verification and validation.) V&V are specific concepts that are associated with the very general field of modeling and simulation. In the OR activities, the systems analyzed could be

extraordinarily complex, e.g., industrial production models, industrial planning models, marketing models, national and world economic models, and military conflict models. These complex models commonly involve a strong coupling of complex physical processes, human behavior, and computer-controlled systems. For such complex systems and processes, fundamental conceptual issues immediately arise with regard to assessing accuracy of the model and the resulting

(10)

simulations. Indeed, the accuracy of most of these models cannot be validated in any meaningful way.

The high point of much of the early work by the OR community was publication of the first definitions of V&V by the Society for Computer Simulation (SCS) [297]. The published

definitions were the following:

Model Verification: Substantiation that a computerized model represents a conceptual model within specified limits of accuracy.

Model Validation: Substantiation that a computerized model within its domain of applicability possesses a satisfactory range of accuracy consistent with the intended application of the model.

The SCS definition of verification, although brief, is quite informative. The main implication is that the computerized model, i.e, computer code, must accurately mimic the model that was originally conceptualized. The SCS definition of validation, although instructive, appears somewhat vague. Both definitions, however contain an important feature: substantiation, which is evidence of correctness. Figure 1 depicts the role of V&V within the phased approach for modeling and simulation adopted by the SCS.

Model

Verification

Model

Qualification

Model

Validation

Analysis

Computer

Simulation

Programming

COMPUTERIZED

MODEL

REALITY

CONCEPTUAL

MODEL

Figure 1

Phases of Modeling and Simulation and the Role of V&V [297]

Figure 1 identifies two types of models: a conceptual model and a computerized model. The conceptual model is composed of all information, mathematical modeling data, and mathematical

(11)

equations that describe the physical system or process of interest. The conceptual model is produced by analyzing and observing the physical system. In CFD, the conceptual model is dominated by the partial differential equations (PDEs) for conservation of mass, momentum, and energy. In addition, the CFD model includes all of the auxiliary equations, such as turbulence models and chemical reaction models, and all of the initial and boundary conditions of the PDEs. The SCS defined Qualification as: Determination of adequacy of the conceptual model to provide an acceptable level of agreement for the domain of intended application. Since we are focusing on V&V, we will not address qualification issues in this paper. The computerized model is an operational computer program that implements a conceptual model. Modern terminology refers to the computerized model as the computer model or code. Figure 1 clearly shows that verification deals with the relationship between the conceptual model and the computerized model and that validation clearly deals with the relationship between the computerized model and reality. These relationships are not always recognized in other definitions of V&V, as will be discussed shortly.

Fundamentally, V&V are tools for assessing the accuracy of the conceptual and computerized models. For much of the OR work, the assessment was so difficult, if not impossible, that V&V became more associated with the issue of credibility, i.e., the quality, capability, or power to elicit belief. In science and engineering, however, quantitative assessment of accuracy, at least for some important physical cases, is mandatory. And in certain situations, assessment can only be

conducted using subscale physical models or a subset of the active physical processes. Regardless of the difficulties and constraints, methods must be devised for measuring the accuracy of the model for as many conditions as the model is deemed appropriate. As the complexity of a model increases, its accuracy and range of applicability can become questionable.

During the 1970s, the importance of modeling and simulation dramatically increased as computer-controlled systems started to become widespread in commercial and public systems, particularly automatic flight-control systems for aircraft. At about the same time, the first commercial nuclear-power reactors were being built, and issues of public safety in normal

operational environments and accident environments were being critically questioned. In response to this interest, the Institute of Electrical and Electronics Engineers (IEEE) defined verification as follows [167, 168]:

Verification: The process of evaluating the products of a software development phase to provide assurance that they meet the requirements defined for them by the previous phase. This IEEE definition is quite general and it is referential; that is, the value of the definition is related to the definition of “requirements defined for them by the previous phase.” Because those requirements are not stated in the definition, the definition does not contribute much to the intuitive understanding of verification or to the development of specific methods for verification. While the definition clearly includes a requirement for the consistency of products (e.g., computer

programming) from one phase to another, the definition does not contain a specific requirement for correctness or accuracy.

At the same time, IEEE defined validation as follows [167, 168]:

Validation: The process of testing a computer program and evaluating the results to ensure compliance with specific requirements.

(12)

The IEEE definitions emphasize that both V&V are processes, that is, ongoing activities. The definition of validation is also referential because of the phrase “compliance with specific requirements.” Because specific requirements are not defined (to make the definition as generally applicable as possible), the definition of validation is not particularly useful by itself. The substance of the meaning must be provided in the specification of additional information. Essentially the same definitions for V&V have been adopted by the American Nuclear Society (ANS) [19] and the International Organization for Standardization (ISO) [170].

One may ask why the IEEE definitions are included since they seem to provide less

understanding than the earlier definitions of the SCS. First, these definitions provide a distinctly different perspective toward the entire issue of V&V than what is needed in CFD and

computational mechanics. This perspective asserts that because of the extreme variety of requirements for modeling and simulation, the requirements should be defined in a separate

document for each application, not in the definitions of V&V. Second, the IEEE definitions are the more prevalent definitions used in engineering, and one must be aware of the potential confusion when other definitions are used. The IEEE definitions are dominant because of the worldwide influence of this organization. It should be noted that the IEEE definitions are also used by the computer science community and the software quality assurance community. Given that members of these two communities often work together with the computational mechanics community, we expect there to be long-term ambiguity and confusion in the terminology.

The Defense Modeling and Simulation Organization (DMSO) of the U.S. Department of Defense (DoD) has also played a major role in attempting to standardize the definitions of V&V. For this standardization effort, the DMSO obtained the expertise of researchers in the fields of OR, operational testing of combined hardware and software systems, man-in-the-loop training

simulators, and warfare simulation. Recently, the DoD published definitions of V&V that are clear, concise, and directly useful by themselves [99, 100]. Also during the early 1990’s, the CFD Committee on Standards of the American Institute of Aeronautics and Astronautics (AIAA) was also discussing and debating definitions of V&V. Because the AIAA committee essentially adopted the DMSO definitions, they will be discussed together.

The DMSO definition of verification, although directly useful, does not make it clear that the accuracy of the numerical solution to the conceptual model should be included in the definition. The reason for this lack of clarity is that the numerical solution of PDEs was not a critical factor in DMSO’s perspective of what verification is intended to accomplish. The AIAA, however, was primarily interested in the accuracy of the numerical solution—a concern that is common to essentially all of the fields in computational sciences and engineering, such as computational mechanics, structural dynamics, and computational heat transfer. Consequently, the AIAA slightly modified the DMSO definition for verification and adopted verbatim the DMSO definition of validation. The AIAA definitions are given as follows [12]:

Verification: The process of determining that a model implementation accurately represents the developer's conceptual description of the model and the solution to the model.

Validation: The process of determining the degree to which a model is an accurate

(13)

By comparing the AIAA definitions with the definitions that were developed by the SCS and represented in Fig. 1, one finds that the definitions from the two organizations are fundamentally identical—though the AIAA definitions are clearer and more intuitively understandable, particularly the definition for validation. The AIAA definition for validation directly addresses the issue of fidelity of the computational model to the real world.

There are some important implications and subtleties in the definitions of V&V that should be addressed. The first significant feature is that both V&V are “process[es] of determining.” That is, they are ongoing activities that do not have a clearly defined completion point. Completion or sufficiency is usually determined by practical issues such as budgetary constraints and intended uses of the model. The definitions include the ongoing nature of the process because of an unavoidable but distressing fact: the veracity, correctness, and accuracy of a computational model cannot be demonstrated for all possible conditions and applications, except for trivial models. Trivial models are clearly not of interest. All-encompassing proofs of correctness, such as those developed in mathematical analysis and logic, do not exist in complex modeling and simulation. Indeed, one cannot prove that complex computer codes have no errors. Likewise, models of physics cannot be proven correct, they can only be disproved. Thus, V&V activities can only assess the correctness or accuracy of the specific cases tested.

The emphasis on “accuracy” is the second feature that is common in the definitions of V&V. This feature assumes that a measure of correctness can be determined. In verification activities, accuracy is generally measured in relation to benchmark solutions of simplified model problems. Benchmark solutions refer to either analytical solutions or highly accurate numerical solutions. In validation activities, accuracy is measured in relation to experimental data, i.e., our best indication of reality. However, benchmark solutions and experimental data also have shortcomings. For example, benchmark solutions are extremely limited in the complexity of flow physics and

geometry; and all experimental data have random (statistical) and bias (systematic) errors that may cause the measurements to be less accurate than the corresponding computational results in some situations. These issues are discussed in more detail later in this paper.

Effectively, verification provides evidence (substantiation) that the conceptual (continuum mathematics2) model is solved correctly by the discrete mathematics computer code. Verification does not address whether the conceptual model has any relationship to the real world. Validation, on the other hand, provides evidence (substantiation) for how accurately the computational model simulates reality. This perspective implies that the model is solved correctly, or verified.

However, multiple errors or inaccuracies can cancel one another and give the appearance of a validated solution. Verification, thus, is the first step of the validation process and, while not simple, is much less involved than the more comprehensive nature of validation. Validation addresses the question of the fidelity of the model to specific conditions of the real world. The terms “evidence” and “fidelity” both imply the concept of “estimation of error,” not simply “yes” or “no” answers.

2_{When we refer to “continuum mathematics” we are not referring to the physics being modeled} by the mathematics. For example, the equations for noncontinuum fluid dynamics are commonly expressed with continuum mathematics. Additionally, our discussion of V&V does not restrict the mathematics to be continuum in any substantive way.

(14)

2.2 Contributions in Fluid Dynamics

In science and engineering, CFD was one of the first fields to seriously begin developing concepts and procedures for V&V methodology. Our review of the literature has identified a number of authors who have contributed to the verification of CFD solutions [11, 28, 33, 34, 39-42, 45, 47, 49, 57, 60, 65, 74, 87-89, 92, 97, 110, 116, 117, 120, 127, 129-131, 137-139, 141, 142, 144, 147, 165, 171, 177, 209, 218, 219, 221, 228, 229, 273, 274, 277, 278, 281, 298, 306-309, 318, 325, 327, 329, 333-337, 340, 347, 350, 351, 353]. Most of these authors have contributed highly accurate numerical solutions, while some have contributed analytical solutions useful for verification. In addition, several of these authors have contributed to the numerical methods needed in verification activities, e.g., development of procedures using Richardson extrapolation. Note, however, that many of these authors, especially those in the early years, do not refer to verification or may even refer to their work as “validation” benchmarks. This practice simply reflects the past, and even present, confusion and ambiguity in the terminology. Several textbooks, including [151, 176, 223, 256, 341, 343], also contain a number of analytical solutions that are useful for verification of CFD codes.

Work in validation methodology and validation experiments has also been conducted by a large number of researchers through the years. Examples of this published work can be found in [2-4, 8-10, 32, 37, 38, 45, 53, 55, 58, 76, 79-81, 90, 101, 102, 106, 123, 135, 136, 152, 163, 166, 174, 201, 202, 204-208, 211, 213, 214, 227, 234, 236-238, 240, 241, 246, 254, 265, 269, 272, 274, 276, 278, 279, 302, 310-313, 317, 324, 326, 330, 331, 338, 342]. Much of this work has focused on issues such as fundamental methodology and terminology in V&V, development of the concepts and procedures for validation experiments, confidence in predictions based on

validated simulations, and methods of incorporating validation into the engineering design process. Some of this work has dealt with experiments that were specifically designed as validation

experiments. (This topic is discussed in detail in Section 4.) Essentially all of this early work dealt with CFD for aircraft and reentry vehicle aerodynamics, gas turbine engines, and turbopumps.

In parallel with the aerospace activities, there have been significant efforts in V&V

methodology in the field of environmental quality modeling—most notably, efforts to model the quality of surface and ground water and to assess the safety of underground radioactive-waste repositories [35, 85, 96, 178, 196, 251, 282, 288, 299, 305, 323, 328, 352]. This water-quality work is significant for two reasons. First, it addresses validation for complex processes in the physical sciences where validation of models is extremely difficult, if not impossible. One reason for this difficulty is extremely limited knowledge of underground transport and material properties. For such situations, one must deal with calibration or parameter estimation in models. Second, because of the limited knowledge, the environmental-modeling field has adopted statistical methods of calibration and validation assessment.

The typical validation procedure in CFD, as well as other fields, involves graphical

comparison of computational results and experimental data. If the computational results “generally agree” with the experimental data, the computational results are declared “validated.” Comparison of computational results and experimental data on a graph, however, is only incrementally better than a qualitative comparison. With a graphical comparison, one does not commonly see

quantification of the numerical error or quantification of computational uncertainties due to missing initial conditions, boundary conditions, or modeling parameters. Also, an estimate of experimental

(15)

uncertainty is not typically quoted, and in most cases it is not even available. In the event that computational uncertainty due to missing data or experimental uncertainty was available, statistical comparisons would be required. A graphical comparison also gives little quantitative indication of how the agreement varies over the range of the independent variable, e.g., space or time, or the parameter of interest, e.g., Reynolds number or a geometric parameter. An important issue concerns how comparisons of computational results and experimental data could be better quantified. We suggest that validation quantification should be considered as the evaluation of a metric, or a variety of appropriate metrics, for measuring the consistency of a given computational model with respect to experimental measurements. A metric would quantify both errors and uncertainties in the comparison of computational results and experimental data. Only a few researchers have pursued this topic [35, 76, 103, 128, 148, 157, 181, 192, 194-196, 246]. And of these researchers, only two [76, 246] are in the field of fluids-engineering systems.

An additional important thread in the increasing importance and visibility of validation has been the construction and dissemination of experimental databases. Some of the important work in this area was performed by the NATO Advisory Group for Aerospace Research and Development (AGARD) and several independent researchers [1, 5-10, 38, 94, 95, 134, 162, 163, 207, 300-302]. Participants in the Fluid Dynamics Panel of AGARD recognized very early the importance of validation databases in the development and maturing of CFD for use in engineering

applications. Although many of the defining characteristics of true validation experiments were lacking in the early work, the work of the Fluid Dynamics Panel helped develop those

characteristics. V&V databases for CFD are now appearing on the World Wide Web, with some of the most important of these described in [109, 118, 220, 222, 230]. The Web provides

extraordinarily easy access to these databases not only within a country but also around the globe. However, most of these databases have restricted access to part or all of the database. Even though some initiatives have been started for constructing verification and validation databases, we believe it is fair to characterize most of the present effort as ad hoc and duplicative.

An examination of the literature from the diverse disciplines of OR, CFD (primarily for aerospace sciences), and environmental sciences clearly shows that each discipline developed concepts and procedures essentially independently. Each of these fields has emphasized different aspects of V&V, such as terminology and philosophy, numerical error estimation, experimental methods and uncertainty estimation, benefits of V&V in the engineering design environment, and uses of V&V in environmental risk assessment using modeling and simulation. This paper attempts to inform the aerospace-sciences CFD community about the productive and beneficial work performed by the OR community and other CFD communities like environmental sciences.

2.3 Methodology for Verification

In 1992, the AIAA Computational Fluid Dynamics Committee on Standards began a project to formulate and standardize the basic terminology and methodology in the V&V of CFD

simulations. The committee is composed of representatives from academia, industry, and government, with representation from the U.S., Canada, Japan, Belgium, Australia, and Italy. After six years of discussion and debate, the committee’s project culminated in the publication of Guide for the Verification and Validation of Computational Fluid Dynamics Simulations [12], referred to herein as the “AIAA Guide.” The AIAA Guide defines a number of key terms, discusses fundamental concepts, and specifies general procedures for conducting V&V in CFD.

(16)

AIAA standards documents are segregated into three levels that reflect the state of the art: guides, recommended practices, and standards. The AIAA Guide is at the first level, denoting the early stage of development of concepts and procedures in V&V. It is also the first standards-like document to be published by any engineering society on the topic of V&V. The American Society of Mechanical Engineers (ASME) has recently formed a new standards committee and is

developing a similar document in the field of solid mechanics [20]. In this section, we briefly review portions of the AIAA Guide that deal with fundamental V&V methodology.

Verification is the process of determining that a model implementation accurately represents the developer's conceptual description of the model and the solution to the model. The

fundamental strategy of verification is the identification, quantification, and reduction of errors in the computational model and its solution. To quantify numerical solution error, a highly accurate, reliable fiducial (benchmark) must be available. Highly accurate solutions refer to either analytical solutions or highly accurate numerical solutions. Highly accurate solutions, unfortunately, are only available for simplified model problems. Verification, thus, provides evidence

(substantiation) that the conceptual (continuum mathematics) model is solved correctly by the discrete mathematics embodied in the computer code. The conceptual model does not require any relationship to the real world. As Roache [278] lucidly points out, verification is a mathematics and computer science issue; not a physics issue. Validation is a physical sciences and mathematics issue. Figure 2 depicts the verification process of comparing the numerical solution from the code in question with various types of highly accurate solutions.

VERIFICATION

TEST

=

Comparison and Test of Agreement COMPUTATIONAL SOLUTION COMPUTATIONAL MODEL CONCEPTUAL MODEL CORRECT ANSWER PROVIDED BY HIGHLY ACCURATE SOLUTIONS • Analytical Solutions • Benchmark Ordinary Differential Equation Solutions • Benchmark Partial Differential Equation Solutions Figure 2 Verification Process [12]

(17)

Verification activities are primarily performed early in the development cycle of a

computational code. However, these activities must be repeated when the code is subsequently modified or enhanced. Although the required accuracy of the numerical solutions that are obtained during validation activities depends on the problem to be solved and the intended uses of the code, the accuracy requirements in verification activities are generally more stringent than the accuracy requirements in validation activities. Note that the recommended methodology presented here applies to finite difference, finite volume, finite element, and boundary element discretization procedures. An extensive description of verification methodology and procedures is given in [278].

Given a numerical procedure that is stable, consistent, and robust, the five major sources of errors in CFD solutions are insufficient spatial discretization convergence, insufficient temporal discretization convergence, insufficient convergence of an iterative procedure, computer round-off, and computer programming errors. The emphasis in verification is the identification and

quantification of these errors, as well as the demonstration of the stability, consistency, and robustness of the numerical scheme. Stated differently, an analytical or formal error analysis is inadequate in the verification process; verification relies on demonstration. Error bounds, such as a global or local error norms, must be computed in order to quantitatively assess the test of

agreement shown in Fig. 2. Upon examining the CFD literature as a whole, it is our view that verification testing of computer codes is severely inadequate. The predominant view in the field of CFD, as in scientific software in general, is that little verification testing is necessary prior to utilization. We believe this view is indicative of the relative immaturity of the field of CFD, that is, much of the literature is of a research nature or concentrates on numerical issues such as solution algorithms, grid construction, or visualization. When CFD is applied to a real-world engineering application, such as design or risk assessment, there is typically an abundance of experimental data or experience with the application. For confidence in CFD to improve, especially for engineering applications where there is little experimental data or experience, we contend that additional effort should be expended on verification testing.

The first three error sources listed above (spatial discretization error, temporal discretization error, and iterative error) are considered to be within the traditional realm of CFD, and there is extensive literature, some of it referenced in Section 2.2, dealing with each of these topics. The fourth error source, computer round-off, is rarely dealt with in CFD. Collectively, these four topics in verification could be referred to as solution verification or solution accuracy assessment. The fifth source, programming errors, is generally considered to be in the realm of computer science or software engineering. Programming errors, which can occur in input data files, source code programming, output data files, compilers, and operating systems, are addressed using methods and tools in software quality assurance, also referred to as software quality engineering (SQE) [186]. The identification of programming errors could be referred to as code verification, as opposed to solution verification [278, 279]. Although the CFD community generally has given little attention to SQE, we believe that more effort should be devoted to this topic. Section 3 discusses issues in SQE, but the primary emphasis in that section is on estimation of error from the first three sources.

2.4 Methodology for Validation

(18)

a model is an accurate representation of the real world from the perspective of the intended uses of the model [12]. Validation deals with the assessment of the comparison between sufficiently accurate computational results and the experimental data. Validation does not specifically address how the computational model can be changed to improve the agreement between the computational results and the experimental data, nor does it specifically address the inference of the model’s accuracy for cases different from the validation comparison. The fundamental strategy of validation involves identification and quantification of the error and uncertainty in the conceptual and computational models, quantification of the numerical error in the computational solution, estimation of the experimental uncertainty, and finally, comparison between the computational results and the experimental data. That is, accuracy is measured in relation to experimental data, our best measure of reality. This strategy does not assume that the experimental measurements are more accurate than the computational results. The strategy only asserts that experimental

measurements are the most faithful reflections of reality for the purposes of validation. Validation requires that the estimation process for error and uncertainty must occur on both sides of the coin: mathematical physics and experiment. Figure 3 depicts the validation process of comparing the computational results of the modeling and simulation process with experimental data from various sources. COMPUTATIONAL MODEL

=

VALIDATION TEST CORRECT ANSWER PROVIDED BY EXPERIMENTAL DATA • Unit Problems • Benchmark Cases • Subsystem Cases • Complete System Comparison and Test of Agreement COMPUTATIONAL SOLUTION CONCEPTUAL MODEL REAL WORLD Figure 3 Validation Process [12]

(19)

Because of the infeasibility and impracticality of conducting true validation experiments on most complex systems, the recommended method is to use a building-block approach [12, 79, 201, 207, 310, 311]. This approach divides the complex engineering system of interest into three, or more, progressively simpler tiers: subsystem cases, benchmark cases, and unit problems. (Note that in the AIAA Guide, the building-block tiers are referred to as phases.) The strategy in the tiered approach is to assess how accurately the computational results compare with the experimental data (with quantified uncertainty estimates) at multiple degrees of physics coupling and geometric complexity (see Fig. 4). The approach is clearly constructive in that it (1)

recognizes that there is a hierarchy of complexity in systems and simulations and (2) recognizes that the quantity and accuracy of information that is obtained from experiments varies radically over the range of tiers. It should also be noted that additional building-block tiers beyond the four that are discussed here could be defined, but additional tiers would not fundamentally alter the

recommended methodology.

Complete System

Subsystem Cases

Benchmark Cases

Unit Problems

Figure 4 Validation Tiers [12]

The complete system consists of the actual engineering hardware for which a reliable computational tool is needed. Thus, by definition, all the geometric and physics effects occur simultaneously. For typical complex engineering systems (e.g., a gas turbine engine), multidisciplinary, coupled physical phenomena occur together. Data are measured on the engineering hardware under realistic operating conditions. The quantity and quality of these measurements, however, are essentially always very limited. It is difficult, and sometimes

(20)

modeling, e.g., various fluid flow-rates, thermophysical properties of the multiple fluids, and coupled time-dependent boundary conditions. Not only are many required modeling parameters unmeasured, no experimental uncertainty analysis is generally conducted.

Subsystem cases represent the first decomposition of the actual hardware into simplified systems or components. Each of the subsystems or components is composed of actual hardware from the complete system. Subsystem cases usually exhibit three or more types of physics that are coupled. Examples of types of physics are fluid dynamics, structural dynamics, solid dynamics, chemical reactions, and acoustics. The physical processes of the complete system are partially represented by the subsystem cases, but the degree of coupling between various physical

phenomena in the subsystem cases is typically reduced. For example, there is normally reduced coupling between subsystems as compared to the complete system. The geometric features are restricted to the subsystem and its attachment, or simplified connection, to the complete system. Although the quality and quantity of the test data are usually significantly better for subsystem cases than for the complete system, there are still limited test data for subsystem cases. Some of the necessary modeling data, initial conditions, and boundary conditions are measured, particularly the most important data.

Experimental data from complete systems and data from subsystem tests are always specific to existing hardware and are available mainly through large-scale test programs. Existing data from these tests have traditionally focused on issues such as the functionality, performance, safety, or reliability of systems or subsystems. For large-scale tests, there is often competition between alternative system, subsystem, or component designs. If the competition is due to outside

organizations or suppliers of hardware, then the ability to obtain complete and unbiased validation information becomes even more difficult. Such tests generally provide only data that are related to engineering parameters of clear design interest, system functionality, and high-level system

performance measures. The obtained data typically have large uncertainty bands, or no attempt has been made to estimate uncertainty. The test programs typically require expensive ground-test facilities, or the programs are conducted in uncontrolled, hostile, or unmeasured environments. Commonly, the test programs are conducted on a rigid schedule and with a tightly constrained budget. Consequently, it is not possible to obtain the complete set of physical modeling

parameters (e.g., thermochemical fluid and material properties), initial conditions, and boundary conditions that are required for quantitative validation assessment. Also, there are certain situations where it is not possible to conduct a validation experiment of the complete system. Such situations could involve public-safety or environmental-safety hazards, unattainable experimental-testing requirements, or international treaty restrictions.

Benchmark cases represent the next level of decomposition of the complete system. For these cases, special hardware is fabricated to represent the main features of each subsystem. By special hardware, we mean hardware that is specially fabricated with simplified properties or materials. For example, benchmark hardware is normally not functional hardware nor is it fabricated with the same materials as actual subsystems or components. For benchmark cases, typically only two or three types of coupled flow physics are considered. For example, in fluid dynamics one could have turbulence, combustion, and two-phase flow, but one would eliminate any structural dynamics coupling that might exist at the subsystem level. The benchmark cases are normally simpler geometrically than those cases at the subsystem level. The only geometric features that are retained from the subsystem level are those that are critical to the types of physics

(21)

that are considered at the benchmark level. Most of the experimental data that are obtained in benchmark cases have associated estimates of measurement uncertainties. Most of the required modeling data, initial conditions, and boundary conditions are measured, but some of the less important experimental data have not been measured. The experimental data, both code input data and system response data, are usually documented with moderate detail. Examples of important experimental data that are documented include detailed inspection of all hardware, specific characterization of materials and fluids used in the experiment, and detailed measurement of environmental conditions that were produced by the experimental apparatus or testing equipment.

Unit problems represent the total decomposition of the complete system. At this level, high-precision, special-purpose hardware is fabricated and inspected, but this hardware rarely resembles the hardware of the subsystem from which it originated. One element of complex physics is allowed to occur in each of the unit problems that are examined. The purpose of these problems is to isolate elements of complex physics so that critical evaluations of mathematical models or submodels can be evaluated. For example, unit problems could individually involve turbulence, laminar flow combustion, or laminar gas/liquid droplet flows. Unit problems are characterized by very simple geometries. The geometry features are commonly two-dimensional, but they can be very simple three-dimensional geometries with important geometric symmetry features. Highly instrumented, highly accurate experimental data are obtained from unit problems, and an extensive uncertainty analysis of the experimental data is prepared. If possible, experiments on unit

problems are repeated at separate facilities to ensure that bias (systematic) errors in the experimental data are identified. For unit problems, all of the important code input data, initial conditions, and boundary conditions are accurately measured. These types of experiments are commonly

conducted in universities or in research laboratories.

Experimental data from benchmark cases and unit problems should be of the quantity and quality that are required in true validation experiments. If, however, significant parameters that are needed for the CFD simulation of benchmark cases and unit-problem experiments were not

measured, then the analyst must assume these quantities. For example, suppose for benchmark cases and unit problems that careful dimensional measurements and specific material properties of the hardware were not made. In this case, the computational analyst typically assumes reasonable or plausible values for the missing data, possibly from an engineering handbook. An alternative technique, although rarely done in CFD, is for the analyst to assume probability distributions with specified means and variances of the unmeasured parameters. Multiple computations are then performed using these assumptions, and likelihoods are computed for output quantities used to compare with experimental data. In existing or older published experimental data for benchmark cases and unit problems, it is common that a significant number of parameters are missing from the description of the experiment. Experiments in the past were typically conducted for the purpose of improving the physical understanding of specific phenomena or for determining parameters in models, rather than for the validation of CFD models. That is, these experiments were used inductively to construct mathematical models of physical phenomena, rather than deductively to evaluate the validity of models.

(22)

3. Verification Assessment 3.1 Introduction

The verification assessment activities discussed in this paper apply to difference, finite-volume, and finite-element discretization procedures. Roache has extensively treated the subject of verification in his book [278], and a detailed description of verification methodology and procedures can be found there. It is not the purpose of our current treatment to completely review material that is otherwise available in that reference and others. Instead, we desire to summarize key features of verification assessment that serve to emphasize its role as an important partner with validation assessment. We review selected topics in verification assessment for CFD that illustrate most clearly the importance of such assessment.

3.2 Fundamentals of Verification 3.2.1 Definitions and General Principles

This section discusses some fundamentals of verification that are required to place the remainder of this section in proper perspective. We emphasize the proper and necessary role of verification, as well as its relation to validation. We give a discussion of error estimation and error quantification in verifying calculations, as well as the distinction between verifying codes and verifying calculations. Proper focus on error has impact on both of these assessment activities. In fact, verification assessment may be viewed as the process that minimizes our belief that there are errors in the code, as well as in particular calculations. Verification assessment can be considered as an optimization problem that is certainly constrained by available resources and needs, as well as by intended applications of the code. We also introduce the appropriate role of software engineering (SE), also called software quality engineering (SQE), in verification assessment.

Verification as defined in Section 2 is an equal partner to validation in the overall verification and validation strategy for a code and its applications. Validation depends on solution accuracy as well as on experimental accuracy. For example, computational error that arises from failure to adequately converge a calculation contributes to the discrepancy or apparent agreement between that calculation and the results of an experiment with which the calculation is compared when validation is performed. If severe enough, such a strictly computational error could dominate this discrepancy. For example, it raises the real possibility in complex problems of confusing a coding error with a mathematical modeling error. The goal should be to distinguish errors in mathematical modeling accuracy as clearly as possible from other errors.

The study of solution error is fundamentally empirical. Achieving the goal of rigorously demonstrating that solution error is small for all feasible applications of a CFD code is essentially impossible for complex codes. However, such a goal may be feasible for particular calculations using those codes. In this sense, verification assessment is quite similar to validation assessment. We aim to understand what the computational error is for given calculations or types of

calculations, not for codes in general. In many cases, this error may be understood to be small enough to justify the beginning of validation assessment, or continuation of validation assessment in different directions. We can expect that over time we will accumulate increasing amounts of

(23)

empirical evidence that the error is indeed small enough for important classes of computations, rather than simply for single calculations. Thus we can hope to generalize the understanding we acquire from the study of specific calculations. It is unlikely, however, that rigorous generalization beyond this approach will be possible for the forseeable future.

Assessment of the accuracy of code solutions, given the underlying equations, is the basic objective of verification assessment. Developing confidence that the code solution is accurate for problems other than verification tests is a key goal of this effort. Therefore, an obvious

requirement is to perform the required verification activities in a way that maximizes confidence in the accuracy of new calculations. We will call this the confidence optimization problem.

In verification assessment, the goal of achieving a predictive understanding of the numerical accuracy of future application of codes is strongly dependent upon a detailed understanding of the numerical accuracy of past applications of the code. There are several major themes of verification assessment that enable the achievement of this goal. These include (1) classical concepts of convergence of discretizations of partial differential equations (PDEs), especially a posteriori error estimates; (2) the formalization of testing in performing verification assessment; (3) the role of benchmarks as accuracy quantification standards; and (4) the impact of SQE (software quality engineering). These themes are discussed below.

The verification process traditionally rests upon comparing computational solutions to the “correct answer,” which is provided by “highly accurate solutions” for a set of well-chosen test problems. As a result, this is a test-dominated strategy for solving the problem of optimizing confidence. The resulting confidence is a specific product of the knowledge that computational error is acceptably small on such a set of tests.

The “correct answer” can only be known in a relatively small number of isolated cases. These cases therefore assume a very important role in verification and are usually carefully formalized in test plans for verification assessment of the code. However, because the elements that constitute such test plans are sparse, there should be some level of concern in the CFD community when these plans are the dominant content of the verification assessment activities. Elements of verification assessment that transcend testing, such as SQE practices [113, 215, 260, 292], are also important. Such elements provide supplemental evidence that the CFD code, as written, has a minimal number of errors.

Verification activities are primarily initiated early in the development cycle of a code. This timing further emphasizes the connection between testing alone and more formal SQE activities that are undertaken as part of the software development and maintenance process. One basic problem that underlies the distinction between testing alone and other elements of the verification process is ultimately how constrained resources can be used to achieve the highest confidence in performance of the code. For simplification of our presentation, however, unless it is obvious from our context, we will use the default understanding that “verification” means “testing” in the remainder of this section.

Because of the code-centric nature of many verification activities, the common language used in discussing verification often refers to code verification. What does this concept really mean?

(24)

In our view, to rigorously verify a code requires rigorous proof that the computational

implementation accurately represents the conceptual model and its solution. This, in turn, requires

proof that the algorithms implemented in the code correctly approximate the underlying PDEs,

along with the stated initial and boundary conditions. In addition, it must also be proven that the algorithms converge to the correct solutions of these equations in all circumstances under which the code will be applied. It is unlikely that such proofs will ever exist for CFD codes. The inability to provide proof of code verification in this regard is quite similar to the problems posed by validation. Verification, in an operational sense, then becomes the absence of proof that the code is incorrect. While it is possible to prove that a code is functioning incorrectly, it is effectively impossible to prove the opposite. Single examples suffice to demonstrate incorrect funtioning, which is also a reason why testing occupies such a large part of the validation assessment effort.

Defining verification as the absence of proof that the code is wrong is unappealing from several perspectives. For example, that state of affairs could result from complete inaction on the part of the code developers or their user community. An activist definition that still captures the philosophical gist of the above discussion is preferable and has been stressed by Peercy [261]. In this definition, verification of a code is equivalent to the development of a legal case. Thus verification assessment consists of accumulating evidence substantiating that the code does not have algorithmic or programming errors, that the code functions properly on the chosen hardware, and so on. This evidence needs to be documented, accesible, referenceable, and repeatable. The accumulation of such evidence also serves to reduce the regimes of operation of the code where one might possibly find such errors. Confidence in the verification status of the code then results from the accumulation of a sufficient mass of evidence.

The present view of code verification as a continuing, ongoing process, akin to accumulating evidence for a legal case, is not universally accepted. In an alternative view [278], code

verification is not ongoing but reaches a termination, more akin to proving a theorem. Obviously, the termination can only be applied to a fixed code; if the code is modified, it is a new code (even if the name of the code remains) and the new code must be re-verified. Also, all plausible

non-independent combinations of input options must be exercised so that every line of code is executed

in order to claim that the entire code is verified; otherwise, the verification can be claimed only for the subset of options exercised. The ongoing code exercise by multiple users still is useful, in an evidentiary sense (and in user training), but is referred to as confirmation rather than code verification. In this alternative view of verification, it is argued that contractual and/or regulatory requirements for delivery or use of a "verified code" can be more easily met, and superficial exercises are less likely to be claimed as partial verification. Ongoing use and exercise can possibly uncover mistakes missed in the code verification process, just as a theorem might turn out to have a faulty proof or to have been misinterpreted, but (in this view) the code verification can be completed, at least in principle. Verification of individual calculations, and certainly

validations, are still viewed as ongoing processes, of course.

The primary approach to proving that implemented code is a rigorously correct representation of the underlying conceptual model is through the use of formal methods. A great deal of effort has recently been devoted to the development and application of these methods [51, 258, 287]. The actual goal of formal methods— rigorous “proof” that a system of software is correctly

(25)

implemented—remains controversial in our opinion. The application of these methods is also complicated by disputes over issues of cost, appropriateness, utility, and impact [50].

Formal methods have certainly not been applied to software systems characteristic of those of interest in CFD, namely those systems in which floating point arithmetic is dominant and an effectively infinite variability in software application is the norm. The utility of these methods to CFD codes, even if resource constraints were not issues, is an unsolved problem. This fact has led to interest in the application of formal methods to more restricted problems that still may be of significant interest in CFD software. For example, there is current interest in applying formal methods to aid in verification of the mathematical formalism of the conceptual model that underlies a CFD code rather than in the full details of the software implementation [72, 188].

3.2.2 Developing the Case for Code Verification

How should evidence supporting confidence in code verification be assembled in greater detail? It is clear that accumulating evidence for code verification is a multifold problem. For example, at the very beginning of software implementation of algorithms, any numerical analysis of the algorithms implemented in the code becomes a fundamental check for code verification. Given the simplifying assumptions that underlie most effective numerical analysis (such as linearity of the PDEs), failure of the algorithms to obey the constraints imposed by numerical analysis can only reduce our confidence in the verification status of the code as a whole. As verification assessment proceeds, we are necessarily forced to relax the simplifying assumptions that facilitated numerical analysis. Evidence therefore becomes increasingly characterized by computational experiments and the resulting empirical evidence of code performance.

The scientific community places great emphasis on the evidence developed by executing test problems with the code that is undergoing verification. In fact, testing of this type (including the “free” testing that occurs by allowing users to apply the code) is generally considered to be the most important source of verification evidence for CFD and engineering codes [18, 278]. As is emphasized in Section 3.4, this evidence would be greatly strengthened by the formation and application of agreed-upon test suites and of standards for comparing results of calculations with the benchmarks established by the tests, as well as by community agreement on the levels of accuracy that are required to provide evidence that the code has correctly solved the test problem.

Gathering evidence from the user community on the performance of a CFD code contributes to verification assessment. We distinguish this information from formal planned testing because user evidence is not typically accumulated by executing a rational test plan (such as would enter into the design of a test suite for the code). The accumulation of user evidence has some

characteristics similar to what is called random testing in the software testing literature [175], a topic we also discuss briefly in Section 3.4. Even though user testing is ad hoc compared to formal planned testing, it is the type of testing that is typically associated with verification assessment of CFD software. User testing is known to be incredibly effective at uncovering errors in codes and typically contributes to code verification by uncovering errors that are subsequently fixed, although this may be accompanied by a corresponding loss of confidence in the code by the user. User testing is also a major cost that can conveniently be hidden from the overall cost and effort associated with code verification assessment centered on the software developers. Both formal

(26)

controlled testing and “random” user testing are important in CFD verification asse