• No results found

Reliability Modeling: The RIAC Guide to Reliability Prediction, Assessment

N/A
N/A
Protected

Academic year: 2021

Share "Reliability Modeling: The RIAC Guide to Reliability Prediction, Assessment"

Copied!
432
0
0

Loading.... (view fulltext now)

Full text

(1)

RIAC is a DoD Information Analysis Center sponsored by the Defense Technical Information Center. RIAC is operated by a

Reliability Modeling

The RIAC Guide to Reliability Prediction,

Assessment and Estimation

LI[(Ta1,Tb1),...,(TaL,TbL)/θ] ∝ [F(Tbi) − F(Tai)] |θ) i=1 L

1− CL =

( )

λt k k! e−λt k=0 r

=e−λt 1+ λt + ⋅⋅⋅⋅ +

( )

λt r−1 r −1

(

)

!+ λt

( )

r r

( )

! ⎡ ⎣ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥

λ = λ

b

e

Ea KT

S

n

(2)
(3)

Reliability Modeling -

The RIAC Guide to Reliability

Prediction, Assessment and

Estimation

Prepared by:

Reliability Information Analysis Center 6000 Flanagan Rd.

Suite 3

Utica, NY 13502-1348 Under Contract to:

Defense Technical Information Center DTIC-AI

8725 John J. Kingman Rd. Suite 0944

Fort Belvoir, VA 22060

RIAC is a DoD Information Analysis Center sponsored by the Defense Technical Information Center. RIAC is operated by a team of Wyle Laboratories, Quanterion Solutions Inc., the University of Maryland, the Penn State University Applied Research Laboratory and the State University

(4)

The information and data contained herein have been compiled from government and nongovernment technical reports and from material

supplied by various manufacturers and are intended to be used for reference purposes. Neither the United States Government nor the Wyle Laboratories contract team warrant the accuracy of this information and data. The user is further cautioned that the data contained herein may not be used in lieu of other contractually cited references and specifications.

Publication of this information is not an expression of the opinion of The United States Government or of the Wyle Laboratories contract team as to the quality or durability of any product mentioned herein and any use for advertising or promotional purposes of this information in conjunction with the name of The United States Government or the Wyle Laboratories contract team without written permission is expressly prohibited.

ISBN-10: 1-933904-17-8 (Hardcopy) ISBN-13: 978-1-933904-17-7 (Hardcopy) ISBN-10: 1-933904-18-6 (PDF Download) ISBN-13: 978-1-933904-18-4 (PDF Download)

(5)

gathering and maintaining the data needed and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden to Department of Defense, Washington Headquarters Services, Directorate for Information Operations and Reports(0704-0188), 1215 Jefferson Davis Highway, Suite 1204, Arlington, VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a current or valid OMB control number.

PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS.

1. REPORT DATE 31 May 2010

2. REPORT TYPE

Technical

3. DATES COVERED (From - To) N/A

4. TITLE AND SUBTITLE

Reliability Modeling – The RIAC Guide to Reliability Prediction, Assessment and Estimation

5a. CONTRACT NUMBER HC1047-05-D-4005 5b. GRANT NUMBER N/A

5c. PROGRAM ELEMENT NUMBER N/A

6. AUTHORS

William Denson

5d. PROJECT NUMBER N/A 5e. TASK NUMBER

N/A 5f. WORK UNIT NUMBER

N/A 7. PERFORMING ORGANIZATIONS NAME(S) AND ADDRESS(ES)

Reliability Information Analysis Center 100 Sherman Rd. Suite C101 Utica, NY 13502-1348 8. PERFORMING ORGANIZATION REPORT NUMBER RPAE

9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES)

Defense Technical Information Center DTIC-AI Air Force Research Lab/RISE 8725 John J. Kingman Rd. STE 0944 525 Brooks Rd.

Ft. Belvoir, VA 22060 Rome, NY 13440

10. SPONSORING/MONITOR’S ACRONYM(S)

DTIC-AI and AFRL/RISE 11. SPONSORING/MONITOR’S REPORT NUMBERS

N/A 12. DISTRIBUTION/AVAILABILITY STATEMENT

Approved for public release, distribution unlimited.

13. SUPPLEMENTARY NOTES

Hardcopies available from Reliability Information Analysis Center, 100 Sherman Rd., Suite C101, Utica, NY 13502-1348. (Price: $85 US/$95 Non-US). PDF Download available from http://theRIAC.org (Price $70).

14. ABSTRACT

The intent of this book is to provide guidance on modeling techniques that can be used to quantify the reliability of a product or system. In this context, reliability modeling is the process of constructing a mathematical model that is used to estimate the reliability characteristics of a product. There are many ways in which this can be accomplished, depending on the product or system and the type of information that is available, or practical to obtain, to the analyst. This book will review possible approaches, summarize their advantages and disadvantages, and provide guidance on selecting a methodology based on the specific goals and constraints of the analyst. While this book will not discuss the use of specific published methodologies, in cases where examples are provided, tools and methodologies with which the author has personal experience in their development are used, such as life modeling, NPRD, MIL-HDBK-217 and 217Plus. 15. SUBJECT TERMS

Reliability Modeling Reliability Prediction Reliability Assessment Reliability Estimation NPRD MIL-HDBK-217 217Plus

16. SECURITY CLASSIFICATION OF:

UNCLASSIFIED 17. LIMITATION OF ABSTRACT UNLIMITED 18. NUMBER OF PAGES 410

19a. NAME OF RESPONSIBLE PERSON David Nicholls a. REPORT UNCLASSIFIED b. ABSTRACT UNCLASSIFIED c. THIS PAGE UNCLASSIFIED 19b. TELEPHONE NUMBER (include area code)

315.351.4202

Standard Form 298 (Rev. 8/98) Prescribed by ANSI Std. Z39.18

(6)

The Reliability Information Analysis Center (RIAC), formerly the Reliability Analysis Center (RAC), is a Department of Defense Information Analysis Center sponsored by the Defense Technical Information Center, managed by the Air Force Research Laboratory (formerly Rome Laboratory), and operated by a team of Wyle Laboratories, Quanterion Solutions, the University of Maryland, the Penn State University Applied Research Laboratory and the State University of New York Institute of Technology. RIAC is chartered to collect, analyze and disseminate reliability, maintainability, quality, supportability and interoperability (RMQSI) information pertaining to systems and products, as well as the components used in them. The RIAC addresses both military and commercial perspectives.

The data contained in the RIAC databases is collected on a continuous basis from a broad range of sources, including testing laboratories, device and equipment manufacturers, government laboratories and equipment users (government and industry). Automatic distribution lists, voluntary data submittals and field failure reporting systems supplement an intensive data solicitation program. Users of RIAC are encouraged to submit their RMQSI data to enhance these data collection efforts. RIAC publishes documents for its users in a variety of formats and subject areas. While most are intended to meet the needs of RMQSI practitioners, many are also targeted to managers and designers. RIAC also offers RMQSI consulting, training and responses to technical and bibliographic inquiries. REQUESTS FOR TECHNICAL ASSISTANCE

AND INFORMATION ON AVAILABLE RIAC SERVICES AND PUBLICATIONS MAY BE DIRECTED TO:

ALL OTHER RIAC REQUESTS SHOULD BE DIRECTED TO:

Reliability Information Analysis Center 100 Sherman Rd.

Suite C101

Utica, NY 13502-1348

General Information:(877) 363-RIAC

(877) 363-7422

Technical Inquiries: (315) 351-4200

Fax: (315) 351-4209

E-Mail: [email protected] Internet: http://theRIAC.org

Air Force Research Laboratory AFRL – Systems and Information Interoperability Branch Attn: R. Hyle 525 Brooks Road Rome, NY 13441-4505 Telephone: (315) 330-4857 DSN: 587-4857 Fax: (315) 330-7647 E-Mail: [email protected]

Copyright © 2010 by Quanterion Solutions Incorporated. This handbook was developed by Quanterion

Solutions Incorporated, in support of the prime contractor (Wyle Laboratories) in the operation of the Department of Defense Reliability Information Analysis Center (RIAC) under Contract HC1047-05-D-4005. The Government has a fully paid up perpetual license for free use of and access to this publication and its contents among all the DOD IACs in both hardcopy and electronic versions, without limitation on the number of users or servers. Subject to the rights of the Government, this document (hardcopy and electronic versions) and the content contained within it are protected by U.S. Copyright Law and may not be copied, automated, re-sold, or redistributed to multiple users without the express written permission. The copyrighted work may not be made available on a server for use by more than one person simultaneously without the express written permission. If automation of the technical content for other than personal use, or for multiple simultaneous user access to a copyrighted work is desired, please contact 877.363.RIAC (toll free) or 315.351.4202 for licensing information.

(7)

Table of Contents

Page 1.  INTRODUCTION  1  1.1.  Scope  2  1.2.  Book Organization  5  1.3.  Reliability Program Elements  7  1.4.  The History of Reliability Prediction  11  1.5.  Acronyms  17  1.6.  References  18      2.  GENERAL ASSESSMENT APPROACH  19    2.1.  Define System  20  2.2.  Identify the Purpose of the Model  22  2.3.  Determine the Appropriate Level at Which to Perform the Modeling  25  2.3.1.  Level vs. Data Needed  26  2.3.2.  Using an FMEA as the basis for a reliability model  28  2.3.3.  Model Form vs. Level  34  2.4.  Assess Data Available  36  2.5.  Determine and Execute Appropriate Approach  38  2.5.1.  Empirical  44  2.5.1.1.  Test 44  2.5.1.2.  Field Data 77  2.5.2.  Physics  106  2.5.2.1.  Stress/Strength Modeling 106  2.5.2.2.  First Principals 111  2.6.  Combine Data  114  2.6.1.  Bayesian Inference  121  2.7.  Develop System Model  123  2.7.1.  Monte Carlo Analysis  127  2.8.  References  133      3.  FUNDAMENTAL CONCEPTS  135    3.1.  Reliability Theory Concepts  135  3.2.  Probability concepts  142  3.2.1.  Covariance  142  3.2.2.  Correlation Coefficient  142  3.2.3.  Permutations and Combinations  143  3.2.4.  Mutual Exclusivity  144 

(8)

Table of Contents

Page 3.2.5.  Independent Events  144  3.2.6.  Non‐independent (Dependent) Events  145  3.2.7.  Non‐independent (Dependent) Events: Bayes Theorem  146  3.2.8.  System Models  146  3.2.9.  K‐out‐of‐N Configurations  151  3.3.  Distributions  153  3.3.1.  Exponential  159  3.3.2.  Weibull  160  3.3.3.  Lognormal  166  3.4.  References  169      4.  DOE­BASED APPROACHES TO RELIABILITY MODELING  171    4.1.  Determine the Feature to be Assessed  172  4.2.  Determine Factors  172  4.3.  Determine the Factor Levels  172  4.4.  Design the Tests  174  4.5.  Perform Tests and Measurements  180  4.6.  Analyze the Data  181  4.7.  Develop the Life Model  183  4.8.  References  183      5.  LIFE DATA MODELING  185    5.1.  Selecting a Distribution  185  5.2.  Parameter Estimation Overview  186  5.2.1.  Closed Form Parameter Approximations  189  5.2.2.  Least Squares Regression  190  5.2.3.  Parameter Estimation Using MLE  192 

5.2.3.1.  Brief Historical Remarks 193 

5.2.3.2.  Likelihood Function 193 

5.2.3.3.  Maximum Likelihood Estimator (MLE) 195 

5.2.4.  Confidence Bounds and Uncertainty  198 

5.2.4.1.  Confidence Bounds with MLE 198 

5.2.4.2.  Confidence Bounds Approximations 199 

5.3.  Acceleration Models  206 

5.3.1.  Fundamental Acceleration Models  207 

(9)

Table of Contents

Page 5.3.2.  Combined Models  210  5.3.3.  Cumulative Damage Model  214  5.4.  MLE Equations  216  5.4.1.  Likelihood Functions  217  5.5.  References  221      6.  INTERPRETATION OF RELIABILITY ESTIMATES  223    6.1.  Bathtub Curve  223  6.2.  Common Cause vs. Special Cause  225  6.3.  Confidence Bounds  238  6.3.1.  Traditional Techniques for Confidence Bounds  238  6.3.2.  Uncertainty in Reliability Prediction Estimates  240  6.4.  Failure Rate vs pdf  243  6.5.  Practical Aspects of Reliability Assessments  245  6.6.  Weibayes  245  6.7.  Weibull Closure Property  246  6.8.  Estimating Event‐Related Reliability  247  6.9.  Combining Different Types of Assessments at Different Levels  248  6.10.  Estimating the Number of Failures  250  6.11.  Calculation of Equivalent Failure Rates  251  6.12.  Failure Rate Units  252  6.13.  Factors to be Considered When Developing Models  253  6.13.1.  Causes of Electronic System Failure  253  6.13.2.  Selection of Factors  255  6.13.3.  Reliability Growth of Components  257  6.13.4.  Relative vs. Absolute Humidity  259  6.14.  Addressing Data with No Failures  259  6.15.  Reliability of Components Used Outside of Their Rating  261  6.16.  References  262      7.  EXAMPLES  263    7.1.  MIL‐HDBK‐217 Model Development Methodology  264  7.1.1.  Identify Possible Variables  266  7.1.2.  Develop Theoretical Model  266  7.1.3.  Collect and QC Data  267  7.1.4.  Correlation Coefficient Analysis  268 

(10)

Table of Contents

Page 7.1.5.  Stepwise Multiple Regression Analysis  270  7.1.6.  Goodness‐of‐Fit Analysis  271  7.1.7.  Extreme Case Analysis  272  7.1.8.  Model Validation  272  7.2.  217Plus Reliability Prediction Models  273  7.2.1.  Background  273  7.2.2.  System Reliability Prediction Model  274  7.2.2.1.  217Plus Background 274  7.2.2.2.  Methodology Overview 277 

7.2.2.3.  System Reliability Model 278 

7.2.2.4.  Initial Failure Rate Estimate 279 

7.2.2.5.  Process Grading Factors 280 

7.2.2.6.  Basis Data for the Model 281 

7.2.2.7.  Uncertainty in Traditional Approach Estimates 281 

7.2.2.8.  System Failure Causes 282 

7.2.2.9.  Environmental Factor 287 

7.2.2.10.  Reliability Growth 291 

7.2.2.11.  Infant Mortality 292 

7.2.2.12.  Combining Predicted Failure Rate with Empirical Data 292 

7.2.3.  Development of Component Reliability Models  292 

7.2.3.1.  Model Form 292 

7.2.3.2.  Acceleration Factors 294 

7.2.3.3.  Time Basis of Models 294 

7.2.3.4.  Failure Mode to Failure Cause Mapping 295 

7.2.3.5.  Derivation of Base Failure Rates 296 

7.2.3.6.  Combining the Predicted Failure Rate with Empirical Data 296 

7.2.3.7.  Estimating Confidence Levels 298 

7.2.3.8.  Using the 217Plus Model in a Top-Down Analysis 298 

7.2.3.9.  Capacitor Model Example 299 

7.2.3.10.  Default Values 301 

7.2.4.  Photonic Model Development Example  303 

7.2.4.1.  Introduction 303 

7.2.4.2.  Model development methodology and results 306 

7.2.4.3.  Uncertainty Analysis 322 

7.2.4.4.  Comments on Part Quality Levels 325 

7.2.4.5.  Explanation of Failure Rate Units 325 

7.2.5.  System‐Level Model  326 

(11)

Table of Contents

Page 7.2.5.2.  217Plus Process Grading Criteria 328 

7.2.5.3.  Design Process Grade Factor Questions 330 

7.2.5.4.  Manufacturing Process Grade Factor Questions 336 

7.2.5.5.  Part Quality Process Grade Factor Questions 340 

7.2.5.6.  System Management Process Grade Factor Questions 342 

7.2.5.7.  Can Not Duplicate (CND) Process Grade Factor Questions 346 

7.2.5.8.  Induced Process Grade Factor Questions 347 

7.2.5.9.  Wearout Process Grade Factor Questions 348 

7.2.5.10.  Growth Process Grade Factor Questions 349 

7.3.  Life Modeling Example  350 

7.3.1.  Introduction  350 

7.3.2.  Approach  350 

7.3.3.  Reliability Test Plan  350 

7.3.4.  Results  352 

7.3.4.1.  Times to Failure Summary 352 

7.3.4.2.  Life Models 354 

7.4.  NPRD Description  357 

7.4.1.  Data Collection  358 

7.4.2.  Data Interpretation  361 

7.4.3.  Document Overview  366 

7.4.3.1.  "Part Summaries" Overview 366 

7.4.3.2.  "Part Details" Overview 373 

7.4.3.3.  Section 4 "Data Sources" Overview 374 

7.4.3.4.  Section 5 "Part Number/MIL Number" Index 374 

7.4.3.5.  Section 6 “National Stock Number Index with Federal Stock Class” 375 

7.4.3.6.  Section 7 "National Stock Number Index without Federal Stock Class

Prefix" 375  7.5.  References  375      8.  THE USE OF FMEA IN RELIABILITY MODELING  377    8.1.  Introduction  377  8.2.  Definitions  381  8.3.  FMEA Logistics  383  8.3.1.  When initiated  383  8.3.2.  FMEA Team  383  8.3.3.  FMEA Facilitation  384 

(12)

Table of Contents

Page 8.3.4.  Implementation  385  8.4.  How to Perform an FMEA  385  8.5.  Identify System Hierarchy  387  8.6.  Function Analysis  388  8.7.  IPOUND Analysis  388  8.8.  Identify the Severity  390  8.9.  Identify the Possible Effect(s) that Result from Occurrence of Each Failure Mode  392  8.10.  Identify Potential Causes of Each Failure Mode  392  8.11.  Identify Factors for Each Failure Cause  398  8.11.1.  Accelerating Stress(es) or Potential Tests  398  8.11.2.  Occurrence  398  8.11.2.1.  Occurrence Rankings 398  8.11.3.  Preventions  401  8.11.4.  Detections  401  8.11.5.  Detectability  401  8.12.  Calculate the RPN  404  8.13.  Determine Appropriate Corrective Action  405  8.14.  Update the RPN  408  8.15.  Using Quality Function Deployment to Feed the FMEA  408  8.16.  References  410      9.  CONCLUDING REMARKS  411 

(13)

List of Figures

Page FIGURE 1.1‐1:  PHASES OF A RELIABILITY PROGRAM ... 2  FIGURE 1.1‐2:  RELATIVE COST OF FAILURES VS. PHASE ... 3  FIGURE 1.1‐3:  RELIABILITY PREDICTION, ASSESSMENT AND ESTIMATION... 4  FIGURE 1.1‐4:  PERCENT OF COMPANIES USING RELIABILITY ENGINEERING TOOLS ... 5  FIGURE 1.3‐1:  EXAMPLE RELIABILITY PROGRAM APPROACH ... 7  FIGURE 2.0‐1:  GENERAL MODELING APPROACH ... 20  FIGURE 2.1‐1:  FAULT TREE REPRESENTATION OF SYSTEM MODEL ... 21  FIGURE 2.1‐2:  FAULT TREE REPRESENTATION TO THE FAILURE CAUSE LEVEL ... 21  FIGURE 2.2‐1:  BREAKDOWN OF POTENTIAL RELIABILITY MODELING PURPOSES ... 23  FIGURE 2.3‐1:  TYPICAL DATA REQUIREMENTS VS. LEVEL OF HIERARCHY ... 27  FIGURE 2.3‐2: THE BASIC FMEA APPROACH ... 28  FIGURE 2.3‐3: HIERARCHICAL RELATIONSHIP BETWEEN CAUSE, MODE AND EFFECT ... 29  FIGURE 2.3‐4: APPROACH TO IDENTIFYING CAUSES ... 29  FIGURE 2.3‐5:  FAULT TREE OF PRODUCT OR SYSTEM ... 32  FIGURE 2.3‐6:  FAULT TREE OF PRODUCT OR SYSTEM WITH CAUSE AS THE LOWEST LEVEL ... 32  FIGURE 2.3‐7:  FAULT TREE OF PRODUCT OR SYSTEM WITH CAUSE ABOVE THE LOWEST LEVEL ... 33  FIGURE 2.3‐8:  FAULT TREE OF PRODUCT OR SYSTEM WITH CAUSE TWO LEVELS ABOVE THE LOWEST  LEVEL ... 33  FIGURE 2.5‐1: BREAKDOWN OF RELIABILITY ASSESSMENT OPTIONS ... 38  FIGURE 2.5‐2: QUALIFICATION CONCEPTS AND TERMINOLOGY ... 46  FIGURE 2.5‐3: EVT, DVT AND PVT RELATIONSHIPS... 48  FIGURE 2.5‐4:  ACCELERATION LEVELS ... 51  FIGURE 2.5‐5:  UNCERTAINTY IN EXTRAPOLATION ... 52  FIGURE 2.5‐6:  ACCELERATION LEVELS ... 53  FIGURE 2.5‐7: ACCELERATION ALTERNATIVES ... 53  FIGURE 2.5‐8:  RELATIVE LIFETIME VS. STRESS ... 54  FIGURE 2.5‐9:  RELIABILITY REQUIREMENT VS. SMALL POPULATION RELIABILITY INFERENCE ... 60  FIGURE 2.5‐10:  LIFE MODELING METHODOLOGY ... 62  FIGURE 2.5‐11: IDENTIFICATION OF TEST STRESSES BASED ON THE FMEA ... 64  FIGURE 2.5‐12:  USING THE DESTRUCT LIMIT TO DEFINE THE LIFE TEST MAX STRESS ... 66  FIGURE 2.5‐13:  POSSIBLE STRESS PROFILES ... 67  FIGURE 2.5‐14: MEASUREMENT POINTS FOR AN INFANT MORTALITY FAILURE CAUSE ... 69  FIGURE 2.5‐15: MEASUREMENT POINTS FOR A WEAROUT FAILURE CAUSE ... 69  FIGURE 2.5‐16: ACCELERATION WHEN THE DISTRIBUTIONS FOR AT LEAST TWO STRESSES ARE AVAILABLE  ... 71  FIGURE 2.5‐17: ACCELERATION WHEN THE DISTRIBUTIONS FOR LOW STRESSES ARE NOT AVAILABLE ... 71  FIGURE 2.5‐18: LIFE MODEL SEQUENCE ... 72  FIGURE 2.5‐19 DEGRADATION MODELING APPROACH ... 75  FIGURE 2.5‐20: DEGRADATION DATA EXAMPLE ... 76  FIGURE 2.5‐21: DEGRADATION DATA CONVERSION TO TIMES TO FAILURE ... 77  FIGURE 2.5‐22: RELIABILITY ESTIMATES FROM FIELD DATA ... 78 

(14)

List of Figures

Page FIGURE 2.5‐23: FMEA AS A TOLL FOR ASSESSING SIMILARITY ... 81  FIGURE 2.5‐24: MIL‐HDBK‐217 PART COUNT EXAMPLE ... 85  FIGURE 2.5‐25: MIL‐HDBK‐217 PART STRESS EXAMPLE ... 86  FIGURE 2.5‐26: TELCORDIA SR‐332 (BELLCORE) ... 87  FIGURE 2.5‐27: RAC PRISM REPLACED BY RIAC 217PLUS ... 88  FIGURE 2.5‐28: CNET/RDF 2000 ... 89  FIGURE 2.5‐29: CNET/RDF 2000 MODEL EXAMPLE ... 90  FIGURE 2.5‐30: FIDES ... 91  FIGURE 2.5‐31: USES OF PROGRAM DATA ELEMENTS ... 93  FIGURE 2.5‐32: PROGRAM DATABASE STRUCTURE ... 93  FIGURE 2.5‐33: DATABASE INFORMATION FLOW ... 95  FIGURE 2.5‐34: HIERARCHY OF MAINTENANCE ACTIONS ... 97  FIGURE 2.5‐35: CALCULATION OF PART LIFE UNIT ... 100  FIGURE 2.5‐36: FAILURE TIMES BASED ON OPERATING TIME ... 101  FIGURE 2.5‐37: FAILURE TIMES BASED ON CALENDAR TIME ... 102  FIGURE 2.5‐38: FAILURE RATE SIMULATION WITH WEIBULL BETA = 20 ... 103  FIGURE 2.5‐39: FAILURE RATE SIMULATION WITH WEIBULL BETA = 5.0 ... 103  FIGURE 2.5‐40: FAILURE RATE SIMULATION WITH WEIBULL BETA = 2.0 ... 104  FIGURE 2.5‐41: FAILURE RATE SIMULATION WITH WEIBULL BETA = 1.0 ... 104  FIGURE 2.5‐42: FAILURE RATE SIMULATION WITH WEIBULL BETA = 0.5 ... 105  FIGURE 2.5‐44: STRESS/STRENGTH INTERFERENCE ... 108  FIGURE 2.5‐45: STRESS/STRENGTH INTERFERENCE VS. TIME ... 109  FIGURE 2.6‐1: 217PLUS APPROACH TO FAILURE RATE ESTIMATION ... 114  FIGURE 2.6‐3.  BAYESIAN INFERENCE OUTLINE ... 122  FIGURE 2.7‐1: COMBINING SEVEN FAILURE CAUSE DISTRIBUTIONS ... 125  FIGURE 2.7‐2: POSSIBLE FAULT TREE REPRESENTATION OF A SERIES RELIABILITY BLOCK DIAGRAM ... 126  FIGURE 2.7‐3: PDF OF NORMAL DISTRIBUTION WITH MEAN OF 10 AND STANDARD DEVIATION OF 3. ... 128  FIGURE 2.7‐4: CUMULATIVE NORMAL DISTRIBUTION WITH MEAN OF 10 AND STANDARD DEVIATION OF 3  ... 128  FIGURE 2.7‐5: VALUE SELECTION FROM A DISTRIBUTION ... 129  FIGURE 2.7‐6: VALUE SELECTION FROM A WEIBULL DISTRIBUTION ... 130  FIGURE 2.7‐7: RELIABILITY BLOCK DIAGRAM OF REDUNDANT EXAMPLE ... 131  FIGURE 2.7‐8: SYSTEM MONTE CARLO EXAMPLE... 131  FIGURE 2.7‐9: MONTE CARLO SIMULATION OF EXAMPLE SYSTEM ... 132  FIGURE 3.1‐1:  DISCRETE PROBABILITY DISTRIBUTION ... 135  FIGURE 3.1‐2:  CONTINUOUS PROBABILITY DISTRIBUTION ... 136  FIGURE 3.2‐1: EXAMPLES OF CORRELATION COEFFICIENTS ... 142  FIGURE 3.2‐2: VENN DIAGRAM OF MUTUALLY EXCLUSIVE EVENTS ... 144  FIGURE 3.2‐3: INDEPENDENT EVENTS ... 145  FIGURE 3.2‐4: FAULT TREE OR GATE ... 147  FIGURE 3.2‐5: RELIABILITY BLOCK DIAGRAM FOR AN OR GATE ... 147 

(15)

List of Figures

Page FIGURE 3.2‐7: RELIABILITY BLOCK DIAGRAM FOR AN AND GATE ... 149  FIGURE 3.2‐8: FAULT TREE OF AN AND/OR COMBINATION ... 150  FIGURE 3.2‐9: RBD OF AND/OR COMBINATION ... 150  FIGURE 3.3‐1: SHAPES OF FAILURE DENSITY AND RELIABILITY FUNCTIONS OF COMMONLY USED DISCRETE  DISTRIBUTIONS (FROM MIL‐HDBK‐338B) ... 157  FIGURE 3.3‐2: SHAPES OF FAILURE DENSITY, RELIABILITY AND HAZARD RATE FUNCTIONS FOR COMMONLY  USED CONTINUOUS DISTRIBUTIONS (FROM MIL‐HDBK‐338B) ... 158  FIGURE 3.3‐3:  EXAMPLE PDF PLOTS FOR THE WEIBULL DISTRIBUTION ... 164  FIGURE 3.3‐4:  EXAMPLE HAZARD RATE PLOTS FOR THE WEIBULL DISTRIBUTION ... 164  FIGURE 3.3‐5:  EXAMPLE PROBABILITY PLOTS FOR WEIBULL DISTRIBUTION ... 165  FIGURE 3.3‐6: EXAMPLE PDF PLOTS FOR THE LOGNORMAL DISTRIBUTION ... 167  FIGURE 3.3‐7: EXAMPLE HAZARD RATE PLOTS FOR THE LOGNORMAL DISTRIBUTION ... 168  FIGURE 3.3‐8: EXAMPLE PROBABILITY PLOTS FOR THE LOGNORMAL DISTRIBUTION ... 168  FIGURE 4.0‐1: THE DOE CONCEPT ... 171  FIGURE 4.3‐1: POSSIBLE RESPONSE‐FACTOR LEVEL RELATIONSHIP ... 173  FIGURE 4.4‐1: DOE TERMINOLOGY ... 174  FIGURE 4.4‐2: ONE‐FACTOR‐AT‐A‐TIME EXPERIMENTS ... 176  FIGURE 4.4‐3: STANDARD DOE NOMENCLATURE ... 177  FIGURE 4.4‐4: POTENTIAL INTERACTIONS ... 178  FIGURE 4.6‐1: ANALYSIS OF MEANS ... 182  FIGURE 4.6‐2: LINEARIZATION OF THE ARRHENIUS RELATIONSHIP ... 182  FIGURE 4.6‐3: OPTIMAL FACTOR SETTINGS... 183  FIGURE 5.4‐1: LIKELIHOOD CONTOUR EXAMPLE... 220  FIGURE 6.1‐1: BATHTUB CURVE ... 223  FIGURE 6.2‐1: EXAMPLE OF NON‐MONOMODAL DISTRIBUTION ... 228  FIGURE 6.2‐2: MULTIMODAL DISTRIBUTION EXAMPLE 1 ... 229  FIGURE 6.2‐3: MULTIMODAL DISTRIBUTION EXAMPLE 2 ... 230  FIGURE 6.2‐4: MULTIMODAL DISTRIBUTION EXAMPLE 3 ... 231  FIGURE 6.2‐5: MULTIMODAL DISTRIBUTION EXAMPLE 4 ... 232  FIGURE 6.2‐6: MULTIMODAL DISTRIBUTION EXAMPLE 5 ... 233  FIGURE 6.2‐7: MULTIMODAL DISTRIBUTION EXAMPLE OF POOLED DATA SET ... 234  FIGURE 6.2‐8: AGE AT DEATH DATA ... 235  FIGURE 6.2‐9: PDF OF MULTIMODE DISTRIBUTION OF AGES ... 236  FIGURE 6.2‐10: FAILURE RATE OF AGE DATA ... 236  FIGURE 6.2‐11: PROBABILITY PLOT OF AGE DATA ... 237  FIGURE 6.2‐12: SINGLE MODE WEIBULL FIT TO THE AGE DATA ... 238  FIGURE 6.3‐1: SOURCES OF ERROR IN EMPIRICAL MODELS ... 241  FIGURE 6.3‐2: CONFIDENCE LEVEL THROUGH PREDICTION, ASSESSMENT AND ESTIMATION ... 243  FIGURE 6.6‐1: WEIBAYES EXAMPLE ... 246  FIGURE 6.13‐1: NOMINAL FAILURE CAUSE DISTRIBUTION OF ELECTRONIC SYSTEMS ... 254   

(16)

List of Figures

Page FIGURE 6.13‐2: IPO MODEL ... 256  FIGURE 6.13‐3: RELATIONSHIP BETWEEN ABSOLUTE AND RELATIVE HUMIDITY... 259  FIGURE 6.14‐1: ESTIMATED UPPER BOUND FAILURE RATES VS OPERATING TIME AT 60 AND 90%  CONFIDENCE ... 260  FIGURE 7.1‐1: MIL‐HDBK‐217 MODEL DEVELOPMENT METHODOLOGY ... 265  FIGURE 7.2‐1: FAILURE CAUSE DISTRIBUTION OF ELECTRONIC SYSTEMS ... 275  FIGURE 7.2‐2: OPTICAL AMPLIFIER FAILURE CAUSE DISTRIBUTION ... 277  FIGURE 7.2‐3:  ΠG VS. TIME AND GROWTH RATES ... 291  FIGURE 7.2‐4: MODEL DEVELOPMENT METHODOLOGY FLOWCHART ... 306  FIGURE 7.2‐5: DISTRIBUTION OF LOG10 PREDICTED/OBSERVED FAILURE RATE RATIO FOR ALL DATA .... 323  FIGURE 7.2‐6: DISTRIBUTION OF LOG10 PREDICTED/OBSERVED RATIO FOR FIELD DATA ONLY ... 324  FIGURE 7.2‐7: DISTRIBUTIONS OF THE PREDICTED/OBSERVED FAILURE RATE RATIO FOR ALL DATA AND  FOR FIELD DATA ONLY ... 324  FIGURE 7.3‐1: TIMES TO FAILURE DISTRIBUTIONS ... 354  FIGURE 7.3‐2: PROBABILITY OF FAILURE VS. TEMPERATURE AND RELATIVE HUMIDITY AT 50,000 HOURS  ... 357  FIGURE 7.4‐1: APPARENT FAILURE RATE FOR REPLACEMENT UPON FAILURE... 362  FIGURE 7.4‐3:  EXAMPLE OF PART DETAIL ENTRIES ... 374  FIGURE 8.1‐1: TWO BASIC TYPES OF FMEA ... 378  FIGURE 8.4‐1: FMEA PROCESS FLOW ... 386  FIGURE 8.7‐1: FAILURE CAUSE‐MODE EFFECT RELATIONSHIP ... 390  FIGURE 8.10‐1: FAILURE CAUSE, MODE AND EFFECT HIERARCHY ... 393  FIGURE 8.10‐2: FAILURE CAUSES ... 395  FIGURE 8.11‐1: OCCURRENCE DEFINITIONS ... 399  FIGURE 8.11‐2: OCCURRENCE GUIDELINES ... 400  FIGURE 8.11‐3: DETECTABILITY DEFINITIONS ... 402  FIGURE 8.11‐4: LIFE CYCLE VS DETECTABILITY DIMENSION ... 403  FIGURE 8.13‐1: POTENTIAL CORRECTIVE ACTIONS ... 407  FIGURE 8.15‐1: QFD‐TO‐FMEA LINKS ... 408  FIGURE 8.15‐2: QFD‐FMEA ... 410 

(17)

List of Tables

Page TABLE 1.3‐1:  RANGES OF POTENTIAL CUSTOMER REACTIONS... 8  TABLE 2.2‐1:  RELIABILITY ASSESSMENT PURPOSES ... 24  TABLE 2.2‐2:  PROGRAM PHASE VS. RELIABILITY ASSESSMENT PURPOSE ... 25  TABLE 2.3‐1:  EXAMPLES OF INITIAL CONDITIONS, STRESSES AND MECHANISMS ... 30  TABLE 2.3‐2:  RELATIONSHIP BETWEEN CAUSE, MODE AND EFFECT. ... 31  TABLE 2.5‐1:  SUMMARY OF RELIABILITY ASSESSMENT OPTIONS ... 39  TABLE 2.5‐1:  SUMMARY OF ASSESSMENT OPTIONS (CONTINUED) ... 40  TABLE 2.5‐2: RELEVANCY OF APPROACH TO PREDICTION, ASSESSMENT AND ESTIMATION... 41  TABLE 2.5‐3:  IDENTIFICATION OF APPROPRIATE APPROACHES BASED ON THE PURPOSE ... 43  TABLE 2.5‐4:  RANKING THE ATTRIBUTES OF EMPIRICAL DATA ... 44  TABLE 2.5‐5:  EVT, DVT AND PVT PURPOSE AND APPROACH ... 47  TABLE 2.5‐6:  RELIABILITY DEMONSTRATION EXAMPLE ... 50  TABLE 2.5‐7:  EXAMPLE OF A QUALIFICATION PLAN FOR AN ASSEMBLY ... 57  TABLE 2.5‐8:  QUALIFICATION EXAMPLE FOR A LASER DIODE ... 58  TABLE 2.5‐9:  STRESS PROFILE OPTION ADVANTAGES AND DISADVANTAGES ... 68  TABLE 2.5‐10: SIMILARITY ANALYSIS ... 80  TABLE 2.5‐11: DIGITAL CIRCUIT BOARD FAILURE RATES (IN FAILURES PER MILLION PART HOURS) ... 83  TABLE 2.5‐12: TEST CONDITIONS ... 111  TABLE 2.5‐13: DATA TO ESTIMATE DIFFUSION RATE ... 112  TABLE 2.5‐14: PREDICTED LIFETIMES VS. OBSERVED ... 113  TABLE 3.1‐1:  PROBABILITY DISTRIBUTION NOTATION & MATHEMATICAL REPRESENTATIONS ... 141  TABLE 3.2‐1: COMBINATIONS EXAMPLE ... 143  TABLE 3.2‐2: COMBINATIONS OF AN OR CONFIGURATION ... 147  TABLE 3.2‐3: COMBINATIONS OF AN AND CONFIGURATION ... 149  TABLE 3.2‐4: EXAMPLE OF “K‐OUT‐OF‐N” PROBABILITY CALCULATIONS... 151  TABLE 3.2‐5: EXAMPLE OF “2‐OUT‐OF‐3” REQUIRED FOR SUCCESS ... 152  TABLE 3.3‐1:  PROBABILITY DISTRIBUTIONS APPLICABLE TO RELIABILITY ENGINEERING ... 154  TABLE 3.3‐2:  EXPONENTIAL DISTRIBUTION PARAMETERS ... 160  TABLE 3.3‐3:  CONFUSING TERMINOLOGY OF THE WEIBULL DISTRIBUTION ... 162  TABLE 3.3‐4:  WEIBULL DISTRIBUTION PARAMETERS ... 163  TABLE 4.3‐1: POSSIBLE CONCLUSIONS FOR A NON‐LINEAR RESPONSE‐FACTOR RELATIONSHIP ... 173  TABLE 4.4‐1: FULL‐FACTORIAL EXAMPLE ... 175  TABLE 4.4‐2: FULL AND HALF FACTORIAL EXAMPLE FOR CORROSION ... 179  TABLE 5.2‐1:  TERMINOLOGY USED IN PARAMETER ESTIMATION ... 187  TABLE 5.2‐2:  TECHNIQUES FOR PARAMETER ESTIMATION ... 188  TABLE 5.2‐3:  PARAMETERS TYPICALLY ESTIMATED FROM STATISTICAL DISTRIBUTIONS ... 189  TABLE 5.2‐4:  CONFIDENCE BOUNDS FOR THE POISSON DISTRIBUTION ... 200  TABLE 5.2‐5:  CONFIDENCE BOUNDS FOR THE BINOMIAL DISTRIBUTION ... 201  TABLE 5.2‐6:  CONFIDENCE BOUNDS FOR THE EXPONENTIAL DISTRIBUTION ... 202  TABLE 5.2‐8:  CONFIDENCE BOUNDS FOR THE NORMAL DISTRIBUTION ... 203  TABLE 5.3‐10:  CONFIDENCE BOUNDS FOR THE WEIBULL DISTRIBUTION ... 205 

(18)

List of Tables

Page TABLE 6.1‐1: CATEGORIES OF FAILURE EFFECTS ... 227  TABLE 6.2‐2: BIMODAL POPULATION EXAMPLE 1 ... 229  TABLE 6.2‐3: BIMODAL POPULATION EXAMPLE 2 ... 230  TABLE 6.1‐4: BIMODAL POPULATION EXAMPLE 3 ... 231  TABLE 6.1‐5: BIMODAL POPULATION EXAMPLE 4 ... 232  TABLE 6.1‐6: BIMODAL POPULATION EXAMPLE 5 ... 233  TABLE 6.1‐7: FOUR MODE WEIBULL DISTRIBUTION PARAMETERS ... 235  TABLE 6.3‐1: FAILURE RATE UNCERTAINTY LEVEL MULTIPLIERS ... 242  TABLE 6.9‐1: EXAMPLE OF COMBING DIFFERENT TYPES OF MODELS... 248  TABLE 6.13‐1: FACTORS TO BE CONSIDERED IN A RELIABILITY MODEL ... 256  TABLE 6.13‐2:  FAILURE RATE DATA SUMMARY ... 258  TABLE 7.1‐1: DATA COLLECTED FOR MODEL DEVELOPMENT ... 269  TABLE 7.1‐2: DATA TRANSFORMS ... 270  TABLE 7.1‐3: REGRESSION DATA INCLUDING CATEGORICAL VARIABLES ... 271  TABLE 7.2‐1: UNCERTAINTY LEVEL MULTIPLIER ... 282  TABLE 7.2‐2: PERCENTAGE OF FAILURES ATTRIBUTABLE TO EACH FAILURE CAUSE ... 283  TABLE 7.2‐3: WEIBULL PARAMETERS FOR FAILURE CAUSE PERCENTAGES ... 283  TABLE 7.2‐4:  MULTIPLIERS AS A FUNCTION OF PROCESS GRADE ... 284  TABLE 7.2‐5: EXAMPLE OF FAILURE MODE‐TO‐FAILURE CAUSE CATEGORY MAPPING ... 295  TABLE 7.2‐6: CAPACITOR PARAMETERS ... 301  TABLE 7.2‐7: DEFAULT ENVIRONMENTAL STRESS VALUES ... 302  TABLE 7.2‐8: DEFAULT OPERATING PROFILE VALUES... 303  TABLE 7.2‐9: FAILURE CAUSE SUMMARY FOR CONNECTORS ... 308  TABLE 7.2‐10:  FAILURE MODE TO FAILURE CAUSE CATEGORY FOR CONNECTORS (SC AND FC) ... 309  TABLE 7.2‐11: FAILURE CAUSE PERCENTAGES FOR CONNECTORS ... 311  TABLE 7.2‐12: DATA COLLECTED FOR CONNECTORS... 312  TABLE 7.2‐13: CATEGORIES OF ACCELERATION MODEL PARAMETERS ... 315  TABLE 7.2‐14: ACCELERATION MODEL PARAMETERS ... 315  TABLE 7.2‐15: DEFAULT MODEL PARAMETERS ... 316  TABLE 7.2‐16: SUMMARY OF PI‐FACTOR CALCULATIONS ... 317  TABLE 7.2‐17: APPLICABILITY OF TEST DATA ... 318  TABLE 7.2‐18: BASE FAILURE RATES (FAILURES PER MILLION CALENDAR HOURS) ... 319  TABLE 7.2‐19:  PART QUALITY PROCESS GRADE FACTOR QUESTIONS FOR PHOTONIC DEVICE MODELS .. 320  TABLE 7.2‐20: SUMMARY OF UNCERTAINTY METRICS ... 323  TABLE 7.2‐21: PARAMETERS FOR THE PROCESS GRADE FACTORS ... 327  TABLE 7.2‐22.  INDEX OF PROCESS GRADE TYPE QUESTIONS ... 328  TABLE 7.2‐23:  DESIGN PROCESS GRADE FACTOR QUESTIONS ... 330  TABLE 7.2‐24:  MANUFACTURING PROCESS GRADE FACTOR QUESTIONS ... 336  TABLE 7.2‐25:  PART QUALITY PROCESS GRADE FACTOR QUESTIONS ... 340  TABLE 7.2‐26:  SYSTEM MANAGEMENT PROCESS GRADE FACTOR QUESTIONS ... 342  TABLE 7.2‐27:  CAN NOT DUPLICATE (CND) PROCESS GRADE FACTOR QUESTIONS ... 346 

(19)

List of Tables

Page TABLE 7.2‐29:  WEAROUT PROCESS GRADE FACTOR QUESTIONS ... 348  TABLE 7.2‐30:  GROWTH PROCESS GRADE FACTOR QUESTIONS ... 349  TABLE 7.3‐1: PARAMETER LEVELS ... 350  TABLE 7.3‐2: TEST PLAN SUMMARY ... 351  TABLE 7.3‐3: LIFE TEST RESULTS ... 352  TABLE 7.3‐4: TIMES TO FAILURE DISTRIBUTION PARAMETERS ... 353  TABLE 7.3‐5: ESTIMATED PARAMETER 80% 2‐SIDED CONFIDENCE BOUNDS ... 356  TABLE 7.4‐1:  DATA SUMMARIZATION PROCESS ... 359  TABLE 7.4‐2: TIME AT WHICH ASYMPTOTIC VALUE IS REACHED ... 363  TABLE 7.4‐3 α/MTTF RATIO AS A FUNCTION OF β ... 363  TABLE 7.4‐4: PERCENT FAILURE FOR WEIBULL DISTRIBUTION ... 364  TABLE 7.4‐5: FIELD DESCRIPTIONS ... 367  TABLE 7.4‐6:  APPLICATION ENVIRONMENTS DEFINED IN NPRD ... 368  TABLE 8.7‐1: FAILURE MODE RELATIONSHIP TO TAGUCHI LOSS FUNCTION ... 389  TABLE 8.8‐1: DIMENSIONS OF FUNCTIONAL SEVERITY ... 391  TABLE 8.8‐2: DIMENSIONS OF SEVERITY ... 392  TABLE 8.11‐1: CATEGORIES OF FAILURE EFFECTS ... 401  TABLE 8.11‐2: RECOMMENDED DETECTABILITY RATING CRITERIA ... 404 

(20)
(21)

1. Introduction

Few engineering techniques have caused as much controversy in the last several decades as the topic of reliability prediction. One of the primary reasons for this is the stochastic nature of reliability. Whereas many engineering disciplines are governed by

deterministic processes, reliability is governed by a complex interaction of stochastic processes. As a result, the metrics of interest in other engineering disciplines are

generally much more quantifiable by their very nature. While there is always a stochastic element in any engineering model, the topic of reliability quantification must address its extreme stochastic nature.

Many highly respected reliability engineering texts treat the topic of reliability modeling thoroughly and in great detail. Included in these texts are detailed ways to model system reliability using techniques like Failure Modes and Effects Analysis (FMEA), Fault Tree Analysis (FTA), Markov models, fault tolerant design techniques, etc. The techniques that are addressed in detail in these texts often gloss over a fundamental requirement in order to effectively utilize these techniques, i.e., the ability to quantify the reliability of the constituent components and subsystems comprising the system.

The intent of this book is to provide guidance on reliability modeling techniques that can be used to quantify the reliability of a product or system. In this context, reliability modeling is the process of constructing a mathematical model that is used to estimate the reliability characteristics of an item. There are many ways in which this can be

accomplished, depending on the item and the type of information that is available to, or practical to obtain by, the analyst. This book will review possible approaches, summarize their advantages and disadvantages, and provide guidance on selecting a methodology based on specific goals and constraints. While this book will not discuss the use of specific published methodologies, in cases where examples are provided, tools and methodologies with which the author has personal experience in their development are used, such as life modeling, NPRD, MIL-HDBK-217 and 217Plus.

The Reliability Information Analysis Center (RIAC) has prepared many documents in the past relating to many different reliability engineering techniques, such as FMEA, FTA, Worst Case Analysis (WCA), etc. However, one noteworthy omission from this list is reliability modeling. This, coupled with (1) the RIAC’s history of providing reliability modeling data and solutions, and (2) the need to objectively address some of the

(22)

In years past, DoD contracts would require specific reliability prediction methodologies, usually MIL-HDBK-217, be used. This resulted in system developers having very little flexibility in applying different reliability prediction practices. Since the DoD has not, until very recently, supported updates to MIL-HDBK-217, companies were encouraged to use best practices in quantifying product reliability. The difficult question to be addressed is “what are the best practices that should be used?” This book attempts to provide guidance on selecting an appropriate methodology based on the specific conditions and constraints of the company and its products or systems.

It is hoped that the author’s experience gained by attempting many different reliability assessment approaches, including physics and empirical approaches, can be used to the advantage of the reader in a practical way.

1.1. Scope

The intent of a reliability program is to identify and mitigate failure modes/mechanisms, verify their removal through reliability testing, implement corrective actions for

“discovered” failures, and maintain reliability levels after reliability has been designed in. These correspond to the designing-in reliability, reliability growth and ensuring on-going reliability goals, respectively, as illustrated in Figure 1.1-1.

(23)

The cost to an organization increases exponentially as a function of when failure causes are discovered, as illustrated in Figure 1.1-2. It is most efficient to discover failure modes and mechanisms as early as possible, when they can be effectively mitigated. If failure modes and mechanisms are discovered late in development or, worse, in the field, organizations can be faced with staggering costs associated with corrective actions.

Figure 1.1-2: Relative Cost of Failures vs. Phase

The use of reliability engineering techniques early in the development cycle of a system is critical to achieving high reliability. An important part of these efforts is the modeling of reliability before the product or system is fielded.

The term “Reliability Prediction” has had a relatively narrow connotation, primarily associated with “handbook” approaches. This document attempts to take a broader view of this topic by investigating the various approaches for quantifying reliability, and their effectiveness when used to achieve specific objectives. For this reason, the book is entitled “Reliability Modeling – the RIAC Guide to Reliability Prediction, Assessment and Estimation”. The definitions of these are:

Prediction - something that is predicted, forecasted

Assessment - to determine the importance, size, or value of

Estimation - A tentative evaluation or rough calculation, as of worth, quantity, or

(24)

Predictions are performed very early, before there is any empirical data on the item under analysis. Reliability assessments are made to determine the affects of certain factors on reliability and to identify failure causes. Reliability estimates are made based on empirical data. This book covers all three areas, as illustrated in Figure 1.1-3.

Figure 1.1-3: Reliability Prediction, Assessment and Estimation

Figure 1.1-4 summarizes the results of a benchmarking study of best commercial

reliability practices (Reference 9). In this study, reliability predictions were identified by more than 90% of the participants as being an appropriate reliability task during the product/system development life cycle. Approximately 70% of the survey respondents felt that reliability predictions were effective, supporting the proposition that, while generally perceived as beneficial, there are problems associated with their use. This information highlights the importance that organizations often place on assessing and predicting reliability.

(25)

Figure 1.1-4: Percent of Companies Using Reliability Engineering Tools

1.2. Book

Organization

Chapter 1 of this book presents background information on reliability modeling. The next section of this chapter includes a description of a typical reliability program, the intent of which is to present the elements that should be considered when developing a program, and to highlight how reliability modeling fits into such a program. Also included is a section on the history of reliability prediction, to provide a historical perspective of its evolution.

Chapter 2 covers the primary topic of this book, and includes information on the various ways in which a product can be modeled and guidance on selecting an approach. It presents a generic approach, and describes the elements of this approach.

Chapter 3 presents fundamental concepts of reliability theory, probability and statistics. In many books, these topics are presented first. However, in this book, it is presented after Chapter 3 because it is not the primary topic. Rather, it is presented to provide the fundamental foundation for the concepts used in reliability modeling. It is also the foundation for Design of Experiments (DOE) and Life Modeling techniques, which are further detailed in Chapters 4 and 5.

(26)

Approaches like using a “Multi-cell”-based designed experiment to generate data from which a life model is developed are presented in Chapter 2. Here, a generic approach to this topic is presented. Since the topic of life modeling is central to reliability modeling, important elements of it are presented in more detail in Chapters 4 and 5. One of the critical aspects of life modeling is reliability testing.

Design of Experiments is a technique to maximize the usefulness of the data resulting from DOE tests, and is the topic of Chapter 4.

Chapter 5 presents information relative to development of the mathematical models that form the basis of the reliability model, and includes information pertaining to parameter estimation.

Chapter 6 presents a variety of topics pertaining to the interpretation of reliability models. This is provided to allow the reader to gain a better appreciation for what can, and cannot, be concluded from a model.

Chapter 7 is a compilation of examples of reliability models. Presented here are the following examples:

1. A typical MIL-HDBK-217 model development process

2. Information on the development of the RIAC’s 217Plus methodology 3. A life modeling example

4. A description of RIAC’s Nonelectronic Parts Reliability Data (NPRD), provided as an example of the use of field data in reliability modeling

These examples are provided to give the reader a better appreciation for the tools, techniques and limitations of various approaches to reliability modeling.

A discussion of FMEA is presented in Chapter 8. Although FMEA is secondary to the primary intent of this book, it can form the basis for many elements of a reliability program, including reliability modeling. Therefore, Chapter 8 is intended to present FMEA concepts in this context, as well as provide practical information on performing FMEAs that this author has found to be useful.

(27)

1.3. Reliability

Program

Elements

In order to allow a perspective on how reliability modeling fits into a reliability program, this section presents a generic reliability program, with a description of its various elements. It is presented to highlight how reliability modeling fits into such a program. There are many possible approaches to “designing in” reliability. The specific approach used will depend on the needs of the specific organization. Figure 1.3-1 presents one possible approach, and includes the elements that should be included in all approaches. The premise of this approach is to identify the critical parts and material which warrant detailed attention. Since it is impractical to perform some reliability modeling

approaches on all system parts, it is imperative to identify the critical parts which are the highest risk. Since one of the most effective ways to verify the robustness of parts or materials is from experience, an effective reliability program must leverage knowledge gained in the development and deployment of previous systems. It will be shown that reliability assessments impact many of the elements of this approach.

(28)

Elements of the reliability program are summarized as follows:

1. Design requirements: The first step in any product development process is the identification of requirements. These requirements include items pertaining to Performance, Reliability (failure rate, life), Maintainability, Diagnostics, and Use

Environment and Operational stresses (i.e., mission profiles). Typically, the medium for communicating these requirements is the product specification. While the specification usually contains details regarding the require performance of the product or system, it is often lacking relative to quantifying the reliability attributes required. The following questions should be answered to determine these reliability requirements:

• What is the required failure rate of the item in its useful life? • What is the service life required?

• What criteria will be used to determine when the requirements are not met? • Whose responsibility will it be to take corrective action if these requirements are

not met?

• What are the operating and environmental profiles expected in field deployed conditions?

A valuable tool to assist in understanding the requirements is Quality Function Deployment (QFD).

The reliability that is considered acceptable will, of course, be specific to the industry, criticality of failure, etc. The specific value may be specified, or it may not be,

depending on the industry and the maturity of the product. The range of potential customer reactions to various scenarios are summarized in Table 1.3-1.

Table 1.3-1: Ranges of Potential Customer Reactions

Outcome Field reliability Likely Customer reaction

Best

Worst

No failures Pleased

Failures occur at an acceptable rate Tolerant

Recurring failures, but on a relatively small percent of items

Annoyed Recurring failures on a high percent of items Angry An unexpected failure mechanism is discovered that

will affect the entire population, or critical safety related failures

(29)

If the requirement is not specified, an estimate of the requirement must be made so that there is a goal that can be used in the development process.

2. Initial Design: After the product requirements are understood, the design team generally derives an initial, or preliminary, design for the product or system. Inputs to this initial design should be in the form of design rules and a Standard parts list. Design rules are the culmination of lessons learned from previous development activities, from both empirical field or test data, and from analysis. These design rules should be a living document which is continuously updated based on current information. Effective use of design rules also saves much effort since reliability attributes which have a reliability history or which have been previously studied do not need to be addressed in detail, thus saving resources to be applied to the study of critical parts.

3. Similarity analysis: Once an initial design is available, a similarity analysis can be performed to identify attributes which are similar to those for which a reliability history is available, and those for which it is not. A FMEA can be a valuable technique for this analysis, and will be discussed later. In this analysis, each reliability attribute identified in the FMEA is reviewed to determine if a reliability history exists or not.

4. Identify attributes that are similar: Similar attributes are those that have a reliability history

5. Assess robustness of attribute: If the part or attribute does have a history, previous test data or field experience data can be used to assess the robustness of the part or attribute. 6. Identify attributes that are not similar: Attributes that are not similar do not have a reliability history.

7. Perform design analysis: Although any attribute that is potentially different in the new design relative to the previous design must be analyzed, particular attention is given to the attributes that are not similar. Design techniques that are used for this purpose are FMEA, tolerance or worst case analysis, thermal analysis, stress analysis, and reliability predictions.

8. Implement corrective action: From the results of the design analysis, corrective action should be taken to improve the robustness of the design.

9. Identify critical parts/materials: Based on the results of the analysis, critical parts or materials are identified.

(30)

10. Model critical parts/materials: Once critical parts are identified, action must be taken to ensure that the parts or materials are robust enough to meet the reliability and

durability requirements. More details of the approach used for this purpose will be presented later in the book.

11. Identify effective tests for non-similar attributes: Based on the identification of critical parts and the design analysis that was performed, specific tests that will assess the reliability and durability of the attribute can be determined. Part of the FMEA should include identification of stresses that will accelerate the attribute under analysis and therefore, this analysis is important for identifying the appropriate stress tests.

12. Develop a test plan and execute tests: Based on the design analysis performed and the identification of tests for non-similar attributes, a test plan can be determined. In the context of this approach, the goal of these tests is to assess the robustness of the product by subjecting the product to test stresses that are intended to accelerate the critical parts and non-similar attributes to failure. In addition to these tests, other test requirements should be incorporated into this test plan. These additional test requirements include any tests required by the customer, such as qualification or reliability demonstration tests. 13. Document the test results: Once the tests have been performed and the data analyzed, the results should be fully documented, since they subsequently will be used for a variety of purposes.

14. Monitor field reliability: Once the product is deployed, field reliability experience data should be carefully gathered, since it will be used for a variety of purposes. Elements of the data to be gathered include:

1. Product or system deployment history by serial number, including when deployed, when fielded

2. Failure information, including failure date, root failure cause, results of failure analysis

3. Product or system re-deployment information

15. Update reliability database: A database is required to manage the reliability data, and should include both test data and field data. This data can be used to generate a

company-specific reliability prediction methodology.

(31)

16. Update Design Rules: Data acquired from tests and field surveillance should be used to update the design rules. Field data is probably the most valuable type of data for this purpose since it represents the actual product or system in the intended use environment. The process of maintaining design rules and ensuring that they are used in new designs is the cornerstone of the means by which reliability is improved in a reliability growth process.

Critical parts are those which may result in a significant risk to the project. This risk can be related to reliability, lifetime, availability, or maintainability. Some of the factors that constitute critical parts are:

• New, unproven technology

• New, unproven manufacturing processes

• Performance limitations: stringent environmental conditions or non-robust design practices

• Reliability limitations: components/materials with life limitations

• Vendors with a past history of delivery, cost performance or reliability problems • Old technology with availability problems

These critical parts or items warrant additional attention in assessing their reliability, as they generally will represent the greatest reliability risk.

1.4. The History of Reliability Prediction

The term “reliability prediction” has historically be used to denote the process of

applying mathematical models and data for the purposes of estimating field reliability of a product or system before empirical data is available on that product or system. This section will review some of the developments in the area of reliability prediction from the 1950’s to the present. While there are several techniques available to reliability

practitioners to perform reliability predictions, the discussion inevitably centers around MIL-HDBK-217 due to its historical prominence as a reliability prediction tool.

During World War II, electronic tubes were by far the most unreliable component used in DoD electronic systems. This observation led to various studies and ad hoc groups whose purpose was to identify ways that their reliability, and the reliability of the systems in which they operated, could be improved. One group in the early 1950’s concluded that:

1. There needs to be better reliability data collected from the field 2. Better components need to be developed

(32)

3. Quantitative reliability requirements need to be established

4. Reliability needs to be verified by test before full scale production

5. A permanent committee needs to be established to guide the reliability discipline Item 5, above, was implemented in the form of the Advisory Group on Reliability of Electronic Equipment (AGREE), whose charter was to identify actions that could be taken to provide more reliable electronic equipment. This time period was the advent of the reliability engineering discipline. It soon became clear that the emerging discipline was using several different methods to achieve its goal of higher reliability. One was the identification of root causes of field failure and determination of mitigating actions. Another was the specification of quantitative reliability requirements. The specification of requirements in turn led to the desire to have a means of estimating reliability before an equipment is built and tested so that the probability of achieving its reliability goal could be estimated. This, of course, was the beginning of reliability prediction. The 1950’s also saw much pioneering work in the reliability discipline, including;

• A variety of efforts to improve device reliability through data collection and design

• The establishment of reliability programs

• Symposiums devoted to quality and reliability engineering

• Statistical techniques development such as the Weibull distribution • Military handbooks that provided guidance on the reliable application of

electronic components

In addition to these accomplishments, the 50’s also included pioneering work in the area of quantitative reliability prediction. In 1956, RCA released TR-1100, “Reliability Stress Analysis for Electronic Equipment”, which presented mathematical models for the estimation of component failure rates. This report turned out to be the predecessor of MIL-HDBK-217.

Several additional early works in the area of reliability prediction were produced in the early 1960’s, including D.R. Erles’ report (Reference 2) and the Erles and Edins paper (Reference 3). In 1962, the first version of MIL-HDBK-217 was published by the Navy. Once issued, MIL HDBK-217 quickly became the standard by which reliability

predictions were performed, and other sources of failure rates gradually disappeared. Part of the reason for the demise of other sources was the fact that MIL-HDBK-217 was often a contractually cited document and defense contractors did not have the option of using other sources of data.

(33)

These early sources of failure rates also often included design guidance on the reliable application on electronic components. However, subsequent versions of the documents, primarily MIL-HDBK-217, would delete the application information because it was treated in more detail elsewhere.

By now, the reliability discipline was working under the tenet that reliability was a quantitative discipline that needed quantitative data sources to support its many

statistically based techniques, such as allocations and redundancy modeling. However, another branch of the reliability discipline focused on the physical processes by which components were failing. The first symposium devoted to this topic was the “Physics of Failure In Electronics” Symposium sponsored by the Rome Air Development Center (RADC) and IIT Research Institute (IITRI) in 19621. This symposium later became known as the International Reliability Physics Symposium (IRPS). In this period of time, the two branches of reliability engineering seemed to be diverging, with the “systems” engineers devoted to the tasks of specifying, allocating, predicting and demonstrating reliability, while the physics-of-failure (PoF) engineers and scientists were devoting their efforts to identifying and modeling the physical causes of failure. Both branches were integral parts of the reliability discipline, and both were hosted at RADC (later to become Rome Laboratory). The physics-based information was necessary to develop part

qualification, screening and application requirements, and the “systems” tasks of specifying, allocating, predicting and demonstrating reliability were necessary to insure that reliability requirements were met. The component research efforts of the 1950’s and 1960’s culminated with the implementation of the “ER” and “TX” families of

specifications. This complicated the issue of predicting their reliability because there were now many different combinations of quality levels and environments that needed to be addressed in MIL-HDBK-217.

In the early 1970’s, the responsibility for preparing MIL-HDBK-217 was transferred to RADC, who published revision B in 1974. However, other than the transition to RADC, the 1970’s maintained the status quo in the area of reliability prediction. MIL-HDBK-217 was updated to reflect the technology at that time, but there were few other efforts that changed the manner in which predictions were performed. One exception, however, was that there was a shift in the complexity of the models being developed for MIL-HDBK-217. There were several efforts to develop new and innovative models for reliability prediction. The results of these efforts were extremely complex models that may have been technically sound, but were criticized by the user community as being too

1

IITRI was the original contractor of the Reliability Analysis Center (RAC). In 2005, the RAC contract was awarded as RIAC to the current team of Wyle Labs (prime), Quanterion Solutions Incorporated, the University of Maryland Center for Risk and Reliability, the Pennsylvania State Applied Research Laboratory (ARL), and the State University of New York Institute of Technology (SUNYIT)

(34)

complex, too costly, and unrealistic given the low level of detailed design information available at the point in time when the models were needed. RCA, under contract to RADC, had developed PoF-based models which were rejected as unusable, since the detailed design and construction data for microcircuits were simply unavailable to typical model users. These models were never incorporated into MIL-HDBK-217.

While MIL-HDBK-217 was updated again several times in the 1980’s, there were

agencies that were developing reliability prediction models unique to their industries. As an example, the automotive industry, under the auspices of the Society of Automotive Engineers (SAE) Reliability Standards Committee, developed a series of models specific to automotive electronics. The SAE committee felt that there was no existing prediction methodologies that were applicable to the specific quality levels and environments of automotive applications. The Bellcore reliability prediction standard is another example of a specific industry developing methodologies for their unique conditions and

equipment. It originally was developed by modifying MIL-HDBK-217 to better reflect the conditions of interest of the telecommunications industry. It has since taken on its own identity with models derived from telecommunications equipment and is now used widely within that industry.

The 1980’s also saw explosive growth in integrated circuit technology. Very dense circuits were being fabricated using feature sizes as small as 0.5 microns. This presented unique challenges to reliability modelers. The VHSIC (Very High Speed Integrated Circuit) program was the government’s attempt to leverage from the technological advancements of the commercial industry and, at the same time, produce circuits capable of meeting the unique requirements of military applications. From the VHSIC program came the Qualified Manufacturers List (QML) - a qualification methodology that qualified an integrated circuit manufacturing line, unlike the traditional qualification of specific parts. The government realized that it needed a QML-like process if it were to leverage from the advancements in commercial technologies and, at the same time, have a timely and effective qualification scheme for military parts. A reliability prediction model was also developed for VHSIC devices in 1989 (Reference 9) in support of a MIL-HDBK-217 update. An interesting observation was made during that study that deviated from the premise on which most of the MIL-HDBK-217 models were based. The

traditional approach to developing models was to collect as much field failure rate data as possible, statistically analyze it, and quantify model factors based on the results of the statistical analysis. For integrated circuits, one of the factors that was quantified was inevitably device complexity. This complexity was measured by the number of gates or transistors and was the primary factor on which the models were based. The correlation between failure rate and complexity was strong and could be quantified because the

(35)

failure rate of circuits was much higher than they are today and the defect rate was directly proportional to the complexity. As technology has advanced, the gate or

transistor count became so high that it could no longer effectively be used as the measure of complexity in a reliability model. Furthermore, transistor or gate count data was often difficult or impossible to obtain. Therefore, the model developed for VHSIC

microcircuits needed another measure of complexity on which to base the model. The best measures, and the ones most highly correlated to reliability are defect density and silicon area. It can be shown that the failure rate (for small cumulative percent failure) is directly proportional to the product of the area and defect density. However, another factor that is highly correlated to defect density and area is the yield of the die, or the percent of die that are functional upon manufacture. Ideally, a reliability model would use either yield or defect density/area as the primary factor(s) on which to base the model. The problem in using these factors in a model is that they are considered highly proprietary parameters from a market competition viewpoint and, therefore, are rarely released by the manufacturers. Therefore, the single most important driver of reliability cannot be obtained by the user of the device, which is unfortunate because the accuracy of the model suffers. The conflict between the usability of a model and its accuracy has always been a difficult tradeoff to address for model developers.

Much of the literature in the 1990’s on the topic of reliability prediction has centered around the debate as to whether the reliability discipline should focus on PoF-based or empirically-based models (such as MIL-HDBK-217) for the quantification of reliability. In the author’s opinion, many of the primary criticisms of MIL-HDBK-217 stem from the fact that it was often used for purposes for which it was not intended. For example, it was often used as a means by which the reliability of a product was demonstrated. Since its use was contractually required, contractors would try to demonstrate compliance to the specified reliability requirements by “adjusting” factors in the model to make it appear that the reliability would meet requirements. Sometimes these adjustments had a

technical basis, and sometimes they did not. Les Gubbins, one of the government’s first project managers for the handbook, once made the analogy that engaging in the use of these adjustment factors is like pushing the needle on your car’s speedometer up, and convincing yourself you’re going faster. This, of course, is not good engineering practice, but rather was done for nontechnical reasons.

Another key development in the area of reliability predictions was related to the implications of acquisition reform. In 1994, Military Specifications and Standards Reform (MSSR) was initiated which decreed the adoption of performance-based

References

Related documents

PD-Monitor score between subjects with FT scored 0 and other subgroups (all with P = 0.000) for the right sides. The Chi-square test was used to compare the categorical variables.

The data reliability is tested by plotting a graph between experimental and predicted values of similar compounds (compounds whose similarity coefficient with

We will illustrate the chi-square goodness-of-fit test with equal expected frequencies using the data reported in Chapter 3 and used there to illustrate the Pivot Table feature of

Then from the Chi Square data analysis test with a level 90% confidence obtained p value of 0,000 with a significant level (p <0.05) which indicates that there is

The results of the Chi square test performed on the observed data and estimated by the models proposed by Brookes and Harrington, of the gamma function and multiplicative