Quality Control
Step 4.3: Documentation of the improvement. We should understand that the project is not complete until the changes are documented in the appropriate quality management
3.7 Assessing Model Adequacy
As there has been an emphasis on continuous lifetime distributions thus far in the chapter, the discussion here is limited to model adequacy tests for continuous distributions. The popular chi-square goodness-of-fit test can be applied to both continuous and discrete dis-tributions, but suffers from the limitations of arbitrary interval widths and application only to large data sets. This section focuses on the Kolmogorov–Smirnov (KS) goodness-of-fit test for assessing model adequacy.
A notational difficulty arises in presenting the KS test. The survivor function S(t) has been emphasized to this point in the chapter, but the cumulative distribution function, whereF (t) = P [T ≤ t] = 1 − S(t) for continuous distributions, has traditionally been used to define the KS test statistic. To keep with this tradition, F (t) is used in the definitions in this section.
The KS goodness-of-fit test is typically used to compare an empirical cumulative distri-bution function with a fitted or hypothesized parametric cumulative distridistri-bution function
Reliability 3-37
0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
t
S(t)
FIGURE 3.17 Confidence bands for the product-limit survivor function estimate for the 6-MP data.
for a continuous model. The KS test statistic is the maximum vertical difference between the empirical cumulative distribution function ˆF (t) and a hypothesized or fitted cumulative distribution functionF0(t). The null and alternative hypotheses for the test are
H0:F (t) = F0(t) H1:F (t) = F0(t)
where F (t) is the true underlying population cumulative distribution function. In other words, the null hypothesis is that data set of random lifetimes has been drawn from a population with cumulative distribution functionF0(t). For a complete data set, the defining formula for the test statistic is
Dn= sup
t | ˆF (t) − F0(t)|
where sup is an abbreviation for supremum. This test statistic has intuitive appeal since larger values ofDn indicate a greater difference between ˆF (t) and F0(t) and hence a poorer fit. In addition, Dn is independent of the parametric form of F0(t) when the cumulative distribution function is hypothesized. From a practical standpoint, computing the KS test statistic requires only a single loop through then data values. This simplification occurs because ˆF (t) is a nondecreasing step function and F0(t) is a nondecreasing continuous function, so the maximum difference must occur at a data value.
The usual computational formulas for computingDn require a single pass through the data values. Let
Dn+= max
i=1,2,...,n
i
n− F0(t(i))
Dn−= max
i=1,2,...,n
F0(t(i))−i − 1 n
so thatDn= max{D+n, Dn−}. These computational formulas are typically easier to translate into computer code for implementation than the defining formula.
We consider only hypothesized (as opposed to fitted) cumulative distribution functions F0(t) here because the distribution of Dn is free of the hypothesized distribution specified.
0 50 100 150 0.0
0.2 0.4 0.6 0.8 1.0
D23 Exponential fit
Nonparametric estimator
t
F(t)
FIGURE 3.18 KS statistic for the ball bearing data set (exponential fit).
To illustrate the geometric aspects of the KS test statistic Dn, however, the fitted expo-nential distribution is compared to the empirical cumulative distribution function for the ball bearing data set. Figure 3.18 shows the empirical step cumulative distribution function F (t) associated with the failure times of the n = 23 ball bearing failure times, along with the exponential fit F0(t). The maximum difference between these two cumulative distribution functions occurs just to the left oft(4)= 41.52 and isD23= 0.301 as indicated on the figure.
The test statistic for the KS test is nonparametric in the sense that it has the same distribution regardless of the distribution of the parent population under H0 when all the parameters in the hypothesized distribution are known. The reason for this is that F0(t(1)), F0(t(2)), . . ., F0(t(n)) have the same joint distribution as U(0,1)-order statistics under H0 regardless of the functional form of F0. These are often called omnibus tests as they are not tied to one particular distribution (e.g., the Weibull) and apply equally well to any hypothesized distributionF0(t). This also means that fractiles of the distribution of Dn depend onn only.
The rows in Table 3.2denote the sample sizes and the columns denote several levels of significance. The values in the table are estimates of the 1− α fractiles of the distribution of Dn under H0 in the all-parameters-known case (hypothesized, rather than fitted distri-bution) and have been determined by Monte Carlo simulation with one million replications.
Not surprisingly, the fractiles are a decreasing function ofn, as increased sample sizes will have lower sampling variability. Test statistics that exceed the appropriate critical value lead to rejectingH0.
Example 3.18
Run the KS test (at α = 0.10) to assess whether the ball bearing data set was drawn from a Weibull population with λ = 0.01 and κ = 2.
Note that the Weibull distribution in this example is a hypothesized, rather than fitted distribution, so the all-parameters-known case for determining critical values is appropriate.
The goodness-of-fit test
H0:F (t) = 1 − e−(0.01t)2 H1:F (t) = 1 − e−(0.01t)2
Reliability 3-39 TABLE 3.2 Selected Approximate KS Percentiles for Small Sample Sizes
n α = 0.20 α = 0.10 α = 0.05 α = 0.01
1 0.900 0.950 0.975 0.995
2 0.683 0.776 0.842 0.930
3 0.565 0.636 0.708 0.829
4 0.493 0.565 0.624 0.733
5 0.447 0.509 0.563 0.668
6 0.410 0.468 0.519 0.617
7 0.381 0.436 0.483 0.576
8 0.358 0.409 0.454 0.542
9 0.339 0.388 0.430 0.513
10 0.323 0.369 0.409 0.489
11 0.308 0.352 0.391 0.468
12 0.296 0.338 0.376 0.449
13 0.285 0.325 0.361 0.433
14 0.275 0.314 0.349 0.418
15 0.266 0.304 0.338 0.404
16 0.258 0.295 0.327 0.392
17 0.250 0.286 0.318 0.381
18 0.243 0.278 0.309 0.370
19 0.237 0.271 0.302 0.361
20 0.232 0.265 0.294 0.352
21 0.226 0.259 0.287 0.345
22 0.221 0.253 0.281 0.337
23 0.217 0.248 0.275 0.330
24 0.212 0.242 0.269 0.323
25 0.208 0.237 0.264 0.317
0 50 100 150
0.0 0.2 0.4 0.6 0.8 1.0
D23
Hypothesized Weibull distribution Nonparametric estimator
t
F(t)
FIGURE 3.19 KS statistic for the ball bearing data set (Weibull fit).
does not involve any parameters estimated from data. The test statistic isD23= 0.274. The empirical cumulative distribution function, the Weibull(0.01, 2) cumulative distribution function and the maximum difference between the two [which occurs att(15)= 68.88] are shown in Figure 3.19. Atα = 0.10, the critical value is 0.248, so H0 is rejected. The test statistic is very close to the critical value forα = 0.05, so the attained p-value for the test
is approximatelyp = 0.05.
The KS test can be extended in several directions. First, it can be adapted for the case the parameters are estimated from the data. Unfortunately, a separate table of critical values
must be given for each fitted distribution. Second, the KS test can be adapted for right-censored data sets. Many researchers have devised approximate methods for determining the critical values for the KS test with random right censoring and parameters estimated from data. Finally, there are several variants of the KS test, such as the Anderson–Darling and Cramer–von Mises test, which improve on the power of the test.
3.8 Summary
The purpose of this chapter has been to introduce the mathematics associated with the design and assessment of systems with respect to their reliability. In specific, this chapter has:
• outlined basic techniques for describing the arrangement of components in a system by defining the structure function φ(x) that maps the states of the com-ponents to the state of the system;
• defined reliability as the probability that a nonrepairable item (component or system) is functioning at a specified time;
• introduced two techniques, definition and expectation, for determining the system reliability from component reliabilities;
• defined four functions, the survivor function S(t), the probability density function f(t), the hazard function h(t), and the cumulative hazard function H(t), which describe the distribution of a nonnegative random variableT , which denotes the lifetime of a component or system;
• reviewed formulas for calculating the mean, variance, and a fractile (percentile) ofT ;
• illustrated how to determine the system survivor function as a function of the component survivor functions;
• introduced two parametric lifetime distributions, the exponential and Weibull distributions, and outlined some of their properties;
• surveyed characteristics (e.g., right-censoring) of lifetime data sets;
• outlined point and interval estimation techniques for the exponential and distri-butions;
• derived a technique for comparing the failure rates of items with lifetimes drawn from two populations;
• derived and illustrated the nonparametric Kaplan–Meier product-limit estimate for the survivor function;
• introduced the Kolmogorov–Smirnov goodness-of-fit test for assessing model adequacy.
All of these topics are covered in more detail in the references. In addition, there are many topics that have not been covered at all, such as repairable systems, incorporating covariates into a survival model, competing risks, reliability growth, mixture models, failure modes and effects analysis, accelerated testing, fault trees, Markov models, and life testing. These topics and others are considered in the reliability literature, highlighted by the textbooks cited below. Software for reliability analysis has been written by several vendors and incorporated into existing statistical packages, such as SAS, S-Plus, and R.
Reliability 3-41
References
1. Barlow, R. and Proschan, F. (1981), Statistical Theory of Reliability and Life Testing Prob-ability Models, To Begin With, Silver Spring, MD.
2. Kalbfleisch, J.D. and Prentice, R.L. (2002), The Statistical Analysis of Failure Time Data, Second Edition, John Wiley & Sons, New York.
3. Kapur, K.C. and Lamberson, L.R. (1977), Reliability in Engineering Design, John Wiley &
Sons, New York.
4. Lawless, J.F. (2003), Statistical Models and Methods for Lifetime Data, Second Edition, John Wiley & Sons, New York.
5. Meeker, W.Q. and Escobar, L.A. (1998), Statistical Methods for Reliability Data, John Wiley & Sons, New York.