PRINCIPLES OF LIMIT STATES DESIGN 3.1 Introduction
3.4 Review of Statistics and Probability Concepts 1 Statistical Descriptors
3.4.2 Probability Density Functions
A histogram showing the frequency of occurrence of the measured to predicted axial driven pile resistances from Table 3-2 is presented in Figure 3-1. The histogram was constructed by counting the number of ratios in each of the equal intervals of 0.25 (e.g., in the interval from 0.51 to 0.75 there are 2 values and in the interval from 0.76 to 1.00 there are 10 values). It is apparent by examining Figure 3-1 that the distribution of measured to predicted resistance is not symmetrical about the mean value of 1.22. Further, the histogram shows how extreme the value near 4.0 is and brings into question its validity.
Figure 3-1
Histogram of Measured to Predicted Axial Driven Pile Resistance for Example Problem 3.2
The histogram also provides an approximate estimate of the probability of Meyerhof=s method to equal or overpredict the driven pile resistance as 12/24 = 0.50 (the estimate is approximate because the data set includes only 24 values out of possibly thousands of load tests that could be used for the
comparison). This ratio is analogous to flipping a coin to obtain either a heads or tails outcome; but that is not the whole story. The histogram shows that most of the data points are clustered around the mean value (1.22), and that is good. In this case, 16 of the 24 values (i.e., 67%) are within the range 0.76 to 1.25, and 22 of 24 (92%) are within the range 0.51 to 1.75. The histogram shows that there is more to interpreting data than knowing the mean value. To have confidence in measured to predicted resistance, the scatter or dispersion of the data must also be known.
To ensure that the predicted resistance is less than or equal to the measured resistance for all cases in Figure 3-1, the predicted resistance can be multiplied by 0.67, where 0.67 represents the minimum ratio of measured to predicted resistance in Table 3-2. The result of this multiplication is to shift the entire histogram to the right so that no value of measured to predicted resistance will be less than 1. The value of 0.67 is similar to a reduction or resistance factor applied to Meyerhof=s method for predicting the axial resistance of driven piles.
For years, histograms have been plotted to show distributions of natural phenomena. A recurring pattern was noticed in these distributions that could be described mathematically. The derived mathematical function is referred to as a “probability density function.” This function is similar to drawing a smooth curve that approximately passes through the values of the histogram in Figure 3-1. A sketch of a function f(x) is shown in Figure 3-2.
Figure 3-2
Lognormal Probability Density Function
The function f(x) in Figure 3-2 is standardized by dividing its ordinates by the total area under the curve (includes all of the data points). The total area under this normalized curve is unity or a probability of one because it includes all the data points in the theoretical total population of all possible outcomes. Thus, the area under the curve, or probability of occurrence, P(x), over a small interval, dx, between x and x + dx is equal to the product of the function at x and dx (i.e., P(x) =
The mathematical expression for the standard or normal form of f(x) is derived as an exponential function symmetrical about the mean value. Values of f(x) and areas (probabilities) between fixed intervals are available in textbooks. The shape of the normal distribution curve is like a bell and just two parameters can describe the function: the mean, x (m in Figure 3-3), and the standard deviation,
σ. Changes in the shape of f(x) with variations in these two parameters are shown in Figure 3-3. If m changes and σ remains constant, the curve shifts to the right, but its shape does not change (i.e., Figure 3-3b). If m remains constant and σ changes, the position of the curve does not change, but its shape does. If σ decreases (i.e., less scatter of data), the shape becomes more compact (i.e., Figure 3-3c). If σ increases (i.e., more scatter), the shape spreads out (i.e., Figure 3-3d). In all of the cases shown in Figure 3-3, the area under the curve is unity.
Figure 3-3
Standard Normal Density Function (Benjamin and Cornell, 1970)
When the data distribution is unsymmetric, a logarithmic normal (or simply lognormal) probability density function is often suitable. Stated mathematically, if y = ln(x) is normally distributed, then x is said to be lognormal. In calibrating the AASHTO LRFD Specification, the lognormal function was used because it appeared to better represent the observed distribution of resistance data for predicting the moment strength of bridge girders. The distribution is probably skewed because the materials supplied usually have strengths greater than the nominal values assumed in the prediction equations. Based on the distribution of data collected from weigh-in-motion studies, a normal
probability density function was used to represent the observed distribution of load data. Other design methods and data bases could result in the use of different distribution functions for load and resistance (e.g., both distributions for load and resistance could be lognormal).
Lognormal probability density functions are shown in Figure 3-4 for different values of its standard deviation, ζ. Notice that as ζ increases, the lack of symmetry becomes more pronounced and the smooth curve more closely represents the histogram of Figure 3-1.
Figure 3-4
Lognormal Density Function
The lognormal mean, ξm, and lognormal standard deviation, ζ, can be determined using Eq. 3-11 and
3-12, respectively.
(
)
[
x 1 COV2]
ln + = ξ (Eq. 3-11) and(
1 COV2)
ln + = ζ (Eq. 3-12) where:x¯ = Mean value defined by Eq. 3-7
COV = Coefficient of variation defined by Eq. 3-9
ln ( ) = Natural logarithm of the expression in parentheses
For values of COV less than 0.2, Eqs. 3-11 and 3-12 are approximately equal to the following simplified relationships:
x ln =
and
(
2)
2 =lnCOV
ζ (Eq. 3-14)
Thus, the lognormal mean and lognormal standard deviation can be calculated from the statistics obtained from the standard normal function. In Example Problem 3-2, using Eqs. 3-11 and 3-12, the lognormal mean and lognormal standard deviation are 0.071 and 0.50, respectively. The COV is too large to use the approximate equations. Taking the anti-log of 0.071 gives a comparable normal mean of 1.07 which indicates that the skew is to the left of the standard normal mean value of 1.22.