New Neural Network Response Surface Methods for Reliability Analysis

(1)

Contents lists available at ScienceDirect

Chinese Journal of Aeronautics

journal homepage: www.elsevier.com/locate/cja Chinese Journal of Aeronautics 24 (2011) 25-31

New Neural Network Response Surface Methods for

Reliability Analysis

REN Yuan, BAI Guangchen*

School of Jet Propulsion, Beijing University of Aeronautics and Astronautics, Beijing 100191, China Received 5 May 2010; revised 30 June 2010; accepted 25 September 2010

Abstract

This article presents two new kinds of artificial neural network (ANN) response surface methods (RSMs): the ANN RSM based on early stopping technique (ANNRSM-1), and the ANN RSM based on regularization theory (ANNRSM-2). The follow-ing improvements are made to the conventional ANN RSM (ANNRSM-0): 1) by monitorfollow-ing the validation error durfollow-ing the training process, ANNRSM-1 determines the early stopping point and the training stopping point, and the weight vector at the early stopping point, which corresponds to the ANN model with the optimal generalization, is finally returned as the training result; 2) according to the regularization theory, ANNRSM-2 modifies the conventional training performance function by adding to it the sum of squares of the network weights, so the network weights are forced to have smaller values while the training error decreases. Tests show that the performance of ANN RSM becomes much better due to the above-mentioned improvements: first, ANNRSM-1 and ANNRSM-2 approximate to the limit state function (LSF) more accurately than ANNRSM-0; second, the esti-mated failure probabilities given by ANNRSM-1 and ANNRSM-2 have smaller errors than that obtained by ANNRSM-0; third, compared with ANNRSM-0, ANNRSM-1 and ANNRSM-2 require much fewer data samples to achieve stable failure probability results.

Keywords: neural networks; response surface; reliability analysis; early stopping; regularization

1. Introduction1

The research on polynomial regression response surface began in the 1980s. Though numerous studies have been made of its application to structural reliabil-ity analysis[1-6]_{, there are still several difficulties in}

practices: 1) the polynomial regression response sur-face may produce large errors when approximating complexly shaped limit state function (LSF)[7-8]_{; 2) the}

definitions of error terms in polynomial regression conflict with deterministic simulation models[9]_{; 3) the}

sample number required for generating the regression

*Corresponding author. Tel.: +86-10-82317418. E-mail address: [email protected]

Foundation item: National High-tech Research and Development Pro-gram of China (2006AA04Z405)

polynomial increases quickly with the input num-ber[10]_.

To overcome these problems faced by polynomial regression response surface, some scholars presented artificial neural network (ANN) response surface methods (RSMs), in which the ANN was used to ap-proximate the LSF. Papadrakakis, et al. [11]_combined

the importance sampling with ANN RSM to perform the reliability analysis of elastic-plastic structures; Lu, et al.[12]_{improved the sampling strategy and put}

for-ward ANN RSM based on properly-selected training samples; Elhewy, et al. [10]_{compared ANN RSM with}

that of polynomial and found that the former gave more accurate reliability results and required fewer sampling points.

To ensure the accuracy of the estimated failure probability, ANN response surface needs to accurately approximate the LSF in the parameter space[13-14]_.

However, based on the empirical risk minimization

(2)

(ERM) principle[15]_{, ANN cannot guarantee}

satisfac-tory prediction accuracy for the inputs outside the training set, especially for small-sized data sets. Therefore, in many cases the obtained ANN model produces large errors for unknown inputs. Such an inaccurate ANN model may lead to erroneous reliabil-ity results.

This article presents two new kinds of ANN RSMs, and they are the ANN RSMs being based on early stopping technique (ANNRSM-1) and regularization theory (ANNRSM-2) respectively. The two use dif-ferent strategies, i.e. early stopping[16]_and

regulariza-tion[17]_{, to improve the accuracy of ANN for}

approxi-mating the LSF. Experiments show that after the above-mentioned improvements the approximation error of ANN response surface greatly decreases, and the required sample number of data is reduced signifi-cantly.

2. Early Stopping Technique-based ANN RSM

In many cases the ANN response surface cannot accurately approximate the LSF, and the essence of this problem is that ANN’s unsatisfactory generaliza-tion ability makes it unable to make full use of the in-formation contained in training samples, so the LSF cannot be reconstructed efficiently. Adopting the re-search results in the field of machine learning are the fundamental way to solve the above-mentioned prob-lem that can enhance neural network’s generalization and then making modifications to conventional ANN RSM (ANNRSM-0). Presently there are two kinds of methods for improving the performance of ANN’s generalization, i.e. applying early stopping technique and regularization theory.

2.1. Early stopping technique

Early stopping technique is derived from the fol-lowing facts:

(1) In the training process, the ANN training error decreases monotonously as the number of iterative loop increases (see the fine curve in Fig.1).

(2) Choose some samples outside the training set and treat them as the validation data, then monitor the ANN validation error during the training process. It will be found that in the beginning the validation error also decreases as the number of iterative loop increases. However, after reaching a minimum (point A in Fig.1), it will increase as the number of iterative loop in-creases, as shown by the heavy curve in Fig.1.

(3) The samples in the validation set are never used to update the network weights, so the validation error reflects the real generalization ability of ANN. After reaching the early stopping point (point A in Fig.1), the ANN validation error continues to increase with the decreasing training error, and this indicates the starting of overfitting[16,18]_.

Fig.1 Rationale of early stopping technique.

In short, during the initial phase ANN’s generaliza-tion will become better and better with the advance of training process; however, after reaching an optimum it will continually get worse as the number of iterative loop increases.

Early stopping technique is a self-adaptive criterion for stopping training. During the training process it monitors the validation error to determine the early stopping point and the training stopping point. When the validation error continues increasing until a speci-fied number of iterative loop is attained, the training is stopped, and returned to the weight vector with the minimum validation error (this weight vector corre-sponds with the ANN model with the optimal gener-alization). The processing flowchart shown in Fig.2 is adopted in this work, where w is the current ANN weight vector, V

D

E is the validation error, NF records

how many times V D

E has continued increasing, and vector wV_{stores the best weight vector ever found}

(having the minimum validation error). The training stops when NF exceeds the preset MF, and the current

vector stored in wV_{is returned back as the final result.}

(3)

2.2. Analysis procedure of ANNRSM-1

Based on early stopping technique, the modifica-tions are made to ANNRSM-0, and the following pro-cedure is obtained:

(1) Analyze the problem to be solved, and define the LSF expression g(x), and the input variables and their value-taking ranges.

(2) Choose data samples uniformly in the parame-ter space defined by the value-taking ranges of input variables using experimental design methods, and separate the collected data samples into two different sets, i.e. the validation set and the training set (keep the ratio of their sample sizes being between 0.7 and 0.8).

(3) Train ANN with the BP algorithm. During the training process, the processing flowchart shown in Fig.2 is used to determine the early stopping point and the weight vector at that point is returned back, then the ANN response surface gc(x) is obtained.

(4) Combine gc(x) with Monte Carlo simulation (or other reliability analysis methods) and compute the failure probability.

This is the proposed ANNRSM-1 which is based on early stopping technique.

3. Regularization Theory-based ANN RSM 3.1. Regularization theory

The regularization theory suggests that the values of ANN weights should be constrained to be as small as possible in the training process[17-18]_{. If the network}

weights are forced to have smaller values, the ANN model will have no overfitting and then have better generalization performance. Therefore it will achieve good prediction accuracy for inputs outside the train-ing set[19]_{. Preventing the weights from having large}

values is equivalent to constraining the nonsmoothness and complexity of ANN. Hence the regularization the-ory is in agreement with the structural risk minimiza-tion principle in essence.

The conventional performance function for ANN training is 2 D 1 min ( ) N i( ) i E w

¦

e w (1) where N is the sample number of training, ei the

ANN’s prediction error for the ith training sample,

w = [w1 w2 … wW]T the weight vector, and its

length W is the total number of weights.

The regularization theory-based training perform-ance function has a different expression contrast to Eq.(1): 2 2 D 1 1 min ( ) W( ) N i( ) W j i j E E e w E wD w E

¦

w D

¦

(2) where EW is the sum of squares of the weights, D and E

are weight coefficients. Obviously, the training

per-formance function expressed in Eq.(2) can force the network weights to have smaller values when the training error decreases. The values of D and E deter-mine the emphases of ANN training: if D E ,the training will focus on reducing the training error; if

,

D E the algorithm will emphasize the weight size reduction at the expense of larger training error. The optimal D and E should maximize likelihood function

P(D|D, E, M), where D represents the data set and M is

the ANN model being used. P(D|D, E, M) can be ex-pressed as[17] ( | , , ) ( | , ) ( | , , ) ( | , , , ) P D M P M P D M P D M E D D E D E w w w (3)

2 / 2 1 1 ( | , , ) exp( ) (ʌ / ) N i N i P D E M E e E

¦

w (4) 2 / 2 1 1 ( | , ) exp( ) (ʌ / ) W j W j P D M D w D

¦

w (5) 2 2 1 1 1 ( | , , , ) exp( ) ( , ) N W i j i j P D M e w Z D E E D D E

¦

w (6) If we substitute the above results into Eq.(3) and es-timate Z(D,E) by Taylor series expansion, we obtain

/ 2 MP 1/ 2 / 2 / 2 MP MP (2 ) [det( )] ( | , , ) ( / ) ( / ) exp ( ) ( ) W N W D W P D M E E D E E D E D S S S ª º ¬ ¼ H w w (7) The values of D and E can be estimated by taking the derivatives with respect to the log of both sides of Eq.(7) and setting them equal to zero. wMP_{is the}

mini-mum point of Eq.(2), and H is its Hessian matrix, i.e.

H = E 2_E

D+D2EW. After adopting the regularization

theory, the training performance function changes from Eq.(1) to Eq.(2), and hence it is necessary to make corresponding modification to the weight updating formula. The amended weight updating for-mula can be written as

T 1 T

[E ( ) ( ) (D P) ] [ E ( ) ( ) D ]

' w J w J w I J w e w w (8) whereP is a positive scalar, I an identity matrix, e(w) the error vector on the training set, and J(w) Jacobian matrix.

3.2. Analysis procedure of ANNRSM-2

On the basis of the regularization theory, another improved version of ANNRSM-0 can be obtained, and

(4)

it is ANNRSM-2 which is based on the regularization theory. Its analysis procedure is as follows: (1) Analyze the problem to be solved, define the LSF expression g(x), and the input variables and their value-taking ranges.

(2) Choose data samples uniformly in the parameter space defined by the value-taking ranges of input vari-ables using experimental design methods (Unlike ANNRSM-1, ANNRSM-2 needs no validation sam-ples).

(3) Train ANN by taking Eq.(2) as the performance function and using Eq.(8) to update the weights. In the training process, D and E are estimated by maximizing Eq.(7). Then the ANN response surface gc(x) can be obtained.

(4) Combine gc(x) with Monte Carlo simulation (or other reliability analysis methods) and compute the failure probability.

4. Numerical Examples 4.1. Example 1

The LSF for the first example is defined as

1 3 2 ( ) 0.018 461 54 74.769 23x g x x (9) where x1~N(1 000, 2002), and x2~N(250, 37.52). The

exact Pf (failure probability) obtained by Monte Carlo

simulation is 0.009 607[20]_.

Estimated LSFs of three methods (25 training sam-ple) are shown in Fig.3. From Fig.3, it can be seen that both ANNRSM-1 and ANNRSM-2 can give good ap-proximation of LSF, whereas the ANNRSM-0 pro-duces large error. The Pf results of the three methods

obtained by combining the estimated LSFs with Monte Carlo simulations are listed in Table 1. The following remarks can be mentioned as: 1) ANNRSM-1 and ANNRSM-2 can obtain more accurate Pf results than

ANNRSM-0 does, since their estimated LSFs have higher approximation accuracy; 2) both early stopping technique and regularization theory can improve the generalization of ANN response surface.

Fig.3 Estimated LSFs of three methods (25 training sam-ples).

Table 1 Estimated failure probabilities Pf of three met- hods (25 training samples)

Result Exact value ANNRSM-0 ANNRSM-1 ANNRSM-2 Pf 0.009 607 0.007 883 0.009 726 0.009 695 Relative.

error/% í17.9 1.24 0.92

Taking into account the randomness of the initial weights, each of the three methods is run another 10 times respectively, and the estimated failure probabilities are shown in Fig.4, where l is the number of ANN hid-den neurons. For all runs the results of ANNRSM-1 and ANNRSM-2 show good agreement with the exact value, and this indicates that 25 training samples are sufficient for both of the two methods to approximate

Fig.4 Results of three methods for 10 runs (25 training samples, l=6).

(5)

the LSF with reliable accuracy, and the initial weights have little effects on the accuracy of their response surfaces. The results of ANNRSM-0 are very unstable, since 25 training samples are not enough to guarantee the reproduction accuracy of its response surface. Ac-cording to Ref.[13], ANNRSM-0 needs at least 70 samples to achieve estimated Pf that is stable and close

to the exact solution.

In the above calculations, l is fixed at 6. When l=8, both ANNRSM-1 and ANNRSM-2 can still give stable and accurate failure probabilities. However, when l is increased to 12 and 16, the results shown in Fig.5 are obtained. ANNRSM-1 achieves satisfactory perform-ance when l equals 6 or 8; however, it gives unstable

Pf results when l is larger than 12. By contrast, the Pf

results of ANNRSM-2 are quite close to the exact value in all four cases. The above facts indicate that the early stopping technique adopted by ANNRSM-1 has restrictions on the value of l, and will fall flat if the number of hidden neurons is beyond a certain limit. Nevertheless the regularization theory-based ANNR- SM-2 has no such kind of limitation.

Fig.5 Results of ANNRSM-1 and ANNRSM-2 for 10 runs (25 training samples).

4.2. Example 2

The LSF for the second example is defined as

3 2 3

1 1 2 2

( ) 18

g x x x x x (10) where x1~N(10, 52), and x2~N(9.9, 52). The exact Pf

obtained by Monte Carlo simulation is 0.005 63[21]_.

Fig.6 shows the estimated LSFs given by three meth-ods (40 training samples). The Pf results are listed in

Table 2.

Fig.6 Estimated LSFs of three methods (40 training sam-ples).

Table 2 Estimated failure probabilities Pf of three methods (40 training samples)

Result Exact value ANNRSM-0 ANNRSM-1 ANNRSM-2 Pf 0.005 63 0.020 614 0.005 983 0.005 527 Relative

error/% 266.10 6.27 í1.83

Taking into account the randomness of initial weights, each method is run another 10 times, and the results are illustrated in Fig.7. According to Cheng’s experiments, ANNRSM-0 is unable to provide stable failure probabilities even when the training sample number is increased to 100[13]_{. Hence the present}

(6)

conclu-sion that compared with ANNRSM-0, ANNRSM-1 and ANNRSM-2 require much fewer data samples to achieve Pf results that are both stable and accurate.

Fig.7 Results of three methods for 10 runs (40 training samples).

4.3. Example 3

Fig.8 shows the finite element model of a compres-sor disk. The disk is made of titanium alloy, and its elasticity modulus E is regarded as a random variable. The uncertainties of r (radius of pressure-balancing hole) and R (radius of the circle where the center of pressure-balancing hole is located) are not negligible due to manufacture tolerance. The centrifugal load that the disk is subjected to depends on the rotating speed Z, so the uncertainty of Z also has significant effect on the structural reliability. The statistics of these random input variables are listed in Table 3, and they all as-sume normal distribution. The LSF is defined as

( , , , ) [ ] P( , , , )

g E r R Z V V E r R Z (11) where [V] is the allowable stress, and VP(E, r, R,Z) is

the stress at node P (see Fig.8). The program for finite element computation is necessary to solve VP(E, r,

R,Z), so Eq.(11) is an implicit LSF. ANN can be used to reproduce the implicit VP(E, r, R,Z). The predicted

results of the three methods on the test set are illus-trated in Fig.9. The Pf results of each method for

dif-ferent [V] are listed in Table 4. The exact solution is obtained from importance sampling (2 000 times of simulations).

Fig.8 Finite element model of compressor disk. Table 3 Random input variables for Example 3 Random variable Mean Standard deviation

E/MPa 1.15u105_1.15u103

r/mm 10 0.066 7

R/mm 200 0.5 Ȧ/(r·min1) 1.1u104 ₁₈₇

Fig.9 Predicted results of three methods for stress at node

P (50 training samples).

Table 4 Estimated failure probabilities of three methods (50 training samples)

Exact solution ANNRSM-0 ANNRSM-1 ANNRSM-2

[Vҏҏҏҏ]/MPa

Pf Pf Relative error/% Pf Relative error/% Pf Relative error/%

510 0.015 244 0.011 545 í24.3 0.015 047 í1.3 0.015 682 2.9

515 0.007 167 0.004 874 í32.0 0.006 679 í6.8 0.007 258 1.3

525 0.001 060 0.000 772 í27.2 0.001 128 6.4 0.001 119 5.6

From Fig.9 and Table 4 it is known that compared with ANNRSM-0, ANNRSM-1 and ANNRSM-2 can more accurately approximate the implicit function VP(E,r,R,Z), and hence obtain the failure probabilities

with smaller errors for different [V] values.

Like the previous example, there is no significant difference between the qualities of ANNRSM-1 and ANNRSM-2 solutions, but for the sample number re-quirements the two are different. ANNRSM-2 needs

no validation samples, and it can accurately reproduce VP(E,r,R,Z) through only 50 repetitions of finite

ele-ment analysis. But validation data are necessary for ANNRSM-1 which is adopting early stopping tech-nique, and hence it invokes the finite element program as many as 90 times (including 40 times for validation data). It is thus clear that if the samples are obtained from time consuming numerical simulation, ANNR- SM-1 will certainly be inferior to ANNRSM-2 in

(7)

computational efficiency.

5. Conclusions

The ERM principle-based ANN cannot guarantee satisfactory prediction accuracy for inputs outside the training set. However, to ensure the accuracy of the Pf

results, the ANN response surface needs to accurately approximate the LSF (or implicit unknown function) in the parameter space. To solve this problem, two new ANN RSMs are proposed in this article, and they are the ANN RSM based on early stopping technique (ANNRSM-1), and the ANN RSM based on regulari-zation theory (ANNRSM-2). Experiments show that the estimated LSF given by ANNRSM-1 and ANNRSM-2 has higher approximation accuracy than that of the conventional ANN RSM (ANNRSM-0), so that they are able to provide failure probabilities with smaller errors; meanwhile, ANNRSM-1 and ANNRSM-2 re-quire much fewer data samples than ANNRSM-0 to achieve stable Pf results.

The contrastive research on the performances of ANNRSM-1 and ANNRSM-2 indicates that ANNRSM-2 is superior to ANNRSM-1 in the following two aspects: 1) ANNRSM-2 needs no validation data, and hence its demand for aggregate sample number is lower than that of ANNRSM-1; 2) ANNRSM-1 has restrictions on the value of l, the number of ANN hidden neurons, but ANNRSM-2 has no such limitation.

References

[1] Bucher C G, Bourgund U. A fast and efficient response surface approach for structural reliability problems. Structural Safety 1990; 7(1): 57-66.

[2] Kim S, Na S. Response surface method using vector projected sampling points. Structural Safety 1997; 19(1): 3-19.

[3] Das P K, Zheng Y. Cumulative formation of response surface and its use in reliability analysis. Probabilistic Engineering Mechanics 2000; 15(4): 309-315.

[4] Guan X L, Melchers R E. Effect of response surface parameter variation on structural reliability estimates. Structural Safety 2001; 23(4): 429-444.

[5] Gupta S, Manohar C S. An improved response surface method for the determination of failure probability and importance measures. Structural Safety 2004; 26(2): 123-139.

[6] Gavin H P, Yau S C. High-order limit state functions in the response surface method for structural reliability analysis. Structural Safety 2008; 30(2): 162-179. [7] Krishnamurthy T, Romero V. Construction of response

surface with higher order continuity and its application to reliability engineering. AIAA-2002-1466, 2002. [8] Krishnamurthy T. Comparison of response surface

construction methods for derivative estimation using moving least squares, Kriging and radial basis func-tions. AIAA-2005-1821, 2005.

[9] Jones D R, Schonlau M, Welch W J. Efficient global optimization of expensive black-box functions. Journal

of Global Optimization 1998; 13(4): 455-492. [10] Elhewy A H, Mesbahi E, Pu Y. Reliability analysis of

structures using neural network method. Probabilistic Engineering Mechanics 2006; 21(1): 44-53.

[11] Papadrakakis M, Papadopoulos V, Lagaro N. Structural reliability analysis of elastic-plastic structures using neural networks and Monte Carlo simulation. Com-puter Methods in Applied Mechanics and Engineering 1996; 136(1-2): 145-163.

[12] Lu Z Z, Yang Z Z. New reliability analysis method based on artificial neural network. Journal of Me-chanical Strength 2006; 28(5): 699-702. [in Chinese] [13] Cheng J, Li Q S, Xiao R C. A new artificial neural

network-based response surface method for structural reliability analysis. Probabilistic Engineering Mecha- nics 2008; 23(1): 51-63.

[14] Cheng J, Li Q S. Reliability analysis of structures us-ing artificial neural network based genetic algorithms. Computer Methods in Applied Mechanics and Engi-neering 2008; 197(45-48): 3742-3750.

[15] Yuan X F, Wang Y N. Parameter selection of support vector machine for function approximation based on chaos optimization. Journal of Systems Engineering and Electronics 2008; 19(1): 191-197.

[16] Prechelt L. Automatic early stopping using cross vali-dation: quantifying the criteria. Neural Networks 1998; 11(4): 761-767.

[17] Yan P F, Zhang C S. Artificial neural networks and evolutionary computing. 2nd ed. Beijing: Tsinghua University Press, 2005. [in Chinese]

[18] Hagiwara K. Regularization learning, early stopping and biased estimator. Neurocomputing 2002; 48(1-4): 937-955.

[19] Burger M, Neubauer A. Analysis of Tikhonov regu-larization for function approximation by neural net-works. Neural Networks 2003; 16(1): 79-90.

[20] Rajashekhar M R, Ellingwood B R. A new look at the response surface approach for reliability analysis. Structural Safety 1993; 12(3): 205-220.

[21] Kaymaz I. Application of Kriging method to structural reliability problems. Structural Safety 2005; 27(2): 133-151.

Biographies:

REN Yuan Born in 1982, he received B.S. degree from Beijing University of Aeronautics and Astronautics in 2004, and now he is a Ph.D. candidate. He has already published nine technical articles in mechanical or aeronautical journals. He was granted Airbus scholarship and Guanghua scholar-ship for his researching efforts. His main research interest focuses on structural and mechanical reliability.

E-mail: [email protected]

BAI Guangchen Born in 1962, he received Ph.D. degree in 1993, and now is a professor of Beijing University of Aeronautics and Astronautics. He has published seventy technical papers in mechanical or aeronautical journals. His main research interests are reliability engineering and opti-mization design.