A Wiener-process-based degradation model with a recursive filter
algorithm for remaining useful life estimation
Xiao-Sheng Si
a,b, Wenbin Wang
c,n, Chang-Hua Hu
a, Mao-Yin Chen
b, Dong-Hua Zhou
b,na
Department of Automation, Xi’an Institute of High-Tech, Xi’an, Shaanxi 710025, China b
Department of Automation, TNList, Tsinghua University, Beijing 100084, China c
Dongling School of Ecomonics and Management, University of Science and Technology Beijing, Beijing 100083, China
a r t i c l e
i n f o
Article history:
Received 2 February 2012 Received in revised form 27 July 2012
Accepted 10 August 2012 Available online 1 September 2012 Keywords:
Reliability
Remaining useful life Wiener process Recursive filter
Expectation maximization
a b s t r a c t
Remaining useful life estimation (RUL) is an essential part in prognostics and health management. This paper addresses the problem of estimating the RUL from the observed degradation data. A Wiener-process-based degradation model with a recur-sive filter algorithm is developed to achieve the aim. A novel contribution made in this paper is the use of both a recursive filter to update the drift coefficient in the Wiener process and the expectation maximization (EM) algorithm to update all other para-meters. Both updating are done at the time that a new piece of degradation data becomes available. This makes the model depend on the observed degradation data history, which the conventional Wiener-process-based models did not consider. Another contribution is to take into account the distribution in the drift coefficient when updating, rather than using a point estimate as an approximation. An exact RUL distribution considering the distribution of the drift coefficient is obtained based on the concept of the first hitting time. A practical case study for gyros in an inertial navigation system is provided to substantiate the superiority of the proposed model compared with competing models reported in the literature. The results show that our developed model can provide better RUL estimation accuracy.
&2012 Elsevier Ltd. All rights reserved.
1. Introduction
Enhancing safety, efficiency, availability, and effectiveness of industrial and military systems through prognostics and health management (PHM) paradigm has gained momentum over the last decade[1,2]. PHM is a systematic approach that is used to evaluate the reliability of a system in its actual life-cycle conditions, predict failure progression, and mitigate operating risks via management actions. There are two parts in PHM, namely, ‘‘prognostics’’ and ‘‘health management’’. Prognostics is often characterized by estimating the remaining useful life (RUL) of a system using available condition monitoring (CM) information[3–7]. Once such prognosis is available, appropriate health manage-ment actions such as repair, replacemanage-ment, and logistic support can be performed to achieve the required system’s operational objectives[8–10]. In PHM, the term ‘‘RUL estimation’’ often implies to find the probability density function (PDF) of the RUL or the mean of the RUL[11], but the emphasis is often placed more on estimating the PDF of the
Contents lists available atSciVerse ScienceDirect
journal homepage:www.elsevier.com/locate/ymssp
Mechanical Systems and Signal Processing
0888-3270/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ymssp.2012.08.016
n
Corresponding authors. Tel.: þ 86 010 62794461; fax: þ86 010 62786911.
RUL than the mean RUL since a PDF can characterize the uncertainty associated with the RUL and is hence more informative for management decision making.
The current RUL estimation approaches can be broadly classified as physics of failure, data driven and fusion methods. Physics of failure approaches rely on the physics of underlying failure mechanisms. Data driven approaches achieve RUL estimation via data fitting mainly including machine learning and statistics based approaches. The fusion approaches are the combination of the physics of failure and data driven approaches. However, for complex or large-scale engineering systems, it is typically difficult to obtain the physical failure mechanisms in advance or cost-expensive and time-consuming to capture the physics of failure by experiments. In contrast, data-driven approaches attempt to derive models directly from collected degradation data or life data, and thus are more appealing and have gained much attention in recent years.
Statistics-based data driven methods for RUL estimation can be classified into the models based on the indirectly observed state processes and the models based on the directly observed state processes[2]. The former models considered the data partially indicating the underlying state of the system, and assumed that the available CM data were stochastically related to the underlying health state. In this case, lifetime data must be available to establish the relationship between the CM data and failure. The latter models utilized the observed degradation data directly to describe the underlying state of the system. Therefore, the RUL is defined as the time to reach the failure threshold of the monitored degradation data for the first time, namely the first hitting time (FHT)[12]. It is noted that the observed degradation data with a threshold are easier to manipulate and implement in practice when lifetime data are scarce[13,14].
It is well recognized that degradation process is uncertain over time and thus stochastic models are frequently used to characterize the evolution of degradation process. In literature, random effect regression (RER) models and stochastic process (SP) models are two kinds of most commonly used stochastic models. Lu and Meeker in[15]first presented the RER model to characterize the degradation of a population of units, and many extensions appeared later[2,16]. However, the degradation modeling paradigm in RER models is based on the fact that a population of ‘‘identical’’ systems (or devices) has a common degradation form. However, individual systems may exhibit different degradation rates, hence the different failure times. On the other side, Pandey et al. in[17]identified that the temporal uncertainty of the degradation process was not taken into account in RER models and thus argued that SP models could remedy it well. They also showed the advantages of SP models over RER models in condition based maintenance. SP models such as Markov chain, Gamma processes, and Wiener processes have been widely used to model the degradation process[5,18–21]. A Wiener-process-based degradation model is one of statistics-Wiener-process-based data driven models, which can characterize a non-monotonic degradation process, and provide a good description of system’s behavior due to an increased or reduced intensity of the use[22]. This type of models has been applied to model the degradation process and to estimate the RUL of a variety of industrial assets, such as rotating element bearings [19], LED lamps [23], self-regulating heating cables [24], laser generator[25], bridge beams[26], and fatigue crack dynamics[27]. Therefore, in this paper we focus on RUL estimation based on a Wiener process where the degradation process can be observed directly and a failure threshold of degradation is available. It is known that the selection of the failure threshold is an important problem in practice. However, such threshold is usually set based on either engineering domain knowledge or accepted industrial standards. For example, the ISO 2372 and ISO 10816 are frequently adopted for defining acceptable vibration threshold levels. It is therefore an issue beyond the scope of this paper. In this work, we assume that the failure threshold is known a priori.
A Wiener-process-based model has a drift term characterized by its drift coefficient and a noise term by Brownian motion. It has been widely used to model degradation processes which can be observed directly, and conduct lifetime analysis. Tseng et al. in[23]used a Wiener process to determine the lifetime for the light intensity of LED lamps of contact image scanners. As an extension, Tseng and Peng proposed an integrated Wiener process to model the cumulative degradation path of a product’s quality characteristics[28]. Joseph and Yu used a Wiener process for degradation modeling and reliability improvement[14]. Other recent extensions in lifetime estimation can be found in[29–31]. However, the degradation modeling paradigm for lifetime analysis in these conventional Wiener-process-based models is based on an assumption that the estimated PDF of RUL depends only on the currently observed degradation data, which is a strong Markovian assumption.
To relax this assumption, Gebraeel et al. in [19]presented an exponential degradation model for rotating element bearings based on a Wiener process, but incorporated some new and important improvements for RUL estimation. Their model established a linkage between the past and current degradation data of the same system by a Bayesian mechanism. However, it is worth noting that the Brownian motion in the Wiener process was just used as an error term in their models and the availability of the explicit distribution of the FHT from the Wiener process was not utilized. Instead, they directly estimated the RUL distribution using an implicit monotonic assumption. It is well-known that Wiener processes are non-monotonic. As such, the resulted RUL estimates in[19]are approximations. In addition, we note that the stochastic coefficients reflecting the individual-to-individual variability in[19]followed some prior distributions, but no elaborated method was presented to select the parameters in the prior distributions. Typically, several systems’ historical degradation data of the same type are required to determine the prior parameters. However, such historical degradation data of many systems are not always available in practice, particularly for newly commissioned systems. It is shown in Section IV that the inappropriate selection of the prior parameters can result in inaccurate estimation of the degradation and the RUL. Wang et al. in[32]recently proposed a Wiener-process-based model which used all past degradation data to date of the system for RUL estimation. Their model explicitly used the FHT from the Wiener process for RUL estimation.
However, we note that the model in[32]also required the data of many same systems for parameter estimation, and the distribution of the updated drift coefficient was not considered. Our results reveal that considering such distribution can lead to uncertainty reduction in the estimated RUL. One final observation from the existing literature of Wiener-process-based RUL estimation models is that all estimated parameters are not updated in line with newly observed data.
From the above review of related researches, we observe that there are three issues remaining to be solved when applying Wiener process for RUL estimation. The first is how to estimate the model parameters from an individual system’s data without the need of past data from many systems. The second is to consider the distribution in the estimated drift coefficient, which is a critical parameter having impact on both the mean and variance of the RUL. The third is to update model parameters based on newly observed degradation data. As we know the system’s life is heavily influenced by the way it is operated, maintained and the environment where it has been operating. The consideration of the above three issues will make our model tailored to an individual system through its actual monitoring data which relate to its operational and environmental characteristics.
In this paper, we address the above issues by utilizing a Wiener-process-based model with a recursive filter algorithm for RUL estimation. We use two techniques for the updating of the RUL estimation. A state-space model is used to recursively update the drift coefficient and an expectation maximization (EM) algorithm is used to re-estimate all unknown parameters at each time when new data are available. The new contributions of this paper are summarized as follows: (1) Different from all the previous works, our model estimates an individual system’s RUL based on its entire monitoring information to date through a recursive filter and an EM algorithm, and does not require historical degradation data of other systems in a population; (2) Unlike the work of Wang et al. in[32]where a recursive filter was also used, the drift coefficient is treated as a random variable to incorporate its distribution in estimation; (3) Our model is also different from the approximated results in[19,33] in that our result on the PDF of the RUL is exact in the sense of the FHT, and we also show that our result can ensure that the moments of the RUL exist, but this is not the case for the approximated results in[19,33]; and (4) We apply the proposed model to estimate the RUL of gyros in an inertial navigation system used in weapon systems as a case application.
The remainder parts are organized as follows. In Section II, we develop a Wiener-process-based degradation model and obtain the distribution of the RUL. In Section III, we discuss the parameter estimation algorithm in detail. Section IV provides a case study to illustrate the application and usefulness of the developed model. Section V draws the main conclusions.
2. Wiener-process-based degradation modeling and RUL estimation Notation used in this paper is summarized as follows.
ti Time of the ith CM point (can be irregularly spaced)
XðtÞ Random variable representing the degradation at time t xi Degradation observation at ti
X0:i¼ fx0,x1,x2,:::,xig History of degradation observations for the system to ti
U
i¼l
0,l
1,. . .,l
i1
History of drift coefficients to ti
EðUÞ Expectation operator BðtÞ Standard Brownian motion
l
Drift coefficientl
i Drift coefficient at til
0 Initial state which follows Nða0,P0Þs
Diffusion coefficientZ
Noise ofl
i which follows Nð0,Q Þw Failure threshold ^
l
i Estimatedl
i conditional on X0:i, denoted by Eðl
i9X0:iÞ^
l
j9i Smoothedl
jconditional on X0:i, denoted by Eðl
j9X0:iÞPi9i Variance of estimated
l
i, denoted by Varðl
i9X0:iÞPi9i1 Variance of estimated
l
i at ti1, denoted by Varðl
i9X0:i1ÞKi Filter gain
Ri RUL at tiwith its realization, ri
T Lifetime
h
Parameter column vector, denoted byh
¼ ½a0,P0,Q ,s
2T^
h
i Maximum likelihood estimate (MLE) ofh
based on X0:i^
h
i ðkÞEstimated
h
in the kth step based on X0:iin the EM algorithmLið
h
Þ Log-likelihood function for observed events, x0,x1,x2,:::,xi‘ið
h
Þ Joint log-likelihood function for observed events, X0:iandU
i‘ð
h
9 ^h
ðkÞi Þ Expectation of ‘iðh
Þconditional on ^h
ðkÞ i and X0:iD
n Order principal minor determinant of the second derivation of ‘ðh
9 ^h
ðkÞi Þ, with n ¼ 1,2,3,4
F
ðUÞ Standard normal CDF2.1. An outline of Wiener-process-based degradation model for lifetime analysis
In this section we briefly present the conventional Wiener-process-based degradation model for lifetime analysis. A Wiener process is typically used for modeling degradation processes where the degradation increases linearly in time with random noise. The rate of degradation is characterized by the drift coefficient. The wear of break pads on automotive wheels is a practical example. Christer and Wang in[34]modeled the wear of break pads as a linear function of time where the thickness of the break pad decreases linearly in time with a Gaussian noise.
In general, a Wiener-process-based degradation model can be represented as,
XðtÞ ¼
l
t þs
BðtÞ, ð1Þwhere
l
is the drift coefficient,s
40 is the diffusion coefficient, and BðtÞ is the standard Brownian motion representing the stochastic dynamics of the degradation process.In physics, a Wiener process aims at modeling the movement of small particles in fluids and air with tiny fluctuations. A characteristic feature of this process in the context of reliability is that the plant’s degradation can increase or decrease gradually and accumulatively over time. The small increase or decrease in degradation over a small time interval behaves similarly to the random walk of small particles in fluids and air. Therefore, this type of stochastic processes has been widely used to characterize the path of degradation processes where successive fluctuations in degradation can be observed, such as the degradation observations of rotating element bearings[19], LED lamps[23], self-regulating heating cables[24], laser generator[25], bridge beams[26]and other examples in[14]and[30–35] and our case of gyros’ drifting. Modeling a stochastic degradation process as a Wiener process implies that the mean degradation path is a linear function of time, i.e. E XðtÞ½ ¼
l
t. Therefore, the drift parameterl
is closely related with the progression of the degradation. In addition, we have the variance of the degradation process var XðtÞ½ ¼s
2t, which represents the uncertainty of thedegradation at time t.
For in service lifetime estimation at time ti with the obtained degradation observation xi, we can use,
XðtÞ ¼ xiþ
l
ðttiÞ þs
ðBðtÞBðtiÞÞ¼xiþ
l
ðttiÞ þs
BðttiÞ, f or t 4 ti ð2ÞAt time ti, we assume xiow, otherwise the degradation has crossed w and the system would have failed as defined.
Although the above model setting is the same, there are two different ways to relate the degradation XðtÞ to lifetime T at ti
in the literature. The first one is that the lifetime is directly defined as T ¼ t : XðtÞ Z w9xi
, and then the lifetime distribution can be represented by FT9xiðt9xiÞ ¼PrðXðtÞ Z w9xiÞ, such as[19,31,33]. To see this, using T ¼ t : XðtÞ Z w9xi
, the PDF and the cumulative density function (CDF) of lifetime T at time ti can be directly obtained as,
fT9xiðt9xiÞ ¼ 1
s
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2p
ðttiÞ p exp ðwxil
ðttiÞÞ 2 2s
2ðtt iÞ ! ð3Þ FT9xiðt9xiÞ ¼1F
wxil
ðttiÞs
ffiffiffiffiffiffiffiffiffitti p , ð4Þwhere
F
ðUÞdenotes the standard normal CDF.Another one defines the lifetime based on the concept of the FHT as T ¼ inf t : XðtÞ Z w9x i, such as[21,29,30,32,35].
Therefore, using the FHT, the following results can be obtained[36], fT9xiðt9xiÞ ¼ wxi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2
p
ðttiÞ3s
2 q exp ðwxil
ðttiÞÞ 2 2s
2ðtt iÞ ! , ð5Þ FT9xiðt9xiÞ ¼1F
wxil
ðttiÞs
ffiffiffiffiffiffiffiffiffitti p þexp 2l
ðwxiÞs
2F
ðwxiÞl
ðttiÞs
ffiffiffiffiffiffiffiffiffitti p , ð6Þwith the mean E T9xi
¼tiþ ðwxiÞ=
l
and variance var T9xi
¼ ðwxiÞ
s
2=l
3h
. This reflects the relationship among the parameters used in the model, the current degradation measurement and the estimated future life of the plant modeled. Particularly,
l
is critical for both the mean and variance of the estimated lifetime.Clearly, the above two definitions are different from each other and also lead to different lifetime estimations. As noted by Park and Bae in[16], T ¼ t : XðtÞ Zw9x icompletely ignores the possible hitting events within interval ðti, tÞ and thus is
only a crude approximation to T ¼ inf t : XðtÞ Zw9xi
when the degradation fluctuations are large. In addition, the CDF using the FHT as Eq.(6)is greater than the approximation by Eq.(4). For safety-critical systems, using the approximated result as Eq.(4)for maintenance scheduling may lead to under-maintenance because of the lower risk of failure estimated by Eq.(4). As such, it is necessary to consider the FHT as the lifetime, which is exact if the failure is defined as the FHT. We note that Eq.(6)uses only the current degradation data, but not its history before ti. However, as we have discussed,
this is a strong Markovian assumption and ideally the future FHT should depend on the path that the degradation has involved to date. For example, inFig. 1, at the same level of xi, case (a) would be expected to fail faster than case (b), but
Consequently, it is desired to utilize the degradation data to date for evaluating the RUL of the degraded system. It is expected that utilizing the degradation data to date can make the RUL estimation sharper and more tailored to an individual system than only using the current data. This is our main focus in the remaining parts of this paper.
2.2. Wiener-process-based degradation modeling
Now we address the issues discussed in the introduction. Since some variants introduced in [19] can be easily transformed into Eq.(1)by logarithmic transformation, we only focus on Eq.(1), which is used to describe the evolution of the monitored degradation variable over time in this paper.
To incorporate the history of the observations and to maintain at the same time the nice property of the Wiener process, we consider an updating procedure for coefficient
l
by a random walk modell
i¼l
i1þZ
over time whereZ
Nð0,Q Þ. Thus the drift coefficientl
evolves as a time-dependent random variable with a distribution, conditional onl
i1. In fact, the diffusion coefficient,s
, can also be made time-dependent. However the structure of Eq.(1)does not allowus to use the state space model shown later, and therefore a general filter has to be used, which is computationally difficult. There is also a practical reason why we are only interested in making
l
time-varying. It is known from Eq.(1)and the discussion in Section II-A that the mean degradation and the progression of the degradation are governed by the drift coefficientl
and the time while the diffusion coefficients
controls in part the uncertainty in the degradation process. The trend in degradation is determined by the drift coefficient while the diffusion coefficient only influences the noise, which can be considered to be constant. A similar idea can also be found in statistical process control literature, in which it is frequently assumed that the process mean will change but the variance will be constant when the process shifts from ‘‘in control’’ to ‘‘out of control’’ [37]. Motivated by the state space model [38], the degradation equation can be reconstructed via a linear state-space model as,l
i¼l
i1þZ
, ð7Þxi¼xi1þ
l
i1ðtiti1Þ þse
i, ð8Þwhere t0¼0, x0¼0, and
e
iNð0,titi1Þ. The use of titi1 as the variance ofe
iis required by the property of Brownianmotion. Eq.(7)is called the system equation, while Eq.(8)is the observation equation [38]. The reason to use a linear system equation is not only because of its simplicity. We can use a nonlinear system equation for
l
i, but we expect that thegain obtained will be minor, but at the expense of a substantially long computation time. There is also a practical problem as what form of nonlinearity we should use because
l
iis not observable. Since the system equation is to model the changeof the drift coefficient over a sampling interval which is not long, we would expect that
l
ishould be aroundl
i1adjustedby the noise term.
We assume that the initial drift coefficient
l
0follows a normal distribution with mean a0and variance P0as required bythe state space model. The drift coefficient is considered as a hidden ‘‘state’’ and can be estimated from the observations up to ti, denoted by X0:i¼ fx0, x1, x2,:::, xig. As such, this model establishes the linkage between the drift coefficient and the
observation history up to ti. In Eq.(7),
l
i follows a distribution which can be estimated by a recursive filter once newobservation xiis available at ti. We denote its mean by ^
l
i¼Eðl
i9X0:iÞand its variance by Pi9i¼Varðl
i9X0:iÞ.In order to compute ^
l
iand Pi9i, we need to know the PDF of thel
igiven X0:i, denoted by pðl
i9X0:iÞ. Recursion solution ofpð
l
i9X0:iÞcan be computed from pðl
i19X0:i1Þby the well-known Bayesian rule as follows,pð
l
i9X0:iÞ ¼Z
pð
l
i9l
i1Þpðl
i19X0:iÞdl
i1¼R
pð
l
i9l
i1Þpðxi9l
i1,X0:i1Þpðl
i19X0:i1Þdl
i1pðxi9X0:i1Þ:
ð9Þ It has been well established that if Eqs.(7) and (8)are used, Eq.(9)is Gaussian with mean ^
l
iand variance Pi9i, which canbe computed by the Kalman filter[38]. As a result, the entire history is captured via recursively updating the estimate of
l
i,which is the advantage of the state space model. The recursive estimations for ^
l
i and Pi9i using Kalman filtering aresummarized as Algorithm 1 inAppendix A.
In literature, the Kalman filter has been successfully applied when system states (here referred the drift coefficient as a state) and observations evolve in a smooth and gradually changing way. However, degradation sometimes may have jumps or sudden changes[19]. Here, we introduce an algorithm to deal with this by a strong tracking filter (STF)[39]. STF is also Kalman filter-based, but adjusts the prediction variance Pi9i1so that it is sensitive to the prediction error,
xi xi
w w
Degradation Degradation
Time Time
xixi1 ^
l
i1ðtiti1Þ, and then the filter gain Ki is sensitive to the change of the system state. The details about the STFalgorithm are summarized inAppendix Bas Algorithm 2.
Based on Eqs.(7), (8), and (9), the PDF of
l
iconditional on X0:iis,pð
l
i9X0:iÞ ¼ 1 ffiffiffiffiffiffiffiffiffiffiffiffiffi 2p
Pi9i q expl
i ^l
i 2 2Pi9i , ð10Þwhere the dependence between
l
i and X0:iis contained in ^l
iand Pi9i. Based on this result, we derive the associated RULdistribution in the following.
2.3. Real-time updating of the RUL distribution
Based on a predefined threshold w, the RUL modeling principle is that when degradation XðtÞ first reaches threshold w, the system is declared to be non-operable and its lifetime terminates. Consequently, it is natural to view the event of lifetime termination as the point that the degradation XðtÞ exceeds threshold w for the first time. In this paper, from the concept of the FHT, we define RUL Ri at time ti as
Ri¼inf ri: XðriþtiÞ Zw9X0:i, ð11Þ
with CDF FRi9X0:iðri9X0:iÞand PDF fRi9X0:iðri9X0:iÞ. From Eq.(2), it is direct to obtain the PDF and CDF of the RUL at time ti defined in Eq.(11)as follows[36],
fRi9li,X0:iðri 9
l
i,X0:iÞ ¼ wxi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2p
ri3s
2 p exp ðwxil
iriÞ 2 2s
2r i ! , ri40: ð12Þ FRi9li,X0:iðri9l
i,X0:iÞ ¼1F
wxil
iris
ffiffiffiffiri p þexp 2l
iðwxiÞs
2F
ðwxiÞl
iris
ffiffiffiffiri p , ri40: ð13ÞIn Eq.(12)if we replace
l
i by ^l
i, then it is the RUL model used in[32]. We call this as Wang’s model for subsequentcomparisons in Section IV. However, as mentioned above, the drift coefficient evolves as a random variable in Eq.(7)with a distribution, pð
l
i9X0:iÞ, conditional on the observed data up to time ti as formulated by Eq.(10). Now we want to usepð
l
i9X0:iÞfor deriving the estimated RUL distribution. In order to achieve this aim, we first give two lemmas, which cansignificantly simplify the course of the derivation for the RUL distribution. Lemma 1. If Y Nð
m
1,s
21Þ, then EY½F
ðYÞ can be formulated as EY½F
ðYÞ ¼F
ðm
1=ffiffiffiffiffiffiffiffiffiffiffiffiffi
s
2 1þ1q
Þwhere
F
ðUÞdenotes the standard normal CDF.Proof: From the property of the normal distribution, we have EY½
F
ðYÞ ¼E EðIfZr Yg9YÞ9Y¼PrðZrYÞ ¼ PrðZY r0Þ ¼
F
ðm
1= ffiffiffiffiffiffiffiffiffiffiffiffiffis
2 1þ1 q Þ: ð14Þ In the derivation process, IfZr Yg is the indicator function, Z is a standard normal variable and independent of Y, andZY Nð
m
1,s
21þ1Þ.Lemma 2. If
l
Nðm
l,s
2lÞ, the PDF and CDF of the FHT of process XðtÞ ¼l
t þs
BðtÞ to first hit threshold w can be formulated as fTð Þ ¼t w ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2p
t3s
2 lt þs
2 q exp wm
lt 2 2ts
2 lt þs
2 " # : ð15Þ FTð Þ ¼tF
m
ltw ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffis
2 lt 2þs
2t q 0 B @ 1 C Aþexp 2m
s
l2wþ 2s
2 lw2s
4 !F
2s
2 lwt þs
2m
lt þws
2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffis
2 lt 2þs
2t q 0 B @ 1 C A: ð16ÞLemma 2 is similar to the results given in[40]and[41]. This lemma can be obtained by some direct manipulations using Lemma 1 and the total law of probability[25]. From Lemma 2, we give the following theorem.
Theorem 1. For the Wiener process defined by Eq.(2)and the state space model as Eqs.(7) and (8), the PDF and CDF of the updated RUL at tibased on the updated PDF of
l
ican be obtained asfRi9X0:iðri9X0:iÞ ¼ wxi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2
p
ri3 Pi9iriþs
2 r exp wxi ^l
iri 2 2ri Pi9iriþs
2 0 B @ 1 C A, ri40: ð17ÞFRi9X0:iðri9X0:iÞ ¼1
F
wxi ^l
iri ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pi9iri2þs
2ri q 0 B @ 1 C Aþexp 2 ^l
iðwxs
2 iÞþ 2Pi9iðwxiÞ2s
4 !F
2Pi9iðwxiÞriþs
2l
^iriþwxis
2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP i9iri2þs
2ri q 0 B @ 1 C A: ð18Þ Proof: Using Eqs.(10), (12), (13), (15), (16)and the total law of probability, we havefRi9X0:iðri9X0:iÞ ¼ Z þ 1
1
fRi9li,X0:iðri9
l
i,X0:iÞpðl
i9X0:iÞdl
i, ð19ÞFRi9X0:iðri9X0:iÞ ¼ Z þ 1
1
FRi9li,X0:iðri9
l
i,X0:iÞpðl
i9X0:iÞdl
i: ð20Þ Following Lemma 2, it is straightforward to obtain Eqs.(17) and (18).Comparing Eq.(17)with Eq.(12), we observe that the observation history and the variance of
l
iare involved in Eqs.(17)and (18), which is also recursively updated. We call Eq.(17)as our model to distinguish other models in Section IV.
Remark 1. In[19], they directly used PrðRirri9X0:iÞ ¼Pr XðriþtiÞ Zw9X0:i
to calculate the RUL distribution and obtained the associated RUL distribution with similar form to Eqs.(3) and (4). However, using PrðRirri9X0:iÞ ¼Pr XðriþtiÞ Zw9X0:i
ignores the possible hitting events within ðti,tiþriÞas discussed in Section II-A and thus their results are approximations.
Instead, Eqs.(17) and (18)are exact in the sense of the FHT.
Remark 2. The moment of the RUL distribution obtained in [19]and [33], does not exist since their obtained RUL distributions belong to the family of Bernstein distributions, known without moments, but this is not the case for our result. For example, the mean of RUL can be easily formulated by
EðRi9X0:iÞ ¼E½EðRi9
l
i,X0:iÞ9X0:i ¼Eðwxi
l
i 9X1:kÞ ¼ wxk Pi9i expð ^l
2i 2Pi9i Þ Z l^i 0 expðu 2 2Pi9i Þdu ¼ ffiffiffi 2 p wxi ð Þ ffiffiffiffiffiffiffi Pi9i q Dð ^l
i ffiffiffiffiffiffiffiffiffiffi 2Pi9i q Þ where DðzÞ ¼ expðz2ÞRz0expðu2Þdu is the Dawson integral for real z, which is known to exist. This property is desired in
maintenance practice, since the expectation of the life estimation is required to be existent sometimes[42,43].
In Eqs.(17) and (18), parameters a0,P0, Q and
s
2should be estimated. As opposed to the result in[19,32], we develop aparameter estimation algorithm in the following section for this task. 3. Parameter estimation
Now we return to estimate and update a0, P0, Q and
s
2in Eqs.(7) and (8). Unlike the method adopted in[32], we useonly the data from one system from the time of installation and recursively update the estimate along with the observation process. We denote
h
¼ ½a0, P0, Q ,s
2T as a parameter vector. We use the maximum likelihood estimation(MLE) to estimate
h
once new degradation observation xiis available. In this case, the log-likelihood function for X0:ican bewritten as
Lið
h
Þ ¼log½pðX0:i9h
Þ, ð21Þwhere pðX0:i9
h
Þis the joint PDF of the degradation data X0:i. Then the MLE estimate ofh
, denoted by ^h
i, conditional on X0:ican be obtained by ^
h
i¼argmaxh Lið
h
Þ: ð22ÞIf the drift coefficient is constant then maximizing Eq.(21)with respect to
h
is straightforward. However, since we treatl
i as a hidden variable which is given by Eq. (7), then directly maximizing Eq. (21)is impossible. However, the EMalgorithm provides a possible framework for estimating the parameters involving hidden variables[44]. A fundamental assumption of the EM algorithm is that the hidden variables can be estimated by observed data. This is the case for our problem since pð
l
i9X0:iÞcan be obtained by Eq.(10).The fundamental principle of the EM algorithm is to replace the hidden variables with their expectations conditional on the observed data. Then the parameter estimation can be formulated as maximizing the joint likelihood function pðX0:i,
U
i9h
Þ. Specifically, by manipulating the relationship between pðX0:i9h
Þand pðX0:i,U
i9h
Þ, Liðh
Þcan be divided into twoparts as
Lið
h
Þ ¼‘iðh
Þlog pðU
i9X0:i,h
Þ, ð23Þwhere
‘ið
h
Þ ¼log pðX0:i,U
i9h
Þ: ð24ÞThen taking the expectation operator on both sides of Eq.(23)with respect to
U
i9X0:i,h
0, we havewhere ‘ð
h
9h
0Þ ¼EUi9X0:i,h0 ‘ið
h
Þ
, ð26Þ
Kð
h
9h
0Þ ¼EUi9X0:i,h0 log pðU
i9X0:i,h
Þ
: ð27Þ
Finally, via Eq.(25), the following holds Lið
h
ÞLiðh
0Þ ¼‘ðh
9h
0Þ‘ðh
09h
0Þ þKðh
09h
0ÞKðh
9h
0Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
Z0
, ð28Þ
where the positivity of the last term is implied by the Kullback–Leibler divergence metric between pð
U
i9X0:i,h
Þ andpð
U
i9X0:i,h
0Þ,[45].
Obviously, if ‘ð
h
9h
0Þ4 ‘ðh
09h
0Þ, then Liðh
ÞLiðh
0Þ40 holds. This is achieved by the EM algorithm which takes anapproximation ^
h
i ðkÞof MLE ^
h
igiven in Eq.(22)and updates it to a better ^h
i ðk þ 1Þaccording to the following two steps: E-step Calculate ‘ð
h
9 ^h
ðkÞ i Þ ¼EU i9X0:i, ^h ðkÞ i ‘iðh
Þ , ð29Þ where ^y
i ðkÞ¼ ½a0iðkÞ, P0iðkÞ, QiðkÞ,
s
i2ðkÞTdenotes the estimated parameters in the kth step conditional on X0:i.M-step Calculate ^
h
i ðk þ 1Þ ¼argmax h EUi9X0:i, ^hi ðkÞ‘iðh
Þ : ð30ÞThen we iterate the E-step and M-step until a criterion of convergence is satisfied. In our case, we can calculate the E-step and M-step separately but just outline the properties of the estimation algorithm. The details of the algorithm are summarized inAppendix C. Interestingly, it can be observed from Theorem 2 inAppendix Cthat the M-step in our approach can be solved analytically and we can obtain the unique maximum point. This implies that each iteration of the EM algorithm can be performed with a single computation, which leads to an extremely fast and simple estimation procedure. This computation advantage plus the exact RUL distribution are particularly attractive for practical applications. The convergence property of the proposed algorithm can be similarly demonstrated in[46–50].
4. A practical case study
In this section, we provide a practical case study for gyros in an inertial navigation system (INS) to illustrate the application of our model and compare the performance of our model with the models presented in[19,32]. In our study, it is found that STF can generate superior results to Kalman filter to a certain extent. As such, in the following, we only use STF in the filtering step for illustration. Of course, in practice which one to use needs to be tested against the model fit. However, due to limited space, we do not discuss this issue in this paper. Actually, the detailed comparisons between STF and Kalman filter with sudden state changing can be found in[51,52].
4.1. Problem description
As a key device of the INS in weapon systems and space equipment, an inertial platform plays an important and irreplaceable role in the INS. Its operating state has a direct influence on navigation precision. The sensors fixed in an inertial platform include three gyros and three accelerometers, which measure angular velocity and linear acceleration, respectively. The gyro fixed on an inertial platform is a mechanical structure having two degrees of freedom from the driver and sense axis (see[53]for a general description of inertial navigation platforms and gyros). When the inertial platform is operating, the wheels of the gyros rotate at very high speeds and can lead to rotation axis wear. As the wear is accumulated, the bearing on the gyro’s electric motor will become deformed and such deformation can lead to the drift of the gyros. The increasing drift finally results in the failure of gyros and then the inertial platform. Past data show that almost 70% of the failures of inertial platforms result from gyroscopic drift and such drift is largely resulted from the wear of bearings as the case of rolling element bearings which are extensively investigated in the literature. However, the difference of our case is that we use the drift data of gyros to estimate the RUL rather than the vibration data as rolling element bearings since we cannot obtain the vibration data as it is not allowed to fix the vibration sensors in the inertial platform. As such, the drift of gyros is often used as a performance indicator to evaluate the health condition of an inertial platform and to schedule maintenance activities.
For an illustrative purpose, we provide an illustration of the inertial platform (see Fig. 2), and an illustration of a deformed bearing of the gyro’s electric motor (seeFig. 3), which was obtained by scanning electron microscopy S-3700N. It can be found that the maximum length of the metal flake is 155
m
m. Such deformation is reflected by the drifting data of gyros, which can be monitored directly.In this study, we assume that CM values of drift coefficients reflect the performance of the inertial platform, and the larger the drift coefficients monitored are, the worser the performance. Therefore, according to the CM data and technical
index of the inertial platform, failure prediction can be implemented by modeling the drift coefficients. The drift coefficients of an inertial platform mainly include K0X, K0Y, K0Z, KSX, KSY, KIZ, in which K0X, K0Y, K0Zdenote constant drift
coefficients, and KSX, KSY, KIZ are stochastic drift coefficients, where KSX, KSY denote the coefficients related to the first
moment of specific force along the sense axis, and KIZdenotes the coefficient related to the first moment of specific force
along the input axis. Generally, the drift degradation measurement along the sense axis, KSX, plays a dominant role in the
assessment of gyro degradation. In our study, we take the CM data of KSX as the degradation signals and use them for RUL
estimation of the INS. For our monitored INS in certain weapon system with the terminated life 180.5 h, 73 points of drift coefficients data were collected with regular CM intervals 2.5 h in field condition. The collected data are illustrated inFig. 4.
In the practice of the INS health monitoring, it is usually required that the drift measurement along the sense axis should not exceed 0:37ð1=hÞ. This threshold is predetermined at the design stage and is strictly enforced in practice since an INS is a critical device used in a navigated weapon system.
4.2. The implementation of our model for RUL estimation of the INS
Using our model, the predictions of the gyro’s drift and the distribution of the RUL can be obtained at each CM point. Specifically, using our approach initialized by parameter vector
h
0¼ ½0:002, 0:001, 0:01, 0:01T, the one step predicteddrifting path by ^xi þ 1¼xiþ ^
l
iðti þ 1tiÞis illustrated inFig. 4to show the fitness of our model to the gyro’s drift degradation Fig. 2. Illustration of the inertial platform.data. Clearly, the predicted results match with the actual data well and the mean squared error (MSE) of the predictions is 1.1962E-4 which is small. This demonstrates that our developed model can model the gyro’s drift degradation data effectively. Correspondingly, the evolving path of the estimated parameter vector ^
h
, consisting of a0, P0, Q ands
2, areillustrated inFig. 5, which are estimated by Algorithm 4 summarized inAppendix E.
Fig. 5shows that the updated parameters converge quickly as the observed degradation data are accumulated. In this
case, once the parameters converge, further updating may be unnecessary. However, such updating is needed for the case that the degradation process may be subject to unusual changes in its progression. Once the parameters in the model are updated, the PDF of the estimated RUL can be calculated at each CM point. It is noted that the RUL estimation at a CM point is not achieved by estimating the long term future degradation state with estimating errors, and instead is achieved by directly calculating the FHT of the degradation process.Fig. 6illustrates the RUL distributions at six different CM points. It is noted that the first drift reading is zero according to the model setting. In our used practical data, the last drift reading is 0:3566 ð1=hÞ at the monitoring time t73¼180 h. It can be found that this drift reading is very close to the failure
threshold (w ¼ 0:37 ð1=hÞ). Therefore, we can consider that the data captures full life cycle history (useful life) of the machine component. In other words, the actual lifetime of the gyros in an INS is approximated to be 180.5 h and thus the actual RUL at each CM point is known from the full life cycle data. As shown inFig. 6, the actual RUL (denoted by square) falls within the range of the estimated PDF of the RUL at each CM point from our model and further the estimated PDF of the RUL becomes sharper as the degradation data are accumulated. This implies that the uncertainty of the estimated RUL is reduced since more data are utilized during estimating the model parameters. When we use our predictive model for RUL estimation at a given CM point, we use the CM data up to that CM point. In other words, if we estimate the RUL at ti,
the data X0:iare used to update the model and for RUL estimation. Therefore, at the last CM time t73, all the available CM
data are used to estimate the RUL.Fig. 7illustrates the performance of our predictive model against the full life cycle data. It can be observed that the estimated mean RUL and the actual RUL match each other well. For example, the relative error between the actual life and the estimated mean life at t73 is 0.16%. This reflects that the estimated result of our
developed model can match the actual result from the full life cycle data closely.
4.3. Comparative studies
In this part, we conduct some comparative studies with the models presented in[19,32]. We first compare our model with Wang’s model about their performance of the RUL estimation for the INS. The detailed implementation process of Wang’s model can be found in[32].
In order to compare our model with Wang’s model, a loss function is employed to enable a direct comparison of the distributions for the RUL estimation between two models. The loss function is the MSE about the actual RUL obtained at each observation point[4], defined as
MSEi¼ Z 1 0 ðri~riÞ2fRi9X0:i ri9X0:i dri, ð31Þ where ~riis the actual RUL obtained at tiand fRi9X0:i ri9X0:i
is the estimated PDF of the RUL.
Fig. 8(a) compares the estimated RUL distributions from our model with Wang’s model. In both cases, the unknown
parameters, such as a0,P0, Q and
s
2, are obtained by the EM algorithm, but the difference is that our model recursivelyupdates all parameter estimates whenever a new piece of information is available, whereas the model in[32]uses the data of past systems to estimate the model parameters and once estimated they are fixed. Only ^
l
iis updated with the new data0 20 40 60 80 100 120 140 160 180 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
The CM point (hours)
The degradation path of gyro drift (
0/hour)
The actual observed path The predicted path
0 20 40 60 80 100 120 140 160 180 0 0.005 0.01 0.015 0.02 0.025 The CM point(hours) Estimated a 0 0 20 40 60 80 100 120 140 160 180 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1x 10 -3 The CM point(hours) Estimated P 0 0 20 40 60 80 100 120 140 160 180 0 0.2 0.4 0.6 0.8 1 1.2x 10 -3 The CM point(hours) Estimated Q 0 20 40 60 80 100 120 140 160 180 0 0.2 0.4 0.6 0.8 1 1.2 1.4x 10 -3 The CM point(hours) Estimated σ 2
Fig. 5. The updated parameters at monitoring points t0, t1,:::, t73.
0 20 40 60 80 100 120 140 160 180 200 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 The RUL(hours)
Probability density function (PDF)
The 21th CM point The 41th CM point The 51th CM point The 61th CM point The 71th CM point The 73th CM point The actual RUL
in Wang’s model. Clearly, the PDFs of the RULs from both models can cover the actual RULs well as the obtained data are accumulated. However, it is observed fromFig. 8(a) that the PDFs of the estimated RULs are typically more dispersed when using Wang’s model.Fig. 8(b) shows the calculated MSE between the actual RUL and the estimated RUL at each CM point. We can observe that the MSEs using Wang’s model change irregularly at different sampling points with relatively large
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180 200 RUL(hours) The CM point(hours)
The actual RUL The estimated RUL
Fig. 7. The estimated mean RUL and the actual RUL at monitoring points t0,t1,:::,t73.
0 5 10 15 20 25 165 170 175 180 0 0.05 0.1 0.15 0.2
The RUL (hours) The CM time (hours)
Probability density function (PDF) Our model Wang's model 0 20 40 60 80 100 120 140 160 180 0 0.5 1 1.5 2 2.5 3x 10 4 The CM point(hours) MSE i at each CM point Wang's model Our model
Fig. 8. Comparative results of our model with Wang’s model (a) the estimated PDFs of the RULs at the last six CM points, and (b) the MSEs of the RULs at all CM points (&–actual RUL).
fluctuations, particularly in the earlier stage of estimation. This implies that the estimated RUL PDFs by Wang’s model is sensitive to small changes in observations shown inFig. 4. This may also lead to completely different health management decisions at two consecutive CM points where the actual degradation observations may only have changed a little as seen
fromFig. 4. The reason for such behavior seen fromFig. 8(b) stems from the factor that the distribution pð
l
i9X0:iÞof theupdated drift coefficient is not considered and the estimated parameters are not recursively updated in [32]. From
Fig. 8(b), we observe that our model can make the MSEs about the actual RUL at each CM point less sensitive to small
changes. But most importantly our model can improve the accuracy of the estimated PDFs of the RULs with reduced variances, seeFig. 8. This reduction in variance is due to the use of the complete distribution of
l
i via the Bayesian ruleshown earlier. These comparisons reflect the superiority of our model to Wang’s model in the RUL estimation for the INS.
0 20 40 60 80 100 120 140 160 180 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
The drift degradation data of the INS(
o/hour)
The CM time (hours) The actual degradation path
Gebraeel's model with random prior parameters Gebraeel's model with appropriate prior parameters Our model with random prior parameters
0 5 10 15 20 25 165 170 175 1800 0.05 0.1 0.15 0.2 0.25 0.3
The RUL (hours) The CM time (hours)
Probability density function (PDF)
Our model
Gebraeel's model with appropriate prior parameters
0 5 10 15 20 25 165 170 175 180 0 0.05 0.1 0.15 0.2 0.25 0.3
The RUL (hours) The CM time (hours)
Probability density function (PDF)
Our model
Gebraeel model with random prior parameters
Fig. 9. Comparative results of our model with Gebraeel’s model (a) the predicted degradation path, (b) RUL estimation using appropriate prior parameters, and (c) RUL estimation using random prior parameters (&–actual RUL).
Now, we further compare our model with Gebraeel’s model about their performance in the RUL estimation. The detailed implementation process of Gebraeel’s model can be found in[19]. It is noted that the model in[19]needs prior parameters as well our model. In order to compare the goodness of fit, the prior parameters of our model in the following comparison are selected at random at time zero, but we consider two cases for the prior parameters of Gebraeel’s model. One uses the appropriately selected parameters as the prior parameters and the other uses random prior parameters.Fig. 9
shows the comparative results.
Fig. 9(a) shows that the predicted degradation path. The MSEs between the actual data and the corresponding
predictions of the degradation path for our model with random prior parameters, Gebraeel’s model with random prior parameters, and Gebraeel’s model with appropriate prior parameters are 1.179E-4, 0.0026, and 7.9864E-4, respectively. Thus, our model has the best fitting for the degradation data. We also find that our model is robust with the choice of the prior parameters, but Gebraeel’s model is not. Fig. 9(b) illustrates several estimated PDFs of the RULs over time using Gebraeel’s model with appropriate prior parameters. It is clear that the range of PDFs of RULs covers the actual RULs, but the uncertainty in the estimated RUL of our model is less than that from Gebraeel’s model, particular for the last few sampling points, since our estimated PDFs of the RULs are sharper than the results of Gebraeel’s model. However, if the prior parameters in Gebraeel’s model are selected at random, the estimated RUL may be incorrect as illustrated inFig. 9(c) (the observed RULs are outside of the predicted RUL PDF ranges). In comparison, our model using the random selected prior parameters can produce reasonable estimates. This demonstrates the merit of our model for RUL estimation of a particular system. Therefore, if there are no sufficient historical degradation data at the beginning for selecting the prior parameters, our model may be more appropriate for RUL estimation.
As noted by Remark 2, the moments of the RUL distribution obtained by Gebraeel’s model do not exist. Therefore, we cannot calculate the MSE about the actual RUL defined in Eq.(31). But, to have a further look at the difference of the estimated RUL distributions between our exact result and Gebraeel’s approximate result, it is desired to compare the estimated reliability from both models. Here, we use FRi9X0:iðri9X0:iÞ ¼1FRi9X0:iðri9X0:iÞas the conditional reliability obtained from our exact approach at ti, while FR0
i9X0:iðri9X0:iÞ as the conditional reliability obtained from Gebraeel’s model by PrðR0
irri9X0:iÞ ¼Pr Xðr iþtiÞ Zw9X0:iat ti. The difference between them at the last CM point is illustrated inFig. 10.
It can be found that the difference between both results is significant. This is due in part to the many fluctuations existing in the degradation path of gyro’s drift so that using PrðL0
krlk9X1:kÞ ¼Pr XðlkþtkÞ Zw9X1:k
to approximate the RUL will lead to overestimating of the reliability, as discussed in Remark 1. Consequently, when the approximated result is applied to reliability-centered maintenance, the obtained decision may be far from the reality.
In sum, this practical case study demonstrates that our developed model can work well and efficiently. On the other hand, we verify that incorporating the observation history to date can improve the accuracy of the RUL estimation indeed. 5. Concluding remarks
In this paper, we present a Wiener-process-based degradation model with a recursive filter algorithm and Bayesian updating to estimate the PDF of the RUL. Based on the Wiener process, we construct a RUL estimation model which depends on the system CM data to date via recursively updating. The updating is done twice in that the drift coefficient is recursively updated via the state-space model and all parameters are updated via the EM algorithm, both at the time that a new piece of CM data becomes available. During the RUL estimation, we also consider the distribution in the estimated drift coefficient and accordingly an explicit RUL distribution is obtained based on the concept of FHT. The usefulness of the proposed model is demonstrated via a real-world example for gyros in an inertial navigation system. By comparing the
0 20 40 60 80 100 120 140 160 180 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 The RUL(hours) Co n d it io n a l r e li a b il it y a t th e la s t CM p o in t Our model Gebraeel's model
proposed model with existing models, we show that the proposed model can generate better results than existing models compared both in terms of the mean and variance of the estimated RUL and is robust with respect to the choice of the prior parameters.
Although the practical case studies show that our model is a simple but efficient method for RUL estimation, there are still several issues needed to be further studied. Firstly, in this paper, we only consider a Wiener process with a linear drift, but for complex systems in practice, a nonlinear model may be more appropriate. Secondly, we mainly consider the case that the failure is resulted by the gradual and continuous degradation in this paper. However, considering the sudden failure in degradation modeling and RUL estimation is interesting and of practical significance. Because the failure of complex engineering systems is often the result of gradual degradation and stresses generated either by the systems themselves or from external shocks. Such shocks frequently lead to the sudden failure. Therefore, considering both the gradual and sudden cases are necessary in future research. Thirdly, we only consider the case that the degradation process is observable in this paper. However, hidden or partially observable degradation process is frequently encountered in many engineering practices. This may encourage the research for RUL estimation subject to hidden degradation processes. The key to tackle this issue is to incorporate the estimation error or estimation uncertainty in RUL estimation due to the unobservability of the degradation state. In addition, we primarily discuss the issues associated with estimating the RUL. Much more decision-focused research under the prognostic information needs to be researched so that we can generate new insights on the effect of the estimated RUL upon decision making.
Acknowledgments
The authors would like to sincerely thank and acknowledge the support and constructive comments from the Editor and the anonymous reviewers. This work was partially supported by the National 973 Project under grants 2010CB731800 and 2009CB32602, the NSFC under grants 61028010, 61021063, 60931160440, 61174030, 71071097, 71231001, 61210012, and 61104223 the National Science Fund for Distinguished Young Scholars of China under grant 61025014 and the Fundamental Research Funds for the Central Universities of China.
Appendix A. Algorithm 1: Kalman filtering algorithm Step 1: Initialize ^
l
0¼a0, P0.Step 2: State estimation at time ti
Pi9i1¼Pi19i1þQ
Ki¼ ðtiti1Þ2Pi9i1þ
s
2ðtiti1Þ^
l
i¼ ^l
i1þPi9i1ðtiti1ÞKi1 xixi1 ^l
i1ðtiti1Þ
:
Step 3: Updating variance
Pi9i¼Pi9i1Pi9i1ðtiti1Þ2K1i Pi9i1:
Appendix B. Algorithm 2: Strong tracking filtering algorithm Step 1: Initialize ^
l
0¼a0, P0,a
,r
.Step 2: Calculating fading factor uðtiÞfrom orthogonality principle
V0ðtiÞ ¼
g
2ðt1Þ, i ¼ 1 rV0ðti1Þ þg2ðtiÞ
1 þr , i4 1
with
g
ðtiÞ ¼xixi1 ^l
i1ðtiti1Þ8 < :
BðtiÞ ¼V0ðtiÞQ ðtiti1Þ2
as
2ðtiti1Þ; CðtiÞ ¼Pi19i1ðtiti1Þ2; u0¼BðtiÞ=CðtiÞ:uðtiÞ ¼
u0, u0Z1 1, u0o1
(
Step 3: State estimation Pi9i1¼uðtiÞPi19i1þQ
Ki¼ ðtiti1Þ2Pi9i1þ
s
2ðtiti1Þ^
l
i¼ ^l
i1þPi9i1ðtiti1ÞKi1 xixi1 ^l
i1ðtiti1Þ
Step 4: Updating variance
Pi9i¼Pi9i1Pi9i1ðtiti1Þ2K1i Pi9i1:
In Algorithm 2,
a
Z1 andr
denote the softening factor and the forgetting factor, respectively, which can be selected heuristically.r
¼0:95 has been used in general[39,51].Appendix C. The implementation of EM algorithm for our model
Following the above-mentioned procedures for the EM algorithm, the joint log-likelihood function for our problem can be expressed as
‘ið
h
Þ ¼log pðX0:i9U
i,h
Þ þlog pðU
i9h
Þ¼log pð
l
09h
Þ þlogYi
j ¼ 0pð
l
j9l
j1,h
Þ þlogYi
j ¼ 0pðxj9
l
j1,h
Þ: ðC 1ÞFrom Eqs.(7) and (8), we directly have
l
j9l
j1Nðl
j1,Q Þ, xj9l
j1N x j1þl
j1ðtjtj1Þ,s
2ðtjtj1Þandl
0Nða0,P0Þ.Using Eq.(C-1)and ignoring the constant terms, the joint log-likelihood function can be formulated as 2‘ið
h
Þ ¼ logP0ðl
0a0Þ2=P0 Xi j ¼ 1 log Q þðl
jl
j1Þ 2=Q Xi j ¼ 1 logs
2þ x jxj1l
j1ðtjtj1Þ 2 =s
2ðt jtj1Þ : ðC 2ÞTo calculate the conditional expectation ‘ð
h
9 ^h
ðkÞi Þdefined in Eq.(29), we have
2‘ð
h
9 ^h
ðkÞi Þ ¼EU i9X1:i, ^h ðkÞ i 2‘iðh
Þ ½ ¼E Ui9X0:i, ^h ðkÞ i logP0ðl
0a0Þ2=P0 Xi j ¼ 1 log Q þ ðl
jl
j1Þ 2=Q h Xi j ¼ 1 logs
2þ x jxj1l
j1ðtjtj1Þ 2 =s
2ðt jtj1Þ i : ðC 3ÞClearly, to calculate the expectation of this expression requires to obtain E Ui9X0:i, ^h ðkÞ i ð
l
jÞ, E Ui9X0:i, ^h ðkÞ i ðl
j2Þand E Ui9X0:i, ^h ðkÞ i ðl
jl
j1Þ,which are the conditional expectations with respect to
U
i, given the observed history X0:i. In this paper, we use the Rauch–Tung–Striebel (RTS) smoother to provide an optimal estimation of E Ui9X0:i, ^h ðkÞ i ð
l
jÞ, E Ui9X0:i, ^h ðkÞ i ðl
j2Þ and E Ui9X0:i, ^h ðkÞ i ðl
jl
j1Þ,summarized as Algorithm 3,[47,54]. In Algorithm 3, we define Mj9i¼Covð
l
j,l
j19X0:iÞ.Algorithm 3. RTS smoothing algorithm
Step 1: Forwards iteration by Algorithm 1 or Algorithm 2 Step 2: Backwards iteration
Sj¼Pj9jP 1 j þ 19j
^
l
j9i¼ ^l
jþSjð ^l
j þ 19i ^l
j þ 19jÞ ¼ ^l
jþSjð ^l
j þ 19i ^l
jÞPj9i¼Pj9jþSj2ðPj þ 19iPj þ 19jÞ
Step 3: Initialize
Mi9i¼ð1Kiðtiti1ÞÞPi19i1
Step 4: Backwards iteration for smoothing covariance Mj9i¼Pj9jSj1þSjðMj þ 19iPj9jÞSj1
From the RTS smoothing algorithm, we can obtain the conditional expectations of E Ui9X0:i, ^h ðkÞ i ð
l
jÞ, E Ui9X0:i, ^h ðkÞ i ðl
j2Þ and E Ui9X0:i, ^h ðkÞ ið
l
jl
j1Þin the following lemma.Lemma 3. Conditional on current estimated parameter ^
h
ðkÞi and observations history X0:i, the values of E Ui9X0:i, ^h ðkÞ i ðl
jÞ, E Ui9X0:i, ^h ðkÞ i ðl
j 2 Þand E Ui9X0:i, ^h ðkÞ i ðl
jl
j1Þare given by E Ui9X0:i, ^h ðkÞ i ðl
jÞ ¼ ^l
j9i, E Ui9X0:i, ^h ðkÞ i ðl
j2Þ ¼ ^l
j9i 2 þPj9i, E Ui9X0:i, ^h ðkÞ iProof: These equations are the direct results of applying the properties of variance-covariance and RTS smoothing algorithm and the proof is hence omitted.
From Eq.(C-3)and Lemma 3, ‘ð
h
9 ^h
ðkÞi Þcan be written as 2‘ð
h
9 ^h
ðkÞ i Þ ¼EU i9X0:i, ^h ðkÞ i 2‘iðh
Þ ½ ¼E Ui9X0:i, ^h ðkÞ i logP0ðl
0a0Þ2=P0 Xi j ¼ 1 log Q þ ðl
jl
j1Þ 2=Q h Xi j ¼ 1 logs
2 þxjxj1l
j1ðtjtj1Þ 2 =s
2ðt jtj1Þ i¼ logP0ðC09i2a09ia0þa02Þ=P0
Xi
j ¼ 1 log Q þ ðCj9i2Cj,j19iþCj19iÞ=Q
Xi j ¼ 1 log
s
2þ x jxj1 2 2 ^l
j19ixjxj1ðtjtj1Þ þ ðtjtj1Þ2Cj19i =s
2ðt jtj1Þ : ðC 5ÞThis completes the E-step and in the following we handle the M-step. After obtaining ‘ð
h
9 ^h
ðkÞi Þ, the results of estimated parameter ^
h
ðk þ 1Þi in the (kþ1)th step can be summarized in the
following theorem.
Theorem 2. ^
h
ðk þ 1Þi , by maximizing ‘ðh
9 ^h
ðkÞi Þ, is given by
a0iðk þ 1Þ¼a09i,
P0iðk þ 1Þ¼C09ia09i2¼P09i,
Qiðk þ 1Þ¼
1 i
Xi
j ¼ 1ðCj9i2Cj,j19iþCj19iÞ,
s
2 i ðk þ 1Þ ¼1 i Xi j ¼ 1 xjxj1 2 2 ^l
j19i xjxj1 ðtjtj1Þ þ ðtjtj1Þ2Cj19i tjtj1 0 @ 1 A: ðC 6Þ with Cj9i¼EU i9X0:i, ^h ðkÞ i ðl
j 2Þ,a09i¼ ^
l
09i,Cj,j19i¼EU i9X0:i, ^hðkÞ i
ð
l
jl
j1Þand ^h
ðk þ 1Þi is uniquely determined and located at the maximum.
Proof: See Appendix D.
The preceding derivations are summarized as Algorithm 4 in Appendix E via a complete specification of the EM-based algorithm for estimating the parameters in
h
when new observation xiis available.Appendix D. The proof of Theorem 2
The unknown parameters ^
h
ðk þ 1Þi can be obtained by maximizing the ‘ðh
9 ^h
ðkÞi Þwith
h
. Therefore, the central goal in thisstep is to find the maximum of function ‘ð
h
9 ^h
ðkÞi Þon
h
, i.e. ^h
ðk þ 1Þi ¼argmax h EUi9X0:i, ^h ðkÞ i ‘iðh
Þ ¼argmax h ‘ðh
9 ^h
ðkÞ i Þ ðD 1ÞFrom Eq. (C-5), taking @‘ð
h
9 ^h
ðkÞi Þ=@h
, we obtain the solution by @‘ðh
9 ^h
ðkÞi Þ=@
h
¼0, which leads to Eq. (36), and taking@2‘ð
h
9 ^h
ðkÞi Þ=@
h
@h
T, the following is obtained,
@2‘ð
h
9 ^h
ðkÞ i Þ @h
@h
T ¼ 1 2 2=P0 2ða0a09iÞ=P 2 0 0 0 2ða0a09iÞ=P 2 0 1=P 202ðC09i2a09ia0þa02Þ=P30 0 0
0 0 i=Q22
c
=Q3 0 0 0 0 i=s
42f
=s
6 2 6 6 6 6 6 4 3 7 7 7 7 7 5 , ðD 2Þ withc
¼Pij ¼ 1ðCj9i2Cj,j19iþCj19iÞ:f
¼Pij ¼ 1 ðxjxj1Þ2
2 ^lj19iðxjxj1Þðtjtj1Þ þ ðtjtj1Þ2Cj19i tjtj1
ðD 3Þ
We show that the matrix in (D-2) is negative definite at
h
¼ ^h
ðk þ 1Þi , by calculating the order principal minor determinantas follows,
D
1¼ 1 P0 ,D
2¼ 1 2P0 1 P2 02ðC09i2a09ia0þa0
2Þ P3 0 ! ða0a09iÞ 2 P4 0 ,
D
3¼ 1 2 i Q2 2c
Q3D
2,D
4¼ 1 2 is
4 2f
s
6D
3:Then, at
h
¼ ^h
ðk þ 1Þi , the following results are obtained,D
19 h¼ ^hðk þ 1Þi ¼ 1 P09i o0,D
29 h¼ ^hðk þ 1Þi ¼ 1 2P09i 1 P09i22ðC09i2a09ia09iþa09i
2Þ
P09i3
!
ða09ia09i Þ2 P09i4 ¼ 1 2P09i 1 P09i2 2 P09i2 ! ¼ 1 2P09i3 40,
D
39 h¼ ^hðk þ 1Þi ¼1 2 i3c
2 2i3c
c
3 !D
29 h¼ ^hðk þ 1Þi ¼ i 3 2c
2D
2 9 h¼ ^hðk þ 1Þi o0,D
49 h¼ ^hðk þ 1Þi ¼1 2 is
4 2f
s
6D
39 h¼ ^hðk þ 1Þi ¼1 2 i3f
2 2i3f
f
3 !D
39 h¼ ^hðk þ 1Þi ¼ i 3 2f
2D
3 9 h¼ ^hðk þ 1Þi 40:This completes the proof that the matrix in (D-2) is negative definite at
h
¼ ^h
ðk þ 1Þi , verifying that ^h
ðk þ 1Þi is located at a
maximum. In addition, ^
h
ðk þ 1Þi is unique since ^h
ðk þ 1Þi is the only solution satisfying @‘ð
h
9 ^h
ðkÞ i Þ=@h
¼0.Appendix E. Algorithm 4: EM algorithm for parameter estimation 1) Initialization Initialize the initial parameters in ^
h
ð0Þi .2) E-Step Calculate the expectation quantities defined in Eq.(C-3)using Algorithm 3 and Eq.(C-4), with state-space model of Eqs.(7) and (8)parameterized by ^
h
ðkÞi .3)
M-Step Maximize ‘ð
h
9 ^h
ðkÞi Þusing Eq.(C-6)to obtain the updated parameter estimates by Theorem 2.
4) Test convergence Test the convergence of the algorithm. If converged, then stop. Otherwise set k ¼ k þ1, go to Step 2 and repeat.
References
[1] M. Pecht, Prognostics and Health Management of Electronics, John Wiley, New Jersey, 2008.
[2] X.S. Si, W. Wang, C.H. Hu, D.H. Zhou, Remaining useful life estimation–a review on the statistical data driven approaches, Eur. J. Oper. Res. 213 (2011) 1–14.
[3] F. Camci, R.B. Chinnam, Health state estimation and prognostics in machining processes, IEEE Trans. Autom. Sci. Eng. 7 (2010) 581–597. [4] M.J. Carr, W. Wang, Modeling failure modes for residual life prediction using stochastic filtering theory, IEEE Trans. Rel. 59 (2010) 346–355. [5] M. Dong, D. He, Hidden semi-Markov model-based methodology for multi-sensor equipment health diagnosis and prognosis, Eur. J. Oper. Res. 178
(2007) 858–878.
[6] Y. Peng, M. Dong, A prognosis method using age-dependent hidden semi-Markov model for equipment health prediction, Mech. Syst. Signal Process. 25 (2011) 237–252.
[7] J. Sikorska, M. Hodkiewicz, L. Ma, Prognostic modelling options for remaining useful life estimation by industry, Mech. Syst. Signal Process. 25 (2011) 1803–1836.
[8] M.I. Mazhar, S. Kara, H. Kaebernick, Remaining life estimation of used components in consumer products life cycle data analysis by Weibull and artificial neural networks, J. Oper. Manag. 25 (2007) 1184–1193.
[9] W. Wang, A two-stage prognosis model in condition based maintenance, Eur. J. Oper. Res. 182 (2007) 1177–1187.
[10] M.Y. You, L. Li, G. Meng, J. Ni, Statistically planned and individual improved predictive maintenance management for continuously monitored degrading systems, IEEE Trans. Rel. 59 (2010) 744–753.
[11] A.K.S. Jardine, D. Lin, D. Banjevic, A review on machinery diagnostics and prognostics implementing condition-based maintenance, Mech. Syst. Signal Process. 20 (2006) 1483–1510.
[12] M.-L.T. Lee, G.A. Whitmore, Threshold regression for survival analysis: modeling event times by a stochastic process reaching a boundary, Stat. Sci. 21 (2006) 501–513.
[13] L.A. Escobar, W.Q. Meeker, A review of accelerated test models, Stat. Sci. 21 (2006) 552–577.
[14] V.R. Joseph, I.T. Yu, Reliability improvement experiments with degradation data, IEEE Trans. Rel. 55 (2006) 149–157. [15] C.J. Lu, W.Q. Meeker, Using degradation measures to estimate a time-to-failure distribution, Technometrics 35 (1993) 161–174.
[16] J.I. Park, S.J. Bae, Direct prediction methods on lifetime distribution of organic light-emitting diodes from accelerated degradation tests, IEEE Trans. Rel. 59 (2010) 74–90.
[17] M.D. Pandey, X.-X. Yuan, J.M. van Noortwijk, The influence of temporal uncertainty of deterioration on life-cycle management of structures, Struct. Infrastruct. Eng. 5 (2009) 145–156.
[18] S. Bloch-Mecier, A preventive maintenance policy with sequential checking procedure for a Markov deteriorating system, Eur. J. Oper. Res. 147 (2002) 548–576.
[19] N. Gebraeel, M.A. Lawley, R. Li, J.K. Ryan, Residual-life distributions from component degradation signals: a Bayesian approach, IIE Trans. 37 (2005) 543–557.
[20] M.C. Delia, P.O. Rafael, A maintenance model with failures and inspection following Markovian arrival processes and two repair modes, Eur. J. Oper. Res. 186 (2008) 694–707.
[21] H.T. Liao, E.A. Elsayed, Reliability inference for field conditions from accelerated degradation testing, Nav. Res. Log. 53 (2006) 576–587. [22] C.T. Barker, M.J. Newby, Optimal non-periodic inspection for a multivariate degradation model, Reliab. Eng. Sys. Safe. 94 (2009) 33–43. [23] S.T. Tseng, J. Tang, L.H. Ku, Determination of optimal burn-in parameters and residual life for highly reliable products, Nav. Res. Log. 50 (2003) 1–14. [24] G.A. Whitmore, F. Schenkelberg, Modelling accelerated degradation data using Wiener diffusion with a time scale transformation, Lifetime Data
Anal. 3 (1997) 27–45.