• No results found

Mechanical Systems and Signal Processing

N/A
N/A
Protected

Academic year: 2021

Share "Mechanical Systems and Signal Processing"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)

A Wiener-process-based degradation model with a recursive filter

algorithm for remaining useful life estimation

Xiao-Sheng Si

a,b

, Wenbin Wang

c,n

, Chang-Hua Hu

a

, Mao-Yin Chen

b

, Dong-Hua Zhou

b,n

a

Department of Automation, Xi’an Institute of High-Tech, Xi’an, Shaanxi 710025, China b

Department of Automation, TNList, Tsinghua University, Beijing 100084, China c

Dongling School of Ecomonics and Management, University of Science and Technology Beijing, Beijing 100083, China

a r t i c l e

i n f o

Article history:

Received 2 February 2012 Received in revised form 27 July 2012

Accepted 10 August 2012 Available online 1 September 2012 Keywords:

Reliability

Remaining useful life Wiener process Recursive filter

Expectation maximization

a b s t r a c t

Remaining useful life estimation (RUL) is an essential part in prognostics and health management. This paper addresses the problem of estimating the RUL from the observed degradation data. A Wiener-process-based degradation model with a recur-sive filter algorithm is developed to achieve the aim. A novel contribution made in this paper is the use of both a recursive filter to update the drift coefficient in the Wiener process and the expectation maximization (EM) algorithm to update all other para-meters. Both updating are done at the time that a new piece of degradation data becomes available. This makes the model depend on the observed degradation data history, which the conventional Wiener-process-based models did not consider. Another contribution is to take into account the distribution in the drift coefficient when updating, rather than using a point estimate as an approximation. An exact RUL distribution considering the distribution of the drift coefficient is obtained based on the concept of the first hitting time. A practical case study for gyros in an inertial navigation system is provided to substantiate the superiority of the proposed model compared with competing models reported in the literature. The results show that our developed model can provide better RUL estimation accuracy.

&2012 Elsevier Ltd. All rights reserved.

1. Introduction

Enhancing safety, efficiency, availability, and effectiveness of industrial and military systems through prognostics and health management (PHM) paradigm has gained momentum over the last decade[1,2]. PHM is a systematic approach that is used to evaluate the reliability of a system in its actual life-cycle conditions, predict failure progression, and mitigate operating risks via management actions. There are two parts in PHM, namely, ‘‘prognostics’’ and ‘‘health management’’. Prognostics is often characterized by estimating the remaining useful life (RUL) of a system using available condition monitoring (CM) information[3–7]. Once such prognosis is available, appropriate health manage-ment actions such as repair, replacemanage-ment, and logistic support can be performed to achieve the required system’s operational objectives[8–10]. In PHM, the term ‘‘RUL estimation’’ often implies to find the probability density function (PDF) of the RUL or the mean of the RUL[11], but the emphasis is often placed more on estimating the PDF of the

Contents lists available atSciVerse ScienceDirect

journal homepage:www.elsevier.com/locate/ymssp

Mechanical Systems and Signal Processing

0888-3270/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ymssp.2012.08.016

n

Corresponding authors. Tel.: þ 86 010 62794461; fax: þ86 010 62786911.

(2)

RUL than the mean RUL since a PDF can characterize the uncertainty associated with the RUL and is hence more informative for management decision making.

The current RUL estimation approaches can be broadly classified as physics of failure, data driven and fusion methods. Physics of failure approaches rely on the physics of underlying failure mechanisms. Data driven approaches achieve RUL estimation via data fitting mainly including machine learning and statistics based approaches. The fusion approaches are the combination of the physics of failure and data driven approaches. However, for complex or large-scale engineering systems, it is typically difficult to obtain the physical failure mechanisms in advance or cost-expensive and time-consuming to capture the physics of failure by experiments. In contrast, data-driven approaches attempt to derive models directly from collected degradation data or life data, and thus are more appealing and have gained much attention in recent years.

Statistics-based data driven methods for RUL estimation can be classified into the models based on the indirectly observed state processes and the models based on the directly observed state processes[2]. The former models considered the data partially indicating the underlying state of the system, and assumed that the available CM data were stochastically related to the underlying health state. In this case, lifetime data must be available to establish the relationship between the CM data and failure. The latter models utilized the observed degradation data directly to describe the underlying state of the system. Therefore, the RUL is defined as the time to reach the failure threshold of the monitored degradation data for the first time, namely the first hitting time (FHT)[12]. It is noted that the observed degradation data with a threshold are easier to manipulate and implement in practice when lifetime data are scarce[13,14].

It is well recognized that degradation process is uncertain over time and thus stochastic models are frequently used to characterize the evolution of degradation process. In literature, random effect regression (RER) models and stochastic process (SP) models are two kinds of most commonly used stochastic models. Lu and Meeker in[15]first presented the RER model to characterize the degradation of a population of units, and many extensions appeared later[2,16]. However, the degradation modeling paradigm in RER models is based on the fact that a population of ‘‘identical’’ systems (or devices) has a common degradation form. However, individual systems may exhibit different degradation rates, hence the different failure times. On the other side, Pandey et al. in[17]identified that the temporal uncertainty of the degradation process was not taken into account in RER models and thus argued that SP models could remedy it well. They also showed the advantages of SP models over RER models in condition based maintenance. SP models such as Markov chain, Gamma processes, and Wiener processes have been widely used to model the degradation process[5,18–21]. A Wiener-process-based degradation model is one of statistics-Wiener-process-based data driven models, which can characterize a non-monotonic degradation process, and provide a good description of system’s behavior due to an increased or reduced intensity of the use[22]. This type of models has been applied to model the degradation process and to estimate the RUL of a variety of industrial assets, such as rotating element bearings [19], LED lamps [23], self-regulating heating cables [24], laser generator[25], bridge beams[26], and fatigue crack dynamics[27]. Therefore, in this paper we focus on RUL estimation based on a Wiener process where the degradation process can be observed directly and a failure threshold of degradation is available. It is known that the selection of the failure threshold is an important problem in practice. However, such threshold is usually set based on either engineering domain knowledge or accepted industrial standards. For example, the ISO 2372 and ISO 10816 are frequently adopted for defining acceptable vibration threshold levels. It is therefore an issue beyond the scope of this paper. In this work, we assume that the failure threshold is known a priori.

A Wiener-process-based model has a drift term characterized by its drift coefficient and a noise term by Brownian motion. It has been widely used to model degradation processes which can be observed directly, and conduct lifetime analysis. Tseng et al. in[23]used a Wiener process to determine the lifetime for the light intensity of LED lamps of contact image scanners. As an extension, Tseng and Peng proposed an integrated Wiener process to model the cumulative degradation path of a product’s quality characteristics[28]. Joseph and Yu used a Wiener process for degradation modeling and reliability improvement[14]. Other recent extensions in lifetime estimation can be found in[29–31]. However, the degradation modeling paradigm for lifetime analysis in these conventional Wiener-process-based models is based on an assumption that the estimated PDF of RUL depends only on the currently observed degradation data, which is a strong Markovian assumption.

To relax this assumption, Gebraeel et al. in [19]presented an exponential degradation model for rotating element bearings based on a Wiener process, but incorporated some new and important improvements for RUL estimation. Their model established a linkage between the past and current degradation data of the same system by a Bayesian mechanism. However, it is worth noting that the Brownian motion in the Wiener process was just used as an error term in their models and the availability of the explicit distribution of the FHT from the Wiener process was not utilized. Instead, they directly estimated the RUL distribution using an implicit monotonic assumption. It is well-known that Wiener processes are non-monotonic. As such, the resulted RUL estimates in[19]are approximations. In addition, we note that the stochastic coefficients reflecting the individual-to-individual variability in[19]followed some prior distributions, but no elaborated method was presented to select the parameters in the prior distributions. Typically, several systems’ historical degradation data of the same type are required to determine the prior parameters. However, such historical degradation data of many systems are not always available in practice, particularly for newly commissioned systems. It is shown in Section IV that the inappropriate selection of the prior parameters can result in inaccurate estimation of the degradation and the RUL. Wang et al. in[32]recently proposed a Wiener-process-based model which used all past degradation data to date of the system for RUL estimation. Their model explicitly used the FHT from the Wiener process for RUL estimation.

(3)

However, we note that the model in[32]also required the data of many same systems for parameter estimation, and the distribution of the updated drift coefficient was not considered. Our results reveal that considering such distribution can lead to uncertainty reduction in the estimated RUL. One final observation from the existing literature of Wiener-process-based RUL estimation models is that all estimated parameters are not updated in line with newly observed data.

From the above review of related researches, we observe that there are three issues remaining to be solved when applying Wiener process for RUL estimation. The first is how to estimate the model parameters from an individual system’s data without the need of past data from many systems. The second is to consider the distribution in the estimated drift coefficient, which is a critical parameter having impact on both the mean and variance of the RUL. The third is to update model parameters based on newly observed degradation data. As we know the system’s life is heavily influenced by the way it is operated, maintained and the environment where it has been operating. The consideration of the above three issues will make our model tailored to an individual system through its actual monitoring data which relate to its operational and environmental characteristics.

In this paper, we address the above issues by utilizing a Wiener-process-based model with a recursive filter algorithm for RUL estimation. We use two techniques for the updating of the RUL estimation. A state-space model is used to recursively update the drift coefficient and an expectation maximization (EM) algorithm is used to re-estimate all unknown parameters at each time when new data are available. The new contributions of this paper are summarized as follows: (1) Different from all the previous works, our model estimates an individual system’s RUL based on its entire monitoring information to date through a recursive filter and an EM algorithm, and does not require historical degradation data of other systems in a population; (2) Unlike the work of Wang et al. in[32]where a recursive filter was also used, the drift coefficient is treated as a random variable to incorporate its distribution in estimation; (3) Our model is also different from the approximated results in[19,33] in that our result on the PDF of the RUL is exact in the sense of the FHT, and we also show that our result can ensure that the moments of the RUL exist, but this is not the case for the approximated results in[19,33]; and (4) We apply the proposed model to estimate the RUL of gyros in an inertial navigation system used in weapon systems as a case application.

The remainder parts are organized as follows. In Section II, we develop a Wiener-process-based degradation model and obtain the distribution of the RUL. In Section III, we discuss the parameter estimation algorithm in detail. Section IV provides a case study to illustrate the application and usefulness of the developed model. Section V draws the main conclusions.

2. Wiener-process-based degradation modeling and RUL estimation Notation used in this paper is summarized as follows.

ti Time of the ith CM point (can be irregularly spaced)

XðtÞ Random variable representing the degradation at time t xi Degradation observation at ti

X0:i¼ fx0,x1,x2,:::,xig History of degradation observations for the system to ti

U

l

0,

l

1,. . .,

l

i1

 

History of drift coefficients to ti

EðUÞ Expectation operator BðtÞ Standard Brownian motion

l

Drift coefficient

l

i Drift coefficient at ti

l

0 Initial state which follows Nða0,P0Þ

s

Diffusion coefficient

Z

Noise of

l

i which follows Nð0,Q Þ

w Failure threshold ^

l

i Estimated

l

i conditional on X0:i, denoted by Eð

l

i9X0:iÞ

^

l

j9i Smoothed

l

jconditional on X0:i, denoted by Eð

l

j9X0:iÞ

Pi9i Variance of estimated

l

i, denoted by Varð

l

i9X0:iÞ

Pi9i1 Variance of estimated

l

i at ti1, denoted by Varð

l

i9X0:i1Þ

Ki Filter gain

Ri RUL at tiwith its realization, ri

T Lifetime

h

Parameter column vector, denoted by

h

¼ ½a0,P0,Q ,

s

2T

^

h

i Maximum likelihood estimate (MLE) of

h

based on X0:i

^

h

i ðkÞ

Estimated

h

in the kth step based on X0:iin the EM algorithm

Lið

h

Þ Log-likelihood function for observed events, x0,x1,x2,:::,xi

‘ið

h

Þ Joint log-likelihood function for observed events, X0:iand

U

i

‘ð

h

9 ^

h

ðkÞi Þ Expectation of ‘ið

h

Þconditional on ^

h

ðkÞ i and X0:i

D

n Order principal minor determinant of the second derivation of ‘ð

h

9 ^

h

ðkÞ

i Þ, with n ¼ 1,2,3,4

F

ðUÞ Standard normal CDF

(4)

2.1. An outline of Wiener-process-based degradation model for lifetime analysis

In this section we briefly present the conventional Wiener-process-based degradation model for lifetime analysis. A Wiener process is typically used for modeling degradation processes where the degradation increases linearly in time with random noise. The rate of degradation is characterized by the drift coefficient. The wear of break pads on automotive wheels is a practical example. Christer and Wang in[34]modeled the wear of break pads as a linear function of time where the thickness of the break pad decreases linearly in time with a Gaussian noise.

In general, a Wiener-process-based degradation model can be represented as,

XðtÞ ¼

l

t þ

s

BðtÞ, ð1Þ

where

l

is the drift coefficient,

s

40 is the diffusion coefficient, and BðtÞ is the standard Brownian motion representing the stochastic dynamics of the degradation process.

In physics, a Wiener process aims at modeling the movement of small particles in fluids and air with tiny fluctuations. A characteristic feature of this process in the context of reliability is that the plant’s degradation can increase or decrease gradually and accumulatively over time. The small increase or decrease in degradation over a small time interval behaves similarly to the random walk of small particles in fluids and air. Therefore, this type of stochastic processes has been widely used to characterize the path of degradation processes where successive fluctuations in degradation can be observed, such as the degradation observations of rotating element bearings[19], LED lamps[23], self-regulating heating cables[24], laser generator[25], bridge beams[26]and other examples in[14]and[30–35] and our case of gyros’ drifting. Modeling a stochastic degradation process as a Wiener process implies that the mean degradation path is a linear function of time, i.e. E XðtÞ½  ¼

l

t. Therefore, the drift parameter

l

is closely related with the progression of the degradation. In addition, we have the variance of the degradation process var XðtÞ½  ¼

s

2t, which represents the uncertainty of the

degradation at time t.

For in service lifetime estimation at time ti with the obtained degradation observation xi, we can use,

XðtÞ ¼ xiþ

l

ðttiÞ þ

s

ðBðtÞBðtiÞÞ

¼xiþ

l

ðttiÞ þ

s

BðttiÞ, f or t 4 ti ð2Þ

At time ti, we assume xiow, otherwise the degradation has crossed w and the system would have failed as defined.

Although the above model setting is the same, there are two different ways to relate the degradation XðtÞ to lifetime T at ti

in the literature. The first one is that the lifetime is directly defined as T ¼ t : XðtÞ Z w9xi

 

, and then the lifetime distribution can be represented by FT9xiðt9xiÞ ¼PrðXðtÞ Z w9xiÞ, such as[19,31,33]. To see this, using T ¼ t : XðtÞ Z w9xi

 

, the PDF and the cumulative density function (CDF) of lifetime T at time ti can be directly obtained as,

fT9xiðt9xiÞ ¼ 1

s

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2

p

ðttiÞ p exp ðwxi

l

ðttiÞÞ 2 2

s

2ðtt iÞ ! ð3Þ FT9xiðt9xiÞ ¼1

F

wxi

l

ðttiÞ

s

ffiffiffiffiffiffiffiffiffitti p   , ð4Þ

where

F

ðUÞdenotes the standard normal CDF.

Another one defines the lifetime based on the concept of the FHT as T ¼ inf t : XðtÞ Z w9x i, such as[21,29,30,32,35].

Therefore, using the FHT, the following results can be obtained[36], fT9xiðt9xiÞ ¼ wxi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2

p

ðttiÞ3

s

2 q exp ðwxi

l

ðttiÞÞ 2 2

s

2ðtt iÞ ! , ð5Þ FT9xiðt9xiÞ ¼1

F

wxi

l

ðttiÞ

s

ffiffiffiffiffiffiffiffiffitti p   þexp 2

l

ðwxiÞ

s

2  

F

ðwxiÞ

l

ðttiÞ

s

ffiffiffiffiffiffiffiffiffitti p   , ð6Þ

with the mean E T9xi



¼tiþ ðwxiÞ=

l



and variance var T9xi



¼ ðwxiÞ

s

2=

l

3

h

. This reflects the relationship among the parameters used in the model, the current degradation measurement and the estimated future life of the plant modeled. Particularly,

l

is critical for both the mean and variance of the estimated lifetime.

Clearly, the above two definitions are different from each other and also lead to different lifetime estimations. As noted by Park and Bae in[16], T ¼ t : XðtÞ Zw9x icompletely ignores the possible hitting events within interval ðti, tÞ and thus is

only a crude approximation to T ¼ inf t : XðtÞ Zw9xi

 

when the degradation fluctuations are large. In addition, the CDF using the FHT as Eq.(6)is greater than the approximation by Eq.(4). For safety-critical systems, using the approximated result as Eq.(4)for maintenance scheduling may lead to under-maintenance because of the lower risk of failure estimated by Eq.(4). As such, it is necessary to consider the FHT as the lifetime, which is exact if the failure is defined as the FHT. We note that Eq.(6)uses only the current degradation data, but not its history before ti. However, as we have discussed,

this is a strong Markovian assumption and ideally the future FHT should depend on the path that the degradation has involved to date. For example, inFig. 1, at the same level of xi, case (a) would be expected to fail faster than case (b), but

(5)

Consequently, it is desired to utilize the degradation data to date for evaluating the RUL of the degraded system. It is expected that utilizing the degradation data to date can make the RUL estimation sharper and more tailored to an individual system than only using the current data. This is our main focus in the remaining parts of this paper.

2.2. Wiener-process-based degradation modeling

Now we address the issues discussed in the introduction. Since some variants introduced in [19] can be easily transformed into Eq.(1)by logarithmic transformation, we only focus on Eq.(1), which is used to describe the evolution of the monitored degradation variable over time in this paper.

To incorporate the history of the observations and to maintain at the same time the nice property of the Wiener process, we consider an updating procedure for coefficient

l

by a random walk model

l

l

i1þ

Z

over time where

Z

Nð0,Q Þ. Thus the drift coefficient

l

evolves as a time-dependent random variable with a distribution, conditional on

l

i1. In fact, the diffusion coefficient,

s

, can also be made time-dependent. However the structure of Eq.(1)does not allow

us to use the state space model shown later, and therefore a general filter has to be used, which is computationally difficult. There is also a practical reason why we are only interested in making

l

time-varying. It is known from Eq.(1)and the discussion in Section II-A that the mean degradation and the progression of the degradation are governed by the drift coefficient

l

and the time while the diffusion coefficient

s

controls in part the uncertainty in the degradation process. The trend in degradation is determined by the drift coefficient while the diffusion coefficient only influences the noise, which can be considered to be constant. A similar idea can also be found in statistical process control literature, in which it is frequently assumed that the process mean will change but the variance will be constant when the process shifts from ‘‘in control’’ to ‘‘out of control’’ [37]. Motivated by the state space model [38], the degradation equation can be reconstructed via a linear state-space model as,

l

l

i1þ

Z

, ð7Þ

xi¼xi1þ

l

i1ðtiti1Þ þ

se

i, ð8Þ

where t0¼0, x0¼0, and

e

iNð0,titi1Þ. The use of titi1 as the variance of

e

iis required by the property of Brownian

motion. Eq.(7)is called the system equation, while Eq.(8)is the observation equation [38]. The reason to use a linear system equation is not only because of its simplicity. We can use a nonlinear system equation for

l

i, but we expect that the

gain obtained will be minor, but at the expense of a substantially long computation time. There is also a practical problem as what form of nonlinearity we should use because

l

iis not observable. Since the system equation is to model the change

of the drift coefficient over a sampling interval which is not long, we would expect that

l

ishould be around

l

i1adjusted

by the noise term.

We assume that the initial drift coefficient

l

0follows a normal distribution with mean a0and variance P0as required by

the state space model. The drift coefficient is considered as a hidden ‘‘state’’ and can be estimated from the observations up to ti, denoted by X0:i¼ fx0, x1, x2,:::, xig. As such, this model establishes the linkage between the drift coefficient and the

observation history up to ti. In Eq.(7),

l

i follows a distribution which can be estimated by a recursive filter once new

observation xiis available at ti. We denote its mean by ^

l

i¼Eð

l

i9X0:iÞand its variance by Pi9i¼Varð

l

i9X0:iÞ.

In order to compute ^

l

iand Pi9i, we need to know the PDF of the

l

igiven X0:i, denoted by pð

l

i9X0:iÞ. Recursion solution of

l

i9X0:iÞcan be computed from pð

l

i19X0:i1Þby the well-known Bayesian rule as follows,

l

i9X0:iÞ ¼

Z

l

i9

l

i1Þpð

l

i19X0:iÞd

l

i1¼

R

l

i9

l

i1Þpðxi9

l

i1,X0:i1Þpð

l

i19X0:i1Þd

l

i1

pðxi9X0:i1Þ:

ð9Þ It has been well established that if Eqs.(7) and (8)are used, Eq.(9)is Gaussian with mean ^

l

iand variance Pi9i, which can

be computed by the Kalman filter[38]. As a result, the entire history is captured via recursively updating the estimate of

l

i,

which is the advantage of the state space model. The recursive estimations for ^

l

i and Pi9i using Kalman filtering are

summarized as Algorithm 1 inAppendix A.

In literature, the Kalman filter has been successfully applied when system states (here referred the drift coefficient as a state) and observations evolve in a smooth and gradually changing way. However, degradation sometimes may have jumps or sudden changes[19]. Here, we introduce an algorithm to deal with this by a strong tracking filter (STF)[39]. STF is also Kalman filter-based, but adjusts the prediction variance Pi9i1so that it is sensitive to the prediction error,

xi xi

w w

Degradation Degradation

Time Time

(6)

xixi1 ^

l

i1ðtiti1Þ, and then the filter gain Ki is sensitive to the change of the system state. The details about the STF

algorithm are summarized inAppendix Bas Algorithm 2.

Based on Eqs.(7), (8), and (9), the PDF of

l

iconditional on X0:iis,

l

i9X0:iÞ ¼ 1 ffiffiffiffiffiffiffiffiffiffiffiffiffi 2

p

Pi9i q exp 

l

i ^

l

i 2 2Pi9i , ð10Þ

where the dependence between

l

i and X0:iis contained in ^

l

iand Pi9i. Based on this result, we derive the associated RUL

distribution in the following.

2.3. Real-time updating of the RUL distribution

Based on a predefined threshold w, the RUL modeling principle is that when degradation XðtÞ first reaches threshold w, the system is declared to be non-operable and its lifetime terminates. Consequently, it is natural to view the event of lifetime termination as the point that the degradation XðtÞ exceeds threshold w for the first time. In this paper, from the concept of the FHT, we define RUL Ri at time ti as

Ri¼inf ri: XðriþtiÞ Zw9X0:i, ð11Þ

with CDF FRi9X0:iðri9X0:iÞand PDF fRi9X0:iðri9X0:iÞ. From Eq.(2), it is direct to obtain the PDF and CDF of the RUL at time ti defined in Eq.(11)as follows[36],

fRi9li,X0:iðri 9

l

i,X0:iÞ ¼ wxi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2

p

ri3

s

2 p exp ðwxi

l

iriÞ 2 2

s

2r i ! , ri40: ð12Þ FRi9li,X0:iðri9

l

i,X0:iÞ ¼1

F

wxi

l

iri

s

ffiffiffiffiri p   þexp 2

l

iðwxiÞ

s

2  

F

ðwxiÞ

l

iri

s

ffiffiffiffiri p   , ri40: ð13Þ

In Eq.(12)if we replace

l

i by ^

l

i, then it is the RUL model used in[32]. We call this as Wang’s model for subsequent

comparisons in Section IV. However, as mentioned above, the drift coefficient evolves as a random variable in Eq.(7)with a distribution, pð

l

i9X0:iÞ, conditional on the observed data up to time ti as formulated by Eq.(10). Now we want to use

l

i9X0:iÞfor deriving the estimated RUL distribution. In order to achieve this aim, we first give two lemmas, which can

significantly simplify the course of the derivation for the RUL distribution. Lemma 1. If Y  Nð

m

1,

s

21Þ, then EY½

F

ðYÞ can be formulated as EY½

F

ðYÞ ¼

F

ð

m

1=

ffiffiffiffiffiffiffiffiffiffiffiffiffi

s

2 1þ1

q

Þwhere

F

ðUÞdenotes the standard normal CDF.

Proof: From the property of the normal distribution, we have EY½

F

ðYÞ ¼E EðIfZr Yg9YÞ9Y

 ¼PrðZrYÞ ¼ PrðZY r0Þ ¼

F

ð

m

1= ffiffiffiffiffiffiffiffiffiffiffiffiffi

s

2 1þ1 q Þ: ð14Þ In the derivation process, IfZr Yg is the indicator function, Z is a standard normal variable and independent of Y, and

ZY  Nð

m

1,

s

21þ1Þ.

Lemma 2. If

l

Nð

m

l,

s

2lÞ, the PDF and CDF of the FHT of process XðtÞ ¼

l

t þ

s

BðtÞ to first hit threshold w can be formulated as fTð Þ ¼t w ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2

p

t3

s

2 lt þ

s

2   q exp  w

m

lt  2 2t

s

2 lt þ

s

2   " # : ð15Þ FTð Þ ¼t

F

m

ltw ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

s

2 lt 2þ

s

2t q 0 B @ 1 C Aþexp 2

m

s

l2wþ 2

s

2 lw2

s

4 !

F

2

s

2 lwt þ

s

2

m

lt þw  

s

2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

s

2 lt 2þ

s

2t q 0 B @ 1 C A: ð16Þ

Lemma 2 is similar to the results given in[40]and[41]. This lemma can be obtained by some direct manipulations using Lemma 1 and the total law of probability[25]. From Lemma 2, we give the following theorem.

Theorem 1. For the Wiener process defined by Eq.(2)and the state space model as Eqs.(7) and (8), the PDF and CDF of the updated RUL at tibased on the updated PDF of

l

ican be obtained as

fRi9X0:iðri9X0:iÞ ¼ wxi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2

p

ri3 Pi9iriþ

s

2 r exp  wxi  ^

l

iri 2 2ri Pi9iriþ

s

2 0 B @ 1 C A, ri40: ð17Þ

(7)

FRi9X0:iðri9X0:iÞ ¼1

F

wxi ^

l

iri ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pi9iri2þ

s

2ri q 0 B @ 1 C Aþexp 2 ^

l

iðwx

s

2 iÞþ 2Pi9iðwxiÞ2

s

4 !

F

 2Pi9iðwxiÞriþ

s

2

l

^iriþwxi

s

2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP i9iri2þ

s

2ri q 0 B @ 1 C A: ð18Þ Proof: Using Eqs.(10), (12), (13), (15), (16)and the total law of probability, we have

fRi9X0:iðri9X0:iÞ ¼ Z þ 1

1

fRi9li,X0:iðri9

l

i,X0:iÞpð

l

i9X0:iÞd

l

i, ð19Þ

FRi9X0:iðri9X0:iÞ ¼ Z þ 1

1

FRi9li,X0:iðri9

l

i,X0:iÞpð

l

i9X0:iÞd

l

i: ð20Þ Following Lemma 2, it is straightforward to obtain Eqs.(17) and (18).

Comparing Eq.(17)with Eq.(12), we observe that the observation history and the variance of

l

iare involved in Eqs.(17)

and (18), which is also recursively updated. We call Eq.(17)as our model to distinguish other models in Section IV.

Remark 1. In[19], they directly used PrðRirri9X0:iÞ ¼Pr XðriþtiÞ Zw9X0:i

 

to calculate the RUL distribution and obtained the associated RUL distribution with similar form to Eqs.(3) and (4). However, using PrðRirri9X0:iÞ ¼Pr XðriþtiÞ Zw9X0:i

 

ignores the possible hitting events within ðti,tiþriÞas discussed in Section II-A and thus their results are approximations.

Instead, Eqs.(17) and (18)are exact in the sense of the FHT.

Remark 2. The moment of the RUL distribution obtained in [19]and [33], does not exist since their obtained RUL distributions belong to the family of Bernstein distributions, known without moments, but this is not the case for our result. For example, the mean of RUL can be easily formulated by

EðRi9X0:iÞ ¼E½EðRi9

l

i,X0:iÞ9X0:i ¼Eð

wxi

l

i 9X1:kÞ ¼ wxk Pi9i expð ^

l

2i 2Pi9i Þ Z l^i 0 expðu 2 2Pi9i Þdu ¼ ffiffiffi 2 p wxi ð Þ ffiffiffiffiffiffiffi Pi9i q Dð ^

l

i ffiffiffiffiffiffiffiffiffiffi 2Pi9i q Þ where DðzÞ ¼ expðz2ÞRz

0expðu2Þdu is the Dawson integral for real z, which is known to exist. This property is desired in

maintenance practice, since the expectation of the life estimation is required to be existent sometimes[42,43].

In Eqs.(17) and (18), parameters a0,P0, Q and

s

2should be estimated. As opposed to the result in[19,32], we develop a

parameter estimation algorithm in the following section for this task. 3. Parameter estimation

Now we return to estimate and update a0, P0, Q and

s

2in Eqs.(7) and (8). Unlike the method adopted in[32], we use

only the data from one system from the time of installation and recursively update the estimate along with the observation process. We denote

h

¼ ½a0, P0, Q ,

s

2T as a parameter vector. We use the maximum likelihood estimation

(MLE) to estimate

h

once new degradation observation xiis available. In this case, the log-likelihood function for X0:ican be

written as

Lið

h

Þ ¼log½pðX0:i9

h

Þ, ð21Þ

where pðX0:i9

h

Þis the joint PDF of the degradation data X0:i. Then the MLE estimate of

h

, denoted by ^

h

i, conditional on X0:i

can be obtained by ^

h

i¼argmax

h Lið

h

Þ: ð22Þ

If the drift coefficient is constant then maximizing Eq.(21)with respect to

h

is straightforward. However, since we treat

l

i as a hidden variable which is given by Eq. (7), then directly maximizing Eq. (21)is impossible. However, the EM

algorithm provides a possible framework for estimating the parameters involving hidden variables[44]. A fundamental assumption of the EM algorithm is that the hidden variables can be estimated by observed data. This is the case for our problem since pð

l

i9X0:iÞcan be obtained by Eq.(10).

The fundamental principle of the EM algorithm is to replace the hidden variables with their expectations conditional on the observed data. Then the parameter estimation can be formulated as maximizing the joint likelihood function pðX0:i,

U

i9

h

Þ. Specifically, by manipulating the relationship between pðX0:i9

h

Þand pðX0:i,

U

i9

h

Þ, Lið

h

Þcan be divided into two

parts as

Lið

h

Þ ¼‘ið

h

Þlog pð

U

i9X0:i,

h

Þ, ð23Þ

where

‘ið

h

Þ ¼log pðX0:i,

U

i9

h

Þ: ð24Þ

Then taking the expectation operator on both sides of Eq.(23)with respect to

U

i9X0:i,

h

0, we have

(8)

where ‘ð

h

9

h

0

Þ ¼EUi9X0:i,h0 ‘ið

h

Þ

 

, ð26Þ

h

9

h

0Þ ¼EUi9X0:i,h0 log pð

U

i9X0:i,

h

Þ

 

: ð27Þ

Finally, via Eq.(25), the following holds Lið

h

ÞLið

h

0Þ ¼‘ð

h

9

h

0Þ‘ð

h

09

h

0Þ þKð

h

09

h

0ÞKð

h

9

h

|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}

Z0

, ð28Þ

where the positivity of the last term is implied by the Kullback–Leibler divergence metric between pð

U

i9X0:i,

h

Þ and

U

i9X0:i,

h

0

Þ,[45].

Obviously, if ‘ð

h

9

h

0Þ4 ‘ð

h

09

h

0Þ, then Lið

h

ÞLið

h

0Þ40 holds. This is achieved by the EM algorithm which takes an

approximation ^

h

i ðkÞ

of MLE ^

h

igiven in Eq.(22)and updates it to a better ^

h

i ðk þ 1Þ

according to the following two steps: E-step Calculate ‘ð

h

9 ^

h

ðkÞ i Þ ¼EU i9X0:i, ^h ðkÞ i ‘ið

h

Þ   , ð29Þ where ^

y

i ðkÞ

¼ ½a0iðkÞ, P0iðkÞ, QiðkÞ,

s

i2ðkÞTdenotes the estimated parameters in the kth step conditional on X0:i.

M-step Calculate ^

h

i ðk þ 1Þ ¼argmax h EUi9X0:i, ^hi ðkÞ‘ið

h

Þ   : ð30Þ

Then we iterate the E-step and M-step until a criterion of convergence is satisfied. In our case, we can calculate the E-step and M-step separately but just outline the properties of the estimation algorithm. The details of the algorithm are summarized inAppendix C. Interestingly, it can be observed from Theorem 2 inAppendix Cthat the M-step in our approach can be solved analytically and we can obtain the unique maximum point. This implies that each iteration of the EM algorithm can be performed with a single computation, which leads to an extremely fast and simple estimation procedure. This computation advantage plus the exact RUL distribution are particularly attractive for practical applications. The convergence property of the proposed algorithm can be similarly demonstrated in[46–50].

4. A practical case study

In this section, we provide a practical case study for gyros in an inertial navigation system (INS) to illustrate the application of our model and compare the performance of our model with the models presented in[19,32]. In our study, it is found that STF can generate superior results to Kalman filter to a certain extent. As such, in the following, we only use STF in the filtering step for illustration. Of course, in practice which one to use needs to be tested against the model fit. However, due to limited space, we do not discuss this issue in this paper. Actually, the detailed comparisons between STF and Kalman filter with sudden state changing can be found in[51,52].

4.1. Problem description

As a key device of the INS in weapon systems and space equipment, an inertial platform plays an important and irreplaceable role in the INS. Its operating state has a direct influence on navigation precision. The sensors fixed in an inertial platform include three gyros and three accelerometers, which measure angular velocity and linear acceleration, respectively. The gyro fixed on an inertial platform is a mechanical structure having two degrees of freedom from the driver and sense axis (see[53]for a general description of inertial navigation platforms and gyros). When the inertial platform is operating, the wheels of the gyros rotate at very high speeds and can lead to rotation axis wear. As the wear is accumulated, the bearing on the gyro’s electric motor will become deformed and such deformation can lead to the drift of the gyros. The increasing drift finally results in the failure of gyros and then the inertial platform. Past data show that almost 70% of the failures of inertial platforms result from gyroscopic drift and such drift is largely resulted from the wear of bearings as the case of rolling element bearings which are extensively investigated in the literature. However, the difference of our case is that we use the drift data of gyros to estimate the RUL rather than the vibration data as rolling element bearings since we cannot obtain the vibration data as it is not allowed to fix the vibration sensors in the inertial platform. As such, the drift of gyros is often used as a performance indicator to evaluate the health condition of an inertial platform and to schedule maintenance activities.

For an illustrative purpose, we provide an illustration of the inertial platform (see Fig. 2), and an illustration of a deformed bearing of the gyro’s electric motor (seeFig. 3), which was obtained by scanning electron microscopy S-3700N. It can be found that the maximum length of the metal flake is 155

m

m. Such deformation is reflected by the drifting data of gyros, which can be monitored directly.

In this study, we assume that CM values of drift coefficients reflect the performance of the inertial platform, and the larger the drift coefficients monitored are, the worser the performance. Therefore, according to the CM data and technical

(9)

index of the inertial platform, failure prediction can be implemented by modeling the drift coefficients. The drift coefficients of an inertial platform mainly include K0X, K0Y, K0Z, KSX, KSY, KIZ, in which K0X, K0Y, K0Zdenote constant drift

coefficients, and KSX, KSY, KIZ are stochastic drift coefficients, where KSX, KSY denote the coefficients related to the first

moment of specific force along the sense axis, and KIZdenotes the coefficient related to the first moment of specific force

along the input axis. Generally, the drift degradation measurement along the sense axis, KSX, plays a dominant role in the

assessment of gyro degradation. In our study, we take the CM data of KSX as the degradation signals and use them for RUL

estimation of the INS. For our monitored INS in certain weapon system with the terminated life 180.5 h, 73 points of drift coefficients data were collected with regular CM intervals 2.5 h in field condition. The collected data are illustrated inFig. 4.

In the practice of the INS health monitoring, it is usually required that the drift measurement along the sense axis should not exceed 0:37ð1=hÞ. This threshold is predetermined at the design stage and is strictly enforced in practice since an INS is a critical device used in a navigated weapon system.

4.2. The implementation of our model for RUL estimation of the INS

Using our model, the predictions of the gyro’s drift and the distribution of the RUL can be obtained at each CM point. Specifically, using our approach initialized by parameter vector

h

0¼ ½0:002, 0:001, 0:01, 0:01T, the one step predicted

drifting path by ^xi þ 1¼xiþ ^

l

iðti þ 1tiÞis illustrated inFig. 4to show the fitness of our model to the gyro’s drift degradation Fig. 2. Illustration of the inertial platform.

(10)

data. Clearly, the predicted results match with the actual data well and the mean squared error (MSE) of the predictions is 1.1962E-4 which is small. This demonstrates that our developed model can model the gyro’s drift degradation data effectively. Correspondingly, the evolving path of the estimated parameter vector ^

h

, consisting of a0, P0, Q and

s

2, are

illustrated inFig. 5, which are estimated by Algorithm 4 summarized inAppendix E.

Fig. 5shows that the updated parameters converge quickly as the observed degradation data are accumulated. In this

case, once the parameters converge, further updating may be unnecessary. However, such updating is needed for the case that the degradation process may be subject to unusual changes in its progression. Once the parameters in the model are updated, the PDF of the estimated RUL can be calculated at each CM point. It is noted that the RUL estimation at a CM point is not achieved by estimating the long term future degradation state with estimating errors, and instead is achieved by directly calculating the FHT of the degradation process.Fig. 6illustrates the RUL distributions at six different CM points. It is noted that the first drift reading is zero according to the model setting. In our used practical data, the last drift reading is 0:3566 ð1=hÞ at the monitoring time t73¼180 h. It can be found that this drift reading is very close to the failure

threshold (w ¼ 0:37 ð1=hÞ). Therefore, we can consider that the data captures full life cycle history (useful life) of the machine component. In other words, the actual lifetime of the gyros in an INS is approximated to be 180.5 h and thus the actual RUL at each CM point is known from the full life cycle data. As shown inFig. 6, the actual RUL (denoted by square) falls within the range of the estimated PDF of the RUL at each CM point from our model and further the estimated PDF of the RUL becomes sharper as the degradation data are accumulated. This implies that the uncertainty of the estimated RUL is reduced since more data are utilized during estimating the model parameters. When we use our predictive model for RUL estimation at a given CM point, we use the CM data up to that CM point. In other words, if we estimate the RUL at ti,

the data X0:iare used to update the model and for RUL estimation. Therefore, at the last CM time t73, all the available CM

data are used to estimate the RUL.Fig. 7illustrates the performance of our predictive model against the full life cycle data. It can be observed that the estimated mean RUL and the actual RUL match each other well. For example, the relative error between the actual life and the estimated mean life at t73 is 0.16%. This reflects that the estimated result of our

developed model can match the actual result from the full life cycle data closely.

4.3. Comparative studies

In this part, we conduct some comparative studies with the models presented in[19,32]. We first compare our model with Wang’s model about their performance of the RUL estimation for the INS. The detailed implementation process of Wang’s model can be found in[32].

In order to compare our model with Wang’s model, a loss function is employed to enable a direct comparison of the distributions for the RUL estimation between two models. The loss function is the MSE about the actual RUL obtained at each observation point[4], defined as

MSEi¼ Z 1 0 ðri~riÞ2fRi9X0:i ri9X0:i  dri,  ð31Þ where ~riis the actual RUL obtained at tiand fRi9X0:i ri9X0:i

 

is the estimated PDF of the RUL.

Fig. 8(a) compares the estimated RUL distributions from our model with Wang’s model. In both cases, the unknown

parameters, such as a0,P0, Q and

s

2, are obtained by the EM algorithm, but the difference is that our model recursively

updates all parameter estimates whenever a new piece of information is available, whereas the model in[32]uses the data of past systems to estimate the model parameters and once estimated they are fixed. Only ^

l

iis updated with the new data

0 20 40 60 80 100 120 140 160 180 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

The CM point (hours)

The degradation path of gyro drift (

0/hour)

The actual observed path The predicted path

(11)

0 20 40 60 80 100 120 140 160 180 0 0.005 0.01 0.015 0.02 0.025 The CM point(hours) Estimated a 0 0 20 40 60 80 100 120 140 160 180 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1x 10 -3 The CM point(hours) Estimated P 0 0 20 40 60 80 100 120 140 160 180 0 0.2 0.4 0.6 0.8 1 1.2x 10 -3 The CM point(hours) Estimated Q 0 20 40 60 80 100 120 140 160 180 0 0.2 0.4 0.6 0.8 1 1.2 1.4x 10 -3 The CM point(hours) Estimated σ 2

Fig. 5. The updated parameters at monitoring points t0, t1,:::, t73.

0 20 40 60 80 100 120 140 160 180 200 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 The RUL(hours)

Probability density function (PDF)

The 21th CM point The 41th CM point The 51th CM point The 61th CM point The 71th CM point The 73th CM point The actual RUL

(12)

in Wang’s model. Clearly, the PDFs of the RULs from both models can cover the actual RULs well as the obtained data are accumulated. However, it is observed fromFig. 8(a) that the PDFs of the estimated RULs are typically more dispersed when using Wang’s model.Fig. 8(b) shows the calculated MSE between the actual RUL and the estimated RUL at each CM point. We can observe that the MSEs using Wang’s model change irregularly at different sampling points with relatively large

0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180 200 RUL(hours) The CM point(hours)

The actual RUL The estimated RUL

Fig. 7. The estimated mean RUL and the actual RUL at monitoring points t0,t1,:::,t73.

0 5 10 15 20 25 165 170 175 180 0 0.05 0.1 0.15 0.2

The RUL (hours) The CM time (hours)

Probability density function (PDF) Our model Wang's model 0 20 40 60 80 100 120 140 160 180 0 0.5 1 1.5 2 2.5 3x 10 4 The CM point(hours) MSE i at each CM point Wang's model Our model

Fig. 8. Comparative results of our model with Wang’s model (a) the estimated PDFs of the RULs at the last six CM points, and (b) the MSEs of the RULs at all CM points (&–actual RUL).

(13)

fluctuations, particularly in the earlier stage of estimation. This implies that the estimated RUL PDFs by Wang’s model is sensitive to small changes in observations shown inFig. 4. This may also lead to completely different health management decisions at two consecutive CM points where the actual degradation observations may only have changed a little as seen

fromFig. 4. The reason for such behavior seen fromFig. 8(b) stems from the factor that the distribution pð

l

i9X0:iÞof the

updated drift coefficient is not considered and the estimated parameters are not recursively updated in [32]. From

Fig. 8(b), we observe that our model can make the MSEs about the actual RUL at each CM point less sensitive to small

changes. But most importantly our model can improve the accuracy of the estimated PDFs of the RULs with reduced variances, seeFig. 8. This reduction in variance is due to the use of the complete distribution of

l

i via the Bayesian rule

shown earlier. These comparisons reflect the superiority of our model to Wang’s model in the RUL estimation for the INS.

0 20 40 60 80 100 120 140 160 180 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

The drift degradation data of the INS(

o/hour)

The CM time (hours) The actual degradation path

Gebraeel's model with random prior parameters Gebraeel's model with appropriate prior parameters Our model with random prior parameters

0 5 10 15 20 25 165 170 175 1800 0.05 0.1 0.15 0.2 0.25 0.3

The RUL (hours) The CM time (hours)

Probability density function (PDF)

Our model

Gebraeel's model with appropriate prior parameters

0 5 10 15 20 25 165 170 175 180 0 0.05 0.1 0.15 0.2 0.25 0.3

The RUL (hours) The CM time (hours)

Probability density function (PDF)

Our model

Gebraeel model with random prior parameters

Fig. 9. Comparative results of our model with Gebraeel’s model (a) the predicted degradation path, (b) RUL estimation using appropriate prior parameters, and (c) RUL estimation using random prior parameters (&–actual RUL).

(14)

Now, we further compare our model with Gebraeel’s model about their performance in the RUL estimation. The detailed implementation process of Gebraeel’s model can be found in[19]. It is noted that the model in[19]needs prior parameters as well our model. In order to compare the goodness of fit, the prior parameters of our model in the following comparison are selected at random at time zero, but we consider two cases for the prior parameters of Gebraeel’s model. One uses the appropriately selected parameters as the prior parameters and the other uses random prior parameters.Fig. 9

shows the comparative results.

Fig. 9(a) shows that the predicted degradation path. The MSEs between the actual data and the corresponding

predictions of the degradation path for our model with random prior parameters, Gebraeel’s model with random prior parameters, and Gebraeel’s model with appropriate prior parameters are 1.179E-4, 0.0026, and 7.9864E-4, respectively. Thus, our model has the best fitting for the degradation data. We also find that our model is robust with the choice of the prior parameters, but Gebraeel’s model is not. Fig. 9(b) illustrates several estimated PDFs of the RULs over time using Gebraeel’s model with appropriate prior parameters. It is clear that the range of PDFs of RULs covers the actual RULs, but the uncertainty in the estimated RUL of our model is less than that from Gebraeel’s model, particular for the last few sampling points, since our estimated PDFs of the RULs are sharper than the results of Gebraeel’s model. However, if the prior parameters in Gebraeel’s model are selected at random, the estimated RUL may be incorrect as illustrated inFig. 9(c) (the observed RULs are outside of the predicted RUL PDF ranges). In comparison, our model using the random selected prior parameters can produce reasonable estimates. This demonstrates the merit of our model for RUL estimation of a particular system. Therefore, if there are no sufficient historical degradation data at the beginning for selecting the prior parameters, our model may be more appropriate for RUL estimation.

As noted by Remark 2, the moments of the RUL distribution obtained by Gebraeel’s model do not exist. Therefore, we cannot calculate the MSE about the actual RUL defined in Eq.(31). But, to have a further look at the difference of the estimated RUL distributions between our exact result and Gebraeel’s approximate result, it is desired to compare the estimated reliability from both models. Here, we use FRi9X0:iðri9X0:iÞ ¼1FRi9X0:iðri9X0:iÞas the conditional reliability obtained from our exact approach at ti, while FR0

i9X0:iðri9X0:iÞ as the conditional reliability obtained from Gebraeel’s model by PrðR0

irri9X0:iÞ ¼Pr Xðr iþtiÞ Zw9X0:iat ti. The difference between them at the last CM point is illustrated inFig. 10.

It can be found that the difference between both results is significant. This is due in part to the many fluctuations existing in the degradation path of gyro’s drift so that using PrðL0

krlk9X1:kÞ ¼Pr XðlkþtkÞ Zw9X1:k

 

to approximate the RUL will lead to overestimating of the reliability, as discussed in Remark 1. Consequently, when the approximated result is applied to reliability-centered maintenance, the obtained decision may be far from the reality.

In sum, this practical case study demonstrates that our developed model can work well and efficiently. On the other hand, we verify that incorporating the observation history to date can improve the accuracy of the RUL estimation indeed. 5. Concluding remarks

In this paper, we present a Wiener-process-based degradation model with a recursive filter algorithm and Bayesian updating to estimate the PDF of the RUL. Based on the Wiener process, we construct a RUL estimation model which depends on the system CM data to date via recursively updating. The updating is done twice in that the drift coefficient is recursively updated via the state-space model and all parameters are updated via the EM algorithm, both at the time that a new piece of CM data becomes available. During the RUL estimation, we also consider the distribution in the estimated drift coefficient and accordingly an explicit RUL distribution is obtained based on the concept of FHT. The usefulness of the proposed model is demonstrated via a real-world example for gyros in an inertial navigation system. By comparing the

0 20 40 60 80 100 120 140 160 180 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 The RUL(hours) Co n d it io n a l r e li a b il it y a t th e la s t CM p o in t Our model Gebraeel's model

(15)

proposed model with existing models, we show that the proposed model can generate better results than existing models compared both in terms of the mean and variance of the estimated RUL and is robust with respect to the choice of the prior parameters.

Although the practical case studies show that our model is a simple but efficient method for RUL estimation, there are still several issues needed to be further studied. Firstly, in this paper, we only consider a Wiener process with a linear drift, but for complex systems in practice, a nonlinear model may be more appropriate. Secondly, we mainly consider the case that the failure is resulted by the gradual and continuous degradation in this paper. However, considering the sudden failure in degradation modeling and RUL estimation is interesting and of practical significance. Because the failure of complex engineering systems is often the result of gradual degradation and stresses generated either by the systems themselves or from external shocks. Such shocks frequently lead to the sudden failure. Therefore, considering both the gradual and sudden cases are necessary in future research. Thirdly, we only consider the case that the degradation process is observable in this paper. However, hidden or partially observable degradation process is frequently encountered in many engineering practices. This may encourage the research for RUL estimation subject to hidden degradation processes. The key to tackle this issue is to incorporate the estimation error or estimation uncertainty in RUL estimation due to the unobservability of the degradation state. In addition, we primarily discuss the issues associated with estimating the RUL. Much more decision-focused research under the prognostic information needs to be researched so that we can generate new insights on the effect of the estimated RUL upon decision making.

Acknowledgments

The authors would like to sincerely thank and acknowledge the support and constructive comments from the Editor and the anonymous reviewers. This work was partially supported by the National 973 Project under grants 2010CB731800 and 2009CB32602, the NSFC under grants 61028010, 61021063, 60931160440, 61174030, 71071097, 71231001, 61210012, and 61104223 the National Science Fund for Distinguished Young Scholars of China under grant 61025014 and the Fundamental Research Funds for the Central Universities of China.

Appendix A. Algorithm 1: Kalman filtering algorithm Step 1: Initialize ^

l

0¼a0, P0.

Step 2: State estimation at time ti

Pi9i1¼Pi19i1þQ

Ki¼ ðtiti1Þ2Pi9i1þ

s

2ðtiti1Þ

^

l

i¼ ^

l

i1þPi9i1ðtiti1ÞKi1 xixi1 ^

l

i1ðtiti1Þ

:

Step 3: Updating variance

Pi9i¼Pi9i1Pi9i1ðtiti1Þ2K1i Pi9i1:

Appendix B. Algorithm 2: Strong tracking filtering algorithm Step 1: Initialize ^

l

0¼a0, P0,

a

,

r

.

Step 2: Calculating fading factor uðtiÞfrom orthogonality principle

V0ðtiÞ ¼

g

2ðt

1Þ, i ¼ 1 rV0ðti1Þ þg2ðtiÞ

1 þr , i4 1

with

g

ðtiÞ ¼xixi1 ^

l

i1ðtiti1Þ

8 < :

BðtiÞ ¼V0ðtiÞQ ðtiti1Þ2

as

2ðtiti1Þ; CðtiÞ ¼Pi19i1ðtiti1Þ2; u0¼BðtiÞ=CðtiÞ:

uðtiÞ ¼

u0, u0Z1 1, u0o1

(

Step 3: State estimation Pi9i1¼uðtiÞPi19i1þQ

Ki¼ ðtiti1Þ2Pi9i1þ

s

2ðtiti1Þ

^

l

i¼ ^

l

i1þPi9i1ðtiti1ÞKi1 xixi1 ^

l

i1ðtiti1Þ

(16)

Step 4: Updating variance

Pi9i¼Pi9i1Pi9i1ðtiti1Þ2K1i Pi9i1:

In Algorithm 2,

a

Z1 and

r

denote the softening factor and the forgetting factor, respectively, which can be selected heuristically.

r

¼0:95 has been used in general[39,51].

Appendix C. The implementation of EM algorithm for our model

Following the above-mentioned procedures for the EM algorithm, the joint log-likelihood function for our problem can be expressed as

‘ið

h

Þ ¼log pðX0:i9

U

i,

h

Þ þlog pð

U

i9

h

Þ

¼log pð

l

09

h

Þ þlog

Yi

j ¼ 0pð

l

j9

l

j1,

h

Þ þlog

Yi

j ¼ 0pðxj9

l

j1,

h

Þ: ðC  1Þ

From Eqs.(7) and (8), we directly have

l

j9

l

j1Nð

l

j1,Q Þ, xj9

l

j1N x j1þ

l

j1ðtjtj1Þ,

s

2ðtjtj1Þand

l

0Nða0,P0Þ.

Using Eq.(C-1)and ignoring the constant terms, the joint log-likelihood function can be formulated as 2‘ið

h

Þ ¼ logP0ð

l

0a0Þ2=P0 Xi j ¼ 1 log Q þð

l

j

l

j1Þ 2=Q Xi j ¼ 1 log

s

2þ x jxj1

l

j1ðtjtj1Þ  2 =

s

2ðt jtj1Þ : ðC  2Þ

To calculate the conditional expectation ‘ð

h

9 ^

h

ðkÞ

i Þdefined in Eq.(29), we have

2‘ð

h

9 ^

h

ðkÞi Þ ¼EU i9X1:i, ^h ðkÞ i 2‘ið

h

Þ ½  ¼E Ui9X0:i, ^h ðkÞ i logP0ð

l

0a0Þ2=P0 Xi j ¼ 1 log Q þ ð

l

j

l

j1Þ 2=Q h Xi j ¼ 1 log

s

2þ x jxj1

l

j1ðtjtj1Þ  2 =

s

2ðt jtj1Þ   i : ðC  3Þ

Clearly, to calculate the expectation of this expression requires to obtain E Ui9X0:i, ^h ðkÞ i ð

l

jÞ, E Ui9X0:i, ^h ðkÞ i ð

l

j2Þand E Ui9X0:i, ^h ðkÞ i ð

l

j

l

j1Þ,

which are the conditional expectations with respect to

U

i, given the observed history X0:i. In this paper, we use the Rauch–

Tung–Striebel (RTS) smoother to provide an optimal estimation of E Ui9X0:i, ^h ðkÞ i ð

l

jÞ, E Ui9X0:i, ^h ðkÞ i ð

l

j2Þ and E Ui9X0:i, ^h ðkÞ i ð

l

j

l

j1Þ,

summarized as Algorithm 3,[47,54]. In Algorithm 3, we define Mj9i¼Covð

l

j,

l

j19X0:iÞ.

Algorithm 3. RTS smoothing algorithm

Step 1: Forwards iteration by Algorithm 1 or Algorithm 2 Step 2: Backwards iteration

Sj¼Pj9jP 1 j þ 19j

^

l

j9i¼ ^

l

jþSjð ^

l

j þ 19i ^

l

j þ 19jÞ ¼ ^

l

jþSjð ^

l

j þ 19i ^

l

Pj9i¼Pj9jþSj2ðPj þ 19iPj þ 19jÞ

Step 3: Initialize

Mi9i¼ð1Kiðtiti1ÞÞPi19i1

Step 4: Backwards iteration for smoothing covariance Mj9i¼Pj9jSj1þSjðMj þ 19iPj9jÞSj1

From the RTS smoothing algorithm, we can obtain the conditional expectations of E Ui9X0:i, ^h ðkÞ i ð

l

jÞ, E Ui9X0:i, ^h ðkÞ i ð

l

j2Þ and E Ui9X0:i, ^h ðkÞ i

ð

l

j

l

j1Þin the following lemma.

Lemma 3. Conditional on current estimated parameter ^

h

ðkÞi and observations history X0:i, the values of E Ui9X0:i, ^h ðkÞ i ð

l

jÞ, E Ui9X0:i, ^h ðkÞ i ð

l

j 2 Þand E Ui9X0:i, ^h ðkÞ i ð

l

j

l

j1Þare given by E Ui9X0:i, ^h ðkÞ i ð

l

jÞ ¼ ^

l

j9i, E Ui9X0:i, ^h ðkÞ i ð

l

j2Þ ¼ ^

l

j9i 2 þPj9i, E Ui9X0:i, ^h ðkÞ i

(17)

Proof: These equations are the direct results of applying the properties of variance-covariance and RTS smoothing algorithm and the proof is hence omitted.

From Eq.(C-3)and Lemma 3, ‘ð

h

9 ^

h

ðkÞ

i Þcan be written as 2‘ð

h

9 ^

h

ðkÞ i Þ ¼EU i9X0:i, ^h ðkÞ i 2‘ið

h

Þ ½  ¼E Ui9X0:i, ^h ðkÞ i logP0ð

l

0a0Þ2=P0 Xi j ¼ 1 log Q þ ð

l

j

l

j1Þ 2=Q h Xi j ¼ 1 log

s

2 þxjxj1

l

j1ðtjtj1Þ 2 =

s

2ðt jtj1Þ   i

¼ logP0ðC09i2a09ia0þa02Þ=P0

Xi

j ¼ 1 log Q þ ðCj9i2Cj,j19iþCj19iÞ=Q

Xi j ¼ 1 log

s

2þ x jxj1  2 2 ^

l

j19ixjxj1ðtjtj1Þ þ ðtjtj1Þ2Cj19i =

s

2ðt jtj1Þ   : ðC  5Þ

This completes the E-step and in the following we handle the M-step. After obtaining ‘ð

h

9 ^

h

ðkÞ

i Þ, the results of estimated parameter ^

h

ðk þ 1Þ

i in the (kþ1)th step can be summarized in the

following theorem.

Theorem 2. ^

h

ðk þ 1Þi , by maximizing ‘ð

h

9 ^

h

ðkÞ

i Þ, is given by

a0iðk þ 1Þ¼a09i,

P0iðk þ 1Þ¼C09ia09i2¼P09i,

Qiðk þ 1Þ¼

1 i

Xi

j ¼ 1ðCj9i2Cj,j19iþCj19iÞ,

s

2   i ðk þ 1Þ ¼1 i Xi j ¼ 1 xjxj1  2 2 ^

l

j19i xjxj1   ðtjtj1Þ þ ðtjtj1Þ2Cj19i tjtj1 0 @ 1 A: ðC  6Þ with Cj9i¼EU i9X0:i, ^h ðkÞ i ð

l

j 2

Þ,a09i¼ ^

l

09i,Cj,j19i¼EU i9X0:i, ^h

ðkÞ i

ð

l

j

l

j1Þand ^

h

ðk þ 1Þ

i is uniquely determined and located at the maximum.

Proof: See Appendix D.

The preceding derivations are summarized as Algorithm 4 in Appendix E via a complete specification of the EM-based algorithm for estimating the parameters in

h

when new observation xiis available.

Appendix D. The proof of Theorem 2

The unknown parameters ^

h

ðk þ 1Þi can be obtained by maximizing the ‘ð

h

9 ^

h

ðkÞ

i Þwith

h

. Therefore, the central goal in this

step is to find the maximum of function ‘ð

h

9 ^

h

ðkÞ

i Þon

h

, i.e. ^

h

ðk þ 1Þi ¼argmax h EUi9X0:i, ^h ðkÞ i ‘ið

h

Þ     ¼argmax h ‘ð

h

9 ^

h

ðkÞ i Þ ðD  1Þ

From Eq. (C-5), taking @‘ð

h

9 ^

h

ðkÞi Þ=@

h

, we obtain the solution by @‘ð

h

9 ^

h

ðkÞ

i Þ=@

h

¼0, which leads to Eq. (36), and taking

@2‘ð

h

9 ^

h

ðkÞ

i Þ=@

h

@

h

T

, the following is obtained,

@2‘ð

h

9 ^

h

ðkÞ i Þ @

h

@

h

T ¼ 1 2 2=P0 2ða0a09iÞ=P 2 0 0 0 2ða0a09iÞ=P 2 0 1=P 2

02ðC09i2a09ia0þa02Þ=P30 0 0

0 0 i=Q22

c

=Q3 0 0 0 0 i=

s

42

f

=

s

6 2 6 6 6 6 6 4 3 7 7 7 7 7 5 , ðD  2Þ with

c

¼Pij ¼ 1ðCj9i2Cj,j19iþCj19iÞ:

f

¼Pij ¼ 1 ðxjxj1Þ

2

2 ^lj19iðxjxj1Þðtjtj1Þ þ ðtjtj1Þ2Cj19i tjtj1

  ðD  3Þ

We show that the matrix in (D-2) is negative definite at

h

¼ ^

h

ðk þ 1Þi , by calculating the order principal minor determinant

as follows,

D

1¼  1 P0 ,

D

2¼  1 2P0 1 P2 0

2ðC09i2a09ia0þa0

2Þ P3 0 ! ða0a09iÞ 2 P4 0 ,

D

3¼ 1 2 i Q2 2

c

Q3  

D

2,

D

4¼ 1 2 i

s

4 2

f

s

6  

D

3:

(18)

Then, at

h

¼ ^

h

ðk þ 1Þi , the following results are obtained,

D

19 h¼ ^hðk þ 1Þi ¼  1 P09i o0,

D

29 h¼ ^hðk þ 1Þi ¼  1 2P09i 1 P09i2

2ðC09i2a09ia09iþa09i

2Þ

P09i3

!

ða09ia09i Þ2 P09i4 ¼  1 2P09i 1 P09i2  2 P09i2 ! ¼ 1 2P09i3 40,

D

39 h¼ ^hðk þ 1Þi ¼1 2 i3

c

2 2i3

c

c

3 !

D

29 h¼ ^hðk þ 1Þi ¼  i 3 2

c

2

D

2 9 h¼ ^hðk þ 1Þi o0,

D

49 h¼ ^hðk þ 1Þi ¼1 2 i

s

4 2

f

s

6  

D

39 h¼ ^hðk þ 1Þi ¼1 2 i3

f

2 2i3

f

f

3 !

D

39 h¼ ^hðk þ 1Þi ¼  i 3 2

f

2

D

3 9 h¼ ^hðk þ 1Þi 40:

This completes the proof that the matrix in (D-2) is negative definite at

h

¼ ^

h

ðk þ 1Þi , verifying that ^

h

ðk þ 1Þ

i is located at a

maximum. In addition, ^

h

ðk þ 1Þi is unique since ^

h

ðk þ 1Þ

i is the only solution satisfying @‘ð

h

9 ^

h

ðkÞ i Þ=@

h

¼0.

Appendix E. Algorithm 4: EM algorithm for parameter estimation 1) Initialization Initialize the initial parameters in ^

h

ð0Þi .

2) E-Step Calculate the expectation quantities defined in Eq.(C-3)using Algorithm 3 and Eq.(C-4), with state-space model of Eqs.(7) and (8)parameterized by ^

h

ðkÞi .

3)

M-Step Maximize ‘ð

h

9 ^

h

ðkÞ

i Þusing Eq.(C-6)to obtain the updated parameter estimates by Theorem 2.

4) Test convergence Test the convergence of the algorithm. If converged, then stop. Otherwise set k ¼ k þ1, go to Step 2 and repeat.

References

[1] M. Pecht, Prognostics and Health Management of Electronics, John Wiley, New Jersey, 2008.

[2] X.S. Si, W. Wang, C.H. Hu, D.H. Zhou, Remaining useful life estimation–a review on the statistical data driven approaches, Eur. J. Oper. Res. 213 (2011) 1–14.

[3] F. Camci, R.B. Chinnam, Health state estimation and prognostics in machining processes, IEEE Trans. Autom. Sci. Eng. 7 (2010) 581–597. [4] M.J. Carr, W. Wang, Modeling failure modes for residual life prediction using stochastic filtering theory, IEEE Trans. Rel. 59 (2010) 346–355. [5] M. Dong, D. He, Hidden semi-Markov model-based methodology for multi-sensor equipment health diagnosis and prognosis, Eur. J. Oper. Res. 178

(2007) 858–878.

[6] Y. Peng, M. Dong, A prognosis method using age-dependent hidden semi-Markov model for equipment health prediction, Mech. Syst. Signal Process. 25 (2011) 237–252.

[7] J. Sikorska, M. Hodkiewicz, L. Ma, Prognostic modelling options for remaining useful life estimation by industry, Mech. Syst. Signal Process. 25 (2011) 1803–1836.

[8] M.I. Mazhar, S. Kara, H. Kaebernick, Remaining life estimation of used components in consumer products life cycle data analysis by Weibull and artificial neural networks, J. Oper. Manag. 25 (2007) 1184–1193.

[9] W. Wang, A two-stage prognosis model in condition based maintenance, Eur. J. Oper. Res. 182 (2007) 1177–1187.

[10] M.Y. You, L. Li, G. Meng, J. Ni, Statistically planned and individual improved predictive maintenance management for continuously monitored degrading systems, IEEE Trans. Rel. 59 (2010) 744–753.

[11] A.K.S. Jardine, D. Lin, D. Banjevic, A review on machinery diagnostics and prognostics implementing condition-based maintenance, Mech. Syst. Signal Process. 20 (2006) 1483–1510.

[12] M.-L.T. Lee, G.A. Whitmore, Threshold regression for survival analysis: modeling event times by a stochastic process reaching a boundary, Stat. Sci. 21 (2006) 501–513.

[13] L.A. Escobar, W.Q. Meeker, A review of accelerated test models, Stat. Sci. 21 (2006) 552–577.

[14] V.R. Joseph, I.T. Yu, Reliability improvement experiments with degradation data, IEEE Trans. Rel. 55 (2006) 149–157. [15] C.J. Lu, W.Q. Meeker, Using degradation measures to estimate a time-to-failure distribution, Technometrics 35 (1993) 161–174.

[16] J.I. Park, S.J. Bae, Direct prediction methods on lifetime distribution of organic light-emitting diodes from accelerated degradation tests, IEEE Trans. Rel. 59 (2010) 74–90.

[17] M.D. Pandey, X.-X. Yuan, J.M. van Noortwijk, The influence of temporal uncertainty of deterioration on life-cycle management of structures, Struct. Infrastruct. Eng. 5 (2009) 145–156.

[18] S. Bloch-Mecier, A preventive maintenance policy with sequential checking procedure for a Markov deteriorating system, Eur. J. Oper. Res. 147 (2002) 548–576.

[19] N. Gebraeel, M.A. Lawley, R. Li, J.K. Ryan, Residual-life distributions from component degradation signals: a Bayesian approach, IIE Trans. 37 (2005) 543–557.

[20] M.C. Delia, P.O. Rafael, A maintenance model with failures and inspection following Markovian arrival processes and two repair modes, Eur. J. Oper. Res. 186 (2008) 694–707.

[21] H.T. Liao, E.A. Elsayed, Reliability inference for field conditions from accelerated degradation testing, Nav. Res. Log. 53 (2006) 576–587. [22] C.T. Barker, M.J. Newby, Optimal non-periodic inspection for a multivariate degradation model, Reliab. Eng. Sys. Safe. 94 (2009) 33–43. [23] S.T. Tseng, J. Tang, L.H. Ku, Determination of optimal burn-in parameters and residual life for highly reliable products, Nav. Res. Log. 50 (2003) 1–14. [24] G.A. Whitmore, F. Schenkelberg, Modelling accelerated degradation data using Wiener diffusion with a time scale transformation, Lifetime Data

Anal. 3 (1997) 27–45.

References

Related documents

If the retirement and removal of a depreciable asset occurs in connection with the installation or production of a replacement asset, are the costs incurred in removing the

Restorative Eye Therapy is a cooling treatment that incorporates the benefits of Aveda’s Green Science skin care products and pressure point massage to diminish fine

The primary objective of this research was not necessarily to “fix” or remedy specific leadership issues or capability gaps. Instead, this research described why leadership

Thus, due to the complex and dynamic nature of these real-life situation projects, the possibility of virtual, interactive and collaborative immersive visu- alization of

All their customers numbered in the scaling of class c subnet address tells how is cidr also to as which hosts need this range of?. Kiselev VY, Kirschner K, Schaub MT, Andrews T, Yiu

The book’s first section offers a foundation of four simple but comprehensive Lean key performance indicators (KPIs): waste of the time of things (as in cycle time),

These criteria are as follows: promote the students' perception of the symbiosis between code and documentation; enhance their awareness of the program's structure using

India stands out for its comprehensive rural water database known as Integrated Management Information System (IMIS), which conducts annual monitoring of drinking water coverage,