Statistical Algorithms - Data-driven Models

Chapter 2 Background and Literature Survey

2.3 Review of RUL Prediction Methods

2.3.3 Data-driven Models

2.3.3.2 Statistical Algorithms

Statistical data-driven models estimate the damage progression based on condition monitoring information on similar machines. Their main difference from stochastic approaches is that they commonly provide precise estimations rather than a probability distribution. They are often used as an alternative or sup- plement to artificial neural networks when a suitable model is available to account for the dynamics of a system (Sikorska et al., 2011).

One of the earliest time series prediction and forecasting prognostics is the trend extrapolation methods (Batko, 1984; Kazmierczak, 1983; Cempel, 1987). The degradation pattern is associated with a single time series (either monitored or calculated) which is assumed to follow a monotonic trend. This single degradation parameter is plotted as a function of time and a threshold level is pre-established to decide the end of life point. The major advantage of such forecasting methods is the simplicity in their calculations which can be carried out on a basic programmable algorithm. However, their main draw- back is that operating conditions are stable and/or do not have any affect on monitored time series.

In the statistical analysis of time series, autoregressive moving average (ARMA) models are widely used for modelling and forecasting time series (Box et al., 2015). The main idea of ARMA is to fit the data to a parametric time series model and extract features based on this model. It consists of

two parts: the auto-regressive part and the moving average part. The damage progression is obtained with curve fitting in moving average part, and added to the autoregressive model output to predict the future values (Liao and K¨ottig, 2014). For a time seriesx= 1 : t, the ARMA model with autoregressive model of orderP and the moving average model of orderQtakes the following form.

¨ xt=c+εt+ P X i=1 φix¨t−i+ Q X i=1 Θiεt−i, (2.23)

where φ and Θ are respectively auto-regressive and moving average

terms (Whitle, 1951). This representation is effective for short-term predictions, however, cannot provide reliable long term predictions due to the model’s sensitivity to initial system conditions and systematic errors in the predictor (Box et al., 2015). Although, this problem can be minimised by avoiding past estimations for future predictions and being reliant on condition monitoring data (Wu et al., 2007), ARMA models are limited for non-stationary and dy- namic processes.

The review of data-driven models in this section shows that due to the incomplete understanding on the multi-dimensional failure mechanisms, time series prediction and forecasting methods lack the ability to deal effectively with complicated multidimensional and noisy data. Further data processing methods are needed to deal with this efficiently.

The raw values of multi-dimensional time series, which are inconsistent with each other and operates under various conditions, need a feature extraction transformation of the multi-regime data in the high-dimensional space to a space of single health level dimension (Bektas et al., 2017). This transformation can reduce the dimensionality of the time series from their original scales to a notionally common scale that will include meaningful information for prognosis.

step by using regression models which perform a mapping of the multi-regime data to a lower-dimensional space in such a way that the variance of the mea- surements in the low-dimensional finding is maximised. Ramasso (2014a,b) applied a multiple linear regression model for complex systems operating under different conditions. Their model could standardise the multi-regime data into a common space for further time series prediction and forecasting prognostics. In similar complex system domains, Juesas et al. (2016); Bektas et al. (2017) extended the approach and applied into alternative estimation models. Multiple linear regression calculates the relationship between different explanatory variables and a target variable by fitting a linear equation to observed data (Chatterjee and Hadi, 1986; Freedman, 2009). This model is based on:

y =xβ+ (2.24)

wherey is an×1 vector of values of the target variable, x is an n×p matrix of observed responses and β is a n×1 vector of coefficient estimates for a multiple linear regression of the responses.

y=β1x1+β2x2 +β3x3+· · ·+βnxn (2.25)

More complex models may include multiple observations (multivariate time series) and the equation is modified by consideringx as a matrix instead of a vector.

yi =β1xi,1+β2xi,2+β3xi,3 +· · ·+βpxi,n, (2.26)

y =         y1 y2 · · · yn         x=         x1,1 x1,2 · · · x1,p x2,1 x2,2 · · · x2,p .. . ... ... xn,1 xn,2 · · · xn,p         (2.28)

When datasets represent multiple instances and cover various realistic and difficult cases including different operating conditions and fault modes with unknown characteristics, the “regression” can tackle the problem of feature feature extraction and standardisation (Ramasso, 2014a). After the coefficients,β, are estimated for a specific case, they can be applied into similar matrix of observed responses. The major limitation in this model is that the coefficient estimation is a supervised method that requires pre-defined target variables. Although there are mathematical definitions are used to define these variables, the calculation of characteristic damage progression in individual complex systems and initial health level (or wear level) is a major issue to be considered.

Principal component analysis (PCA) is another statistical dimensionality reduction procedure that uses an orthogonal transformation to convert a set of correlated inputs possibly into a set of linearly uncorrelated principal components. In PCA, these components are obtained from singular value de- composition of rectangular matrices,x (Holland, 2008). To standardise these multi-dimensional data, the first principal components are used as the health indicator for prognosis (Ramasso, 2014a,b; Juesas et al., 2016; Bektas et al., 2017).

y1 =β11x1+β12x2+β13x3+· · ·+β1pxp =β1Tx (2.29)

Similar to the regression model, β is a matrix of coefficients that is determined by PCA. This dimensionality reduction method is not a super-

vised model and does not require a pre-defined target variables. However, the method can only be applied into individual run-to-failure trajectories. Consid- ering a dataset with multiple component-wise cases, the damage progression and initial health level of individual cases cannot be standardised into a common scale.

In order to provide a common scale across all the characteristics of a dataset, normalisation is a common well-known pre-processing step to perform component-wise standardisation before the prognostic analysis. A standard method to achieve this is to use the standard score (Peel, 2008; Wang et al., 2008; Wang, 2010; Lam et al., 2014; Ramasso, 2014a,b; Malinowski et al., 2015; Rigamonti et al., 2016):

N(xd) = x

d₋_µd

σd , ∀d (2.30)

where xd are the original data values (data set) for feature d (regime), andµd_and_σd_{are respectively the mean and standard deviation of the regime.}

Peel (2008); Wang et al. (2008) proposed such a component-wise “multi-regime normalisation” method to standardise the multi-regime sensor readings accord- ing to each other within the same domain. Unlike the regression analysis and PCA, their methodology can deal with the damage progression in complex systems and consider the population characteristics (µ, σ). They could standardise the entire dataset together with its all components and preserve the characteristic damage progression and initial health levels of multiple trajectories.

However, for the case considered in the works of Peel (2008); Wang et al. (2008), all trajectories were available at the same time. The “normalisation at once” has therefore not been a major issue. Nevertheless, it would be rather unlikely to find such data in a real-life scenario due to the restrictions on data proprietary considerations and confidentiality (Ramasso and Saxena, 2014).

In a real-world scenario, the “multi-regime normalisation” should be repeated for each novel incoming trajectory in order to calculate the changed population characteristics.

In document An adaptive data filtering model for remaining useful life estimation (Page 66-71)