Multi Step Ahead Prediction - Research Procedures

Chapter 3 Methodology

3.4 Research Procedures

3.4.3 Multi Step Ahead Prediction

In the future failure mode, the filtered health indicators are stored to calculate the pairwise distance relations for the life time predictions. A similarity-based RUL estimation model identifies the best matching training units for each test case and makes future multi-step predictions over filtered health indicators. A particular issue encountered when attempting to make meaningful long-term predictions is that of taking account of different kinds of uncertainties arising from various sources.

3.4.3.1 Pairwise Distance Relation

The similarity-based prediction algorithm proposed in this research estimates the future behaviour of the systems only when sufficient training data to map out the damage space is present and has been examined by comparison with a robust standardisation. The filtered health indicator derived from the data must give a realistic representation of system performance and manage entire trajectories correctly. For example, when more information about historical damage propagation becomes available, the filtering should be devised to nar- row the errors in identification of trajectory characteristics, and the results should demonstrate the relationships between the historical performance de- terioration of initially trained subsets and RUL predictions of test subsets.

training units that have a run-to-failure history (Wang et al., 2008; Lam et al., 2014; Ramasso, 2014a; Eker et al., 2014; Peng et al., 2012b). Distance matri- ces are used in similarity-based estimation for pairwise distance estimations. These distances are then used to measure the similarity between training and test HIs.

Each training curve is accepted as a baseline for the degradation pattern for prediction. The pairwise distance between the pairs of training and test HIs can be regarded an error rate for each corresponding point. According to this error rate, the similarity between the degradation trajectories of two diverse instances can be calculated first; then, the failure threshold point of the test trajectory is estimated based on the actual failure point of the corresponding training trajectory. Finally, the RULs estimated from multiple training data can be fused to compute a final RUL estimation.

Euclidean distance is used to measure the ordinary distance between the points of two corresponding trajectories in Euclidean space. Although various distance-based methods can be used in the model, since the trajectories are in a continuous space where all dimensions are properly scaled and relevant, the Euclidean is an appropriate choice for the distance function. With this method, Euclidean space becomes a metric space and calculates the distance between pairs of objects in a data matrix that represents the similarity of corresponding parts.

The Euclidean distance between the vector of the test (p) and corresponding part of the training trajectory (q) is the length of the line seg- ment connecting both vectors ( ¯pq). In Cartesian coordinates, when “p = (p1, p2,· · · , pn)” and “q = (q1, q2,· · · , qn)” are two vector points in Euclidean

n-space, the distance (d) between p and q, or the reverse, is calculated by Pythagoras’ theorem, which can be written as an equation relating the lengths of the sides p, q and d (Sally, 2007).

d(p,q) = d(q,p) = q (q1−p1) 2 + (q2−p2) 2 +· · ·+ (qn−pn) 2 = v u u t n X i=1 (qi−pi) 2 . (3.46)

Figure 3.6 shows pairwise distance calculations between a particular test sample and two sets of training sample observations which include run to failure degradation progress. As seen in this example, the best-matching training units for the test data can be located in the ongoing parts of the curve rather than being in the initial stages. The testing curve is moved in a step-by-step manner throughout the base curve to identify the minimum pairwise distance between the trajectories. This illustrates the fact that the relation between the test and training samples is calculated and stored at each step in order to find the possible best-matching part of the training domain. Given the measure of the distance between each pair, the matching locations are used as feedback to complete the missing parts of the test trajectories, and the algorithm continues to the next training time units by repeating the same process. The pairwise distance over each time unit is given by the following equation: d(te, tr, j) = ntr X j=nte v u u t nte X i=1 (tei−tri+j)2. (3.47)

where nte is the length of test trajectory and ntr is the length of training

trajectory (the base curve).

3.4.3.2 RUL Estimation

Once the testing curve has been moved throughout the baseline, the minimum of the stored pairwise distance values is identified as the best-matching part

} }

a b c a b c ' ' ' a b c '' '' '' Run-to-Failure Degradation

Progress of Training Sample Test Sample

}

Most Similar Segments

Time

Health Index

Figure 3.6: Similarity Methodology

between the test data and the training baseline, M n(te,tr).

M n(te,tr) =min d(te,tr,1), d(te,tr,2),· · ·, d(te,tr,ntr−nte)

(3.48) The location of the best-matching time instant at the training baseline, Lte,tr, is determined by:

Lte,tr = arg find d(te,tr,j)=M n(te,tr)

;j = 1,2,· · ·(ntr−nte) (3.49)

A presentation of minimum distance and the estimated RUL are shown in Figure 3.7. The baseline data before this location is accepted as non-useful information to be removed, and the remaining part is used as a representation of the test trajectory’s future behaviour.

a b c a b c ' ' ' a b c '' '' '' Best-matching part of Training Sample Test Sample Parts to be Removed Time Health Index RUL₁ RUL2

Remaining Useful Life Estimation

Figure 3.7: RUL Estimation

By using the calculated location of the minimum Euclidean distance, a RUL matrix for each test trajectory at each training baseline can be calculated by the following formula.:

RU Lte,tr =ntr −(nte+Lte,tr) (3.50)

The length of time series after the identified location of best matching time instant, Lte,tr, gives the estimated RUL of the test trajectory. However,

a single RUL prediction with a single training baseline can be extended to a collaborative group of best matching training trajectories and the estimations with more training samples can increase prediction performance in multi-step ahead RUL estimations. In other words, the data collaboration effort can be performed from multiple suppliers with a wide variety of degradation cases rather than using separate and self-contained sources.

In document An adaptive data filtering model for remaining useful life estimation (Page 137-142)