On-line Fault Detection - Model Application for On-line Fault Detection and End-of-Batch Predic

CHAPTER 4 Case Study 1 – Three Tank Simulation

4.5 Model Application for On-line Fault Detection and End-of-Batch Prediction

4.5.1 On-line Fault Detection

The monitoring results for the NOC test data synchronised with window sizes of 5 and 20 are presented below in table 4.6. Every alarm recorded during the NOC test batch was assigned as a false alarm. Every alarm recorded during the fault test batch before the 160th_{interval (when the}

fault occurred) was assigned as a false alarm, and any alarm not recorded after the 160th_interval

was assigned as a missing alarm. The missing alarm rate was calculated using the number of intervals after the fault occurrence (462).

Table 4.6: Fault Detection Results for NOC Test Batch – Case Study 1 Window Size No. Intervals Phase Division Method Number False Alarms False Alarm Rate 5 433 Global 154 0.356 Operational Phases 130 0.300 MP Algorithm 133 0.307 20 433 Global 143 0.330 Operational Phases 32 0.073 MP Algorithm 33 0.076

A high false alarm rate (FAR) was recorded for the NOC test batch when synchronising using a window width of 5. The majority of these false alarms occurred through the SPEx statistic at the

end of the batch, as shown in figure 4.12 for the operational phase model. The reason for these false alarms is the suboptimal synchronisation at the end of the batch run. Figure 4.13 shows fewer false alarms in this region for a window width of 20. A plot of the synchronisation paths the two window sizes (figure 4.14) shows that different paths were taken at the 296th_{signal interval. The}

greedier approach of the smaller window size assumes the batch to be further along than it actually is, and excessively expands the trajectory. This is probably due to the slower dynamics of the variables in this part of the process, which the smaller window size cannot correct for. This suboptimal warping may need to be taken into account when defining the monitoring chart limits. The limits were calculated based on data synchronised off-line, but if the greedy approach is used on the calibration data, the suboptimal warping may be taken into account. However, this could leave the model susceptible to missing alarms.

The reason for the global model not significantly reducing the false alarm rate when a larger window width was used is due to the second operational phase of the process. The limits defined in this phase for the global model are quite low compared to the other models, therefore more false alarms were recorded. This is a symptom of using the training data to define the limits, as opposed to a separate set of calibration data. These alarms should be recalibrated to relax the limits somewhat.

Figure 4.13: SPE Chart for NOC Test Data (Operational Phase Models, γ = 20)

The fault identification results for the faulty batch for both window size is show in table 4.7. The fault introduced was a small leak in tank 1, which resulted in the process taking much longer to reach its completion time. Much better performance is seen in for the smaller window size compared with the NOC batch.

Table 4.7: Fault Detection Results for Fault Batch – Case Study 1 Window Size No. Intervals Phase Division Method False Alarms Missing Alarms Detection Delay 5 622 Global 12 76 34 Operational Phases 2 45 52 MP Algorithm 20 48 41 20 622 Global 81 172 37 Operational Phases 8 138 52 MP Algorithm 20 273 54

The fault was detected 36-54 intervals after they occurred for all the approaches the reason for this delay probably lies with the nature of the fault, as is didn’t drastically change the variable trajectories. The fault was identified only after tank 2 had reached its set point, and at this point the deviation of tank 1 from its set point was great enough that the data did not fit the model. After the fault had been identified, the larger window size allowed the trajectories to be compressed and many missing alarms occurred, as the synchronisation paths show in figure 4.15. The fault that occurred affected the time taken for the batch to reach completion, and synchronisation warps the trajectories to a pseudo time, hence this information was lost when the trajectory was compressed. Initially the fault was detected, but as the process continued, the larger window size allowed the signal trajectory to be compressed, resulting in missing alarms. This did not occur for the smaller window size. The synchronisation paths show that a relatively consistent linear path was maintained at this part of the process. Using a window size of this length might be advantageous at this part of the process, as it can detect faults such as this that affect the time taken for the batch to reach completion. The false alarms that may occur at the end of the process could be obviated by using a dynamic window size that changes from 5 intervals to 20 intervals once a particular reference point has been reached. Alternatively, time could be included as a variable, considering it is such an important part of the process.

Figure 4.15: Synchronisation Path Deviations for Fault Test Batch

In almost all the alarms detected, only the SPEx statistic exceeded the control limits imposed. The

Hotellings T2_{statistic rarely exceeded its limit. The limit was calculated based on the F-statistic at}

the confidence level and A and I – A degrees of freedom. This limit is quite high if a smaller number of training batches (I) are used and fewer PCs (A) are retained, which is the case for this simulation. Adjusting these control limits is recommended, as the Hotellings T2_{does increase in some of the}

cases when the fault occurred, just not enough to exceed the imposed limit (figure 4.16).

Overall, the MPPLS and operational phase models showed the best performance. The false alarm rates were closest to the 5% confidence limits for the NOC batch. However, the detection delay was the longer for these models than the global model, which could be as a result of the confidence limits being high enough to allow the minor number of false alarms. It is inconclusive whether these missing alarms are a result of the nature of the fault itself, although the deviation from the model occurred consistently around the 200th_{interval for both phase-based models. The number}

of false alarms recorded in before the fault occurred in the fault batch was much lower for the operational phase model than the MPPLS model, which is due to the lower limits in the second operational phase for the MPPLS model. The limits would probably need to be relaxed slightly in this region. Based on these results, the best monitoring scheme would probably include using the operational phase model with a dynamic window size that changes when tank 2 gets closer to its set point.

Figure 4.16: T2_{Chart for Fault Test Data (Operational Phase Models, γ = 5)}

In document On-line fault detection and end-of-batch quality prediction for batch processes incorporating on-line synchronisation and phase identification (Page 92-97)