Using Topology for Fault Identification - Exploiting process topology for optimal process monit

In addition to their use for blocking process data into groups according to strongly connected components, connectivity graphs can be for fault identification to determine possible root causes of faults. This is achieved by tracing faults from variables that showed symptoms of the fault back to variables that were root causes of the fault.

3.5.1. Change in connectivity for identification of symptom nodes

In order to use the connectivity graph to trace faults to their root causes the symptom nodes, i.e. variables that showed symptoms of the fault, have to first be identified. Fault conditions will result in a change in the connectivity structure in a process. This change can be due to a change in the physical or chemical behaviour of the process, or due to a change in the control action, or some other change in the process behaviour. By comparing connectivity extracted from process data under normal operating conditions to that extracted from fault conditions data, information can be gained about the fault conditions to aid root cause analysis.

This change in causality, or connectivity, due to faults was demonstrated by Chiang and Braatz (2003), who used it for detection of faults. They used two causality measures, the modified distance (DI), which is similar to transfer entropy since it is also a measure of mutual information, and the causal dependency (CD) (based on the TA2 statistic). They compared the values obtained during fault

Chapter 3 -Topology for Fault Diagnosis Page 38 conditions to those obtained during NOC. Significant deviation of the observed values from those observed under NOC provided an indication of a fault condition. The variables that showed the highest DI and CD were identified as symptom nodes, and then variables that were highly correlated (connected) with these variables were considered to be possible root nodes. Their method was applied to faults in the Tennessee Eastman Process case study and it was found that it performed very well for fault detection, with MAR of about 15%, but, more importantly, also performed well for fault identification, allowing an expert to highlight propagation paths correctly and allowing the root cause to be identified in most cases.

3.5.2. Back propagation in connectivity graphs for fault identification

Once symptom nodes have been identified, either from contribution plots or from connectivity change, the connectivity graph can be used to trace them back to possible root causes.

Connectivity maps have been widely used for fault diagnosis, typically by identifying possible fault propagation paths. One method for inference of propagation paths uses expert systems. This type of rule-based inference can only be used when a set of expert rules is available (Yang and Xiao, 2012), which makes it a very limited method. Bayesian nets have also been used for inference (Yang and Xiao, 2012); probability and conditional probability of fault conditions is used to make inference on the probability of the fault occurring.

However, the most common method for finding the root cause for the fault using a connectivity map is depth-first traversal on the map (Iri et al., 1979; Venkatasubramanian et al., 2003a; Yang and Xiao, 2012). This method constructs a propagation path by moving to adjacent nodes until no further edges are found. So a node that has been identified as having fault conditions associated with it is taken and possible propagation paths are traced back until a node is found that has no entering edges (a root node). However, applying this method will generally just trace a fault back to the first node in the graph, or the first variable in the process. In a graph that has captured control and recycle loops in its connectivity structure, it would be difficult to determine if a root node was captured inside one of these loops. This method also does not account for the weights of the edges in the graph, or the strength of the connections between the variables. A propagation path that follows strongly connected variables from a possible root node to a symptom node is more likely to be a true representation of the actual fault propagation path.

It is therefore proposed to use a slightly modified back propagation method that involves finding all the shared ancestors of the identified symptom nodes, finding the shortest distances from these ancestors to the symptom nodes by taking into account the weights of the edges in the graph, and then finding the furthest shared ancestors.

Chapter 4 -Fault Diagnosis Method Page 39

Fault Diagnosis Methodology

Chapter 4 -

This chapter presents the proposed fault diagnosis methodology combining all the techniques described in chapters 2 and 3 as well as the methodology followed to determine which combination of techniques performed best according to the aims of the project.

4.1. Fault Diagnosis Techniques

The various topology extraction, pre-processing, fault detection and fault identification techniques considered for this research include: three topology extraction methods (TEM) including linear cross- correlation (LC), partial cross-correlation (PC) and transfer entropy (TE); two pre-processing methods (PPM) including the unblocked case (i.e. no pre-processing) and blocking; Two feature extraction methods (FEM) including principal components analysis (PCA) and kernel principal components analysis (KPCA); three monitoring chart methods (MCM) including Shewhart, exponentially weighted moving average (EWMA) and cumulative sum (CUSUM) monitoring charts; two fault identification methods (FIM) including contributions and connectivity change. Figure 4-1 summarises the techniques considered with a reference to where each technique was discussed in Chapter 2 - and Chapter 3 -.

Figure 4-1: Summary of fault diagnosis techniques considered

Each possible combination of methods was applied in order to determine which combination resulted in the best fault detection and identification performance. The performance was evaluated based on the following performance metrics:

TEM

TEM1: LC (Chapter 3.3.3) TEM2: PC (Chapter 3.3.4) TEM3: TE (Chapter 3.3.5)

PPM

PPM1: Unblocked PPM2: Blocked (Chapter 3.4 and Chapter 2.6)

FEM

FEM1: PCA (Chapter 2.4) FEM2: KPCA (Chapter 2.5)

MCM

MCM1: Shewhart (Chapter 2.7.1) MCM2: EWMA (Chapter 2.7.2) MCM3: CUSUM (Chapter 2.7.3)

FIM

FIM1: Contributions (Chapter 2.4.3) FIM2: Connectivity Change (Chapter 3.5)

Chapter 4 -Fault Diagnosis Method Page 40

 Fault detection: AUCs and DDs

 Fault Identification: Location identified for each fault

In document Exploiting process topology for optimal process monitoring (Page 59-62)