2.1 FAULT DIAGNOSIS METHODS
2.1.1 Data-Driven Methods
Data-driven methods mainly use multivariate statistical analysis for FDD. They rely on relationships between multiple measurements of a system, but use them implicitly through analysis of historical data. For this reason, such methods are also referred to as process history-based methods (Venkatasubramanian, Rengaswamy, Kavuri, & Yin, 2003). Since the challenging task of explicit system modeling is not required, data-driven methods are attractive for practical FDD applications. They are particularly suitable for FDI in steady state systems. In fact, data-driven methods have been successfully used for FDI in sensors, machines, and processes of various industrial systems. However, a key limitation of data-driven methods is that a data-driven model only works well within the operational range represented by the training data.
Model-based methods can be used for fault diagnosis in two different approaches. The first approach is based on transformations of a set of measurements using model-based algorithms. Some popular algorithms using this approach include Principal component
analysis (PCA) (Dunia & Qin, 1998; Jolliffe, 2002; Wise & Gallagher, 1996), partial lest squares (PLS) (Geladi & Kowalski, 1986; Wise & Gallagher, 1996; Kourti &
MacGregor, 1996; Qin & McAvoy, 1992; Wold, 1994; Qin, 1998; Rosipal & Kramer, 2006), independent component analysis (ICA) (Hyvärinen & Oja, 2000; Ding, Gribok, Hines, & Rasmusse, 2004), Fisher linear discriminant analysis (FDA or LDA) (Chiang et al., 2000; Chiang et al., 2000; Chiang, Russell, & Braatz, 2000; Chiang, Kotanchek, & Kordon, 2004; He, Qin, & Wang, 2005), and nonlinear extensions to those algorithms (Mika, Ratsch, Weston, Scholkopf, & Muller, 1999; Baudat & Anouar, 2000; Bach & Jordan, 2003; Lee, Qin, & Lee, 2007; Zhang & Qin, 2007). In the second approach, fault is detected and isoalted by comparing a set of measurment data with analytical
estimations generated by a data-driven model. Popular algorithms in this approach include PCA, ANN (Mehrotra, Mohan, & Ranka, 1997; Venkatasubramanian,
Rengaswamy, Kavuri, et al., 2003), MSET (Herzog, Wegerich, & Gross, 1998; Hines & Usynin, 2005), AAKR (Garvey & Hines, 2006), and cross calibration (Hashemian, 2006). ANN and MSET have been used for a large variety of FDD applications (Watanabe, Matsura, Abe, Kubota, & Himmelblau, 1989; Venkatasubramanian, Vaidyanathan, & Yamamoto, 1990; Kramer, 1992; Nieman & Singer, 2002; Hines & Davis, 2005; White, Gross, Kubic, & Wigeland, 1994; Gross, Wegerich, Singer, & Mott, 1996; Hines & Davis, 2005; GE, 2014).
It is interesting to note that PCA is one of the best known algorithms in both approaches. PCA is basically a linear projection of a set of data into a lower dimensional principal component subspace, where the maximum variances are captured. The principal components reveal how the variables are correlated to each. Projections to the non- principle subspace are considered residuals (Dunia & Qin, 1998; Jolliffe, 2002). Standard PCA can be conveniently trained by applying singular value decomposition (SVD) to the covariance matrix of some historical data (Wise & Gallagher, 1996). Faults in the
measurement data will break down the normal correlations and increase the residuals. Fault detection can be achieved by comparing the squared prediction error (SPE) with a threshold. The faulty sensor can be isolated using techniques such as contribution plot and sensor reconstruction (Wise & Gallagher, 1996; Dunia, Qin, Edgar, & McAvoy, 1996; Qin, 2003; Qin, 2012). PCA has simple structure, is easy to train, and is a powerful
tool that captures the maximum variances in correlated data. It is a very popular choice for FDI in real systems (Kaistha & Upadhyaya, 2001; Upadhyaya, Zhao, & Lu, 2003; Ma & Jiang, 2009). Standard PCA has been extended to obtain variant algorithms such as recursive PCA (Li, Yue, Valle-Cervantes, & Qin, 2000), dynamic PCA (Ku, Storer, & Georgakis, 1995; Russell, Chiang, & Braatz, 2000; Chen & Liu, 2002; Lee, Choi, & Lee, 2004), multi-way PCA (Wise & Gallagher, 1996; Nomikos & MacGregor, 1994), and multi-scale PCA (Bakshi, 1999; Yoon & MacGregor, 2004). PCA has also been used in hybrids with model-based FDD methods to improve the performance (Gertler, Li, Huang, & McAvoy, 1999; Qin & Li, 1999; Li & Qin, 2001).
Standard PCA is a linear method. Large errors can be induced when PCA is applied to data containing nonlinearities. A few nonlinear PCA methods (Kramer, 1991; Webb, 1996; Dong & McAvoy, 1996) have been developed. However, complicated nonlinear optimization is often required, which has the risk of local optima. In addition, the model structures usually need to be specified a priori. What’s more, existing fault isolation techniques for PCA, notably contribution plot (Kramer, 1991) and sensor validity index (SVI) based on sensor reconstruction (Dunia et al., 1996), may not give reliable isolation results if more than one sensor fault exists at the same time. A more recent development to PCA is the combination of kernel-based nonlinear learning methods to PCA to obtain nonlinear PCA (Schölkopf, Smola, & Müller, 1998; Mika, Schölkopf, et al., 1999). This technique has been adopted in some FDD studies (Lee, Yoo, Choi, Vanrolleghem, & Lee, 2004; Choi, Lee, Lee, Park, & Lee, 2005; Ma & Jiang, 2012). KPCA first maps
measurements from the input space onto a feature space via nonlinear mapping functions. Procedures used in linear PCA can then be directly applied in the feature space. Through the use of kernel functions, dot products in the feature space can be computed implicitly (Aizerman, Braverman, & Rozonoer, 1964). Nonlinear optimizations and a priori model structure specifications as required in other nonlinear PCA techniques are not involved in KPCA. KPCA has been studied for fault detection applications (Lee et al., 2004; Choi et al., 2005) in analogy to PCA. For fault detection, a SPE in the feature space can be calculated and compared with a predetermined threshold (Lee et al., 2004; Choi et al., 2005). Fault isolation and identification is more difficult for KPCA (Schölkopf & Smola, 2002). One possible approach is to reconstruct new measurements from the training data.
After replacing the output of a sensor with the reconstructed value, a SVI, called fault index in (Choi et al., 2005), can be defined for this sensor as the ratio between the SPEs after and before the reconstruction. A sensor with considerably reduced fault index is considered faulty. However, the fault index may not provide reliable results if more than one fault exists at the same time. In addition, the direction and magnitude of a detected fault cannot be identified.
Overall, data-driven methods do not need an explicit model of a system. Therefore, they are flexible for applications in practical systems. In fact, they have been favorable choices for FDD in various industries. The major limitation is that the models only work well in the range of the training data. PCA is probably the most widely used algorithm in practical FDD systems. PCA has been extended for nonlinear applications using kernel functions; however, the techniques used to isolate faulty sensors can be unreliable.