Classification Algorithms based on Generalized Polynomial Chaos

(1)

Classification Algorithms

based on

Generalized Polynomial Chaos

by

Yuncheng Du

A thesis

presented to the University of Waterloo

in fulfillment of the

thesis requirement for the degree of

Doctor of Philosophy

in

Chemical Engineering

Waterloo, Ontario, Canada, 2016

(2)

AUTHOR'S DECLARATION

I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any

required final revisions, as accepted by my examiners.

(3)

iii

Abstract

Classification is one of the most important tasks in process system engineering. Since most of the classification algorithms are generally based on mathematical models, they inseparably involve the quantification and propagation of model uncertainty onto the variables used for classification. Such uncertainty may originate from either a lack of knowledge of the underlying process or from the intrinsic time varying phenomena such as unmeasured disturbances and noise. Often, model uncertainty has been modeled in a probabilistic way and Monte Carlo (MC) type sampling methods have been the method of choice for quantifying the effects of uncertainty. However, MC methods may be computationally prohibitive especially for nonlinear complex systems and systems involving many variables.

Alternatively, stochastic spectral methods such as the generalized polynomial chaos (gPC) expansion have emerged as a promising technique that can be used for uncertainty quantification and propagation. Such methods can approximate the stochastic variables by a truncated gPC series where the coefficients of these series can be calculated by Galerkin projection with the mathematical models describing the process. Following these steps, the gPC expansion based methods can converge much faster to a solution than MC type sampling based methods.

Using the gPC based uncertainty quantification and propagation method, this current project focuses on the following three problems: (i) fault detection and diagnosis (FDD) in the presence of stochastic faults entering the system; (ii) simultaneous optimal tuning of a FDD algorithm and a feedback controller to enhance the detectability of faults while mitigating the closed loop process variability; (iii) classification of apoptotic cells versus normal cells using morphological features identified from a stochastic image segmentation algorithm in combination with machine learning techniques. The algorithms developed in this work are shown to be highly efficient in terms of computational time, improved fault diagnosis and accurate classification of apoptotic versus normal cells.

(4)

Acknowledgements

First and foremost, I would like to express my deepest gratitude to Professor Hector M. Budman and Professor Thomas A. Duever. I truly appreciate their continuous support, encouragement, understanding, and their selfless dedication to both my personal and academic development. Without their help, this work would not have been possible.

I also would like to thank the members of my Ph.D. examining committee: Professor Sirish L. Shah, Professor Fakhri O. Karray, Professor Ali Elkamel and Professor Luis Ricardez-Sandoval, for devoting the time to reading my thesis and for providing valuable suggestions.

I also thank everyone in my research group for all the memories we have in the last four years, and also, to all my friends for the wonderful moments.

(5)

v

Dedication

To you as a reader.

(6)

List of Figures

Figure 2.1 General scheme of fault detection and diagnosis ... 11

Figure 3.1 Faults profiles ... 27

Figure 3.2 The grid points for two-dimensional heat conduction problem... 29

Figure 3.3 Flowchart to formulate the adaptive optimization model ... 31

Figure 3.4 FDD algorithm by using the PDF profiles of measured variables ... 31

Figure 3.5 Joint Confidence Region (JCR) array ... 32

Figure 3.6 Sketch of JCR based FDD algorithm ... 33

Figure 3.7 Mean and variance distribution over two-dimensional domain ... 34

Figure 3.8 Sensors placement for model optimization (top-left part of the square domain) ... 35

Figure 3.9 PDF profiles of six classes at grid point 8 by gPC model (Q = -100) ... 36

Figure 3.10 Fault detection rate for single fault with gPC model ... 37

Figure 3.11 Fault detection rate for single fault by gPC model with 10 replicates ... 38

Figure 3.12 Fault detection rate for single fault with different weights... 38

Figure 3.13 Mean and variance distribution over two-dimensional domain ... 39

Figure 3.14 Sensor placements for Case II (one stochastic boundary) ... 40

Figure 3.15 JCRs for two measurements at sensor 1 and 3 with a 99% confidence interval ... 41

Figure 3.16 Comparisons of expected value (a) and variance (b) between gPC and MC ... 42

Figure 3.17 Comparison of model calibration results between gPC and MC (single fault) ... 43

Figure 3.18 PDF profiles of six classes at grid 8 by MC (Q = -100, 10,000 samples) ... 44

Figure 3.19 Comparison of result at each grid point between gPC and MC (Q = -100) ... 44

Figure 4.1 Fault profile representing an intermittent stochastic input fault and resulting measured variable ... 51

Figure 4.2 Visual interpretation of FDD with the level-1 algorithm ... 54

Figure 4.3 Two reactors in series with separator and recycle unit ... 59

Figure 4.4 Comparisons of the gPC model and MC simulations using controlled variable T1 ... 61

Figure 4.5 Multi-level pseudo random sequence ... 62

Figure 4.6 The PDF profiles of the measured variable (Q1) at 3 operating modes ... 64

Figure 4.7 Illustration of Bayesian inference estimation based fault detection ... 66

Figure 4.8 Illustration of Maximum likelihood based fault estimator ... 68

Figure 5.1 Fault profile representing an intermittent stochastic input fault and resulting measured variable ... 79

Figure 5.2 The PDF profiles of measured variables ... 81

Figure 5.3 The CSTR with a concentration control loop and typical industrial stochastic faults ... 85

Figure 5.4 Simulation results of the gPC model, MC simulations and deterministic nonlinear model ... 86

Figure 5.5 Multi-level pseudo random sequence ... 88

(12)

Figure 5.7 Illustration of the effect of weights on the control performance ...92

Figure 5.8 Illustration of maximum likelihood estimation based fault detection ...95

Figure 6.1 Fluorescent photomicrograph of CHO cells stained with AO and EB ...100

Figure 6.2 Visual interpretation of stochastic images ...103

Figure 6.3 Stochastic segmentation algorithm ...108

Figure 6.4 Sketch of the morphological feature along the boundary ...109

Figure 6.5 Segmentation results and PDF of pixel intensities defining boundary ...110

Figure 6.6 Visual illustration of pixels intensities in the background ...111

Figure 6.7 Segmentation results with deterministic and stochastic level set algorithms ...112

Figure 6.8 Histograms of curvature for apoptotic and normal cells ...113

(13)

xiii

List of Tables

Table 2.1Correspondence of Wiener-Askey polynomial and random input ... 9

Table 3.1 Comparison of acceptance rate for six sensor placement structures ... 35

Table 3.2 Summary of model calibration results (noise variance 𝛔𝟐=0.1) ... 36

Table 3.3 Summary of model calibration results (noise variance 𝝈𝟐=0.1) ... 40

Table 3.4 Summary of results for fault detection rate for two simultaneous faults (noise variance 𝛔𝟐=0.1) ... 41

Table 3.5 Type I and Type II analysis for training set (gPC) ... 45

Table 3.6 Type I and Type II analysis for training set (MC) ... 46

Table 4.1 Parameter declaration for the Reactor-Separator process ... 60

Table 4.2 Sensitivity analysis of reactor 1 ... 62

Table 4.3 Sensitivity analysis of reactor 2 ... 62

Table 4.4 Sensitivity analysis of separator ... 62

Table 4.5 Model calibration result for the level-1 algorithm ... 63

Table 5.1 Parameter declaration and setting used for CSTR ... 85

Table 5.2 Comparison of the inner level optimization strategies (noise 1%) ... 88

Table 5.3 Summary of the results for the outer level optimization without tuning set point ... 89

Table 5.4 Summary of the results for the outer level optimization with tuning set point ... 90

Table 5.5 Summary of the FIR using transient measurements ... 93

Table 5.6 Summary of inner level optimization with Latin hypercube sampling ... 96

Table 6.1 Examples of feature vector (apoptosis) ... 113

Table 6.2 Examples of feature vector (normal) ... 113

(14)

(15)

1

Chapter 1 Introduction

1.1 Background

The quantitative analysis of phenomena occurring in many engineering applications is generally based on mathematical models. Such models can provide a representation of a real system by using a number of hypotheses, approximations and parameters. The system of interest cannot be exactly characterized in practice since models are never exact. Model uncertainties may originate from: (i) a lack of knowledge about the underlying process, (ii) the intrinsic time varying nature of model parameters; and (iii) the inaccurate measurements due to random noise. Thus uncertainties are generally related to both errors in the assumed model structures as well as inaccuracies in the estimated model parameters. Three main tasks are involved in the use of models with uncertainties, (a) the quantification of these uncertainties from data, (b) the propagation of the uncertainties through the mathematical model onto variables of interest, and (c) the characterization of the models’ outputs resulting from the propagation of the uncertainty.

Probabilistic analysis such as Monte Carlo (MC) simulations is the most popular method for propagating uncertainties and characterizing models’ outputs for uncertain models. For this approach, uncertainty can be quantified by drawing a large number of samples and running the model with each of these samples. However, approaches such as MC simulations are computationally prohibitive especially for complex systems. Moreover, the uncertainty propagation results may be questionable when the available information does not provide a strong basis/support for a particular probability assumption. To improve the computational efficiency and the accuracy of the uncertainty propagation step, the generalized polynomial chaos (gPC) in this work which leads to significant reduction in computational time. Then, using a gPC approach, it was possible to treat in this thesis a variety of problems that would be otherwise computationally prohibitive when approached with MC methods.

Abnormal events defined as faults such as sensor/actuator failures usually occur in chemical processes, which can affect the process reliability and lead to economic losses. Different fault detection and diagnosis (FDD) approaches can be used to diagnose and isolate faults, prevent them from propagating, and improve the reliability and efficiency of the supervisory control. The main restrictive factor of an efficient model-based FDD algorithm is the model uncertainty. The step of quantifying the effect of uncertainty onto the variables used for isolation or diagnosis is typically omitted, leading to a loss of the performance of the FDD algorithm. Moreover, faults often may occur intermittently, i.e., systems may switch between non-faulty to faulty operating conditions in a random fashion. Such intermittent occurrences are difficult to diagnose and further complicate the proper detection of faults. In terms of application, fault diagnosis that explicitly considers the dynamic transients has not been extensively addressed in the literature. FDD algorithms that are based on steady state analysis may result in high false alarm rate or mis-detection of faults, when they use data collected during dynamic transients. In the current

(16)

work, the gPC method is combined either with the Maximum Likelihood or with Bayesian Inference to recursively estimate faults of a stochastic nature, while taking the uncertainty and dynamic transients into account.

In practice, most of the available FDD systems are implemented at a supervisory hierarchical level above the closed-loop control system and use measurements that are also used for feedback control. While there is a large body of methods for FDD, the problem of integrating process control and fault diagnosis algorithm has not been addressed as much in particular in the presence of stochastic faults. The key challenge for such integration is that these two activities have competing objectives. For example, if the measured quantities are perfectly controlled, they will not a sufficient amount of variability required for detection of faults. Thus, there is a trade-off between the closed loop control performance and the fault detectability. The optimal trade-off between these two activities has been addressed in the present project by a bi-level optimization problem that is accounting for the uncertainty and dynamic transients.

Automated cell detection and characterization is important in many problems such as cancer research, stem cell research and wound healing. Studying in vitro cellular behavior via living-cell imaging and high throughput screening involves a great amount of imaging data. Accurate and fast quantitative analysis of these images is useful for the evaluations of experimental outcomes and cells’ culture protocols. However, these images usually have varying image qualities, and the manual quantification and analysis of these data is time consuming and prone to errors. Motivated by this, the current work proposed new image processing tools to segment cells from the background in a computationally efficient way. The main idea behind automated image segmentation is to detect the boundary of cells and separate the cells from the background. However, any measurement error due to the noise or uncertainty in the pixels’ intensities may result in significant variations in the results of segmentation. To address this problem, a stochastic image segmentation algorithm is developed to account for the uncertainty in a given image.

1.2 Objectives

In this current project, the following objectives were investigated:

i- The development of new fault detection and diagnosis (FDD) algorithm to identify and diagnose stochastic intermittent fault/s and evaluate the detectability of faults with statistical analysis methods.

ii- The development of recursive FDD algorithms to improve accuracy of fault diagnosis accounting for dynamic transients and uncertainties.

iii- The investigation of the trade-off between fault detectability and closed loop control performance.

iv- The development of efficient algorithms to distinguish apoptotic versus normal cells using identified morphological features of cells in combination with machine learning techniques.

(17)

3

1.3 Contributions

To summarize, the contributions of this current work are (i) the use of generalized polynomial chaos (gPC) expansions for efficient uncertainty quantification and propagation, and (2) their application to a wide array of engineering problems including fault detection and diagnosis (FDD), integration of FDD and feedback control, and efficient image segmentation. The contributions in each of chapter of this work can be summarized as:

i- Chapter 2 provides an up-to-date literature review that covers the main aspects of this work, i.e., gPC based uncertainty propagation, FDD, integration of fault detection and control, as well as image segmentation.

ii- Chapter 3 presents a computationally efficient FDD algorithm and its application to a two-dimensional heat conduction problem. The proposed method is specifically targeted to detect the average of input faults consisting of stochastic perturbations around mean values that change intermittently. The detectability of faults is assessed by calculating Type I and Type II error. This method is shown to be significantly better in terms of computational efficiency and accuracy as compared to Monte Carlo simulations.

iii- Chapter 4 develops FDD algorithms to identify fault/s of a stochastic nature with dynamic transients by combining gPC approximation with nonlinear models of the process and by using either the Maximum Likelihood or the Bayesian Inference based estimators. Optimal selection of sensors is addressed based on sensitivity analysis of the gPC model. This method is shown to be more computationally efficient than an equivalent Particle Filter and less sensitive to the user selected tuning parameters as compared to Particle Filter (PF).

iv- Chapter 5 investigates the problem of the optimal simultaneous tuning of a FDD algorithm and a controller in the presence of stochastic time varying faults. This method is successful in achieving a trade-off between fault detectability and closed loop control performance, and is advantageous in terms of computational efficiency as well as fast fault detection.

v- Chapter 6 presents an efficient gPC model based image segmentation algorithm for fast segmentation of fluorescence microscopy images of Chinese Hamster Ovary (CHO) cells. An automated support vector machine (SVM) classifier is formulated to distinguish apoptotic versus cells based on morphological features identified with the segmentation algorithm. The combination of developed morphological feature extraction method and the trained SVM classifier is shown to be more efficient in terms of differentiation accuracy.

vi- Chapter 7 concludes with detailed recommendations for future work on the following topic: (i) arbitrary uncertainty quantification and propagation; (ii) integration of plant design, control and fault diagnosis; (iii) Image Segmentation and Classification.

(18)

Most of the findings in the current work have been presented in referred journal papers and conferences’ proceeding as below:

Referred Publications

1. Y. Du, T. A. Duever, H. Budman, “Fault detection and diagnosis with parametric uncertainty using generalized polynomial chaos”, Computers and Chemical Engineering, vol. 76, p. 63~75, 2015.

2. Y. Du, H. Budman, T. A. Duever, “Integration of fault diagnosis and control based on a trade-off between fault detectability and closed-loop performance”, Journal of Process Control, vol. 38, p. 42~53, 2016. 3. Y. Du, T. A. Duever, H. Budman, “Generalized polynomial chaos based fault detection and classification for

nonlinear dynamic processes”, Industrial & Engineering Chemistry Research, in press.

4. Y. Du, H. Budman, T. A. Duever, “Classification of normal and apoptotic cells from fluorescence microcopy images using generalized polynomial chaos and level set functions”, Microscopy and Microanalysis, 2nd revision.

5. Y. Du, H. Budman, T. A. Duever, “Parameter estimation for an inverse nonlinear stochastic problem: reactivity ratio studies in copolymerization”, Computers and Chemical Engineering, submitted.

6. Y. Du, T. A. Duever, H. Budman, “Comparison of stochastic fault detection and diagnosis algorithms for nonlinear chemical processes”, Chemometrics and Intelligent Laboratory Systems, ready to submit. 7. Y. Du, H. Budman, T. A. Duever, “Segmentation and quantitative analysis of normal and apoptotic cells

from fluorescence microscopy images”, the 11th_{International Federation of Automatic Control (IFAC)} Symposium on Dynamics and Control of Process Systems, including Biosystems (DYCOPS-CAB), June 6~8, 2016, Trondheim, Norway.

8. Y. Du, T. A. Duever, H. Budman, “Stochastic fault diagnosis using generalized polynomial chaos and maximum likelihood”, the International Symposium on Advanced Control of Chemical Processes (ADCHEM), June 7~10, 2015, Whistler, British Columbia, Canada.

9. Y. Du, T. A. Duever, H. Budman, “Integration of fault diagnosis and control by finding a trade-off between observability of stochastic faults and economics”, the 19th_{World Congress of the International Federation of} Automatic Control (IFAC), August 24~29, 2014, Cape Town, South Africa.

(19)

5

Chapter 2 Theoretical Background and Literature Review

Fault diagnosis in chemical processes and classification of cells’ states of bioengineering are two typical examples of classification problems in engineering. For fault diagnosis, the classification methods are used to predict whether the process is operated at faulty or non-faulty operating condition. In the context of classification of cells’ states, the goal is to assess the in-vitro status of cells, e.g., healthy cells versus cells undergoing programmed cell death or apoptosis.

This chapter provides a brief literature review on the fault detection and diagnosis (FDD), and on cell imaging techniques. Section 2.1 discusses the general uncertainty quantification and propagation method used in this work. This is followed by reviews on fault detection and diagnosis methods, and on the interaction between process control and fault detection. Understanding this interaction is essential for achieving an optimal trade-off of fault detection and control, since in industrial practice both algorithms are operated simultaneously. The review on segmentation of images is given in Section 2.3 followed by a summary of the literature review in Section 2.4.

2.1 Spectral Representation of Stochastic Process

There has been a good amount of research on the numerical solution of large scale engineering problems in the presence of uncertainty (Stefanou G. , 2009). Such uncertainties may originate from either intrinsic time varying phenomena or may result from the use of stochastic noisy data for model calibration. Then, uncertainty model parameters can be used to describe the model uncertainty. Different techniques have been proposed to take the uncertainties into account from the very beginning of the problem definition and analysis (Xiu & Karniadakis, 2003). Uncertainties may be associated with uncertain boundary or initial conditions and/or geometric discrepancies between model and process. A common approach to describe uncertainty is by assuming that the uncertain parameters are stochastic quantities. However, the treatment of these uncertainties as stochastic with a specific probability distribution is not simple due to lack of relevant experimental data to calibrate this distribution. Stochastic processes can be roughly categorized into two main groups based on their probability distribution, i.e., Gaussian and non-Gaussian. The simulations of Gaussian and non-Gaussian stochastic processes are different and a review of available methods for both representations is presented in the following two subsections.

2.1.1 Quantification of Uncertainty

Although most of the uncertainties in engineering problems may be represented as non-Gaussian, the Gaussian assumption is usually made to keep the analysis simple (Spanos & Zeldin, 1998). Current available methods for simulation of Gaussian processes are divided into two categories, i.e., the spectral representation method (Shinozuka & Deodatis, 1996) and the Karhunen-Loeve (K-L) expansion (Ghanem & Spanos, 1991).

(20)

Both approaches are based on the representation of a stochastic process 𝑓(𝑥) as a summation of particular predefined functions with respect to specific random variables as follows:

f(x)= ∑ Cn∅n(x) N

n=0

(2.1)

The spectral representation approach is based on expanding f(x) as a sum of trigonometric functions with random phase angles (Φn(x) in Eq. 2.1) and amplitudes (Cn in Eq. 2.1). The simplest version of this type of

representation which is widely adopted in most applications is given as a function of one random phase angle. The coefficients of the description given in Eq. 2.1 are deterministic and depend on the prescribed power spectrum of the stochastic field (Stefanou G. , 2009). Spectral representation algorithms have been employed in various kinds of Gaussian stochastic process, such as multivariate, multidimensional, and non-homogeneous problem (Liang, Chaudhuri, & Shinozuka, 2007; Spanosa, Tezcanb, & Tratskasc, 2005), and have been successfully implemented in the framework of Monte Carlo (MC) simulations for solving problems with the stochastic finite element method (Lagaros & Papadopoulos, 2006).

The K-L expansion is a special case of an orthogonal series expansion, in which the orthogonal functions are chosen as the eigenfunctions of a Fredholm integral equation. In a K-L expansion, the first term in Eq. 2.1 (n = 0) is the expectation of the random variable, and it is identical to 0 in most applications. In addition, Φn(x) is

defined as the multiplication of eigenvalues by their corresponding eigenfunctions of a set of uncorrelated random variables, where the eigenvalues and eigenfunctions are calculated from the covariance function. This expansion is particularly suitable for the representation of strongly correlated random variables where only a few terms in Eq. 2.1 suffice to capture the majority of the information contained in the data used for calibration (Stefanou G. , 2009). However, there are drawbacks for the K-L expansion, which limits its application (Xiu D. , 2010). The first challenge is solving the Fredholm integral equation, since the analytical solution for this kind of integral equation is only available for simple geometries and special forms of the autocovariance function. Furthermore, the covariance function of the stochastic system is generally unknown, and the computation of eigenvalues and corresponding eigenfunctions from the autocovariance function is strongly influenced by the K-L expansion (Phoon, Huang, & Quek, 2002; Schwab & Todor, 2006). In order to overcome those shortcomings, polynomial chaos expansion (PCE) and generalized polynomial chaos (gPC) expansion were proposed.

2.1.2 Generalized Polynomial Chaos Expansion

The problem of modeling non-Gaussian uncertainty has gained considerable attention since uncertain model components often exhibit non-Gaussian probabilistic characteristics. The polynomial chaos expansion (PCE) is an alternative method to generate sample functions of non-Gaussian, non-stationary stochastic process that employs the Hermite polynomial as an orthogonal basis function of random variables. However, the Hermite polynomial has difficulties in approximating probabilities for non-Gaussian uncertainties. Subsequently, the

(21)

7

generalized polynomial chaos (gPC) method was proposed (Xiu & Karniadakis, 2002). Different kinds of orthogonal polynomials can be selected as basis function depending on the probability distribution function (PDF) of the random variables to be described by the expansion so as to obtain optimal convergence and to maintain orthogonality.

A random process X(θ), viewed as a function of a random event θ is expressed as: 𝑿(𝜃) = 𝑎0𝐻0+ ∑ 𝑎𝑖₁𝐻1(𝜉𝑖₁(𝜃)) ∞ 𝑖₁=1 + ∑ ∑ 𝑎𝑖₁𝑖₂𝐻2(𝜉𝑖₁(𝜃)𝜉𝑖₂(𝜃)) 𝑖₁ 𝑖₂ ∞ 𝑖₁=1 + ∑ ∑ ∑ 𝑎𝑖1𝑖2𝑖3𝐻3(𝜉𝑖1(𝜃), 𝜉𝑖2(𝜃), 𝜉𝑖3(𝜃)) + 𝑖₂ 𝑖₃ 𝑖₁ 𝑖₂ ∞ 𝑖₁=1 ⋯ (2.2)

where 𝐻𝑛(𝜉𝑖₁, ⋯ , 𝜉𝑖_𝑛) is the Hermite polynomial of order n in terms of the multidimensional independent standard

Gaussian random variables 𝝃 = (𝜉_𝑖₁, ⋯ , 𝜉_𝑖_𝑛) with zero mean and unit variance. This expression is the discrete version of the original Wiener polynomial chaos expansion, in which the continuous integrals are replaced by summations. The general equation of the Hermite polynomial is defined as:

𝐻𝑛(𝜉𝑖1, ⋯ , 𝜉𝑖𝑛) = 𝑒1 2𝜉 𝑟_𝜉 ⁄ ₍₋₁₎𝑛 𝜕𝑛 𝜕𝜉𝑖1⋯ 𝜕𝜉𝑖𝑛 𝑒1 2𝜉⁄ 𝑟𝜉 (2.3)

For example, one dimensional Hermite polynomials are:

𝐼0= 1, 𝐼1= 𝜉, 𝐼2= 𝜉2− 1, 𝐼3= 𝜉3− 3𝜉 ⋯ (2.4)

For notational convenience, Eq. 2.2 can be rewritten as follows:

𝑿(𝜃) = ∑ 𝑎̂𝑗𝐼𝑗(𝝃) ∞ 𝑗=0

(2.5)

There is one-to-one correspondence between the function 𝐻𝑛(𝜉𝑖1, ⋯ , 𝜉𝑖𝑛) and 𝐼𝑗(𝝃), as well as the coefficients 𝑎𝑖₁⋯𝑖_𝑟 and 𝑎̂𝑗. In Eq. 2.2, the summation is carried out according to ascending order of the Hermite polynomials.

The Hermite based chaos expansion sometimes converges very slowly or may diverge for non-Gaussian random inputs (Xiu D. , 2009). In order to deal with more general random inputs, t basis functions other than Hermite can be used. These basis functions are selected as per the Wiener-Askey scheme (Xiu & Karniadakis, 2002), which is a generalization of the original Wiener’s Hermite-chaos expansion. Due to their ability to produce more compact representations, gPC’s are considered in the current work. Similar to the one-dimensional Hermite polynomial, a general two-dimensional expansion of random process 𝑿(𝜃) is defined as:

𝑿(𝜃) = 𝑐0ψ0+ ∑ 𝑐𝑖1ψ1(𝜉𝑖1(𝜃)) ∞ 𝑖₁=1 + ∑ ∑ 𝑐𝑖1𝑖2ψ2(𝜉𝑖1(𝜃)𝜉𝑖2(𝜃)) 𝑖₁ 𝑖₂ ∞ 𝑖₁=1

(22)

+ ∑ ∑ ∑ 𝑐𝑖₁𝑖₂𝑖₃𝜓3(𝜉𝑖₁(𝜃), 𝜉𝑖₂(𝜃), 𝜉𝑖₃(𝜃)) + 𝑖2 𝑖₃ 𝑖1 𝑖₂ ∞ 𝑖₁=1 ⋯ (2.6)

where ψ𝑛(𝜉𝑖₁, ⋯ , 𝜉𝑖_𝑛) is the gPC from the Askey-chaos scheme, and 𝑛 is the order of multi-dimensional random

variables 𝛏 = (𝜉𝑖1, ⋯ , 𝜉𝑖𝑛). The polynomials in Eq. 2.6 are not restricted to Hermite polynomials and are selected according to the Askey scheme dependent on the PDF of the random variables to be used in a particular problem. For example, Jacobi polynomials can be used for when the random variables have a Beta distribution. For notational convenience, Eq. 2.6 can be also expressed as:

𝑿(𝜃) = ∑ 𝑐̂𝑗∅𝑗(𝝃) ∞

𝑗=0

(2.7)

There is one-to-one correspondence between the functions ψ_𝑛(𝜉_𝑖₁, ⋯ , 𝜉_𝑖_𝑛) and ∅_j(𝛏), as well as their coefficients 𝑐̂𝑗 and 𝑐𝑖1⋯𝑖𝑟. Since each polynomial considered in the Askey scheme forms a complete basis in the Hilbert space determined by their corresponding support, it can be concluded that each type of Askey-chaos will converge to any 𝑳𝟐 functional in the 𝑳𝟐 sense in the corresponding Hilbert functional space, i.e.,

〈∅𝑖∅𝑗〉 = 〈∅𝑖2〉𝛿𝑖𝑗 (2.8)

where δij is the Kronecker delta and 〈∙,∙〉 means the inner product in the Hilbert space of the variables.

〈𝑓(𝜉)𝑔(𝜉)〉 = ∫ 𝑓(𝜉)𝑔(𝜉) 𝑊(𝜉)𝑑𝜉 (2.9) where W(ξ) is the weighting function in Eq. 2.9, and is defined as:

𝑊(𝜉) = 1 √2𝜋𝑛𝑒

−1 2𝜉⁄ 𝑇𝜉 _(2.10)

where 𝑛 is the dimension of random variables 𝝃. The key difference between gPC and many other possible expansions is that the polynomials are orthogonal with respect to the weighting function 𝑊(𝜉). The correspondence between the type of Wiener-Askey polynomial chaos and the uncertain inputs of continuous chaos is given in Table 2.1 (Xiu D. , 2009). It is worthwhile mentioning that uniformly distributed random variables correspond to a special case of the Jacobi polynomial with parameter α=β=0*_{, and this case is separately shown} in table 2.1. The support is defined as the set of points where the PDF of particular polynomial is not zero-valued. Specifically, the support is defined by two parameters for the Beta as well as the Uniform distribution, 𝑎 and 𝑏, which are their minimum and maximum values.

*

The weighting function of a uniform distribution in (-1, 1) is W(ξ) = ½, and the first few Legendre orthogonal polynomials are: u0(ξ) = 1, u1(ξ) = ξ, u2(ξ) = (3/2)*ξ2 - (1/2), …

The weighting function of a beta distribution in (-1, 1) is W(ξ) = (1-ξ)α(1+ξ)β, (α, β > 0), and the first few Jacobi orthogonal polynomials are: b0(ξ) = 1, b1(ξ) = (1/2)[α – β + (α + β + 2)*ξ], …

(23)

9

Table 2.1Correspondence of Wiener-Askey polynomial and random input Random Input Polynomial Support

Gaussian Hermite-chaos (−∞, ∞) Gamma Laguerre-chaos [0, ∞)

Beta Jacobi-chaos [a, b] Uniform Legendre-chaos [a, b]

2.1.3 Uncertainty Propagation

The second part in the analysis of a stochastic system consists of propagating the effect of uncertainties in the model parameters onto the system outputs. The stochastic finite difference or element method is an extension of the corresponding classical deterministic approach and has been gaining attention in the past decades to solve stochastic problems (Ghanem & Spanos, 1991). This method basically proceeds as per the following three steps: (1) the representation of the random inputs by the spectral approach; (2) the propagation of uncertainties into the stochastic system equation (first at the element and then at the global system level); and (3) the response variability calculation with respect to the stochastic inputs/parameters.

In this work, a gPC approximation is used for the first step as per the discussion in the previous subsection. Then, for step 2, the gPC’s are substituted into the governing equations and subsequently, a Galerkin projection calculation is applied to compute the coefficients of the gPC expansions using their orthogonality properties. The general procedures for Galerkin projection are presented as below.

Suppose the general stochastic elliptic partial differential equations with random inputs are given as†: ∇ ∙ [𝜅(𝑥; 𝜔)∇𝑢(𝑥; 𝜔)] = 𝑓(𝑥; 𝜔) on 𝒟 × Ω

𝑢(𝑥; 𝜔) = 𝑔(𝑥; 𝜔) on 𝜕𝒟 × Ω (2.11) where 𝒟 is the spatial domain and Ω is the probability space, 𝑓, 𝑔 and κ are functions on 𝒟 × Ω. 𝑢 is the solution, 𝑓 is the source term, 𝑔 is the Dirichlet boundary condition, and κ is a model parameter. All of these operators are a function of the uncertainty 𝜔, which may be introduced into the system via stochastic boundary conditions, initial conditions, material properties, etc.

In order to solve for solution 𝑢, which is a random variable, the gPC’s are employed to expand the variables as follows: 𝜅(𝑥; 𝜔) = ∑ 𝜅𝑖(𝑥)𝜙𝑖(𝜉) 𝑃 𝑖=0 (2.12) 𝑢(𝑥; 𝜔) = ∑ 𝑢𝑖(𝑥)𝜙𝑖(𝜉) 𝑃 𝑖=0 (2.13)

†_{The application of the gPC approximation to ordinary differential equations follows the similar procedures and will be further explained in}

(24)

𝑓(𝑥; 𝜔) = ∑ 𝑓𝑖(𝑥)𝜙𝑖(𝜉) 𝑃

𝑖=0

(2.14)

where the infinite summation of 𝝃 in Eq. 2.5 has been replaced by a truncated finite term summation of {𝝓} in the finite dimensions of 𝝃 = {𝜉1, ⋯ , 𝜉𝑛}. The dimensionality 𝑛 of 𝝃 is determined by the random inputs. According

to the gPC expansion, the random parameter 𝜔 is embedded into the polynomial basis 𝜙(𝝃) while the coefficients in the above equations, i.e., 𝜅𝑖, 𝑢𝑖, 𝑓𝑖, are deterministic.

The truncated finite summation parameter 𝑃 is determined by the dimensionality (𝑛) of random inputs and the highest order (𝑝) of the polynomials {𝜙𝑖}, which satisfies:

(𝑃 + 1) = (𝑛 + 𝑝)! 𝑛! 𝑝!⁄ (2.15) In order to achieve exponential convergence in the coefficients 𝑢𝑖, the optimum polynomial should be chosen

from the Askey-chaos scheme (see Table 2.1) and the weighting function is calculated accordingly. By substituting the expansions into Eq. 2.11:

𝛻 ∙ [∑ 𝜅𝑖(𝑥)𝜙𝑖(𝜉) 𝑃 𝑖=0 𝛻 ∑ 𝑢𝑖(𝑥)𝜙𝑗(𝜉) 𝑃 𝑗=0 ] = ∑ 𝑓𝑖(𝑥)𝜙𝑖(𝜉) 𝑃 𝑖=0 (2.16)

After some algebra:

∑ ∑[𝜅𝑖(𝑥)𝛻2𝑢𝑗(𝑥) + 𝜅𝑖(𝑥)𝛻𝑢𝑗(𝑥)]𝜙𝑖 𝑃 𝑗=0 𝜙𝑗 𝑃 𝑖=0 = ∑ 𝑓𝑖(𝑥)𝜙𝑖 𝑃 𝑖=0 (2.17)

The choice of 𝝃 and 𝜙(𝝃)define the weighting function to be used. Using the concept of the inner product, a Galerkin projection of Eq. 2.17 onto each basis polynomial {𝜙𝑖} is then conducted. The projection ensures that

the error is orthogonal to the functional space spanned by the finite dimensional basis {𝜙𝑖}. Based on the

orthogonality of {𝜙𝑖}, the following expression can be obtained:

∑ ∑[𝜅𝑖(𝑥)𝛻2𝑢𝑗(𝑥) + 𝜅𝑖(𝑥)𝛻𝑢𝑗(𝑥)]𝑒𝑖𝑗𝑘 𝑃 𝑗=0 𝑃 𝑖=0 = ∑ 𝑓𝑘(𝑥)〈𝜙𝑘2〉 𝑃 𝑖=0 (2.18)

where 𝑒𝑖𝑗𝑘= 〈𝜙𝑖𝜙𝑗𝜙𝑘〉. Based on the orthogonality of the basis function some of these products will be vanish,

and then the original stochastic partial differential equation is reduced to a system of coupled deterministic differential equations with the coefficients obtained from the truncated gPC expansion. The central differencing method is used to solve the deterministic system. Once the coefficients of the expansion are obtained, it is possible to compute statistics for the solved output with the following formulae:

𝔼(𝑢) = 𝛦 [∑ 𝑢𝑖𝜙𝑖 𝑃 𝑖=0 ] = 𝑢0𝛦[𝜙0] + ∑ 𝛦[𝜙𝑘] 𝑃 𝑖=1 = 𝑢0 (2.19) 𝑉𝑎𝑟(𝑢) = 𝛦 [(𝑢 − 𝛦(𝑢))2] = 𝛦 [(∑ 𝑢𝑖𝜙𝑖 𝑃 𝑖=0 − 𝑢0) 2 ]

(25)

11 = 𝛦 [(∑ 𝑢𝑖𝜙𝑖 𝑃 𝑖=1 ) 2 ] = ∑ 𝑢𝑖2𝛦(𝜙𝑖2) 𝑃 𝑖=1 (2.20)

Also, the PDF of u can be efficiently calculated by sampling from the distribution of ξ and substituting the corresponding sampled values into Eq. 2.13. It should be noted that Taylor approximations are needed for using Galerkin projection for nonlinear terms that are not of polynomial form. The polynomial chaos quadrature (PCQ) can be used to overcome this challenge when using a non-intrusive PCE method (Xiu D. , 2009). In appendix B, PCQ is used to replace the exact integration in Eq. 2.17 with respect to ξ and is applied to the estimation of reactivity ratios in copolymerization.

2.2 Fault Detection and Diagnosis

Distributed control systems have brought great benefits to the modern engineering systems, such as chemical and petrochemical industries. However, abnormal events usually occur in practice affecting their performance and resulting in economic losses (Isermann, 2005). To detect faults and improve the reliability and efficiency of supervision, fault detection and diagnosis (FDD) become essential activity.

FDD activities involve the timely detection of abnormal events, correct diagnosis of their causal origins, efficient isolation of a fault and appropriate actions to bring the process back to its normal operating state. Generally, FDD methods can be categorized into three classes: model based analytical methods, data driven based empirical methods and hybrid approaches (Frank, 1990). All of the available methodologies involve a series of steps: (1) information transformation; (2) symptoms extraction and (3) classification, and (4) cause-effect mapping according to the obtainable measurements or constructible reference indicator (signal) (Venkatasubramanian, Rengaswamy, & Yin, 2003). A general schematic depiction of FDD is given in Figure 2.1 (Gerlter, 1998). Information transformation symptoms extraction cause-effect mapping symptoms classfication Measurement space Feature space Decision space Class space

(26)

2.2.1 Model based Analytical Methods

Different mathematical models have been proposed for use in the framework of FDD. A straightforward approach to detect a potential fault in a process is to compare the process behavior with a mathematical model describing the nominal process performance, i.e., without the faults. The inconsistencies between the measurements and the ideal model predictions are employed as an indicator to describe the discrepancies between the actual behavior and the normal operation state predicted by the model (Isermann, 2005). When a fault occurs, a nonzero indicator should be obtained to reveal the relation between the observed variables and the model based predictions.

The advantage of model based FDD method is that the effects of faults and other inputs, such as disturbances and noise, can be mathematically modeled as either additive or multiplicative contributions according to the physical understanding of the process (Frank, 1990; Isermann, 2005). Therefore, the discrepancy between the nominal model and the true system can be clearly illustrated by a mathematical expression, and then the fault can be further classified easily. According to the types of measured input signals and output signals, there are three kinds of model based FDD methods: parameter estimation, state/output observer and a parity space based approach (Frank, 1990).

The parameter estimation method is based on the premise that the fault in the process can change a model parameter significantly. Thus changes in model parameters, as obtained from regression of the model with data, can be used to infer faults (Isermann, 2005). The presence of the fault can be inferred from the discrepancies between the nominal model parameter values and the estimated parameter where the nominal model parameters are associated with normal (fault free) operating conditions. Computing the differences (Eq. 2.21) between the nominal values and the estimated parameters is a straightforward way to identify the occurrence of a fault:

∆𝑝 = 𝑝 − 𝑝̂ (2.21) where 𝑝 and 𝑝̂ are the nominal value and the estimation of the physical parameter respectively. Normally, due to the disturbance/noise as well as uncertainty of modeling, the difference ∆𝑝 is not identical to 0 even if there is no fault. Therefore, a threshold must be set up to indicate whether a fault has occurred or not. If the value of indicator ∆𝑝 is greater than the threshold a fault is identified.

An alternative method is to use either state observers or output observers. This kind of methodology is referred to as the observer based method (Isermann R. , 2005; Venkatasubramanian, Rengaswamy, & Yin, 2003). A state observer can be applied if the faults can be modeled as a state variable, and the output observer is used if the state observer is not feasible, e.g. because of lack of observability. The observer based method is especially appropriate if the fault occurs in sensors and actuators because the latter are not part of the state space model used for state estimation. Similar to the parameter estimation approach, a relatively precise mathematical model for the plant is required. An indicator is also necessary, which is defined as the residual between the estimated state and the measured state, or the nominal output and the measurement of output from the process when the state observer is

(27)

13

not available. Although generally linear observers have been used, nonlinear state estimators have been also reported. For nonlinear systems, the extended Kalman filter (EKF) has also been used (Chetouania, Mouhaba, Cosmaoa, & Estela, 2002). However, the EKF can result in a suboptimal solution, since it is based on linearization of the nonlinear equations at each time interval. A class of estimators that do not require explicit linearization has been investigated recently involving particle filtering (Rawlings & Bakshi, 2006). However, this kind of approach belongs to the Markov Chain Monte Carlo based methodology, and its computational cost is very large.

In addition to the employment of observers for identifying potential faults, another promising approach is fault identification by input-output models (Isermann, 2005). Parity space based residual analysis belongs to this group. This method is based on comparing predictions from a fixed model 𝐺𝑚 to the measured outputs from process 𝐺𝑝,

thereby forming a residual vector with respect to the selected input 𝑢 and output 𝑦:

𝑟(𝑠) = 𝐺𝑀𝑦(𝑠)𝑦(𝑠) − 𝐺𝑀𝑢(𝑠)𝑢(𝑠) (2.22)

where 𝑟(𝑠) is the residual vector and 𝐺𝑀𝑦 and 𝐺𝑀𝑢 are transfer functions. Ideally, for a model structure error and

noise free system, the residual is 0 in the absence of faults. If the fault, model structure error and noise can be mathematically modeled, the parity space based method is capable of decoupling fault from model structure error and noise. Therefore, the parity space based method exhibits certain robustness with respect to model structure error and noise.

2.2.2 Data Driven based Empirical Methods

Empirical methods are mainly based on univariate and multivariate statistical algorithms to identify the occurrence of fault (Negiz & Cinar, 1997). They are useful in real process operations since accurate mathematical mechanistic (first-principles based) models are difficult to obtain due to lack of knowledge about the process. Considering that the systems are influenced by random inputs (distance or noise), it is reasonable to represent the measurements as statistical time series that can be analyzed in a probabilistic framework (Venkatasubramanian & Kavuri, 2003). When the process is fault free, the observations can be represented by a probability distribution that is assigned to the normal operation. If the process works under faulty condition, the underlying distributions will deviate from the normal distribution thus revealing that the process is out of control. Accordingly, the fault is identified by detecting changes in the probability distribution of the collected data.

For the data driven method, measurements are sampled sequentially and decisions are made based on the observations up to the current time. The easiest way to make a decision regarding the occurrence or absence of a fault is to compare the values of the observations with predefined control limits. If the value is beyond the limits (or ranges) this can be interpreted as the occurrence of a fault. Obviously, an effective algorithm should be sensitive to the faults and robust to the random noise and model structure error. However, the sensitivity to process noise usually increases along with the sensitivity to actual input changes, which means that often false alarm rates tend to increase while detection ability increases.

(28)

The Shewhart control chart and the cumulative sums chart were the earliest algorithms proposed for online monitoring and fault detection. They are based on the assumption that a process subject to its natural variability will remain in a relatively steady state of statistical control where certain process and monitored variables remain close to the desired values. Therefore, abnormal events or faults can be identified as soon as they occur by monitoring deviations from the steady state of statistical control. On the other hand, since most of the chemical and petrochemical processes are characterized by strong interaction, the monitored variables are generally not independent, which limits the effectiveness of univariate control charts. Instead, multivariate statistical techniques have been proposed as a way of providing a better solution (MacGregor & Kourti, 1995).

Most of the available multivariate analysis based algorithms are based on the idea of Principal component analysis (PCA). PCA not only transforms a number of related process variables into a smaller set of uncorrelated variables, but it can also be used for control-detection in the presence of interactions among variables. Similar to PCA, partial least squares (PLS) conceptually is another kind of dimension reduction method, which is employed to reduce the dimensions of both process variables and product quality variable to make the analysis simpler. There are different versions of PCA/PLS algorithms reported in literature (Venkatasubramanian & Kavuri, 2003).

PCA is based on an orthogonal decomposition of the covariance matrix for the underlying process variables along their directions that could explain the maximum variability in the obtained data. Therefore, the advantage of using PCA is its ability to represent the original variables in a relatively lower dimension where the information can be properly explained and the major trends in the original data set can be identified. A major limitation of PCA based monitoring methods reported in the literature is that the time invariant PCA models have been used whereas most practical processes are time varying. To address this, some studies developed algorithms to update the PCA model recursively. A general scheme for recursive PCA update should include: mean, covariance, principal components including number of components to be retained, and the confidence limits for 𝑇2 (scaled squared scores) and 𝑄 (residual) statistics. An algorithm involving recursive PCA (Li, Yue, Valle-Cervantes, & Qin, 2000) has been used for adaptive monitoring of a rapid thermal annealing process. A similar recursive PLS algorithm was employed to monitor a complex industrial process (Wang, Kruger, & Lennox, 2003).

Another variant of the PCA method is the multi-resolution or multi-scale PCA. In the latter approach wavelet analysis was combined with PCA method and has been proposed to deal with both cross-correlated and auto-correlated variables (Bakshi, 1998) as well as with robustness problems (Chen, Bandoni, & Romagnoli, 1996; Wang & Romagnoli, 2005). The combination algorithm of PCA and wavelet analysis can provide multi-resolution and multi-scale capabilities for fault detection. In particular it can reveal frequency information about the fault.

To overcome the nonlinear behavior that is typical in most chemical processes, different algorithms have been developed. A neural network based PCA model was proposed where an internal layer referred to as the bottleneck was used to reduce the model dimension (Kramer, 1991). A multi-scale nonlinear PCA was proposed using wavelet analysis (Maulud, Wang, & Romagnoli, 2006). Alternatively, a Kernel PCA method has also been

(29)

15

proposed as a relatively simple alternative to neural network based approaches since it requires straightforward solution of an eigenvalue problem (Lee, Yoo, Choi, Vanrolleghem, & Lee, 2004).

Compared with FDD schemes that are based on mechanistic models, multivariate statistical methods do not require an explicit mechanistic model and can handle high dimensional and correlated processes. However, they fail in predicting faults for data that is significantly different from the ones used for model calibration. Thus, hybrid methods that combine mechanistic models and multivariate statistical models were proposed to overcome this shortcoming (Gertler & Cao, 2004; Mylaraswamy & Venkatasubramanian, 1997).

2.2.3 Hybrid Algorithms

To determine the effectiveness of the available fault detection algorithm, four issues have to be addressed: (1) whether the fault is observable; (2) can the fault be distinguished from another unknown fault; (3) can the fault be detected in the presence of process and measurement noise; and (4) can the fault be distinguished from other known faults. All these questions are related to the subject of observability of a fault from available measurements or mathematical model. Since no single method is accurate enough to deal with all the requirements for a fault diagnostic system, hybrid approaches that combine mechanistic models and data driven empirical models become more attractive (Gertler & Cao, 2004). A successful implementation of such a hybrid framework has been conducted for the Amoco model IV fluid catalytic cracking unit. It was adopted by Honeywell for the development of an intelligent control system (Mylaraswamy & Venkatasubramanian, 1997).

2.2.4 Interaction between Control and Fault Diagnosis

Most of the FDD systems are implemented at the supervisory level on top of the available control system. As mentioned above, fault detection methods are based on measurements and some of these measurements are used for feedback in control loops. Thus, variations in the tuning of control loops may affect the closed loop dynamics of the controlled variables and subsequently may affect the performance of the fault detection algorithms. For example, detuning of a controller may be required to increase the variability in a controlled variable so as to improve the observability of fault. However in such case the performance of the control unit would deteriorate. Hence, there is a tradeoff between fast fault detection and acceptable performance of the control unit. A control system that is tolerant to faults is referred to as a fault tolerant control system (FTC). More precisely, FTCs are closed loop control systems that can tolerate malfunctions of the system while maintaining desirable performance (Isermann, 2006).

Although the fault tolerant control problem has been extensively studied, most of the work on FTC was carried out on either one of the two components of the systems, i.e. the FDD component and the control strategy. The issue of interaction between control and diagnostic together has not been addressed as much. Hence, most available FDD algorithms that are operated together with a controller have not been designed to achieve an

(30)

optimal trade-off between control and FDD performance. Thus, it is important to integrate FDD and control to develop flexible algorithms that satisfy both objectives (Blanke, Kinnaert, & Lunze, 2006).

Generally, interactive FTC approaches can be categorized into two classes, i.e., passive FTC and active FTC (Zhang & Jiang, 2008). For the passive FTC strategy, the controllers are fixed and predesigned to be robust against a class of predefined faults (Eterno, et al., 1985). In contrast, active FTC system can react to the potential faults by reconfiguring the control strategy to preserve stability and system properties. Thus, in active FTC, the controller has to compensate for the impacts of the possible faults either by selecting a pre-assumed control algorithm or by synthesizing a new one online (Patton, 1997). These two approaches rely highly on the real time FDD algorithm to provide timely information about the status of the system. Thus, the goal of a FTC system is to design controllers with flexible structures while maintaining stability and improving the performance, not only when all control components are performing normally, but also when faults occur.

The active FTC can be divided into four units: (1) a re-configurable controller; (2) a FDD algorithm; (3) a controller reconfiguration mechanism; and (4) a flexible reference governor (Zhang & Jiang, 2008). The issues are how to: (1) design controllers that can be reconfigured; (2) develop FDD schemes that are sensitive to faults while robust to model uncertainties, disturbances as well as noise; and (3) manipulate controllers in the event of faults to achieve desirable performance of monitored parameters. A four parameter controller setup that is a generalization of the two degrees of freedom controller was proposed to address the interaction between fault detection and control (Jacobson & Nett, 1991). The four degrees of freedom controller was reformulated into a general framework, where tools from optimal and robust control were applied (Tyler & Morari, 1994). Based on a standard fault diagnostic algorithm, simultaneous design of a controller and multivariate statistical model based fault diagnosis scheme was proposed and the economic impact of unobservable faults was discussed (Shams, Budman, & Duever, 2011). The influence of control on the fault detection problem was studied from the modeling point of view (Gertler & Cao, 2004), where the set point of the feedback control and/or the ratio coefficient to be used for ratio control was changed to improve the fault identification.

2.2.5 Estimation based on Sequential Monte Carlo Methods

Classification involves estimating unknown quantities from some given observations. When the prior knowledge about the phenomenon being modelled is available, Bayesian models can be formulated with this knowledge. The knowledge includes prior distributions for the unknown quantities and likelihood functions relating these quantities to the observations. Following this, all inference on the unknown quantities is based on the posterior distribution obtained from Bayes’ theorem. In terms of implementation, the observations (data) arrive sequentially in time and we are interested in performing inference online. Therefore, it is necessary to update the posterior distribution as new data become available. Computational efficiency is an additional motivation for real-time estimation with new data (Doucet, Freitas, & Gordon, 2001).

(31)

17

When the data can be modelled by a linear Gaussian state space model, it is possible to derive an exact analytical expression to compute the evolving sequence of posterior distributions. This procedure is the well-known Kalman filter (Ristic, Arulampalam, & Gordon, 2004). If the data are modelled as a partially observed finite state-space Markov chain, it is also possible to derive an analytical solution, which is known as the Hidden Markov Model (HMM) filter (Elliott, Aggoun, & Moore, 2008). These two popular filters rely on various assumptions to ensure mathematical tractability. However, observations (data) collected can be very complex. For example, these data typically involve elements of non-Gaussian and nonlinearity, which may preclude analytic solution. Many schemes, such as extended Kalman filter, Gaussian sum approximation and grid-based filter, have been proposed to overcome this challenge. The first two methods cannot take all the salient statistical features into account for the process of interest, which may lead to poor estimation results. The third method, grid-based filter (Ristic, Arulampalam, & Gordon, 2004), using deterministic numerical integration methods, can provide accurate results, but are difficult to implement and computational prohibitive for high dimensional nonlinear problem.

Sequential Monte Carlo (SMC) methods are a set of simulation based methods that can provide a convenient and attractive approach to computing the posterior distributions. SMC methods are flexible and can be easily applied to complicated problem (Doucet, Freitas, & Gordon, 2001). Over the last decades, several related algorithms, such as particle filter and Monte Carlo filter, have been proposed in several research fields. Since their introduction, particle filters have been become a very popular method to solve the solution of optimal estimation problem in nonlinear and non-Gaussian scenarios. In the context of fault detection and diagnosis (FDD), the principle of particle filters is to approximate the conditional state probability distribution that can be used for fault detection by a number of particles. These particles contain samples from the state space and a set of weights that are associated with the particles. Particles can be easily generated and recursively updated using a given process model, which can be further used to describe the evolution in time of the system under analysis. Thus, particle filters algorithm can be used to estimate the probability density function of state, which can be further used to indicate the probability of the occurrence of fault.

2.3 Classification of Cells States

2.3.1 Microscopic Image Acquisition

Microscopy images of cells can be used to discriminate normal, apoptotic and necrotic cells. The morphological difference between apoptosis and necrosis was first observed by electron microscopy (Huerta, Goulet, Huerta-Yepez, & Livingston, 2007). Due to its high resolution, the electron microscopy has the capacity of detecting the specific morphological changes during early and late apoptotic cells. However, this method requires special technical training and it takes much time, which limit its application in practice.

Fluorescence microscopy can improve the observation of apoptotic bodies and also discern necrosis by staining cells with fluorescent dyes. Different fluorescent dyes such as Hoechst stains and Annexin V can be used to label

Classification Algorithms based on Generalized Polynomial Chaos