Fault Detection and Model Identification in Linear Dynamical Systems

(1)

Horton, KirkGerritt. Fault Detection andModelIdenticationinLinear Dynamical

Systems. (Under the directionof Dr. Stephen La Vern Campbell.)

Lineardynamicalsystems,Ex 0

+Fx=f(t),inwhichE issingular,areuseful ina

widevariety ofapplications. Because ofthis widespreadapplicability,muchresearch

has been donerecently todevelop theory for the design of linear dynamicalsystems.

A key aspect of system design is fault detection and isolation (FDI). One avenue of

FDIisviathemulti-modelapproach,inwhichtheparametersofthenominal,unfailed

modelofthesystemareknown,aswellastheparametersofoneormorefaultmodels.

The design goalis to obtain an indicator for when a fault has occurred, and, when

morethan one type ispossible,whichtypeoffaultit is. Achoicethat must bemade

in the system design is how to modelnoise. One way isas a bounded energy signal.

This approach places very few restrictions on the types of noisy systems which can

be addressed, requiring nocomplex modelingrequirement.

This thesisappliesthe multi-modelapproachtoFDIinlineardynamicalsystems,

modeling noise as bounded energy signals. A complete algorithm is developed,

re-quiring very littleon-linecomputation,with whichnearly perfect faultdetection and

isolationoveranitehorizonisattained. Thealgorithmappliestechniquestoconvert

complexsystemrelationshipsintonecessaryandsuÆcientconditionsforthesolutions

(2)

isolation via the separating hyperplane. The algorithm is implemented and tested

ona suite of examplesin commercialoptimization software. The algorithmis shown

to have promise in nonlinear problems, time varying problems, and certain types of

(3)

LINEAR DYNAMICAL SYSTEMS

by

Kirk Gerritt Horton

a dissertation submitted to the graduate faculty of

north carolina state university

in partial fulfillment of the

requirements for the degree of

doctor of philosophy

operations research program

raleigh, north carolina

February 2001

approved by:

S. L. Campbell R. Smith

chair of advisory committee

K. Ito H. T. Tran

(4)

KirkGerritt Hortonwas born September 14,1963in Paterson, New Jersey. He grew

up attending public schools with his three sisters in West Milford, New Jersey. He

was the valedictorian of the West Milford HighSchool graduatingclass of 1981.

He received hisB. Engineeringdegreewith amajorinElectricalEngineering and

Computer Science from Stevens Institute of Technology in Hoboken, New Jersey in

1985. Shortly after graduation he received a commission in the United States Air

ForcefromAFROTC.From1985 to1993hewasapilot,yingthe T-37andtheT-38

trainer,thentheRF-4Creconnaissance/ghter, andthentheF-111Eghter/bomber.

He ew 19 missions over enemy territory in Iraq during Operation Desert Storm,

receivingthe AF DistinguishedFlying Cross forone of those missions. From 1993to

1995heattendedthe AirForceInstituteofTechnology,earningaM.S.inOperations

Analysis. From 1995to1997he pilotedtheF-117A StealthFighter,endingthat tour

asaninstructor in theaircraft. He arrived atN. C. State in1997to pursuea Ph. D.

inOperationsResearch. He iscurrently serving onactiveduty asa Major inthe Air

Force.

The author is marriedto the former Susan Elaine Pringels,of Sumter, S.C. and

(5)

As countless students have beforeme, I owe anincredibleamountof gratitude tomy

advisor, Dr. Stephen L. Campbell. An exceptional teacher, a meticulous researcher,

a prolic writer, an accomplished artist, and a natural conversationalist, he guided

me through the morass of graduate study with a rm but gentle hand. Without his

expertise I would not have been able tocomplete this project.

IamalsogratefultoDr.RalphSmith,Dr.HienT.Tran,Dr.KazufumiIto,andDr.

Ethelbert Chukwuforservingonmycommittee. Theirmathematicaland Operations

Research experience were invaluablein ensuring the accuracy of my research.

I would like to thank the United States Air Force, and in particular the faculty

and sta at the Air Force Institute of Technology for preparing and selecting mefor

the program that allowed me to attend North Carolina State University to pursue

my degree.

Finally, I must acknowledge my wife, Susan, and my daughters, Madisyn and

Ashlyn. Without their love and support none of this would have been possible, nor

wouldithavebeenworthdoing. Ipraythatallofourfather'sblessingswillbepoured

out on them as we travel around the world serving our country and spreading the

(6)

List of Tables vii

List of Figures ix

1 Introduction and Review of Prior Research 1

1.1 Linear DescriptorSystems . . . 2

1.1.1 Basic Theory . . . 2

1.1.2 NumericalSolutions . . . 7

1.2 Fault Detection and Isolation . . . 11

1.2.1 Basic Theory . . . 11

1.2.2 Feedback and Observer Design. . . 13

1.2.3 OptimalControl . . . 16

1.2.4 H 1 Control . . . 18

1.2.5 Prior Research . . . 19

1.2.6 Conclusion. . . 24

1.3 Outlineof Thesis . . . 24

1.4 Contributions of Thesis . . . 25

2 Fault Detection via the Detection Signal 26 2.1 The Problem- Findingthe Minimum Energy DetectionSignal . . . . 26

2.1.1 ProblemSetup . . . 27

2.1.2 Formulationas anOptimalControlProblem . . . 30

2.1.3 ProblemStatement . . . 37

2.2 Necessary Conditions . . . 37

2.2.1 Computing the Necessary Conditions . . . 38

2.2.2 Riccati Form of Necessary Conditions . . . 39

2.2.3 ProblemFormulation inTerms of the Necessary Conditions. . 49

2.2.4 SuÆcientConditions . . . 52

(7)

2.4.2 Unreduced Model . . . 61

2.4.3 ControlledSystems . . . 63

2.4.4 AlternativeCost Functions . . . 64

2.4.5 Knowledgeof InitialConditions . . . 65

3 Model Identication via the Separating Hyperplane 68 3.1 The Problem- Determiningthe Origin of a Given Output . . . 68

3.1.1 ProblemSetup . . . 69

3.1.2 The Separating Hyperplane . . . 71

3.1.3 Approximatingthe SeparatingHyperplane . . . 72

3.1.4 ProblemStatement . . . 74

3.2 The ModelIdentication Algorithm . . . 76

3.3 Variations . . . 79

3.3.1 Multiple faultmodels . . . 79

3.3.2 AlternativeFormulations . . . 80

3.3.3 ControlledSystems . . . 81

3.3.4 AlternativeCost Functions . . . 81

3.3.5 Knowledgeof InitialConditions . . . 81

4 Examples and Analysis of Results 83 4.1 The Complete Problemand Algorithm . . . 83

4.2 Introduction of Software . . . 87

4.2.1 SOCS Parameters . . . 88

4.2.2 Choosing aValue of . . . 90

4.3 Introduction of Examples. . . 90

4.4 One-DimensionalState Examples . . . 92

4.4.1 Primary One-DimensionalExample . . . 93

4.4.2 Other One-DimensionalExamples . . . 99

4.5 Two-Dimensional StateExamples . . . 103

4.5.1 Primary Two-DimensionalExample . . . 104

4.5.2 Other Two-Dimensional Examples. . . 108

4.5.3 Common Mode Two-Dimensional Example . . . 115

4.6 IndustrialExample . . . 118

4.7 Multiple Fault ModelExamples . . . 121

4.7.1 One-DimensionalExample . . . 121

(8)

5 Future Work and Conclusions 129

5.1 FutureWork . . . 129

5.1.1 The Half-Innite Interval . . . 130

5.1.2 Linear Time Varying Models . . . 131

5.1.3 NonlinearModels . . . 132

5.1.4 Independent NoiseBounds . . . 135

5.1.5 Sensitivity Issues . . . 135

5.2 Conclusions . . . 136

List of References 139 A Software Drivers 145 A.1 ModelReduction . . . 145

A.2 FortranCode Generation . . . 148

A.3 Optimizationvia the FDMI Algorithm . . . 149

A.4 Analysisand Presentation of Results . . . 163

A.4.1 Detection SignalPhase Processing. . . 163

(9)

4.1

and kbvk for Example 4.1: t

f

=1;10;20;100 . . . 95

4.2 Formulationcomparison of Example 4.1: t

f

=1;10;20 . . . 98

4.3

and kbvk for Examples 4.1-4.4: t

f

=1;10 . . . 101

4.4 Performance comparison of Example 4.5on various time intervals . . 107

4.5 Performance comparison of Examples 4.2-4.10: t

f

=1 . . . 114

4.6 Performance comparison of Examples 4.2-4.10: t

f

(10)

3.1 Outputsets under application of bv and Æbv,Æ >1. . . 73

3.2 Outputsets under full and reduced noise contributions . . . 75

4.1 Typicalvariation of a

(t) with . . . 91

4.2 bv, forExample 4.1: t

f

=1(left),t

f

=10(right) . . . 93

4.3 bv for Example4.1: t

f

=20(left),t

f

=100 (right) . . . 94

f

=20;100 . . . 94

4.5

for Example 4.1asa function of t

f

. . . 95

4.6 Rescaledbv for Example 4.1: t

f

=1;10;20;100 . . . 96

4.7 y

(t) and a

(t) for Example4.1: =0:3;0:5;0:7;0:9 . . . 97

4.8 bv for Examples4.2 (left),4.3(center), 4.4(right): t

f

=1 . . . 101

4.9 bv for Examples4.1-4.4: t

f

=1 . . . 102

4.10 y

(t) and a

(t) for Example4.4: =0:3;0:5;0:7;0:9 . . . 103

f

=1(left) and t

f

=20(right) . . . 105

f

=1;2;4;6;8;10;20 . . . 105

4.13

forExample4.5asafunctionoft

f

(left)andcomparedwith

Exam-ple4.1(right) . . . 107

4.14 y

(t) and a

(t) for Example4.5: =0:3;0:5;0:7;0:9 . . . 108

4.15 bv for Examples4.5-4.10: t

f

=1(left), t

f

=10(right) . . . 112

4.16 bv for Examples4.2,4.3,4.4,4.9: t

f

=10 . . . 113

4.17 Normalizedbv for Examples 4.5-4.10: t

f

=10 . . . 113

4.18 y

(t) and a

(t) for Example4.10: =0:7 . . . 115

4.19 Componentsof bv for Example 4.11: t

f

=5 . . . 117

4.20 Componentsof y

(t) and a

(t) for Example 4.11: =0:7 . . . 117

4.21 Componentsof bv for Example 4.12: t

f

=1 . . . 120

4.22 Componentsof y

(t) and a

(t) for Example 4.12: =0:7 . . . 120

4.23 bv for Example 4.13 sequentialvs. simultaneous (left)and full interval

two-modelvs. simultaneous (right) . . . 123

4.24 y

(t) and a

(t) for Example4.13: sequentialsolve . . . 123

4.25 y

(t) and a

(11)

4.27 y

(t) and a

(t) for Example4.14: sequential . . . 126

4.28 y

(t) and a

(12)

Introduction and Review of Prior

Research

Models of dynamicalsystems that consistof aset of linear dierential and algebraic

equations (DAEs)

Ez 0

+Fz =f(t) (1.1)

inwhichthe(square)matrixEissingular,are calledlineardescriptorsystems. Many

systems throughouta widevarietyof applicationsare most easilydescribed aslinear

descriptorsystems. Variationalproblemssubjecttoconstraints,suchastheequations

of motion for a robotic arm, can often be written as descriptor systems. Network

modeling problems, as in electrical circuit design, are another example. The list

continues withmodelreductionproblems, singularperturbations,and discretizations

of partial dierential equations, just to name a few. (See [5, 8] for an in-depth

description of applications and examples.) Because of this wide spread applicability,

much research has been done recently involvinglinear DAEs.

A key aspect of system design in linear DAE modeled systems is fault detection

and isolation (FDI). One avenue of FDI is via the multi-model approach, in which

(13)

the parametersofoneormorefaultmodels. Thedesigngoal istoobtainanindicator

that tellsthe operator when a fault has occurred, and, when more than one type is

possible, which type of faultit is.

Another aspect of system design is the modeling of noise. One way to model

noiseisasaboundedenergysignal. Thisapproachplacesveryfewrestrictionsonthe

typesofnoisysystems whichcan beaddressed. Italsopresentsnocomplexmodeling

requirement,a very useful computationaltool of which we can take full advantage.

In this thesis we apply the multi-model approach to FDI in linear descriptor

systems, modeling noise as bounded energy signals. The combination appears to

be under-explored, in that very little research seems to exist that uses both the

multi-modelapproach and bounded energy noise. We develop a complete algorithm,

requiring very little on-line computation by an operator, with which nearly perfect

faultdetection and isolationover anite horizon isattained.

1.1 Linear Descriptor Systems

Webeginwithashortintroductiontodescriptorsystems,thebasictheoryandseveral

numericalmethodsused to obtainsolutions.

1.1.1 Basic Theory

As described above, DAEs occur in many applications. Models that consist of a set

of ordinary dierential equations (ODEs) often are rst written as DAEs. A DAE

is manipulated through dierentiation and substitution to convert it to ODE form.

Consider

x 0

(14)

in which a, b, c, and d are scalar constants. Equation (1.2) consists of a dierential

equation(1.2a)andanalgebraicconstraint(1.2b). TheJacobianof(1.2)withrespect

tox 0

;y 0

is

2

4 1 0

0 0 3

5

whichis singular. By dierentiating (1.2b) weobtain the full system

x 0

= ax+by (1.3a)

y = cx+d (1.3b)

y 0

= cx 0

: (1.3c)

Wecan substitute (1.3b)into (1.3a) toobtain the ODE

x 0

= (a+bc)x+bd (1.4a)

y 0

= cx 0

(1.4b)

the solution of which is easilyobtained.

Frequently,reasonsexistfornotattemptingtomanipulateasystemlike(1.2)into

explicit (ODE) form. First, physical problems initially modeled as DAEs contain

relationshipsbetween variablesofinterest. Changing toanexplicit modelmay result

in less meaningful variables, as well as a loss of the importance of the relationships

between those variables. In addition, sparsity is usually lost. Numerical methods

that rely on the sparsity of a DAE may not be suitable for solving the ODE that

is obtained by dierentiation and substitution. Finally, it may not be easy or even

possibletoconvertacomplexsystemintoODEform. Whenitispossible,itmightbe

easiertosolvetheDAEdirectlythan todothemathematicalmanipulationnecessary

(15)

It is due to these reasons, among others, that the base of research in DAEs has

continuouslygrown overthe lastseveral years. Attheheartof thetheoryare twokey

concepts, solvability and the uniform dierentiation index[5].

Definition 1.1. The system (1.1), where E and F are m m matrices, is

solvable on an interval if for every m-times dierentiable f(t), there is at least one

continuously dierentiablesolution to (1.1). Inaddition, solutions are denedon the

entire interval and are uniquely determined by their value at any t in the interval.

We willreturnto the necessary and suÆcient conditions for solvabilityof certain

typesof DAEs a bitlater.

Definition 1.2. The minimum number of times that all or part of (1.1) must

be dierentiatedwith respect tot in order todetermine z 0

as acontinuousfunction of

z;t is the index, , of DAE (1.1).

Example (1.2) is an index one DAE. Numerical methods are well developed for

index one DAEs. Higher index problems are notoriously more diÆcult to solve via

numerical methods. Fortunately, allof the DAEs we willdeal with in this thesis are

of index one.

Of the several special structural forms for DAEs found in the literature, two will

be of interest inthis research:

Linear Time Invariant DAE

Ez 0

+Fz =f(t) (1.5)

Linear Time Varying DAE

E(t)z 0

+F(t)z =f(t) (1.6)

Wemention three other typesfor completeness:

Linear inthe derivative,nonlinear DAE

(16)

Semi-Explicit(nonlinear) DAE

z 0

= f[z(t);u(t);t] (1.8a)

0 = g[z(t);u(t);t] (1.8b)

Fully Implicit(nonlinear) DAE

F(z 0

;z;t) =0 (1.9)

The extension of our algorithmtothese problems willbe leftto future research.

The theory for (1.5) and (1.6) is fairly well understood. For (1.5), solvability is

expressed in terms of a matrix pencil. For square matrices E and F, and complex

parameter , E +F is called a matrix pencil. If its determinant is not identically

zero as a functionof , then the pencil E +F is said to be regular. Equation (1.5)

is solvable if and onlyif E+F isa regularpencil[5]. If(1.5) is solvable wecan let

z =Qw and premultiplyby P so that (1.5) becomes

PEQw 0

+PFQw =Pf(t)=g(t) (1.10)

where P,Q are nonsingular matricessuch that

PEQ= 2

4 I 0

0 N 3

5

; PFQ= 2

4 C 0

0 I 3

5

:

N isanilpotentmatrix the index of whichis the same asthe uniform dierentiation

index of DAE (1.5). The system is then decoupled, and can be writtenas

w 0

1 +Cw

1 =g

1

(t) (1.11a)

Nw 0

2 +w

2 =g

2

(t): (1.11b)

Equation(1.11a)isanODEforwhichasolutionexists forany initialvalue ofw

1 and

any continuous forcing function g

1

(t). The unique solution to(1.11b) is

w

2

=(ND+I) 1

g

2 (t)=

1

X

( 1) i

N i

g (i)

(17)

where is the index, or degree of nilpotency of N, and D is the dierentiation

operator. Notethat the initialvalues of w

2

are completelydetermined.

In the linear time-varying case, (1.6), a similar result holds. Whilethe nature of

the matrix pencil E(t)+F(t) is no longer an indicator of solvability, the form of

(1.11) isstill important inlinear time-varying DAEs.

Definition 1.3. The system (1.6) is in standard canonical form if it is in the

form

2

4

I 0

0 N(t) 3

5

z 0

+ 2

4

C(t) 0

0 I

3

5

z =f(t) (1.12)

where N is strictly lower(or upper) triangular.

If E(t), F(t) are real analytic, then (1.6) is solvable if and only if, after linear

time-varying coordinate changes, it can be written as (1.12). The problemexists in

the diÆculty of nding those coordinate changes that allow us to write the DAE in

standard canonical form.

For the other three cases mentioned above, (1.7)-(1.9), the theory is much newer

and alsomuchless understood. Forthe purpose of this thesis, it issuÆcient to note

that the concepts presented aboveserve as abasis for the development of the theory

forthesecases. Itshouldbenotedthatwhilethis newertheoryisbeyondthe scopeof

this thesis, the commonstarting point serves asa goodindicatorthat the algorithm

developed herein for linear time-invariant and linear time-varying DAEs may have

applications inthe more general case aswell.

While there are many more interesting and useful items in the theory of DAEs,

these few properties and denitions that we have mentioned willsuÆce for our

(18)

1.1.2 Numerical Solutions

While it is not our goal to present an exhaustive overview of the numerical

meth-ods that can be used to solve descriptor systems, we briey mention those methods

which will be used in later chapters. The discretization methods we will review are

those that are used by the commercial software in which we implement the FDI

al-gorithm developed in this thesis, namely the trapezoidal method, the Compressed

Hermite-Simpsonmethod, and the 4-stage implicit Runge-Kutta method. These

di-rect transcription discretizationswillbedescribed using the semi-explicitDAE form

(1.8). After discretization, several methods exist for solving the resulting nite

di-mensionalproblem. Ofthosemethods,onlythesparse quadraticprogram(SQP)will

be described here. While the software, Boeing's Sparse Optimal Control Software

(SOCS), which will be introduced in a later chapter, can solve DAEs via an

ana-lytic transformation, as well as Euler's and linear multistepmethods, these schemes

will not be used in this thesis, and thus will not be mentioned here. Many of the

approaches applied toDAEs are described in detailin [5,13].

For our discussion of discretizationand nite dimensional problemsolution,

con-sider a simpleoptimization problembased onthe semi-explicitDAE (1.8)

min

t

0 tt

f

J[x(t);u(t);t] (1.13a)

subject to

x 0

= f[x(t);u(t);t] (1.13b)

(19)

Discretization

In general, transcription discretization schemes start by dividing the time interval,

[t

0 ;t

f

],inton segments

t 0 <t 1 <t 2

<:::<t

n =t

f

where the points t

k

, k = 0;:::n, are referred to as mesh points. Let x

k

= x(t

k ) be

the value of astate variable ata mesh point. Likewise, denote the value of a control

at a mesh point as u

k

= u(t

k

). Let f

k = f(x k ;u k ;t k

) and g

k = g(x k ;u k ;t k

) be the

right-handsidesof(1.13b)and (1.13c),respectively. Finally,leth

k =t k t k 1 bethe

step size for k =1;:::;n.

Utilizing this notation, the trapezoidalmethodapproximates the state equations

(1.13b)and algebraicconstraints(1.13c) as

x k = x k 1 + h k 2 (f k +f k 1 ) (1.14a)

0 = g

k

: (1.14b)

In the Compressed Hermite-Simpson scheme we denote the value of the control at

the midpointof asegmentasu

k =u(t

k

) wheret

k = 1 2 (t k +t k 1

),fork =1;:::;n. The

discreteapproximationsfor this methodare given by

x k = x k 1 + h k 6 (f k +4f k +f k 1 ) (1.15a)

0 = g

k

(1.15b)

where

f

k

=f(x

k ;u;t

(20)

for k=1;:::;n. The 4-stageimplicit Runge-Kuttadiscretizationuses four

intermedi-ate, implicitsteps

c 1 = h k f(x k 1 ;u k 1 ;t k 1 ) (1.16a) c 2 = h k f(x k 1 + c 1 2 ;u k ;t k ) (1.16b) c 3 = h k f(x k 1 + c 2 2 ;u k ;t k ) (1.16c) c 4 = h k f(x k 1 +c 3 ;u k ;t k ): (1.16d)

The discrete approximations forthis method then become

x k = x k 1 + 1 6 (c 1 +2c 2 +2c 3 +c 4 ) (1.17a)

0 = g

k

(1.17b)

where u

k

is dened as before.

These methods have all been proven to converge for index one DAEs, and are

thusappropriateforour purpose [4,5]. Inevery case,the resultof the discretization,

when combined with the cost function, J, is a sparse nonlinear programming(NLP)

problem. Thevariablesoftheproblemarethe discretizedstates,x

k

,controls,u

k ,and

time, t

k

, fork =0;:::n.

Solving the Finite Dimensional Problem

One way to solve this NLP problem,and the approach used by SOCS, is via a SQP

approach [3]. Dropping subscripts for now, let w be the vector of state and control

variables,(x;u),andletF(w;t)betheconstraintset resultingfromthediscretization

of the DAE. That is, (1.14), (1.15), or (1.17), after shifting everything to the right

hand side, becomes

(21)

Note that F is a function of the state, control, and time variables atall time steps.

The SQP algorithmrequires an initialguess, w 0

, and forms a new iterate by adding

a scalarmultiple, , of the search direction, p. That is,

w 1

=w 0

+ p:

The search directionis found by solving aquadratic programming(QP) subproblem

dened at the current point. The QPsubproblem is dened as

minJ T

w p+

1

2 p

T

Hp

subject to

0=Gp

where J

w

is the gradient vector of the cost function, H is an approximation to the

Hessian matrix ofthe Lagrangian of the NLP (L=J T

F), and Gisthe Jacobian

matrix of gradients of the constraints F. The step length, , is computed such that

H remains positive denite. The QP subproblem can be solved via either a sparse

Schur-Complement method, when appropriate, or a null-space quadratic

program-ming algorithm when G and/or H are dense. Details about the latter can be found

in [3].

An algorithmbased onthe combinationof a directtranscription scheme and the

SQP approachbegins withadiscretizationand aninitialguess. The SQPproblemis

then solved via the QP subproblem iteration. After each QP subproblem is solved,

the current point is updated and the procedure is repeated. The subproblem

itera-tionterminateswhenapointisreachedwhichsatisesnecessary conditionsforalocal

minimum within a given set of tolerances. The solutionis then compared to that of

the previous discretizationiteration, orthe initialguess if it isthe rst iterate. The

(22)

succes-demonstratesquadraticconvergenceundertherightconditions[3]. Convergencerates

for the direct transcription schemes, when applied to index one descriptor systems,

are at least quadratic, and, under the right system coeÆcient conditions, often are

considerably better [5].

1.2 Fault Detection and Isolation

With this basic understanding of the theory of descriptor systems and numerical

methodsfortheirtreatment,wenowturnour attention tothevariousapproaches for

treating faults in those systems. We begin with basic controltheory, and then turn

to feedback, the linkbetween control theory and FDI. Following that is a discussion

of the elements of optimal control and H

1

control pertinent to our approach. We

conclude with a discussion of existing research into FDI in descriptor systems and

the methodsused.

1.2.1 Basic Theory

A descriptor system is one possible result of a system design problem. The problem

begins witha tasktobe accomplished,andthe design engineeris usuallygiven goals

orobjectivesthatdescribethedesiredperformancecharacteristicsofthesystemalong

withaset ofconstraintsby whichthe systemisbound. Thedevelopment ofasystem

whichaccomplishesthe objectiveswhilemeetingthe constraintsisthe system design

problem.

A particular type of system design problem is the control problem, in which the

goal is to generate certain outputs from the system or to maintain the state of the

system within certain bounds. For example, an engineer might be asked to design

(23)

essential elements of sucha controlproblemare

a mathematicalmodelof the system,

a desiredoutput,

a set of admissiblecontrols,

a performance measure.

Often, as stated above, a descriptor system is the natural product of the system

design problem. For the remainder of this thesis, we will restrict most of our study

to linear time-invariant systems (1.5). Comments extending our algorithmto linear

time-varyingsystems (1.6) are included in alater chapter.

Consider asystem based on the linear time invariantDAE (1.5)

x 0

= Ax+Bu (1.18a)

y = Cx (1.18b)

wherex,y,anduarethestate,output,andcontrolvectors,respectively,andthetime

intervalconsidered ist2[t

0 ;t

f

]. Systemsoften allowfornoise orunknown inputsby

adding aterm to each equation of (1.18)

x 0

= Ax+Bu+M (1.19a)

y = Cx+N (1.19b)

where is the noise or unknown input, and the matrices M and N are the weight

matricesfor the state and output noise, respectively.

Central to the study of system (1.18) are the concepts of controllability,

observ-ability,and stability [6].

Definition 1.4. Alinear system issaidtobe controllableat t

0

if itispossibleto

nd some input function u(t), dened over t 2[t

0 ;t

f

], which will transfer the initial

state x(t

0

) to the origin at some nite time t

1 2[t

0 ;t

f ]; t

1 >t

0

. If this istrue for all

(24)

Definition 1.5. A linear system is said to be observable at t

0 if x(t

0

) can be

determined from the output function y

[t

0 ;t

1 ]

for t

0 2 [t

0 ;t

f

] and t

0 t

1

, where t

1

is some nite time, t

1 2 [t

0 ;t

f

]. If this is true for all t

0

and x(t

0

), the system is

completelyobservable.

Since controllabilitydescribesthe abilityof the controltoaect the system state,

itinvolvesthematricesAandB. Likewise,sinceobservabilitydescribestheabilityof

theoutputtocharacterizethestate, itinvolvesthe matricesAand C. Simplystated,

the nth-order system (1.18) is controllable if and onlyif [sI A j B] has rankn for

allvalues of s. The same system isobservable if and only if [sI A T

jC T

] has rank

n for all values of s. Proofs of these characteristics can be found in [6], along with

the requirements forcontrollabilityand observabilityin more complex systems.

The concept of stability helps us deal with systems that may not be controllable

and/or observable. Stability is closely related to the eigenvalues of the A matrix.

Intuitively, a solutionto (1.18) isstable if we can stay close to the solutionby

start-ingclose enoughto it viathe initialcondition. A solution isasymptotically stable if,

by starting close enough, we converge to the solution. A system is stabilizable if all

unstable modes are controllable, and detectableif allunstable modes are observable.

Thusthesystemcan behandledeectivelyprovided alluncontrollableand

unobserv-able modes are stable. This situation can often be tolerated in a control system [6].

For the remainder of this thesis, we will assume that we are dealing only with the

controllableand/or observable modes of controlsystems.

1.2.2 Feedback and Observer Design

Thebridgebetweenbasiccontroltheoryandfaultdetectionistheconceptoffeedback.

In a feedback controlsystem, the control, u(t), is modiedby informationabout the

(25)

information to the controller, which adjusts the control based on the input from

the sensors. One of the fundamental goals of feedback compensator design is to

improve the performance of the system through eigenvalue placement. As stated

earlier, stability depends on the eigenvalues of the A matrix. By assigning desirable

values to eigenvalues, system stability can be enhanced. Forthe state feedback case,

the relation

u(t)=Fv(t) Kx(t) (1.20)

isused,wherethematrixKiscalledthefeedbackgainmatrix,andF thefeed-forward

matrix. Substituting into(1.18), we obtain

x 0

= (A BK)x+BFv (1.21a)

y = Cx: (1.21b)

Clearly, the eigenvalues of the A BK matrix now determine the stability of the

system. By careful construction of the feedback gain matrix K, the eigenvalues are

assignedthe desiredvalues. For the output feedback case, the relation

u(t)=Fv(t) Ky(t) (1.22)

isused, where the K and F matricesare asdened above. Substitutingthis relation

into (1.18),we obtain

x 0

=(A BKC)x+BFv: (1.23)

Now the eigenvalues of the A BKC matrix determine the stability of the system.

Unfortunately,duetothepresenceoftheCmatrixinthisexpression,outputfeedback

usually cannot place all of the eigenvalues of the system. This limitationis present

(26)

havenoimpactoncontrollability. Outputfeedback canimpactneithercontrollability

nor observability of a system [6].

Using feedback, the basic tool for many FDI approaches can be constructed: the

observer. For most systems the only information about the system state is through

the outputvector, which oftenprovidesonly partialinformation. Thus, output

feed-back is the only option, and not all system eigenvalues can be placed wheredesired.

To improve system stability in these cases, the most frequently used method is to

reconstruct information about the remaining elements of the state vector through

development of anobserver of the system. Consider the observer

^ x 0

=A^x+Bu+L(y Cx )^ (1.24)

where x^is the observer estimatefor the state vector. Note thaty is the outputfrom

the real system, (1.21), and Cx^ is the observer output. Subtracting (1.18a), and

letting e=x^ x be the observer error,we obtain

e 0

=(A LC)e:

Since L is arbitrary and (A;C) is observable, we can guarantee that observer error

goes to zero by selecting L so that A LC is stable. With this construction, state

feedback is possibleusing the observer estimateforthe state vector. Thus allsystem

eigenvaluescanbeplacedwheredesired,andcompletecontroloversystemstabilityis

possible. It should be noted that since the complete state vector is reconstructed by

theobserver, faultswhichsendthesystem intounpredictedorundesirablestatesmay

be detectable by such an observer simply by comparing the observer estimate with

those elements of the system state vector which are available. This fault detection

can be accomplishedwithout using the observer to aect any feedback compensator

(27)

1.2.3 Optimal Control

Later,whenwedevelopour algorithm,wewillworkwithacontrolsystemwhichacts

astheconstraintsinanoptimizationproblem. Thisoptimalcontrolstructureiskeyto

themulti-modelapproachtoFDI,whichwewilldiscussinSection1.2.5. Accordingly,

we briey review optimal control theory. While this area of study is vast, the only

topic whichwe willneedfor our discussion isthe state regulatorproblem,alsocalled

the linear quadratic regulator (LQR) problem. Consider the optimization problem

J(x;u)=min 1

2 x(t

f )

T

S

f x(t

f )+

1

2 Z

t

f

t

0 x

T

Qx+u T

R udt (1.25a)

subject to

x 0

=Ax+Bu (1.25b)

as well as some initial conditions at the beginningof the interval, where S

f

, Q, and

R are the weight matricesfor the terminalcost, the trajectory,and the control. It is

assumed that Q is positive semi-denite and R is positive denite. This is one form

ofthe LQRproblemanditisimportantforthreereasons. First,the theoryiselegant

and robust. Results are easyto understand and implement in numericalalgorithms.

Second, ithas stronggeometry. J(x;u)isactuallyaninnerproductnormwith useful

properties. Finally, there are strong physical correlations to this type of problem.

Energy is a quadraticform,as is power.

As with any optimization problem, the LQR problempossesses necessary

condi-tions for aminimum. Forthe problem (1.25), weconstruct the Hamiltonian

H(x;;u)= 1

2 (x

T

Qx+u T

R u)+ T

(Ax+Bu) (1.26)

(28)

must besatised by any extremum of the problem,are

H T

= x 0

(1.27a)

H T

x

=

0

(1.27b)

H T

u

= 0: (1.27c)

When applied to(1.26), we obtain

0

= Qx A

T

(1.28a)

x 0

= Ax+Bu (1.28b)

0 = R u+B T

: (1.28c)

Using (1.28c) and our assumption that R > 0, we can eliminate u from (1.28) to

obtain aset of dierentialequations in xand

x 0

= Ax BR

1

B T

(1.29a)

0

= Qx A

T

: (1.29b)

Whilethisformwillbeusefulinouralgorithm,itispossibletotakeanadditionalstep

and eliminate , resulting in a matrix Riccati dierential equation for the optimal

controlfeedback gain matrix. The derivation of the Riccatiequation willbe detailed

when we develop our algorithm in the next chapter. It should be noted that our

assumptions on the Q and R matrix, while not restrictive in an applicability sense,

guarantee that the extremum which satises the necessary conditions represents at

least a local minimum of the cost J(x;u). In fact, Q is often positive denite, and

in that case, the conditions for an extremum are necessary and suÆcient. Detailed

(29)

1.2.4 H

1

Control

H

1

control in the time domain is similar to optimal control. It takes advantage of

the linear quadratic (LQ) forminaddressing signicant uncertainties inthe energies

ofsystemnoises. Forboundedenergynoiseinputs,wherelittleornootherknowledge

isavailableaboutthe signal,the LQRformulationisanelegantworst-case approach.

Themodelgenerallytakestheformof(1.19),andallfunctionsareassumedtoexistin

the spaceof squareintegrablefunctions, denoted L 2

. WhileH

1

performancecriteria

vary, they all share the structure of the optimal control cost function, that is, they

are all inLQ form.

In this setting, ltering, smoothing, and compensator design are eÆciently

ac-complished. Nagpal and Khargonekar [27] apply a ltering and smoothing method

using an H

1

performance criterion on both the nite and half-innite intervals to

accomplish state estimation(ltering) and output smoothing. Tadmor [36] attempts

tond,inLQgame-theoreticterms,thecompensatorwhichprovidesthebest control

in response to the worst disturbance. Matrix Riccati equations provide solutions in

eachcase.

While the structure of our problem is very similar to the H

1

problem, several

key dierences willbecomeapparent. First,wewill solve adierentproblem. While

Tadmor [36] designs a worst-case compensator, and Nagpal and Khargonekar [27]

solve for the optimal lter and smoother inthe face of various initialconditions, we

will solve for the optimal fault detection signal. In addition, while both [27] and

[36] work in single model systems, we will work in a multi-model system. Finally,

while the noise present in our system is also L 2

, it is not the same kind of signal as

is commonly assumed in H

1

control. The impact of these dierences will become

(30)

1.2.5 Prior Research

In addition to the two approaches mentioned above, fault detection and isolationin

linearcontrolsystemshas been attemptedfrommanyangles. Tobegin, wenote that

there exist two basic types of approaches to FDI: passive and active. In the passive

approach,onlymonitoringofsystem performance isallowed. Nointeractionwith the

systemoccurs,eitherformaterialorsecurity reasons. Thesystem states(oroutputs)

aremeasuredand comparedto\normal"system behavior,generatingaresidual. The

residualiscomputed suchthat itis equalorcloseto zerowhen nofaultsare present,

and much dierent from zero when a fault occurs. The vast majority of research in

FDI using the passive approach appliesobservers to generate residuals.

Passive Methods

Nuninger et al. [32] use analytic redundancy in order to detect sensor and actuator

failuresorprocessdisturbances. Analyticredundancyattemptstogeneratearesidual

that might contain information about the faults. Two methods for generating the

residual are examined. First, direct residual generation is based onthe parity space

approach, using the input-output transfer function (the parity equation). Second,

indirect residual generation is based on output estimation, using an observer to do

state estimation rst. The authors apply the rst method to known input systems

and the second method to both known and unknown input systems. A drawback of

thisapproachisthatsomefaultsmayhavenoinuence onthe residualsgeneratedby

eithermethod,soperfect FDIisnot attained. ChenandSpeyer[15]alsouse analytic

redundancy, generating a residualvia anobserver that reconstructs the system state

vector. Their model has the target fault direction explicitly in the detection lter

(31)

Koenig et al.[24] present a comparative study of several design methodsfor

un-known input observers (UIO) used for FDI and Correction. Their goal is to design

anintegrated approachwhichcan detect, isolate,and correctalargevariety of faults

fora desiredsystemwith real-timecomputationconstraints. Methodscompared are:

failure isolation by using banks of observers (robust to some faults, but sensitive to

others, incombinationso that allfaults are detectable), failureisolationby observer

pole placement (to create an unknown input fault detection observer), and failure

correctionviageneralstructuredUIO(designoffullorderobservertoestimatestates

as well asunknown inputs). Chung and Speyer [17] develop a game theoretic

detec-tion lter, which is similar to the UIO, that attenuates disturbances, bounding all

signals except the faultto be detected, embeddingthe exogenous signals intoan

un-observable, invariantsubspace. The subspacestructure isused toreducethe orderof

the limitinglterby factoringtheinvariantsubspace out ofthe statespace, resulting

in alower order lter sensitive only to the fault tobe detected. The lter isapplied

tothe ight controlcharacteristics of the F-16XL and a simplerocket.

The parity relation approach to residual genertion is applied by Youssouf and

Kinnaert [40]. The method is based on the inverse of the map from both unknown

inputs and faults to the observable signals (measured inputs and outputs), using a

variablechangeinthe frequency domain. Toolsavailablefornonsingularsystems can

be used on the resulting map. The authors contend that FDI for singular systems

depended previously on state estimation, which put unnecessary requirements on

the plant, as there is no need to reconstruct the entire state vector to do residual

generation. WhereYoussoufandKinnaert[40]applytheirmethodtocontinuoustime

systems, Sauter et al. [35] do the same for the discrete case in mechanical systems,

though the theory and algorithm are completely dierent. They do state equation

(32)

of fromthe residual.

Chowdhury and Aravena [16] go in a slightly dierent direction. They apply a

modularmethodologytotheareaoffastfaultdetectionandclassicationindynamical

electricalpowersystems. Module I isthe generation of faultindicators in one of two

ways:

1. model-based, in which a residual is generated using either an accurate

mathe-matical orI/O model of the nominal system, or an I/O modelis built on-line,

which is very diÆcult,

2. model-free, in which detectable variables are measured and enhanced if

neces-sary by signal-processingtechniques.

The authorspresent amodel-free orthogonaldecomposition basedon multiratelter

bankstoproduceafaultindicator. ModuleIIisthemeasuringandtestingoffault

in-dicatorsviaeitherstatisticaltest orfeedforward neural-networktesting. Theauthors

explore the neural network aspect. Fault classication occurs in module III, another

neural network, the operation of whichdepends onthe existenceof asystem model.

The emphasis is on model-free methods, those lesser explored and lesser restrictive

cases wheremodelsare notavailable,non-linearitiesprevent modelderivation,ortoo

manyuncertainties existinthesystem. Thesecases appear toholdthe mostpromise

for neural-network applications in faultdetection.

Hybrid Passive-Active Methods

Some research has been done using a hybrid of the passive/active approaches. The

passive approach is used to detect faults, then an active approach is used to

cor-rect or compensate for faults through feedback. Zhang and Jiang [43] investigate

(33)

to discrete-time stochastic vertical take-o and landing aircraft systems. A bank of

two-stage adaptiveKalmanlters is used forFDI, and statistical decisionsare made

forfaultdetection,diagnosisandactivationofcontrollerreconguration. Inarelated

paper[42], the same authors apply aninteracting multiple-modelbased approachto

the same type of control problem. A nite-state Markov chain is linked tothe same

Kalman lter bank for fault diagnosis. The decision from this process is used to

activate system reconguration via eigenstructure assignment.

Active Methods

The drawback inherent in the passive approach is that faults can be masked by the

applicationofthe control. Thus itispossiblethatafaultcouldgoundiscovereduntil

it is too late tocorrect it. In direct contrast, the active approach interacts with the

system on aperiodic basis, orat criticaltimes, todetect faults,thus eliminatingthe

possibility of the presence of undetected faults. The approach uses various types of

interaction with the system to detect faults. A test signal, which is constructed in

such a way that faults are highlighted, is fed into the system. Observation and/or

manipulationof the resultingoutput isused to makea decisionabout system faults.

Observers designed to aid in feedback, as well as various other types of feedback

compensators, are examples of the active approach.

Bennett et al. [2] apply speed dependent feedback (a stable time-varying linear

observer) todetectintermittent, shortdurationfaultsin bilineardynamicalsystems.

The AC drive system for an electric train is considered. These systems experience

abrupt disconnections which introduce severe transient errors and which are hard to

detectdue totheir shortdurations. The parityequation approachisnot preferredin

this casedue totheintermittentnatureof the faults. Bycombiningthe observerand

(34)

This case isan example of the applicationof atest signal as part of the feedback to

controlthe system and correctfaults.

The multi-modelapproachis well-suitedto thecase whereitis desirableto apply

atest signal independentof feedback control. The approachrelies onthe presence of

the system model

x 0

i

= A

i x

i +B

i v+M

i

(1.30a)

y = C

i x

i +N

i

(1.30b)

for i = 0;:::;m, where m is the number of faults expected from the system, and

v is the test signal. A dierent system model exists, with known parameters, for

each possible fault. It is assumed that any feedback control has been absorbed into

the A

i

matrix, thus eliminating the control u from the dierential equation. The

diÆculties inthisapproachlieindeterminingfromwhichmodelanoutputyderives,

as well as the computation of system parameters for each fault model. Nikoukhah

[28] presents the use of a test signal for active FDI in discrete-time linear systems

subject to inequality-bounded perturbations. Detectability is required, but when

present, guaranteed FDI is attained. The discrete time case lends itself to recursive

algorithms, and so recursion is used extensively by the author to develop the test

signal. After constructing a test signal that separates outputs into disjoint convex

sets, the author uses the separating hyperplane approach to determine which set a

certainoutputfallsinto,and thuswhetherafaulthasoccurred. Linear programming

is used to construct the separating hyperplane. Nikoukhah et al. [29] has the same

goal as [28], but goes about it completely dierently. Among the dierences, fault

isolationis accomplishedby aratio test, and optimal controltheory isapplied. This

paper is the inspiration for our current research, and thus uses some of the same

(35)

and algorithmic dierences. In addition, both [28] and [29] consider only two model

systems, whereas our approachcan handle problems with three or moremodels.

The multi-modelapproach is also useful with Kalmanltering. Keller et al. [21]

presents the multi-modelapproachfor faultdetection in stochastic systems with

un-known inputs. The method uses the two-stage Kalman lter with unknown inputs

and constant biases, the rst stage of which is bias-free (for faultdetection) and the

second stage is a bias lter (for fault isolation). The optimum state estimate is

ex-pressedastheoutputofthebias-freeltercorrectedwiththeoutputofthebiaslter.

Dierent fault types are detected using a bank of such lters. The two stages of the

lter reduce computationaltime associated with the presence of multiple faults.

1.2.6 Conclusion

As mentioned in the introduction to this chapter, the combination of the

multi-model approach and the bounded energy noise model seems to be under-explored.

The commonthread runningthrough most of the applications mentioned inthe last

section is the modeling of noise. [16, 21, 22, 23, 33, 37, 41, 42, 43] model noise as

sometypeofrandomvariable. Manyuse lteringorstatisticalteststomakethefault

decision, and thus do not model noise at all. Only [17, 27, 28, 29, 36] model noise

as bounded energy signals. As we shall see, the bounded energy noise modelis very

suited tothe multi-modelapproach, and the combinationas developed in this thesis

providesa powerfultool for faultdetection and isolation indescriptor systems.

1.3 Outline of Thesis

In the next chapter, we present the theory and algorithm for the fault detection

(36)

problem. Following that, Chapter 3 is the development of the algorithm for the

model identication phase. In Chapter 4, we state the full algorithm, then present

andanalyzeseveralexamples. Lastly,Chapter5istheconclusionandoutlineoffuture

research possibilities in this area. Software codes for the algorithm are in Appendix

A.

1.4 Contributions of Thesis

The research in this thesis will appear, or has already appeared in the following

publications:

S. L. Campbell, K. Horton, R. Nikoukhah, and F. Delebecque, Rapid

Model Selection and the Separability Index, in Proc. 4th IFAC

Sympo-sium on Fault Detection, Supervision and Safety for Technical Processes

(SAFEPROCESS 2000), Budapest, Hungary, June 2000, pp. 1187-1192.

R. Nikoukhah, F. Delebecque, S. L. Campbell, and K. Horton,

Multi-model Identication and the Separability Index, in Proc. 14th

Interna-tional Symposiumof the Mathematical Theory of Networks and Systems

2000, Perpignan, France, June 2000, CDROM.

R.Nikoukhah,S.L.Campbell,KirkHorton,andF.Delebecque,Auxiliary

signal design for robust multi-model identication,IEEE Transactions on

AutomaticControl, accepted subject to nal revision.

S. L. Campbell, Kirk Horton, R. Nikoukhah, and F. Delebecque,

Auxil-iary signal design for rapid multi-modelidentication using optimization,

(37)

Fault Detection via the Detection Signal

2.1 The Problem - Finding the Minimum Energy

Detection Signal

Asintroducedinthepreviouschapter,ourgoalistoattainnear-perfectfaultdetection

andmodelidenticationinlineardescriptorsystemsusingthemulti-modelapproach.

This approachallows the treatmentof the problemintwo steps. Inthis chapter, our

focus willbe onthe fault detection step of the problem, while the next chapter will

tackle the modelidenticationstep.

Multi-model fault detection and model identicationmeans that we have two or

morepossiblemodelsforasystem,andwedecidewhichonecorrespondstothesystem

based on measurements of the inputs and outputs of the system over a nite test

period, [0;t

f

]. Whileother possible test periods exist,we will restrictour discussion

tothe nite interval.

In order to exclude all but one model based on input-output measurements, the

input signal must have special properties. A signal with such properties is called a

proper detection signal. For the remainder of the present discussion we will assume

(38)

the fault model. This assumption is not restrictive in any way, and later we will

describe how the algorithmcan be extended to include the case inwhich more than

one fault modelispresent.

2.1.1 Problem Setup

The true model of the system isone of two models

x 0

i

= A

i x

i +B

i v+M

i

(2.1a)

y = C

i x

i +N

i

(2.1b)

for i = 0and 1, and for t 0, where x

i

, y, v, and

i

are the system states, output,

detection signal, and noise, respectively. The matrices A

i , B

i , C

i , M

i

, and N

i are

matrices of appropriate dimensions. We assume that v and

i

are in L 2

[0;t

f ]= L

2

,

forcingx

i

andytobeinL 2

aswell. WhileweassumefullrowrankoftheM

i

andN

i ,

and controllability/observabilityof the systemfor computationalreasons, thereisno

assumptionthatthedimensionsofthestateornoisevectorsofthe twomodelsare the

same. We alsoassumeno apriori informationabout the system beforet=0,and in

particularnoinformationaboutx

i

(0). Thus,unlikesomeexistingtheory,inparticular

[30], we have no weights on x

i

(0). (We willdiscuss the impact of information about

initialconditionsandthe subsequentpresenceofweightmatricesonx

i

(0)laterinthis

chapter.) In addition, we assume that any feedback control has been absorbed into

theA

i

matricesasdescribed inChapter1,orelseisnulledatt=0forthe durationof

the test period. Thus, the only commonelements of the two models are the output,

y, and the detection signal, v. Notethat (2.1) is alinear descriptor system since the

output y isknown.

Consider the detection signalv and letA 0

(v)bethe set ofpossibleoutputs

(39)

letA 1

(v)betheset ofoutputsifModel1,the faultmodel,isthecorrectmodel. Then

perfect modelidentication based onoutputmeasurementimplies that

A 0

(v)\A 1

(v)=;: (2.2)

This is achievable thanks to the bounded energy noise model. This noise model can

be expressed as

S

i (

i

)k

i k

2

= Z

t

f

0 j

i (t)j

2

dt<1; i=0;1 (2.3)

wherejjisthe(pointwise)Euclideannorm,andthuskkistheL 2

norm. In practice

one has bounds k

i k

2

<K. It is always possible to rescalethe M

i , N

i

toget K =1,

so weassume without lossof generality that K =1.

This expression for thenoise allows ustodistinguish between the two basic types

of detection signals.

Definition 2.1. Thedetection signalv isnotproperifthere existx

0 ,x

1 ,

0 ,

1 ,

and y satisfying (2.1) and (2.3). The detection signal v iscalled properotherwise.

Thus we say that the L 2

vector function v is a proper detection signal if its

application implies that we are always able to distinguish the two candidate models

based on observation y. That is, condition (2.2) is satised [30]. Notethat v =0 is

not proper since the zero solutionis always in the intersection of (2.2). In addition,

if v isproperthen cv isalsoproperfor c1,but if v is not proper then thereexists

an >0 such that cv isalso not properfor 0c1+. These factswill be useful

when we develop the optimization problemlater inthe chapter.

The conditions for the existence of proper detection signals are quite weak. For

their characterization, let

L

i (f)=

Z

t

0 e

A

i (t s)

(40)

be the solution of z 0

=A

i

z+f, z(0)=0. Then the solutionsto (2.1) are

x i = L i (B i v)+L

i (M

i

i )+e

A i t i (2.5a)

y = C

i L

i (B

i

v)+C

i L i (M i i )+C

i e A i t i +N i i (2.5b)

for i=0;1,where

i

isthe freeinitialcondition for x

i

. Thus the output set for each

model is the sum of three terms

y i = C i L i (B i

v) which is a vector depending linearly on the detection

signal, v, f(C i L i M i +N i ) i :k i

k<1g which is anopen convex set,

fC i e A i t i : i 2< n (orC n

)g which is anite dimensional subspace of L 2

.

Because of these facts, and noting that y

0

and y

1

are respectively the outputs of

Model 0 and Model 1 corresponding to zero noise and zero initialstate, we see that

the output sets A 0

(v) and A 1

(v) are translates by y

0

and y

1

of bounded open sets.

Since y

0 and y

1

depend linearly on v,either y

0 =y

1

forall v,or y

0 y

1

can be made

arbitrarily large with proper choice of v. So proper detection signals exist provided

the linear mapping of v to y

0

is distinct from the linear mapping of v to y

1

[30]. In

the time invariant case, this is equivalentto

C 0 (sI A 0 ) 1 B 0 C 1 (sI A 1 ) 1 B 1

6=0 (2.6)

for some s.

The amountof energy requiredfor adetection signaltobe properdetermines the

separability of the output sets A 0

(v) and A 1

(v).

Definition 2.2. Let V denote the set of proper detection signals v. Then,

= inf v2V kvk 2 1 2 (2.7)

(41)

Thus, (

) 2

isalowerbound onthe energyof properdetectionsignals. Also,the

inverse relationship between the separability index and the proper detection signal

energyindicatesthatsystemswithlowerenergyproperdetectionsignalshaveahigher

separability index. The separability index is zero if there are no proper detection

signals. Later, the algorithm we develop will compute

as the objective function

of an embedded optimal control problem. In Section 2.4.5 we describe an existing

algorithm that computes

[30]. Our approach has the advantage of being able to

address several problems that the algorithmin[30] cannot handle.

2.1.2 Formulation as an Optimal Control Problem

Before we describe the algorithm, however, the problem of nding the minimum

energy proper detection signal must be formulated as an optimal control problem.

First, note that for the detection signal v to be not proper, (2.1) must hold as well

as (2.3). We can rewrite(2.3) as

max Z

t

f

0 j

0 (t)j

2

dt; Z

t

f

0 j

1 (t)j

2

dt

<1: (2.8)

This expression can alsobe writtenas

max

01 Z

t

f

0 j

0 (t)j

2

+(1 )j

1 (t)j

2

dt <1: (2.9)

Thusweobtain a useful characterization of not proper detection signals[30]

Lemma 2.1. The detection signal v is not proper if and only if

inf max

01 Z

t

f

0 j

0 (t)j

2

+(1 )j

1 (t)j

2

dt<1 (2.10)

where the inmum is taken over (x

i ;

i

;y) in L 2

, subject to (2.1), i=0;1.

This characterizationisuseful becausethe algorithmwedevelop willcompute the

(42)

The next step in formulating the computation of the separability index as an

optimal control problem involves dimension reduction. By assumption the N

i are

bothfullrowrank. Thus, wecanperformaconstantorthogonalchangeofcoordinates

onthe N

i

(viaa QRdecomposition onN T

i

). As aresult we obtain

N i =[N i 0] (2.11) where N i

isinvertible, and

M i =[M i f M i ]: (2.12) Let i = 0 @ i e i 1 A

with the same decomposition as N

i

, and subtract (2.1b) for i = 1

from(2.1b) for i=0. Equation (2.1b) becomes

0=C

0 x 0 C 1 x 1 +N 0 0 N 1 1 : (2.13)

Nowwe cansolve foreither

i

anduse theresultingexpression toeliminate(2.13)by

substituting itinto(2.1a). Solving for

0

, we obtain

0 @ x 0 0 x 0 1 1 A = 2 4 A 0 M 0 N 1 0 C 0 M 0 N 1 0 C 1 0 A 1 3 5 0 @ x 0 x 1 1 A + 2 4 f M 0 M 0 N 1 0 N 1 0 0 M 1 f M 1 3 5 0 B B B @ e 0 1 e 1 1 C C C A + 2 4 B 0 B 1 3 5 v: (2.14)

Withtheobviouscorrespondences,thereducedsystem,nolongeradescriptorsystem,

can be written as

x 0

=Ax+Bv+M: (2.15)

(43)

and we desire to detect the fault in a short test period to prevent the instability of

the fault fromcreating problems forthe system.

The characterization of not proper, (2.10), forthe reduced system becomes

inf max

01

P(x;;) <1 (2.16)

where

P(x;;)= Z t f 0 (j N 1 0 C 0 x 0 +N 1 0 C 1 x 1 +N 1 0 N 1 1 j 2 +je 0 j 2 )+ (1 )(j 1 j 2 +je 1 j 2

)dt (2.17)

and the inmum is nowtaken over (x;) inL 2

, subject to(2.15).

The thirdstepinthetransformationtoanoptimalcontrolprobleminvolvesusing

the denition of the Euclidean norm to expand the integrand. After doing so and

combininglike terms,we can rewrite (2.17) as

P(x;;)= 1 2 Z t f 0 x T

Qx+x T

H+

T

R dt (2.18)

where

Q=2 2 4 C T 0 N T 0 N 1 0 C 0 C T 0 N T 0 N 1 0 C 1 C T 1 N T 0 N 1 0 C 0 C T 1 N T 0 N 1 0 C 1 3 5 (2.19)

H=4 2 4 0 C T 0 N T 0 N 1 0 N 1 0 0 C T 1 N T 0 N 1 0 N 1 0 3 5 (2.20)

R=2 2

6

4

I 0 0

0 (1 )I+N T 1 N T 0 N 1 0 N 1 0

0 0 (1 )I

(44)

Finally,lettingS

v

bethesetofL 2

functions(x;)satisfyingtheconstraints(2.15),

and dening

J

v

()= inf

(x;)2Sv

P(x;;) (2.22)

we callona useful result [30].

Theorem 2.1. Thefunction P has at least one saddlepoint (x s

; s

)on S

v

[0;1]and

inf

(x;)2S

v max

01

P(x;;)= min

(x;)2S

v max

01

P(x;;)=

max

01 min

(x;)2Sv

P(x;;)=P(x s ; s ; s ): (2.23)

Proof (from [30]) Let (x

;

) be the solution of problem (2.22). Then S

i (

i ),

i=0;1, depend continuously on0< <1. Moreover, since

S

0 (

0

)=0; if =1; (2.24)

S

0 (

0

) iscontinuous for 2(0;1], and since

S

1 (

1

)=0; if =0; (2.25)

S

1 (

1

) iscontinuous for 2[0;1). Suppose

lim !1 S 1 ( 1

)>0: (2.26)

Then for some 0< s

<1, we must have

S 0 ( s 0 )=S

1 ( s 1 ): (2.27) Let (x s ; s

)=(x s ; s ). Then P(x s ; s

;)S

1 (

s

(45)

(holdingat equality because cancels out) and

P(x;; s

)S

1 (

s

1

); 8(x;)2S

v (2.29) because (x s ; s

) is the optimal solution of (2.22) for = s

. This implies that

(x s ; s ; s

) is a saddle point and the rest follows. Now suppose that (2.26) does not

hold so that

lim !1 S 1 ( 1

)=0: (2.30)

In that case S

0

and S

1

can be made arbitrarily small simultaneously. This implies

that J

v

()=0 forall which meansthat there exists (x s

; s

)suchthat (2.27) holds

with equality tozero. Then,clearly (2.28) holdsbecauseboth sides ofthe inequality

are zero. In addition, (2.29) holds for all s

2 [0;1] because the right hand side of

the inequality is zero and the left hand side cannot be negative. This implies that

(x s ; s ; s

) isa saddle point and the rest follows.

Note that the aboveproof in[30] is done withknowledgeof, and weight matrices

onthe initialstate, x

i

(0). In that case, the bounded energy noisemodelbecomes

S i (x i (0); i )x

i (0) T F i;0 x i (0)+ Z t f 0 j i (t)j 2

dt<1; i=0;1: (2.31)

Since each S

i

is the sum of positive semi-denite terms, letting one term go to zero

does not alter the proof.

This result allows us to interchange the order of the inf and the max in (2.16),

and replace inf with min. Thus

J

v

()= min

(x;)2Sv

P(x;;) (2.32)

and, the characterizationof not proper becomes

max J

v

(46)

Expandingthis resulttoitsfullyexplicit form,wesee thatadetection signalv isnot

properif and only if

max

01 min

1

2 Z

t

f

0 x

T

Qx+x T

H+

T

R dt<1 (2.34)

where the minis subject to

x 0

=Ax+Bv+M: (2.35)

Theinner minimization,the J

v

()problem,isastandardLQRoptimalcontrol

prob-lemwith an added cross term inthe objective function and the forcing function Bv

in the constraint. J

v

() iscalled the auxiliary cost function for the problem.

The auxiliary cost functionexhibits several useful qualities[12].

Lemma 2.2. For allv 2L 2

,for 0 1, J

v

()isdened andhas thefollowing

properties:

1. It is zero for =0 and =1,

2. It is quadratic in v, i.e., for allscalar c, J

cv

()=jcj 2

J

v (),

3. It is a continuous function of ,

4. If v is not proper, then J

v

() < 1 for all 0 1. Equivalently, J

v

() 1

for some implies v is proper,

5. It isa strictly concave function of if the set of proper detection signals isnot

empty, otherwise it isidentically zero.

Theproofisstraightforwardandreliesoncontinuityandlinearity. Itcanbefound

in [12]. With this result, we can state the original problem of nding a minimum

energy properdetection signal v as

minkvk subject to max J

v

(47)

Note that the cases = 0 and = 1 are excluded because J

v

(0)= J

v

(1) = 0, and

Lemma 2.2demonstrates continuity of J

v

() atthese points.

UsingthefactthatJ

v

()isquadraticinv,wearriveatthefollowingfundamental

result

Theorem 2.2. Let

J

()=sup

v6=0 J

v ()

R

t

f

0 jvj

2

dt

= sup

kvk=1 J

v

(): (2.37)

Then

(

) 2

= max

0<<1 J

() (2.38)

where

is the separability indexdened previously.

This theorem,whilesimilartoresultsin[29]and[30], hasaddedtechnical

diÆcul-ties due tothe presence of the innite dimensionalspace of the independent variable

and the unbounded nitedimensionalsubspace of the outputsets. Despite these

dif-ferences, the proof is an extensionof that in[29]. However, itis somewhat technical

and requires functional analysis and convergence theory for sequences. See [12] for

the complete proof.

Notethat the ease ofseparating the nominaland faultmodels ofa systemis

pro-portionalto the size of

. When

=0,the models are indistinguishable regardless

of the detection signal.

As a nal result before dening the optimization problems we will address, we

state a useful corollary toLemma 2.2.

Corollary 2.1. Adetection signal v isproper if andonly ifJ

v

()1forsome

0< <1.

Proof Lemma 2.2, part 4,shows that v isproperif J

v

()1 for some. Toshow

(48)

where J

v

() attains itsminimum. Clearly, the values producing a minimum ateach

endpoint are

1

=0 for =0, and

0

=0 for =1. Thus there willbe a value

where k

0

()k =k

1

()k. But then k

i

()k<1, which shows that v is not proper.

2.1.3 Problem Statement

We can now state the two versions of the problem solved by the rst half of the

algorithm. Version One, from (2.37-2.38) is:

(

) 2

= max

kvk=1

0<<1 J

v

(): (2.39)

VersionTwo, from(2.7) and (2.36) is:

(

) 2

= min

Jv()1

0<<1 Z

t

f

0 jvj

2

dt: (2.40)

These problems will be solved by rst calculating the necessary conditions for a

minimum of the inner problem which denes J

v

(), then numerically solving the

outer problemusing the previously computed necessary conditions as constraints.

2.2 Necessary Conditions

Aswithmanytypesofoptimizationproblems,theJ

v

()problempossessesconditions

that any extrema must satisfy in order to be an optimal solution. In Chapter 1 we

introduced the necessary conditions for an optimal solution to the standard LQR

(49)

inChapter1becauseofthepresenceofthecrosstermintheintegral,sointhissection

we develop the necessary conditions for the J

v

() problemexplicitly.

2.2.1 Computing the Necessary Conditions

Recall from(2.32) that

J

v

()=min 1

2 Z

t

f

0 x

T

Qx+x T

H+

T

R dt (2.41a)

subject to

x 0

=Ax+Bv+M: (2.41b)

The Hamiltonianfor system (2.41) is

b

H = 1

2 x

T

Qx+ 1

2 x

T

H+

1

2

T

R + T

(Ax+Bv+M): (2.42)

As described inChapter 1, the Euler equations for anextremum are

b

H T

= x 0

(2.43a)

b

H T

x

=

0

(2.43b)

b

H T

= 0: (2.43c)

These conditions appliedto (2.42)give

x 0

= Ax+Bv+M (2.44a)

0

= Qx

1

2

H A

T

(2.44b)

0 = R + 1

2 H

T

x+M T

: (2.44c)

whichis anindex one DAE in(x;;) since R >0.