diagnosing dynamic systems using trace patterns PRL1999

(1)

Diagnosing dynamic systems using trace patterns

Girish Keshav Palshikar

a,*

_{, Deepak Khemani}

b,1

a_{Tata Research Development and Design Centre, #54B, Hadapsar Industrial Estate, Pune 400 013, India} b_{Department of Computer Science and Engineering, Indian Institute of Technology, Chennai 600 036, India}

Received 3 September 1998; received in revised form 18 March 1999

Abstract

Process control systems usually generate a system activity log ortrace. Normal behavior results in normal patterns in the trace. We model the normal trace patterns using a context free grammar and develop a technique for automatic qualitative diagnosis, based on grammar perturbation, to recognize deviations from the normal trace patterns. Ó 1999 Elsevier Science B.V. All rights reserved.

Keywords:Diagnosis; Context-free grammar; Dynamic systems

1. Introduction

Dynamic systems ± e.g., process control systems ± usually generate a system activity log (i.e., atrace) for later inspection. The log may include information about input events received, output events generated, internal data values at dierent times, etc. This trace serves as a basis for diagnosis in case of a system malfunction. A trace can be understood as containing patterns of system behavior. Normal behavior results in normal patterns in the trace. In case of a malfunction, the trace may contain speci®c faulty patterns (i.e., deviations from the normal behavior patterns) characteristic of particular system faults. A diagnostician examines the trace, detects faulty patterns and relates them to his knowledge of the system structure, be-havior and faults. This paper presents thetrace-based diagnosis(TBD) framework to formalize such high-level, qualitative diagnosis reasoning as a syntactic pattern recognition problem.

We view the trace generated by a system as a string over an alphabet. Then the normal behavior of a system is characterized by the set of normal traces. Whenever the system malfunctions, this fact is re¯ected in the trace. Human experts seem to detect faulty patterns in the trace and diagnose the system by un-derstanding them as deviations from normal trace patterns. We formalize this human ability in syntactic pattern recognition framework.

Grammars have long been used in syntactic pattern recognition (Fu, 1982). Following this, the set of normal traces of a dynamic system is modeled as the languageL(G) generated by acontext free grammar

(CFG) G. Given an abnormal trace w generated by the system, TBDrepairs the normal grammarG to

*_{Corresponding author. E-mail: [email protected]} 1_{E-mail: [email protected]}

(2)

another grammarG0_{, using a simple repair operator, so that}_w₂_L₍_G0_{). The perturbation needed to accept} the invalid tracewidenti®es `faulty' non-terminals inG, i.e., points of deviations from the normal behavior. Since a non-terminal stands for a part of the normal behavior, the perturbation is a qualitative diagnosis that explainsw.

Model-based diagnosis(MBD) (de Kleer and Williams, 1987; Reiter, 1987) is a fundamental approach to diagnosis, which incorporates reasoning based on deep knowledge about the system. Essentially, MBD accepts a functional description of a system in logic and an observation of the system. A diagnostic pro-cedure generates all diagnoses, where a diagnosis is a list of components whose being faulty explains the given observation. Due to its simplicity, sound theoretical foundations and robustness (compared to heuristics-based expert systems), MBD has sparked o intense research in theory, techniques, tools and applications (Davis and Hamscher, 1988; Hamscher et al., 1992).

Of particular relevance here is the development of the MBD framework for the diagnosis of dynamic systems (Hamscher, 1991; Downing, 1992; Washio et al., 1997). MBD framework has also been extended to include hierarchical modeling schemes (Mozetic, 1991; Nakakuki et al., 1992). Another approach to the diagnosis of dynamic systems requires simulation and snapshots (Struss et al., 1997). In our approach, the trace of a system is essentially an observation of the system behavior over time, the normal system behavior is modeled as a CFG and a diagnosis is a subset of non-terminals in the CFG.

The speci®c motivation for this work is as follows. Diagnostic procedures in MBD use sophisticated mathematical logics for the description of and reasoning about system models. The logical notations have the advantage of being compositional but appear to be dicult to use for dynamic systems. On the other hand, CFGs provide a hierarchical and simple notation, which is particularly suitable for modeling be-havior of dynamic systems and for which ecient algorithms are known. We later illustrate how CFGs can also be used to build structural models of standard MBD applications like digital circuits.

Next, MBD is dicult to apply when the detailed system model is either too complex or not fully known. In such cases, typically the diagnostician uses a model of the normal and abnormal patterns of system observations to obtain a ®rst-cut `black-box' diagnosis. This is followed by a detailed diagnosis identifying the faulty components.

Finally, a trace-based diagnosis system is more suitable for condition (or health) monitoring tasks in a dynamic system (Williams et al., 1994; Eick and Lucas, 1996), e.g., in safety critical systems in areas like process control, avionics, nuclear power plants, etc.

This paper presents a diagnosis mechanism that is closely related to MBD and addresses the points raised above. First, the trace-based diagnosis approach is de®ned within the syntactic pattern recognition framework and motivating examples are presented. A theoretical characterization of TBD is obtained. An ecient diagnostic procedure to generate all diagnoses is outlined. This work demonstrates that MBD can use CFG as a simple, hierarchical model description notation (rather than logic) for which ecient diag-nosis is possible.

2. Trace-based diagnosis

(3)

relevant (in some well-de®ned sense) for diagnosis purposes, the more should be the accuracy of the di-agnosis. The important problem of selecting the terminal alphabet (i.e., primitive selection) does not have a simple solution. The choice depends on the goals of the diagnosis process and details desired in diagnoses. Also, an actual log may contain a lot of data; some transformations may be needed to convert this log to a trace string in terms of the terminals. This is the usual primitive extraction problem in pattern recognition. The normal behavior of the system is represented by a set of externally observable legal traces, called the

normal trace language of the system. When the system is behaving normally, it generates a trace, which belongs to its normal trace language. However, when the system generates a trace, which does not belong to the normal trace language, obviously something within the system is not working right and we have a diagnosis problem on hand. We seek a theory of diagnosis in which the explicit structure or behavior of the system is not used. Instead, we assume that all we know about the system is the structure of its normal traces. The normality or the abnormality of the system is to be deduced from the observed trace. This is the trace-based diagnosis problem.

We use a CFG to de®ne the normal trace language of a system. A system is described by a CFG,

G(T,N,P,S), whereTis the terminal alphabet,Nis the set of non-terminals,Pis the set of context-free production rules andSis the non-terminal inNcalled the start symbol. The languageL(G) (i.e., the set of terminal strings inT_{accepted by (}_G_{) is called the}_{normal trace language}_{of the system. A string}_w_in_T_is called atrace; wis avalid traceifw2L(G), otherwise, it is aninvalid trace.

It is reasonable to say that the grammarGitself forms a model of the system (rather, a model of the normal system observations). In this sense, TBD is similar to MBD, except that the model is not in terms of logic but in terms of a grammar; hence, we do not need the notions of logical consistency and logical consequence. Also, a CFG models the observable behavior of a system rather than its functionality or structure in terms of components. The MBD framework is chie¯y concerned about identi®cation of the faulty components and the system model is more structural and connectivity oriented. On the other hand, the TBD framework is more behavior oriented. Basically, a non-terminal stands for a `behavioral com-ponent' in the system model and the grammar as a whole represents the behavioral model of the artifact, as a composition of the behavioral components. In this paper, we do not investigate the relationship between the diagnosis based on behavioral and functional models.

The key idea behind TBD is as follows. Suppose that we are given a terminal stringwfrom T_which does not belong to the normal trace language of the system i.e.,w62LG. Then TBD attempts to identify that part of the normal behavior (i.e., those non-terminals in G) whose deviation from the required be-havior explains the observed illegal trace. Insofar as non-terminals ofGrepresent parts of normal behavior, the non-terminals so altered to accept the illegal trace constitute a diagnosis.

We thus formalize the trace-based diagnosis problem as a grammar repair problem, in which we sys-tematically perturb the original grammar so that the observed faulty trace can be accepted by the repaired grammar. The idea is that the details of each such repair operation forms a diagnosis. The task is to identify the repair setwhich includes all possible repairs to the original grammar. To simplify the problem, we constrain the repair operation to the simplest possible, that of adding rules of the formX®u(Xis a non-terminal anduis a terminal string). We do not add new terminals or non-terminals, do not delete rules and do not add rules of any other form.

LetG(T,N,P,S) be a grammar. LetDN. Then by arepair rule set PDwe mean a set of production rules of the formX®ufor eachX 2Dand someu2T_{. Given a grammar}_G₍_T_,_N_,_P_,_S_),_D_N _{and a} repair rule setPD, aD-repair grammar GD is the grammarGaugmented with the additional rules fromPD i.e.,GD(T,N,P0,S) whereP0P[PD.

Given a grammarGand a trace w2T_{, we say that the tuple (D,}_P_{D), where}_D_N _and_P

D is a repair rule set, is adiagnosisfor (G,w) if and onlyw2LGDand for any proper subsetD0ofD; w62LGD0. Thus

(4)

A few words on the nature, contents and interpretation of a trace are in order. The trace of a system re¯ects the activity that is going on within the system. The displayed behavior of the entire system in turn depends upon the activities taking place in the subsystems. A component connected model of a static system, for example the 3-bit adder described later, can be viewed as recording the activity of all the components. In this case the grammar of the system is in fact a composed model of the system, where the behavior of each component is described by the corresponding non-terminal, and the structure or con-nectivity is captured by the rules of the grammar. In this case when a broken or faulty part of the grammar is identi®ed, in eect a broken component is identi®ed.

On the other hand, a trace need not describe the activity at the component level of detail. The trace generated may in fact be designed to capture important behavior from the monitoring and diagnosis point of view. An immediate advantage of this approach is the capability to capture the behavior over time, a capability notably absent in a purely structural model. This makes it attractive for monitoring of dynamic systems, with the added capability of fault identi®cation as soon as anomalous behavior is detected. Once the faulty behavior-component is identi®ed, it is a separate problem to link it to the cause of the anomalous behavior. The cause of the behavior could be a faulty component, but it could also be due to operating conditions violating design assumptions.

This link needs to be established explicitly. This could be done at the time of designing the system and the trace it generates. It could also be done heuristically as humans often do for natural (biological and ecological) systems. It may be observed that a `static' device also lends itself to this `dynamic' treatment when its input±output behavior is captured in temporal cycles.

3. Illustrative examples

We now present some examples, which illustrate and motivate the TBD approach. Example 1 illustrates the idea of grammar repair, Example 2 discusses a dynamic system and Example 3 shows how to encode the structural model of a static system in the TBD framework.

Example 1.LetG1({a,b}, {S,A,B}, P,S) wherePcontains the following rules:

S!aBjbA A!ajaSjbAA B!bjbSjaBB

Letwaaab,w62LG1. Then ({S},S®aaab), ({B},B®a), ({B},B®aab) are all the diagnoses for (G1,w).

Example 2(A furnace temperature controller).Consider a furnace heated by two heater coils H1 and H2. A furnace control programP tries to maintain the furnace temperature close to some ®xed temperatureT. The furnace has a temperature sensor S which sends a temperature reading every n seconds. Initially, assume that the temperature is belowTand H1 and H2 are o. If the temperature falls belowT, the control program switches on the coil H1. If the temperature does not reach T within a ®xed time interval ofm

seconds after switching on H1, then Pswitches on H2 also. Whenever the temperature goes above T,P

switches o the coils which are currently on (i.e., only H1 or both H1 and H2).

(5)

We introduce the following alphabetT{a,a+,aÿ,b,b+,bÿ,o,f,O,F,t}. The meaning of the symbols is given below. We ignore actual temperature values, events like timer start and stop and time-stamps on the trace entries.

Trace fragment [b,o,b+,b+,a+,f] indicates a normal system behavior where the temperature is initially belowT, so H1 is switched on. The temperature then continues to rise and crossesT; H1 is then switched o. Another normal behavioral pattern is given by the trace [b,o,b+,b+,b+,t,O,b+,a+,f,F] where H2 was also needed to reachT. The following Prolog DCG grammar (Clocksin and Mellish, 1984) de®nes the normal system behavior.

normal - -> [].

normal - -> temp_below, heater_normal,normal. normal - -> temp_rises_above,

temp_falls_above, temp_falls_to, normal. heater_normal - -> h1_normal.

heater_normal - -> h2_normal.

h1_normal - -> h1_on, temp_rises_below, temp_rises_to, h1_off.

h2_normal - -> h1_on, temp_rises_below, timeout, h2_on, temp_rises_below, temp_rises_to, h1_off, h2_off.

temp_below - -> [b]; [`b+']; [`bÿ']. temp_above - -> [a]; [`a+']; [`aÿ']. temp_rises_below - -> []; [b]; [`b+'].

temp_rises_below - -> [b], temp_rises_below. temp_rises_below - -> [`b+'],temp_rises_below. temp_rises_to - -> [`a+'].

temp_falls_to - -> [`aÿ'].

temp_rises_above - -> []; [a]; [`a+'].

temp_rises_above - -> [a], temp_rises_above. temp_rises_above - -> [`a+'], temp_rises_above. temp_falls_below - -> []; [b]; [`bÿ'].

temp_falls_below - -> [b], temp_falls_below. temp_falls_below - -> [`bÿ'], temp_falls_below. temp_falls_above - -> []; [a]; [`aÿ'].

temp_falls_above - -> [a], temp_falls_above.

a temperature same as last entry (it is P threshold T)

a+ temperature increased from last entry (it is P T)

aÿtemperature decreased from last entry (it is P T)

b temperature same as last entry (it is<thresholdT)

b+ temperature increased from last entry (it is<T)

bÿtemperature decreased from last entry (it is<T)

t time-out of the timer occurred

o heater H1 was switched on

f heater H1 was switched o

O heater H2 was switched on

(6)

temp_falls_above - -> [`aÿ'], temp_falls_above. h1_on - -> [o].

h1_off - -> [f]. h2_on - -> [`O']. h2_off - -> [`F']. timeout - -> [t].

Informally, the following patterns describe some of the abnormal system behavior:

· Both heaters are o, yet the temperature does not decrease;[a+,f,`b+',`b+']. · Some heater is on, yet the temperature does not increase; e.g.,[b,o,`b+',`bÿ'].

Example 3(A 3-bit adder).The 3-bit adder (Fig. 1) is a typical example for MBD (Mozetic, 1991). Here,

T{0, 1} and a trace is a string of 5 symbols In1In2In3Out1Out2. The normal behavior of the adder corresponds to a ®nite trace language {00000, 00110, 01010, 01101, 10010, 10101, 11001, 11111}.

The following DCG attribute/action grammar represents the circuit structure and accepts the same trace language. Note that the component gates each have a corresponding non-terminal in the grammar. The structure and connectivity of the components is also re¯ected in the grammar through shared at-tributes.

adder - -> x1(I1,I2,A),

a1(I1,I2,B), a2(I3,A,C), x2(A,I3,O1), o1(C,B,O2).

x1(I1,I2,A) - -> [I1,I2], { xor(I1,I2,A) }. a1(I1,I2,B) - -> [], { and(I1,I2,B) }. a2(I3,A,C) - -> [I3], { and(I3,A,C) }. x2(A,I3,O1) - -> [O1], { xor(A,I3,O1) }. o1(C,B,O2) - -> [O2], { or(C,B,O2) }. xor(1, 1, 0).

xor(1, 0, 1). xor(0, 1, 1). xor(0, 0, 0). and(1, 1, 1). and(1, 0, 0). and(0, 1, 0). and(0, 0, 0).

(7)

or(1, 1, 1). or(1, 0, 1). or(0, 1, 1). or(0, 0, 0).

4. Theoretical characterization

It is easy to prove that the grammar repair mechanism is conservative (i.e., preserves the original lan-guage).

Proposition 1.LG LGDfor each diagnosis(D,PD).

Proposition 2.(D{},PD{})is a diagnosis for(G,w)ifw2LG.

Proposition 3.For any CFG G and an invalid terminal string w,a diagnosis always exists.

Proposition 4.For any CFG G and an invalid terminal string w,let(D,PD)be a diagnosis. Then the right hand side of any rule in PD contains only a substring of w.

Proposition 2 says that the diagnosis is empty when a normal trace is generated. Proposition 3 guar-antees diagnosibility and follows from the fact that ifSis the start symbol, then ({S}, S®w) is a trivial diagnosis for anyw. Proposition 4 is important as it makes the diagnosis space (of all possible repair rules) ®nite, though still large.

Let w be a string. By an ordered partition of w, we mean a sequence of possibly empty sub-strings

hu1u2 ukiof wsuch that wu1u2 uk; e.g., each of áa, aabñ andáa,a,abñ is an ordered partition ofwaaabbut áaa,bañis not an ordered partition ofw.

Every sentential forma has one of the following two general forms:V1 orV1u1V2u2 VkukVk1 where eachViis a possibly empty string of non-terminals and eachuiis a non-empty string of terminals. For each

of these general forms ofa, we now de®ne the conditions under whicha is said to beconsistent withthe given terminal stringw. IfaV1, then ais always consistent withw. IfaV1u1V2u2 VkukVk1 thenais consistent withw if there exists an ordered partitionhv1;u1;v2;u2;. . .;vk;uk;vk1i ofw where each vi is a

terminal sub-string ofwsuch that ifViethenvie. We make the following simple observations: · A terminal stringuis consistent with wiuw.

· A sentential formuV(Vu) is consistent withwi uis a pre®x (sux) ofw.

· If a sentential formaubis consistent withwthen a sentential formaXbis also consistent withwifuis a terminal string andXis a non-terminal.

· Generalizing, if a sentential formacbis consistent withwthen a sentential formaXbis also consistent withwifcis a sentential form andXis a non-terminal.

In Example 1 andwaaab, the sentential formsaaBB,aaBbS,aaBaBBare consistent withwand sentential formsbA,aabSB are not consistent withw.

Given a CFGGand a terminal stringw, a derivation stepa)bisG-consistent with w(or justconsistent, ifGis understood) if both sentential formsaandbareG-consistent withw. Such a consistent derivation step is denoted asa b. A sentential formais said to beG-consistent(or justconsistent) withwifS _a_in

G where _{denotes zero or more consistent derivation steps. Clearly, the sentential form} _S _{is always}

G-consistent with any terminal stringw.

In Example 1, some derivation steps which areG1-consistent withwaaabare:S)aB,aB)aaBB,

(8)

aaBB)aabSB, aaBbS)aaBbaB. The sentential form aaBBisG1-consistent withw but the sentential formbAis not G1-consistent withw.

LetGbe a CFG andwbe an invalid string with respect toG. Aconsistent derivation graph for w in G, denotedDG

w, is a directed connected graph in which a sentential formais a vertex ifaisG-consistent tow

and there is a directed edge from a vertexato a vertexbifa b. For Example 1, the consistent derivation graph for the invalid stringwaaabis shown in Fig. 2.

· In general, the derivation graph forwinGmay be in®nite. For example, ifGcontains a production rule

S®SS(among other rules) then the derivation graph for anywis in®nite.

· The derivation graph is ®nite if either the given CFG is in Greibach normal form or does not contain any e-production (i.e., a production of the formX®e).

Proposition 5.Let G be a context-free grammar and w an invalid string. Let the tuple(D,PD)be a diagnosis for

(G, w). Let FG

w and FwGD denote the set of sentential forms which are G-consistent and GD-consistent to w,

respectively. ThenFG w FwGD.

Proof.The proposition follows from the observation that ifais a sentential formG-consistent towthenais alsoGD-consistent to w. This is so since every consistent derivation step in the derivation ofa inG also holds inGD. The strict subset relation follows from the fact that the sentential formwisGD-consistent but notG-consistent. h

LetGbe a context-free grammar andwan invalid string. Letabe a sentential formG-consistent tow. Then a non-terminalXoccurring inais said to beG-live inafor wif there is a production rule forXinG

which when applied toaleads to anotherG-consistent sentential form. Otherwise,Xis said to beG-dead in

afor w.

In Example 1, whenwaaab, in theG1-consistent sentential formaaaBB, both occurrences ofBare

G1-live forw. The ®rst occurrence ofBisG1-live inaforwbecausea aaaBBBusing the production rule

B®aBBfor the ®rst occurrence ofB. The second occurrence ofBisG1-live inaforwbecausea aaBb

using the ruleB®bfor the second occurrence ofB. However, in the sentential formaaBbS,BisG1-live forwandSisG1-dead forw.

(9)

Proposition 6.Let G be a CFG and w be an invalid terminal string. Letabe any sentential form which is G-consistent to w. Letacontain a G-dead variable X. Then X remains G-dead in any sentential formbs.t.a b.

Proof.Letaa1Ya2Xa3be a sentential formG-consistent towin whichXisG-dead butYisG-live. Let us obtain a sentential form b by using an applicable production Y®c to expand Y in a, i.e., a1Ya2Xa3 a1ca2Xa3. We need to show that XisG-dead in b. Assume to the contrary. Then there is a production X®v in G such that a1ca2va3 is consistent tow. Since a1ca2va3 is consistent tow, clearly, a1Ya2va3is also consistent towand, consequently,a1Ya2Xa3(i.e.,a) is also consistent tow. Therefore, the production X®v can now be applied to X in a. Thus X is G-live in a for w, contradicting the assumption. h

Informally, the binary relation between sentential forms preserves dead non-terminals. However, it is easy to see that it does not preserve live non-terminals. In Example 1, forwaaab, all the three occurrences of B are G1-live in the G1-consistent sentential form aaaBBB. However, whenever any applicable pro-duction is applied to any occurrence ofB, the remaining two occurrence ofBin the resulting sentential form areG1-dead forw.

Given a CFGGand an invalid stringw, a non-terminalXinNis said to beG-relevanttowifXappears in at least onew-consistent sentential form inG. The set of all non-terminals which areG-relevant towis denoted byRG

w. A non-terminal which is notG-relevant towis called a non-terminal which isG-irrelevant

to w. The set of all non-terminals which areG-irrelevant towis denoted byIG

w. Clearly, a non-terminalXin

Nmust belong to eitherRG

w orIwGbut not to both. ThusRGw andIGwtogether form a partition of the setNof

non-terminals in G. The start symbol S always belongs to RG

w for any w, since the sentential form S is

alwaysG-consistent and it is always consistent to anyw. For Example 1 and waaab,RG1

w {S,B} and IG1

w A.

Proposition 7.Let G be a context-free grammar and w an invalid string. Let the tuple(D,PD)be a diagnosis for

(G,w).ThenRG w RGwD.

Proof. By Proposition 5,FG

w FwGD. This means that RGw RwGD. Let a2FwGD. Then a has a w-consistent

derivation in GD, say, ha1S;a2;a3;. . .;anai where ai ai1 in GD for i1;2;. . .;nÿ1. The proposition is proved if for eachai,i1;2;. . .;n, it is true that eachaicontains only variables inRGw. We

prove this statement by induction on the length of the derivation sequence.

Basis. Sincea1SandSis inRGW, the proposition holds fori1.

Induction step. Assume that the proposition holds foraii.e.,aicontains only those non-terminals which

are inIG

w. Letai ai1in GD. There are 4 cases to consider.

Case 1.ai2FwGand a rule fromGwas used in the derivation step. Thenai12FwGand the non-terminals

inai1are inIwG.

Case 2.ai2FwGand a rule fromPDwas used in the derivation step. Since every rule inPDcontains only a terminal string on the right hand side, the non-terminals in ai1 are a subset of those in ai. Since

non-terminals inaiare inIwG, so are those inai1.

Case 3.ai62FwG and a rule fromPD was used in the derivation step. By an argument similar to that for case 2, it is seen that the non-terminals inai1 are also inIwG.

Case 4.ai62FwG and a rule X®b fromGwas used in the derivation step. SinceX 2IwG, there exists a

sentential form inFG

w such thatXoccurs in it. We show that there exists a sentential formcinFwGsuch that

XisG-live inc. Such a sentential form is obtained from the derivation sequence upto ai by replacing all

(10)

Proposition 8.Let G be a CFG and w be an invalid terminal string. Let the tuple(D,PD)be a diagnosis for

(G, w).ThenDRG w.

Proof.Follows immediately from Proposition 7. h

Proposition 8 says that only the relevant non-terminals can be used for diagnosis. It can be used to rule out certain non-terminals as candidates for faults.

Proposition 9(Complexity of TBD).Let G be a context-free grammar containing n non-terminals. Let w be an invalid string of length m. Then the size of the diagnosis space(i.e.,the total number of possible candidate rule-sets) for diagnosis of w is given by R(m,n)[1 +S(m)]n _{where S}₍_m₎_{the number of non-empty distinct}

substrings of w.

Proof.Consider a subsetA, of the setNof non-terminals, containingkelements. Then we need to assign a substring ofwwith each of the non-terminals inA; each such substring can be chosen inS(m) ways. Thus the total number of rules possible forAis [S(m)]k_{. Now the set}_A_{itself can be chosen in}k_C

nways. Thus the

total number of rule-sets forkvariables isn_C

k á[S(m)]k. Summing fork0;. . .;n, we get the total number

of rule sets:

Rm;n n_C

0 Sm0nC1 Sm1nC2 Sm2 nCn Smn 1Smn:

Of course, not all rule sets in the diagnosis space are diagnoses for the given invalid stringw. h

Thus, the size of the diagnosis space suers from combinatorial explosion. For Example 1, with

waaab,n3,m4. There are 7 non-empty distinct substrings ofw:a,aa,aaa,b,ab,aab,aaab. Hence,

S(m)7. ThenR(m,n)[1 +S(m)]n_{[1 + 7]}3_512.

There is a simple but highly inecient generate-and-test procedure to generate all possible diagnoses for a given grammarGand a given invalid stringw. It systematically picks up a subset of non-terminals and constructs the associated rule set by assigning a substring ofwto each non-terminal picked. It then merges the resulting rule set withGand tests whetherwis parsed by the merged grammar; if yes, the rule set is a diagnosis, otherwise, a new attempt is made to form a rule set.

5. Diagnostic algorithm

In MBD, the diagnosis algorithm systematically selects the logical components whose being faulty makes the observation logically consistent with the system model. In TBD, the diagnostic algorithm systematically selects non-terminals and identi®es (and adds to the original grammar) an additional production rule for them, so as to make the observed string derivable from the new grammar. Thus, the notion of logical consistency in MBD is replaced by the use of the `derivable' relation between sentential forms. For example, it is easy to see in Fig. 2 that the three sentential forms marked with a * are the only ones which contain a non-terminal and which can be `repaired'. We now formalize this approach.

LetGbe a CFG andwbe an invalid terminal string. LetC{X®u |Xis a non-terminal inGanduis a substring ofw} be the diagnosis space (i.e., the set of all possible repair rules for (G,w)). LetP0_P_[_C. Then acompleted consistent derivation graph for w in G, denotedHG

w, is de®ned as follows. Every sentential

formaG-consistent towis a vertex. There is a directed edge, labeled by a rulerinP0_{, from a vertex}_a_{to a} vertexbifa rb.

(11)

reasonably ecient diagnostic algorithms can be designed to generate all possible diagnoses for a given CFGGand given invalid stringw. The eciency of the diagnostic algorithm follows because it `rides' on an ecient parser. The graph HG

w is implicitly generated and traversed by a parser. Whenever the parser is

stuck in its attempt to parsew, the diagnostic algorithm nudges it forward by generating a repair rule. The backtracking inherent in a parser is then used to generate all possible diagnoses. The algorithm can di-agnose multiple faults in the given invalid trace. The algorithm does not generate diagnosis containing irrelevant non-terminals and it performs only consistent derivation steps. The output lists only the faulty non-terminals in a diagnosis, rather than the repair rules for each non-terminal.

Although it is possible to use any parser in the diagnostic procedure, thetrace analysis system(TAS/RD) currently uses the top-down recursive descent parser (Aho and Ullman, 1972) for Prolog DCG. However, this means that the grammar is forbidden to contain left-recursive rules. It also means that the diagnostic algorithm has the worst-case complexity O(cn_{) where}_c_{is a constant and}_n_{is the length of the trace (Aho}

and Ullman, 1972). However, in practice, with a careful writing of grammar rules, the recursive-descent parser in Prolog works quite eciently.

The diagnostic algorithm in TAS/RD is implemented in Prolog and takes advantage of Prolog back-tracking for generating all possible repair rules to parse the given invalid string. For each non-terminalX, the algorithm systematically attempts to generate all possible repair rules of the (simpli®ed) formX - -> [L]whereLis arelevantstring from the as yet unparsed (remaining) part of the given input string. TAS/ RD collects all rules applied for a parse ofwin a parse-tree like structure. Once a parse forwis obtained, a diagnosis is extracted from the parse tree and printed. Backtracking inherent in the parser is used to au-tomatically generate all such parse trees forw. Care is taken to avoid printing non-minimal diagnoses and diagnoses already generated.

For Example 1 and waaab, the TAS/RD system shows all possible diagnoses (obtained on back-tracking):

?- diagnose(`S',[a,a,a,b],[]). [`B'];

[`S']; no

TAS/RD internally generates the following three repair rules for the given stringw, corresponding to the three sentential forms marked with a * in Fig. 2:

B - -> [a] B - -> [a,a,b] S - -> [a,a,a,b]

This indicates thatBcan `fail' in two ways; however, only one diagnosis forBis printed.

When a CFG has a large number of non-terminals, the TAS/RD system allows the user to specify only a subset of them to be considered for repair (akin to the notion of components in MBD). For Example 2 and

w[b,o,`b+',`bÿ'], let us restrict the components to {h1_on, h1_o, h2_on, h2_o, temp_ris-es_below, temp_rises_to}. Then the TAS/RD system shows the only possible diagnosis which is that temp has not risen toT(perhaps H1 may be o due to some fault). The addition of more components does not eliminate this diagnosis, but only adds a few more.

?- diagnose(normal,[b,o,`b+',`bÿ'],[]). [temp_rises_to, h1_off];

no

(12)

backtracking) each value from the domain of the attribute. For Example 3, the domain of each attribute is Boolean, having only values 0 and 1. Once this information is given, the TAS/RD system generates all diagnoses for the given invalid string. Note that the non-terminals in a diagnosis correspond to faulty components.

?- diagnose(`adder', [1,0,1,1,0],[]). [x2, o1];

[x2, o1]; [x1]; no

6. Conclusion

This paper is based on a fact that the initial diagnosis in a dynamic system is usually based on the analysis of the run-time log of the system activities. Use of CFG (as an alternative model description mechanism) to describe the set of normal traces of the system is advocated. CFG is a natural choice to represent traces, especially because it is a hierarchical notation and allows non-terminals to be naturally interpreted as behavior components. The theoretical framework of trace-based diagnosis is developed and illustrated. Given a normal trace grammar and a faulty trace, a diagnostic procedure is described which identi®es the faulty parts (i.e., deviations from the normal trace patters) of the grammar. Diagnoses based on TBD may be used as clues to improve the eciency of more detailed diagnosis using other techniques. TBD is a new diagnosis framework, unifying MBD and syntactic pattern recognition. Essentially, TBD reformulates the MBD problem in a syntactic pattern recognition framework. The diagnostic algorithm in TBD is distinct from the standard error-correcting parsing in syntactic pattern recognition (Fu, 1982): (i) no new non-terminals are added, (ii) no notion of a `distance' between strings (i.e., between the given invalid string and a possible correct string `closest' to it) is used and (iii) a single uni®ed repair operator is used (instead of the usual insert, delete, substitute operators).

Additional work is needed to develop the TBD framework; e.g., TBD may be extended to stochastic grammars to take probabilities of behavior patterns into account. For complex systems, TAS/RD generates a large number of diagnoses without discriminating among them. Two ways to reduce the number of di-agnoses are: (a) focus on a small number of components, (b) generate didi-agnoses containing only non-terminals at the highest or lowest `level'. It is an open question whether the use of more sophisticated repair operators (e.g., the insert, delete and substitute operations on production rules) will lead to faster and more focused diagnosis. The TBD framework does not yet include tests and measurements, notions of minimum cost diagnosis, explanation facilities etc.

Finally, it would be interesting to explore the theoretical relationship between the diagnosis based on behavioral and functional models ± we have not done so in this paper. Such a relationship must necessarily re¯ect the deep connection between a set of logical formulae and a grammar. In particular, the notion of logical consistency in MBD is replaced in TBD by the use of the `derivable' relation between sentential forms.

Acknowledgements

(13)

anonymous referees were of considerable help and value. The ®rst author would like to thank Dr. Manasee G. Palshikar for her patience and con®dence.

References

Aho, A.V., Ullman, J.D., 1972. Theory of Parsing, Translation and Compiling, Vol. I: Parsing. Prentice-Hall, Englewood Clis, NJ. Clocksin, W.F., Mellish, C.S., 1984. Programming in Prolog. Springer, Berlin.

Davis, R., Hamscher, W., 1988. Model based reasoning: troubleshooting. In: Exploring Arti®cial Intelligence, Chapter 8. Morgan Kaufmann, Los Altos, CA.

de Kleer, J., Williams, B., 1987. Diagnosing multiple faults. Arti®cial Intelligence 31 (1), 97±130.

Downing, K.L., 1992. Consistency-based diagnosis in physiological domains. In: Proceedings of the Tenth National Conference on Arti®cial Intelligence. AAAI Press/MIT Press, San Jose, California, pp. 558±563.

Eick, S.G., Lucas, P.J., 1996. Displaying trace ®les. Software Practice and Experience 26 (4), 399±410. Fu, K.S., 1982. Syntactic Pattern Recognition and Applications. Prentice-Hall, Englewood Clis, NJ. Hamscher, W., 1991. Modeling digital circuits for troubleshooting. Arti®cial Intelligence 31 (1), 97±130.

Hamscher, W., Console, L., de Kleer, J., 1992. Readings in Model-based Diagnosis. Morgan Kaufmann, Los Altos, CA. Mozetic, I., 1991. Hierarchical model-based diagnosis. International Journal of Man-Machine Studies 35 (3), 329±362.

Nakakuki, Y., Koseki, Y., Tanaka, M., 1992. Adaptive model-based diagnostic mechanism using a hierarchical model scheme. In: Proceedings of the Tenth National Conference on Arti®cial Intelligence, San Jose, California. AAAI Press/MIT Press. pp. 564±569. Reiter, R., 1987. A theory of diagnosis from ®rst principles. Arti®cial Intelligence 32 (1), 57±95.

Struss, P., Sachenbacher, M., Dummert, F., 1997. Diagnosing a dynamic system with (almost) no observations. In: Proceedings of the 11th International Workshop on Qualitative Reasoning.

Washio, T., Sakuma, M., Kitamura, M., 1997. A new approach to quantitative and credible diagnosis of multiple faults of components and sensors. Arti®cial Intelligence 84 (1±2), 103±130.