Dynamic Bayesian Networks - Dynamic Bayesian Ontology Languages

8.2 Dynamic Bayesian Ontology Languages

8.2.1 Dynamic Bayesian Networks

BNs are static models, i.e., it is not possible to capture the dynamic features of the application domain. For instance, the probability of a patient having a high fever is very likely given the fact that the patient had high fever in the previous time step. Such scenarios can be modeled using dynamic Bayesian networks (DBNs) (Dean and Kanazawa 1989; Murphy 2002), which extend BNs to provide a compact representation of evolving JPDs for a fixed set of random variables.

BNs are known for compactly representing a state space, while DBNs can also represent the state-transition probabilities, and thus, can be facilitated to make projections about the future states of a system. The update of the joint probability distribution is typically expressed through an two-slice BN, which expresses the probabilities at the next point in time, given the current context.

Formally, a two-slice BN (TBN) over a finite set of variables V is a pair B→= (G, Θ),

where G = (V ∪V+_{, E)}with V+_{= {X}+_{| X ∈ V }}is a DAG such that all edges are directed

from elements of V ∪ V+ to elements of V+, and Θ contains, for every X+_{∈ V}+, a

conditional probability distribution P (X+_{| π(X}+₎₎for X+ given the parents of X+. As

standard in BNs, every node is independent of all its non-descendants given its parents in TBNs. Thus, for a TBN B→, the conditional probability distribution at time t + 1

given time t is PB→(Vt+1| Vt) = Y X+_∈V+ PB→(X +_{| π(X}+_)).

Example 8.26 Figure 8.2 depicts a TBN B→ and thereby defines the transition prob-

abilities between the two time slices. For instance, the probability for bob to have high fever at time t + 1 provided he did not have high fever at time t is given by

PB→(f+| ¬f) = 0.1. ♦

A dynamic Bayesian network (DBN) is a pair D = (B1, B→), where B1 is a BN , and

F S C t F+ S+ C+ f+ f .7 ¬f .1 s+ f s f+ _.9 f s ¬f+ _.5 f ¬s f+ .8 f ¬s ¬f+ _.4 ¬f s f+ .8 ¬f s ¬f+ _.4 ¬f ¬s f+ _.7 ¬f ¬s ¬f+ _.1 c+ c f+ _s+ _.7 c f+ ¬s+ _.2 c ¬f+ _s+ _.4 c ¬f+ _¬s+ ₁ ¬c f+ s+ .6 ¬c f+ ¬s+ _.1 ¬c ¬f+ _s+ _.3 ¬c ¬f+ _¬s+ ₁ t + 1

Figure 8.2: The DBN Dh = (B1, B→), consisting of (a) a BN B1 (= Bh ) over

V = {F, S, C}, which compactly represents a joint probability distribution,

and (b) a two-slice BN B→ over V , which defines the transition probabilities

between two time slices Vtand Vt+1.

thought of as containing two disjoint copies of the random variables in V , where the probability distribution at time t + 1 depends on the distribution at time t.

To be able to distinguish the variables in different time slices, we use Vt and Xt to

denote the set of variables V and the variable X ∈ V at time t, respectively. As in BNs, x is an abbreviation for X = 1 and ¬x for X = 0 . Moreover, we assume the (first-order) Markov property: the probability of the future state is independent from the past, given the present state. We note, however, that all of our results can be generalized to k-slice BNs, which relaxes this assumption to k slices and adds memory. Given the (first-order) Markov property, a DBN D = (B1, B→) defines, for every t ≥ 1, the unique

joint probability distribution

PD(Vt) = PB1(V1) · t Y i=2 Y Xi∈Vi PB→(Xi | π(Xi)).

We briefly illustrate these notions on our running example.

Example 8.27 Consider the TBN B→ depicted in Figure 8.2. The pair D = (B1, B→)

is a DBN, where B1 is the BN depicted in Figure 8.1. We can pose non-statics queries

to the DBN D. For instance, the probability of bob having high fever at time point 2 , PDh(f2), can be computed as

PB1(f1) · PB→(f2| f1) + PB1(¬f1) · PB→(f2 | ¬f1),

which is a dynamic version of standard probabilistic inference of BNs . ♦ Intuitively, the distribution at time t is defined by unraveling the DBN starting from B₁, using the two-slice structure of B→ until t copies of V have been created. This

produces a new BN B1:t encoding the distribution over time of the different variables.

Figure 8.3 shows the unraveling to t = 3 of the DBN (B1, B→), where B1 and B→ are

the networks shown in Figures 8.1 and 8.2, respectively.

The conditional probability tables of each node given its parents (not shown) are those of B1 for the nodes in V1, and of B→ for nodes in V2∪ V3. Notice that B1:t has

8.2 Dynamic Bayesian Ontology Languages F1 C1 S1 F2 C2 S2 F3 C3 S3

Figure 8.3: Three step unraveling B1:3 of (B1, B→)

t copies of each random variable in V . For a given t ≥ 1, we call Bt the BN obtained

from the unraveling B1:t of the DBN to time t, and eliminating all variables not in Vt.

In particular, we have that PBt(V ) = PB1:t(Vt).

For notational convenience, we write V1:t := V1∪ . . . ∪ Vt, and W1:t for a valuation of V1:t. Moreover, we write Xt⊆ Vt to denote a set of variables at time t and xtto denote

an instantiation of these variables; if, furthermore, Xt= Vt then xt corresponds to a

state at time t. Analogously, we write X1:t to denote a sequence of variables x1. . . xt

and x1:t to denote an instantiation of these variables, which is usually called a trajectory.

Traditional inference problems in DBNs are given in Figure 8.4 with the help of a simple timeline. Formally, given a DBN, filtering (also called monitoring) is the task of computing PD(xt | y1:t). Smoothing and prediction are the past (PD(xt−l | y1:t)) and

future (PD(xt+h| y1:t))analogs of filtering, respectively; and classification is a special

case of filtering, which amounts to computing PD(y1:t). Finally, finding a hypothesis,

which maximizes the probability of the query is called decoding, and is computed as arg max_x

1:tPD(x1:t | y1:t). Decoding is analogous to Maximum a Posteriori Hypothesis

(MAP) and most probable explanation (MPE) in BNs (Park and Darwiche 2004b). For

further details, we refer to (Murphy 2002).

In document Query Answering in Probabilistic Data and Knowledge Bases (Page 187-189)