Dynamic Bayesian Networks - Bayesian Networks

2. Bayesian Artificial Intelligence

2.3 Bayesian Networks

2.3.4 Dynamic Bayesian Networks

The previous section has shown the advantages of Bayesian networks as a simplified graphical representation of the joint probability of some random variables defined over a given process. The resulting network can then be used to answer any probabilistic query given the availability of some evidences. For example, the BN of the lung cancer patient shown in figure 7 can be used to estimate the likelihood of a patient having lung cancer giving

Bayesian Artificial Intelligence 64

that he/she is a smoker. However, the network assumes that all patients’

cases can be represented by the same variables connected together in a fixed structure fashion. This type of modelling is referred to as variable-based modelling [20,p. 199]. There are many applications where the process changes over time and we are more interested in capturing the dynamic behaviour of it. For example, in the case of inferring patients’ states in an ICU, the states of the patients change over time in a way that the next state depends on the current one and some observed variables that are samples at some time intervals such as heart rate, blood pressure and urine output. While an ordinary BN can model the relationship of current patient state in terms of some observed variables, it fails to capture the dynamic nature of how that state evolves over time and, in turn, fails to represent the distribution of the patient’s state over time. In addition, there are other classes of problem where the structure changes with every case. Consider, for example, the modelling of a genetic inheritance. In this case, each family has its own members which in turn have their own variables [20,p. 199]. Nonetheless, the way in which genes are inherited is the same for every family [20,p. 199]. This calls for a better way of representing dynamic processes then a mere variable-based fashion such as Dynamic Bayesian Networks (DBN). DBNs are models that work as templates to represents the temporal dynamics of an entire class of distribution in a compact way [20,p. 199]. The basic assumption of a DBN is that the world consists of successive temporal snapshots where each one has some random variables which are either observed or hidden [1,p. 567], and that the way the system evolves over time, called transition model, depends on a fixed number of previous states so as to prevent the transition probability

Bayesian Artificial Intelligence 65

between the current and next state from becoming infinitely unbound [1,p.

568]. This assumption is known as the Markov assumption and the process that satisfies it is known as Markov chain [1,p. 568]. If the next state in a temporal transition model depends only on the previous state, then it is called a first order Markov chain whereas if it depends on the previous two states then it is called second order Markov chain [1,p. 568]. The set of system states at a given time instant like t is often designated as Xt and the set of evidences as Et [1,p. 567]. Thus the transition model of a first order Markov chain can be expressed as [1,p. 568]:

| | (64)

Whereas the transition model of a second order Markov chain can be expressed as [1,p. 568]:

| | , (65)

In addition, the transition model is assumed to be fixed over time. That is to say the temporal-based conditional probability is constant regardless of the current time. Using the chain rule of probability, the temporal joint probability distribution of a Markovian process can be expressed as [20,p. 201]:

∏ ₊|

(66)

Bayesian Artificial Intelligence 66

The Markov assumption can be further extended to the case of evidences.

Evidences, or observed variables, may also depend on previous variables as well as the process states. However, careful modelling of the process states would make it suffice to generate the observed variables entirely so that the Markov assumption of the evidence, known as the sensory model, can be written as [1,p. 568]:

| , | (67)

Combining equation 67 with 66 yields the general template temporal model of DBN that satisfies the Markov property given in equation 68 [1,p. 569]:

, ∏ |

| (68)

Equation 68 assumes that the evidences, or observations, start to arrive from time slice 1. Hence at time slice 0, there are no evidences to have a conditional probability and the only information available about the process is its unconditional probability which is an intuitive conclusion giving that the unconditional probability of a variable is its likelihood in the event of no available evidences. In addition, equation 68 shows that a DBN can be represented by three sub-models: the transition model | which structures the evolution of the process variables between the current and next time slice, the sensor model | that connects the current process states with the observed sensors, and the unconditional probability distribution of the process variables [1,p. 591]. Hence, it is more convenient to only plot

Bayesian Artificial Intelligence 67

one slice of the DBN that shows the prior unconditional variables, the transition model, and the sensory model [1,p. 591]. Figure 10 shows a DBN representation of patient monitoring in ICU.

The state variables are shown as oval shapes whereas evidences are shown as circles. For simplicity, the state of the patient (denotes as S) can either be true or false. A true state indicates that the patient is a live whereas false

indicates that the patient is deteriorating. The sensory model consists of three variables: heart rate (H), blood pressure (B) and oxygen saturation (O). The probability tables are filled with arbitrary values to serve as a demonstrating example of how it would look like. Although, figure 10 only specifies the sensor probability table of the heart rate sensor, the remaining tables follows the same structure of the heart rate probability table. As similar to the sate variable, the sensor conditional probability table can assume any of two states: beating or non-beating. The event of patient deteriorating while the heart rate sensor is still showing beating is assigned a probability of 0.1 to emphasize the likelihood of sensor failure. It represents the simplest modelling of a sensor failure commonly known as the transient failure model [1,p. 593].

Figure 10. Simple DBN for monitoring patients at ICU

Bayesian Artificial Intelligence 68

It makes it possible to distinguish between nonsensical sensor reading due to sensor failure and reliable readings. If the predicted likelihood of non-beating heart rate sensor state giving all the past patient states is much less than the probability of transient sensor failure then the best explanation of the previous event is that the sensor has failed [1,p. 593]. Equation 69 gives a mathematical criterion of detecting the event of heart rate sensor failure at time slice t:

| | (69)

While the transient model helps smooth out the recorded history of sensor readings by removing the less probable ones according to equation 69, it still fails in cases where the failure is persistent [1,p. 593]. For example, if the heart rate sensor lead attached to the patient is disconnected. In order for the DBN to accommodate persistent failure, a persistent sensor model needs to

Figure 11. Modified DBN of figure 10

S0 S1

IsH0 IsH1

IsB0 IsB1

IsO0 IsO1

Bayesian Artificial Intelligence 69

be developed where the sensor itself will have a hidden state that would be interpreted using the available evidences. If the state of a sensor is designated with a prefix of (Is), the DBN of figure 10 can be expanded to that of figure 11 which introduces three new states describing the conditions of the sensors.

Inference in DBN can also be classified as exact or approximate and the same techniques used to query an ordinary BN can also be used with a DBN [1,p. 595]. However, the basic models of figures 10 or 11 need to be replicated, or unrolled, until it fits the present amount of observations [1,p.

595]. Figure 12 shows the unrolling of figure 10 to time slice 3 where the three observations nodes are combined into one node for simplicity of drawing.

There are many inference techniques proposed by researcher to reduce the amount of computations required to accomplish the task of probabilistic querying. Reference [20] discusses some of them in more details.

H1 B1 O1

H2 B2 O2

H3 B3 O3

Figure 12. DBN of figure 10 rolled to time slice 3

Bayesian Artificial Intelligence 70

In document Design and implementation of advanced Bayesian networks with comparative probability (Page 80-87)