Probabilistic Inference in Bayesian Networks and Influence Diagrams

In document Bayesian Modeling For Dealing With Uncertainty In Cognitive Radios (Page 56-63)

2.2 Probabilistic Graphical Models

2.2.5 Probabilistic Inference in Bayesian Networks and Influence Diagrams

Bayesian networks and influence diagrams have the ability of answering different types of queries about the nodes they contain. Making queries to a BN or ID is a form of probabilistic reasoning or probabilistic inference [68]. Normally, the inference starts when we have evidence, which means we observe a variable or set of variables, and we want to know the probability of other variables given the evidence. For example, in the situation described by figure 2.2.4 we observe thatJ 1, and want to know the probability of R0; we represent this query as

( 0 | 1)

P RJ. This operation is known as conditioning. Another query is marginalization, in which we look for the probability of a variable no conditioned to the other variables. In the example shown in figure 2.2.4 we could marginalize to obtainP J( ), P T( ), P S( ), or P R( ). Other operations that BNs allow are: most probable explanation (MPE), maximum a posteriori probability (MAP), and sensitivity analysis [69, 72, 74].

Figure 2.2.4: Example of Bayesian Network

Let us have a BN composed of the sets E and Q . The set E contains the evidence

variables, whereas Q contains the remainder of the variables of the BN that are not evidence. When Ee , MPE finds the instantiation q of Q that maximizes the probability P q e( | ). If

41

MPE looks for the instantiation of all the non-evidence variables, Q , that explains Ee with the

highest probability. MAP does so but only for some of the non-evidence variables.

Out of the aforementioned types of queries, conditioning or conditional probability query is the most common [69]. For solving this type of queries there exist exact and approximate inference algorithms. For this dissertation I use an exact inference algorithm called variable elimination (VE). The main idea behind the VE algorithm is to eliminate the variables that are neither query nor evidence. The VE algorithm takes the factorized joint probability distribution and sums out the variables that it needs to eliminate. This operation is also called factor marginalization.

The factorized form of the joint probability distribution obtained with the chain rule for Bayesian networks - equation 2.2.7 - can be seen as the product of factors. A factor is a function that maps a set of variables X to a real number; : Val( )X , where Val X( )

means value of the variable X and means map to. The set X is called scope of the factor;

therefore, Scope[ ] X. Equation 2.2.18 shows the chain rule for Bayesian networks as a

product of factors: 1 1 1 ( , , ) ( | ) i i n n n i X X i i P X X P X Pa     

, (2.2.18) where the conditional probability ( | )

i

i X

P X Pa is represented by the factor

i

X

. The scope of

i

X

is the variable Xi and its parents

i

X Pa .

42

Let us consider the set of variables X and a variable YX being the scope of the factor :

( , )Y

X . The factor marginalization of Y in , denoted ( , )

Y

Y

X , is equivalent to a factor 

over X such that [69] :

( ) ( , )

Y

Y

X

X . (2.2.19) Another name for the operation in 2.2.19 is summing out of Y in  . In this operation we only should sum up combinations where the states of X coincide.

The factor product and summation operations have properties equivalent to those of the product and summation over numbers [69]. Both operations are commutative:    1· 22·1 and

X Y Y X

 

 

; the product is associative: (     1 2)  3 1 ( 2 3); and they are

interchangeable:

( · )1 2 1· 2

X X

   

, (2.2.20) if XScope[ ]1 . This property allows to “push in” the summation, so that the summation is performed only on the subset of factors that contain the variable we want to eliminate. For instance, in 2.2.20 since we want to eliminateX , we push in the summation to sum only over 2 because 1 does not contain X .

The variable elimination (VE) algorithm takes advantage of the aforementioned properties. Figure 2.2.5 shows a very simple Bayesian network. The joint probability distribution for this

43

Bayesian network is: ( , , , )P A B C D    A· · ·B C D. If for instance we want to know the marginal probability of D, P D( ), we apply factor marginalization: ( ) ( , , , )

C B A

P D



P A B C D . By applying property 2.2.15 to it we get:

( ) · · · · · · · · · . A B C D C B A C D A B C B A D C A B C B A P D                          





(2.2.21)

The procedure in 2.2.21 can be summarized as:

Z  





. (2.2.22)

Figure 2.2.5: Simple BN for illustration

The expression 2.2.22 is also called sum-product inference task [69]. The variable elimination (VE) algorithm performs this inference task to sum out variables once at a time by using the property 2.2.20. When summing a variable, we multiply the factors that contain such variable to obtain a product factor. The next step is to sum out the variable from this product factor to generate a new factor, which will go to the next iteration as part of the new set of factors that the VE algorithm will be apply on. The VE algorithm will iterate until it removes all the variables it aim to eliminate. We can summarize the VE algorithm as follows [68, 69]:

44

The VE algorithm receives a set of factors , a set of variables to eliminate Z and an

ordering on Z, . If the set Z is [Z1,,Zk], let the ordering be Zi Zj if and only if ij. The set Z encompasses those variables that are neither query nor evidence. We refer to this

algorithm as procedure Sum-Product-VE(, ,Z ). The procedure Sum-Product-VE(, ,Z )

follows these steps:

1, , i  k for do   Sum-Product-Eliminate-Var ( , ) Zi *    

after completing the kth iteration.

*

return at the end.

The procedure Sum-Product-Eliminate-Var ( , ) Zi is performed for each of the iterations

1, ,

i  k. This process receives the set of factors, and the variable to be eliminatedZ ; then it performs the next operations:

1. Form a set   with the factors that have Z in their scope:

2. Form a set   with the factors that do not have Z in their scope, which is the set  without  :

{ : Z Scope[ ]}

    

45

3. Multiply the factors in the set   and save the result in factor  :

4. Add up the elements of the factor  where Z varies; this action eliminates Z from that factor. After eliminating Z from  save the result in the factor  :

5. Return the union between the set   and the factor 

The VE algorithm also applies when introducing evidence. Let us have a Bayesian network  that parameterizes the set of variables , the set of query variables Y, and evidence Ee .

When introducing evidence the task is to compute P Y e( , ). To execute this task, the factors are

reduced by Ee and eliminate the variables  Y E before applying the Sum-Product-VE

( , , ) Z procedure to the network . The factor *,which comes from Sum-Product-VE ( , , ) Z ,divided by  is P Y e( , ). This whole procedure whereby P Y e( , ) is obtained is called

Cond-Prob-VE( , , ) Y E procedure. This procedure encompasses the next steps [69]:

1.   Factors parameterizing 

2. Replace each  by [Ee]

3. Set an elimination ordering

    

Z  

 { }    return

46 4. Z   Y E 5. * Sum-Product-VE( , , Z) 6. ( ) * ( ) y Val y    

Y 7. return  , *

47

CHAPTER 3

BAYESIAN APPROACH FOR COGNITIVE RADIO

In document Bayesian Modeling For Dealing With Uncertainty In Cognitive Radios (Page 56-63)