• No results found

2.2 Decomposition Methods in Graphical Models

2.2.3 Mini-bucket Tree and Join-graph Decomposition

The complexity of tree clustering schemes for inference tasks is intractable for practical prob-lems, so designing approximate inference algorithms with a bounded complexity is a main research topic in graphical models. Our primary interest is in decomposition methods that generate upper bounds by graph decomposition schemes offering lower complexity structure with smaller clusters.

Approximate decomposition schemes such as mini-bucket elimination [Dechter and Rish, 2003] or join-graph decomposition [Mateescu et al., 2010] decompose a join-tree into possibly a loopy graph using the mini-bucket relaxations. Such mini-bucket based graph decompo-sition schemes can limit the maximum cluster size below a bounding parameter i-bound so

that the complexity of one iteration of message passing is bounded exponentially by the i-bound.

Consider a bucket BX that collects functions having a variable X in its scope. Mini-bucket relaxation partitions the bucket to P mini-buckets {B1X, . . . , BPX} such that the i-bound limits the scope size of each mini-bucket by |sc(BpX) ≤ i + 1. Then, the message λX from the bucket BX can be bounded by the weighted mini-bucket relaxation [Dechter and Rish, 2003, Liu and Ihler, 2011].

Theorem 2.6 (Weighted Mini-bucket Relaxation).

λX :=

If a variable X is eliminated by the maximization, we use wX = 0+, and all the weights {wX1, . . . , wXP} in Eq. (2.26) are arbitrary close to zero. Any combination of weights satis-fying Eq. (2.26) generates a valid upper bound if X is eliminated by the summation, wX = 1.

Dechter and Rish [2003] presented the mini-bucket relaxation by the maximization and the summation operations, and Liu and Ihler [2011] generalized it to the weighted mini-buckets as shown in Eq. (2.25). Eq. (2.27) defines a set of equality constraints enforcing variables X at the bucket BX and duplicated variables Xp at each mini-bucket BXp are the same.

The idea of decomposing a coupled problem (bucket) into smaller problems (mini-buckets) appears frequently in the context of optimization, and we will revisit this mini-bucket re-laxation from the optimization perspective later when we review variational decomposition bounds.

Algorithm 2.1 WMB(i): Weighted Mini-bucket Elimination

Require: Graphical model MG:= hX, D, Fi, Total variable elimination ordering O, i-bound Ensure: Mini-bucket tree Decomposition TMB:= hT (C, E ), χ, ψi, Upper bound to a mixed inference

task UB

1: UB ← 1.0 . Initialize UB for the multiplicative functions

2: for each variable Xi ∈ O do

6: Assign positive weights wXp

i to each variable Xip such that wX =PP p=1wXp

i

7: for each mini-bucket BXp

i ∈ {BX1

i, . . . BXP

i } do

8: Add a cluster node BXp

i to TMB . Structure mini-bucket tree

9: ψ(BXp

i) ← {Fi|Fi∈ BXp

i} . Update node labeling functions

10: χ(BXp

i) ← ∪i∈IB Xp

i

sc(Fi)

11: Connect two cluster nodes U, V in T if ψ(V ) contains the outgoing message from U

12: λXip ←PwXip

. Compute weighted mini-bucket message

13: if sc(λXip) is empty then

14: UB ← UB · λXip . Multiply constant message to upper bound

15: else

16: F ← F ∪ {λXip}

return UB and message propagated mini-bucket tree T (C, E )

By using the weighted mini-bucket relaxation, we can modify the bucket elimination algo-rithm to the weighted mini-bucket elimination in a straightforward manner. Algoalgo-rithm 2.1 shows the weighted mini-bucket elimination algorithm for a mixed inference task with a combination operator ⊗ being the multiplication between functions. From line 3 to line 6, a bucket BXi is partitioned into P mini-buckets, and each mini-bucket cluster is added to a mini-bucket tree decomposition from line 8 to line 11. The weighted mini-bucket elimi-nation algorithm computes the outgoing messages from the mini-buckets and accumulates constant messages to yield UB from line 12 to line 16, which we can skip computing the actual messages when structuring the mini-bucket tree decomposition. Since the size of all cluster scopes are bounded by i-bound, the space and time complexity for bounding the inference task by the mini-bucket elimination algorithm is exponential in the i-bound.

A join-graph decomposition [Mateescu et al., 2010] refines a join-tree into a join-graph with

smaller clusters.

Definition 2.29 (Join-graph Decomposition GJG). A join-graph decomposition of a graphical model M := hX, D, Fi is a tuple GJG := hG, χ, ψi, where G := hC, Si is a graph with nodes C and edges S, and χ and ψ are labeling functions, where χ maps a node C ∈ C to a set of variables χ(C) = XC ⊂ X, and ψ allocates each function Fi ∈ F exclusively to a node C ∈ C such that sc(Fi) ⊂ XC. An edge (Ci, Cj) ∈ S is associated with a subset of variables shared between the two clusters χ(Ci) ∩ χ(Cj), called separator SCi,Cj. The labeling function should ensure the running intersection property; for each variable Xi ∈ X, the set {C ∈ C|Xi ∈ ψ(C)} induces a connected sub-graph.

A valid join-graph can be systematically obtained from a mini-bucket tree by connecting mini-buckets {BX1

i, . . . BXP

i } in a chain. The separators between mini-buckets of the same bucket BX have a single variable {X}, and the scope of separators SCi,Cj is determined by the scope of the message sent from mini-bucket cluster Ci to Cj. The join-graph structured from a mini-bucket tree can further be simplified by merging adjacent clusters if the scope of a cluster subsumes the other. A nice property of a join-graph based on a mini-bucket tree is that the separators are minimal in the sense that removing any variable from a separator renders the join-graph decomposition invalid. The message propagation over a join-graph aims to provide an approximation that imrpoves over the belief propagation algorithm [Pearl, 1988]. However, it does not guarantee an upper bound to the optimal solution like the mini-bucket tree elimination. Yet, we can propagate messages in an iterative manner until convergence if it happens, or a time limit with space and time complexity for each iteration bounded exponentially by the i-bound.

Example 2.4. Figure 2.4a shows an example of mini-bucket tree that approximate the bucket tree shown in Figure 2.3a. We can see that the bucket BC is divided into two mini-buckets {BC1, BC2} due to the i-bound 1. Figure 2.4b is a join-graph structured from the mini-bucket tree by adding a separator with single variable {C} between clusters BC1 and BC2

(a) Mini-bucket Tree (i=1) (b) Join-graph(i=1) (c) Simplified Join-graph Figure 2.4: Example of Mini-bucket Tree and Join-graph Decomposition.

that introduces a cycle. Figure 2.4c is a simplified join-graph that merges clusters that can be subsumed to the adjacent ones.

The mini-bucket tree and join-graph decomposition provide a structural decomposition of an input graphical model for the approximate inference algorithms. In the following, we review variational inference framework that derives message passing algorithms sending messages over such graph decompositions. The message passing algorithms derived by variational in-ference use region graphs, which can be any graph reflecting the structure of the graphical model. The join-graph is a reasonable choice for the region graph since it allows a system-atic construction procedure for generating higher-order region graphs that can improve the quality of approximation with increased i-bounds leading to anytime property.