Binary Decision Diagrams - A Probabilistic Prolog and its Applications

A binary decision diagram (BDD) [Bryant, 1986] is a data structure that graphically represents a Boolean function. Roughly speaking, a BDD is a rooted directed acyclic graph, where nodes correspond to Boolean variables, edges to truth value assignments to their source node’s variable, and the two designated sink nodes, called 0- and 1-terminal node (or 0- and 1-leaf), to the function values 1 (or

true) and 0 (or false), respectively. Each path through such a diagram thus

encodes a truth value assignment together with the corresponding function value. While various variants of such diagrams exist, in this thesis, we will use the term BDD to refer to reduced ordered binary decision diagrams. As the name states, in this variant, all paths through the diagram respect the same variable ordering, and furthermore, the diagram is reduced as much as possible to achieve maximal compression. We will now discuss the basics of BDDs by means of an example.

Example 2.10 Consider the propositional formula x ∨ (y ∧ z), defining a Boolean

function over three variables. Alternatively, this function could be specified by means of a truth table, that is, by listing all truth assignments to the variables together with the truth value of the formula. In Figure 2.3(a), such an explicit encoding is graphically depicted as a Boolean decision tree, where each branch corresponds to one assignment. Leaves are labeled with the truth value of the formula under the branch’s assignment. All edges are implicitly directed top-down. Dotted edges denote the assignment of 0 to the variable of their source node, solid ones that of 1. Corresponding child nodes are called low and high child, respectively. The

BINARY DECISION DIAGRAMS 21

leftmost branch thus assigns 0 to all three variables, the next one assigns 0 to both xand y, but 1 to z, and so forth. Clearly, already for such a small example, this encoding contains redundant information. For instance, once x is set to 1, the truth value of the entire formula is determined and the remaining tests listed in the corresponding subtree are unnecessary. The key idea of BDDs is to remove such redundancies by dropping nodes or sharing identical subtrees, which will transform the tree into a directed acyclic graph. For our example, this graph – which is a canonical representation given the variable ordering – is shown in Figure 2.3(b).

Two BDDs g1 and g2 are isomorphic if there exists a one-to-one mapping σ from

edges in g1 to edges in g2 such that if σ(s1, t1) = (s2, t2), the edges (s1, t1) and

(s2, t2) are of the same type and each of the associated node pairs (s1, s2) and

(t1, t2) shares the same label. Starting from a full binary tree with the same variable

ordering on all branches, a BDD can be obtained using the following two reduction

operators:

Subgraph Merging If two subgraphs g1and g2are isomorphic, all edges leading

from some node outside g2 to some node in g2 are redirected to the

corresponding node in g1, and g2is removed from the graph.

Node Deletion If both outgoing edges of a node n lead to the same node c, all

incoming edges of n are redirected to c and n is removed from the graph.

Example 2.11 In Figure 2.3(a), the two rightmost trees with root label z are

isomorphic and can thus be merged, resulting in both outgoing edges of their parent node y leading to the same node. Thus, this parent node can be deleted.

BDDs are one of the most popular data structures used within many branches of computer science, such as computer architecture and verification, even though their use is perhaps not yet so widespread in artificial intelligence and machine learning (but see [Chavira and Darwiche, 2007] and [Minato et al., 2007] for recent work on Bayesian networks using variants of BDDs). ProbLog is the first probabilistic logic programming system using BDDs as a basic data structure for probability calculation, a principle that receives increased interest in the fields of probabilistic logic learning and probabilistic databases, cf. for instance [Riguzzi, 2007; Ishihata et al., 2008; Olteanu and Huang, 2008; Thon et al., 2008; Riguzzi, 2009]. Since their introduction by Bryant [1986], there has been a lot of research on BDDs and their computation, leading to many variants of BDDs and off the shelf systems. The reduction approach to BDD construction described above is clearly impractical, as it starts from an exponential encoding of the Boolean formula. However, BDDs can also be constructed by applying Boolean operators to smaller BDDs, starting with BDDs corresponding to single variables and following the structure of the

22 FOUNDATIONS a b c 1 d e f 0 (a) a c c e e e e f 0 d d b b b b 1 (b)

Figure 2.4: Example illustrating the effect of variable ordering on BDD size for formula (a ∧ b) ∨ (c ∧ d) ∨ (e ∧ f), taken from [Bryant, 1986].

formula to be encoded, where reduction operators are applied on intermediate results. Denoting the number of nodes in a BDD g as |g|, reducing g has time complexity O(|g| · log(|g|)), while combining g1and g2 has time complexity

O(|g1| · |g2|); for further details, we refer to [Bryant, 1986]. BDD tools construct

BDDs following a user-defined sequence of operations. The size of a BDD is highly dependent on its variable ordering, as this determines the amount of structure sharing that can be exploited for reduction; see Figure 2.4 for an example. As computing the order that minimizes the size of a BDD is a coNP-complete problem [Bryant, 1986], BDD packages include heuristics to reduce the size by reordering variables. While reordering is often necessary to handle large BDDs, it can be quite expensive. To control the complexity of BDD construction, it is therefore crucial to aim at small intermediate BDDs and to avoid redundant steps when specifying the sequence of operations to be performed by the BDD tool.

PROBABILISTIC NETWORKS OF BIOLOGICAL CONCEPTS 23

In document A Probabilistic Prolog and its Applications (Page 42-45)