NFA-to-MDD Conversion - Designing and Optimizing Representations for Non-Binary Constraints

Since we want to investigate the performance of GAC algorithms on dierent representations, we may rst do a transformation between those representations, i.e. convert an r-ary NFA constraint into an equivalent MDD constraint. One approach could be to rst construct a DFA from the NFA, e.g. by using the subset construction algorithm [Les95]. The DFA can then be traversed to generate solutions lexicographically which are then inserted into an MDD using the mddify

construction algorithm [CY10]. However, the DFA may be exponentially larger than the NFA, in which case a DFA minimization may be used. Note that DFA minimization is PSPACE-hard in general [MS72].

In this section, we propose instead to convert directly from the NFA into the corresponding MDD constraint. More precisely, given an NFA G = hQ, Σ, δ, q0, F i,

the algorithm nfa2mdd creates an MDD G0 _{= hQ}0_{, Σ, δ, q}0

0, {tt}i such that the r-

ary constraints represented by G and G0 _{are equivalent. The pseudo-code of the}

algorithm nfa2mdd is in Figure 3.3. The idea is to simulate the NFA [Tho68] in a depth-rst fashion and build the MDD in a bottom-up fashion. In other words, it combines NFA-to-DFA conversion (or subset construction) and trie-to-MDD construction.

We rst introduce the two dictionaries cache and unique that are used in

nfa2mdd and nfa2mdd-recur. To save the NFA exploration, cache maps a pair of

hS, ii to a constructed MDD node q, where S is a set of NFA states, i is the current level of the MDD. To construct reduced MDDs, unique maps one MDD node q's all outgoing arcs to q itself.

The execution of the main procedurenfa2mdd-recur is as follows: If the current subset S of states of the NFA G is at depth i = r+1 (line 1), the unique nal state tt of the MDD G0 is returned i S contains a nal state of G. This is because, by denition, any path from the start state to a nal state corresponds to a solution of the NFA constraint i its length is r. Otherwise (i ≤ r), the traversal continues by visiting collective successors for every a ∈ Σ, the initial domain of the variable (line 3). The recursive call returns the start state q0

a of the sub-MDD. If q 0 a is

not the failure state, the pair (a, q0

a) is recorded (line 5). Thus the MDD only

contains paths leading to tt (true). Now, if all successors of the states in S lead to a dead-end, the failure state ff will be returned (line 6), which means the current sub-MDD has no solution. Otherwise, we need to create a (new) start state for this sub-MDD. By denition, two equivalent (sub-)MDD constraints should be represented by the same sub-MDD, so we use unique, implemented as a dictionary, to store all created states in the MDD. With our bottom-up construction, we

can identify equivalent sub-MDDs easily by using E as a key (line 7). When we cannot nd an existing state, we generate a new one (line 8) and the corresponding transitions (line 9).

nfa2mdd(G, r)

// input: NFA G = hQ, Σ, δ, q0, F i and arity r

begin Q0 := ∅

initialize cache[], unique[] // dictionaries δ0 := ∅ q0₀ :=nfa2mdd-recur({q0}, 1, r) return hQ0_{, Σ, δ}0_{, q}0 0, {tt}i // output: MDD G 0 nfa2mdd-recur(S, i, r) begin 1 if i = r + 1 then if S ∩ F = ∅ then

return ff // failure state else

return tt // nal state (unique)

2 q0 := cache[hS, ii]// dictionary lookup if q0 _{6= Null} _{then return q}0

3 E := ∅ for a ∈ Σ do 4 T :=S q∈Sδ(q, a) if T 6= ∅ then q0_a:=nfa2mdd-recur(T, i + 1, r) 5 if q_a0 6= ffthen E := E ∪ {ha, q0 ai} 6 if E = ∅ then q0 := ff else

7 q0 := unique[E] // dictionary lookup

if q0 _{= Null} _then

8 make a new state q0

Q0 := Q0∪ {q0_} _{// insert new state}

for ha, q0

ai ∈ E do

9 δ0(q0, a) := {q_a0} // insert new transitions unique[E] := q0

10 cache[hS, ii] := q0 return q0

Proposition 3.1. nfa2mdd-recur constructs reduced MDDs.

Proof. According to the algorithm, one state q will be generated only when there is no other state q0 _{that q and q}0 _{have the same set of outgoing arcs. This guarantees}

that no two dierent states q and q0 _{will be generated if q and q}0 _{are isomorphic}

and the MDD being constructed is reduced.

The depth-rst traversal will implicitly enumerate an exploration tree of size |Σ|r_{. To reduce the size of the traversal, we make use of caching (lines 2 and}

10) to ensure that any sub-MDD will only be expanded once. In the worst case,

nfa2mdd-recur visits O(2|Q|_r) _{NFA states. This is also the maximum number of}

slots in cache. However, the cache size can be smaller, thus trading time for space. Note that if we convert an NFA into a DFA using subset construction the time complexity will be O(22|Q|₎ _{since the DFA will have at most O(2}|Q|₎ _states.

Given an NFA G = hQ, Σ, δ, q0, F i and arity r, the runtime and space com-

plexities of nfa2mdd are given as follows:

Proposition 3.2. The worst-case time complexity of nfa2mdd is O(|Q|2 _{· |Σ| ·}

min{2|Q|· r, |Σ|r+1_}).

Proof. In each call of nfa2mdd-recur, the caching operations on cache and unique take O(|Q|) and O(|Σ|) time respectively. Every set union at line 4 can take O(|Q|2₎_{time. The last term accounts for the maximum number of recursive calls:}

(1) each subset of Q is at most visited r times, and (2) the generation tree has O(|Σ|r₎ _{non-leaf nodes.}

In practice, the use of caching may signicantly reduce the runtime.

Proposition 3.3. The worst-case space complexity ofnfa2mdd is O(max{2|Q|_·|Q|·

r, |Σ|r_{· |Σ|})}_.

Proof. (1) for cache, the maximum number of slots is O(2|Q|_r) _{and for each slot,}

the key S takes O(|Q|) space in the dictionary entry, and (2) for unique, the MDD size is in O(|Σ|r₎_{and same as cache, each slot takes O(|Σ|) space.}

To reduce this high memory requirement, we can trade time for space and restrict the number of entries in cache and/or unique. In the latter case, the resulting MDD may not be reduced and has more nodes, i.e., it may contain equivalent sub-MDDs.

Proposition 3.4. The output MDD G0 _{is in the worst case r times larger than}

the DFA equivalent to the input NFA G, but it is exponentially smaller than the DFA in the best case.

Proof. For the worst case, suppose nfa2mdd is run on a DFA equivalent to G. Since nfa2mdd-recur visits every DFA state at most r times, there will be at most r MDD states created for each DFA state. For the best case, consider an NFA that represents the regular expression {0, 1}∗_{0{0, 1}}n_{. The equivalent DFA has 2}n

nodes; whereas, for any xed r > n, the corresponding MDD has r + 1 states, which represents the constraint (xr−n = 0) ∧

i=1xi ∈ {0, 1}.

Thus, we suggest that the size of a DFA and MDD are not comparable with an NFA.

In document Designing and Optimizing Representations for Non-Binary Constraints (Page 38-41)