Since we want to investigate the performance of GAC algorithms on dierent representations, we may rst do a transformation between those representations, i.e. convert an r-ary NFA constraint into an equivalent MDD constraint. One approach could be to rst construct a DFA from the NFA, e.g. by using the subset construction algorithm [Les95]. The DFA can then be traversed to generate solutions lexicographically which are then inserted into an MDD using the mddify
construction algorithm [CY10]. However, the DFA may be exponentially larger than the NFA, in which case a DFA minimization may be used. Note that DFA minimization is PSPACE-hard in general [MS72].
In this section, we propose instead to convert directly from the NFA into the corresponding MDD constraint. More precisely, given an NFA G = hQ, Σ, δ, q0, F i,
the algorithm nfa2mdd creates an MDD G0 = hQ0, Σ, δ, q0
0, {tt}i such that the r-
ary constraints represented by G and G0 are equivalent. The pseudo-code of the
algorithm nfa2mdd is in Figure 3.3. The idea is to simulate the NFA [Tho68] in a depth-rst fashion and build the MDD in a bottom-up fashion. In other words, it combines NFA-to-DFA conversion (or subset construction) and trie-to-MDD construction.
We rst introduce the two dictionaries cache and unique that are used in
nfa2mdd and nfa2mdd-recur. To save the NFA exploration, cache maps a pair of
hS, ii to a constructed MDD node q, where S is a set of NFA states, i is the cur- rent level of the MDD. To construct reduced MDDs, unique maps one MDD node q's all outgoing arcs to q itself.
The execution of the main procedurenfa2mdd-recur is as follows: If the current subset S of states of the NFA G is at depth i = r+1 (line 1), the unique nal state tt of the MDD G0 is returned i S contains a nal state of G. This is because, by denition, any path from the start state to a nal state corresponds to a solution of the NFA constraint i its length is r. Otherwise (i ≤ r), the traversal continues by visiting collective successors for every a ∈ Σ, the initial domain of the variable (line 3). The recursive call returns the start state q0
a of the sub-MDD. If q 0 a is
not the failure state, the pair (a, q0
a) is recorded (line 5). Thus the MDD only
contains paths leading to tt (true). Now, if all successors of the states in S lead to a dead-end, the failure state ff will be returned (line 6), which means the current sub-MDD has no solution. Otherwise, we need to create a (new) start state for this sub-MDD. By denition, two equivalent (sub-)MDD constraints should be represented by the same sub-MDD, so we use unique, implemented as a dictionary, to store all created states in the MDD. With our bottom-up construction, we
can identify equivalent sub-MDDs easily by using E as a key (line 7). When we cannot nd an existing state, we generate a new one (line 8) and the corresponding transitions (line 9).
nfa2mdd(G, r)
// input: NFA G = hQ, Σ, δ, q0, F i and arity r
begin Q0 := ∅
initialize cache[], unique[] // dictionaries δ0 := ∅ q00 :=nfa2mdd-recur({q0}, 1, r) return hQ0, Σ, δ0, q0 0, {tt}i // output: MDD G 0 nfa2mdd-recur(S, i, r) begin 1 if i = r + 1 then if S ∩ F = ∅ then
return ff // failure state else
return tt // nal state (unique)
2 q0 := cache[hS, ii]// dictionary lookup if q0 6= Null then return q0
3 E := ∅ for a ∈ Σ do 4 T :=S q∈Sδ(q, a) if T 6= ∅ then q0a:=nfa2mdd-recur(T, i + 1, r) 5 if qa0 6= ffthen E := E ∪ {ha, q0 ai} 6 if E = ∅ then q0 := ff else
7 q0 := unique[E] // dictionary lookup
if q0 = Null then
8 make a new state q0
Q0 := Q0∪ {q0} // insert new state
for ha, q0
ai ∈ E do
9 δ0(q0, a) := {qa0} // insert new transitions unique[E] := q0
10 cache[hS, ii] := q0 return q0
Proposition 3.1. nfa2mdd-recur constructs reduced MDDs.
Proof. According to the algorithm, one state q will be generated only when there is no other state q0 that q and q0 have the same set of outgoing arcs. This guarantees
that no two dierent states q and q0 will be generated if q and q0 are isomorphic
and the MDD being constructed is reduced.
The depth-rst traversal will implicitly enumerate an exploration tree of size |Σ|r. To reduce the size of the traversal, we make use of caching (lines 2 and
10) to ensure that any sub-MDD will only be expanded once. In the worst case,
nfa2mdd-recur visits O(2|Q|r) NFA states. This is also the maximum number of
slots in cache. However, the cache size can be smaller, thus trading time for space. Note that if we convert an NFA into a DFA using subset construction the time complexity will be O(22|Q|) since the DFA will have at most O(2|Q|) states.
Given an NFA G = hQ, Σ, δ, q0, F i and arity r, the runtime and space com-
plexities of nfa2mdd are given as follows:
Proposition 3.2. The worst-case time complexity of nfa2mdd is O(|Q|2 · |Σ| ·
min{2|Q|· r, |Σ|r+1}).
Proof. In each call of nfa2mdd-recur, the caching operations on cache and unique take O(|Q|) and O(|Σ|) time respectively. Every set union at line 4 can take O(|Q|2)time. The last term accounts for the maximum number of recursive calls:
(1) each subset of Q is at most visited r times, and (2) the generation tree has O(|Σ|r) non-leaf nodes.
In practice, the use of caching may signicantly reduce the runtime.
Proposition 3.3. The worst-case space complexity ofnfa2mdd is O(max{2|Q|·|Q|·
r, |Σ|r· |Σ|}).
Proof. (1) for cache, the maximum number of slots is O(2|Q|r) and for each slot,
the key S takes O(|Q|) space in the dictionary entry, and (2) for unique, the MDD size is in O(|Σ|r)and same as cache, each slot takes O(|Σ|) space.
To reduce this high memory requirement, we can trade time for space and restrict the number of entries in cache and/or unique. In the latter case, the resulting MDD may not be reduced and has more nodes, i.e., it may contain equivalent sub-MDDs.
Proposition 3.4. The output MDD G0 is in the worst case r times larger than
the DFA equivalent to the input NFA G, but it is exponentially smaller than the DFA in the best case.
Proof. For the worst case, suppose nfa2mdd is run on a DFA equivalent to G. Since nfa2mdd-recur visits every DFA state at most r times, there will be at most r MDD states created for each DFA state. For the best case, consider an NFA that represents the regular expression {0, 1}∗0{0, 1}n. The equivalent DFA has 2n
nodes; whereas, for any xed r > n, the corresponding MDD has r + 1 states, which represents the constraint (xr−n = 0) ∧
Vr
i=1xi ∈ {0, 1}.
Thus, we suggest that the size of a DFA and MDD are not comparable with an NFA.