IMPLEMENTATION OF LAZY LEARNING

Lazy learning

3.4. IMPLEMENTATION OF LAZY LEARNING

3.3.7.1. Various learning schemes for CSP invented by Dechter et al. There are a few issues with implementing s-nogood learning (§2.6.2) with lazy explanations. The first problem is that, as shown in Example 2.7, s-explanations for prunings cannot necessarily be expressed using only the variables in the scope of the constraint. Hence it is impossible to be sure that an explanation can be produced using a minimal record. Rather the state of all the variables and all the constraints may need to be taken into account. However s-nogood learning is obsolete (see justification in§2.6.3).

3.3.7.2. Conflict directed backjumping. CBJ has a similar drawback to Dechter et al.’s learning, because it uses s-explanations. However once lazy s-explanations are available, it can easily be implemented.

In g-learning, the backjump target is found by analysing the lazily-built implica- tion graph.

3.3.7.3. Lazy clause generation. Lazy explanations are a bad fit for lazy clause generation (LCG) (§2.6.6). This is because in LCG, clauses corresponding to the propagation are posted into the solver and these also act as explanations. Since they are expected to propagate immediately and repeatedly the effort cannot be delayed. However in [FS09], one of the solver options is to delete clauses after backtracking and for this lazy explanations would be ideal, since lazily posted clauses can be replaced with lazy explanations.

Hence apart from when clauses are deleted after backtracking, the LCG approach to propagation is fundamentally to normal CSP solvers and obviates lazy explanations.

I think it is worth noting that whilst lazy clause generation and SMT are very similar technologies, a wedge can be driven between them mainly in their use of explanations. SMT solvers very sensibly use lazy explanation, whereas for lazy clause generation lazy explanation is inappropriate.

3.4. Implementation of lazy learning

I will now describe how I implement the lazy learning framework in minion, since there are various interesting design choices to make, as well as other important choices like which variable ordering heuristic to use.

3.4. IMPLEMENTATION OF LAZY LEARNING 70

3.4.1. Framework. The g-learning solver used is based on release 0.7 of the minion solver, a highly optimised solver that didn’t originally contain any learning or explanation mechanisms [GJM06]. By convention, I will call the eager learning variant “minion-eager” and the lazy variant “minion-lazy”. Implementation decisions are made so that compared to the experiments in [Kat09], which use eager explanations, only the method used to produce explanations is varied. Hence dom/wdeg variable ordering [BHLS04] and far backtracking as described in [Kat09] are used. The solver learns the firstUIP cut. From personal correspondence I know that Katsirelos’ solver also uses firstDecision cuts if a loop is detected but the details are unpublished [Kat08]. Finally node counts are not directly comparable because I do not know how they were calculated.

Learned clauses are propagated by the 2-watch literal scheme [MMZ+_01].

3.4.2. Storage of depths and explanations. Recall that an explanation and a depth must be available for each and every (dis-)assignment that occurs. It is a very common operation to request this information, the depth being requested once for each (dis-)assignment involved in the derivation of a new g-nogood and the explanation being requested for a subset of these.

How to implement depth and explanation storage depends on whether the constraint solver usescopying ortrailing to maintain backtrackable state (see [RSST09] for a detailed discussion of the choice). Copying means that the entire backtrackable state is copied before a new decision is made, and when search backtracks to that point again it can be copied into place to undo any changes that occurred. In trailing, whenever the state is changed, a record is pushed onto a stack, consisting of the address changed plus the old value. When a new decision is made a NULL record will be pushed. Now when search backtracks the solver will pop records off the stack until the NULL record is reached, restoring the value to the appropriate address each time. In this way the state is restored to its original value.

With trailing, each (dis-)assignment is written onto the stack and is thus implic- itly labelled with a depth, since the order on the stack mirrors the order of events. Explanation records can be pushed onto the stack for each (dis-)assignment. It may

3.4. IMPLEMENTATION OF LAZY LEARNING 71

seem that the complexity of obtaining the depths and explanations will be unrea- sonably high: linear in the size of the trail in the worst case, which can contain |V|.|D| records in the worst case, one for each possible (dis-)assignment. It could be even worse if propagators and other solver code have backtrackable state, e.g. the α pointer for the GAC propagator for lexicographic ordering [FHK+06]. However, recall from Algorithm 5 that during the firstUIP algorithm, the (dis-)assignments are processed deepest first. Since the explanations and depths are only needed by the firstUIP algorithm, this algorithm can be combined with restoring the trail: the trail can be unstacked until a (dis-)assignment in the current cut in found, at which time the depth and explanations are found. Hence no additional cost is incurred obtaining them, since the trail must be unstacked anyway. Katsirelos [Kat09] implements depth and explanation storage using the stack in this way.

Minion uses copying [GJM06] and so the stack implementation is inefficient: time spent searching the trail cannot be amortized against unstacking it. Hence in my implementation there are two variables for each assignment and each disassignment, one for the depth and one for the explanation record, organised as arrays which are not backtracked. Explanations and depths can be obtained inO(1) time. In the trail implementationvalidity is not a problem: if an explanation is on the stack it is for a current pruning. However in minion’s implementation of copying invalid explanations and depths are left in the array, and not deleted once they become invalid. This is because copying treats the whole domain state as a piece of uninterpreted memory which is copied into place verbatim, hence the backtracking memory system does not know which values are being restored to the domains. Fortuitously, finding out if an explanation or depth is valid is simple: an explanation for assignment x ← a (disassignment x₈a) is valid if and only if x is currently assigned to a (a is not in dom(x)). Hence, invalid depths and explanations can efficiently be left in the arrays but ignored.

3.4.3. Explanations for internal solver events. In minion, variable types implement a couple of rules internally:

• if x←a, must force x₈b, ∀b∈dom(x)\ {a}, and • if x₈b,∀b ∈initdom(x)\ {a}, must force x←a.

3.4. IMPLEMENTATION OF LAZY LEARNING 72

These can be summarised as, respectively, xcan takeat most one value (AMOV) and x must have a value (MHAV). They can be treated like clausal constraints, for each variablex the solver implements

• the AMOV constraints: ∀a, b∈initdom(x),x₈a∨x₈b, and • the MHAV constraint: W

val∈initdom(x)x←val.

Explanations are then normal explanations for a clause as described in Example 2.12.

3.4.4. Eager and lazy explanations. In my solver, each explanation record is an object representing a (dis-)assignment DA, equipped with methods to return the explanation and depth of DA on demand.

As a final point, [NOT06] says that in SMT with lazy explanation “each theory propagated literal may occur in more than one conflict”. This suggests that there may be a benefit to keeping an explanation that has been computed lazily, keep it in case it is needed again, i.e. the computational technique ofmemoization. However the authors of this paper appear to be mistaken, as they use firstUIP and the following theorem shows that each explanation can be used at most once.

Theorem 3.1. Using firstUIP learning, each explanation for a specific solver event is needed at most once.

Proof. Clearly, in Algorithm 5 any (dis-)assignment for which the explanation

is requested is at the current depth (in the algorithm, depth(e) ≥cd). These explanations are requested exactly once. However after the constraint is built, the solver backtracks at least one level, and hence the explanations at current depth all become

invalid and will never be requested again.

To be clear, I do not dispute that the same literal may be inferred multiple times using the exact same explanation in different branches of search; only that exactly the same (dis-)assignment can appear twice in different conflict analyses.

3.4.5. Failure. Conflict analysis (Algorithm 5) begins with a set of events that directly caused the initial failure. I will now describe issues with obtaining such a set, also touching on consistency of state at failures. There are two types of failure:

3.4. IMPLEMENTATION OF LAZY LEARNING 73

(1) failure detected by constraint, where a constraint detects that it has no remaining consistent assignments and stops immediately rather than removing values; and

(2) inconsistent state, i.e. a variable has no values left in its domain, I will call this a domain wipeout in variable x.

3.4.5.1. Constraint detects. The first type of failure is dealt with by ensuring that the constraint must return a set of events that are inconsistent. The lexicographical ordering constraint works in this way and I will defer discussion until§3.5.2.3.

3.4.5.2. Inconsistent state. The second type of failure is similar because if variable x wipes out then the built-in MHAV constraint is failed. The set of events in the failure is the set of all disassignments tox. However there are some subtleties in the implementation.

Minion deals with 3 types of failure due to an inconsistent domain:

• domain wipeout (DWO);

• assignment and pruning to same value; and

• out of range assignment, i.e. x←a and a /∈initdom(x).

The latter two types of failure involve assignments. It would be possible to sidestep the latter types of failure by transforming them into a DWO. This would be achieved by pruning all but assigned value instead of assigning it directly, but the initial cut is smaller when assignments are allowed.

For a domain wipeout in variable x, the initial cut is the negative of the MHAV constraint, i.e. {x ₈ a : x ∈ initdom(x)}. This constraint is guaranteed to yield a new and valid firstUIP constraint when Algorithm 5 is applied to it, as I will shortly show, but a preliminary definition and lemma is needed first to make it easier to prove.

Definition 3.1. A constraint propagator C is said to be subsumed by another constraint propagator D if D will perform a superset of the propagation that C will irrespective of the domain state.

Lemma 3.2. (due to[Rya02]) Apart from a propagator corresponding to the initial cut, and assuming that constraints are propagated in strict order of when they become

3.4. IMPLEMENTATION OF LAZY LEARNING 74

able to propagate, no propagator corresponding to a cut created by Algorithm 5 is subsumed by another constraint propagator already posted.

Proof. Suppose that a new cut is subsumed by another constraint C already

posted. It was built by resolving together two existing constraints denoted {x, A} and{¬x, B} in this proof. Suppose without loss of generality (w.l.o.g.) that literal x was true before literal¬xwas forced. Consider the solver state immediately before¬x is forced. All literals in the new cut are false and hence the corresponding constraint would have propagated in this state, had it been posted. Hence constraint C should have propagated to cause the failure before ¬x was forced. This is a contradiction because that didn’t happen and hence there is noCthat subsumes the new constraint.

Now the following lemma shows that a new constraint that is not subsumed by any other will be obtained.

Lemma 3.3. A new firstUIP constraint is produced by applying Algorithm 5 (page 39) to {x₈a:x∈initdom(x)} when variable x has a DWO.

Proof. First, there must be at least two (dis-)assignments at the current depth.

This is because if there were 0 (dis-)assignments at the current depth then the solver would have had a DWO at the previous depth. If there were 1, then it would have unit propagated the single (dis-)assignment not already assigned at the previous depth, and the conflict would have been avoided. Hence there are at least 2.

Hence Algorithm 5 will iterate at least once, since the loop condition will initially be true, and by Lemma 3.2 there will be a new constraint learned not subsumed by

any other.

If an assignment x←aand disassignmentx₈aoccur contemporaneously, there is a choice of which cut to begin with. The easy way is to usex←aas a justification for pruning any remaining values in dom(x), and then using the MHAV nogood as the initial cut. However a smaller initial cut is available: the cut C = {x ← a, x ₈a}. However the proof of Lemma 3.3 does not show that firstUIP will create a new constraint from C. The problem is that either x ← a or c ₈ a may be from an

3.4. IMPLEMENTATION OF LAZY LEARNING 75

earlier depth. In this case Algorithm 5 would terminate immediately and learn a redundant constraint. To get around this a mandatory resolution is applied to C before Algorithm 5 is used.

Lemma 3.4. Algorithm 5 generates a new firstUIP constraint from {x←a, x₈a}, except that the most recent out of x←a and x₈a should have already been replaced by its explanation.

Proof. Suppose w.l.o.g. thatdepth(x8a)> depth(x←a) and hence the starting cut isC={x←a} ∪E whereE is the explanation forx₈a. E must contain at least one (dis-)assignment from the current depth, otherwise the propagation should have happened at an earlier depth. Hence C = {x ← a} ∪E contains at least one (dis-)assignment from the current depth. Hence Algorithm 5 iterates at least once and, and by Lemma 3.2 there will be a new constraint learned not subsumed by any

other.

Finally the initial cut for x←a s.t.a /∈initdom(x) is simply the explanation for x←a.

Lemma 3.5. Let E be the explanation for x ← a. A new firstUIP constraint is produced by applying Algorithm 5 to E when x←a is an out of range assignment.

Proof. Similar to proof of Lemma 3.3.

Hence I have shown how to start the conflict analysis process for both lazy and eager learning, using various different starting points for failure, and proved that each one results in a valid new constraint.

3.4.5.3. Minimising learned constraints. When a (dis-)assignment is already false at the root node, i.e. at depth 0.ifor some i, it will be false throughout search. Such a (dis-)assignment can simply be removed from any learned disjunction in which it appears. It is better for the solver to perform this optimisation, to avoid complicating each and every explainer4. This optimisation does not affect the level of propagation obtained, because the zero-level (dis-)assignment would never be watched by the 4_{basically, adding throughout the code conditions that} _depth₍_dis₎ _> ₁_._{0 before including a}

3.4. IMPLEMENTATION OF LAZY LEARNING 76

In document Improving the efficiency of learning CSP solvers (Page 85-92)