Recent advances in hardware model checking

3.5 Discussion and related work

3.5.3 Recent advances in hardware model checking

In 2010, Aaron Bradley proposed a new SAT-based algorithm for checking safety prop- erties of finite state systems (Bradley, 2011). The algorithm, called Property Directed Reachability (PDR), also known as IC3, has been shown to perform remarkably well in practice and was given the title of the “most important contribution to bit-level formal verification in almost a decade” (E´en et al., 2011). Recently, two new algorithms for checking liveness, an equivalent of our TST (un)satisfiability, have been developed, each building on the success of PDR and using it as a subroutine.

In Chapter 5, we will specialize LS4 to (single time) reachability8 _{and will obtain an}

algorithm very similar to PDR. It is, therefore, appropriate to briefly compare LS4 to those two liveness checking algorithms. The reader may want to return to this section after the relation of LS4 and PDR has been presented in full detail in Section 5.3.

It should be noted that, unlike LS4, NuSMV-BMC guarantees to find models of minimal size. 7

This is an LTL analogue of the k-induction method, originally proposed by Sheeran et al. (2000). 8_{Instead of looking for an infinite path with infinitely many goal worlds, reachability is concerned}

about the existence of just a finite path from an initial world to a goal world. Specializing LS4 to reachability essentially corresponds to limiting its computation to the first block.

I G _G G . . . I G G2 G3 . . . LS4 k-LIVENESS

Figure 3.6: Comparing state space exploration of LS4 and k-Liveness. Constructed paths grow along the arrows, guiding layers (denoted by ovals) in the opposite direction.

Fair

The less related of the two algorithms is Fair by Bradley et al. (2011). It uses a SAT solver to iteratively pick a selection of worlds, called a skeleton, and attempts to connect these worlds by paths using reachability queries delegated to PDR. A successfully connected skeleton represents a witness for satisfiability of the given problem: an initial world connected to a loop with a goal world on it. A failure to connect two worlds of a skeleton leads to a discovery a new wall in the state space, such that new skeletons must lie entirely on one side of the wall. Walls are extracted from clausal reachability information maintained by PDR, an almost exact equivalent of LS4’s layers.

The algorithm that counts

The idea behind the k-Liveness algorithm by Claessen and S¨orensson (2012) is to reduce liveness checking to safety checking (i.e., reachability). This has already been proposed previously in a form of a one-time encoding (see Biere et al., 2002), but in k-Liveness, the reduction is incremental.

To show that the given problem is unsatisfiable, the algorithm counts and bounds the number of times the goal condition G can be satisfied along an infinite path that starts in an initial world. If the goal condition cannot be reached even once, the given problem is obviously unsatisfiable and k-Liveness terminates. Otherwise it constructs a strengthened condition G2_{, which expresses that the goal G should be satisfied twice in}

a row, and runs the safety check again. In general, if the given problem is unsatisfiable, the goal cannot be satisfied more than k times for some k ∈ N and the algorithm will terminate after failing to reach a world that satisfies Gk+1_{. In the setting of verifying}

hardware circuits considered by Claessen and S¨orensson (2012), the transformation from condition Gi _{to G}i+1 _{can be realized by adding a simple one bit memory element to the}

circuit.

The algorithm relies on PDR for answering the reachability queries. While in LS4, we construct the model path from an initial world towards the goal, in PDR the default direction is reversed. This means that the equivalent of LS4’s layers in PDR encode

over-approximation of the image of the set of initial worlds, as opposed to the preimage of the goal worlds as in LS4 (see Figure 3.6). This has an important consequence for k-Liveness. Because these “layers” do not depend on the goal conditions Gi_{, they can}

be shared and reused for guidance by the individual reachability queries. This was shown to be a key to a good performance of the algorithm.

Performance comparison estimate

In order to obtain a rough comparison of the relative performance of LS4, Fair, and k- Liveness, we have extended LS4 to parse circuits in the AIGER format (Biere, 2012) of the Hardware Model Checking Competition (HWMCC) and to translate the correspond- ing liveness problems to equisatisfiable TST’s. Using the same time limit of 900 seconds per problem as in the competition, we then ran the extended LS4 on the 118 problems of HWMCC 2012 (Biere et al., 2012). The results we obtained can be summarized as follows.

If LS4 participated in the competition, it could have9 _{placed third with 66 problems}

solved (44 satisfiable and 19 unsatisfiable) just after the system iimc2011, which implements Fair and solved 70 problems (27 satisfiable and 43 unsatisfiable), and the winner TIP, which implements k-Liveness and solved 92 problems (46 satisfiable and 46 unsatisfiable). There are, however, several reasons why this is only a fair comparison of the implementations rather than of the algorithms themselves.

• k-Liveness is not suitable for recognizing satisfiable instances, therefore, it was complemented in TIP by a Bounded Model Checker running in lock-step.

• Both iimc2011 and TIP employ circuit specific preprocessing of the input, while our extension of LS4 relies only on a straightforward encoding followed by variable and clause elimination (see also Chapter 4).

• Unlike LS4, both Fair in iimc2011 and k-Liveness in TIP incorporate special heuristic techniques for efficiently dealing with problems involving counters.10

About 10 of the unsatisfiable problems were solved thanks to this technique by each of the systems (see Claessen and S¨orensson, 2012, Fig. 9).

For a better estimate of LS4’s potential as a liveness model checking algorithm, a full- fledged implementation including the mentioned techniques would be needed. This is left as a future work.

Actually, our hardware configuration is stronger than the one used in the competition (3.16 GHz CPU and 16 GB RAM versus 2.6 GHz CPU and 8 GB RAM), but because all the problems solved by LS4 were solved before the 300 second mark, this should not make a difference.

Look for “skeleton-independent proofs” (in Bradley et al., 2011, Section III-E) and for “stabilizing constraints” (in Claessen and S¨orensson, 2012, Section IV).

3.6 Conclusion

We have presented LS4, a new algorithm for LTL satisfiability. Building on the calculus LPSup and, in particular, on its ability to construct (partial) models on the fly, LS4 departs from the saturation paradigm and instead employes a modern SAT solver to efficiently drive the search and select inferences. This gives rise to a hybrid between explicit and symbolic exploration of the space of potential models.

LS4 was shown to perform remarkably well in practice, staying on par with or even improving over the state-of-the-art LTL satisfiability checkers. In particular, the model guidance approach clearly outperforms saturation. This is more evident on the satisfiable instances, where an explicit model is typically discovered long before a full saturation of the clause set can confirm satisfiability indirectly. But a performance gain can also be observed on the unsatisfiable problems. Its likely explanation is the lazy nature in which LS4 derives clauses from failed attempts of model extension thus only focusing on the relevant reasons for unsatisfiability.

4.1 Introduction

When developing practically useful tools for satisfiability checking of formulas in a particular logic, one is on a constant lookout for techniques that would improve performance and help to fight the typically high inherent computational complexity of the decision problem. One possibility for speeding up such a tool lies in simplifying the input formula before the actual decision method is started.

In the context of resolution-based methods for LTL, where the given formula is first translated into a clausal normal form, simplification means reducing the number of clauses and variables while preserving satisfiability of the formula. Such a preprocessing step may have a significant positive impact on the subsequent running time.

In this chapter we take inspiration from the SAT community where a technique called variable and clause elimination (E´en and Biere, 2005) has been shown to be particularly effective. It combines exhaustive application of the resolution rule over selected variables with subsumption and other reductions. Our main contribution lies in showing that variable and clause elimination can be adapted from SAT to the setting of LTL.

Preprocessing and normal forms

It was observed by E´en and Biere (2005) that work on preprocessing techniques can be seen as a viable alternative to optimizing normal form transformation procedures. Let us recall the SNF transformation presented in Figure 2.2 (Section 2.2.2) and consider the LTL formula

¬p ∨ p. (4.1)

In order to produce an equisatisfiable SNF, our transformation will introduce several new variables and eventually end up with the following set of temporal clauses

i, (¬i ∨ u ∨ v), (¬u ∨ ¬p), (¬v ∨ w), (¬w ∨ w), (¬w ∨ p).

With variable and clause elimination for LTL, as described in this chapter, we will be able to reduce this set to

¬p ∨ w, (¬w ∨ w), (¬w ∨ p), (4.2)

and, if we also eliminate the original variable p, further to just the single clause

By enlarging the set of rules of Figure 2.2 and introducing additional side conditions, it would be possible to optimize the transformation such that it directly produces (4.2) as the SNF of (4.1). We believe that investing the effort into a general purpose preprocessing is more worthwhile. For one thing, it is applicable even when we are not in control of the normal form transformation and obtain the input already in SNF. Moreover, it typically allows us to reduce the input further, as shown by our example and its final form (4.3).

Strategy and chapter overview

We start our exposition by reviewing propositional variable and clause elimination in Section 4.2.1. To lift the technique from SAT to LTL, we reuse the idea of labels and labeled clauses (recall Section 2.3.1). Because the labels we introduced in Chapter 2 are not sufficiently expressive to justify the lift of variable elimination, we extend them by a third component and correspondingly update the semantics and operations on labels (Section 4.2.2). The intuitive explanation is that we need a correspondence between labeled resolution and the represented propositional resolution that is not only sound, as was sufficient in the case of LPSup, but also complete, in the sense that every propositional resolution is lifted by some labeled resolution.

We develop the actual variable and clause elimination for labeled clauses in Sec- tion 4.2.3. Interestingly, we will be able show that when the elimination is completed, clauses with non-trivial third label component can be removed from the resulting set without affecting satisfiability. This means that the third label component, although an important part of the theoretical argument, does not need to be explicitly realized in an actual implementation.

As a proof of concept, we implemented (a restricted version of) variable and clause elimination for LTL, building on top of the simplification capabilities of the SAT solver Minisat (Section 4.3.1). We then experimentally evaluated the effect of the preprocessing on the performance of resolution-based LTL provers LS4 and TRP++ (Section 4.3.2). Our results confirm that even in the temporal setting substantial reductions in formula size and subsequent decrease of running time can be achieved.

The results of this chapter have been published in (Suda, 2013d,a).

In document Resolution-based methods for linear temporal reasoning (Page 123-128)