Overview - Disproving in First-Order Logic with Definitions, Arithmetic and Finite Domains

4.4 Applications

5.1.1 Overview

While the first-order validity problem is semi-decidable, the satisfiability problem is not, as there is no way to enumerate first-order models. If interpreted theories are added, then even refutationally complete validity checking becomes in- tractable (linear integer arithmetic with free symbols has a Π1₁-hard validity problem [Dow72, Hal91]). In practice, this lack of completeness is a major concern in software verification applications, including ranking function and loop invariant syn- thesis, which require the capability to disprove non-valid proof obligations. In such cases, incomplete theorem provers run out of resources or report ‘unknown’ instead of detecting non-validity (i. e. , satisfiability of the negated conjecture).

There are various methods to circumvent this problem: SMT-solvers generally use instantiation heuristics to reduce the input problem to a quantifier-free one, while approaches based on first-order theorem proving either are incomplete; do not accept free BG-sorted operators at all, for example [KV07, Rüm08, GK06, BT11]; or, other- wise, are complete only for certain fragments of the input language.

Nieuwenhuis et al. [NOT06] gives an overview of SMT instantiation heuristics, while specific ones are described by Ge et al. [GBT07], and de Moura and Bjorner [dMB07]. These heuristics are complete only in rather restricted cases, as in Ge and de Moura

[GdM09]. For theorem proving, approaches described in [BGW94, AKW09, KW12, BW13a, BW13b] all restrict the input language to obtain completeness.

Some complete fragments can be very useful, for example, the data structure theories given previously are known to have finite saturations under the Superposition calculus (when the conjecture is ground and without interpreted theories) [ABRS09]. It seems straightforward to include theory reasoning in these fragments, so long as compactness is not a problem. Since the only new inferences on the BG part of clauses are simplifications or constraint refutations, a finite saturation should be possible. TheDenerule is then able to recover sufficient completeness by renaming each of the finitely-many ground free BG-sorted terms in the finite saturation.

More general fragments, such as the array property fragment, allow limited use of quantifiers. These are usually instantiated first, then the proof goal is discharged using a dedicated decision procedure for the ground fragment. Solvers for first-order logic typically degrade in performance as the number of clauses increases, hence it is desirable to minimize the number of instances, if possible. However, their ability to reason natively with quantifiers properly extends the capability of SMT solvers.

As described in Chapter 2, the Hierarchic Superposition calculus requires both compactness of the base specification and sufficient completeness of the input clause set, for refutation completeness. A lack of sufficient completeness either results in non-termination, or, more seriously, termination with a saturated clause set none of whose models properly extend any model of the base specification B. Then, any

clause set that has a finite saturation under the Hierarchic Superposition calculus requires sufficient completeness in order to concludeB-satisfiability.

The GBT-fragment, in which all free BG-sorted terms are ground, is sufficiently complete. This will be the starting point of the method described in this chapter.

The GBT-fragment will be modelled by finitely quantified clauses, in which every variable occurring below a free BG-sorted operator is quantified over a finite cardi- nality subset of its domain. The advantage of this is twofold: instantiation is limited to only those quantifiers which must be instantiated for completeness, and, sets of clause instances (and hence sets of relevant terms) can be represented efficiently by ΣZ-formulas.

If all quantifiers range over finite sets, decidability can be recovered trivially by exhaustive instantiation, followed by calling a suitable SMT-solver. Of course, the instantiation approach scales poorly with increasing domain size, as observed in the context of finite-model finding, for example see [Sla92, ZZ95, McC03, CS03, BFdNT09, RTG+13, RTGK13].

Then the main goal is to design a procedure that recovers sufficient completeness while minimizing instantiation of clauses. To this end, the satisfiability procedure maps multiple free BG-sorted terms to the same constant, and refinements are made by exempting selected terms from that default assignment in a conflict-guided way. After each refinement, the given clause set is rewritten with the new assignment into a clause set with sufficient completeness, soB-satisfiability can be checked with existing reasoners. Suitable reasoners are, e. g. , theorem provers implementing Hier- archic Superposition and, with one more simple transformation step, SMT-solvers for the EA-fragment of the background theory. The procedure stops after finitely many refinement steps; either with a representation of a model (i. e. , a saturated clause set) or a set of clause instances which demonstrates the unsatisfiability of the input clause set.

The satisfiability procedure can be understood as testing a succession of over and under-approximations of the given clause set. Under-approximations are created using a conjectured equality relation on the free BG-sorted terms. Concretely, terms assigned the same default constant are in the same equivalence class. Again, sim- plifying the clause set using this relation (i. e. , replacing free BG-sorted terms with constants) produces a clause set for which saturation in the Hierarchic Superposition calculus impliesB-satisfiability. It is called an under-approximation in keeping with naming conventions, e. g. , in counter-example guided abstraction refinement, where an under-approximation may exclude someΣ-interpretations, but satisfiability of the under-approximation implies satisfaction of the original set.

The over-approximation phase takes a certain subset of clause instances which have been produced by a sound assignment to free BG-sorted terms, and tests this for unsatisfiability. If neither test is conclusive, then the current equality relation is refined by removing some terms from equivalence classes. Effectively, this enlarges the set ofΣ-interpretations considered in the under-approximation phase. Doing so naïvely will require more work than simply instantiating outright, and so a critical part of the procedure is the heuristic used to choose the terms to be removed from

the equivalence relation and added as instances after an iteration.

In summary, the satisfiability algorithm aims to fix the immediate problem that follows the restriction to the GBT-fragment: the exponential increase in clause num- bers due to instantiation with ground free BG-sorted terms. The fix involves repre- senting clause instances symbolically using LIA formulas, then aggressively replacing relevant terms with constants. This unsound step is rectified by heuristic instantiation of clauses which appear to be causing unsatisfiability; a form of conflict-guided instantiation.

5.2 Example Application

In document Disproving in First-Order Logic with Definitions, Arithmetic and Finite Domains (Page 108-110)