AlloyPF starts its proof process by translating the Alloy specification (model and assertion) to an equisatisfiable RFOL formula F (details of the RFOL logic and the Alloy to RFOL translation can be found in Chapter 4 and Chapter 5, respectively).
Using our SufGT technique (see Chapter 7 for details), AlloyPF first checks if there exists a general sufficient maximal bound for F and thus for the Alloy specification. If so, AlloyPF uses the Alloy Analyzer to prove the correctness of the Alloy specification via bounded verification within the computed scope. For both specifications of our assertions noDir Aliases and someDir, no general sufficient maximal bound could be computed, moreover, no individual type admits a sufficient maximal bound. In such a case and in order to gain confidence in the validity of the assertion, AlloyPF applies bounded verification using the Alloy Analyzer for increasingly larger scopes until a time-out (or out-of-memory) is reached. Time-out has been set to 600 seconds. For both specifications of our assertions, the Alloy Analyzer can show their bounded-correctness up to the scope of 56.
Having this confidence in the validity of the assertions, the main automatic proof process of AlloyPF starts —using an engine named AlloyPE. This process is based on the satisfiability modulo theories (SMT) analysis of the RFOL translation of the Alloy specification. Because of the complete axiomatization of relational operators in RFOL, the resulting RFOL formulas of the Alloy to RFOL translation are, in general, too difficult for the analysis with SMT solvers. In order to overcome this limitation, AlloyPE uses our SB+simplification technique (see Chapter 6 for details). Since SB+ does not preserve satisfiability, in general, AlloyPE checks a priori whether the RFOL formulas belong to the completeness fragment of SB+or not. Both our assertions do belong.
We will denote the SB+-simplified RFOL translations of the Alloy specifications of the assertions noDir Aliases and someDir, respectively, F1and F2. In a first try, AlloyPE checks the satisfiability of their negations¬F1 and¬F2 using the Z3 SMT solver.
Because of the equisatisfiability of the RFOL translation and the SB+simplification, AlloyPE can directly conclude the validity of the corresponding assertion if the SMT solver reports “unsat”, and its invalidity if the SMT solver reports “sat”3.
Using this process, AlloyPE proves the validity of the first assertion noDirAliases in 0.19 seconds but times out (after 600 seconds) trying to prove the validity of the second assertion someDir. These results might seem odd at the first glance, since the first assertion looks (syntactically) more complicated than the second one. The reason of this behavior, however, goes back to a general limitation of SMT solvers in handling quantified formulas. That is, an SMT solver can only refute a formula containing quantifiers if it can construct, using quantifier instantiations, a ground subformula of the original formula that is by itself refutable. This applies for¬F1but not for¬F2. With respect to RFOL formulas resulting from the Alloy to RFOL translation, this is
3We treat “unkown” reports of the SMT solver as time-out.
3.2 The AlloyPF Proof Process 21
due to a large extent to the use of the so called recursive theories. These are in Alloy:
transitive closure, set cardinality, ordering and integer.
For such formulas, for which the first proof attempt times out, AlloyPE applies in a second attempt our TCPInv calculus (see Chapter 8 for details). The TCPInv calculus can automatically refute RFOL formulas which can not be refuted by standard SMT solving, and uses as only recursive theory the transitive closure theory. It first detects all essential relational paths, i.e., positive literals containing transitive closure, whose refutation is essential for the refutation of the overall formula. For each essential relational path, TCPInv detects and injects to the original formula so called path-invariants until the essential relational path can be refuted from the SMT solver via standard SMT solving. The overall formula gets refuted, when all essential relational paths are refuted.
Applying this technique to our second assertion someDir, AlloyPE proves its validity in 16.31 seconds. Out of 6 relational paths, only one was essential. In order to refute this unique essential relational path, AlloyPE needs to detect and inject 4 path-invariants. AlloyPE checks in total 14 path-invariant candidates.
23
Chapter 4
A First-Order Relational Logic
In order to prove the correctness of Alloy proof obligations, we translate them to a first-order relational logic (RFOL). We designed RFOL with two objectives: (1) it should express the core Alloy language almost directly1, in order to ease the correctness argumentation, and (2) it should fit in a target logic of satisfiability modulo theories (SMT) solvers, in order to exploit their reasoning power. We first start by fixing the following basic mutually disjoint sets:
• An infinite set𝒮of sort symbols s.
• An infinite set𝒳 of variables x.
• An infinite setℱof function symbols f .
4.1 Sorts
In RFOL—a many-sorted language— each term is associated with a unique sort, and each sort is denoted by a sort symbol. The sorts in RFOL are neither composed nor parametrized. Also, the sort hierarchy is flat, i.e., all sorts are mutually disjoint.
Consequently, a sort signature Ω of RFOL is as simple as a subset S of𝒮.