Solving technologies - Counterexamples in probabilistic verification

2.6 Solving technologies

We introduce three different concepts of solvers which are all used later on. We use classical SAT solvers, their extension by “theories”, SAT modulo theories (SMT), and mixed integer linear programming (MILP).

2.6.1 SAT solving

We present a short summary of SAT solving. We first need the syntax of quantifier-free Boolean formulae.

Definition 38 (Boolean formula) Assume a set Var = {x₁, . . . , x_n} of Boolean variables. A quantifier-free Boolean formula is given by the following grammar:

ϕ ::= xi| ¬ϕ | (ϕ ∧ ϕ)

with x_i_{∈ Var.}

The set of all Boolean formulae over Var is denoted by Bool(Var). We use syntactic sugar like ∨, →, ↔. As for evaluations for BDDs, see Section 2.4.1, let V (Var) = ν : Var → {0, 1} be the set of all variable valuations. An evaluation—also referred to as assignment—ν ∈ V (Var) for Var assigns to every variable from Var a Boolean value. Using the Boolean connectives, this can be used to define a valuation for Boolean formulae. We overload the evaluation function by ν : Bool(Var) → {0, 1}. A formula ϕ ∈ Bool(Var) is called satisfiable if and only if there is an evaluation such that ν(ϕ) = 1. The problem of checking the satisfiability of Boolean formulae is called the satisfiability problem. Tools deciding whether a Boolean formula is satisfiable are called SAT solvers. Though the satisfiability problem is known to be NP-hard [Coo71], in the past there have been great advances developing SAT solvers such that practical examples with many thousands of variables can be handled. Basic work was done by Davis, Putnam, Logemann and Loveland by developing a method called DPLL algorithm [DP60, DLL62].

For instance, a famous open-source SAT solver is MiniSAT [ES03]; a modern extension is

Glucoser[AS09]. For further introduction to SAT solving we refer to [BHvMW09].

2.6.2 SMT solving

SMT refers to SAT-modulo-theories [dMB11], which is a generalization of the classical satisfiability problem (SAT). An SMT formula allows for atoms of a given theory as atomic proposition; here we use linear real arithmetic as theory.

Definition 39 (SMT formula) Assume a set Var = {x₁, . . . , x_n} of real-valued variables. A quantifier-free linear real-arithmetic formula ϕ is given by the following grammar:

2.6. SOLVING TECHNOLOGIES

p ::= t | p + p c ::= p = p | p < p ϕ ::= c | ¬ϕ | ϕ ∧ ϕ where a ∈ Z.

SMT problems can be solved by the combination of a DPLL-procedure (as used for deciding SAT problems) and a theory solver that is able to decide the satisfiability of conjunctions of theory atoms. For a description of such a combined algorithm see, e. g., [DdM06]. Popular SMT solvers are, e. g.,Z3[dMB08] orMathSAT[BCF+08].

2.6.3 MILP solving

A mixed integer linear program optimizes a linear objective function under a condition specified by a conjunction of linear inequalities. A subset of the variables in the inequalities is restricted to take only integer values, which makes solving MILPs NP-hard [GJ79, Problem MP1].

Definition 40 (Mixed integer linear program) Let A ∈ Qm×n, B ∈ Qm×k, b ∈ Qm, c ∈ Qn, and d ∈ Qk_{. A mixed integer linear program (MILP) consists in computing min c}T_{x + d}T_y

such that Ax + B y ≤ b and x ∈ Rn_{, y ∈ Z}k_.

MILPs are typically solved by a combination of a branch-and-bound algorithm with the generation of so-called cutting planes. These algorithms heavily rely on the fact that relaxations of MILPs which result by removing the integrality constraints can be solved efficiently. MILPs are widely used in operations research, hardware-software co-design, and numerous other applications. Efficient open source as well as commercial implementations are available likeGurobi[Gur13],

Scip [Ach09] or Cplex [cpl12] by IBM. We refer to, e. g., [Sch86] for more information on solving MILPs.

CHAPTER

3

Related work

In this chapter we discuss other works on counterexample generation for DTMCs and MDPs or PAs, respectively. We explain the intuition of all available approaches that—to the best of our knowledge—have been made so far. Differences and relations to our own work are discussed at the end of each corresponding chapter. Roughly, we have two categories: path-based counterexamples and subsystem-based counterexamples. For a detailed overview with numerous examples we refer to [21].

3.1 Path-based counterexamples

Counterexamples based on paths of a system are the classic notion as in Definition 35 on Page 45. We summarize all approaches that represent counterexamples in this way.

3.1.1 Minimal and smallest counterexamples

In [HK07a, HKD09], Han and Katoen shaped the notions of minimal and smallest counterexam- ples. A minimal counterexample for a reachability property and a DTMC contains the minimal number of evidences that is needed to form a counterexample. In general, a minimal counterexample is not unique. To require a stronger condition, a smallest counterexample is a minimal counterexample with maximal probability. Again, this stronger condition does not necessarily induce a unique counterexample.

As a way to compute these counterexamples, the authors of [HK07a, HKD09] use a k-shortest path algorithm for weighted directed graphs. They show that the k most probable paths in a DTMC (forming a smallest counterexample) correspond to the k shortest paths in a related weighted digraph. This digraph is constructed from the DTMC by considering the negative

3.1. PATH-BASED COUNTEREXAMPLES

logarithms of the transition-probabilities as weights of the edges. In order to compute these k shortest paths, the authors choose the recursive enumeration algorithm (REA) [JM99] by Jiménez and Marzal, where the number k of paths is not determined beforehand but on-the-fly by an external condition. In this case that means that the search terminates once a counterexample has been found, i. e., that the paths have enough cumulated probability mass. This avoids fixing some (arbitrary) k in advance, and allows for finding the smallest k yielding a counterexample. Besides the mentioned publications, for more detailed information we refer to the dissertation of Tingting Han [Han09].

As an alternative, Aljazzar and Leue proposed a K∗_{-algorithm [AL11] for finding the k short-}

est paths. In contrast to other algorithms like the aforementioned REA or Eppstein’s algorithm [Epp98], the state space of the graph at hand does not have to be generated to its full extend. Starting from an initial state, the graph is expanded on-the-fly which is often beneficial for large state spaces.

Another possibility to handle large state spaces is to use symbolic representations of DTMCs, see Section 2.4.2. A first approach was made by Günther, Schuster and Siegle [GSS10], who proposed a BDD-based algorithm for computing the k most probable paths of a DTMC. They use an adaption of Dijkstra’s shortest path algorithm [Dij59], called flooding Dijkstra, to determine the most probable path. In order to get the k-shortest paths, they transform the DTMC in each step such that the most probable path of the transformed system corresponds to the second-most- probable path in the original DTMC. This involves copying the DTMC and redirecting transitions from the original system to the copy. In the end, this yields a symbolic representation of a minimal counterexample. Details are explained in Section 5.7 as this is also relevant to our approaches. Note that the underlying BDD grows exponentially when applying these transformations for all needed paths.

3.1.2 Heuristic approaches

Instead of striving to compute minimal or even smallest counterexamples, it might be reasonable to use heuristics and compute counterexamples more efficiently or for larger systems.

The only heuristic approach generating path-based counterexamples we are aware of is an adaption of bounded model checking (BMC) [BCC+_{03] by Wimmer et al. in [WBB09, 26, 25]. A}

SAT solver is used to generate evidences until enough probability mass is accumulated. The basic idea of BMC is to formulate the existence of an evidence of a certain length as a satisfiability problem. In [WBB09], purely propositional formulas are used which does not allow to take the actual probability of an evidence into account; in [26, 25] this was extended to SMT formulas over linear real arithmetic, which allows to enforce a lower bound on the probability of evidences. In both cases, the starting point is a symbolic representation of the DTMC, see Section 2.4.2. From the BDDs and MTBDDs, a SAT formula is generated where a satisfying variable assignment corresponds to a path of the DTMC starting in an initial state and ending at a target state. Starting with a small path length n, all paths of this length are enumerated. If afterwards the set of paths

In document Counterexamples in probabilistic verification (Page 62-67)