The cheap Program Verifier 1

(1)

The cHEAP Program Verifier

1 Etienne Lozes

19 july 2012

1_{Work supported by and developed for the french ANR Project “Veridyc”, 2009–}

(2)

Overview

1.1 What is cHEAP

cHEAP is a program verifier that targets concurrent heap-manipulating pro-grams written in C. cHEAP is based on a program logic directly inspired from Separation Logic [Rey02, O’H08]. A program proof in this program logic guar-antees the absence of errors with respect to heap manipulation (including mem-ory leaks), the absence of data race conditions, and the absence of unjoined

threads.1

Program proofs are conveyed by annotations in the C code based on the Sepashare assertion language (see Chapter 2). cHEAP users have to provide pre and post conditions for all functions, loop invariants for all loops, and lock invariants for all locks.

A large number of program verifiers for C code based on Separation Logic

already exist, see for instance VCC [CDH+_{09] or Verifast [JSP}+_{11]. There are}

two reasons why cHEAP looks a bit different than these two tools:

• cHEAP contains its own theorem prover, and

• cHEAP automatically folds and unfolds recursive definitions.

cHEAP theorem prover is complete for the class of formulas it targets (in the sense that it cannot diverge), which makes the time needed by cHEAP for checking a program proof predictibly small. Handling recursive definitions au-tomatically simplifies the design of the proof, since users do not have to worry about intermediate states of verification conditions. The counterpart of these design choices is that cHEAP is much more limitated than VCC or Verifast with respect to the amount of C features and recursive data structures it can handle (see Section 1.4).

Finally, note that cHEAP is a Frama-C plug-in, which may help digesting some more advanced features of C rather smoothly in the future. Moreover, 1_{Termination and deadlocks would be interesting as well, that’s nonetheless not yet}

(4)

cHEAP has been developped during the ANR research projectVeridycjointly with several other Frama-C plug-ins. All these plug-ins are dedicated to heap manipulation or concurrency and address some topics that are not covered by cHEAP. It is quite tempting now to make these plug-ins collaborate.

1.2 Installation

cHEAP is distributed under the Q Public License (mostly for

compatibil-ity with Smallfoot [BCO05] and Heap-Hop [VLC10], from which it inherits a significant part of the code).

To install cHEAP, you need to have installed first a recent version of

Frama-C (Nitrogen or higher2_{). You can then download the sources on cHEAP web}

page.3

The tar file contains the following files:

README short description of cHEAP

LICENSE text of the Q Public License

cheap-manual.pdf this file

src folder of Ocaml sources of cHEAP

examples folder of example C programs with annotations To compile the source, try

> make

The system requirements for cHEAP are the same as for Frama-C, so if you already have installed Frama-C, you should be able to compile cHEAP without difficulties. You may try cHEAP on the examples included in the distribution to check it compiled correctly.

> make test

1.3 Using cHEAP

cHEAP is typically invoked with one C file4_{and then outputs the result of the}

analysis of this file.

> ./cheap examples/simple-list-examples.c

[kernel] preprocessing with "gcc -C -E -I. memory-manager.c"

memory-manager.c:71:[kernel] warning: Body of function main falls-through. Adding a return statement

[cheap] warning: reference to parameter disposed_list in a post condition(memory-manager.c:21)

2_{http://www.frama-c.com} 3

http://www.lsv.ens-cachan.fr/~lozes/Cheap.

4_{be sure that this C file contains #}_{include "heap.h"} _{and that the file} _heap.h _{will be}

(5)

[kernel] warning: No code for function join, default assigns generated [kernel] warning: No code for function spawn, default assigns generated [cheap] Function main

Function dealloc Function client Function alloc Function compose_init_main [cheap] Valid >

The output may contain warnings from the pre-processing phase ([kernel]

lines), and possibly some warning from cheap. If it terminates with the word

Valid, then the program has been checked. Otherwise for each function that could not be proved, cHEAP will output the position of the instruction or anno-tation that caused an error during symbolic execution, with some informations on the nature of the error.

1.4 cHEAP C

cHEAP targets the programs that may manipulate heap-allocated recursive data structures (lists, doubly-linked lists, or trees), create and join threads, and syn-chronize through heap-allocated locks. For simplicity, cHEAP assumes a unique

type of cell5. struct _cell { int val; int lock; struct _cell *tl; struct _cell *fst; struct _cell *snd; };

typedef struct _cell cell;

Cell fields can be dereferenced and updated using standard C syntax, at the

exception of thelockfield, which is assumed to be a reserved field (it would be

a ghost field if Frama-C supported it, and the type of this field should be an

abstract typelockinstead of int). The only other operations on cells that are

currently handled by cHEAP are the following ones

cell *new();

void dispose(cell *c);

const cell *nil = (cell *)0;

(6)

void init_lock(cell *x); void finalize_lock(cell *x); void lock(cell *x);

void unlock(cell *x); int spawn(cell *f(cell *),

cell *c); cell *join(int tid);

When a cell c is allocated by a call c=new(), the field c->lock does not

refer to a valid lock. After a call toinit lock(c), a lock has been allocated

in the heap and has been associated to the cellc (it is morally pointed to by

c->lock). Before disposing a cellc, its associated lock should be disposed as

well by a call tofinalize lock(c). cHEAP makes some assumptions on the

lock library: locks are not reentrant, and they can be released by a thread even if it was acquired by a different thread. cHEAP however prevents releasing

non-acquired locks. For instance,lock(c);lock(c)always deadlocks, whereas

(lock(c);lock(c))||unlock(c)may sometimes terminate. 6

A call to spawn(f,c) creates a new thread that runs the function f with

parameterc, and returns the thread identifier (tid) of this new thread to the

caller. The tid can later be used by any other thread to wait for the termination

of this thread, obtaining then the return value of f(c). If the return value of

spawn is ignored by the caller, it is assumed that the spawned thread is detached, and one does not report an error for it not being joined.

(7)

Chapter 2

The Sepashare Assertion

Language

Sepashare assertions are syntactically a fragment of ACSL1 _{assertions, as}

de-picted on Figure 2.1. This should not be misleading: the semantics of the

Sepashare assertions significantly differs from the one of ACSL assertions. The

most elementary difference is that the&&connective is always interpreted as a

separating conjunction. There are more differences that are too technical for now, and are deferred to Section 2.2.

Sepashare assertions are intended to be used in pre-condition, post-condition, loop invariants, lock invariants, and user-defined predicates.

2.1 Built-in Predicates

2.1.1 Basics

cHEAP provides a library of built-in predicates, from which all anotations can be constructed. The whole list of all built-in predicates can be found in

examples/cheap.h. Some predicates are dedicated to heap ownership (cells and lists), and others to locks.

\emp nothing owned

\cell(x) ownership of a cell x

\node(x) ownership of a cell x with a lock

\ls(x) ownership of a null-ended list

\lseg(x,y) acyclic list segment (possibly empty)

\tree(x) rooted tree

\locked(x) right to release the lock

\may_lock(x) right to acquire the lock 1_{http://www.frama-c.com}

(8)

<expression> ::= <constant> | <variable> | \result | <expression>-><field> <condition> ::= <expression>==<expression>

<built-in predicate> <user-defined predicate> <assertion>&&<assertion>

Figure 2.1: Syntax of Sepashare

In order to illustrate these built-in predicates, a small axiomatization of built-in functions manipulating cells and locks is depicted in Figure 2.2. Basi-cally, the reasonning on locks is based on the principles introduced by Gotsman

& al [GBC+07]:

• releasing a lock requires to own the right to release it, which is obtained

by either acquiring the lock or by ownership transfer2

• acquiring a lock requires to own (a fraction of) the right to acquire it.

This right is obtained by allocating lock, or by ownership transfer

• acquiring a lock transfers the ownership of the lock invariant from the

lock to the thread that acquires it. The lock invariant is described by a

user-defined predicate whose name must belock_inv.

• releasing a lock transfers the ownership of the lock invariant from the

thread back to the lock.

2.1.2 Permissions

Basic built-in predicates have their permissioned counter-part:

\cell_p(x,p) fractional ownership of a cell x

\locked_p(x,p) fractional right to release the lock

\may_lock_p(x,p) fractional right to acquire the lock

Above, p is a permission, that is either allorhalf(p’).3 _{For instance,}

\cell_p(x,half(all)) && \cell_p(x,half(all))

is equivalent to\cell_p(x,all)and\cell(x).

2_{in that case the lock may be released by a thread that differs from the thread that acquired}

it.

3_{more elaborated permissions are not needed for now, but might be interesting and}

some-times not too hard to add the current version (a library for tree shares and arbitrary fractional permssisions is actually used internally).

(9)

/*@ requires \emp;

ensures \cell(\result); */

extern cell *new(); /*@ requires \cell(c);

ensures \emp; */

extern void dispose(cell *c); /*@ requires \cell(x);

ensures \node(x) && \may_lock(x) && \locked(x); */

extern void init_lock(cell *x);

/*@ requires \node(x) && \may_lock(x) && \locked(x); ensures \cell(x);

*/

extern void finalize_lock(cell *x); /*@ requires \may_lock_p(x,p);

ensures \may_lock_p(x,p) && \locked(x) && lock_inv(x); */

extern void lock(cell *x);

/* requires \locked(x) && lock_inv(x); ensures \emp;

*/

extern void unlock(cell *x);

Figure 2.2: Small axioms for built-in functions

2.1.3 Multiple Lock Invariants

Several lock invariants may be needed for locks with different roles (see for

instanceexamples/readers-writers.c). In that case, the following should be

done:

1. define all lock invariants as user-defined predicates with arbitrary names

2. use variants may_lock2_p and locked2_p of predicates may_lock and

lockedto specify the name of the user-defined predicate representing the lock. For instance,

\may_lock2_p(x,"lock_inv",all)

is equivalent to\may_lock(x).

(10)

2.2 Remarks

2.2.1 Well-Formed Assertions.

The syntax rules of Figure 2.1 misses a rule that should be followed when writ-ting assertions:

If an assertion contains an expression with a dereferencing, then the left value should be guaranteed to be valid somewhere else in the same assertion.

For instance,\cell(x->tl)alone is going to be “useless”4, but

\cell(x) && \cell(x->tl),

or\cell_p(x,half(all)) && \cell(x->tl)will be meaningfull.

2.2.2 Variables in Post Conditions.

Unlike in ACSL, a local or formal variable of a function occuring in its post-condition refer to the value of the variable at function exit (and not function entry).

//@ requires \emp; ensures x==nil && i==1; void foo(cell *x) {int i=1; x=nil;}

This should be considered carefully when reasoning on caller’s side, as it is converted there into an existential variable.

2.2.3 Break, Return, and Loop Invariants.

The semantics of loop invariants is the same as for ACSL, namely:

• loop invariant holds at loop entry and loop normal exit

• loop invariant holds at every call to continue

• loop invariant does not necessarily hold at break and return. The return

case is quite simple: the function post-condition must hold. For the break case, the continuation of the loop (the code and its post-condition or higher-level loop invariant) should hold.

For technical reasons due to the treatment of loops in Frama-C, it is simpler to keep this semantics in cHEAP. But for this reason, we forbbid the application

of the frame rule for loop invariants.5 _{For instance, the program}

4_{It will be accepted by cHEAP, but meaning}_∃_x_tl._cell₍_x

tl)

5_{the frame of a loop was inferred at the time of symbolic execution in Smallfoot, but it}

might be needed already at the step of VC generation if we were to supportbreakstatements. A fix seems possible, but require to force the order in which symbolic execution works on verification conditions to save the infered frame for VCs that depend on it.

(11)

cell *x = new();

// loop invariant \emp; while(0==1) skip;

(12)

Bibliography

[BCO05] Josh Berdine, Cristiano Calcagno, and Peter W. O’Hearn.

Small-foot: Modular automatic assertion checking with separation logic. In Frank S. de Boer, Marcello M. Bonsangue, Susanne Graf, and

Willem P. de Roever, editors,FMCO, volume 4111 ofLecture Notes

in Computer Science, pages 115–137. Springer, 2005.

[CDH+_{09] Ernie Cohen, Markus Dahlweid, Mark A. Hillebrand, Dirk}

Leinen-bach, Michal Moskal, Thomas Santen, Wolfram Schulte, and Stephan Tobies. Vcc: A practical system for verifying concurrent c. In Stefan Berghofer, Tobias Nipkow, Christian Urban, and Makarius

Wenzel, editors, TPHOLs, volume 5674 of Lecture Notes in

Com-puter Science, pages 23–42. Springer, 2009.

[GBC+07] Alexey Gotsman, Josh Berdine, Byron Cook, Noam Rinetzky, and

Mooly Sagiv. Local reasoning for storable locks and threads. In

Zhong Shao, editor,APLAS, volume 4807 ofLecture Notes in

[JSP+_11] _{Bart Jacobs, Jan Smans, Pieter Philippaerts, Fr´}_ed´_{eric Vogels,}

Willem Penninckx, and Frank Piessens. Verifast: A powerful, sound, predictable, fast verifier for c and java. In Mihaela Gheorghiu Bo-baru, Klaus Havelund, Gerard J. Holzmann, and Rajeev Joshi,

edi-tors,NASA Formal Methods, volume 6617 ofLecture Notes in

[O’H08] Peter W. O’Hearn. Tutorial on separation logic (invited tutorial).

In Aarti Gupta and Sharad Malik, editors, CAV, volume 5123 of

Lecture Notes in Computer Science, pages 19–21. Springer, 2008.

[Rey02] John C. Reynolds. Separation logic: A logic for shared mutable data

structures. InLICS, pages 55–74. IEEE Computer Society, 2002.

[VLC10] Jules Villard, ´Etienne Lozes, and Cristiano Calcagno. Tracking

heaps that hop with heap-hop. In Javier Esparza and Rupak

Ma-jumdar, editors,TACAS, volume 6015 ofLecture Notes in Computer

The cheap Program Verifier 1