• No results found

The CompCert verified C compiler

N/A
N/A
Protected

Academic year: 2021

Share "The CompCert verified C compiler"

Copied!
43
0
0

Loading.... (view fulltext now)

Full text

(1)

The CompCert verified C compiler

Compiler built and proved by Xavier Leroy et al.

Talk given by David Pichardie - Harvard University / INRIA Rennes Slides largely inspired by Xavier Leroy own material

(2)

Critical embedded software

(3)

Critical embedded software

2

critical_prog.ppc High degree of assurance is required

is the program critical_prog.ppc safe ?

(4)

Critical embedded software

High degree of assurance is required

is the program critical_prog.ppc safe ? 2 options:

qualify the PPC program as if hand-written (intensive testing, painful manual review...)

qualify the program at the source level (static analysis, model checking, or

program proof)

(5)

Critical embedded software

2

critical_prog.ppc High degree of assurance is required

is the program critical_prog.ppc safe ? 2 options:

qualify the PPC program as if hand-written (intensive testing, painful manual review...)

qualify the program at the source level (static analysis, model checking, or

program proof)

2nd option is preferred in practice

can you trust your compiler ?

this talk: apply formal verification techniques to the compiler itself !

(6)

Miscompilation happens

(7)

Miscompilation happens

3

NULLSTONE isolated defects [in integer division] in twelve of twenty commercially available compilers that were evaluated.

http://www.nullstone.com/htmls/category/divide.htm

(8)

Miscompilation happens

NULLSTONE isolated defects [in integer division] in twelve of twenty commercially available compilers that were evaluated.

http://www.nullstone.com/htmls/category/divide.htm

We tested thirteen production-quality C compilers and, for each, found situations in which the compiler generated incorrect code for accessing volatile variables.

E. Eide & J. Regehr, EMSOFT 2008

(9)

Miscompilation happens

3

NULLSTONE isolated defects [in integer division] in twelve of twenty commercially available compilers that were evaluated.

http://www.nullstone.com/htmls/category/divide.htm

We tested thirteen production-quality C compilers and, for each, found situations in which the compiler generated incorrect code for accessing volatile variables.

E. Eide & J. Regehr, EMSOFT 2008

We created a tool that generates random C programs, and then spent two and a half years using it to find compiler bugs. So far, we have reported more than 290 previously unknown bugs to compiler developers. Moreover, every compiler that we tested has been found to crash and also to silently generate wrong code when

presented with valid inputs.

J. Regehr et al, 2010

(10)

How do we verify a compiler ?

(11)

How do we verify a compiler ?

A simple idea:

4

(12)

How do we verify a compiler ?

Program and prove your compiler in the same language !

A simple idea:

(13)

How do we verify a compiler ?

Program and prove your compiler in the same language !

Which language ? A simple idea:

4

(14)

How do we verify a compiler ?

Program and prove your compiler in the same language !

Which language ?

Coq

A simple idea:

(15)

Coq: an animal with two faces

5

(16)

Coq: an animal with two faces

First face:

(17)

Coq: an animal with two faces

5

First face:

a proof assistant that allows to interactively build proof in constructive logic

(18)

Coq: an animal with two faces

First face:

a proof assistant that allows to interactively build proof in constructive logic

Second face:

(19)

Coq: an animal with two faces

5

First face:

a proof assistant that allows to interactively build proof in constructive logic

Second face:

a functional programming language with a very rich type system

(20)

sort : 8 l: list int , { l ’: list int |

Sorted l ’ ^ PermutationOf l l ’ }

Coq: an animal with two faces

First face:

a proof assistant that allows to interactively build proof in constructive logic

Second face:

a functional programming language with a very rich type system example:

(21)

sort : 8 l: list int , { l ’: list int |

Sorted l ’ ^ PermutationOf l l ’ }

sort : int list ! int list

Coq: an animal with two faces

5

First face:

a proof assistant that allows to interactively build proof in constructive logic

Second face:

a functional programming language with a very rich type system example:

with an extraction mechanism to Ocaml

(22)

Verifying a compiler in Coq

A simple recipe

Logical

Framework

(here Coq)

(23)

6

Verifying a compiler in Coq

A simple recipe

Logical

Framework

(here Coq)

Definition c o m p i l e r

( p : source ) : option target := ...

Program the compiler inside Coq

Compiler

(24)

Verifying a compiler in Coq

A simple recipe

Logical

Framework

(here Coq)

Definition c o m p i l e r

( p : source ) : option target := ...

Program the compiler inside Coq

Compiler Source & Target Language Semantics

State its correctness wrt. a formal specification of the language semantics

Theorem c o m p i l e r _ i s _ c o r r e c t : 8 p , compiler p = Some p ’ !

b e h a v i o r s (p ’) b e h a v i o r s ( p )

If compilation succeeds, the generated code should behave as prescribed by the semantics of the source program.

(25)

6

Verifying a compiler in Coq

A simple recipe

Logical

Framework

(here Coq)

Definition c o m p i l e r

( p : source ) : option target := ...

Program the compiler inside Coq

Compiler Source & Target Language Semantics

Proof. ... (* few days later *) ... Qed.

We interactively and mechanically prove this theorem

Soundness Proof

State its correctness wrt. a formal specification of the language semantics

Theorem c o m p i l e r _ i s _ c o r r e c t : 8 p , compiler p = Some p ’ !

b e h a v i o r s (p ’) b e h a v i o r s ( p )

(26)

Verifying a compiler in Coq

A simple recipe

Logical

Framework

(here Coq)

We extract an OCaml implementation of the compiler

Definition c o m p i l e r

( p : source ) : option target := ...

Program the compiler inside Coq

Compiler Source & Target Language Semantics

Proof. ... (* few days later *) ... Qed.

We interactively and mechanically prove this theorem

Soundness Proof

parser.ml compiler.ml linker.ml

State its correctness wrt. a formal specification of the language semantics

Theorem c o m p i l e r _ i s _ c o r r e c t : 8 p , compiler p = Some p ’ !

b e h a v i o r s (p ’) b e h a v i o r s ( p )

(27)

7

Trusted Computing Base (TCB)

1. Formal specification of the programming language semantics

already (informally) shared between any end- user programmer, compiler, static analyzer

must itself be rigorously proofread/animated/

tested

2. Logical Framework

only the proof checker needs to be trusted

we don’t trust sophisticated decision procedures

Source & Target Language Semantics

Logical

Framework

(here Coq)

Compiler

Soundness Proof

(28)

Trusted Computing Base (TCB)

1. Formal specification of the programming language semantics

already (informally) shared between any end- user programmer, compiler, static analyzer

must itself be rigorously proofread/animated/

tested

2. Logical Framework

only the proof checker needs to be trusted

we don’t trust sophisticated decision procedures

Source & Target Language Semantics

Logical

Framework

(here Coq)

Compiler

Soundness Proof

Proof Checker

(29)

7

Trusted Computing Base (TCB)

1. Formal specification of the programming language semantics

already (informally) shared between any end- user programmer, compiler, static analyzer

must itself be rigorously proofread/animated/

tested

2. Logical Framework

only the proof checker needs to be trusted

we don’t trust sophisticated decision procedures

Source & Target Language Semantics

Logical

Framework

(here Coq)

Compiler

Soundness Proof

Proof Checker

Still a large code base but at least a

foundational code base: logic & semantics

(30)

The CompCert project

X. Leroy, S. Blazy et al.

Develop and prove correct a realistic compiler, usable for critical embedded software.

• Source language: a very large subset of C.

• Target language: PowerPC/ARM/x86 assembly.

• Generates reasonably compact and fast code careful code generation; some optimizations.

Note: compiler written from scratch, along with its proof; not trying to prove an existing compiler

(otherwise see Zdancewic et al’s Verified LLVM project).

(31)

9

The formally verified part of the compiler

Compcert C Clight C#minor

Cminor CminorSel

RTL

LTL LTLin Linear

Mach ASM

side-effects out of expressions

type elimination loop simplifications

stack allocation of variables Optimizations: constant prop., CSE, tail calls

instruction selection CFG construction

expr. decomp.

register allocation

linearization of the CFG

spilling, reloading calling conventions

layout of stack frames asm code

generation

(32)

Theorem t r a n s f _ c _ p r o g r a m _ i s _ r e f i n e m e n t : 8 p tp ,

t r a n s f _ c _ p r o g r a m p = OK tp !

(8 beh , exec_C_program p beh ! not_wrong beh ) !

(8 beh , exec_asm_program tp beh ! exec_C_program p beh ).

After 50 000 lines of Coq and 4 person.years of effort

Formally verified in Coq

Behaviors beh = termination / divergence / going wrong

(33)

11

Compiler verification patterns

(for each pass)

(34)

Compiler verification patterns

(for each pass)

Verified transformation

transformation

(35)

11

Compiler verification patterns

(for each pass)

Verified transformation

transformation transformation

validator

Verified translation validation

= formally verified

= not verified

(36)

Compiler verification patterns

(for each pass)

Verified transformation

transformation transformation

validator

Verified translation validation

External solver with verified validation

transformation

(37)

12

External solver with verified validation

Example: register allocation

register allocation

graph coloring checker

graph coloring heuristics

RTL LTL

(38)

Verified translation validation

Example: SSA generation (in CompCert SSA extension)

untrusted SSA generator

validator

RTL SSA

(39)

13

Verified translation validation

Example: SSA generation (in CompCert SSA extension)

untrusted SSA generator

validator

RTL SSA

The untrusted generator can rely on advanced graph algorithms

as Lengauer and Tarjan’s dominator tree construction and frontier

dominance computation.

(40)

Verified translation validation

Example: SSA generation (in CompCert SSA extension)

untrusted SSA generator

validator

RTL SSA

The untrusted generator can rely on advanced graph algorithms

as Lengauer and Tarjan’s dominator tree construction and frontier

(41)

14

The whole CompCert compiler

C source AST C

Executable Assembly AST Asm

parsing, construction of an AST type-checking, de-sugaring

assembling linking

printing of asm syntax Type reconstruction

Graph coloring Code linearization heuristics

Not proved (hand-written in OCaml)

Proved in Coq

(extracted to OCaml)

Part of the TCB

Not part of the TCB

Verified compiler

(42)

Performance of generated code

(On a PowerPC G5 processor)

Performance of generated code

(On a PowerPC G5 processor)

es ch e dy l

Execution time

gcc -O0

Compcert

gcc -O1

gcc -O3

(43)

16

The formal verification of realistic compilers is feasible.

(Within the limitations of contemporary proof assistants)

Much work remains:

• Shrinking the TCB

(e.g. verified parsing, validated assembling & linking).

• More optimizations

(see CompCert SSA).

• Front-ends for other languages

• Concurrency

(see Sevcik et al’s CompCert TSO and Appel and al’s Verified Software Toolchain).

• Connections with source-level verification

(ongoing french project on a verified C static analyzer)

Conclusions

References

Related documents

MIR830 Human Resource Management MIR840 Labour Economics and IR MIR850 Organizational Theory & Design Elective coursework and skills seminars provide students with

This study undertook a detailed field investigation, which included the installation of a network of groundwater monitoring bores, collection of spatial and temporal water quality

discussions on the theory of fana and baqa , the influence of Rumi’s predecessors on his understanding, the connection of the process to fundamental principles of Islam, the

An increase in expenditure on research and development (per capita) by 1% raises TFP by barely 0.1%. An increase in U.S. We also find monetary policy to have an effect, albeit

Investment income percentage for 2015 (line 10c, column (f) divided by line 13, column (f)). The organization qualifies as a publicly supported organization. The organization

According to the definition by COSO in 1992, an internal control system is defined as a set of methods, designed and controlled by senior management and board of directors

• Royal Bank of Scotland sold its Global Merchant Services business to two US private equity firms – Advent International and Bain Capital – for €2.3bn; and.. • Santander agreed