The inferrer implements the shape-and-effect inference algorithm of Chapter 5. Given a CIL AST, it infers the memory shapes and aliasing relationships of all program variables, and the effects for all statements.
2http://lxr.free-electrons.com/ident?i=container_of 3http://lxr.free-electrons.com/ident?i=__acquires
6.2. Shape-and-effect inferrer 103 !lock 10 8 IF(*) 11 !unlock 12 {lock, unlock} 15 inode_get_rsv_space !unlock 16 return 17 IF(*) 3 1 return 6 !lock 4 !unlock 5
Figure 6.2: An illustration of our bug-finding technique for the double- lock bug in Figure 1.2. The figure shows the associated CFG annotated with lock and unlock effects. The numbers next to the CFG nodes show corresponding line numbers. The gray nodes visualize the (red) path, via the function call in line 15, to the double-lock (in line 4).
Crucially, each function is assigned a shape-and-effect signature, which establishes aliasing relationships between inputs and outputs, and pro- vides a flow-insensitive summary of its observable behavior. This infor- mation is superimposed on the CFG, obtaining the so-called effect-based CFG, or F-CFG, of the program. Figure 6.2 shows an example F-CFG (basically the same as that of Fig. 1.3, repeated here for convenience).
We begin with a standard CFG, where nodes represent program lo- cations, and edges specify the control-flow. We distinguish branching decisions (diamond nodes), atomic operations (circles), function calls (dotted squares), and return statements (double-circles). A F-CFG is an effect-abstraction of a program obtained from the standard CFG by annotating variables with their memory shapes, and nodes with the ef- fects inferred for the corresponding locations. Function call nodes hold a flow-insensitive over-approximation of the callee’s behavior. For in- stance, in Fig. 6.2, the call to inode_get_rsv_space is summarized as
{lock, unlock}, indicating that this function call may both acquire and releaseinode->lock.
104 Chapter 6. Effective Bug Finding with EBA
`S ⊆Env×Shape×Stmt×Effect
[If] Γ`E E : Z0 & F0 Γ`
⇓Z
S S1 & F1 Γ`⇓SZS2 & F2
Γ`⇓SZif (E) S1 S2 & F0∪mayF1∪mayF2
[Switch] Γ `E E : Z0 & F0 Γ`
⇓Z
S Si & Fi / i ∈ [1, n]
Γ`⇓SZswitch (E) { S1 · · · Sn }& F0∪ (
[ i∈[1,n] mayFi) [Block] Γ` ⇓Z S Si & Fi / i ∈ [1, n] Γ`⇓SZ{S1· · ·Sn} &F1∪( [ i∈[2,n] mayFi)
Figure 6.3: Typing of statements with must-effects. Only the rules that change are shown, and those changes are marked using a bold blue font.
6.2.1
May and must effects
The shape-and-effect system of Chapter 4 infers only may-effects, whereas EBA also considers must-effects. The may-effects of a function describe all the effects that may result from applying such function, but not all applications of the function necessarily have all those effects. For instance, if a function has a may-effect lockρ, it may or may not acquire a lock on ρ, depending on the flow of control.
The must-effects of a function describe the effects that any invocation of the function is guaranteed to have. For instance, functionspin_lock will only return after acquiring a lock. Must-effects are prefixed by a bang (!), as in !lockρ. Must-effects are useful to mark the basic operations that perform such effects (see Sect. 6.2.2). The bug-filter (cf. Sect. 6.4) can rely on must-effects to rank the bug reports, and decide whether inlining is necessary (more on this later).
However, the inferrer is not able to infer must-effects—that would require a flow-sensitive analysis. When the inferrer cannot guarantee that must-effects hold for a statement, it simply downgrades them to may- effects. Figure 6.3 shows how the typing rules are changed to deal with must-effects. The operator may turns must-effects into may-effects, e.g., may({!lockρ2}) = {lockρ2} (so it just “drops the bang”). Note that in a
6.2. Shape-and-effect inferrer 105
sequence of statements S1;S2, S2may never execute if S1either diverges
or aborts the execution (rule [Block]).
6.2.2
Axioms
The shape-and-effect inferrer knows about a few built-in effects that are inherent to the C language, such as reads and writes to l-values. Yet, bug checkers are often more interested in tracking effects associated with specialized APIs. For EBA to track these effects, it is necessary to mark the elementary API functions that are the root of those effects. (Functions that are built on top of these elementary functions do not need to be annotated, if their source code is available to EBA.) EBA offers two mechanisms for doing this.
Full axioms
A full axiom is simply a shape scheme that is associated with a function name, and added to the global typing environment as an axiom. The axiom fully specifies the shape and effects of the function, and even if a definition of the function exists, it is ignored. Full axioms can be bur- densome to specify, and thus should be used to specify simple function signatures, mainly when the code is not available. For instance, in EBA, the libc function freeis axiomatized as follows:
free: ∀ ρ1ρ2ζ. refρ1 ptr refρ2 ζ
{!readρ1, !free
ρ2}
−−−−−−−−→ ⊥ (6.1) This axiom specifies that free takes a pointer to an arbitrary memory location (ρ2), containing an object of an arbitrary shape (ζ), and it has
the effect of freeing that chunk of memory from the heap. Partial axioms
A partial axiom allows to refine the shape scheme of a function, that has been previously inferred. Usually, the refinement consists in extending the set of latent effects. Partial axioms are preferred when functions ma- nipulate struct types, or when the source code of the function is avail- able. (It is not a good practice to specify a complete struct shape in an axiom, since the declaration may change, or even be configuration- dependent.) For instance, Linux spin locks are mainly manipulated
106 Chapter 6. Effective Bug Finding with EBA
through the use of the spin_lock and spin_unlock functions, which have the following prototype: void f (spinlock_t *lock). EBA con- tains the following partial axioms to track operations on spin locks:4
spin_lock : refρ
1 ptr refρ2 Z
+!lockρ2
−−−−−→ ⊥ (6.2)
spin_unlock: refρ1 ptr refρ2 Z
+!unlockρ2