Specification-Based Repair - Advanced Techniques for Search-Based Program Repair

Unlike search-based and semantics-based program repair, both of which rely on user-provided test suites to establish correctness, specification-based repair techniques assume the existence of formal specifications. Due to this requirement, the applicability of these approaches is far more limited; none of the datasets previ- ously used to evaluated search-based or semantics-based repair approaches (e.g., SIR, ManyBugs, Coreutils) possess any such specifications. On the other hand, specification-basedapproaches are able to leverage these specifications to generate higher quality patches (that are likely to satisfy them). Concerns are shifted from the adequacy of the test suite, to the soundness and completeness of the specifications that are used. Like semantics-based repair, specification-based approaches use program synthesis techniques to generate patches.

Below, we briefly discuss AutoFix-E, one of the few and foremost approaches to specification-basedprogram repair.

AutoFix-E

AutoFix-E [Wei et al.,2010] leverages partial specifications, provided in the form of contracts, to automatically repair faults in Eiffel classes. These contracts are used to specify preconditions, postconditions and intermediary assertions over Eif- fel classes.

Provided a defective Eiffel class, AutoFix-E attempts to expose and repair the underlying bug, following the steps below:

1. Test Suite Generation: In contrast to almost all search-based and semantics- based repair approaches, AutoFix-E generates a test suite automatically, rather than relying on one provided by the programmer. AutoTest, an automated test generation tool for Eiffel, is used to generate as large a test suite as possible within a fixed window of time, covering the defective class.

By evaluating the program under repair against this test suite, the tests are partitioned into passing runs and failing runs (equivalent to positive and neg- ativetests). A run is determined to be a failure if it results in a contract viola- tion.

2. Object State: To aid in determining the source of the fault, AutoFix-E exam- ines object state by observing the output of argument-less, boolean-valued functions (referred to as boolean queries). Boolean queries are widely used in Eiffel contracts as a means of concisely capturing the key properties of object state. Together, the n boolean queries for a given class describe its 2n_abstract

states. Although the resulting state space may seem intractable, a study of a large Eiffel library, conducted byWei et al.[2010], suggests that the majority of Eiffel classes have 15 or fewer such queries.

Using the set of boolean queries for the class under repair, AutoFix-E uses contract mining to discover implications between queries. For each mined implication, AutoFix-E also generates three mutants of that implication: the negation of its antecendent, consequent, and both. Together, the sets of boolean queries and implications form the predicate set P ; this set describes a set of simple facts about the class under repair, encoded as logical statements. To improve the efficiency of later stages in the repair process, the predicate set is pruned, with the help of an automated theorem prover.

3. Fault Profiling: Using Daikon, a popular invariant mining tool, and a super- set Π of the predicate set P , also containing the negation of each predicate, AutoFix-E generates a pair of invariants for each executed program location `. This pair, (I_`+, I_`−), is comprised of the predicates I_`+ ⊆ Πthat hold for all passing runs, and the predicates I−

` ⊆ Π that hold for all failing runs.

Together, this sequence of pairs forms a fault profile, describing the possible causes for the failed runs, in terms of abstract state.

4. Behaviour Model Generation: To serve as a source of ingredients for its repair process, AutoFix-E mines a simple finite-state behavioural model over all of its passing runs, encoded as a finite-state automaton. The states of this au- tomation are labelled by the predicates that hold over that state. Transitions are labelled with the name of a routine, and are used to describe a particular input-output behaviour of a routine, in terms of abstract state; implicitly, each transition describes a Hoare triple.

The resulting model is used by AutoFix-E to determine how to reach a desired state (i.e., the intended state) from some current state (i.e., a faulty state). In general, the model is neither sound nor complete, since it is inferred over a

(a) snippet old_stmt (b) iffail then snippet end old_stmt (c)

if notfail then old_stmt end (d) iffail then snippet else old_stmt end

Figure 2.9: The fix schemas implemented in AutoFix-E [Wei et al.,2010]. snippet is replaced with a sequence of routine calls that move the program from a faulty state into a desired state. old_stmt may either be a single statement, or the block to which a statement belongs. fail is used to monitor the conditions under which the fault manifests, and not to affect the appropriate action.

finite set of executions. In practice, the model appears to be sufficiently precise for the purposes of finding a repair.

5. Fix Generation: Beginning at the location where the fault occurred and it- erating backwards, AutoFix-E uses its fault profile and behaviour model to generate a set of candidate repairs. At each location, AutoFix-E finds all possible instantiations of the four repair schemas within its model, outlined in Figure2.9. Each of these candidate repairs uses the fault profile to move the program from a possibly faulty state to a possibly correct state. To achieve this change in state, AutoFix-E uses its behavioural model to finds the sequence of routine calls that results in the desired post-condition.

A special fix schema is also introduced for repairing linear assertion violations (e.g., count ≥ 0).

6. Fix Validation: Once a set of candidate repairs has been generated, AutoFix-E evaluates them, to determine the set of valid repairs (i.e., those which pass all of the tests). Finally, AutoFix-E uses a series of metrics to rank the valid repairs according to some proxy to quality. These metrics measure the size of the snippet, the distance in state, the number of old statements captured by the fix schema, and the number of branches required to reach old_stmt from the point of injection of the instantiated fix schema.

To evaluate AutoFix-E, Wei et al.[2010] ran it on a dataset of 42 faults detected by AutoTest in 10 data structure classes taken from popular, open-source Eiffel libraries. AutoFix-E was able to repair 16 of these faults. To assess the quality of AutoFix-E’s patches, they manually inspected the top five patches reported for each bug to determine if the patch fixed the underlying bug without introducing a new defect. 13 of the 16 bugs fixed by AutoFix-E were found to satisfy this crite- rion.

In document Advanced Techniques for Search-Based Program Repair (Page 57-60)