Related Techniques - Advanced Techniques for Search-Based Program Repair

To conclude our review of the relevant literature, in this section we briefly discuss a number of techniques loosely related to general-purpose program repair:

• Reflective Grammatical Evolution: [Timperley,2013;Timperley and Step- ney,2014] incorporate concepts from failure-obliviousness into genetic programming. After using computational reflection to create a failure-oblivious dialect of Ruby, the authors demonstrate a higher success rate and efficiency— measured by candidate evaluations—when programs are evolved with these measures enabled. These results suggest that allowing the program to persist in the presence of errors, through the use of such measures, may smoothen the fitness landscape and reduce the difficulty of the search.

• Data Structure Repair: [Demsky and Rinard,2003,2005] present techniques for automatically recovering from data structure corruption errors, at runtime. Repair is achieved by enforcing a data structure consistency specifi- cation, which may be manually provided by the developer, or automatically inferred from correct program executions, using tools such as Daikon. • Integer Bug Fixing: [Coker and Hafiz,2013] propose a set of three program

transformations for repairing all instances of integer bug within C programs: 1. Add Integer Cast: adds explicit casts to disambiguate integer usage, and

to address signedness and widthness problems.

2. Replace Arithmetic Operator: replaces arithmetic operations with safe equivalents, which detect overflows and underflows at runtime.

3. Change Integer Type: modifies types of integers to avoid signedness and widthness problems.

Together, these three transformations fixed all 7,147 programs within NIST’s SAMATE dataset [Black,2007], covering over 15 million lines of code. Un- like general-purpose repair techniques, Coker and Hafiz’s program transformations produce sound and complete repairs for a restricted class of defects, forgoing the need for test suite evalution or symbolic execution. Although these transformations prevent integer bugs from manifesting, they do so by ensuring safe behaviour, rather than finding particular unsafe instances of integer usage and patching the source code directly, thus avoiding the need for potentially expensive instrumentation.

• Bolt and Jolt: Bolt [Kling et al.,2012] and Jolt [Carbin et al.,2011] are techniques for monitoring the execution of a program, detecting whether an infinite loop occurs, and if so, allowing the loop to be executed or the program to be terminated, at the request of the user. Whereas Jolt requires source code instrumentation to inject monitoring code, Bolt obviates this need through on- demand dynamic binary instrumentation; the (unstripped) binaries remain

unmodified until the user attaches Bolt to the application.

To detect the occurrence of (some) infinite loops, both techniques monitor the state of the program upon entry to the loop. If upon the next iteration the current state is the same as the previous state, an infinite loop is detected and the user is given the option to escape the loop or to terminate the program. Across eight infinite loops in five small but real programs (grep, ctags, indent, ping, look), Jolt was able to detect an infinite loop in seven cases [Carbin et al.,

2011]. Importantly, in each case, the program was allowed to continue exe- cuting, producing a more useful output than simply terminating the program [Carbin et al.,2011;Kling et al.,2012].

• CodePhage: Instead of addressing bugs via source code modification, CodePhage [Sidiroglou-Douskos et al.,2015] attempts to fix bugs by automatically iden- tifying correct code in foreign, donor applications, and transferring that code into the faulty program. CodePhage allows programs to be repaired with- out the source code of either the program under repair or the source code of the donor applications, by operating at the binary level. In its evaluation, CodePhage was able to successfully repair five out of ten security bugs across seven different programs.

• ClearView: ClearView [Perkins et al., 2009] uses Daikon to infer likely invariants for a given system, based on data collected from several training executions. Prior to deployment, ClearView uses binary instrumentation to inject monitoring code into the program. Two of these monitors, HeapGuard and Determina Memory Firewall, check for out-of-bounds memory accesses, and illegal control flow transfer errors, respectively. A third monitor, Shadow Stack, allows invariant violations along the call stack to be recorded. In the event that a monitor reports erroneous behaviour at run-time, ClearView attempts to generate a patch that restores violated invariants and satisfies the monitor. The generated patches re-establish the inferred invariants by altering control flow, register values, and/or values of memory locations.

To evaluate ClearView,Perkins et al.[2009] conducted a Red Team exercise, wherein members of the Red Team were asked to perform attacks on a program protected by ClearView, using exploits discovered on the unprotected version of the program. Firefox, a popular open-source web browser, was used as the target application for the exercise. The external Red Team was able to generate ten code-injection exploits in the target application. When ClearView was used, all of these attacks were detected by its monitors, and thus prevented. In seven out of ten cases, ClearView generated a patch that allowed the program to survive the attack and to safely resume its execution. • LeakFix: LeakFix [Gao et al.,2015] uses a series of program analyses to lo- cate and repair (a sub-set of) memory leaks in C programs. Identified leaks are patched by inserting deallocation statements at appropriate points in the program. The resulting patches are guaranteed to not interrupt normal pro-

gram execution. On an evaluation of 15 programs, comprising 522 KLOC, each containing multiple leaks, LeakFix generated patches for 28% of the leaks. LeakFix exemplifies an alternative approach to automated program repair: tackling specific defect classes with high quality and accuracy. This approach is an appealing one, but in practice, it is difficult to cleanly assign bugs to any one particular defect class.

• Genetic Improvement: Search-based program repair can be viewed as part of the wider field of Genetic Improvement (GI) [Langdon,2015], which seeks to apply machine learning and search techniques to improving existing programs, more generally. In contrast to Genetic Programming, which typically attempts to evolve a program from scratch, GI uses existing code as its seed. In addition to program repair, GI has been used to automatically improve runtime performance, reduce power consumption, port functionality, and more.

In document Advanced Techniques for Search-Based Program Repair (Page 60-62)