Code Reuse - Snow_unc_0153D

“Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy.” -Alan Perlis inEpigrams on Programming(1982)

The general principle of any code reuse attack is to redirect the logical program flow to instructions already present in memory, then use those instructions to provide alternative program logic. There exist countless methods of orchestrating such an attack, the simplest of which involves an adversary redirecting the program execution to an existing library function (Solar Designer, 1997; Nergal, 2001). More generally, Shacham (2007) introduced return-oriented programming (ROP) showing that attacks may combine short instruction sequences fromwithinfunctions, calledgadgets, allowing an adversary to inducearbitraryprogram behavior (i.e. Turing complete). This concept was later generalized by removing the reliance on actual return instructions (Checkoway et al., 2010). However, for simplicity, this section highlights the basic idea of code reuse using ROP in Figure 2.8. First, the adversary writes a so-called ROP payload into the application’s memory space. In particular, the payload is placed into a memory area that can be controlled by the adversary,i.e.,

Program Memory Adversary Stack Heap Code (Executable) Libraries ADD Gadget ret LOAD Gadget ret STORE Gadget ret Return Address 3 Return Address 2 Return Address 1

Stack Pivot ret Heap Vulnerability SP SP 1 2 3 4 5 6 7

Figure 2.8: Basic overview of code reuse attacks.

the area is writable and the adversary knows its start address. For instance, an exploit that involves JavaScript can allocate the payload as a string, which also enables one to include binary data by using the JavaScriptunescapefunction. The payload mainly consists of a number of pointers (the return addresses) and any other data that is needed for running the attack (Step¬). The next step is to exploit a memory error vulnerability of the target program to hijack the intended execution-flow (Step), as covered in the last section. In the example shown in Figure 2.8, the adversary exploits a spatial error on the heap by overwriting the address of a function pointer with an address that points to a so-calledstack pivotsequence (Zovi, 2010). Once the overwritten function pointer is used by the application, the execution flow is redirected to the stack pivot instructions (Step®), which were already present in the application’s memory.

Loosely speaking, stack pivot sequences change the value of the stack pointer (esp) to a value stored in another register. Hence, by controlling that register6, the attacker can arbitrarily change the stack pointer. The most common strategy is to use a stack pivot sequence that directs the stack pointer to the beginning of the payload (Step¯). A concrete example of a stack pivot sequence is thex86assembler code sequencemov esp,eax; ret7. The sequence changes the value of the stack pointer to the value stored in registereaxand afterwards invokes a return (ret) instruction. Thex86 retinstruction simply loads the address pointed to byespinto the instruction pointer and incrementsespby one word. Hence, the execution continues at the first gadget (STORE) pointed to by Return Address 1 (Step°). In addition, the stack pointer is increased and now points to Return Address 2.

A gadget represents an atomic operation such asLOAD,ADD, orSTORE, followed by aret instruction. For example, on thex86, aLOADgadget can take the form ofpop eax; ret, hence loading the next value present on the stack into theeaxregister. Similarly, anADDgadget could be implemented withadd eax,ebx; ret, among other possibilities. It is exactly the terminating retinstruction that enables the chained execution of gadgets by loading the address the stack pointer points to (Return Address 2) in the instruction pointer and updating the stack pointer so that it points to the next address in the payload (Return Address 3). Steps°to²are repeated until the adversary reaches her goal. To summarize, the combination of different gadgets allows an adversary to induce arbitrary program behavior.

Randomization for Exploit Mitigation: As noted in the last section, a well-accepted counter- measure against code reuse attacks is the randomization of the application’s memory layout. The basic idea of address space layout randomization (ASLR) dates back to Forrest et al. (1997), wherein a new stack memory allocator was introduced that adds a random pad for stack objects larger than 16 bytes. Today, ASLR is enabled on nearly all modern operating systems such as Windows, Linux, iOS, or Android. For the most part, current ASLR schemes randomize the base (start) address of segments such as the stack, heap, libraries, and the executable itself. This basic approach is depicted in Figure 2.9, where the start address of an executable is relocated between consecutive runs of 6_{To control the register, the adversary can either use a buffer overflow exploit that overwrites memory areas that are} used to load the target register, or invoke a sequence that initializes the target register and then directly calls the stack pivot.

The Intel assembly notation described by theIntel 64 and IA-32 Architectures Software Developer’s Manual(Volume 2) is used throughout this dissertation. In general, instructions take the form ofinstr dest,src.

the application. As a result, an adversary must guess the location of the functions and instruction sequences needed for successful deployment of a code reuse attack. The intent of ASLR is to hinder such guessing schemes to a point wherein they are probabilistically infeasible within a practical time-frame.

Program Memory Executable

ADD Gadget ret

LOAD Gadget ret

STORE Gadget ret Stack Pivot ret

Program Memory Executable ret ret ret ret

Execution i

Execution i + 1

0x08000000: 0x07000000: ADD Gadget LOAD Gadget STORE Gadget Stack Pivot

Figure 2.9: Address Space Layout Randomization (ASLR)

Unfortunately, the realization of ASLR in practice suffers from two main problems: first, the entropy on 32-bit systems is too low, and thus ASLR can be bypassed by means of brute-force attacks (Shacham et al., 2004; Liu et al., 2011). Second, all ASLR solutions are vulnerable tomemory disclosureattacks (Sotirov and Dowd, 2008b; Serna, 2012b) where the adversary gains knowledge of a single runtime address,e.g.from a function pointer within a vtable, and uses that information to “de-randomize” memory. Modern exploits use JavaScript or ActionScript (hereafter referred to as ascript) and a memory-disclosure vulnerability to reveal the location of a single code module (e.g., a dynamically-loaded library) loaded in memory. Since current ASLR implementations only randomize on a per-module level, disclosing a single address within a module effectively reveals the location of every piece of code within that module. Therefore, any gadgets from a disclosed module may be determined manually by the attacker offline prior to deploying the exploit. Once the prerequisite information has been gathered, the exploit script simply builds a payload from a pre-determined template by adjusting offsets based on the module’s disclosed location at runtime.

To confound these attacks, a number of fine-grained ASLR and code randomization schemes have recently appeared in the academic literature (Bhatkar et al., 2005; Kil et al., 2006; Pappas et al., 2012; Hiser et al., 2012; Wartell et al., 2012). These techniques are elaborated on later (in Chapter 3 §3.1), but for now it is sufficient to note that the underlying idea in these works is to randomize the data and code structure, for instance, by shuffling functions or basic blocks (ideally for each program run (Wartell et al., 2012)). As shown in Figure 2.10, the result of this approach is that the location of all gadgets is randomized. The assumption underlying all these works is that the disclosure of a single address no longer allows an adversary to deploy a code reuse attack.

Program Memory Executable

ADD Gadget ret

LOAD Gadget ret

STORE Gadget ret

Stack Pivot ret

Program Memory Executable Stack Pivot ret

STORE Gadget ret

LOAD Gadget ret

ADD Gadget ret

Execution i

Execution i + 1

0x08000000:

0x07000000:

Figure 2.10: Fine-Grained Memory and Code Randomization

Chapter 3 thoroughly examines the benefits and limitations of fine-grained ASLR. In short, however, these new mitigations offer no substantial benefit over existing ASLR schemes so long as one can construct payloads “online” rather than rely on manually crafting ROP prior to exploitation.

Fortunately (for defenders), crafting code reuse payloads of interest is not easy, and each payload is highly specific to the environment in which it runs. For these reasons, the trend has been to build the simplest useful ROP payload—one that bootstraps execution of a code injection payload.

In document Snow_unc_0153D_14970.pdf (Page 36-41)