DIFT Implementations - THE DESIGN AND IMPLEMENTATION OF HARDWARE SYSTEMS FOR INFORMATION FLOW T

Owing to the popularity and versatility of the DIFT security model, researchers have ex- plored applying DIFT to software security in a number of environments.

2.3.1 Programming language platforms

One approach to applying DIFT is via language DIFT implementations, where DIFT ca- pabilities are added to a language interpreter or runtime. Researchers have proposed DIFT implementations for many languages, such as PHP [67] and Java [33]. Additionally, DIFT concepts are already used in limited situations by many existing interpreted languages, such as the taint mode found in Perl [70] and Ruby [84]. In such implementations, the language interpreter serves as the runtime environment. From a DIFT perspective, memory consists of language variables which are extended to accommodate taint.

Language platforms for DIFT are very flexible, and have been shown to provide good protection against high-level vulnerabilities, with low performance overheads [22, 26]. Re- searchers have modified the interpreters of dynamic languages such as PHP to provide protection against a wide variety of semantic, web-based input validation bugs such as SQL injection, and cross-site scripting.

The downside to language DIFT platforms is their inability to address vulnerabilities such as low-level memory corruption exploits, or operating system errors. Additionally, since this technique is language-specific, it is impractical in defending against vulnerabilities that occur in a wide variety of languages.

2.3.2 Dynamic binary translation

Another method of applying DIFT in software is using a Dynamic Binary Translator (DBT). In a DBT-based DIFT implementation, the application (or even the entire system) is run within a DBT. The binary translation framework maintains metadata, or state associated with the application’s data. This metadata is used to maintain information about the taint- edness of the associated data. The DBT dynamically inserts instructions for DIFT when performing binary translation. Every instruction from the application has an associated metadata instruction that manipulates the associated taint values.

Dynamic binary translators have been used for performing DIFT both on individual programs [65], and the entire system [5]. Since the security analysis is performed in software, the policies employed can be arbitrarily complex and flexible. This provides the advantage of being able to use the same infrastructure for a wide range of policies. Binary translation however, requires the introduction of a whole new instruction to manipulate the taint associated with the original program’s instruction. The disadvantage of this scheme is the high performance overhead. DBT-based DIFT systems have been shown to have performance overheads ranging from 3× [73] to 37× [66] depending upon the application and policies in question. Applying DIFT support to the entire system requires that the DBT solution virtualize all devices, the MMU, the OS, and all applications. Overheads of performing this virtualization alone using whole-system binary translation frameworks such as QEMU, are between 5× to 20× [5]. Adding DIFT support increases these overheads significantly. Such high performance overheads restrict the wide-spread applicability of a DBT-based DIFT solution.

Another drawback with binary translation frameworks is the lack of support for multithreaded applications. When executing a multi-threaded workload, the DIFT platform must ensure consistency between updates to data and tags, so that all other threads in the system perceive these updates as atomic operations [18]. Failing to do so could cause race

conditions that could lead to false negatives (undetected security breaches) or false posi- tives (spurious security exceptions), which undermine the utility of the DIFT mechanism. Software DBT schemes deal with this issue by either forgoing support for multiple threads entirely [9, 73], restricting applications to only execute a single thread at a time [65], or requiring tool developers to explicitly implement the locking mechanisms needed to access metadata [54]. Since many security critical workloads such as databases and web servers are multithreaded, this limits the practicality and applicability of the DBT DIFT solution. Recent research into hybrid DIFT systems has shown that with additional hardware support, multithreaded applications can be run within DBTs [40], but this requires significant hardware modifications to existing systems.

2.3.3 Hardware DIFT

An alternative approach to DIFT is to perform the taint tracking and checking in hardware [14, 20, 81]. The hardware is responsible for maintaining and managing the state associated with taint tracking. Hardware being the lowest layer of abstraction in a computer system is the ideal level for implementing DIFT support. All programs, binaries and ex- ecutables must run on top of the hardware. Implementing DIFT mechanisms in hardware allows the DIFT security policies to be applied to scripting languages, binaries, applications, or even operating systems. This renders the protection independent of the choice of programming language, since all languages must eventually be translated to some form of assembly language understood by the hardware.

This approach has a very low performance overhead as tag propagation and checks occur in hardware, often in parallel with the execution of the original instruction. Hardware DIFT systems provides extremely low-overhead protection, even when applied to the whole operating system. Tag propagation occurs in hardware, often in parallel with the execution of the original data instruction. Additionally, hardware can apply DIFT policies to the

whole system without the performance and complexity challenges faced by whole-system dynamic binary translation.

Unlike DBT-based solutions, hardware DIFT platforms can also apply protection to multi-threaded applications. This can be done either by ensuring atomic updates to both data and tags [24, 41], or by making minor modifications to the coherence protocols to ensure that an atomic view of data and tags is always presented to other processors [40]. Since computer systems are migrating to multi-core environments, such support is key in ensuring the practical viability of the DIFT solution. Overall, hardware DIFT support has been shown to provide comprehensive support against both low-level memory corruption exploits such as buffer overflows [20, 81], and high-level web attacks such as SQL injec- tions [66], with low performance overheads.

The downside to hardware DIFT systems, however, is their inflexibility. Hardware architectures implemented thus far use single fixed security policies to catch all classes of attacks. Worms that target multiple vulnerabilities are however, becoming exceedingly common [11]. Such worms can bypass the protection offered by current hardware DIFT architectures, since they can protect against only one kind of exploit using a solitary security policy. Casting security policies in silicon impairs the ability of the solution to adapt to future threats, and limits the utility of the solution. Modern software is extremely complex and ridden with corner cases that often require special handling. The lack of flexibility restricts the ability of a hardware DIFT system to handle such cases. We discuss this issue further in Chapter 3.

In document THE DESIGN AND IMPLEMENTATION OF HARDWARE SYSTEMS FOR INFORMATION FLOW TRACKING (Page 30-33)