Control Transformations - Types of Obfuscation

4. Code Obfuscation Techniques

4.2. Types of Obfuscation

4.2.3. Control Transformations

Control transformations alter the control constructs of the program code. “The idea here is to disguise the real control flow in a program” [34]. With these transformations a certain amount of

computational overhead will be unavoidable, meaning that there is a trade-off between efficiency and obscurity ([8] p.10).

4.2.3.1. Opaque Constructs

An opaque variable is a variable that has some property q which is known a priori to the obfuscator, but which is difficult for the deobfuscator to deduce ([8] p.10). Similarly, an opaque predicate is a boolean expression for which a deobfuscator can deduce its outcome only with great difficulty, while this outcome is well known to the obfuscator ([8] p.10). Opaque constructs are the key to highly resilient control transformations ([8] p.10).

4.2.3.1.1. Definition

“A variable V is opaque at a point p in a program, if V has a property q at p which is known at obfuscation time. We write this as Vp

q or V^q if p is clear from context.

A predicate P is opaque at p if its outcome is known at obfuscation time. We write VpF

(VpT

) if P always evaluates to False (True) at p, and Vp

q if P sometimes evaluates to True and sometimes to

False “ ([8] p.10).

Figure 5.7: Different types of opaque predicates. Solid lines indicate paths that may sometimes be taken, dashed lines paths that will never be taken ([8] p.10).

4.2.3.1.2. Trivial Constructs

“An opaque construct is trivial if a deobfuscator can crack it (deduce its value) by a static local analysis. An analysis is local if it is restricted to a single basic block of a control flow graph” ([8]

p.11).

{ int v, a=5; b=6;

v = a + b;

if (b > 5)^T …

if (random(1,5) < 0)^F … }

Figure 5.8 An example of a trivial opaque construct. ([8] p.12)

4.2.3.1.3. Weak Constructs

“An opaque construct is weak if a deobfuscator can crack it by a static global analysis. An analysis is global if it is restricted to a single control flow graph” ([8] p.11).

{ int v, a=5; b=6;

if (…) …

. …. (b is changed) . ….

. ….

if (b < 7)^T a++;

v = (a > 5) ? v=b*b : v=b }

Figure 5.9 An example of a weak opaque construct. ([8] p.12)

4.2.3.2. Control Aggregation

“Control aggregation obfuscation changes the way in which program statements are grouped

together” [34]. They do this by breaking up computations that logically belong together and merging computations that do not belong together ([8] p.10). Code which the programmer aggregated into a method (presumably because it logically belonged together) should be broken up and scattered over the program and (2) code which seems not to belong together should be aggregated into one method”

([8] p.14).

Method inlining removes procedural abstractions from the program ([8] p.14). It is “a highly resilient transformation, since once a procedure call has been replaced with the body of the called procedure and the procedure itself has been removed, there is no trace of the abstraction left in the code” ([8]

p.14). Method outlining involves turning a sequence of statements into a subroutine and is a very

useful companion to inlining ([8] p.14). Both transformations have medium potency and free cost, but outlining is strongly resilient.

Method interleaving transformations merge the bodies and parameters of two or more methods declared in the same class and add an extra opaque variable to discriminate between calls to the individual methods ([8] p.15). Method cloning transformations involve generating methods (within the same class) that appear different to each other but have identical behaviour, with an opaque predicate to select the correct method ([8] p.15). The quality of both types of transformations depends on the quality of the opaque predicate used.

Looping transformations affect the control flow in loop constructs. This includes loop blocking, in which the loop is “decomposed into blocks of a constant blocking factor (the step is multiplied by this factor)” [39], loop unrolling, in which the body of the loop is replicated a number of times and loop fission, in which a loop with a compound body is turned into several loops with the same iteration space ([8] p.16). All of these transformations have low potency, weak resilience and free cost, except for loop unrolling, which has cheap cost ([8] p.30).

4.2.3.3. Control Ordering

“Control ordering obfuscation alters the order in which statements are executed. For example, loops can be made to iterate backwards instead of forwards” [34]. This follows on from the same principles of locality outlined in Section 5.2.2.4. “There is locality among terms within expressions, statements within basic blocks, basic blocks within methods, methods within classes, classes within files, etc. For some types of items (methods within classes, for example) this is trivial. In other cases (such as statements within basic blocks) a data dependency analysis will have to be performed to determine which reorderings are legal” ([8] p.16,17). Data dependency analysis “involves the determination of what variables depend on what other variables” [29].

“These transformations have low potency (they do not add much obscurity to the program) but their resilience is high, in many cases one-way. For example, when the placement of statements within a basic block has been randomised, there will be no traces of the original order left in the resulting code” ([8] p.17). Control ordering transformations also have low cost ([8] p.30).

4.2.3.4. Control Computations

Computation transformations make algorithmic changes to the source applications. This involves hiding the real control flow behind irrelevant statements that do not contribute to the actual computations, introducing code sequences at the object code level for which there exist no corresponding high-level language constructs, removing real control-flow abstractions and adding

bogus control flow abstractions. For these transformations, the quality depends on the quality of the opaque predicate and the nesting depth at which the construct is inserted ([8] p.30).

4.2.4. Preventive Transformations

The main goal of preventive transformations is not to alter any particular type of program code, but rather to cause a deobfuscator or decompiler to crash or stop it from successfully undo the

transformations ([8] p.24). There are two types: inherent and targeted.

4.2.4. Inherent Preventive Transformations

Inherent preventive transformations attempt to make known deobfuscation techniques harder to employ ([33] p.16), e.g. by reversing an iterating control construct and inserting bogus data dependencies to prevent the deobfuscator from undoing the transformation ([8] p.24). They have medium potency, weak resilience and free cost ([8] p.31).

4.2.4. Targeted Preventive Transformations

Targeted preventive transformations are designed to counter specific analysis tools, e.g. by inserting code so as to cause the deobfuscator to crash ([8] p.24). They have free cost, but as they may be susceptible to attack from other deobfuscators, they also have low potency and trivial resilience ([8]

p.31).

In document Java Obfuscation Salah Malik BSc Computer Science 2001/2002 (Page 31-35)