• No results found

5.3 Formalization of DEX O

5.3.3 Non-Interference Proof for DEX O

The definition of high result is still the same as DEXI with the addition of the heap. The definition of high branching lemma locally respect lemma are also the same with the addition of the heap and βmapping.

Lemma indist2_intra : forall m sgn se rt ut ut’ s s’ u u’ b b’, forall H0:P (SM _ _ m sgn),

indist sgn rt rt b b’ s s’ -> pc s = pc s’ ->

exec m s (inl _ u) -> exec m s’ (inl _ u’) ->

texec m (PM_P _ H0) sgn se (pc s) rt (Some ut) -> texec m (PM_P _ H0) sgn se (pc s) rt (Some ut’) -> exists bu, exists bu’,

border b bu /\ border b’ bu’ /\ indist sgn ut ut’ bu bu’ u u’.

Lemma indist2_return : forall (m : Method) (sgn : Sign) (se : PC -> L.t) (rt : registertypes) (s s’ : istate) (u u’ : rstate) (b b’ : pbij), forall H:P (SM Method Sign m sgn),

indist sgn rt rt b b’ s s’ -> pc s = pc s’ ->

exec m s (inr istate u) -> exec m s’ (inr istate u’) ->

texec m (PM_P _ H) sgn se (pc s) rt None -> exists bu, exists bu’,

border b bu /\ border b’ bu’ /\ rindist sgn bu bu’ u u’.

The requirement “border b bu //border b’ bu’ ” corresponds toββ′in Lemma 4.2.14.

Indistinguishability at Junction Point The definition of path and change are the same as that of DEXI with the addition of the three new instructionsiget,iput, and new.

Inductive changed_at (m:Method) (i:istate) (r:Reg) : Prop := | const_change : forall k v, instructionAt m (pc i) =

Some (DEX_Const k r v) -> changed_at m i r

| move_change : forall k rs, instructionAt m (pc i) = Some (DEX_Move k r rs) -> changed_at m i r

| ineg_change : forall rs, instructionAt m (pc i) = Some (DEX_Ineg r rs) -> changed_at m i r

| inot_change : forall rs, instructionAt m (pc i) = Some (DEX_Inot r rs) -> changed_at m i r

| i2b_change : forall rs, instructionAt m (pc i) = Some (DEX_I2b r rs) -> changed_at m i r

| i2s_change : forall rs, instructionAt m (pc i) = Some (DEX_I2s r rs) -> changed_at m i r

| ibinop_change : forall op ra rb, instructionAt m (pc i) = Some (DEX_Ibinop op r ra rb) -> changed_at m i r

| ibinopConst_change : forall op rs v, instructionAt m (pc i) = Some (DEX_IbinopConst op r rs v) -> changed_at m i r

| iget_change : forall t ro f, instructionAt m (pc i) = Some (DEX_Iget t r ro f) -> changed_at m i r

| new_change : forall c, instructionAt m (pc i) = Some (DEX_New r c) -> changed_at m i r..

Since now we also need to make sure that during the execution the heap indis- tinguishability is maintained, we have proven the following lemmas.

Lemma high_path_heap_indist_onestep_left : forall m sgn s i i’ b b’ j (H: P (SM _ _ m sgn)),

(forall k:PC, region (cdr m (PM_P _ H)) s k -> ~ L.leql (se m sgn k) kobs) ->

region (cdr m (PM_P _ H)) s (pc i) -> indist_heap i i’ b b’ ->

exec m i (inl j) -> indist_heap j i’ b b’.

Lemma high_path_heap_indist_onestep_right : forall m sgn s i i’ b b’ j (H: P (SM _ _ m sgn)),

§5.3 Formalization of DEXO 139

(forall k:PC, region (cdr m (PM_P _ H)) s k -> ~ L.leql (se m sgn k) kobs) ->

region (cdr m (PM_P _ H)) s (pc i’) -> exec m i’ (inl j) ->

indist_heap i i’ b b’ -> indist_heap i j b b’.

Lemma high_path_heap_indist : forall m sgn s i i’ b b’ j (H:P (SM _ _ m sgn)) (Hpath: path m i j),

(forall k:PC, region (cdr m (PM_P _ H)) s k -> ~ L.leql (se m sgn k) kobs) ->

path_in_region m (cdr m (PM_P _ H)) s i j Hpath -> region (cdr m (PM_P _ H)) s (pc i) ->

junc (cdr m (PM_P _ H)) s (pc j) -> indist_heap i i’ b b’ ->

indist_heap j i’ b b’.

Lemma high_step_indist_heap_result : forall m sgn u u’ b b’ res res’ i (H: P (SM _ _ m sgn)),

(forall k:PC, region (cdr m (PM_P _ H)) i k -> ~ L.leql (se m sgn k) kobs) ->

(forall jun : PC, ~ junc

(cdr m (PM_P {| unSign := m; sign := sgn |} H)) i jun) ->

region (cdr m (PM_P {| unSign := m; sign := sgn |} H)) i (pc u) -> region (cdr m (PM_P {| unSign := m; sign := sgn |} H)) i (pc u’) -> indist_heap u u’ b b’ ->

exec m u (inr res) -> exec m u’ (inr res’) -> indist_heap_result res res’ b b’.

These lemmas are saying that in a high region, the heap indistinguishability is pre- served throughout the execution.

Lemma junction_indist : forall m sgn ns ns’ s s’ u u’ b b’ bu bu’ res res’ i (H: P (SM m sgn)),

indist sgn (RT m sgn (pc s)) (RT m sgn (pc s’)) b b’ s s’ -> exec m s (inl u) -> exec m s’ (inl u’) ->

region (cdr m (PM_P _ H)) i (pc u) -> region (cdr m (PM_P _ H)) i (pc u’) -> high_region m (PM_P _ H) sgn i -> evalsto m ns u res ->

evalsto m ns’ u’ res’ ->

indist sgn (RT m sgn (pc u)) (RT m sgn (pc u’)) bu bu’ u u’ -> border b bu -> border b’ bu’ ->

(exists v, exists v’, exists ps, exists ps’, exists bv, exists bv’, evalsto m ps v res /\ ps <= ns /\

evalsto m ps’ v’ res’ /\ ps’ <= ns’ /\ junc (cdr m (PM_P _ H)) i (pc v) /\ junc (cdr m (PM_P _ H)) i (pc v’) /\ border bu bv /\ border bu’ bv’ /\

indist sgn (RT m sgn (pc v)) (RT m sgn (pc v’)) bv bv’ v v’) \/ (exists br, exists br’, border bu br /\ border bu’ br’ /\

indist_heap_result res res’ br br’ /\

high_result sgn res /\ high_result sgn res’).

Lemma junction_indist_2 : forall m sgn ns ns’ s s’ u u’ b b’ bu bu’ res res’ i (H: P (SM m sgn)),

indist sgn (RT m sgn (pc s)) (RT m sgn (pc s’)) b b’ s s’ -> exec m s (inl u) -> exec m s’ (inl u’) ->

region (cdr m (PM_P _ H)) i (pc u) -> junc (cdr m (PM_P _ H)) i (pc u’) -> high_region m (PM_P _ H) sgn i -> evalsto m ns u res ->

evalsto m ns’ u’ res’ ->

indist sgn (RT m sgn (pc u)) (RT m sgn (pc u’)) bu bu’ u u’ -> border b bu -> border b’ bu’ ->

(exists v, exists ps, exists bv, evalsto m ps v res /\ ps <= ns /\ junc (cdr m (PM_P _ H)) i (pc v) /\ border bu bv /\

indist sgn (RT m sgn (pc v)) (RT m sgn (pc u’)) bv bu’ v u’).

These two lemmas are similar to the one in DEXI with the additional requirement that theβmapping must be in order (ββ′).

The definition of type check is still the same as that of DEXI. The definition of non-interference and type system soundness are also similar to DEXI with the addition ofβmapping and heap.

Type System Soundness

Definition NI (p:DEX_ExtendedProgram) : Prop :=

forall kobs m sgn i h1 h2 r1 r2 hr1 hr2 res1 res2 b1 b2, P p (SM _ _ m sgn) ->

init_pc m i ->

indist kobs p sgn (rt0 m sgn) (rt0 m sgn) b1 b2 (i,(h1,r1)) (i,(h2,r2)) ->

DEX_BigStepAnnot.DEX_BigStep p.(DEX_prog) m (i,(h1,r1)) (hr1,res1) -> DEX_BigStepAnnot.DEX_BigStep p.(DEX_prog) m (i,(h2,r2)) (hr2,res2) -> exists br1, exists br2,

border b1 br1 /\ border b2 br2 /\

hp_in kobs (DEX_ft p) br1 br2 hr1 hr2 /\

Chapter 6

Type-Preserving Compilation of

Android Bytecode

We have proposed a type system design to ensure non-interference on Android byte- code, and we also have shown that this type system is sound, i.e., a typable DEX bytecode is non-interferent. Now we prove type-preserving compilation for Android bytecode. We first analyze the translation process that is done in the actual dx tool and then show that the translation process preserves typing, i.e., typable JVM byte- code will yield typable DEX bytecode.

We decided to take this approach to leverage existing security approach to JVM bytecode due to the close relationship between JVM bytecode and DEX bytecode. It is also closer to our bigger goal where we provide a framework for the developer to provide a formal guarantee. We could target DEX bytecode directly, but then a failure to type DEX bytecode will not give any information whatsoever about what could cause the issue to the developer. With this approach, the typability of DEX bytecode depends on the typability of JVM bytecode, whose relationship has been studied (see Section 3.1 for the discussion).

6.1

Translation Phase

We now describe the translation process from JVM to DVM. This is an abstracted version of what is implemented in the dx tool of Android.

The dx tool translates JVM in blocks of code. To formalize this, it is useful to first define what we call aBasic Block. The Basic block is a construct containing a group of code that has one entry point and one exit point (not necessarily one successor/one parent), has a parent list, a successor list, a primary successor, and its order. The basic block also contains translated DEX instruction for the contained group of code. Formally, a basic block is a tuple

{parents; succs; pSucc; order; JV M_insn; DEX_insn; handlers}

where

• parents⊆ Z is a set of the block’s parents, 141

• succs⊆ Z is a set of the block’s successors,

• pSucc ∈ Z is the primary successor of the block (if the block does not have a primary successor it will have−1 as the value),

• order∈ Z is the order of the block in the output phase, and

• JV M_insn⊆JV Minsis the JVM instructions contained in the block

• DEX_insn⊆DEXinsis the DEX instructions contained in the block (as a result of translating JV M_insn).

• handlersis the associated exception handlers for the block.

The set ofBasicBlockis denoted as BasicBlocks. When instantiating a basic block, we define a default object NewBlock, which will be a basic block with

{parents= ∅; succs= ∅; pSucc= −1; order= −1;

JV M_insn= ∅; DEX_insn= ∅; handler= ∅}

We also need some auxiliary functions to define the translation:

BMap:PP →BasicBlock is a function from program pointers in JVM bytecode to a DEX basic block. Initially, the mapping is empty.

PMap: PP → PP is a function from program points in JVM to its starting point of the block that contains the program point (refer to Section 6.1.1). Initially, the mapping contains all of the program points mapped to itself (∀pp∈ PP,PMap(pp) = pp).

SBMap:PP →boolean Similar to BMap, this function takes a program pointer in JVM bytecode and returns whether that instruction is the start of a basic block. Initially, the mapping contains all of the program points mapped to false (∀pp∈

PP,SBMap(pp) = f alse).

TSMap:PP → Z A function that maps a program pointer in JVM bytecode to an integer denoting the index to the top of the stack. This mapping is initialized with the number of local variables as that number is the index which will be used by DEX to simulate the stack (∀pp∈ PP,TSMap(pp) =locN wherelocN

is the number of local variables).

NewBlock: BasickBlock A function which returns a NewBlock.

Since DEX is register-based whereas JVM is stack-based, to bridge this gap, the translation uses registers to simulate JVM stacks. This is done as follows:

• We set asidelnumber of registers to hold local variables (registers 1, . . . ,l). We denote these registers withlocR;

§6.1 Translation Phase 143

Note that although in principle, the stack can grow indefinitely, it is impossible to write a program that does so in Java, due to the strict stack discipline in Java. We assume the JVM bytecode has passed the Java bytecode verifier (see Lindholm et al. [2013] for the verifier’s specifications), which ensures, among others, that the max- imum height of the operand stack in a method is fixed. Similarly, bytecode verifier guarantees that the (Java) types of the operand stack (and by implication, also its height) at each program point is fixed. This makes it possible to statically map each operand stack location to a register in DEX. (cf. TSMap above and); see Davis et al. [2003] for a discussion on how this can be done.

There are several phases involved to translate JVM bytecode into DEX bytecode. To help illustrate each phase, we use the following idealized JVM bytecode with its abstracted labels: 0 ∶ Load 1 5 ∶ Push 1 1 ∶ Goto 7 6 ∶ Ireturn 2 ∶ Ifeq 5 7 ∶ Load 2 3 ∶ Push 0 8 ∶ Sub 4 ∶ Ireturn 9 ∶ Goto 2

For this particular example, we also assume that the number of local variables is 2. After each phase, we will show the result of the transformation applied to above code. In general, we apply the transformations to P, the source JVM program that we want to translate. For this particular case, the program containing the piece of code is P.