Browser Model for Security Analysis of Browser-Based Protocols

(1)

Browser Model for Security Analysis of Browser-Based

Protocols

Thomas Groß

IBM Zurich Research Lab R¨uschlikon, Switzerland [email protected]

Birgit Pfitzmann

IBM Zurich Research Lab R¨uschlikon, Switzerland [email protected]

Ahmad-Reza Sadeghi

Ruhr-University Bochum Bochum, Germany [email protected]

Abstract

Currently, many industrial initiatives focus on web-based applications. In this context an impor-tant requirement is that the user should only rely on a standard web browser. Hence the underlying security services also rely solely on a browser for interaction with the user. Browser-based identity federation is a prominent example of such a protocol. Unfortunately, very little is still known about the security of browser-based protocols, and they seem at least as error-prone as standard security protocols. In particular, standard web browsers have limited cryptographic capabilities and thus new protocols are used. Furthermore, these protocols require certain care by the user in person, which must be modeled. In addition, browsers, unlike normal protocol principals, cannot be assumed to do nothing but execute the given security protocol.

In this paper, we lay the theoretical basis for the rigorous analysis and security proofs of browser-based security protocols. We formally model web browsers, secure browser channels, and the security-relevant browsing behavior of a user as automata. As a first rigorous security proof of a browser-based protocol we prove the security of password-based user authentication in our model. This is not only the most common stand-alone type of browser authentication, but also a fundamental building block for more complex protocols like identity federation.

1 Introduction

Web-based services have received increasing attention in the last years. The idea is simple: users should be able to send their requests for desired services using a browser, which offers a set of basic function-alities, and receive and view the results at the browser. This allows easy deployment of applications at low cost and without specific user education. Services can be offered by one service provider or several affiliated enterprises. The requirement on such services not to need any special client software is also

calledzero-footprint. The underlying security services must also be zero-footprint, i.e., only a browser

is used for user authentication, and, if desired, for retaining a secure channel with the user, requesting additional security relevant attributes about the users, and potential confirmation by third parties of the authentication of the authentication or attribute information.

The typical approach in other security protocols is to perform a key exchange, based on local mas-ter keys, masmas-ter keys shared with a third party, or public-key certificates, and to subsequently use the exchanged key to secure the communication. A large body of literature on such protocols exist. A seminal paper was [29], although a vulnerability in one of the original protocols was later found in [22]. Tool-supported proofs were initiated in [26, 18, 25], based on abstractions of cryptographic primitives introduced in [8]. Recent tool-supported proofs concentrated on using existing general-purpose model checkers, first in [23, 27, 6], and theorem provers, first in [9, 30]. Cryptographic proofs of key-exchange

(2)

and authentication protocols were initiated in [1]. Cryptography also added interesting additional prop-erties to pure authentication, e.g., see [20]. Modeling secure channels by a comparison to ideal se-cure channels, a technique that we will use for the underlying sese-cure channels below, was introduced in [40, 34, 3]. Analyses specifically for SSL and TLS, and thus close to an underlying mechanism used in browsers, were made in [42, 28, 31, 21].

However, standard browsers simply do not execute most of these protocols. The only exception that would include user authentication would be to use SSL or TLS channels with client certificates for 2-party authentication. However, this is not considered truly zero-footprint because the users would have to obtain the certificates, and because it would not allow a user to easily use different browsers at different times. Thus it is very rarely used, and not used at all as a basis in larger browser-based security protocols. Hence browser-based protocols are different from all protocols for which prior security proofs exist.

A prominent example of browser-based security protocols is identity federation, which aims at link-ing a user’s (otherwise) distinct identities at several locations. The advantage of such systems is that the involved organizations can reduce user management costs, such as the cost of password helpdesks and user registration and deletion. In particular in this area, concrete and complex browser-based secu-rity protocols were proposed, e.g., Microsoft’s Passport [5], the Secusecu-rity Assertion Markup Language (SAML) standardized by OASIS [41], the Shibboleth project for university identity federation [4], the Liberty Alliance project [38], and WS-Federation [16, 17]. Several papers discussed vulnerabilities of such protocols, in particular for Passport [19], the Liberty enabled-client protocol [37], and a SAML profile [13]. Others discussed privacy design principles and details [36, 32, 33]. Basic browser-based authentication without federated identity management is discussed in [11]. As far as the vulnerabilities found were removable security problems (in contrast to fundamental limitations of the browser-based protocol class or matters of taste like privacy), they were removed in the next version of the proto-cols. However, past experience in protocol design has shown that incorporating countermeasures against known attacks does not guarantee to eliminate all vulnerabilities. Hence it is desirable to devise security proofs.

It is not trivial to apply previous security proof techniques, both cryptographic techniques and formal-methods techniques, to browser-based protocols. The primary reason is that a browser repre-sents a new party with its own, predefined behavior that has impact on the security of the protocols executed across it. In usual security protocols, principals are assumed to execute precisely the security protocol under consideration (unless they are corrupted). A browser, in contrast, reacts on a number of predefined messages, adds information to responses automatically, and stores certain information such as histories in places which cannot always be assumed secure, e.g., in an Internet kiosk. For instance, one of the SAML problems found in [13] is based on the HTTP Referer tag, i.e., a browser feature that is not mentioned at all on the level of the SAML protocol. Another usual issue is that browser-based protocols use a multitude of names for a principal, while other protocols typically assume a one-to-one mapping; for instance, there are URL addresses, identities used in SSL certificates, and identities used in higher protocols, and it is easy to forget some name comparisons in protocols and thus to enable man-in-the-middle attacks. All this means that a detailed and rigorous browser model is a prerequisite for convincing security proofs of browser-based protocols, and no such model exists so far. For the

resulting model, one has to assume that a real browser does notperform additional actions, because it

seems that for most security protocols arbitrary additional actions could destroy the security. Hence it is not enough to make a minimal model covering the few messages and parameters explicitly used by security protocols, but one has to get as close as possible to real browsers.

Another aspect is that due to the limited capabilities of browsers, the user at the browser is an active participant and certain assumptions must be made about the user as well, e.g., that the user verifies that a secure channel to a trusted server is used before entering an important password.

In this paper, we lay the theoretical basis for research in this area by modeling the major build-ing blocks for browser-based protocols. We present a rigorous and abstract model for a standard web

(3)

browser as a principal for browser-based protocols. While our model is still extensible – in particular we do not model cookies and scripting but assume a browser with these features turned off – we be-lieve that we have captured the major explicit and implicit browser features that play a role in typical browser-based protocols. In addition, we model the security-relevant browsing behavior of a user, i.e., a machine that implements the explicit constraints on a user that are needed for protocol proofs, but still allows arbitrary behavior apart from that. Furthermore, we model browser channels in order to capture, in particular, the naming issues across multiple protocol layers.

As a first security lemma for a browser-based protocol in our model, we focus on the security of the initial authentication of the user behind a browser by a password. Initial user authentication is an integral part of all browser-based protocols, and passwords are the standard technique used in the zero-footprint scenario.

A first step in the direction of proofs of browser-based protocols was taken in [14]. There, however, we only modeled exactly those parts of the user and browser behavior that we concretely needed for the protocol, and made assumptions that other things would not happen, where the assumptions were made top-down for the needs of the protocol rather than bottom-up from a browser and user model. In this paper, we lay the bottom-up groundwork for such assumptions.

Another related area is the analysis of web services security protocols [12, 2], because in standard-ization web-based protocols and web services protocols are closely related, e.g., in WS-Federation or in the use of SAML tokens for web services protocols. The techniques developed there are very useful for security proofs for real standards. However, so far, they do not consider browser-based protocols, and hence do not affect the novelty of our browser and user models and of proofs based on them.

2 Notation

General Notation.We use a straight font for , including and

,

and

, where

are predefined constant sets. We use italics for !"#%$&"'(*)%+ and variable

,

)-.+ . Let / be an alphabet without the symbols 0 “1 ” , “2”, “3 ”, “4 ” , “5”, “6”, “787 ” 9 ; then /;: is the

set of strings over / where1 denotes the empty string and /=< > /?:A@B019 . For a set

,

,C D

,FE

denotes

the powerset of ,

and ,

: the set of finite sequences over

,

. We define ,

. (G ) as

, H

>

, I

0GJ9

and ,

.

KBL

(G ) as

, H

>

,

@M0 NO9 . Assignment by possibly probabilistic functions is written as P .

Assignment of a value to a tuple of variables means making correspondingly many projections; if one of these fails the entire assignment fails. We denote the set of URL host names, including protocol names

such as “https”, by QSRUTV

, and the set of URL host and path names by QSRUTV

WX %Y

. We write an

address "[Z# \ QSRUTV

WX %Y

as a pair D&]X^8_ `%acbed`]

E

of a host name ]X^8_ `=\ QSRUTV

and a path. The

typef

Y

H

> 0

g

a

hg

9 contains the channel types available.

Automata. We represent our machines such as the browser model as I/O automata, in other words

finite-state machines with additional variables. This is a very usual basis for specifying participants in distributed protocols; the first specific use for security is in [24]. Specifically we use the automata model proposed in [35], which has a well-defined realization by probabilistic interactive Turing machines and is therefore linked to more detailed cryptographic considerations where those become necessary in multi-layer proofs. In the following we give a brief overview of this machine model (see also Figure 2).

Machines may have multiple fixed connections to other machines organized by means of ports. We

define asimple port for message transmission asb > D.ijagk

E

\ /l< m 0[2*a%39 wherei indicates the port

nameand k the portdirection. Ports are uni-directional, where k > 2 denotes output and k > 3 input.

The machine model connects simple ports

3 and

2 of same namei and opposite directionk ; these are

calledcomplement ports. We call ports without such a complementfreeports. We define aclock port

b > D.ijan4Uagk

E

\ /?< m 0 49om 0[2*a%39 as a port that schedules the connection between simple ports

3 and

2 with same namei , or is free itself if this connection does not exist.

Amachine p is defined as a tuple p > Dqe"r )sta%utv#-.+[stawX"#+s?a

,

-x"8-y)%+stazsta%{qX$%sta%|F$.q}s

E

of a

name qe"r )~s \ /?< , a finite sequence utv#-.+~s of ports, a finite sequence wX"#+s of local variables a

(4)

n

ngn

ne

O ¢¡ ¤£¦¥

§

©¨g%ª

¬«®

y

§

¯¨%ª

°

¦©ª

«

±

²

±¨

³´%¨

µ®¶·¹¸»º¹¼¾½y¿»ÀÁ¯ÂÃÅÄÆÄ¢ÇyÈ±º

É

Ê¤¸

Figure

1:

K

ey

to

the

state

diagrams

U

B

S

Ë

Ì~Í[Î[ÎÏÐÒÑ

Ó»ÔÖÕ¹×¹×¢Ø»Ù

ÚÖÛÝÜ¤Þ¯ßàáÓ»ÔÖÕ¹×â×hØ¹Ù

Ú.ã

×hßnä

Ó»Ô¹ÕÖ×»×Ø»Ù

ÚÝã

×Öåä

Ó»ÔÖÕ¹×»×¢Ø»Ù

ÚâÛÝÜÞånä

æèç¹éêhê¹ë±ì

íÅî

ê¤ï¹ð

ænç¹é¤ê¢êhë±ì

íñ¤òhó

ïô

õ*Üâã

ön÷

å

à

õÝÜâã

å

÷¯ö

ä

ê¹ë¤ó

íÅî

ê¹øúùûð

ê¹ë¢ó

íñhòhó

ø±ù

ô

êhë¢ó

íÅî

êhü¤ð

ê¹ëhó

ícñhòhó

ü

ô

ÛÝÜ¤Þ

ßý

à

ã

×¢ßý¯ä

õÜâã

ön÷

ånä

õÝÜâã

å

÷¯ö

à

Ó»Ô¹ÕÖ×»×Ø»Ù

ÚâÛÅÜ¤Þåúà

ÓâÔ¹Õh×â×hØâÙ

ÚÝã³×

å

à

ÓâÔhÕ¹×â×hØâÙ

ÚâÛÜÞ¯ßèý¯äþÓ»ÔÖÕ¹×â×hØâÙ

ÚÝã

×hßnýà

ÿ

! #"%$&"(')"+*

,.-!$0/! )132)'4/

S

5

ÚâÛÝÜÞ

ß6

àÓâÔhÕ¹×â×hØâÙ

ÚÝã³×

ß6

ä

ÛÝÜ±Þ¯ß

6

à

ã³×¢ß

6

ä

ÚâÛÝÜÞ

ß6

ä

Ó»ÔÖÕh×»×¢Ø»Ù

ÚÝã×

ß6

à

7

8

7

êhëhó

íÅî

ê

ø9

ð

êÖë¤ó

íñhò¤ó

ø:9

ô

;<;!;

;=;&;

>?>@>

>?>A>

>?>?>

>@>3>

ñhò¤ó

ü

ô

BC

ÛÝÜ¤Þ

ö

à

ã×

ö

ä

DFEHGIKJL

MON

IPJQ

DFEHGISR

L

MON

ITRUQ

D<EVGWKL

MXN

W

Q

î

ê

ü

ð

Figure

2:

System

architecture

for

bro

wser

-based

protocols

with

a

bro

wser

Y

,

a

user

Q

,

serv

ers

[Z

,

gY

and

their

interf

aces.

set

,

-x"8-y)%+ás

\

/?:

of

major

states,

a

probabilistic

state-transition

function

z~s

,and

sets

{qX$sta%|F$.qUs

\

,

-x"8-y)%+[s

of

initial

and

final

states.

The

inputs

are

tuples

]

>

D^]

Z

E

ZP_a`<bcccbdfeBg0hjilk<m3noFprq?d

,

where

]

Z

\

/?:

is

the

input

for

the

s

-th

in-port,

h

Dxutv#-.+

s

E

is

the

input

ports

of

the

machine

and

t

h

Dxutv#-.+Òs

E

t

denotes

the

number

of

the

input

ports.

Analogously

,the

outputs

are

tuples

u

>

D?u

Z

E

ZP_a`<bcccbd.vxwHy3hjilk<m3no<prq?d

.

The

empty

w

ord,

1

,denotes

“no

in-or

output”,

respecti

vely

.

The

value

assignments

of

local

variables

are

tuples

z

>

D

z

Z

E

ZP_a`<bcccbdP{%|<m3o}p~d

,where

z

Z

\

/?:

is

the

value

for

the

s

-th

variable

in

the

sequence

wX"#+

s

.

W

e

define

the

state

transition

function

z[s

of

a

machine

p

using

a

notation

analogous

to

UML

state

diagrams

[15

]

(see

Figure

1).

W

e

define

a

tr

ansition

of

such

a

state

diagram

as

an

arro

w

from

state

_

to

_%

with

label

!8)q-5

^

X"#Z6c787

-c$&vq

,where

the

components

are

defined

as

follo

ws:

_

and

_

are

the

source

and

destination

states,

!8)q-is

a

sequence

of

non-empty

inputs

to

the

input

ports,

X"#Z

is

a

predicate

ov

er

the

!8)q-and

z

which

is

the

machine’

s

current

variable

allocation.

%`

}s

y^i

specifies

computations

and

outputs

of

the

transition.

System

Ov

er

view

for

Br

owser

-based

Pr

otocols.

Figure

2

gi

ves

an

ov

ervie

w

of

the

automata

in

our

model.

W

e

call

the

generic

bro

wser

machine

Y

,

the

user

machine

that

implements

the

minimum

as-sumptions

on

secure

user

beha

vior

Q

,and

the

machine

that

models

the

beha

vior

of

secure

channels

as

implemented

within

HTTPS

(see

[39

])

by

gY

.T

o

analyze

and

pro

ve

bro

wser

-based

security

proto-cols

one

complements

these

general-purpose

machines

with

one

or

more

serv

er

machines,

here

denoted

by

rZ

,that

jointly

ex

ecute

the

bro

wser

-based

protocol.

Furthermore,

one

configures

the

user

machine

Q

with

a

suitable

initial

trust

relationship

and

kno

wledge

about

trusted

parties.

3 Ideal

W

eb

Br

owser

In

this

section,

we

introduce

the

functionality

of

real

web

bro

wsers,

and

we

describe

the

feature

set

that

we

model

and

its

benefit

for

the

rigorous

security

analysis

of

bro

wser

-based

security

protocols.

A

web

bro

wser

acts

as

the

client

in

transactions

of

the

Hyperte

xt

Transfer

Protocol

(HTTP)

[10

]and

renders

protocol

state

and

payloads

to

its

user

.A

bro

wser

acts

on

behalf

of

one

single

user

in

a

bro

wsing

session.

Ho

we

ver

,the

bro

wser

may

display

multiple

windo

ws

that

render

dif

ferent

HTTP

transactions

in

parallel.

The

bro

wser

model

represents

a

windo

w

by

a

windo

w

identifier

in

the

communication

between

Q

and

Y

.

W

e

therefore

allo

w

a

user

machine

to

distinguish

multiple

windo

ws;

ho

we

ver

,

the

windo

w

management

of

the

bro

wser

machine

is

omitted

for

bre

vity

.A

real

bro

wser

accepts

inputs

from

the

user

(5)

specifying

addresses

to

retrie

ve

and

render

the

content

associated

with

such

addresses,

as

well

as

the

status

of

the

channel

to

the

serv

er

and

the

identity

of

the

serv

er

.

The

bro

wser

also

initiates

dialogs

with

the

user

to

ne

gotiate

changes

of

the

channel

state,

verify

a

serv

er’

s

authentication

or

request

for

user

authentication.

W

e

model

the

interf

ace

to

w

ards

a

user

such

that

the

user

has

the

same

information

as

in

the

real

w

orld.

Thus,

the

interf

ace

not

only

contains

content

and

error

messages,

but

also

information

about

the

channel

state

and

the

address

of

the

serv

er

.Normally

the

bro

wser

al

w

ays

displays

the

serv

er’

s

hostname,

yet,

a

certified

serv

er

identity

only

if

secure

channels

are

used.

HTTP

is

a

client-serv

er

protocol

positioned

in

the

application

layer

of

the

TCP/IP

protocol

stack

with

variable

underlying

transport

protocols.

In

order

to

initiate

an

HTTP

transaction,

a

web

bro

wser

establishes

a

connection

to

a

serv

er

specified

by

the

address

to

access

and

may

le

verage

multiple

types

of

transport

protocols

underlying

the

HTTP

transaction,

e.g.,

TCP/IP

,SSL

3.0

or

TLS1.0

[7

].

Ha

ving

established

a

channel,

the

bro

wser

issues

an

HTTP

request

to

the

serv

er

.

Such

a

request

specifies

the

resource

that

the

bro

wser

intends

to

retrie

ve,

but

may

also

contain

additional

data

and

parameters.

The

serv

er

ev

aluates

the

request

and

issues

a

response

using

the

same

channel.

W

e

call

such

an

interaction

an

HTTP

transaction.

In

principle,

bro

wsers

do

not

need

to

hold

state

be

yond

such

a

single

transac-tion,

ho

we

ver

,a

real

web

bro

wser

builds,

e.g.,

local

cache

and

bro

wsing

history

and

lets

a

transaction

influence

the

subsequent

one.

Our

bro

wser

model

reflects

this

beha

vior

of

real

bro

wsers.

HTTP

transactions

may

implement

various

functions.

Clients

such

as

real

bro

wsers

may

use

a

transaction

to

retrie

ve

data

by

a

GET

request,

and

send

data

by

a

POST

request.

Serv

ers

may

not

only

deli

ver

content

but

also

direct

the

bro

wser

to

a

beha

vior

change

by

issuing

ex

ecutable

scripts

and

error

messages.

Most

prominent

examples

are

HTTP

responses

with

scripted

form

POST

and

redirect

messages,

which

direct

the

bro

wser

to

another

address

of

the

serv

er’

s

choice.

The

serv

er

may

also

issue

an

HTTP

response

that

requests

a

user’

s

authentication

by

means

of

a

username-passw

ord

pair

W

e

model

bro

wser

messages

as

abstract

formats

and

abstract

from

parsing

bitstrings.

W

e

focus

on

the

subset

of

message

and

parameter

types

of

HTTP

that

is

acti

vely

used

in

bro

wser

-based

protocols

or

may

ha

ve

impact

on

their

security

.Thus,

for

modeling

a

standard

bro

wser

-based

protocol

lik

e

SAML

[41

],

Liberty

[38

]or

WS-Federation

[16

]the

model

pro

vides

the

right

set

of

functionality

without

ov

erwhelming

with

too

man

y

options.

An

important

aspect

of

real

web

bro

wsers

is

that

the

y

do

more

than

bro

wser

-based

protocols

intend.

Most

prominent

is

the

problem

of

information

flo

w

.

On

the

one

hand,

information

flo

ws

through

the

HTTP

requests

to

the

serv

er

,

e.g.,

by

means

of

the

HTTP

Referer

tag.

Also,

features

such

as

history

,

cache

or

passw

ord

storage

ha

ve

data

flo

w

to

the

underlying

operating

system.

W

e

explicitly

model

this

property

of

real

bro

wsers

in

order

to

allo

w

information

flo

w

analysis.

W

e

dedicate

Section

3.4

to

this

important

part

of

the

bro

wser

specification.

Supported

by

a

rudimentary

trust

model,

a

web

bro

wser

can

establish

secure

channels

to

serv

ers

by

le

veraging

the

SSL3.0

or

TLS

protocol.

W

e

model

the

establishment

of

insecure

and

secure

channels

and

the

corresponding

ke

y

management

by

the

channel

machine

gY

.

The

user’

s

log

of

f

from

the

bro

wser

session

that

remo

ves

bro

wsers

state

from

the

machine’

s

memory

is

also

security-rele

vant.

Further

,we

do

not

consider

cookies

as

man

y

bro

wser

-based

protocols

do

not

use

them

directly

.

In

Section

3.1

we

define

the

interf

ace

consisting

of

ports

utv#-.+

and

possible

ev

ents.

Section

3.3

specifies

static

elements

of

Y

such

as

the

set

of

local

variables

wX"#+

,whereas

we

explain

algorithms

and

predicates

in

Section

B.

Specific

abstract

bro

wser

messages

is

the

subject

of

Section

3.2

while

Section

3.4

considers

the

imperfections

of

the

bro

wsers.

Section

3.5

describes

the

beha

vior

model

of

Y

consisting

of

the

state

sets

,

-x"8-y)%+

,

{qX$

0

,and

|F$.q

as

well

as

the

transition

function

z

.

3.1

Interface

of

Br

owser

Machine

W

e

refine

the

interf

ace

of

Y

depicted

in

Figure

2

by

defining

the

exact

messages

types

transferred

ov

er

the

ports

of

Y

.

W

e

list

them

all

in

Table

5

of

Appendix

Section

B.

Here

we

only

explain

them

as

far

as

it

is

useful

to

understand

the

upcoming

state

diagram.

The

ports

b

3

and

b

2

model

the

bro

wser’

s

user

interf

ace

and

connect

Y

to

a

user

machine

(6)

(7)

Name

Domain

Description

Init.

¼r½}¾&¿

½<ÀÁ

ÂÄÃÅÆ(Ç%ÈÊÉ

ËÍÌ~ÎÏ~Ð¥Ñ?Ò&ÓÔxÒ<ÃÕÉ

ÖØ×

Data

of

preceding

HTTP

transaction

Ù(ÚÛ

È&Ü

Ý«Þàß

ÖØ×

Identifier

of

the

bro

wser

windo

w

Ù(ÚÛ

È&Ü

áãâä

ÁÁr¾&åæ

ÂÄÃÔ

Ú(Ú

È%ç

×

Set

of

all

channels

the

bro

wser

holds

è

élê

Àë

â

ÖØ×É

ÖØ× É

ËÍÌ~ÎÏ~Ð¥Ñ?Ò

Set

of

login

data

stored

by

ì

è

í

Þæ<ë3î½<ï

ËÍÌ~ÎÏ~Ð¥Ñ?Ò&ÓÔxÒ<Ã

×

History

of

addresses

successfully

retrie

ved

è

á[ä¥ðFâ

¾

ñ?ËÍÌ~ÎÏ~Ð¥Ñ?Ò&ÓÔxÒ<ÃÊÉ

ÖØ×0ò}×

Cache

of

all

pages

retrie

ved

è

Table

1:

Persistent

local

variables

of

the

bro

wser

machine

Y

.

"[Z#

and

then

sends

¡"8-£¢

and

¤

)#

x¦

ov

er

that

channel.

The

channel

type

is

implied

by

the

HTTP

proto-col

name

“http”

or

“https”

in

"[Z#

.

The

abstract

message

ó?

%Y

D

E

queries

the

bro

wser

for

a

user

authentication

and

triggers

the

bro

wser

for

this

purpose.

3.3

Static

Model

of

Br

owser

Machine

W

e

no

w

define

the

bro

wser’

s

local

variables

wX"#+

ô

.The

variable

space

is

not

only

a

major

prerequisite

of

the

definition

of

the

transition

function,

it

is

also

the

source

of

all

information

flo

w

of

Y

.

W

e

distinguish

volatile

and

per

sistent

variables.

Persistent

variables

hold

the

bro

wser’

s

long-term

state

and

are

only

deleted

by

the

(õ

command,

whereas

volatile

variables

are

deleted

in

ev

ery

final

state

of

an

HTTP

transition.

W

e

start

by

describing

the

volatile

variables,

which

we

also

summarize

in

Table

6

of

Appendix

Section

B.

At

the

be

ginning

of

an

HTTP

transaction

a

bro

wser

is

directed

to

retrie

ve

an

address,

either

through

a

user

input

or

deri

ved

from

the

pre

vious

HTTP

transaction.

The

variable

"[Z#

contains

the

address

to

be

retrie

ved

and

implies

the

value

of

¢v+-and

¬¢

-^¦0¡

X)

.

The

value

of

¬¢

-^¦0¡

X)

specifies

the

type

of

the

channel

the

bro

wser

establishes

to

the

serv

er

with

hostname

¢v+-.

The

variables

+v

+

#

<

%)

#%$

and

r

)-£¢

v Z

specify

the

properties

of

the

HTTP

request,

where

the

+v

+

#

<

%)

#%$

states

that

the

entity

issuing

the

request

for

"[Z#

has

a

URI

on

its

own.

This

variable

implies

whether

a

Referer

Tag

is

included

in

the

request

and

therefore

implies

an

information

flo

w

to

the

serv

er

.

The

variable

r

)-£¢

v Z

contains

the

HTTP

method

to

be

used

in

this

HTTP

transition.

The

variable

¨v#%r

$.q

contains

additional

input

pro

vided

by

the

user

.

The

variable

¬¢

\

f

Y

contains

the

bro

wser’

s

local

representation

of

a

channel

established

to

a

serv

er

,i.e.,

the

data

the

bro

wser

has

acquired

about

the

channel.

An

element

of

f

Y

is

a

tuple

D<

$&Z

a¬¢

v+-a+%$&Z

a

-^¦0¡

X)a

3¨

%#g))

E

from

the

domain

ö

m

QSRUTV

m

/

:Bm

f

Y

m

Y

.

The

variables

¢v+-and

+%$&Z

describe

the

serv

er

to

which

the

bro

wser

channel

is

connected,

where

¢v+-contains

the

hostname

of

the

serv

er

whereas

+%$&Z

names

the

serv

er’

s

identity

in

a

secure

channel,

and

is

1

in

an

insecure

channel.

The

variable

-^¦0¡

X)

represents

the

channel

type

(secure

or

insecure).

The

variable

¨%#g))

is

used

to

or

ganize

the

reuse

of

existing

channels

and

flags

that

the

channel

is

currently

not

associated

to

a

HTTP

transaction.

The

variable

r

contains

the

payload

of

an

HTTP

response,

whereas

variable

+-xv#g)

flags

the

user’

s

decision

whether

to

store

login

data

in

the

bro

wser’

s

state.

W

e

describe

the

persistent

variables

in

Table

1.

W

e

first

consider

variables

that

are

associated

with

ev

ery

single

windo

w

.

Firstly

,the

variable

÷j$&Z

names

the

identifier

of

the

bro

wser

windo

w

.

Secondly

,

the

tuple

¡e#g)!

#x

q

contains

data

about

the

preceding

HTTP

transaction

in

this

windo

w

.

Its

element

¬¢

-^¦0¡

X)

contains

the

channel

type,

whereas

"[Z#

contains

the

address

retrie

ved

in

the

preceding

HTTP

transaction.

The

element

¨v#%r

contains

the

structure

of

an

HTML

form

together

with

hidden

value

fields

already

included

in

the

form.

The

other

persistent

variables

are

global

for

Y

.The

set

øa¢

"qXq

)(¤+

contains

representations

of

type

f

Y

for

all

channels

the

bro

wser

has

established.

The

follo

wing

variables

are

important

for

the

information

flo

w

analysis

of

bro

wser

-based

protocols.

The

set

ù[

-£¢

contains

a

user’

s

login

information

the

user

decided

to

store

in

the

bro

wser’

s

state.

The

persistent

variable

ú

$+-xv#

x¦

is

a

finite

sequence

of

addresses

successfully

retrie

ved

by

the

bro

wser

.

The

variable

øj"

¬¢

)

models

the

local

bro

wser

cache.

It

is

a

finite

sequence

of

pairs

of

addresses

and

page

contents

retrie

ved

from

these

addresses.

(8)

3.4

Imperfections

Ev

en

correct

bro

wsers

produce

information

flo

w

to

(i)

communication

partners

and

(ii)

the

underly-ing

operating

system.

As

this

information

flo

w

may

impose

vulnerabilities

to

bro

wser

-based

security

protocols,

we

model

it

as

explicit

imperfection

in

the

bro

wser

model.

A

web

bro

wser

leaks

information

about

pre

vious

transactions

and

its

user

to

communication

partners

in

HTTP

transactions.

A

real

bro

wser

includes

such

information

into

HTTP

header

4 Ideal

User

Br

owsing

Beha

vior

¡

In

bro

wser

-based

protocols,

the

bro

wser’

s

user

has

an

important

role.

As

we

ha

ve

seen,

the

bro

wser

is

a

state-less

de

vice

that

only

pro

vides

a

rudimental

trust

management.

Ho

we

ver

,in

higher

protocols

one

needs

to

store

information

be

yond

a

single

transaction

and

ha

ve

a

stronger

trust

model.

Also,

the

user

controls

most

of

the

bro

wser

beha

vior

and

has

the

final

say

about

the

bro

wser’

s

actions.

Therefore,

without

a

user

fulfilling

certain

tasks

and

properties

all

bro

wser

-based

protocols

must

fail.

Thus,

we

consider

a

user

as

acti

ve

protocol

participant

and

model

it

by

an

in

general

transparent

machine,

which,

ho

we

ver

,

enforces

the

requirements

for

bro

wser

-based

protocols.

Firstly

,

the

user

stores

data

of

the

protocols

it

is

in

volv

ed

in.

Such

data

may

be

addresses

of

trusted

serv

ers

or

identity

information.

The

user

kno

ws

its

trust

relationship

to

other

parties

in

the

bro

wser

-based

protocols.

The

machine

Q

stores

this

data

in

its

state.

As

the

web

bro

wser

is

not

aw

are

of

an

y

higher

-le

vel

protocol,

the

user

acts

as

a

supervisor

of

the

protocol

flo

w

the

bro

wser

is

in

volv

ed

in.

It

checks

certificates,

observ

es

the

status

of

secure

channels

and

logs

of

f

from

the

bro

wser

in

error

cases.

Also,

the

user

engages

in

the

user

authentication

with

a

serv

er

and

performs

the

crucial

verification

of

the

serv

er’

s

identity

.

The

machine

acts

autonomically

upon

bro

wser

dialogs

concerning

these

tasks.

As

depicted

in

Figure

2,

the

machine

Q

w

orks

as

proxy

between

bro

wser

machine

and

the

protocol

interf

ace.

It

forw

ards

communication

from

the

protocol

interf

ace

to

Y

and

the

bro

wser’

s

pages

back.

W

ith

the

message

K

g

it

indicates

that

the

bro

wser

beha

ved

contrary

to

the

expectations

of

Q

and

that

Q

aborted

the

interaction

with

it.

The

user

machine

is

connected

to

the

bro

wser

through

the

user

interf

ace

ports

b

e2

and

b

3

,which

we

described

in

Section

3.1

in

detail.

The

user

machine

contains

confidential

metadata

in

its

state.

Therefore,

wX"#+

is

an

important

initial

point

for

information

flo

w

analysis.

As

in

Section

3.3,

we

distinguish

persistent

and

volatile

variables.

W

e

use

similar

names

as

in

the

bro

wser

machine

for

the

volatile

variables:

the

address

"[Z#

and

channel

type

¬¢

-^¦0¡

X)

refer

to

the

address

the

bro

wser

established

a

channel

to.

For

secure

channels

the

serv

er

identity

+%$&Z

additionally

contains

the

identity

according

to

the

serv

er’

s

certificate.

The

variable

u

refers

to

instances

of

the

type

L

,

e.g.

to

serv

ers

Q

trusts.

A

L

is

a

tuple

DH¢

v+-a+%$&ZJa(»v

&§

8$.qUa+ )

&

E

of

domain

QSRUTV

m

/

:

m

/?:

m

C

Dyf

Y

E

where

¢v+-contains

the

serv

er’

s

hostname,

+%$&Z

its

identity

in

a

secure

channel

and

(»v

&§

8$.q

the

login

information

for

user

Q

.

The

set

+ )

&

contains

the

channel

types

allo

wed

for

a

user

authentication

with

that

serv

er

.

In

Table

2,

we

introduce

the

persistent

variables

of

user

machine

Q

,which

we

use

to

model

the

trust

relationships

of

Q

.

The

set

¢

contains

all

instances

of

type

L

to

which

machine

Q

has

a

trust

relationship.

The

pairs

DH¢

v+-a+%$&Z

E

within

this

table

must

be

unique.

The

set

X"

+

-£¢

+ )

&

models

the

general

polic

y

of

Q

for

allo

wed

channel

types

for

user

authentication

and

contains

the

channel

types

that

are

acceptable.

W

e

define

the

state

transition

function

z

by

the

state

diagram

in

Figure

5.

The

state

of

this

machine

models

a

user

being

idle,

w

aiting

for

an

input

from

the

higher

protocol

layer

which

address