Solving Some Modeling Challenges when Testing Rich Internet Applications for Security

(1)

Solving Some Modeling Challenges when

Testing Rich Internet Applications for Security

Software Security Research Group (SSRG), University of Ottawa

(2)

SSRG Members

University of Ottawa

• Prof. Guy-Vincent Jourdan

• Prof. Gregor v. Bochmann

• Suryakant Choudhary (Master student)

• Emre Dincturk (PhD student)

• Khaled Ben Hafaiedh (PhD student)

• Seyed M. Mir Taheri (PhD student)

• Ali Moosavi (Master student)

In collaboration with

Research and Development, IBM®_{Security AppScan}®_Enterprise

(3)

Introduction: Traditional Web Applications

• Navigation is achieved using the links (URLs)

• Synchronous communication

Traditional Synchronous Communication Pattern

User Interaction

Server Processing

Request Response Full Page Refresh User

Waiting

User Interaction

Server Processing

Full Page Refresh User

Waiting

User Interaction

(4)

Introduction : Rich Internet Applications

• More interactive and responsive web apps

▫ Page changes via client-side code (JavaScript) ▫ Asynchronous communication

Asynchronous Communication Pattern (in RIAs)

User Interaction Partial Page Update Partial Page Update Partial Page Update

Server Processing Server Processing Request Request Request

Response Response

(5)

Crawling and web application security

testing

• All parts of the application must be discovered before we analyze for security.

• Why automatic crawling algorithm are important for security testing ?

▫ Most RIAs are too large for manual exploration ▫ Efficiency

(6)

What we present…

• Techniques and Approaches to make web

application security assessment tools perform better

• How to improve the performance?

▫ Make them efficient by analysing only what’s important and ignore irrelevant information

(7)

Web application crawlers

• Main components:

▫ Crawling strategy

 Algorithm which guides the crawler

▫ State equivalence

 Algorithm which indicates what should be considered new

(8)

State Equivalence

• Client states

• Decides if two client states of an application should be considered different or the same.

• Why important?

▫ Infinite runs or state explosion

(9)

Techniques

• Load-Reload: Discovering non-relevant dynamic content of web pages

(10)

1. Load-Reload: Discovering non-relevant

dynamic content of web pages

(11)

What we propose

• Reload the web page (URL) to determine the parts of the content that are relevant.

Calculate Delta (X): Content that changed between the two loads.

(12)

• Delta(X): X is any web page and Delta(X) is collection of xpaths of the contents that are not relevant

• E.g. Delta(X) = {html\body\div\, html\body\a\@href}

(13)

(14)

(15)

2. Identifying Session Variables and

Parameters

• What is a session?

▫ A session is a conversation between the server and a client.

▫ Why should a session be maintained?

▫ HTTP is Stateless: When there is a series of continuous request and response from a same client to a server, the server cannot identify from which client it is getting

(16)

Identifying Session Variables and Parameters

(2)

• Session tracking methods:

▫ User authorization ▫ Hidden fields

▫ URL rewriting ▫ Cookies

▫ Session tracking API

• Problems that are addressed:

▫ Redundant crawling: Might result in crawler trap or infinite runs. ▫ Session termination problem: Incomplete coverage of the application

(17)

What we propose

• Two recordings of the log-in sequence are done on the

same website, using the same user input (e.g. same user

name and

password) and the same user actions.

(18)

(19)

3. Crawling Strategies For RIAs

• Crawling extracts a “model” of the application that consists of

▫ States, which are “distinct” web pages

▫ Transitions are triggered by event executions

• Strategy decides how the application exploration should proceed

(20)

Standard Crawling Strategies

• Breadth-First and Depth-First

• They are not flexible

▫ They do not adapt themselves to the application • Breadth-First often goes back to the initial page

▫ Increases the number of reloads (loading the URL)

• Depth-First requires traversing long paths

(21)

What we propose

• Model Based Crawling

 Model is an assumption about the structure of the

application

 Specify a good strategy for crawling any application that

follows the model.

 Specify how to adapt the crawling strategy in case that

(22)

What we propose (2)

• Existing models:

▫ Hypercube Model

1. Independent events

2. The set of enabled events at a state are the same as the initial state except the ones executed to reach it.

▫ Probability Model

 Statistics gathered about event execution results are used to guide the application exploration strategy

e2 e1 e1 e2 {e1,e2} {e2} {e1} {}

(23)

Conclusion

• Crawling is essential for automated security testing of web applications

• We introduced two techniques to enhance security testing of web applications

▫ Identifying and ignoring irrelevant web page contents

▫ Identifying and ignoring session information • We have worked on new crawling algorithms

(24)

(25)

Demonstration

• Rich Internet Application Security Testing - IBM®_{Security AppScan}®_Enterprise

(26)

DEMO – IBM

®

_{Security AppScan}

®

_Enterprise

• IBM Security AppScan Enterprise is an automated web application scanner

• We added RIA crawling capability on a prototype of AppScan

• We will demo how the coverage of the tool increases with RIA crawling capability

(27)

(28)

DEMO – Results

(29)

Solving Some Modeling Challenges when Testing Rich Internet Applications for Security