Solving Some Modeling Challenges when
Testing Rich Internet Applications for Security
Software Security Research Group (SSRG), University of Ottawa
SSRG Members
University of Ottawa
• Prof. Guy-Vincent Jourdan
• Prof. Gregor v. Bochmann
• Suryakant Choudhary (Master student)
• Emre Dincturk (PhD student)
• Khaled Ben Hafaiedh (PhD student)
• Seyed M. Mir Taheri (PhD student)
• Ali Moosavi (Master student)
In collaboration with
Research and Development, IBM® Security AppScan® Enterprise
Introduction: Traditional Web Applications
• Navigation is achieved using the links (URLs)• Synchronous communication
Traditional Synchronous Communication Pattern
User Interaction
Server Processing
Request Response Full Page Refresh User
Waiting
User Interaction
Server Processing
Full Page Refresh User
Waiting
User Interaction
Introduction : Rich Internet Applications
• More interactive and responsive web apps▫ Page changes via client-side code (JavaScript) ▫ Asynchronous communication
Asynchronous Communication Pattern (in RIAs)
User Interaction Partial Page Update Partial Page Update Partial Page Update
Server Processing Server Processing Request Request Request
Response Response
Crawling and web application security
testing
• All parts of the application must be discovered before we analyze for security.
• Why automatic crawling algorithm are important for security testing ?
▫ Most RIAs are too large for manual exploration ▫ Efficiency
What we present…
• Techniques and Approaches to make web
application security assessment tools perform better
• How to improve the performance?
▫ Make them efficient by analysing only what’s important and ignore irrelevant information
Web application crawlers
• Main components:
▫ Crawling strategy
Algorithm which guides the crawler
▫ State equivalence
Algorithm which indicates what should be considered new
State Equivalence
• Client states
• Decides if two client states of an application should be considered different or the same.
• Why important?
▫ Infinite runs or state explosion
Techniques
• Load-Reload: Discovering non-relevant dynamic content of web pages
1. Load-Reload: Discovering non-relevant
dynamic content of web pages
What we propose
• Reload the web page (URL) to determine the parts of the content that are relevant.
Calculate Delta (X): Content that changed between the two loads.
• Delta(X): X is any web page and Delta(X) is collection of xpaths of the contents that are not relevant
• E.g. Delta(X) = {html\body\div\, html\body\a\@href}
2. Identifying Session Variables and
Parameters
• What is a session?
▫ A session is a conversation between the server and a client.
▫ Why should a session be maintained?
▫ HTTP is Stateless: When there is a series of continuous request and response from a same client to a server, the server cannot identify from which client it is getting
Identifying Session Variables and Parameters
(2)
• Session tracking methods:
▫ User authorization ▫ Hidden fields
▫ URL rewriting ▫ Cookies
▫ Session tracking API
• Problems that are addressed:
▫ Redundant crawling: Might result in crawler trap or infinite runs. ▫ Session termination problem: Incomplete coverage of the application
What we propose
• Two recordings of the log-in sequence are done on the
same website, using the same user input (e.g. same user
name and
password) and the same user actions.
3. Crawling Strategies For RIAs
• Crawling extracts a “model” of the application that consists of
▫ States, which are “distinct” web pages
▫ Transitions are triggered by event executions
• Strategy decides how the application exploration should proceed
Standard Crawling Strategies
• Breadth-First and Depth-First
• They are not flexible
▫ They do not adapt themselves to the application • Breadth-First often goes back to the initial page
▫ Increases the number of reloads (loading the URL)
• Depth-First requires traversing long paths
What we propose
• Model Based Crawling
Model is an assumption about the structure of the
application
Specify a good strategy for crawling any application that
follows the model.
Specify how to adapt the crawling strategy in case that
What we propose (2)
• Existing models:
▫ Hypercube Model
1. Independent events
2. The set of enabled events at a state are the same as the initial state except the ones executed to reach it.
▫ Probability Model
Statistics gathered about event execution results are used to guide the application exploration strategy
e2 e1 e1 e2 {e1,e2} {e2} {e1} {}
Conclusion
• Crawling is essential for automated security testing of web applications
• We introduced two techniques to enhance security testing of web applications
▫ Identifying and ignoring irrelevant web page contents
▫ Identifying and ignoring session information • We have worked on new crawling algorithms
Demonstration
• Rich Internet Application Security Testing - IBM® Security AppScan® Enterprise
DEMO – IBM
®Security AppScan
®Enterprise
• IBM Security AppScan Enterprise is an automated web application scanner
• We added RIA crawling capability on a prototype of AppScan
• We will demo how the coverage of the tool increases with RIA crawling capability