Model checking web applications

(1)

by

Mohammed Yahya Alzahrani

Submitted for the degree Doctor of Philosophy

Department of Computer Science

School of Mathematical and Computer Sciences

Heriot-Watt University

December 2015

The copyright in this thesis is owned by the author. Any quotation from the thesis or use of any of the information contained in it must acknowledge this thesis as the source of the quotation or

(2)

The modelling of web-based applications can assist in capturing and understanding their behaviour. The development of such applications requires the use of sound methodologies to ensure that the intended and actual behaviour are the same.

As a verification technique, model checking can assist in finding design flaws and simplifying the design of a web application, and as a result the design and the security of the web application can be improved. Model checking has the advantage of using an exhaustive search of the state space of a system to determine if the specifications are true or not in a given model.

In this thesis we present novel approaches in modelling and verifying web applications’ properties to ensure their design correctness and security. Since the actions in web applications rely on both the user input and the server status; we propose an approach for modelling and verifying dynamic navigation properties. The Spin model checker has been used successfully in verifying communication protocols. However, the current version of Spin does not support modelling time. We integrate discrete time in the Spin model to allow the modelling of realistic properties that rely on time constraints and to analyse the sequence of actions and time. Examining the sequence of actions in web applications assists in understanding their behaviour in different scenarios such as navigation errors and in the presence of an intruder. The model checker Uppaal is presented in the literature as an alternative to Spin when modelling real-time systems. We develop models with real time constraints in Uppaal in order to validate the results from the Spin models and to compare the differences between modelling with real time and with discrete time as in Spin. We also compare the complexity and expressiveness of each model checker in verifying web applications’ properties.

The web application models in our research are developed gradually to ensure their correctness and to manage the complexities of specifying the security and navigation properties. We analyse the compromised model to compare the differences in the sequence of actions and time with the secure model to assist in improving early detections of malicious behaviour in web applications.

(3)

(4)

I am grateful to my supervisor Dr. Lilia Georgieva for her guidance, support and encouragement throughout my research. Her extensive comments during the early stages of my Ph.D. to the writing up helped me to understand my research. Without her inspiration, knowledge and enthusiasm, this work would never have been finished. I am also grateful to Professor Gerard Holzmann for his support and patience in answering my questions regarding the Spin models.

I would also like to thank Vaggelis for all the encouragement and support during my Ph.D.

Also, many thanks to my office colleagues Prabhat and Konstantina for their support and useful comments.

My gratitude goes to Nesreen for being an endless source of inspiration and for sup-porting for me to undertake my Ph.D.

Last but not least, I am so grateful for my parents and my siblings (Elham, Seham, Ali, Wafaa, Waleed and Dana) whose constant encouragement, love and support helped me throughout my Ph.D. It is to them that I dedicate this work.

(5)

Abstract i

Acknowledgements iii

Contents iv

List of Figures vii

List of Tables ix

1 Introduction 1

1.1 Web Applications . . . 1

1.2 Formal Methods. . . 4

1.3 State of the Problem . . . 6

1.4 Research Aims and Objectives . . . 6

1.5 Research Methodology . . . 7

1.6 Contributions . . . 10

1.7 Publications . . . 12

1.8 Structure of Thesis . . . 13

2 Background and Related Work 15 2.1 Web Applications . . . 15

2.1.1 Security Threats for Web Applications . . . 19

2.2 Web Navigation Properties . . . 22

2.3 Web Security Properties . . . 24

2.3.1 Session Management . . . 27

2.3.2 Authentication . . . 28

2.3.3 Control-Flow . . . 29

2.4 Model Checking . . . 31

2.4.1 Model Checking Tools . . . 34

2.5 Temporal Logic . . . 35

2.5.1 Linear Temporal Logic (LTL) . . . 36

2.5.2 Computational Tree Logic (CTL) . . . 37

2.5.3 Temporal Logic Patterns . . . 39 iv

(6)

2.6 Timed Automata Theory . . . 40

2.6.1 Formal Syntax . . . 41

2.7 Modelling an Intruder (Man in the Middle). . . 43

2.8 Conclusions . . . 44

3 Modelling in SPIN 46 3.1 _{The SPIN Model Checker} . . . 46

3.1.1 Promela . . . 48

3.1.2 Verification in Spin . . . 53

3.1.3 Modelling Time in SPIN . . . 55

3.2 _{Modelling Web Applications in Spin} . . . 56

3.2.1 Model without Timer . . . 58

3.2.1.1 Simulation and Verification Results of Model without Timer . . . 64

3.2.2 Modelling Dynamic Navigation . . . 79

3.2.2.1 Simulation and Verification Results of the Dynamic Navigation Model. . . 82

3.2.3 Modelling with Time Constraints . . . 85

3.2.3.1 Simulation and Verification Results of Timed Model . 90 3.2.4 Adding an Intruder to the Model . . . 90

3.2.4.1 Simulation and Verification Results of Model with In-truder . . . 93

3.3 Summary . . . 95

4 Modelling in UPPAAL 97 4.1 _{The Uppaal Model Checker} . . . 97

4.1.1 The Modelling Language . . . 98

4.1.2 _{Modelling Time in UPPAAL} . . . 101

Locations in UPPAAL . . . 105

4.2 _{Modelling Web Applications in Uppaal} . . . 105

4.2.1 Model without Time Constraints . . . 106

4.2.1.1 Simulation and Verification Results of Model without Time Constraints . . . 110

4.2.2 Modelling Dynamic Navigation . . . 112

4.2.2.1 Simulation and Verification Results of Dynamic Nav-igation Model . . . 114

4.2.3 Modelling with Time Constraints . . . 115

4.2.3.1 Simulation and Verification Results of Timed Model . 116 4.2.4 Adding an Intruder to the Model . . . 117

4.2.4.1 Simulation and Verification Results of Model with In-truder . . . 118

4.3 Summary . . . 119

5 Comparison 121 5.1 _{Modelling Web Applications in Spin} . . . 121

(7)

5.2 _{Modelling Web Applications in Uppaal} . . . 125

5.3 Comparison . . . 127

5.4 Summary . . . 130

6 Conclusion 131

6.1 An Overview of the Research . . . 131

6.2 Summary of Thesis Contributions to Research Areas . . . 133

6.2.1 Contributions to Model Checking Web Applications . . . 133

6.2.2 Contributions to Model Checking Timed Models of Web

Appli-cations . . . 134

6.2.3 Contributions to Modelling Security Properties of Applications . 134

6.3 Future Work . . . 135

Bibliography 136

(8)

1.1 Model Checking Web Application Properties.. . . 9

2.1 Overview of Web Applications [Li and Xue, 2014] . . . 18

2.2 Percentage of Common Vulnerability Types in Web Applications [Cen-zic, 2014]. . . 20

2.3 Model Checking Process. . . 32

2.4 Attack Example . . . 43

3.1 Model of Online Banking. . . 57

3.2 Safety Verification of Model without Timer . . . 65

3.3 Message Sequence Chart of Model without Timer. . . 65

3.4 Page Sequence and LTL Proposition Letters. . . 66

3.5 Verification Result of Property 3.1. . . 67

3.10 Never Claim for Property 3.5 . . . 72

3.11 Verification Result of Property 3.6 . . . 73

3.12 Never Claim for Property 3.6 . . . 74

3.13 Verification Error Result of Property 3.13. . . 75

3.14 Error-trail File of Property 3.13 . . . 75

3.16 Verification Error Result of Property 3.7 . . . 77

3.19 Safety Verification Result of Dynamic Navigation Model . . . 82

3.20 Message Exchange Sample of Dynamic Navigation Model . . . 83

3.22 Verification Result of Global Variable Assertions . . . 84

3.24 Simulation Chart of the Discrete Time Model. . . 88

3.25 Simulation Results of the Discrete Time Model. . . 89

3.27 Secure Model. . . 94

3.28 Model with Intruder. . . 95 vii

(9)

4.1 _{Path Formulas Supported in uppaal.} . . . 100

4.2 The Automata P1 with Obs Observer. . . 101

4.3 Possible Behaviour of the First Example. . . 102

4.4 uppaal Verification Example. . . 103

4.5 uppaal Behaviour with Invariant.. . . 103

4.6 uppaal Behaviour with Guard. . . 104

4.7 _{Location Types in uppaal.} . . . 105

4.8 Client Automaton. . . 109

4.9 Server Automaton. . . 110

4.10 Simulation Result of Model without Time Constraints. . . 111

4.11 Verification Result of CTL Formula 4.1 . . . 111

4.12 Dynamic Client Automaton. . . 113

4.13 Dynamic Server Automaton. . . 114

4.14 Verification Results of CTL Formula 4.6 . . . 114

4.15 Timed Client Automaton. . . 116

4.16 Timed Server Automaton. . . 116

(10)

2.1 Navigation Properties [Stock et al., 2014]. . . 22

2.2 Session Management Properties [Stock et al., 2014]. . . 28

2.3 Authentication Properties [Stock et al., 2014]. . . 29

2.4 Control-Flow Properties [Stock et al., 2014]. . . 30

2.5 _{LTL formula operators with their mathematical and spin notation} . . . 37

2.6 Temporal Logic Patterns . . . 39

3.1 Operators in Promela. . . 50

3.2 Verification Results of Properties for Model without Time. . . 78

3.3 Verification Results of the Secure and the Compromised Model. . . 94

4.1 _{CTL Syntax in uppaal} . . . 99

4.2 Verification Results of the Secure Model and Compromised Model in Uppaal. . . 118

5.1 Web Applications’ Properties Stock et al. [2014].. . . 123

5.2 _{Comparison of Number of States between Spin and Uppaal.}. . . 127

(11)

Introduction

In this chapter we first discuss the challenges of web applications’ development that lead to security and design vulnerabilities. In Section 1.2 we provide an overview of the formal methods and present the model checking advantages over alternative verification methods. We then present the state of problem of our research in Section

1.3. In Section 1.4 we list and discuss the research aims and objectives. In Section

1.5 we present the research methodology. In Section 1.6 we show the contributions of our research. We conclude with a structure of the following chapters in Section1.8.

1.1 Web Applications

Web applications are common in today’s economic and social life. Such applications provide business services to customers, business to business communications, and various services to users around the world. Online businesses use web applications to reach more clients and to improve their services. Sectors such as banking, travel, education and governmental services rely on web applications to promote and increase

(12)

their operations [Ginige and Murugesan,2001, Homma et al.,2011,Miao and Zeng,

2007]. The rapid spread of web applications in the areas of communications and business services has promoted them to one of the leading and most essential branches of the software development industry [Offutt,2002]. Along with the increased demand for web applications, concerns have been raised about design flaws that are able to cause vulnerabilities in security and navigation properties [Huang and Lee, 2005].

The development of web applications has been evolving rapidly, resulting in poor quality, security vulnerabilities and maintenance challenges [Murugesan and Desh-pande, 2002]. Unstable design and development processes, as well as poor project management practices are the main reasons for such problems [Ginige, 2002]. The data handled by web applications often contains sensitive values (e.g. credit card numbers) for both users and service providers. In 2015, the attack of several or-ganizations’ web applications was considered the most popular method that led to sensitive data disclosure [Hesseldahl, 2015, Solutions, 2015].

Web application vulnerabilities, which lead to the compromise of sensitive information, are regularly reported [Falk et al., 2008, Jovanovic et al., 2006], as indicated by the following reports:

• According to a report by [Cenzic, 2014], 96% of tested web applications in 2013 had vulnerabilities categorised as high risk. In addition, an average number of 14 vulnerabilities per web application found in 2013 due to design errors.

• A recent report by [Hoff, 2013] showed that there were more than 800 reported hacking incidents in 2012 alone, and 70% of those were carried out through web application vulnerabilities.

(13)

• A study carried out by [Falk et al., 2008] showed that 75% of online banking web sites have at least one major security flaw.

• In 2010, more than 8,000 online banking clients’ credentials were stolen from a server where they were stored as plain text [Fundation, 2010].

A report by [Solutions, 2015] stated: “ by tracking user behaviour and using some form of fraud detection to get an early warning of suspicious behaviour ...can help to identify malicious activity before your last bit of sensitive data is fully exfiltrated.”

Larger and more complex web applications will also increase the need for rigorous methods of developing high quality applications that are secure and easy to maintain [Lee and Shirani, 2004, Ricca and Tonella, 2001, Taylor et al., 2002]. The develop-ment of such applications requires the use of sound methodologies to ensure that the intended and actual behaviour are the same. Also, web applications must satisfy es-sential security properties, such as authentication, session management and navigation properties [Stock et al.,2014].

In this thesis we apply model checking for the simulation and verification of time sensitive web applications. We model security and navigation properties which include session management properties, authentication properties and control flow properties. We use the model checking tools Spin [Holzmann,2004_{] and Uppaal [}Amnell et al.,

2001] to verify an online banking web application of a client communicating with a server to complete a transaction.

(14)

1.2 Formal Methods

Formal methods are mathematical based languages, techniques and tools for verify-ing hardware and software systems. The process of usverify-ing formal methods does not guarantee the correctness of a given system, but they can assist in increasing the un-derstanding of a system’s inconsistencies and incompleteness that can lead to design errors [Clarke and Wing, 1996].

Traditional validation techniques, such as testing, can be effective in the early stages of debugging. However, testing can not detect all the errors and in some cases it can miss errors in systems that have very large number of states, as the testing process can only explore part of the possible behaviour of the system. Furthermore, it is not evident when they have reached their limit, nor is there a clear estimate of the remaining number of bugs [Clarke et al., 1999, Donini et al., 2006].

In contrast, theorem proving and proof checking do not have this shortcoming. How-ever, they are time consuming and often require that the design team includes an ex-pert in both the language used to model the system and the mathematical background of the language. In addition, theorem proving is complex when timing requirements are included in verification [Davis, 2000].

An alternative approach is formal verification, which can exhaustively explore the possible behaviour of a system. In contrast to testing, where only some parts of the behaviour are explored, formal verification can show that a design is correct by exploring all possible states; thus not allowing a security vulnerability or design flaw [Clarke et al., 1999].

Model checking tools have played a key role in the design of concurrent and distributed systems and have also been reported in industrial applications [Baier et al., 2008,

(15)

Clarke,2008,Holzmann,2004]. The model checking process assists designers to ensure the correctness of a system in the early stages of development.

In order for a model checking tool to verify a web application model, three main tasks need to be carried out. The first task is modelling, in which the systems’ design is converted into a formalism that is accepted by the model checker tool. In some cases, this is a straightforward task, while complex systems may require the use of abstraction to eliminate unrelated or non-essential system details.

The second task is specification; stating the properties of the model that the system must satisfy. Model checking tools commonly use temporal logic, which can assert how the behaviour of the system evolves over time. The final task is verification. Ideally, this task is performed in a completely automated fashion. The model checker tool will provide an error trace (counterexample) that assists in locating where an error occurred in the case of a negative result. Each of these are examined and demonstrated further in Chapter 2.

Model checking has two important advantages over other techniques [Baier et al.,

2008, Clarke,2008, Clarke et al., 1999]:

• The process is fully automatic, so the user does not need to be an expert in mathematical disciplines such as logic and theorem proving.

• The model checker tool provides a (counterexample) that shows where the er-ror has occurred if the property fails. This erer-ror trace provides an insight to understand the reason for the error, as well as essential clues to fix the problem.

The main disadvantage of model checking is the state explosion problem where the number of states of a system to be analysed or verified increases significantly in the state space [Holzmann, 2004, McMillan, 1992, Valmari, 1998].

(16)

1.3 State of the Problem

Web applications are dynamically changing and evolving. They are used in services such as banking, governmental and health sectors [Homma et al., 2011, Huang and Lee,2005,Krishnamurthi, 2006], as web application often involve the transmission of sensitive data and they need to ensure correctness to avoid vulnerabilities. Security is a major concern for developers, since simple errors could lead to the loss of valuable information and threaten the privacy of online users. As a result, the need for auto-mated tools that detect vulnerabilities and protect users against attacks is evident. Verifying web applications using model checking is an emerging research area, and there is a clear gap between the theory and practice. This research investigates web application behaviour under different situations (e.g. in the presence of an attacker or different server status). Realistic web application models are built and extended with time constraints to verify and analyse their behaviour.

1.4 Research Aims and Objectives

In this research we apply model checking for modelling and verifying web application behaviour under different scenarios. In particular, the focus is on web application security and navigation properties. This aim can be achieved by fulfilling three inter-connected objectives, as follows:

• Develop web application models that extend and verify its properties by adding time constraints to achieve realistic models. In addition, secure models will be investigated and compared with a model in the presence of an attacker to study the weaknesses of the specifications and the sequence of timing and actions.

(17)

The results could capture the behaviour of the attacker in order to identify vulnerabilities in the models and to analyse compromised and secure models.

• Apply model checking to verify web applications’ behaviour and compare it with other verification methods.

• Finally, we present a critical review of the formal methods and investigate the landscape of web application modelling and verification techniques.

In this thesis we present a novel approach for the modelling of web applications. We gradually include features to the models in verify additional properties in each model. We integrate discrete time in the Spin model checker [Holzmann, 2004] to model properties that rely on time constraints. The advantage of using discrete time is that we were able to capture the value of time at each step in order to compare the behaviour of different models. Uppaal [Amnell et al.,2001] uses real time modelling, which we first use to validate the results obtained from the Spin. Secondly, we compare both tools in the context of verifying web applications.

1.5 Research Methodology

Modelling can provide significant benefits to web application development. The view of a system shifts from basic implementation to more detailed aspects, such as security that will improve the quality of the final product. Model checking assists in under-standing the interactions and states of web applications, reducing design flaws and ensuring consistent conditions and well-defined behaviour [Sch¨atz, 2004]. Additional benefits of model checking for web applications are [Baier et al., 2008, Clarke, 2008,

(18)

Modelling phase: Describing and analysing the high-level, abstract and non-deterministic behaviour of the application avoids the cost of implementation details that could complicate the design. Errors can be caught more easily earlier in a less expen-sive development phase.

Properties definition The properties of the web application model to be verified are defined by using temporal logic formulas, for example in Spin the Linear Temporal Logic (LTL) [Burstall,1974,Kr¨oger,1977,Pnueli,1981] is used, while Uppaal uses a subset of the Computation Tree Logic (CTL) [Huth and Ryan,

2006].

Simulation phase: Using simulation and verification to analyse the model and in-teractions, we identify potential issues, such as the undesired behaviour of the system and modelling errors.

Verification phase: In this phase we verify that the model guarantees the properties of the system in different scenarios. If the property fails, the model checking fail-ure analysis assists in finding the error through a trace. We either use temporal logic formulas or simple assertions.

Figure 1.1 shows the process we will use in Chapters 3 and 4 to analyse and verify the properties of web applications.

(19)

Figure 1.1: Model Checking Web Application Properties.

According to [Fenton and Bieman, 2014], a formal experiment is a rigorous and con-trolled investigation of a model in which important variables are identified and changed such that the outcome can be validated. [Mendes and Mosley, 2006] stated that for-mal investigation is best suited to the web applications research community, as it is applicable across various types of projects and processes. In a formal investigation a variable is manipulated such that all possible variable value are validated.

The formal analysis framework used in this research consists of four components. The application and properties are expressed using formal semantics. A formal language is then used to describe the system. Next, a formal language is used to describe the property under analysis. Finally, a formal technique checks whether the application satisfies the property.

In our research, we are interested in the verification of the applications’ behaviour properties, rather than the data transmission properties. We model both web page transactions and the input of web applications, as the dynamic nature of web appli-cations means that the input could lead to different pages (e.g. wrong authentication credentials). The dynamic nature of web applications could be affected by different input from the user, or by the server state.

(20)

As spin does not support the modelling of time constraints, it will be extended with discrete time, enabling the construction of realistic web application models. In the model checking of timed models, discrete time is preferred to reduce the risk of state space explosion [Valmari,1998], which is one drawback of model checking. Modelling with real time could result in an increase of the system’s states up to an intractable level. The state space explosion problem will be discussed in Chapter 2.

1.6 Contributions

In this thesis, we made the following contributions:

• The challenges in adopting model checking for the analysis and verification of web applications are critically reviewed. The usage of model checking is ex-amined for critical properties of web applications, such as security, navigation and time-sensitive properties. After providing background information on the current challenges in verifying web applications, methods are devised to develop more secure and easy-to-maintain web applications. In Chapter 2 we present and discuss the challenges in more detail.

• We design a novel web application model and extend it with the novel approach of time constraints to enable the modelling of web application properties. The time constraints assist in time stamping the messages exchanged between par-ties in the communication and also in the analysis of the sequence of actions. By adding time we are able to express properties, such as modelling session manage-ment properties and dynamic navigation properties, where a timeout can lead to different pages. Chapter 3describes the steps in modelling time constraints.

(21)

• We develop web application models in the Spin model checker. We first analyse the models without time constraints to understand the difference when we add time and to ensure the correctness of the models. We then model dynamic navigation properties by showing how different input from both sides could affect the simulation and verification process. We then introduce a novel approach for modelling the discrete time process so we can model further time-related properties such as session management properties. Finally, we add an intruder to the model to analyse the behaviour of the system in different scenarios. Chapter

3 describes the modelling steps in further detail.

• By analysing the time sequence and action sequence within a web application session, we can identify the difference between a secure session and a compro-mised session, with the presence of an intruder. Understanding the web appli-cation behaviour in different scenarios leads to an improved security and more stable development. Furthermore, our approach can assist in developing meth-ods to detect malicious behaviour at early stages. This is analysed in Chapter

4.

• In addition to modelling the static properties of web applications, a novel ap-proach was developed for modelling the dynamic properties of web applications, in which a single input can lead to different pages based on time constraints and server state. As highlighted in the literature review, there is a gap in modelling the dynamic navigation properties of web applications. Our research shows how it is possible to model web applications using the model checking tool’s existing capabilities, resulting in simplified models that contain security and navigation properties. We present the models in further detail in Chapter 3 and Chapter

(22)

• We verify web applications’ properties in the Uppaal real-time model checker. Uppaal has a graphical editor which makes it easy to design a system model, along with a graphical simulator that shows the possible dynamic behaviour of a system description. We compare the models in Chapter 5.

• A comparison was made between the Spin and Uppaal model checkers for web application analysis and verification. This comparison aims to answer the following questions:

– What is the complexity and expressiveness level of the model checking tool to verify the properties of web applications models?

– To what extent can the property specification language be adapted to the specification of web application properties?

– How capable is the model checker for verifying models without resulting in a state explosion problem?

– How are the results different when integrating a simple timing constraint into Spin, in contrast to Uppaal, which is based on timed automata spec-ifications?

The outcome results validate the rationale for using model checking in web application development, as explained in Chapter5.

1.7 Publications

Part of the work presented in this thesis has been published and presented in peer-reviewed conferences and workshops:

(23)

1. Alzahrani,M. & Georgieva, L. (2012) Modelling Trusted Web Applications. 1st International Workshop on Trustworthy Multi-Agent Systems. KES-AMSTA Special Session, Dubrovnik, Croatia, 25-27 June.

2. Alzahrani,M. & Georgieva, L. (2012) Analysing Data-Sensitive and Time-Sensitive Web Applications at the 19th Automated Reasoning Workshop, University of Manchester. 2nd-4th April.

3. Alzahrani,M. & Georgieva, L. (2013) Comparative analysis of time-sensitive web applications using Spin and Uppaal at the 20th Automated Reasoning Workshop, University of Dundee on 11-12 April.

4. Alzahrani,M. (2015) Model Checking Web Applications using Spin and Uppaal at 15th International Workshop on Automated Verification of Critical Systems, Edinburgh, 1-4 September.

1.8 Structure of Thesis

The remainder of this thesis is organised as follows:

Chapter 2 summarises the background on web application fundamentals and prop-erties, and provides a comparison of the analysis and verification methods found in the literature. Model checking and the tools used in this research are then described, as a basis for subsequent chapters.

Chapter 3 presents the first model checker, Spin. First the tool and its input lan-guage Promela are described. Second, the web application is modelled, and

(24)

a description of the steps followed during the modelling and verification is pro-vided. The secure and compromised models are presented, and then the simu-lation and verification results are shown.

Chapter 4 describes the second model checking tool, Uppaal. First a brief de-scription of the tool is presented, followed by background information on timed automata theory as a basis for modelling web applications. A comparison of the secure and compromised models is made, and subsequently, the simulation and verification results are provided.

Chapter 5 provides a comparison between the results obtained from Chapter3 and Chapter 4. The results of the experiments are analysed, illustrating the differ-ences between the tools, as well as the challenges of modelling web applications.

Chapter 6 assesses the results that were obtained and presents conclusions, contri-butions, limitations and possible future work.

(25)

Background and Related Work

This chapter describes the development process of web applications and the challenges arising in both the design and implementation phases. We then outline evolving trends and discuss related work. We then present an overview of model checking principles for web applications. Moreover, we list the verification requirements for web applications. The OWASP Application Security Verification Standard [Stock et al., 2014], which is updated annually, is used to illustrate a list of verification properties. This chapter provides the research context and lays the foundation for the modelling and analysis work described in the next chapters.

2.1 Web Applications

Web applications enable much of today’s online business; including banking, social networking and governmental activities, to thrive. As a result of the rapid develop-ment of new programming models and technologies, web applications are evolving

(26)

continuously. The results of such rapid change for web applications brings new chal-lenges [Alpuente et al., 2010,Armando et al., 2010,Conallen, 1999,Di Sciascio et al.,

2003].

The security of web applications is a challenging task. Security is a continuous process of identifying and analysing potential threats [MSDN, 2011].

Furthermore, new security challenges emerge due to the increasing amount of appli-cation code being moved to the client’s side. With larger amounts of code exposed to the user comes greater vulnerability risks. Attackers are able to gain knowledge of the code and are therefore, more likely to compromise the server-side application state. The data protected by web applications are security sensitive in most cases, including credit card details and personal information, and are typically significantly valuable for both users and service providers. Emerging types of attacks, such as the HTTP parameter pollution attack, place a wider range of web applications at risk [Balduzzi et al., 2011]. As a result, major companies offer rewards for finding vulnerabilities within their web applications [Google, 2015].

This inherent complexity poses challenges to the modelling, analysis and verification of this type of application. Some of these challenges are summarised below [Alalfi et al., 2009, Li and Xue, 2014]:

• The complex nature of the web application environment may lead to integration difficulties with other diverse hardware and software platforms. The analysis of many components could make the verification extremely difficult.

• The dynamic behaviour, such as the dynamic interaction between clients and servers, and the continual changes in the system’s context and web technologies can be another major challenge.

(27)

• Web applications may have several entry points, allowing interaction with the system in a way that cannot be predicted (due to design errors) and that cannot be blocked by the web application.

• Another challenge is the efficient monitoring and tracking of outputs of web applications. Examining the change of states between different components is often difficult to analyse.

The early websites only contained a collection of documents with static content, en-coded in the HyperText Markup Language (HTML). Since then, web applications have evolved from static hypermedia to complex and dynamic infrastructures. In ad-dition, development technologies shifts the focus of web applications from information delivery only, to include application execution [Casteleyn et al., 2009].

New technologies have been developed to enable web applications to change from simple static HTML pages to dynamic web pages that are able to interact with other systems [Casteleyn et al., 2009, Conallen, 1999]. Web pages and various elements of web applications are stored on the server. Users primarily interact with the browser; the request from the client’s side is sent to the web server and in turn to the database management system. Servers respond to the user’s request and carry out data pro-cessing to complete the transaction. The processed results are then returned to the user via the web browser.

Web applications are commonly designed as a three-tiered architecture (shown in Figure2.1) and consist of the following components:

Web browser is the software application that serves as a user interface for presenting information.

(28)

Web application server manages the dynamic flow-control of the web application. The web application server receives user input via the web browser and results from the database server. The code is constructed dynamically and the challenge arises when checking or modifying the incoming data before processing it or when passing it to the lower tiers for execution. Failure in this process can lead to compromising the security of the web application.

Database server provides management and database persistent functionality.

Figure 2.1: Overview of Web Applications [Li and Xue,2014]

Accordingly, the features that differentiate web applications from traditional software and information systems can be summarised below [Casteleyn et al.,2009,Fraternali,

1999]:

• Accessibility: Users with different levels of computing skills and with different needs are able to access web applications.

• Data management: The data in web applications is distributed in different for-mats and using various technologies.

• Architecture complexity: Web application accessibility requires distributed, multi-tier architectures to access the full range of information and services.

(29)

2.1.1 Security Threats for Web Applications

Web applications are built on complex systems consisting of various components and technologies. The current web application development and testing frameworks offer limited support for security validation. Web application development is an error-prone process, and the implementation of security metrics requires substantial effort [Alalfi et al., 2009]. Security relies on the following attributes [MSDN, 2011]:

• Authentication: The process of knowing who is accessing the information on the server. All principals of a communication need to prove their identities in order to gain access.

• Authorization: The process of controlling the information and actions that an authenticated principal is permitted to access.

• Auditing: Developing effective auditing prevent clients from denying their trans-actions.

• Integrity: Ensuring that transmitted data is protected from accidental or delib-erate malicious modification.

• Confidentiality: Ensuring that the data remains private and confidential from unauthorized users or eavesdroppers who monitor the flow of traffic across a network.

• Availability: Ensuring that systems remain available for legitimate users. Denial of service attacks cause the system to crash so that other users cannot gain access.

A large number of web applications deployed on the Internet are open to security vulnerabilities. According to a report by [Cenzic,2014], 96% of tested web applications

(30)

in 2013 had vulnerabilities categorised as “ high risk”. In addition, in 2013 an average of 14 vulnerabilities was estimated per web application. A recent report by [Hoff,

2013] showed that in 2012 alone there were more than 800 reported hacking incidents; 70% of those were carried out by exploring web application vulnerabilities.

Figure2.2 shows the percentage of web applications with respective different common types of vulnerabilities. This increases the difficulty of finding a universal solution for each type, as each one requires a different fix. The three main categories of threats were: session management vulnerabilities; (found in 79% of web applications in 2013); Cross-Site Scripting (XSS) vulnerabilities; (60%); and authentication and authorisation vulnerabilities (56%).

Figure 2.2: Percentage of Common Vulnerability Types in Web Applications [Cenzic,2014]

The development of web applications requires careful consideration with a focus on security and navigation correctness. The use of model-based verification assists in capturing the system’s behaviour. Furthermore, model-based verification can simplify future analysis in order to improve or measure the quality of the system.

In addition, modelling plays an important role during the the software development phase by formally defining the requirements and providing exhaustive detail. A cen-tral goal of model-based development is to enable an analysis of the system, thus

(31)

ensuring quality at model level. There is a need to consider certain properties of the system prior to implementation, such as deadlock freedom, timing consistency and the availability of memory resources [Engels et al., 2003].

Following traditional software verification, the use of forward engineering-based ver-ification simplifies the development process and establishes a basis for later phases, such as verification. On the other hand, the use of reverse engineering methods to extract models from existing applications simplifies maintenance and evaluation. Forward-engineering verification employed in early development stages enables error detection, alleviating the costs and effort of rectification with respect to errors in later development stages [Huth and Ryan, 2006].

Web application modelling is viewed from different perspectives based on the purpose of the verification.

In order to present and discuss the level of modelling and the scope of properties under this research, we will demonstrate and discuss primary web application properties and the attempts of both researchers and the industry to ensure correctness.

According to a survey carried out by [Alalfi et al.,2009], the level of web applications modelling can be viewed from three perspectives: web navigation, web behaviour and web content. Web content properties are outside the scope of this research, since they mostly rely on checking the programming language and content components used. The other two perspectives of web properties; web navigation and web behaviour are discussed, along with related work, in the following sections.

(32)

2.2 Web Navigation Properties

Navigation within a web application is key to ensuring both its security and usability [Kappel et al.,2006]. The web navigation properties are divided into three categories. First, static navigation properties address properties such as broken links, reachability (e.g., links to home page), consistency of frame structure and cost of navigation, such as the longest path analysis.

Second are dynamic navigation properties, whereby some links may lead to different web pages depending on the input. Input can be provided by either the user or the system. The action then depends on the server that uses information, such as session information, time or date, to apply access control and user privileges.

The third category of properties is interaction navigation analysis, which focuses on properties outside the control of the web application, such as user interaction with the web browser, e.g., the back and forward buttons. Table 2.1 lists an example of navigation properties.

Property Description

d1 The page is reachable from the top page and always has a next page in the transition.

d2 Every page is reachable from the top page. d3 The top page is reachable from all pages. d4 Eventually a chosen-page is visited.

d5 The first page is the page and the next page is either the login-error-page or the home-page.

d6 Whenever the login-page is visited, the next page is either the login-error-page or the login-success-login-error-page.

Table 2.1: Navigation Properties [Stock et al.,2014].

The work of [Homma et al., 2011_{] uses the Spin model checker [}Holzmann, 2004] to model web application navigation properties. The authors propose a method to use

(33)

two finite state automata, with the first representing page transitions and the second modelling the internal state transitions of the web application.

[Castelluccia et al., 2006, Di Sciascio et al., 2003] modelled web applications as a directed graph in which pages, links, windows and actions are represented as states. The implemented prototype system embeds a component which automatically imports web applications design from a UML tool; and then Computational Tree Logic (CTL) specifications are added and translated as source code for the NuSMV model checker [Cimatti et al., 2000]. The main advantage of this method is the ability to perform a priori verification of the web application design by applying the verification process to the UML-design of the web application in a single automated process using the verification tool WAver.

[Ricca and Tonella, 2000, 2001] propose a model of web applications using a UML class diagram. The model is used for reachability checking and semi-automatic test case generation.

[Han and Hofmeister, 2006] present an approach that uses state charts to formally model the adaptive navigation of web applications and checks for unreachable web pages. This model only focuses on user mode (e.g., whether the user is logged in or not) and page history (e.g., what pages the user has previously visited).

The work of [Haydar et al.,2005] proposes a way to discriminate states of interest by introducing a specialised operator for Linear Temporal Logic (LTL), which is used to verify web applications. This focuses more on the distinction of states; rather than on the modelling of web applications.

In [Yuen et al., 2006], the authors propose a behavioural model of web applications, called Web Automata, which is based on the Model, View and Control (MVC) model architecture. They model the behaviour of a web application with dynamic content

(34)

as an extension of links-automata with the constraint logic feature of Extended Finite Automata (EFA). They also present a testing framework for web applications based on the behavioural model.

In [Haydar et al., 2004] the authors present a formal approach for modelling web applications using communicating automata. They observe the external behaviour of an explored part of a web application using a monitoring tool. The observed behaviour is then converted into communicating automata representing all windows, frames and frame sets of the application being tested by intercepting HTTP requests and responses using a proxy server. Their model differs from the one proposed in this research, as it focuses on external behaviour.

The approaches described in this section use either the graph-based model or UML to represent the navigation properties of a web application. UML is considered as the modelling standard for a wide range of applications and systems [Alalfi et al., 2009]. However, UML is not a suitable method for the verification of web applications as the models need to be translated into formal specifications. The alternative method is to use graph-based modelling methods that can be directly translated to a verification form that is accepted by model checking tools. From the research listed in this section, we identify the need for a sound method that also includes the dynamic behaviour of web applications.

2.3 Web Security Properties

Since web applications are developed with availability across the Internet their se-curity is a major concern for developers and users [Huang and Lee, 2005, McClure et al., 2003, Tracy et al., 2002]. Web application developers review web application

(35)

vulnerabilities regularly. The Open Web Application Security Project (OWASP) is concerned with web application security, and publishes a list of the most recent attacks of web applications each year [OWASP,2013], as well as general guidance for building and verifying web applications. In addition, technical reports are published by other organisations, such as Microsoft, which focus on developing secure web applications using the .NET framework [Microsoft, 2011]. The Web Applications Security Con-sortium (WASC) published a report on security threat classifications [WASC, 2011], summarising the most common security threats of web applications.

The modelling of web behaviour properties is divided into two categories. The first category is security properties, which focuses on access control and session control mechanisms. Security properties are related to navigation properties. For example, a wrongly designed navigation link could lead to unauthorised access to sensitive infor-mation in the web application. The second category of web behaviour is instruction processing properties; this type of modelling addresses issues related to execution and state changes at both ends, without communicating with each other.

Predicting the kinds of attacks that could affect the security of a web application is challenging when observing the diversity of these attacks. However, by modelling specific web application properties, it is possible to model the cause and effects of the attacks [Corin et al., 2003].

According to a survey by [Li and Xue,2014], the three primary security aspects that should be considered to achieve an accepted level of security are:

• User inputs are potentially dangerous and cannot be trusted in an open envi-ronment. Thus, input validation is an essential aspect of the web application

(36)

security to detect untrusted user inputs. Due to the unique features of web ap-plications in contrast with other apap-plications, input validation is a challenging task.

• It is equally as important to employ session management to correlate web re-quests from the same user into one web session during a certain period of time. Communication between a user and a server is carried out through HTTP, which is a stateless protocol. As a result, multiple inputs from the same user are processed as independent requests originating from multiple users of the web application. The session variables can be stored either at the client side (via cookies) or at the server side (using files or databases). In the latter case, a unique identifier session ID is assigned to index the explicit session variables, which are stored at the server side and issued to the client.

• Additionally, the implementation of control flow between the user and server must be accurate to protect sensitive information. This can be achieved ex-plicitly through source code security checks or imex-plicitly through the navigation paths presented to users. Security checks examine the state of a web application by relying on session variables and persistent objects in the database before re-vealing sensitive information to the user. Authentication and authorisation are the most common mechanisms for control flow in data-sensitive web applica-tions, enabling an application to restrict its sensitive information and privileged operations from authorised users.

In this research, the focus is on the three primary aspects of session management, authentication properties and control flow properties. The remainder of this chapter describes and discusses the aforementioned security aspects and the most common attacks that can exploit vulnerabilities.

(37)

2.3.1 Session Management

Web application session management is essential to track and record user input and to maintain accurate application states. Session management is accomplished through collaboration between the client and the server. The simplest approach is for the server to send a unique identifier (i.e. session ID) to the client.

Since the session ID is the sole proof of the client’s identity, its confidentiality, au-thenticity and integrity need to be secured in order to avoid session hijacking. First, a session ID should be randomly generated for each user’s visit and should expire after a short period of inactivity timeout. Second, transmissions between the parties should be protected by a secure transport layer protocol (i.e. SSL security protocol), to ensure that attackers are unable to deduce the session ID and eventually control the session. Finally, the user should make sure that the session ID provided by the server is unique by not adopting a session ID from an external source. Attackers can set a session ID to a value that is known to them.

A web session is formed as a pair of network HTTP request and response transactions associated with the same user. Complex web applications require the retention of information or status about each user for the duration of multiple requests [Stock et al., 2014]. Therefore, sessions provide the ability to establish variables, such as access rights and localisation settings, which will apply to each interaction between the user and the web application for the duration of the session.

Web applications create sessions to keep track of anonymous users after the very first user request. An example is saving the preferences of the user’s language. In addition, web applications make use of sessions once the user has been authenticated. The process ensures the ability to associate the user to any following requests, and

(38)

also employs security access controls, enables authorised access to the user’s private data and enhances the usability of the application.

Table 2.2 lists examples of the session management properties:

b1 Verify that sessions are invalidated when the user logs out.

b2 Verify that sessions time-out after a specified period of inactivity. b3 Verify that the application does not permit duplicate concurrent user

sessions, originating from different machines.

b4 Verify that sessions time-out after an administratively-configurable max-imum time period regardless of activity (an absolute time-out).

Table 2.2: Session Management Properties [Stock et al.,2014].

2.3.2 Authentication

Authentication is the process of verifying that an individual or entity is who they claim to be. Authentication is commonly performed by submitting a user name or ID and one or more items of private information that only a given user should know [Stock et al., 2014]. A session record is then created with a cookie set, which the browser will send with each subsequent request to the application. The application then shows data related to the authenticated user (e.g., shopping cart content, posts, and stored files) during their use of the application. Table2.3lists the most important web application (authentication properties).

(39)

a1 Verify that all pages and resources require authentication except those specifically intended to be public.

a2 Verify that all authentication controls are enforced on the server side. a3 Verify that re-authentication is required before any application-specific

sensitive operations are permitted as per the risk profile of the applica-tion.

a4 Verify that a failure of the authentication controls ensures that attackers cannot log in.

Table 2.3: Authentication Properties [Stock et al.,2014].

2.3.3 Control-Flow

Each web application maintains its own application control flow (also known as busi-ness logic). Ensuring the correctbusi-ness of the control flow is key to a secure web appli-cation, and this mainly depends on the intended functionality of the application. The main logic property is that users can only access authorised information and perform operations allowed by the intended work flow of the web application [Li and Xue,

2014].

Web developers attempt to prevent such vulnerabilities. The interface-hiding mech-anism, which uses the principle of security through obscurity, has been widely used as an access control mechanism in web applications. However, this mechanism alone is not sufficient to ensure the control flow of web applications. Attackers can simply expose hidden links to access unauthorised information and operations. Secondly, developers may manually use explicit security checks prior to all sensitive operations.

(40)

It is difficult to check and anticipate all possible execution paths that may lead to a security vulnerability. It is likely that there will be a missing security check on certain paths that will lead vulnerabilities to be exposed to attackers.

As discussed above, control flow vulnerabilities depend on the intended purpose of a web application. For example, an online banking website may have a certain vulnera-bility that allows attackers to bypass vital security pages or steps. The 2013 OWASP report states that top ten security risks for web applications [Stock et al.,2014] can be attributed to application logic vulnerabilities (i.e. missing functional access control, invalid redirects and/or forwards). Table 2.4 lists control-flow vulnerabilities.

c1 Verify that the application does not allow spoofed high value transac-tions, such as allowing Attacker User A to process a transaction as Vic-tim User B by tampering with or replaying a session, transaction state, transaction or user IDs.

c2 Verify that the application is protected against information disclosure attacks, such as direct object reference, tampering, session brute force. c3 Verify that the application will only process business logic flows in

se-quential step order, with all steps being processed in realistic human time, and restricting out of order, skip steps, process steps from another user, or submitting transactions too quickly.

(41)

2.4 Model Checking

Model checking is based on a collection of techniques for the automatic analysis of a system. A formal definition of model checking is

“ Model checking is an automated technique that, given a finite state model of a system and a formal property, systematically checks whether this property holds for (a given state in) that model [Baier et al., 2008].”

The model checker tool takes as input the description of the system and the properties of that system. The system, in most cases, is defined as a finite state system, and its properties are expressed as temporal logic formulas. The model checker verifies whether the properties hold or not. In most cases, if a property does not hold, the model checker provides a counterexample.

In practice, the model of the system being analysed is approximate, thus the results are limited. Errors in the model may still remain after the verification. When applying model checking to a system’s design, three main phases may be identified, as described in [Baier et al., 2008, Clarke, 2008, Clarke et al., 1999]:

• Modelling Phase. The modelling phase consists of modelling the system in a language acceptable to the one used by the model checker, then using the sim-ulation on the model and finally using the property specification language to formalise the property to be checked.

• Running Phase. The system is checked to see if the properties defined using the model checker hold.

(42)

• Analysis Phase. This phase checks whether the properties specified are satisfied or not. Depending on the result, the model is then refined, the properties are re-designed and the process is repeated.

Figure 2.3: Model Checking Process.

Figure 2.3 gives an overview of the model checking approach. The requirements of the system under consideration are first identified and these requirements are then formalised in a property specification language. The system is then modelled in a language acceptable to the model checker. A combination of the model and the properties of the model are fed into the model checker. The model checker outputs the results as ‘satisfied’ if no property is ‘violated,’ or ‘violated’ if a property fails. To build any model for verification purposes, there are guidelines to be followed, in order to correctly model the system under consideration.

The problem of model checking can easily be stated as defined by [Clarke,2008]:

Let M be a Kripke structure [Kripke,2007] which is graphical transition system that represent the behaviour of a system . Let f be a formula of temporal logic (i.e. the specification). Find all states s of M such that M,s |= f (see Figure 2.3). The term model refers to whether the structure M was a model for the formula f. It does imply the abstract model of the system under study.

Some of the main advantages of model checking over other verification techniques, such as automated theorem proving or proof checking [Clarke, 2008], are:

(43)

• The checking process is automatic. Users need to enter a formal description of the system model and its specifications and the model checking tool will produce a result.

• Model checking is faster than traditional verification techniques (i.e. testing), and therefore it saves time and expense.

• The model checker produces a counterexample if the specification is not satisfied. Counterexamples are important to show the reason why the specification does not hold; this assists in the debugging of complex systems.

• It can evaluate partial specifications. In complex systems, model checking can be used during the design phase.

• Temporal logic can express essential properties for reasoning about concurrent systems.

In contrast, one of the main disadvantages of model checking is the state explosion problem, since model checking searches the state space of a system, which may increase exponentially with the system’s description size.

This problem comes from having so many possible, interleaved interactions between processes that the state space grows exponentially compared to the number of pro-cesses. If such behaviour is inherent to the system, the only way out of it is to use bit-state hashing, which implies only partial checking of the state space. This is based on the idea that the presence of errors is easier to detect than their absence [Bosnacki,

1998,Clarke,2008_{]. Partial-order reduction, which is used by the Spin model checker,}

is one of the most effective solutions to this problem [Clarke, 2008]. Partial-order re-duction means that instead of generating all possible, interleaving execution paths in the state space, it is possible to generate only representatives of certain classes of

(44)

execution paths. As [Holzmann, 1997] describes, the reduction is based on the obser-vation that the validity of an LTL formula is often insensitive to the order in which concurrent or independently executed events are interleaved in the depth-first search.

This makes it possible to record state changes and in this way to ascertain that two different paths of execution are the same, which in turn enables the verifier to remove the other paths, as they would not contribute anything new to the verification. In contrast, state compression means simply compressing the state data, which naturally incurs runtime overhead. Both partial-order reduction and state compression are guaranteed not to make the system states unreachable [Holzmann,1997].

2.4.1 Model Checking Tools

Model checking tools are being developed continuously in both industry and the re-search community. Tools such as ProVerif [Blanchet, 2001_{], Scyther [}Cremers,

2008], NuSMV [Cimatti et al., 2000] and Tamarin [Meier et al., 2013] are examples of model checking tools used for the verification of security protocols and can provide simulation and verification of their properties. Each of the tools has a different level of ability to model specific properties of the system under verification. Moreover, tools such as ProVerif provide support for modelling intruders and cryptographic primitives.

The Spin model checker [Holzmann, 2004] is the primary tool used in this research; it was chosen for its simplicity and high degree of expressiveness. In contrast to other model checkers, Spin has the ability to provide insights into the first stages of modelling through its simulation charts. Since Spin does not support the modelling of time we integrate discrete time into the models.

(45)

The second model checker used in our research is Uppaal [Amnell et al., 2001]. Uppaal is used for the verification of real-time systems. The graphical editor of Uppaal makes it easy to design a system model, along with a graphical simulator that shows the possible dynamic behaviour of a system description.

2.5 Temporal Logic

Temporal logics are used to describe event sequences in time without the explicit use of time. Temporal logics were developed to investigate how time is used in natural lan-guage arguments by philosophers and linguists [Clarke, 2008, Hughes and Cresswell,

1996_{]. This work uses two model checking tools: the Spin model checker uses LTL,}

while the Uppaal model checker use CTL for verification [Burstall, 1974, Kr¨oger,

1977, Pnueli, 1981].

The usage of temporal logics for reasoning about systems was proposed by [Burstall,

1974, Kr¨oger, 1977, Pnueli, 1981]. The two most-used branches of temporal logic are the Linear-time Temporal Logic (LTL) and the Computation Tree Logic (CTL). LTL considers every event in time as having a unique possible future; the events are checked over a classical timeline. In contrast, CTL expresses each moment in time as having several possible futures. Thus, CTL views the structure of time as a tree, rooted in the current time with any number of branching paths from each node of the tree. LTL and CTL have a common subset of properties, but neither of them completes the other one. Properties exist that are expressible in LTL but cannot be expressed in CTL, and vice versa. CTL* is another temporal logic that contains both LTL and CTL.

(46)

Safety Property: Describes a behaviour that may not occur on any path (”Some-thing bad may not happen”). In order to verify a safety property, all execution paths need to be exhaustively checked.

Invariance Property: Describes a behaviour that must hold on all paths.

Liveness Property: A liveness property implies that ”something good eventually happens”, and a certain state will always be reached in a system.

2.5.1 Linear Temporal Logic (LTL)

Linear Temporal Logic (LTL) [Fisher,2011,Holzmann, 1997, Venema, 2001] reasons over linear traces through time. At each time instant, there is only one real future state that will occur. Traditionally, that timeline is defined as starting “now,” in the current time step, and progressing infinitely into the future.

Syntax of LTL formulas are composed of a finite set Prop of atomic proposition vari-ables (denoted by letters p, q,...), the Boolean connectives ¬, ∧, ∨, → and the temporal connectives U (until), X (next time), G (globally, also known as the sample) and F (eventually, also known as the ♦ sample). Intuitively, the X ϕ states that ϕ is true in the next time step after the current one. The ϕU ψ states that either ψ is true now or ϕ is true now and ϕ remains true until such a time when ψ holds. Finally, ϕ means that ϕ is true in every step, while ♦ϕ designates that ϕ must either be true now or at some future time step. Table 2.5 lists all the operators used in LTL formulas and their equivalent used in the Spin model checker verification:

Formally, an LTL formula ϕ has the following syntax, where p is an atomic proposition from some set atoms:

(47)

Operator Math or Logic spin not ¬ or p ! and ∧ && or ∨ || implies → − > equivalent ↔ < − > until U U always or globally _{or G} []

eventually or in the future _{♦ or F} <>

next X X

Table 2.5: LTL formula operators with their mathematical and spin notation

ϕ ::= p | ¬ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | Gϕ | F ϕ | U ϕ | X ϕ

For example, we verify a navigation property “ the home page is always followed by an account page” in a web application model. We use the LTL formula in 2.1 where p is defined in the Spin model checker as the home page and q as the account page and ♦ means eventually.

([](p − > <> q)) (2.1)

More properties will be expressed in LTL later in Chapter 3.

2.5.2 Computational Tree Logic (CTL)

Computational Tree Logic (CTL)[Fisher,2011] is a branching time logic that reasons over many possible traces through time. Unlike LTL, for which every time instance has exactly one immediate successor, CTL has a finite, non-zero number of immediate

(48)

successors. CTL was the first logic to be used in model checking [Clarke and Emerson,

1982]. The CTL branching timeline starts in the current time step and may progress to any one of potentially many possible infinite futures. Furthermore, in reasoning along a timeline, CTL operators must also reason and include all possible branches. CTL is similar to LTL, in so far as the temporal operators are all two-part operators, with one part specifying the location to occur along a future timeline and another specifying whether this action takes place on at least one branch or all branches. The path operators are:

• A: On all future paths, starting from the initial state.

• E: On some future paths, starting from the initial state.

The second model checker used in this research, Uppaal, uses a simplified subset of CTL. The properties of web applications are expressed in CTL.

For the purpose of understanding the logical expressiveness of LTL and CTL in verify-ing web application properties, we consider the differences between the logics. [Vardi,

2001] states that the linear and branching time logics correspond to two distinct views of time, and therefore LTL and CTL are expressively incomparable. In general, CTL allows explicit existential quantification over paths, which gives it an expressive na-ture in cases where there is a need to reason about the possibility of the existence of a specific path through the transition system Model M. This includes instances such as when M is best modelled as a computation tree, such as the dynamic nature of multi-pages from a single page. For example, there are no LTL matches of the CTL formulae A Xp, since LTL cannot express the possibility of p occurring on some path but not all paths next time or in the future. Moreover, it is impossible to express in LTL scenarios where distinct behaviour on distinct branches occurs at the same time.

(49)

On the other hand, it is difficult to use the CTL logic in situations where the same behaviour may occur on distinct branches at distinct times; here the ability of LTL to describe individual paths is important and useful. [Vardi, 2001] explains that the former rarely happens, and LTL is found to be more expressive in other ways than CTL.

A further comparison between the logics will be discussed in Chapter 5.

2.5.3 Temporal Logic Patterns

Temporal logic formalisms are commonly used to describe states and event sequences in systems. Defining temporal formulas can be a straight-forward process if the prop-erty is small. However, when there is a need to verify complex properties, it is advis-able to use the formula patterns described in [Dwyer et al.,1999,Salamah et al.,2005]. Table 2.6 shows the temporal patterns that were collected from research studies:

Pattern Description

Absence A given state q does not occur within a state scope. Existence A state q must occur within a scope.

Bounded Existence A given state q must occur a number of times in a scope. Universality A state q must occur through the scope.

Precedence A given state p must always be preceded by a state q within a scope.

Response A given state p must always be followed by a given state q within a scope.

(50)

[Corbett et al., 2000] developed pattern scopes, where the execution of the temporal formula takes place in a specified region of the scope. There are five basic scopes where a pattern can hold:

• Globally: A given state must hold throughout the system’s execution.

• After: A given state p must occur after the first occurrence of a state q.

• Before: A given state p must occur before the first occurrence of a state q.

• Between: A given state p must occur between a pair of designated states.

• After and until: The state must occur after one state until the next occurrence of another state.

The use of the temporal logic patterns assists in reducing the complexity of using tem-poral logic formulas. We use the patterns in verifying properties in Spin in Chapter

3.

2.6 Timed Automata Theory

The second model checker used in this research, Uppaal, is based on the theory of timed automata, hence the importance of introducing the theory in this section.

Timed automata theory [Alur and Dill, 1994, Bengtsson and Yi, 2004, Kourkouli and Hassapis,2005] is defined as a formal framework for modelling and analysing the behaviour of real-time systems. Real-time systems are described as systems that must fulfil time constraints, such as deadlines, response time, communication delays and execution. In secure web applications, time constraints are crucial to deliver security to both the user and the web application. Timed automata are finite-state directed