Classification, Formalization and Automatic Verification of Untraceability in RFID Protocols

(1)

CLASSIFICATION, FORMALIZATION AND AUTOMATIC VERIFICATION OF UNTRACEABILITY IN RFID PROTOCOLS

ALI KHADEM MOHTARAM

DÉPARTEMENT DE GÉNIE INFORMATIQUE ET GÉNIE LOGICIEL ´

ECOLE POLYTECHNIQUE DE MONTR´EAL

M ÉMOIRE PRÉSENTÉ EN VUE DE L’OBTENTION DU DIPL ÔME DE MAÎTRISE ÈS SCIENCES APPLIQUÉES

(G ´ENIE INFORMATIQUE) AVRIL 2013

c

(2)

´

ECOLE POLYTECHNIQUE DE MONTR´EAL

Ce m´emoire intitul´e :

CLASSIFICATION, FORMALIZATION AND AUTOMATIC VERIFICATION OF UNTRACEABILITY IN RFID PROTOCOLS

pr´esent´e par : KHADEM MOHTARAM Ali

en vue de l’obtention du diplôme de : Maˆıtrise ès Sciences Appliquées a été dûment accepté par le jury d’examen constitué de :

M. ANTONIOL Guiliano, Ph.D., pr´esident

M. MULLINS John, Ph.D., membre et directeur de recherche

M. QUINTERO Alejandro, Doct., membre et codirecteur de recherche M. ADAMS Bram, Ph.D., membre

(3)

(4)

ACKNOWLEDGEMENTS

First, and foremost, I would like to express my thanks and gratitude to my supervisor John Mullins for welcoming me to his research laboratory. John has dedicated a considerable amount of time and energy to the supervision of my research and his guidance, patience and support throughout this research and the writing of this dissertation was invaluable. I look forward to our future collaboration and continued friendship.

My thanks also aimed at my co-supervisor, Alejandro Quintero, president of the Jury, Guiliano Antoniol, as well as jury member, Bram Adams, for accepting evaluation of my dissertation.

My father always valued education and taught me hard work and perseverance. I would like to extend my thanks to my family ; My parents, Dara and Parivash, and my sister, Mojan, for their love and moral support in everything I have chosen to do ; Above all, my wife, Hoda, who showed me unquestioning faith in my abilities, unfailing support and unconditional love. They have been very supportive and concerned all along. I really missed having them by my side.

Finally I would like to thank my cousin Nima, for a great friendship and numerous hours spent together in Montreal, and also all the people, who their names are not mentioned and helped me with various aspects of conducting research and the writing of this dissertation.

(5)

R´ESUM´E

Les protocoles sécurité RFID sont des sous-ensembles des protocoles cryptographiques mais avec des fonctions cryptographiques légères. Leur objectif principal est l’identification `

a l’égard de certaines propriétés de intimité comme la non-tra¸cabilité et la confidentialité de l’avant. La intimité est un point essentielle de la société d’aujourd’hui. Un protocole d’iden-tification RFID devrait non seulement permettre à un lecteur légitime d’authentifier un tag, mais il faut aussi protéger la intimité du tag. Des failles de sécurité ont été découvertes dans la plupart de ces protocoles, en dépit de la quantité considérable de temps et d’efforts requis pour la conception et la mise en œuvre de protocoles cryptographiques. La responsabilité de la vérification adéquate devient cruciale.

Les méthodes formelles peuvent jouer un rôle essentiel dans le développement de proto-coles de sécurité fiables. Les systèmes critiques qui nécessitent une haute fiabilité tels que les protocoles de sécurité sont difficiles à évaluer en utilisant les tests conventionnels et les tech-niques de simulation. Cela a eu comme effet de concentrer les recherches sur les techtech-niques de vérification formelle de tels systèmes pour assurer un degré élevé de fiabilité. Par conséquent, certaines recherches ont été faites dans ce domaine, mais une définition explicite de certaines de ces propriétés de sécurité n’ont pas encore été donnée.

L’objectif principal de cette thèse est de démontrer l’utilisation de méthodes formelles pour analyser les propriétés de intimité du protocole RFID. Plusieurs définitions sont don-nées dans la littérature pour les propriétés non-tra¸cabilité, mais il n’y a pas d’accord sur sa définition exacte. Nous avons introduit trois niveaux différents pour cette propriété en ce qui concerne les expériences de intimité existantes. Nous avons également classé toutes les définitions existantes avec différents points forts de la propriété non-tra¸cabilité dans la litt´ e-rature. De plus, notre approche utilise spécifiquement les techniques de calculs de processus pi calcul appliqués pour créer un modèle pour un protocole. Nous démontrons les définitions formelles de nos niveaux de non-tra¸cabilité proposées et l’applique à des études de cas sur les protocoles existants.

(6)

ABSTRACT

Radio Frequency IDentification (RFID) is the wireless non-contact use of radio-frequency electromagnetic fields to transfer data, for the purposes of automatically identifying and tracking tags attached to objects. Since RFID tags can be attached to clothing, possessions, or even implanted within people, the possibility of reading personally-linked information without consent has raised privacy concerns. Privacy is the essential part of today’s society. RFID protocols are subsets of cryptographic protocols but with lightweight cryptographic functions. Their main objective is identification with respect to some privacy properties, like untraceability. An RFID identification protocol should not only allow a legitimate reader to authenticate a tag but also it should protect the privacy of the tag. Although design and implementation of cryptographic protocols are tedious and time consuming, security flaws have been discovered in most of these protocols. Therefore the responsibility for reliable and proper verification becomes crucial.

Formal methods can play an essential role in the development of reliable security proto-cols. Critical systems which requires high reliability such as security protocols are difficult to be evaluated using conventional tests and simulation techniques. This has encouraged the researchers to focus on the formal verification techniques to ensure a high degree of reliability in such systems. In spite of the studies which have been carried out in this field, an explicit definition for some of these security properties is still missing.

The main goal of this work is to demonstrate the use of formal methods to analyse RFID protocol’s untraceability. Untraceability generally defined as ensuring that an attacker can not get any information to be used to trace an object in time and space. Several definitions are discussed in the literature for untraceability property, however, there is no compromise on its exact definition. We have introduced three different levels for this property. We also have classified all former definitions of untraceability property in the literature. In order to create a model for a protocol and define privacy properties, our approach employs process calculi technique, applied pi calculus. We have demonstrated the formal definitions of suggested untraceability levels, extending them to some case studies on the existing protocols.

Keywords: RFID security protocol, privacy, untraceability, formal method, process alge-bra, applied pi.

(7)

TABLE OF CONTENTS

DEDICATIONS . . . iii

ACKNOWLEDGEMENTS . . . iv

R´ESUM´E . . . v

ABSTRACT . . . vi

TABLE OF CONTENTS . . . vii

LIST OF TABLES . . . x

LIST OF FIGURES . . . xi

LIST OF ACRONYMS AND ABBREVIATIONS . . . xii

CHAPTER 1 INTRODUCTION . . . 1

1.1 RFID Systems . . . 2

1.2 Cryptographic Protocols . . . 4

1.3 Security Properties . . . 6

1.4 Formal Verification of Security Protocols . . . 9

1.4.1 Formal Methods . . . 9

1.4.2 Verification Techniques for Cryptographic protocols . . . 10

1.4.3 Adversary Model . . . 12 1.5 Problem Statement . . . 14 1.6 Research Objectives . . . 15 1.6.1 Research Question . . . 15 1.6.2 General Objective . . . 15 1.6.3 Specific Objectives . . . 15

1.6.4 Original Scientific Hypothesis of our Contribution(OSHC) . . . 15

1.7 Outline of the Thesis . . . 16

CHAPTER 2 BACKGROUND: APPLIED PI CALCULUS . . . 17

2.1 Syntax and Informal Semantics . . . 17

(8)

2.2.1 Structural Equivalence . . . 20

2.2.2 Internal Reduction . . . 21

2.3 Equivalences . . . 22

2.3.1 Observational Equivalence . . . 22

2.3.2 Labelled Bisimilarity . . . 23

CHAPTER 3 RELATED APPROACHES . . . 27

3.1 Works in Symbolic Settings . . . 27

3.1.1 Arapinis Model . . . 27

3.1.2 Bruso Model . . . 28

3.1.3 Deursen’s Trace-based Model . . . 29

3.2 Works in Computational Settings . . . 30

3.2.1 Avoine Model . . . 33

3.2.2 Vaudenay Model . . . 33

3.2.3 Juels-Weis Model . . . 34

CHAPTER 4 FORMALIZATION AND CLASSIFICATION . . . 36

4.1 Classifying Privacy Games . . . 36

4.1.1 Game Type I . . . 37

4.1.2 Game Type II . . . 37

4.1.3 Game Type III . . . 37

4.1.4 Comparison . . . 38

4.2 Modelling RFID Protocols . . . 40

4.3 Formalizing Untraceability . . . 41

4.3.1 Low Untraceable . . . 41

4.3.2 Mid Untraceable . . . 43

4.3.3 High Untraceable . . . 44

4.4 Classification of Untraceability Definitions . . . 45

4.5 Results and Discussions . . . 50

CHAPTER 5 CASE STUDIES . . . 53

5.1 ProVerif . . . 53

5.2 Feldhofer Protocol . . . 55

5.2.1 Description . . . 56

5.2.2 The Model In Applied Pi . . . 56

5.2.3 Automatic Verification Results . . . 58

(9)

5.3.2 The Model in Applied Pi . . . 61

5.4 KIM Secure and Private Protocol . . . 61

5.4.2 The Model in Applied Pi . . . 63

5.4.3 Attack on untraceability . . . 64

5.4.4 Fixing the Protocol . . . 65

5.5 OSK protocol . . . 66

5.5.2 The model in Applied Pi . . . 67

5.5.3 Automatic Verification Result . . . 67

5.6 Results and Discussions . . . 68

CHAPTER 6 CONCLUSION . . . 70

6.1 Review of the Research . . . 70

6.2 Possible Improvements and Future Work . . . 71

(10)

LIST OF TABLES

Table 1.1 Tag Classification . . . 4

Table 2.1 Syntax for terms . . . 17

Table 2.2 Syntax for processes . . . 18

Table 2.3 Syntax for extended processes . . . 19

Table 2.4 Semantics for processes (1) . . . 20

Table 4.1 Comparison of Untraceability properties . . . 52

Table 5.1 Feldhofer protocol ProVerif code . . . 57

Table 5.2 Feldhofer protocol - Correspondence properties . . . 59

Table 5.3 Feldhofer protocol - secrecy and correspondence queries . . . 59

Table 5.4 Arapinis’s sample protocol . . . 62

Table 5.5 Kim Protocol ProVerif Code . . . 64

Table 5.6 OSK Protocol ProVerif Code . . . 67

Table 5.7 Untraceability results of case studies . . . 68

(11)

LIST OF FIGURES

Figure 1.1 Overview of an RFID system . . . 2

Figure 1.2 Two samples of RFID Applications: Access Control & Supply chain . . 3

Figure 1.3 3-Round Identification Protocol . . . 6

Figure 1.4 Anonymity . . . 7

Figure 1.5 Untraceability . . . 8

Figure 1.6 Forward Secrecy . . . 8

Figure 2.1 A Sample Protocol . . . 25

Figure 4.1 A protocol trace . . . 39

Figure 4.2 1st type Untraceability Vs. 2nd type Untraceability . . . 39

Figure 4.3 2nd type Untraceability Vs. 3rd type Untraceability . . . 40

Figure 5.1 ProVerif Architecture . . . 54

Figure 5.2 The Feldhofer Mutual Authentication Protocol . . . 56

Figure 5.3 Arapinis’s Sample Protocol . . . 60

Figure 5.4 The Kim Secure and Private Protocol . . . 63

Figure 5.5 Man-in-the-middle Attack on Kim et al. Protocol . . . 65

(12)

LIST OF ACRONYMS AND ABBREVIATIONS

BAN Burrows–Abadi–Needham FSM Finite State Machine

IEC International Electrotechnical Commission IFF Identify Friend or Foe

ISO International Organization for Standardization kbps Kilobit per second

OBDD Ordered Binary Decision Diagram OSK Ohkubo, Suzuki and Kinoshita PCL Protocol Composition Logic

PPTM Probabilistic Polynomially-bounded Turing Machine PRF Pseudonym Random Function

PRNG Pseudo Random Number Generator

RF Radio Frequency

RFID Radio Frequency IDentification SATMC SAT-based Model-Checker

(13)

CHAPTER 1

INTRODUCTION

As we advance through the 21st _{century, it has been demonstrated that, RFID technology}

will continue to grow and expand beyond our dreams. RFID systems are expected to become more common and useful tools in most of the remote object identification systems. Because of their low production costs and tiny size, they are currently deployed in many fields. Sensitive informations, such as credit card numbers, passport informations, social security, medical in-formation, pass everyday through the insecure wireless channel. Unfortunately, there is some concerns within the use of RFID in view of the minimal privacy and security held. Privacy and security have been defined in (Marcella et Stucki, 2003) as, “when and with whom you share your personal information,” and “how well information is protected from unauthorized access, alteration, or destruction.”

The security and privacy objectives of so many systems have not been met. The compu-tational limitations of RFID tags impose significant restrictions on the number and type of cryptographic primitives that can be implemented on them. Nonetheless, widespread deploy-ment of these tags and also the contactlessness of communication bring up new threats to the user privacy. As an example, a tag can be embedded in a credit card or a passport which is linked to the person’s identity,. In this case, the tracking of a tag turns into the tracking of a person. Chotia and Smirnov studied on privacy attacks on e-passports (Chothia et Smirnov, 2010), illustrating that, it is possible to trace the displacement of particular passport, without breaking the cryptographic functions.

This thesis will help the private design of RFID protocols and facilitate the analysis of existing schemes through the particular focus on untraceability property. Untraceability is one of the most important privacy properties which must be respected by an RFID system. We would consider the protocols private, if privacy requirement could not be violated by any adversary. Although, these properties are mostly used in the context of RFID systems, they are issues for any protocol, which is applied by a mobile device.

The remainder of this chapter is structured as follows: RFID systems will be introduced (Section 1.1). A brief description of cryptographic protocols will be revealed (Section 1.2) and their security properties will be discussed (Section 1.3). Formal approaches for protocol

(14)

verification techniques will be investigated in (Section 1.4). The Problem statement will be notified (Section 1.5). The objectives of this thesis and the description of the contribution will be presented (Section 1.6). Finally, an overview of the structure will be summarised (Section 1.7).

1.1 RFID Systems

RFID is a system that transmits the identity (in the form of a unique serial number) of an object wirelessly, using radio waves. It’s grouped under the broad category of automatic identification technologies. Auto-ID technologies include bar codes, optical character read-ers and some biometric technologies, such as retinal scans. The auto-ID technologies have been used to reduce the amount of time and labour needed to input data manually, like bar code systems, and to improve data accuracy. Some auto-ID technologies, such as bar code systems, often require a person to manually scan a label or tag to capture the data. RFID is designed to enable readers to capture data on tags and transmit it to a computer system without needing a person to be involved.

The beginnings of RFID technology can be traced back to work carried out during World War II on radar technology. In particular, an ancestor of modern RFID is the so-called Identify Friend or Foe (IFF) system, first introduced in second world war and still in use to-day to identify friendly or enemy aircraft. IFF, as used by the British Air Force at that time, was what we would describe today as a programmable tag that upon interrogation would return a code that identified the aircraft that carried it as friendly. Early IFF systems were further developed in the 1950s and nowadays are in common use in civil as well as military aviation.

Figure 1.1 Overview of an RFID system

(15)

(transponder), a reader (transceiver) and the reader’s back-end database. A tag typically has a microchip to store basic information (ID, manufacturer’s info) and an antenna for transmitting signals to RFID reader. A reader can interrogate a tag by sending a signal via electro-magnetic fields. The tag receives the signal through its antenna and responds with information stored on its microchip, which is verified by the reader against its back-end database. Although RFID systems are used as one of the most pervasive computing technolo-gies, there are still so many security issues that need to be solved before their vast deployment. This technology poses critical privacy and security concerns (Chothia et Smirnov, 2010). In an RFID system, it is mostly supposed that the communication channel between a reader and its back-end database is via secured wired channel while the channel between a reader and a tag is wireless and insecure. Therefore studying the communication between the database and the readers is not relevant. Hence, readers and database are often considered as a single and unique entity in the security analysis.

Figure 1.2 Two samples of RFID Applications: Access Control & Supply chain

The use of RFID in society attracts lots of attention in the past few years because of its beneficial features. The most well known application of the RFID tag is the supply chain and logistics. RFID tags can be attached to the objects in a supply chain to track, secure and manage goods throughout the entire production cycle. The overall cost of the supply chain could be reduced by using low cost tags and also adopting high-efficiency bulk-processing and non-human-intervention method. (Sarac et al., 2010; Khashkhashi Moghaddamet al., 2012). RFID chips can also be embedded into passports to record the holder’s biometric information such as fingerprint and iris data (Kumar et Srinivasan, 2012). Some 95 countries, including the Canada and the United States have been using e-passports for several years (Passport Canada). RFID tags also can be a key and used in access control systems. They can store personal information for security check-ins. For example, an employee carries an ID card, embedded with a RFID chip, could authenticate his or her identity at the security entry in a facility very fast (Xiang, 2012). RFID system also has been used in a other industries like

(16)

automatic payment systems, medical systems, animal tracking, libraries and smart appliances (Fosso, 2012; Wamba, 2012; Kushal et al., 2012).

Different kind of tags are used in different places. Tags can be divided into three main types. Passive tags, which do not require batteries, can be much smaller, and have an unlimited life span. They are currently being developed by several companies globally. These tags are much less expensive than other types. Passive tags receive all power from the reader and necessarily cannot initiate any communications. Semi-passive tags are very similar to passive tags except for the addition of a small battery. This battery allows the tag to be constantly powered. Therefore, semi-passive tags are faster in response than passive, though less reliable and powerful than active tags. Active tags have their own internal power source, which is used to power tag to generate the outgoing signal. They are typically much more reliable (e.g. fewer errors) than passive tags due to the ability for active tags to conduct a ”session” with a reader. Due to their on-board power supply, also transmitting at higher power levels than passive tags, allows them to be more effective in ”RF challenged” environments like water, heavy metal (shipping containers, vehicles), or at longer distances. Any active tags have practical ranges of hundreds of meters. Table 1.1 depicts the main characteristic of each group.

Table 1.1 Tag Classification

Property Type

Passive Semi-Passive Active

Frequency(MHz) 860-960 868-915 and 2.4GHz 860-960 and 2.4GHz

Internal Power No Yes Yes

Bit Rate(kbps) 246 16 20/40/250

Read Range(m) 10 30 100

Cost(cents) 10-150 500-2000 1250-2500

1.2 Cryptographic Protocols

A security protocol also called a cryptographic protocol is a communication protocol that performs security functions while applying cryptographic methods. Participants to a protocol, called agents, have to follow some steps, known in advance, to communicate with each other.

Cryptographic techniques are used for the protocols with a security objective. The security objective might be confidentiality or might involve authentication, generation of random sequences, or partial sharing of a secret. Since the communication medium in RFID systems

(17)

is public, the messages in the wireless channels can be eavesdropped by any adversary with the receiver equipment. An adversary may interface and even intercept, insert, block, modify the messages in the wireless channel. This has made the design of secure protocols difficult. Using cryptographic protocols is one of the solutions to overcome security threats in RFID systems. Because of the resource limitation of RFID tags, it requires lots of efforts to find a light weight and secure protocol. Therefore simple and light weight cryptographic oper-ations like logical operoper-ations(e.g. XOR, AND, OR) mostly used in these protocols. Many researchers have looked into the problem in order to design protocols which allow authorised persons to identify the tags without an adversary being able to trace them. Numerous papers addressing RFID cryptographic protocols have been published recently such as (Wang et al., 2012; Bassil et al., 2012; Qi et al., 2012; Sun et Zhong, 2012; Lee, 2012; Jia et Wen, 2012).

Many works have used cryptographic primitives to encrypt the secrets like tag identifier. Hash function-based protocols like (Lee et al., 2006; Henrici et M¨uller, 2004; Ohkubo et al., 2003) are taking the advantage of one-way function to prevent direct exposure of secret. For example authors in (Ohkubo et al., 2003) used two different hash functions, one for sending tag’s internal state to the reader and the other for updating the tag’s secret. In addition, Henrici in (Henrici et M¨uller, 2004) proposed a scheme that all the data management is done in back-end and the tag only requires a hash function. Therefore the communication channel does not need to be reliable and the reader need not be trusted and also no long-term secrets stored in tags. Later Dimitriou in (Dimitriou, 2006) presented an optimized scheme using hash functions, which do not require exhaustive search in the back-end database. Pseudonym Random Function (PRF) has also been used in the design of RFID protocols. For exapmle in (Tsudik, 2006), author presented an algorithm called YATRAP which needs only the secret ID and a Pseudo Random Number Generator (PRNG). Later Chatmon et al. (Chatmon et al., 2006) presented YA-TRAP+ and OTRAP.

Most of these schemes are 3-round identification protocols as depicted in Figure 1.3. The first message is the starting request or may contain data such as a nonce. Nonce is an arbitrary number used only once in a cryptographic communication. The main part of the protocol lies on the information in the second message which changes at each new identification. It could be either the tag’s identifier, or an encrypted version of its identifier. The third message is not used for the identification of the tag but it may be used by the tag to update its identifier. Within this thesis we will represent protocols graphically using message sequence charts. Every message sequence chart shows a reader role R and a tag role T with the role names framed, on the top of the chart. Above the role names, the role’s secret terms are shown.

(18)

Figure 1.3 3-Round Identification Protocol

Events, such as nonce generation, computation, and assignments are shown in boxes. Mes-sages to be sent and expected to be received are specified above arrows connecting the roles. It is assumed that an agent continues the execution of its run only if it receives a message conforming to the specification.

1.3 Security Properties

Cryptographic protocols are required to satisfy some security requirements also called security properties. Depending on the protocol’s purpose, failure to preserve security prop-erties may result in unveiling confidential information to an intruder and consequently, may lead to protocol failure. These includes classical properties like secrecy and authentication or privacy properties like anonymity or untraceability. These requirements must be overcome to consider a protocol secure against the threats.

Secrecy and authentication are the two basic standard requirements of any authenti-cation protocol. To define secrecy informally, we could say that a protocol P preserves the secrecy of dataM ifP never publishesM, or anything that would permit the computation of M, even in interaction with an adversary. In addition to the secrecy, an authentication pro-tocol should also enable an agent to prove her identity. Recalling the client-server handshake protocol, authentication ensures that the client C is only willing to share her secret with the serverS; it follows that, if she completes the protocol, then she believes she has done so with S and hence authentication ofS toC should hold. In contrast, serverS is willing to run the protocol with the client C who he is willing to, and hence at the end of the protocol he only expects authentication of C toS to hold, if he believes C was indeed his interlocutor.

(19)

conisdered as the main issue in RFID systems rather than other communication systems. An RFID authentication protocol should not only allow a legitimate reader to authenticate a tag but it should also protect the privacy of the tag. Here, we present the main privacy properties which are generally required for RFID protocols.

Anonymity allows an entity to make transactions that cannot be known by others. If the tag ID can be kept anonymous, the problem of leaking information pertaining to the user belongings would be solved. Anonymity is the topic of most researches. This property is highly important for tag owner privacy in the environment. It is informally defined by the ISO/IEC standard 15408 as ensuring that a user may use a service or resource without disclosing the user’s identity.

Figure 1.4 Anonymity

Intuitively, it says that an attacker cannot discern the tag from any information that the owner necessarily reveals during the identification. This might include nonces or keys which the tag owner is given during the protocol. Figure 1.4 shows the RFID tag anonymity. Untraceability says that an attacker cannot trace the movement of a tag. RFID readers in strategic locations can record sightings of unique tag identifiers, which are then associated with personal identities. Passive tags respond to readers without alerting their owners when they powered by electromagnetic waves from a reader. The threat comes out when a tag identifier linked to personal information. For example, by using a credit card or an e-passport, if an attacker can establish a link between a person’s identity and the tags he is carrying, at any such point the tracing of objects can become the tracing of a person.

Let’s Consider set of messages sent by different agents, for example in a communication system, all messages sent by the same sender are related to each other for him but should not for an attacker. Untraceability as first defined by Avoine (Avoine et Oechslin, 2005) means that an attacker can not get any information to be used to trace an object in time and space. Figure 1.5 shows the concept of untraceability in a simple manner.

(20)

Figure 1.5 Untraceability

Comparing with anonymity, in analysing untraceability the identity of the tag is not im-portant, the point is that the link between different sessions of the tags cannot be established by an attacker. In this sense, untraceability is not intended to protect the link between a tag participated in the protocol and the identity of the user, but rather the link between different sessions of the tags.

Forward Secrecy ensures that even by compromising a tag at some later time, an adver-sary can not link the tag to its previous sessions. In other word she cannot trace the data back through past events in which the tag was involved. For example once the secret in the tag is stolen, all past activities can be traced by searching past logs. So the past activities should be protected from tampering. For forward privacy, the attacker is allowed to tamper with a tag and retrieve the data stored in it. We call a protocol forward secure if the attacker cannot use the obtained data to link the tag to past sessions. It requires that the adversary who only eavesdrops on the tag output, cannot associate the current output with past output. In other words given set of observations between tags and readers and given the fact that all information stored in the involved tags has been revealed at time t, the adversary must not be able to find any relation between any observations of the same tag or a set of tags that occurred at time ´t < t. It is depicted in Figure 1.6.

(21)

1.4 Formal Verification of Security Protocols

Critical systems which require high reliability such as aircraft navigation systems, banking transactions, security protocols and other similar systems are difficult to be evaluated using conventional test and simulation techniques. Unlike testing that does not provide exhaustive coverage, verification using model checking gives a more thorough analysis of the system by checking satisfiability of the requirement through every branch of the model representing the system. It covers all possible scenarios of the behavior of the system. Formal verification techniques used in such systems to ensure a high degree of reliability. (Siminiceanu et Ciardo, 2012; Ahamad et al., 2012; Souyris et al., 2009)

Formal methods have proven to be a promising technique toward developing automated and generic protocol verification and testing methods. The main benefits of formal verification comparing to testing, is its soundness and exhaustiveness.

1.4.1 Formal Methods

Formal methods, as defined by Meadows (see Meadows, 2003), combine a language which can be used to model a cryptographic protocol and its security properties. It means the use of mathematical and logical techniques to express the system and its properties. By building a mathematically rigorous model of such a system and a mathematical definition of the properties, it is possible to verify the system’s properties in a more thorough fashion than empirical testing. With these techniques, we can develop specifications and models, which describe all or part of a system’s behavior at various levels of abstraction, and use them as input to an automated model checker. Formal methods in context of cryptographic protocols generally divided into two parts: Formal modelling of the protocol specification and Formal verification which refers to analysis of the security properties.

Although cryptographic protocols design is improving day by the day but there are still some security issues that are discovered mostly as the result of an informal approach. For-mal methods successfully applied to find such problems. Many forFor-mal techniques and tools have been developed to reason about security and protocols. Probably the first mention of formal methods as a possible tool for cryptographic protocol analysis came in Needham and Schroeder (see Needham et Schroeder, 1978). However, the first work that was actually done in this area was done by Dolev and Yao (Dolev et Yao, 1983).

Modelling a cryptographic protocol begins with specification of the protocol and its requirements. Specification language could be a formal language or calculi. Milner’s Pi-calculus(Milner, 1991) is a general-purpose calculus that has been successfully used to demon-strate security properties for some protocols. By extending it with specialized cryptographic

(22)

primitives, Abadi obtained what he called the SPI calculus (Abadi et Gordon, 1997), a pro-cess algebra specifically designed for modelling cryptographic protocols (Abadi et Gordon, 1997), which allows verification of a broader spectrum of security protocols, as the encryp-tion and decrypencryp-tion can now be checked. Since we used applied pi calculus as our modelling language, we will present the preliminaries of this language in chapter 2.

1.4.2 Verification Techniques for Cryptographic protocols

Generally formal verification techniques in context of cryptographic protocols divided into three main categories: Logic oriented methods, Model checking and Theorem proving.

Logic oriented methods

This method uses the modal logics to specify and analyse cryptographic protocols. Modal logic is a type of formal logic primarily developed in the 1960s that extends classical propo-sitional and predicate logic to include operators expressing modality. They reason about the evolution of knowledge and beliefs within a system to show that certain conditions are satisfied. The basic idea of these approaches is the analysis of knowledge or belief during the protocol run. Inference rules are used to describe how beliefs or knowledge can be derived from other beliefs or knowledges.

The most famous in the class of logics is the Burrows–Abadi–Needham (BAN) logic (see Burrows et al., 1989). It models messages and the beliefs of an agent. BAN logic is a set of rules for defining and analysing information exchange protocols. It is an example of a logic of belief, which consists of a set of modal operators describing the relationship of principals to data, a set of possible beliefs that can be held by principals (such as a belief that a message was sent by a certain other principal), and a set of inference rules for deriving new beliefs from old ones. Specifically, it helps its users determine whether exchanged information is trustworthy, secured against eavesdropping, or both. Note that belief logics such as BAN are generally weaker than state exploration tools since they operate at a much higher level of abstraction.

There are a numbers of others including Bieber’s CKT5 (Bieber, 1990), Syverson’s KPL (Syverson, 1990) and Protocol Composition Logic (PCL) (Datta et al., 2005).

Model checking

Model checking (Clarke, 1997) is one of the most common formal verification techniques. Model refers to the mathematical representation of the system and using model checking technique a temporal logic formula besides the model of your system can determine whether

(23)

the model satisfies the property or not. The basic idea is an automated method to determine the correctness property holds by exploring the state space of a model. If the property does not hold in each state, the model checking algorithm produces a counterexample and an execution trace leads to a state in which the property is violated. For models with a small state space this search can be exhaustive and the results are then equivalent to a formal proof. Since the state space might be too large to be examined directly, model checking is typically combined with abstraction techniques.

The inputs to the model checker are, a finite-state modelM, a model that represents all possible scenarios of the behaviour of a system S, and a property P to be checked in every state of M. The model checker exhaustively explores all the paths throughM while checking that P is true at each reachable state. If no branch in the model violates this property then the system is said to satisfy the property. Otherwise, the verifier outputs the branch that has violated the property, known as the counterexample.

Two general approaches to model checking are used: Explicit state enumeration and Symbolic model checking. In the first, the states of the system under verification are basically encoded and stored in a table and then checked against the desired properties. Many state reduction techniques have been developed to help reduce the state space without affecting the ability of the tool to discover insecure states. Other techniques are also used in order to optimize both the search and storage procedures. The second approach, symbolic model checking (Meadows et Meadows, 1995) brought some remedy to the state space explosion problem of model checking and is based on the use of Ordered Binary Decision Diagram (OBDD)s. Using OBDDs allows for an efficient representation of system state transition, thereby increasing the size of the system that could be verified. By using symbolic model checking, it is possible to verify extremely large reactive systems. This is possible because the number of nodes in the OBDDs that must be constructed no longer depends on the actual number of states or the size of the transition relation. Because of this breakthrough it is now possible to verify reactive systems with realistic complexity.

The most works on state machine approaches goes back to Dolev and Yao (Dolev et Yao, 1983). A Finite State Machine (FSM) includes a finite number of states and produces output on state transitions after receiving an input symbol. It is often used to model control portion of a protocol. In these models a state space is defined and then explored by the tool to determine if there are any paths through the space corresponding to a successful attack by the intruder. Much of this work was successful in finding flaws in protocols that had been previously undetected by human analysts. One of the first systems that used Dolev-Yao approach is Interrogator developed by Millen (Millen et al., 1987). The tool attempts to locate protocol security flaws by an exhaustive search of the state space. In this class we

(24)

can also mention NRL that is proposed by Meadows (Meadows, 1996). It is also based on the Dolev-Yao approach and use the similar strategy as Interrogator. Other researches has also focused on state exploration such as MurΦ (Mitchellet al., 1997), Maude (Denkeret al., 1998), SATMC (SAT-based Model-Checker) (Armando et al., 2003) or FDR (Lowe, 1996). The most significant problem of this kind of methods is that the search space of the paths to find the insecure state can be infinite which leads to a termination problem.

Theorem proving

This techniques model the computations performed in a protocol and define security properties as theorem, automated theorem checkers are then used to verify these theorems and thus prove properties of the model. Automated theorem proving is the oldest technique of formal verification, and it has been studied and practised since the 1960s. The logic is given by a formal system based on a set of axioms and inference rules. Theorem proving is the process of finding proofs using axioms of the system. The idea is to represent two models or properties as two formulas f and g in a reasonably expressive logic, consequently prove that f ⇒g (Pradhan et Harris, 2009).

Theorem provers are being used more and more in the mechanical verification of safety critical properties of hardware and software designs. Logic based proofs for security protocols build on traditional mathematical reasoning, using models extended for representing security concepts. Logical notations typically offer a wide range of data structures and operators which makes them capable of accurately representing the messages transmitted in security protocols. Unfortunately, a particular weakness of logic-based models is that they have no in-built notions of communication concepts such as message sequencing and these must be explicitly expressed in the model. Paulson in (Paulson, 1994) used the theorem proverIsabelle to analyze protocols.

In contrast to model checking, theorem proving can be applied to systems with infinite state spaces because it doesn’t have the state space exploration problem. It also relies on techniques such as structural induction to prove over infinite domains. As a disadvantage to this technique, we can point that interactive theorem provers, require human interaction, so the theorem proving process is slow and sometimes error prone.

1.4.3 Adversary Model

In security scenarios typically we assume that cryptographic protocols should achieve their objectives in the presence of a malicious entity that seeks to break the security of the system, called adversary. The characterization of the adversary is essentially done by specifying the

(25)

actions that she is allowed to perform (i.e. the oracles she can query), the goal of her attack (i.e. the game she plays) and the way in which she can interact with the system (i.e. the rules of the game).

Consider a public-key cryptosystem including a plaintext x underlying a challenge ci-phertext y. The classic attacks in order of increasing strength are chosen-plaintext attacks (CPA), non-adaptive chosen-ciphertext attacks (CCA1), and adaptive chosen-ciphertext at-tacks (CCA2) (Naor et Yung, 1990). Under CPA, the adversary has the capability to choose arbitrary plaintexts to be encrypted and obtain the corresponding ciphertexts. For example, in the public-key setting, giving the adversary the public key suffices to capture these attacks. Under CCA1, the adversary gets, in addition to the public key, access to an oracle for the decryption function. The adversary may use this decryption function only for the period of time preceding her being given the challenge ciphertext y. Under CCA2, the adversary again gets (in addition to the public key) access to an oracle for the decryption function, but this time she may use this decryption function even on ciphertexts chosen after obtaining the challenge ciphertext y, the only restriction being that the adversary may not ask for the decryption of y itself (Rackoff et Simon, 1992).

For RFID systems, no universal adversary model has been defined yet, up until now designs and attacks have been made in an ad hoc way. Even though there are concepts like CPA, CCA1 and CCA2 for confidentiality, in RFID systems the adversaries resources are defined in an ad hoc manner and vary depending on the publication. They can be passive, active but limited in the number of queries on the tag, active but cannot modify exchange between the legitimate tag and a reader or active but cannot tamper with the tags. For example Avoine defined the adversary as a Probabilistic Polynomially-bounded Turing Machine(PPTM) (Avoine, 2005), which interacts with RFID tags and readers through some oracles. Vaudenay (Vaudenay, 2007) also proposed a hierarchical model for adversary. His model captures eight classes of adversary capabilities ranging over four different types of tag corruption and two modes of observation. There are also other works in modelling the adversary in RFID systems like (Canard et al., 2010; Cai et al., 2012)

In this dissertation we consider the adversary model described by Dolev and Yao (Dolev et Yao, 1983). It is based on its initial knowledge which is extended by observing the communi-cations. Since the adversary has complete control of the network in this model, it is assumed that the adversary has unlimited inference capabilities and messages may be read, modified, deleted or injected. It means that she has the ability to influence all communications between a tag and the reader, she can change the order of messages, combine the information in his knowledge to construct or interpret new terms and can therefore perform man-in-the-middle attacks on any tag that is within its range. In brief Dolev-Yao adversary is as follows:

(26)

• Eavesdrop all messages transmitted on the network • Replay to captured messages

• Inject new messages deduced from captured messages • Intercept messages

However, these capabilities are restricted due to the assumption of perfect cryptography. This means that the adversary cannot reverse hash functions and that she is not able to learn the contents of an encrypted term, unless she knows the decryption key.

1.5 Problem Statement

The contactlessness of communication and the expected ubiquity of RFID systems en-courage nefarious entities to track and analyze tags in time and space. If at any such point a tag is linked to an individual, the tracking of a tag becomes the tracking of a person. The need for RFID protocols to be secured against these threats has been realized early on (Juels, 2006; Breu, 2011). A number of papers discussed about the untraceability problems, raised by RFID technologies and its importance, however, very few defined a formal model and the untraceability property precisely.

In the literature, there are several definitions of untraceability, while there is no agree-ment on its comprehensive definition. Generally two type of models have been considered in order to formalize untraceability; The Symbolic model and Computational model. Most of the definitions have been accomplished in computational settings. These experiments are poorly supported by automatic tools. In the symbolic world, in one hand Arapinis defined the notions of weak and strong untraceability in the process algebra (Arapinis et al., 2009), on the other hand, Bruso defined a single definition for untraceability (Brus´oet al., 2010). By investigating the existing models we found that, there are some cases that make it impossible to satisfy the defined properties, or the definitions are not complied properly with some type of protocols. All the above models will be fully presented in the related works. Therefore, the need to sum up and classify all these definitions is essential.

These problems made us to think of studying the former definitions, providing more clear and explicit definitions for untraceability with different strength levels. Finally a solution is proposed for the automatic analysis of untraceaility of protocols.

(27)

1.6 Research Objectives 1.6.1 Research Question

• How to formalize and automatically verify untraceability of RFID protocols? 1.6.2 General Objective

The main idea of this thesis is to provide the design of private cryptographic protocols and facilitate the privacy evaluation of existing schemes, using formal verification techniques. This thesis presents an approach for modelling and verifying privacy property, particularly in RFID identification protocols. Our work is an attempt to the formal analysis of a more challenging privacy property in RFID protocols, so-called untraceability. Thus, by proving untraceability in an RFID protocol, it prevents a real world attacker to invalidate the protocol, making it impossible to trace a tag, and hence its owner as well.

1.6.3 Specific Objectives

• Establish different strength levels for untraceability properties • Formalize untraceability properties using applied pi calculus • Classify all existing definitions of untraceability in the literature • Automatic Verification of untraceability in RFID protocols

1.6.4 Original Scientific Hypothesis of our Contribution(OSHC) Hypothesis:

• Verifying untraceability in each proposed levels for any RFID protocol, prevents an adversary invalidate the untraceability of modelled protocol.

Justification of originality: Few researches have looked at formalizing untraceability in symbolic settings. Providing explicit definitions for different levels of untraceability in pro-cess algebra, considering the game based definitions in computational setting has not been provided in any research.

Refutability: The hypothesis will be refuted if an adversary can interfere the privacy of the protocol which satisfied our untraceability levels.

(28)

1.7 Outline of the Thesis

The rest of the dissertation is organized as follows.

•Chapter 2 introduces the syntax and semantics of the language of our symbolic setting: applied pi calculus.

•Chapter 3 surveys the related works in the application of formal methods to the analysis of untraceability in RFID protocols. Discussion about symbolic and computational settings will be presented in this chapter.

•Chapter 4 begins with classifying existing privacy games. Then we will propose our levels of untraceability related to the former games. After that we will presents our protocol model and formalization of proposed untraceability levels based on applied pi calculus. Finally we will compare all existing definitions of untraceability in the literature and provide a classification of this property.

•Chapter 5 includes some case studies which describe the results of verifying the RFID identification protocols from the implementation using proposed framework. Three existing RFID protocols will be modelled. Finally Verification results will be demon-strated.

•Chapter 6 summarizes the work presented and underlines our original contributions. We will also explain the limitations of the work and future improvements in this chapter. Finally, the Bibliography section lists all the articles, reports, books and etc.

(29)

CHAPTER 2

BACKGROUND: APPLIED PI CALCULUS

The applied pi calculus is a process calculi, a mathematical formalism for describing and analysing concurrent systems and their interactions. It is an extension of the pi calculus (Milner, 1999) with the addition of a rich term algebra, value-passing, primitive function symbols and equational theory to enable modelling of the cryptographic operations used by security protocols. It is specifically targeted at modelling security protocols by adding the possibility to model wide variety of complex cryptographic primitives, including, non-deterministic encryption, digital signatures, and proofs of knowledge, while the pi calculus has a fixed set of primitives built-in like symmetric and public key encryption. Symbolic model relies on Dolev-Yao model(Dolev et Yao, 1983) in which messages are modelled as algebraic terms instead of computational model. The applied pi calculus permits formal modelling of properties including: reachability, correspondence and observational equivalence. In the context of cryptographic protocols, these properties are particularly useful, since they allow the analysis of traditional security goals such as secrecy and authentication. Moreover, emerging properties such as privacy and traceability can also be considered. Here we briefly recall its basic notions, for an extended descriptions see (Abadi et Fournet, 2001).

2.1 Syntax and Informal Semantics

The calculus consists of terms including infinite set of names(a, b, c, . . .), variables(x, y, z) and a signature Σ consisting of a finite set of function symbols each with an associated ar-ity(e.g. one-way hash functions, encryption, digital signatures and data structures as pairing), plain processes and extended processes.

Definition 2.1. (Algebraic Term) The setT(Σ) of algebraic terms is the smallest set defined by the grammer of table 2.1

Table 2.1 Syntax for terms L, M, N, T, U, V ::= terms

a, b, c, . . . , k, . . . , m, n, . . . , s names

x, y, z variables

(30)

where f rages over the functions of Σ and ar matches the arity of f. A function symbol with arity 0 is called constant, with arity 1 is called unary and with arity 2 is called binary. Metavariables such asu, v, w are used to denote both names and variables. Tuplesu1, . . . , uar

andM1, . . . , Mar are abbreviated to ˜uand ˜M respectively. We assume a type system for terms

generated by a set of base types such as Integer, Key, or simply a universal type, Data. In addition if τ is a type, then Channel(τ) is a type too. Typically a,b, and c are used for channel names, s and k as names of some base types, and m and n as names of any type. We always assume that terms are well-typed and the substitutions preserves types.

Definition 2.2. (Ground Term) A term is called ground or closed if it contains no variables, and it is denoted by TΣ.

Example 2.3. Let a be a constant, f be a unary function andg be a binary function. The ground terms that could be built by above signature are like followings:

a f(a) g(a, a) g(f(a), f(a)) f(g(a, a)) . . .

In the applied pi calculus, one has plain processes and extended processes. The grammar for processes is similar to the pi calculus, except that messages can contain terms (rather than just names) and that names need not be just channel names. The syntax of process is shown below:

Table 2.2 Syntax for processes

P, Q, R ::= processes (or plain processes)

0 null process

P|Q parallel composition

!P replication

νn.P name restriction

if M =N then P else Q conditional

u(x).P message input(also written as in(u, x).P) ¯

uhMi.P message output (also written asout(u, x).P) let x=g(M1, . . . , Mn) in P else Q destructor application

let x=M inP binding

The null process 0 does nothing; P|Q is the parallel composition of process P and Q to represent the agents of the protocol running in parallel; the replication !P behaves as an infinite composition of P running in parallel (P|P|. . .) which mostly used to show the unbounded number of protocol sessions. The name restriction νn.P makes a new, private name n inside P which is often used to represent nonces, keys, private channels and other fresh random numbers. The conditional construct if M =N then P else Qis standard, but

(31)

M =N represents equality rather than strict syntactic identity. IfQis a null process we can abbreviate it to if M = N then P. Finally, the communication between agents is shown by message input and message output. The process u(x).P is ready to input from channel u, then to run P with the received message replaced with parameter x, while ¯uhMi.P is ready to output M on channel u, then to run P. In both of these, the P is omitted when it is a null process. The set of free names in P, denotedf n(P), consists of those namesn occurring in P not in the scope of a restriction νxor input u(x). In opposite, The set of bound names bn(P) contains every namen which is not in free inP.

Further, the processes are extended with active substitutions:

Table 2.3 Syntax for extended processes A, B, C ::= extended processes

P plain process

A|B parallel composition νn.A name restriction νx.A variable restriction {M/x} active substitution

Processes are extended with active substitution to take into account the knowledge ex-posed to the adversarial environment and reflect in its interaction with the protocol. The active substitution {M/x} denotes the substitution that replaces the variable x with the term M in every process. In other wordsM is available to the environment but the variable x is just a way to refer to M. This allows access to terms which the environment cannot construct. Although the substitution {M/x} concerns only one variable, large substitution can be written as: {M1/x1, . . . , Mar/xar} or{M /˜˜ x}.

Contexts may be used to represent the adversarial environment in which a process is run; that environment provides the data that the process inputs, and consumes the data that it outputs. A context C[ ] is an expression (a process or extended process) with a hole. C[P] is obtained as the result of fillingC[ ]’s hole withP. An evaluation context C[ ] is a context whose hole is not under a replication, a conditional, an input, or an output.

Active substitutions are useful because they allow us to map an extended process A to its frame ϕ(A) by replacing every plain process in A with 0. A frame, denoted ϕ or ψ, is an extended process built from 0 and active substitution {M/x}; The frameϕ(A) represents the static knowledge exposed by a process A to its environment. The domain dom(ϕ) of a frame ϕis the set of the variables that ϕexports.

(32)

consider the following processes:

P =c(x).if x =req then ¯chhelloielse 0

This process waits for a message on the public channelc. If the received message is req, then the process sends hello on the public channel c.

2.2 Operational Semantics

Operational semantics of the applied pi calculus are defined by the means of two relations: structural equivalence and internal reductions.

2.2.1 Structural Equivalence

Informally, two processes are structurally equivalent if they model the same thing, but the grammar permits different encodings. For example, to describe a pair of processes A,B running in parallel, the grammar forces us to put one on the left and one on the right; that is, we have to write either A|B, or B|A. These two processes are said to be structurally equivalent.

Definition 2.5. (α-Conversion) is renaming a bound name or variable without changing the semantics.

Definition 2.6. (Structural equivalence≡) is the smallest equivalence relation on extended processes that is closed underα-conversion of both bound names and variables and application of evaluation contexts which satisfies the rules in table 2.4.

Table 2.4 Semantics for processes (1)

A|0 ≡ A νn.0 ≡ 0

A|(B|C) ≡ (A|B)|C νu.νw.A ≡ νw.νu.A

A|B ≡ B|A A|νu.B ≡ νu.(A|B)

!P ≡ P|!P if u /∈f n(A)∪f v(A)

νx.{M/x} ≡ 0 {M/x} ≡ {N/x}

{M/x}|A ≡ {M/x}|A{M/x} if M =E N

The parallel composition, replication and restriction rules are self-explanatory. We briefly describe the substitution rules here. The rule νx.{Mx} ≡ 0 called ALIAS, enables us to introduce active substitutions with restricted scopes. The second rule, {M/x}|A ≡

(33)

{M/x}|A{M/x}calledSUBST, describes the active substitution to a process. The final rule called REWRITE, allows terms that are equal modulo the equational theory to be swapped. Example 2.7. Consider the following process:

P ≡νs.νk.( ¯c1henc(s, k)i|c1(y).c¯2hdec(y, k)i)

The first component publishes the messageenc(s, k) on channel c1. The second receives

a message from channel c1, then using the secret key k decrypt it and sends the result on

channel c2. This process is structurally equivalent to the following extended process A:

A≡νs.νk.νx.( ¯c1hxi|c1(y).c¯2hdec(y, k)i|{enc(s, k)/x})

Using structural equivalence, every closed extended process A can be rewritten to consist of a substitution and a closed plain process with some restricted names: A≡ν.˜n.{M˜x}|P˜ , where f v(P) = ∅ ,f v( ˜M) =∅ and {˜n} ⊆f n( ˜M).

2.2.2 Internal Reduction

Internal reduction (→) is informally defined as the execution of a process with respect to control flow and communication. It is formally defined as follows:

Definition 2.8. (Internal reduction→) is the smallest relation on extended processes that is closed under structural equivalence and application of evaluation satisfying the rules of table 2.5.

Table 2.5 Semantics for processes (2) COMM chxi.P¯ |c(x).Q→P|Q

THEN if N =N then P else Q→P ELSE if L=M then P else Q→Q

The reflexive and transitive closure of → is written as →∗_{. To better understand the}

substitution and reduction, we provide some examples as follows: Example 2.9. Consider the process P described in Example 2.7:

P ≡νs.νk.( ¯c1henc(s, k)i)|c1(y).c¯2hdec(y, k)i

(34)

P ≡ νs.νk.νx1.( ¯c1hx1i)|c1(y).c¯2hdec(y, k)i|{enc(s, k)/x1}) (by ALIAS)

νx1.c¯1hx1i

−−−−−−−→ νs.νk.(c1(y).c¯2hdec(y, k)i|{enc(s, k)/x1}) (by COMM)

c1(x1)

−−−−→ νs.νk.( ¯c2hdec(x1, k)i|{enc(s, k)/x1})

≡ νs.νk.νx2.( ¯c2hx2i|{enc(s, k)/x1}|{dec(x1, k)/x2}) (by SUBST)

νx2.c¯1hx2i

−−−−−−−→ νs.νk.({enc(s, k)/x1}|{dec(x1, k)/x2}) (by COMM)

Example 2.10. ¯

chMi.P|c(x).Q ≡ νx.(¯chxi.P|c(x).Q|{M/x}) (by ALIAS) → νx.(P|Q|{M/x}) (by COMM) ≡ νx.(P|Q{M/x}|{M/x}) (by SUBST)

≡ P|Q{M/x}|νx.{M/x} (by restriction rules)

≡ P|Q{M/x} (by ALIAS)

2.3 Equivalences

In addition to secrecy and correspondence properties, there are more complex properties which enables us to define privacy. Intuitively, two processes are said to be equivalent if an observer has no way to tell them apart. In order to define observational equivalence, we could say that processes P and Qare equivalent if they can output on the same channels, no matter what the context they are placed inside.

2.3.1 Observational Equivalence

We writeP ⇓cwhenP emits a message on the channelc, that is, whenP →∗ C[out(c, M);Q]

for some evaluation context C that does not bind cand some process Q.

Definition 2.11. (Observational Equivalence ≈ ) is the largest symmetric relation R on processes such that P RQ implies:

1. ifP ⇓c then Q⇓c;

2. ifP →∗ _P0 _{then there exists} _Q0 _{such that}_Q_→∗ _Q0 _and _P0_RQ0 _;

3. C[P] R C[Q] for all evaluation contextsC[ ].

⇓cis called abarb oncin the pi calculus, and ≈is one of the two usual notions of barbed

bisimulation congruence. Intuitively, a context may represent an attacker, and two processes are observationally equivalent if they cannot be distinguished by any attacker.

Two terms M and N are said to be equal in the frame ϕ, and written as (M =N)ϕ, if

and only if ϕ ≡ νñ.σ, M σ = N σ, and {ñ} ∩(f n(M)∪f n(N)) = ∅ for some names ñ and substitution σ.

(35)

2.3.2 Labelled Bisimilarity

The quantification over contexts makes the definition of observational equivalence hard to use in practice. Therefore, labelled bisimilarity is introduced, which is more suitable for both manual and automatic reasoning. Labelled bisimilarity relies on an equivalence relation between frames, called static equivalence, which we define first. Two substitutions may be seen as equivalent when they behave equivalently when applied to terms. This is shown by ≈S and called static equivalence.

Definition 2.12. (Static equivalence ≈S). Two closed framesϕ and ψ are statically

equiv-alent, and is written ϕ≈S ψ, when:

1. dom(ϕ) = dom(ψ)

2. For all terms M and N we have that (M =N)ϕ if and only if (M =N)ψ

We say that two closed extended process are statically equivalent, and denoted by A≈S B,

when their frames are statically equivalent ϕ(A)≈S ϕ(B).

This is called static because by using frames we are able to examine only the current state of the processes and not their dynamic behaviour. Static equivalence captures the static part of observational equivalence, more precisely they coincide on frames(Abadi et Fournet, 2001). Example 2.13. Let’s assume the simple signature Σ ={hash, pair, f st, snd} and the equa-tional theory satisfying f st(pair(x, y) = x and snd(pair(x, y) = y over variables x and y, where hash, f st and snd are unary functions and pair is a binary function. We could have the following statements:

• νm.{m/x} ≈S νn.{n/x}; Since they are structurally equivalent.

• νm.{m/x} ≈S νn.{hash(n)/x}

• {m/x} 6≈S {hash(n)/x}; Since the first one satisfies x = m but the second one does

not.

• νs.{pair(s, s)/x} 6≈S νs.{s/x}; Since the first one satisfies pair(f st(x), snd(x)) = x

but the second one does not.

The operational semantics is extended by a labelled operational semantics enabling us to reason about processes that interact with their environment. Labelled operational semantics depicted in table 2.6 defines the relation −→α where α is a label of the form c(M), ¯chui or νu.¯chui such that u is either a channel name or a variable of base type. It enables us to capture the dynamic part of observational equivalence.

The transition A −c−−(M→) B means that the process A performs an input of the term M from the environment on the channel c, and the resulting process is B. If the item is a free variablexor a free channeld, then the label ¯chxi, respectively ¯chdi, is used. If the item being

(36)

Table 2.6 Semantics for processes (3)

IN c(x).P −c−−(M→) P{M/x}

OUT-ATOM chui.P¯ −−→¯chui P

OPEN-ATOM A

νu.¯chui

−−−−→A0 u6=c νu.A−−−−→νu.¯chui A0

SCOPE A

α

− →A0

νu.A−→α νu.A0 u does not occur in α

PAR A α − →A0 bv(α)∩f v(B) = bn(α)∩f n(B) =∅ A|B →αA0_|B STRUCT A≡B B α − →B0 B0 ≡A0 A−→α A0

output is a restricted channel d, then the label νd.¯chdiis used. Finally, if the item is a term M, then the label νx.¯chxi is used, after replacing the occurrence of the term M by x and wrapping the process in νx.({M/x}| ).

Definition 2.14. (Labelled bisimilarity≈L). Labelled bisimilarity is the largest symmetric

relation R on closed extended processes, such that ARB implies: 1. A≈S B

2. ifA→A0, thenB →∗ B0 and A0RB0 for some B0

3. ifA−→α A0 and f v(α)⊆dom(A) and bn(α)∩f n(B) =∅, then B →∗₋_→→α ∗ _B0 _and

A0RB0 for some B0.

Clauses 2 and 3 of this definition correspond to classical notions of bisimilarity (Milner, 1999). Clause 1 asserts static equivalence at each step of the bisimulation.

Example 2.15. Consider a handshake protocol between a clientCand serverS as illustrated below.

This protocol is started with the request from a client C, then the server S generates a fresh session key k, and encrypts it using the client’s public key pkC. When C receives this

message he decrypts it using his private key and extracts the session key k. Finally C uses this key to symmetrically encrypt the secret s and sends the encrypted message to S. Note that the above notation is an incomplete description and does not cover all specific parts of the protocol such as the creation of nonces or the various checks, made by participants.

(37)

Figure 2.1 A Sample Protocol

This protocol is defined with respect to the signature Σ ={adec, aenc, sdec, senc}, where all are binary functions representing the asymmetric and symmetric key encryption. We define following equation over the variables x and y, in order to model the behaviour of encryptions:

adec(x, aenc(pk(x), y)) =y sdec(x, senc(x, y)) =y

The protocol can now be modelled in pi calculus as the process P, defined as follows: P = ν skS. ν skC ν s.

let pkS =pk(skS) in letpkC =pk(skC) in

(¯chpkSi|¯chpkCi|!S|!C)

S = c(pk).νk.¯chaenc(pk, k)i.

c(x). if sdec(k, x) =tag then Q C = c(x). let k =adec(skC, x).

¯

chsenc(k, s)i

In addition to the above used symmetric and asymmetric encryption, the theory allows us to model different cryptographic primitives such as: pairing, digital signature schemes and etc, which are briefly presented below.

Ordered Pairing Algebraic data structures such as ordered pairs, tuples, arrays, and lists appear in several examples. It is not difficult to encode them in the Π-calculus. For example, the signature Σ may contain the binary function symbol pair and an element can be extracted using the unary functions fst and snd. pair(x, y) may abbreviated to (x, y) and the fst and snd equations are as follows:

(38)

f st(pair(x, y)) =x snd(pair(x, y)) = y

One-way Hash function A one-way hash function is represented as a unary function symbol h with no equation. The absence of any equational theory associated with it ensures the one-wayness of h and its collision resistance by the fact that h(x) = h(y) only when x=y.

Asymmetric encryption An alternative and equivalent formalism for asymmetric encryp-tion considers two unary constructors pk and sk for generating public and secret keys from an agent, to capture the notion of constructing a key pair from an agent:

adec(aenc(m, pk(ag)), sk(ag)) =m

Digital signature In a similar manner to asymmetric encryption, digital signatures rely on a pair of signing keys. In each pair, the secret key serves for computing signatures and the public key for verifying those signatures. In order to model digital signatures and their checking, in addition to the unary function symbol pk from asymmetric encryption, the new binary function symbol sign for constructing signatures, the unary function symbol getmass that allows the adversary to get the message m from the signature even without having the key and the binary function checksign to check the signature, and returningm only when the signature is correct.

getmess(sign(m, k)) =m checksign(sign(m, k), pk(k)) = m

It is also possible to model signatures that do not reveal the message m, in the Handbook of Applied Cryptography four different classes of digital signature schemes are defined which all can be modelled (Menezes et al., 1997).

Exclusive-Or Finally, the XOR function can be modelled by: xor(xor(x, y), y) =x

(39)

CHAPTER 3

RELATED APPROACHES

In order to formally verify a protocol we first need to build a mathematical model of that system. Generally two kind of models have been considered in the literature; The Symbolic model and Computational model.

This chapter surveys the related works in the application of formal methods to the analysis of untraceability in RFID protocols. We will present works based on the both symbolic and computational model. We will also compare all definitions in this chapter with our proposed definitions and will provide a classification for untraceability property in next chapter. 3.1 Works in Symbolic Settings

In symbolic settings, the most closest works to our approach are the models provided by Arapinis (Arapiniset al., 2009) and Bruso (Brus´oet al., 2010). They presented approaches for verifying protocols based on process algebra, written in applied pi calculus. Their approach models privacy properties as indistinguishability properties using the concept of observational equivalence in applied pi. Moreover in the symbolic world, Deursen et al. proposed a formal definition of untraceability in a particular trace model (Van Deursen et al., 2008).

We start with the works that used process algebra in order to model the RFID protocol and the properties. An RFID protocol is modelled as some processes running concurrently in parallel. As it is mentioned in the previous chapter, to describe processes in the applied pi calculus, we define set of names presenting the communication channels and other constants. A signature Σ which consists of the function symbols which will be used to define terms. Generally in the symbolic settings the cryptographic primitives are represented by function symbols and the messages are considered as the terms on these primitives. One can also describe the equations which hold on terms constructed

3.1.1 Arapinis Model

Arapinis et al. (Arapinis et al., 2009) developed a definition of untraceability in the applied pi calculus independent of the computational models. They defined two levels of un-traceability. Strong untraceability that holds if an attacker cannot tell the difference between an RFID system in which all tags are different and a system in which some tags appear twice. The weak untraceability holds if an attacker cannot identify two particular runs of a protocol

(40)

as having involved the same tag. An RFID protocol due to their framework is modelled as a closed plain process P such that:

P =νn.(DB|!R|!T)

where DB is the database process, R is the reader and T is the process modelling a tag as: T = νm.init.!main. init is the process of registering the tag to the database DB, and the main models one session of the tag’s protocol.

They defined their strong untraceability as ensuring that an intruder thinks that each tag session is initiated by the different tag. It is modelled as a observational equivalence between process P and P0 where the process P0 defined as: P0 = νn.(DB|!R|!T0_{) where}

T0 =νm.init.mainand P is said strong untraceable if: P ≈P0

In other words the intruder should not be able to tell the difference between the protocol P and the ideal version of P0 in which the the tags are allowed to execute themselves at most once.

On the other hand, theirweak untraceability definition ensures that a tag can execute its protocol multiple times without an intruder being able to link these executions together. This is modelled using the equivalence of two processes Q and Q0 which in the former T1 and T2

are modelling two different tags such that each one initiated a different session(session1 and

session2), and in the laterT10 andT

0

2 are modelling two tags such that both of the mentioned

sessions are initiated from T₁0.

Q=νn.(DB|!R|!T|T1|T2) Q0 =νn.(DB|!R|!T|T0 1|T 0 2) where:

T1 =νm.init.(!main|mainsession1|mainsession3)

T2 =νm.init.(!main|mainsession2)

T₁0 =νm.init.(!main|mainsession1|mainsession2)

T₂0 =νm.init.(!main|mainsession3)

and Q is said weak untraceable if: Q≈Q0 .

3.1.2 Bruso Model

Bruso et al.(Brus´o et al., 2010) defined untraceability property as equivalences in the applied pi calculus. They expressed their definition based on unlinkability game presented by Chatmon (Chatmon et al., 2006).