A Program Logic for Verifying Secure Routing Protocols

(1)

A Program Logic for Verifying

Secure Routing Protocols

Chen Chen1_{, Limin Jia}2_{, Hao Xu}1_{, Cheng Luo}1_,

Wenchao Zhou3_{, and Boon Thau Loo}1

1

University of Pennsylvania{chenche,haoxu,boonloo}@cis.upenn.edu 2

Carnegie Mellon [email protected] 3 _{Georgetown University}_{[email protected]}

Abstract. The Internet, as it stands today, is highly vulnerable to at-tacks. However, little has been done to understand and verify the formal security guarantees of proposed secure inter-domain routing protocols, such as Secure BGP (S-BGP). In this paper, we develop a sound pro-gram logic for SANDLog—a declarative specification language for secure routing protocols—for verifying properties of these protocols. We prove invariant properties of SANDLog programs that run in an adversarial environment. As a step towards automated verification, we implement a verification condition generator (VCGen) to automatically extract proof obligations. VCGen is integrated into a compiler for SANDLog that can generate executable protocol implementations; and thus, both verifica-tion and empirical evaluaverifica-tion of secure routing protocols can be carried out in this unified framework. To validate our framework, we (1) encoded several proposed secure routing mechanisms in SANDLog, (2) verified variants of path authenticity properties by manually discharging the gen-erated verification conditions in Coq, and (3) gengen-erated executable code based on SANDLog specification and ran the code in simulation.

1 Introduction

In recent years, we have witnessed an explosion of services provided over the Internet. These services are increasingly transferring customers’ private infor-mation over the network and being used in mission-critical tasks. Central to ensuring the reliability and security of these services is a secure and efficient Internet routing infrastructure. Unfortunately, the Internet infrastructure, as it stands today, is highly vulnerable to attacks. The Internet runsBorder Gateway Protocol (BGP), where routers are grouped into Autonomous Systems (ASes) administrated by Internet Service Providers (ISPs). Individual ASes exchange route advertisements with neighboring ASes using thepath-vectorprotocol. Each originating AS first sends a route advertisement (containing a single AS number) for the IP prefixes it owns. Whenever an AS receives a route advertisement, it adds itself to the ASpath, and advertises the best route to its neighbors based on its routing policies. Since these route advertisements are not authenticated, ASes can advertise non-existent routes or claim to own IP prefixes that they do

(2)

SANDlog(Program( Annota/ons( SANDlog(Compiler( Code(( genera/on( Veriﬁca/on( condi/on( genera/on( Executable( protocol( Proof(

obliga/ons( Theorem(_prover( Simulator( (Emulator)(

Fig. 1.Architecture of a unified framework for implementing and verifying secure routing protocols. The round objects represent the inputs and outputs of the framework, which are either code or proofs. The rectangular objects are software components of the framework.

not. These faults may lead to long periods of interruption of the Internet; best epitomized by recent high-profile attacks [10, 24].

In response to these vulnerabilities, several new Internet routing architec-tures and protocols for a more secure Internet have been proposed. These range from security extensions of BGP (Secure-BGP (S-BGP) [19], ps-BGP [28], so-BGP [30]), to “clean-slate” Internet architectural redesigns such as SCION [31] and ICING [22]. However,noneof the proposals formally analyzed their security properties. These protocols are implemented from scratch, evaluated primarily experimentally, and their security properties shown via informal reasoning.

Existing protocol analysis tools [7, 12, 14] are rarely used in analyzing rout-ing protocols because routrout-ing protocols are considerably more complicated than cryptographic protocols: they often compute local states, are recursive, and their security properties need to be shown to hold on arbitrary network topologies. As the number of models is infinite, model-checking-based tools, in general, cannot be used to prove the protocol secure.

To overcome the above limitations, we develop a novel proof methodology to verify these protocols. We augment prior work on declarative networking (ND-Log) [21] with cryptographic libraries to provide compact encoding of secure routing protocols. We call this extension SANDLog (Secure and Authenticated Network DataLog). It has been shown that such a Datalog-like language can be used for implementing a variety of network protocols [21]. We develop a program logic for reasoning about SANDLog programs that run in an adversarial environ-ment. Based on the program logic, we implement a verification condition gener-ator (VCGen), which takes as inputs the SANDLog program and user-provided annotations, and outputs intermediary proof obligations as lemma statements in Coq’s syntax. Proofs for these lemmas are later completed manually. VCGen is integrated into the SANDLog compiler, which augments the declarative net-working engine RapidNet [26] to handle cryptographic functions. The compiler is able to translate SANDLog specification into executable code, which is amenable to implementation and evaluation. Both verification and empirical evaluation of secure routing protocols can be carried out in this unified framework (Figure 1).

We summarize our technical contributions:

1. We define a program logic for verifying SANDLog programs in the presence of adversaries (Section 3). We prove that our logic is sound.

2. We implement VCGen for automatically generating proof obligations and integrate VCGen into a compiler for SANDLog (Section 4).

(3)

3. We encode S-BGP and SCION in SANDLog, verify path authenticity prop-erties of these protocols, and run them in simulation (Section 5).

Due to space constraints, we omit many details, which can be found in our companion technical report [9].

2 SANDLog

We introduce the syntax and operational semantics of SANDLog, which extends the Network Datalog (NDLog) [21] with a library for cryptographic functions. The complete definitions can be found in our TR.

2.1 Syntax

SANDLog’s syntax is summarized below. A SANDLog program is composed of a set of rules, each of which consists of a rule head and a rule body. A rule head is a tuple. A rule body consists of a list of body elements which are either tuples or atoms. Atoms include assignments and inequality constraints. The binary operatorbop denotes inequality relations. Each SANDLog rule specifies that if all the tuples in the body are derivable and all the constraints specified by the atoms in the body are satisfied, then the head tuple is derivable. These features are shared between NDLog [21] and SANDLog. Unique to SANDLog, are the cryptographic functions denotedfc, implemented as a library. This library includes commonly used functions such as signature generation and verification.

Crypt func fc :: f sign asym|f verify asym

Atom a :: x:t|t1bopt2

Terms t :: x|c|ι|fp~tq |fcp~tq

Body Elem B :: ppagBq |a

Arg List ags :: |ags, x|ags, c

Rule Body body :: |body, B

Body Args agB :: @ι,ags Rule r :: ppagHq:body

Head Args agH :: agB|@ι,ags, Faggxxy,ags Program progpιq:: r1, ,rk

To support distributed execution, SANDLog assumes that each node has a unique identifier denoted ι. A SANDLog programprog is parametrized over the identifier of the node it runs on. A location specifier, written @ι, specifies where a tuple resides and is the first argument of a tuple. We require all body tuples to reside on the same node as the program. A rule head can specify a location different from its body tuples. When such a rule is executed, the derived head tuple is sent to the specified remote node. Finally, SANDLog supports aggregation functions (denotedFaggxxy), such asmaxandmin, in the rule head. An example program. The following program can be used to compute the best path between each pair of nodes in a network. sis the location parameter of the program, representing the ID of the node where the program is executing.

(4)

Each node stores three kinds of tuples:linkp@s, d, cqmeans that there is a direct link fromstodwith costc;pathp@s, d, c, pqmeans thatpis a path fromstod

with costc; andbestPathp@s, d, c, pqstates thatpis the lowest-cost path between

sandd.

sp1 pathp@s, d, c, pq:linkp@s, d, cq, p: rs, ds.

sp2 pathp@z, d, c, pq:linkp@s, z, c1q,pathp@s, d, c2, p1q, c:c1 c2, p:z::p1.

sp3 bestPathp@s, d,minxcy, pq:pathp@s, d, c, pq.

Rule sp1 computes all one-hop paths based on direct links. Rulesp2 expresses that if there is a link from s to z of cost c1and a path froms to dof cost c2, then there is a path fromztodwith costc1+c2(for simplicity, we assume links are symmetric, i.e. if there is a link from sto dwith costc, then a link fromd toswith the same costcalso exists). Finally, rulesp3aggregates all paths with the same pair of source and destination (sandd) to compute the best path. The arguments that appear before the aggregation denotes the group-by keys.

2.2 Operational Semantics

The operational semantics of SANDLog adopts a distributed execution model. Each node runs a designated program, and maintains a database of derived tuples in its local state. Nodes can communicate with each other by sending tuples over the network. The evaluation of the SANDLog programs follows the PSN algorithm [20], and maintains the database incrementally. The semantics introduced here is similar to that of NDLog except that we make explicit, which tuples are derived, which are received, and which are sent over the network. This addition is crucial to specifying and proving protocol properties. The constructs needed for defining the operational semantics of SANDLog are presented below.

Table Ψ :: |Ψ,pn, Pq Network Queue Q::U

Update u :: P| P Local State S :: pι, Ψ,U,progpιqq

Update List U:: ru1, ,uns Configuration C ::QBS1, ,Sn

We writeP to denote tuples. The database for storing all derived tuples on a node is denoted Ψ. Because there could be multiple derivations of the same tuple, we associate each tuple with a reference count n, recording the number of valid derivations for that tuple. An update is either an insertion of a tuple, denoted P, or a deletion of a tuple, denotedP. We writeU to denote a list of updates. A node’s local state, denotedS, consists of the node’s identifierι, the databaseΨ, a list of unprocessed updatesU, and the programprogthatιruns. A configuration of the network, writtenC, is composed of a network update queue Q, and the set of the local states of all the nodes in the network. The queue Q models the update messages sent across the network.

Figure 2 presents an example scenario of executing the shortest-path program shown in Section 2.1. The network consists of three nodes,A,BandC, connected by two links with cost 1. In the current state, all three nodes are aware of their direct neighbors, i.e.,linktuples are in their databasesΨA,ΨBandΨC. They have constructed paths to their neighbors (i.e., the corresponding path and bestPath tuples are stored). In addition, nodeB has applied sp2 and generated updates

(5)

A" B" C" SA"="{A,"" """""""""ψA"="{link(@A,B,1)," """"""""" "path(@A,B,1,[A,B])," """"""""" "bestPath(@A,B,1,[A,B])}" """""""""UA"="[]," """""""""progA"="sp}" " SB"="{B,"" """""""""ψB"="{link(@B,A,1),"link(@B,C,1)," """""""""""""path(@B,A,1,[B,A]),"path(@B,C,1,[B,C])," """""""""" "bestPath(@B,A,1,[B,A]),"" """""""""""""bestPath(@B,C,1,[B,C])}" """""""""UB"="[]," """""""""progB"="sp}" " SC"="{C,"" """""""""ψC"="{link(@C,B,1)," """"""""" "path(@C,B,1)," """""""""" "bestPath(@C,B,1,[C,B]),}" """""""""UC"="[]," """""""""progC"="sp}" " Q"="[+path(@A,C,2,[A,B,C]),"+path(@C,A,2,[C,B,A])]" cost"="1" cost"="1"

Fig. 2.An Example Scenario.

+path(@A,C,2,[A,B,C])and+path(@C,A,2,[C,B,A]), which are currently queued and waiting to be delivered to their destinations (nodeAandC respectively). Top-level transitions. The small-step operational semantics of a node is de-noted S ãÑ S1,U. From state S, a node takes a step to a new state S1 and generates a set of updates U for other nodes in the network. The small-step operational semantics of the entire system is denoted C ÝÑτ C1, whereτ is the time of the transition step. A trace T is a sequence of transitions: _ÝÑτ0

C1 τ1 ÝÑ

C2 τn

ÝÑCn 1, where the time points on the trace are monotonically increasing

(τ0 τ1 τn). We assume that the effects of a transition take place at

timeτi (reflected inCi 1). Figure 3 defines the rules for system state transition.

Rule_NodeStep states that the system takes a step when one node takes a

step. As a result, the updates generated by nodeiare appended to the end of the network queue. We use to denote the list append operation. RuleDeQueue

applies when a node receives updates from the network. We write Q1 `Q2

to denote a merge of two lists. Any node can dequeue updates sent to it and append those updates to the update list in its local state. Here, we overload the

operator, and writeSQto denote a new state, which is the same asS, except that the update list is the result of appending Qto the update list inS.

We omit the detailed rules for state transitions within a node. Instead, we ex-plain it through examples. At a high-level, those rules either fire base rules—rules that do not have a rule body—at initialization; or computes new updates based on the program and the first update in the update list. Continue the example sce-nario, nodeA dequeues+path(@A,C,2,[A,B,C]), and puts it into the unprocessed update list UA (ruleDeQueue). NodeA then fires all rules that are triggered

by the update, and generates new updates Uin andUext (Uin and Uext denote

updates to local (internal) states and remote (external) states respectively.) In the resulting state, the local state of node Ais updated: path(@A,C,2,[A,B,C]) is

CÝÑC1 SiãÑSi,1 U @jP r1, ns ^ji, Sj1 Sj QBS1, SnÝÑQUBS11, Sn1 NodeStep QQ1`Q1 `Qn @jP r1, ns Sj1 SjQj QBS1, SnÝÑQ1BS11, Sn1 DeQueue

(6)

inserted into ΨA, and UA now includesUin. The network queue is updated to

include Uext (ruleNodeStep).

Incremental maintenance.Following the strategy proposed in [20], the local database is maintained incrementally by processing updates one at a time. The rules are rewritten into ∆ rules, which can efficiently generate all the updates triggered by one update. For any given ruler that containskbody tuples,k ∆

rules of the following form are generated, one for eachiP r1, ks.

∆ppagHq:pν

1pagB1q, ..., pνi1pagBi1q, ∆pipagBiq, pi 1pagBi 1q, ..., pkpagBkq, a1, ..., am

∆pi in the body denotes the update currently being considered. ∆p in the head denotes new updates that are generated as the result of firing this rule. Here

pν denotes the set of tuples whose name is p and includes the current update being considered. p is drawn only from the set of tuples that does not include the current update. For example, the∆ rules forsp2are:

sp2a ∆pathp@z, d, c, pq:∆linkp@s, z, c1q,pathp@s, d, c2, p1q, c:c1 c2, p:z::p1.

sp2b ∆pathp@z, d, c, pq:linkν_p_@_{s, z, c}₁_q_{, ∆}_path_p_@_{s, d, c}₂_{, p}₁_q_{, c}_:_c₁ _c₂_{, p}_:_z_::_p₁_.

Rulessp2aandsp2bare∆rules triggered by updates of thelinkandpathrelation respectively. For instance, when nodeAprocesses+path(@A,C,2,[A,B,C]), only rule sp2bis fired. In this step,pathνincludes the tuplepath(@A,C,2,[A,B,C]), whilepath does not. On the other hand,linkν andlinkdenote the same set of tuples, because the update is apathtuple, and thus does not affect tuples with a different name. Rule firing.Here we explain through examples how they work. Informally, a∆

rule is fired if instantiations of its body tuples are present in the derived tuples and available updates. The resulting rule head will be put into the update lists, depending on whether it needs to be sent to another node, or consumed locally. We revisit the example in Figure 2. Upon receiving+path(@A,C,2,[A,B,C]),A

will trigger∆rulesp2band generate a new update+path(@B,C,3,[B,A,B,C]), which will be included inUextas it is destined to a remote nodeB. The∆rule for sp3

will also be triggered and will generate a new update+bestPath(@A,C,2,[A,B,C]), which will be included inUin. After evaluating the∆rules triggered by the

up-date+path(@A,C,2,[A,B,C]), we haveUin t+bestPath(@A,C,2,[A,B,C])uandUext t+path(@B,C,3,[B,A,B,C])u. In addition, bestpathagg, the auxiliary relation that

maintains all candidate tuples forbestpath, is also updated to reflect that a new candidate tuple has been generated. It now includesbestpathagg(@A,C,2,[A,B,C]).

Discussion.The semantics introduced here will not terminate for programs with a cyclic derivation of the same tuple, even though set-based semantics will. Most routing protocols do not have such issue (e.g., cycle detection is well-adopted in routing protocols). Our prior work [23] has proposed improvements to solve this issue. It is a straightforward extension to the current semantics and is not crucial for demonstrating the soundness of the program logic we develop.

The operational semantics is correct if the results are the same as one where all rules reside in one node and a global fixed point is computed at each round. The proof of correctness is out of the scope of this paper. We are working on correctness definitions and proofs for variants of PSN algorithms. Our initial

(7)

results for a simpler language can be found in [23]. SANDLog additionally al-lows aggregates, which are not included in [23]. The soundness of our logic only depends on the specific evaluation strategy implemented by the compiler, and is orthogonal to the correctness of the operational semantics. Updates to the oper-ational semantics is likely to come in some form of additional bookkeeping in the representation of tuples, which we believe will not affect the overall structure of the program logic; as these metadata are irrelevant to the logic.

3 A Program Logic for SANDLog

Inspired by program logics for reasoning about cryptographic protocols [12, 15], we define a program logic for SANDLog. The properties we are interested in are safety properties, which should hold throughout the execution of SANDLog programs that interact with attackers.

Attacker model. We assume connectivity-bound network attackers, a variant of the Dolev-Yao network attacker model. An attacker can send and receive messages to and from its neighbors. We assume a symbolic model of the crypto-graphic functions: an attacker can operate cryptocrypto-graphic functions to which it has the correct keys, such as encryption, decryption, and signature generation. This model does not allow an attacker to eavesdrop or intercept packets. This makes sense in the application domain that we consider, as attackers are ma-licious nodes in the network that participate in the routing protocol exchange. All the links we consider represent dedicated physical cables that connect neigh-boring nodes, which are hard to eavesdrop without physical intrusion.

This attacker model manifests in the formal system in two places: (1) the network is modeled as connected nodes, some of which run the SANDLog pro-gram that encodes the prescribed protocol and others are malicious and run arbitrary SANDLog programs; (2) assumptions about cryptographic functions are admitted as axioms in proofs.

Syntax.We use first-order logic formulas, denotedϕ, as property specifications. The atoms, denoted A, include predicates and term inequalities.

Atoms A::Pp~tq@pι, τq |sendpι,tppP, ι1, ~tqq@τ|recvpι,tppP, ~tqq@τ

| honestpι,progpιq, τq |t1bopt2

PredicatePp~tq@pι, τqmeans that tuple Pp~tq is derivable at timeτ by node

ι. The first element in~tis a location identifierι1, which may be different from

ι. When a tuplePpι1, ...qis derived at nodeι, it is sent toι1. Thissendaction is captured by predicatesendpι,tppP, ι1, ~tqq@τ. Predicaterecvpι,tppP, ~tqq@τdenotes that nodeι has received a tuplePp~tqat timeτ.honestpι,progpιq, τqmeans that nodeιstarts to run programprogpιqat timeτ. Since predicates take time points as an argument, we are effectively encoding linear temporal logic (LTL) in first-order logic [18]. Using these atoms and first-first-order logic connectives, we can specify security properties such as route authenticity (see Section 5 for details). Logical judgments. The logical judgments use two contexts: context Σ con-tains all the free variables andΓ contains logical assumptions.

(8)

Σ;Γ $progpiq:ti, tb, teu.ϕpi, tb, teq

@pPhdOfpprogq, ϕpis closed under trace extension

@rPrlOfpprogq, rhp~vq:p1p~s1q, ..., pmp~smq, q1p~u1q, ..., qnp~unq, a1, ..., ak Σ;Γ $ @i,_©@t,@~y jPr1,ks rajs^ © jPr1,ms ppjp~sjq@pi, tq^ϕpjpi, t, ~sjqq^ © jPr1,ns recvpi,tppqj, ~ujqq@t ϕhpi, t, ~vq where~yfvprq Σ;Γ $progpiq:ti, yb, yeu. © pPhdOfpprogq @t,@~x, yb¤t ye^pp~xq@pi, tq ϕppi, t, ~xq Inv Σ;Γ $ϕ Σ;Γ $progpiq:ti, yb, yeu.ϕpi, yb, yeq Σ;Γ $honestpι,progpιq, tq

Σ;Γ $ @t1, t1¡t, ϕpι, t, t1q Honest

Fig. 4.Selected Rules in Program Logic

(1)Σ;Γ $ϕ (2)Σ;Γ $progpiq:ti, yb, yeu.ϕpi, yb, yeq

Judgment (1) states thatϕis provable given the assumptions inΓ. Judgment (2) is an assertion about SANDLog programs. We writeϕp~xqwhen~xare free in

ϕ.ϕp~tqdenotes the resulting formula of substituting~tfor~xinϕp~xq. Recall that prog is parametrized over the identifier of the node it runs on. The assertion of an invariant property for such a program is parametrized over not only the node IDi, but also the starting point of executing the program (yb) and a later

time pointye. Judgment (2) states that any traceT containingthe execution of

programprog by a node ι, starting at timeτb, satisfiesϕpι, τb, τeqfor any time

point τe later than τb. Intuitively, the trace contains several threads running concurrently, only one of them runs the program and the other threads can be malicious. Sinceτe is any time afterτb (the timeprog starts),ϕis an invariant property ofprog. For example,ϕpi, yb, yeqcould specify that wheneveriderives a pathtuple, every link in the path must have existed in the past.

Inference rules.The inference rules of our program logic include all standard first-order logic ones (e.g. Modus ponens), omitted for brevity. We explain two key rules (Figure 4) in our proof system.

Rule _Inv derives an invariant property of a program prog. The invariant property states that if a tuplepis derived by this program, then some property

ϕpmust be true; formally:@t,@~x, yb¤t ye^pp~xq@pi, tq ϕppi, t, ~xq, wherep

is the tuple name of a rule head ofprog, andϕppi, t, ~xqis an invariant property associated with pp~xq. For example, pcan bepath, and ϕppi, t, ~xqbe that every link in argument path must have existed in the past. Rule Invstates that the

invariant of the program is the conjunction of all the invariants of the tuples it derives.

We require that the invariantsϕp be closed under trace extension (the first

premise of _Inv). Formally: T ( ϕpι, t, ~sq and T is a prefix of T1 then T1 (

ϕpι, t, ~sq. For instance, the property that nodeι has received a tuple P before timetis closed under trace extension; the property that nodeιnever sendsP to the network is not closed under trace extension. This restriction has not affected

(9)

our case studies: the invariants used in verification only assert what happened in the past, or facts independent of time (e.g., arithmetic constraints).

Intuitively, the premises of_Inv need to establish that (1) when pis a base tuple—its derivation is independent of any other tuples—ϕholds; and (2) when

pis derived using other tuples, ϕholds. The last (second) premise of _Invdoes precisely that. It checks every rule inprog and proves that the body tuples and the invariants associated with the body tuples together imply the invariant of the head tuple. For example, forsp1, to show that the invariant associated with path is true, we can use the fact that there is a link tuple, that the invariant associated with that linktuple is true, and that the constraintp rs, dsis true. This is sound because we are inducting over the derivation tree of the head tuple. This premise looks complicated because the body tuples need to be treated differently depending on whether they are derived locally, received from the network, or constraints. For each ruler in prog, we assume that the body of r

is arranged so that the firstm tuples are derived byprog, the nextntuples are received from the network, and constraints constitute the rest of the body. The right-hand-side of the implication of the last (second) premise is the invariant associated with tupleh. A rule head is only derivable when all of its body tuples are derivable and constraints satisfied. For tuples that are derived earlier by prog (denotedpj), we can safely assume that their invariants hold at timet. All received tuples (qj) should have been received prior to rule firing. Finally, the atoms (constraints, denoted aj) should be true. Here, rx: fp~tqs rewrites the assignment statement into an equality checkxfp~tq. The left-hand-side of that implication is a conjunction of formulas denoting the above conditions. When

r only has a rule head, this premise is reduced to the right-hand-side of that implication, which is what case (1) mentioned above.

The last (second) premise of Inv can be automatically generated given a

SANDLog program and all the corresponding ϕps. In Section 4, we detail the implementation of the verification condition generator for Coq.

The_Honestrule proves properties of the entire system based on invariants of

a SANDLog program. Ifϕpi, yb, yeqis the invariant ofprog, and a nodeιruns the

programprogat timetb, then any trace containing the execution of this program

satisfiesϕpι, tb, teq, whereteis a time point after tb. SANDLog programs never

terminate: after the last instruction, the program enters a stuck state.

Soundness. We prove the soundness of our logic with regard to the trace se-mantics. First, we define the semantics for our logic and judgments in Figure 5. Formulas are interpreted on a trace T. We elide the rules for first-order logic connectives. A tuple Pp~tqis derivable by node ι at timeτ, ifPp~tqis either an internal update or an external update generated at a time pointτ1 no later than

τ. A node ι sends out a tuplePpι1, ~tq if that tuple was derived by nodeι. Be-causeι1is different fromι, it is sent over the network. Areceived tupleis one that comes from the network (obtained using _DeQueue). Finally, an honest node ι

runsprog at timeτ and the local state ofιat timeτ is the initial state with an empty table and update queue.

(10)

T (Pp~tq@pι, τqiffDτ1¤τ,C is the configuration onT prior to timeτ1, pι, Ψ,U,progpιqq PC, at timeτ1,pι, Ψ,U,progpιqq ãÑ pι, Ψ1,U1Uin,progpιqq,Ue, and eitherPp~tq PUinorPp~tq PUe

T (sendpι,tppP, ι1, ~tqq@τ iffCis the configuration onT prior to timeτ, pι, Ψ,U,progpιqq PC, at timeτ,pι, Ψ,U,progpιqq ãÑS1,UeandPp@ι1, ~tq PUe T (recvpι,tppP, ~tqq@τ iffDτ1¤τ,CÝÑτ1 C1PT,

Qis the network queue inC,Pp~tq PQ,pι, Ψ,U,progpιqq PC1andPp~tq PU T (honestpι,progpιq, τqiff at timeτ, nodeι’s local state is (ι, [], [],prog (ι))

Γ (progpiq:ti, yb, yeu.ϕpi, yb, yeqiff Given any traceT such thatT (Γ, and at timeτb, nodeι’s local state is (ι, [], [],progpιq)

given any time pointτe such thatτe¥τb, it is the case thatT (ϕpι, τb, τeq

Fig. 5.Trace-based semantics

The semantics of invariant assertion states that if a trace T contains the execution of prog by nodeι (formally defined as the node running prog is one of the nodes in the configurationC), then given any time point τe after τb, the

traceT satisfiesϕpι, τb, τeq. This definition allowsprog to run concurrently with

other programs, some of which may be controlled by the adversary.

The program logic is proven to be sound with regard to the trace semantics. Theorem 1 (Soundness) 1. IfΣ;Γ $ϕ, then for all grounding substitution

σforΣ, given any traceT,T (Γ σ impliesT (ϕσ;

2. If Σ;Γ $progpiq: ti, yb, yeu.ϕpi, yb, yeq, then for all grounding substitution σforΣ,Γ σ( pprogqσpiq:ti, yb, yeu.pϕpi, yb, yeqqσ.

4 Verification Condition Generator

As a step towards automated verification, we implement a verification condition generator (VCGen) to automatically extract proof obligations from a SANDLog program. VCGen is implemented in C++ and fully integrated to RapidNet [26], a declarative networking engine for compiling SANDLog programs. We target Coq, but other interactive theorem provers such as Isabelle HOL are possible.

More concretely, VCGen generates lemmas corresponding to the last premise of rule_Inv. It takes as inputs: the abstract syntax tree of a SANDLog program sp, and type annotations tp. The generated Coq file contains the following: (1) definitions for types, predicates, and functions; (2) lemmas for rules in the SANDLog program; and (3) axioms based on_Honest rule.

Definition. Predicates and functions are declared before they are used. Each predicate (tuple)p in the SANDLog program corresponds to a predicate of the same name in the Coq file, with two additional arguments: a location speci-fier and a time point. For example, the generated declaration of the link tuple linkp@node, nodeqis the following

Variable link: nodeÑnodeÑnodeÑtime ÑProp.

For each user-defined function, a data constructor of the same name is gen-erated, unless it corresponds to a Coq’s built-in operator (e.g. list operations). The function takes a time point as an additional argument.

(11)

linkp@n, n1q there is a link betweennandn1. routep@n, d, c, p, slq pis a path todwith costc.

slis the signature list associated withp. prefixp@n, dq nowns prefix (IP addresses)d.

bestRoutep@n, d, c, p, slq pis the best path todwith costc.

slis the signature list associated withp.

verifyPathp@n, n1, d, p, sl, a pathptodneeds verifying against signature listsl.

pOrig,sOrigq pis a sub-path ofpOrig, andsis a sub-list ofsOrig. signaturep@n, m, sq ncreates a signaturesof messagemwith private key. advertisementp@n1, n, d, p, slqnadvertises pathpto neighborn1with signature listsl.

Fig. 6.Tuples forprog_sbgp

Lemmas.For each rule in a SANDLog program, VCGen generates a lemma in the form of the last premise in inference ruleInv(Figure 4). Rulesp1 of example

program in Section 2.1, for instance, corresponds to the following lemma: Lemma r1: forall(s:node)(d:node)(c:nat)(p:list node)(t:time),

link s d c s t Ñ p = cons (s (cons d nil)) Ñ p-path s t s d c p t. Here, cons is Coq’s built-in list appending operation. and p-path is the in-variant associated with predicatepath.

Axioms.For each invariantϕpof a rule headp, VCGen produces an axiom of the form: @i, t, ~x,Honestpiq pp~xq@pi, tq ϕppi, ~xq. These axioms are conclusions of

the_Honestrule after invariants are verified. Soundness of these axioms is backed

by Theorem 1. Since we always assume that the program starts at time8, the condition thatt¡ 8is always true, thus omitted.

5 Case Studies

We investigate two secure routing protocols: S-BGP and SCION. Due to space constraints, we present in detail the verification of one property of S-BGP. All SANDLog specifications and Coq proofs can be found online athttp://netdb. cis.upenn.edu/forte2014/.

Encoding.Secure Border Gateway Protocol (S-BGP) provides security guaran-tees such as origin authenticity and route authenticity over BGP through PKI and signature-based attestations. Our SANDLog encoding includes all neces-sary details of S-BGP’s route attestation mechanisms. S-BGP requires that each node sign the route information it advertises to its neighbor, which includes the path, the destination prefix (IP address), and the identifier of the intended neighbor. Along with the advertisement, a node sends its own signature as well as all signatures signed by previous nodes on the subpaths. Upon receiving an advertisement, a node verifies all signatures.

Key tuples generated at each node executingprog_sbgp are listed in Figure 6. Herenis the parameter representing the identifier of the node that runsprog_sbgp. All tuples except advertisementare stored at node n. An advertisement tuple en-codes a route advertisement that, once generated, is sent over the network to one of n’s neighbors. We summarizeprog_sbgpencoding in Table 1.

(12)

Empirical evaluation.We use RapidNet [26] to generate low-level implemen-tation from SANDLog encoding of S-BGP and SCION. We validate the low-level implementation in the ns-3 simulator [1]. Our experiments are performed on a synthetically generated topology consisting of 40 nodes, where each node runs the generated implementation of the SANDLog program. The observed execu-tion traces and communicaexecu-tion patterns match the expected protocol behavior. We also confirm that the implementation defends against known attacks such as adversely advertising non-existent routes.

Property specification. We focus on route authenticity, encoded as ϕauth1

below. It holds on any execution trace of a network where some nodes run S-BGP, and those who do not are considered malicious.

ϕauth1=@n, m, t, d, p, sl,

Honestpnq ^advertisementpm, n, d, p, slq@pn, tq goodPathpt, p, dq

Formulaϕauth1asserts a propertygoodPathpt, p, dqon anyadvertisementtuple

generated by an honest node n. goodPathpt, p, dq defined below asserts that all links in pathpreaching the destination IP prefixdmust have existed at a time point no later thant. This means that every pair of adjacent nodesnandmin pathphad in their databases: tuplelink(@n, m)andlink(@m, n)respectively.

Honestpnq Dt1, t1¤t^prefixpn, dq@pn, t1q

goodPathpt, n::nil, dq

Honestpnq Dt1, t1¤t^linkpn, n1q@pn, t1q goodPathpt, n::nil, dq

goodPathpt, n1::n::nil, dq

Honestpnq Dt1, t1¤t^linkpn, n1q@pn, t1q ^ Dt2, t2¤t^linkpn, n2q@pn, t2q

goodPathpt, n::n2::p2, dq

goodPathpt, n1::n::n2::p2, dq

The base case is when phas only one node, and we require that d be one of the prefixes owned by n (i.e., theprefix tuple is derivable). When phas two nodesn1andn, we require that the link fromnton1 exist fromn’s perspective, assuming that n is honest. The last case checks that both links (from n to n1

and fromn ton2) exist from n’s perspective, assumingnis honest. In the last two rules, we also recursively check that the subpath also satisfiesgoodPath. By varying the definition of goodPath, we can specify different properties such as

Rule Summary Head Tuple

r1: Generate a route for prefix of own. routep@n, d, c, p, slq

r2: Generate a best route for destination. bestRoutep@n, d, c, p, slq

r3: Receive advertisement from neighbor. verifyPathp@n, n1, d, p, sl, pOrig, sOrigq

r4: Recursively verify signature list. verifyPathp@n, n1, d, p, sl, pOrig, sOrigq

r5: Generate a route for verified path. routep@n, d, c, p, slq

r6: Generate a signature for new route. signaturep@n, m, sq

r7: Send route advertisement to neighbors.advertisementp@n1, n, d, p, slq

(13)

one that requires each subpath be authorized by the sender.ϕauth1 is a general

topology-independent security property.

Verification. To use the authenticity property of the signatures, we include the following axiom Asigin the logical context Γ. This axiom states that ifsis

verified by the public key of n1, and the node n1 is honest, then n1 must have generated asignaturetuple. Predicateverifypm, s, kq@pn, tq, generated by VCGen when functionf verify in SANDLog returns true, means that nodenverifies at timet thatsis a valid signature of messagemaccording to keyk.

Asig =@m, s, k, n, n1, t,verifypm, s, kq@pn, tq ^publicKeyspn, n1, kq@pn, tq ^

Honestpn1q Dt1, t1 t^signaturepn1, m, sq@pn1, t1q

We first prove thatprog_sbgp has an invariant propertyϕI:

(a); $progsbgppnq:ti, yb, yeu.ϕIpi, yb, yeq

withϕIpi, yb, yeq _p_P_hdOf_p_prog

sbgpq@t ~x, yb ¤t ye^pp~xq@pi, tq ϕppi, t, ~xq.

Here, every ϕp in ϕI denotes the invariant property associated with each head tuple in progsbgp, and needs to be specified by the user. For instance, the

invariant associated with theadvertisementtuple is denotedϕadvertisement: ϕadvertisementpi, t, n1, n, d, p,slq goodPathpt, p, dq.

The proof of (a) is carried out in Coq; we manually discharged all lemmas generated by VCGen. Next, applying the _Honest rule to (a), we can deduce

ϕ @n t,Honestpnq ϕIpn, t,8q. ϕis injected into the assumptions (Γ) by

VCGen, and is safe to be used in subsequent proof steps. Finally, ϕϕauth1 is

also proven in Coq by applying standard first-order logic rules.

We explain interesting steps of proving (a). Similar to verifying (a), usingInv,

Honestand keeping the only clause related tosignature, we derive the following:

(a2); $@n,@t,Honestpnq ^signaturepn, m, sq@pn, tq Dn1, d, md::n1::p^ϕlink2pp, n, d, n1, tq

Formulaϕlink2pp, n, d, n1, tqstates thatnis the first node on the pathp, the link from nto the next node on pexists, and the link betweennand the receiving node n1 also exists. This matches the non-recursive conditions in the definition of goodPath.

ϕlink2pp, n, d, n1, tq=linkpn, n1q@pn, tq ^

Dp1, pn::p1^ pp1nilprefixpn, dq@pn, tqq ^ @p2, m1, p1m1::p2linkpn, m1q@pn, tq

Now (a2) has connected an honest node’s signature to the existence of links

related to it. Combining (a2) andAsig, each time a signaturesig of a nodenis

properly verified inprog_sbgp, the invariantlink2(link tuples existed) holds under the assumption thatnis honest.

Defining the invariant for tupleverifyPath is technically challenging. The di-rection in which we check the signature list is different from the didi-rection in which the route is created. The invariant needs to convey that part of a path has been verified, and part of it still needs to be verified. The solution is to use

(14)

implication to state that if the path to be verified satisfies goodPath, then the entire path satisfiesgoodPath.

6 Related Work

Cryptographic protocol analysis. Analyzing cryptographic protocols [12, 27, 17, 25, 14, 6, 4, 15] has been an active area of research. Compared to crypto-graphic protocols, secure routing protocols have to deal with arbitrary network topologies and the protocols are more complicated: they may access local storage and commonly include recursive computations. Most model-checking techniques are ineffective in the presence of those complications.

Verification of trace properties. A closely related body of work is logic for verifying trace properties of programs (protocols) that run concurrently with adversaries [12, 15]. We are inspired by their program logic that requires the asserted properties of a program to hold even when that program runs concur-rently with adversarial programs. One of our contributions is a general program logic for a declarative language SANDLog, which differs significantly from an ordinary imperative language. The program logic and semantics developed here apply to other declarative languages that use bottom-up evaluation strategy.

Wang et al. [29] have developed a proof system for proving correctness prop-erties of networking protocols specified in NDLog. Built on proof-theoretic se-mantics of Datalog, they automatically translate NDLog programs into equiv-alent first-order logic axioms. Those axioms state that all the body tuples are derivable if and only if the head tuple is derivable. One main difference is that unlike theirs, we made explicit in our semantics, the trace associated with the distributed execution of a SANDLog program. Another important difference is that we verify invariants associated with each derived tuple in the presence of attackers, which are not present in their system. Therefore, their system cannot be directly used to verify the security properties of secure routing protocols. Networking protocol verification.Recently, several papers have investigated the verification of route authenticity properties on specific wireless routing pro-tocols for mobile networks [2, 3, 11]. They have showed that identifying attacks on route authenticity can be reduced to constraint solving, and that the security analysis of a specific route authenticity property that depends on the topolo-gies of network instances can be reduced to checking these properties on several four-node topologies. In our own prior work [8], we have verified route authen-ticity properties on variants of S-BGP using a combination of manual proofs and an automated tool, Proverif [7]. The modeling and analysis in these works are specific to the protocols and the route authenticity properties. Some of the properties that we verify in our case study are similar. However, we propose a general framework for leveraging a declarative programming language for verifi-cation and empirical evaluation of routing protocols. The program logic proposed here can be used to verify generic safety properties of SANDLog programs.

There has been a large body of work on verifying the correctness of vari-ous network protocol design and implementations using proof-based and model-checking techniques [5, 16, 13, 29]. The program logic presented here is customized

(15)

to proving safety properties of SANDLog programs, and may not be expressive enough to verify complex correctness properties. However, the operational se-mantics for SANDLog can be used as the semantic model for verifying protocols encoded in SANDLog using other techniques.

7 Conclusion and Future Work

We have designed a program logic for verifying secure routing protocols specified in the declarative language SANDLog. We have integrated verification into a unified framework for formal analysis and empirical evaluation of secure routing protocols. As future work, we plan to expand our use cases, for example, to investigate mechanisms for securing the data (packet forwarding) plane [22]. In addition, as an alternative to Coq, we are also exploring the use of automated first-order logic theorem provers to automate our proofs.

8 Acknowledgments

The authors wish to thank the anonymous reviewers for their useful suggestions. Our work is supported by NSF 1218066, NSF 1117052, NSF CNS-1018061, NSF CNS-0845552, NSFITR-1138996, NSF CNS-1115706, AFOSR Young Investigator award FA9550-12-1-0327 and NSF ITR-1138996.

References

1. ns 3 project: Network Simulator 3,http://www.nsnam.org/

2. Arnaud, M., Cortier, V., Delaune, S.: Modeling and verifying ad hoc routing pro-tocols. In: Proceedings of CSF (2010)

3. Arnaud, M., Cortier, V., Delaune, S.: Deciding security for protocols with recursive tests. In: Proceedings of CADE (2011)

4. Bau, J., Mitchell, J.: A security evaluation of DNSSEC with NSEC3. In: Proceed-ings of NDSS (2010)

5. Bhargavan, K., Obradovic, D., Gunter, C.A.: Formal verification of standards for distance vector routing protocols. J. ACM 49(4) (2002)

6. Blanchet, B.: Automatic verification of correspondences for security protocols. J. Comput. Secur. 17(4) (Dec 2009)

7. Blanchet, B., Smyth, B.: Proverif 1.86: Automatic cryptographic protocol verifier, user manual and tutorial,http://www.proverif.ens.fr/manual.pdf

8. Chen, C., Jia, L., Loo, B.T., Zhou, W.: Reduction-based security analysis of inter-net routing protocols. In: WRiPE (2012)

9. Chen, C., Jia, L., Xu, H., Luo, C., Zhou, W., Loo, B.T.: A program logic for verifying secure routing protocols. Tech. rep., CIS Dept. University of Pennsylvania (February 2014),http://netdb.cis.upenn.edu/forte2014

10. CNET: How pakistan knocked youtube offline, http://news.cnet.com/ 8301-10784_3-9878655-7.html

11. Cortier, V., Degrieck, J., Delaune, S.: Analysing routing protocols: four nodes topologies are sufficient. In: Proceedings of POST (2012)

12. Datta, A., Derek, A., Mitchell, J.C., Roy, A.: Protocol Composition Logic (PCL). Electronic Notes in Theoretical Computer Science 172, 311–358 (2007)

(16)

13. Engler, D., Musuvathi, M.: Model-checking large network protocol implementa-tions. In: Proceedings of NSDI (2004)

14. Escobar, S., Meadows, C., Meseguer, J.: A rewriting-based inference system for the NRL protocol analyzer: grammar generation. In: Proceedings of FMSE (2005) 15. Garg, D., Franklin, J., Kaynar, D., Datta, A.: Compositional system security with

interface-confined adversaries. ENTCS 265, 49–71 (September 2010)

16. Goodloe, A., Gunter, C.A., Stehr, M.O.: Formal prototyping in early stages of protocol design. In: Proceedings of ACM WITS (2005)

17. He, C., Sundararajan, M., Datta, A., Derek, A., Mitchell, J.C.: A modular correct-ness proof of IEEE 802.11i and TLS. In: Proceedings of CCS (2005)

18. Kamp, H.W.: Tense Logic and the Theory of Linear Order. Phd thesis, Computer Science Department, University of California at Los Angeles, USA (1968) 19. Kent, S., Lynn, C., Mikkelson, J., Seo, K.: Secure border gateway protocol

(S-BGP). IEEE Journal on Selected Areas in Communications 18, 103–116 (2000) 20. Loo, B.T., Condie, T., Garofalakis, M., Gay, D.E., Hellerstein, J.M., Maniatis,

P., Ramakrishnan, R., Roscoe, T., Stoica, I.: Declarative Networking: Language, Execution and Optimization. In: SIGMOD (2006)

21. Loo, B.T., Condie, T., Garofalakis, M., Gay, D.E., Hellerstein, J.M., Maniatis, P., Ramakrishnan, R., Roscoe, T., Stoica, I.: Declarative networking. In: Communi-cations of the ACM (2009)

22. Naous, J., Walfish, M., Nicolosi, A., Mazieres, D., Miller, M., Seehra, A.: Verifying and enforcing network paths with ICING. In: Proceedings of CoNEXT (2011) 23. Nigam, V., Jia, L., Loo, B.T., Scedrov, A.: Maintaining distributed logic programs

incrementally. In: Proceedings of PPDP (2011)

24. One Hundred Eleventh Congress: 2010 report to congress of the u.s.-china economic and security review commission (2010), http://www.uscc.gov/annual_report/ 2010/annual_report_full_10.pdf

25. Paulson, L.C.: Mechanized proofs for a recursive authentication protocol. In: Pro-ceedings of CSFW (1997)

26. RapidNet: A Declarative Toolkit for Rapid Network Simulation and Experimenta-tion:http://netdb.cis.upenn.edu/rapidnet/

27. Roy, A., Datta, A., Derek, A., Mitchell, J.C., Jean-Pierre, S.: Secrecy analysis in protocol composition logic. In: Proceedings of ESORICS (2007)

28. Wan, T., Kranakis, E., Oorschot, P.C.: Pretty secure BGP (psBGP). In: Proceed-ings of NDSS (2005)

29. Wang, A., Basu, P., Loo, B.T., Sokolsky, O.: Declarative network verification. In: Proceedings of PADL (2009)

30. White, R.: Securing bgp through secure origin BGP (soBGP). The Internet Pro-tocol Journal 6(3), 15–22 (2003)

31. Zhang, X., Hsiao, H.C., Hasker, G., Chan, H., Perrig, A., Andersen, D.G.: Scion: Scalability, control, and isolation on next-generation networks. In: Proceedings of IEEE S&P (2011)