Duality: Relating Duplicate and Complementary Reduction

7.2 Properties of Network Reduction

7.2.1 Duality: Relating Duplicate and Complementary Reduction

Theorem 14. a. If all the nodes in N_from can be merged into one node by (multiple steps of) complementary reductions, then uandv must be duplicate. b. If all the nodes in Nto

can be merged into one node by (multiple steps of) duplicate reductions, thenuandvmust be complementary.

Proof. Proof by definitions. The complete proof and its graph illustration are in Appendix A.2.

Chapter 7. Scalability Techniques for Analysis

tary reduction, as prefigured in Section 7.1.2 (where the border routers were complementary but the internal, downstream routers were duplicate). It also implies that if two nodes’ upstream (or downstream) neighbors can be reduced into one node in our calculus, then these two nodes themselves can be further merged into one.

7.2.2 Soundness

The main soundness result is that the reduced policy configuration has the same safety and robustness properties as the original one, and so we can use the reduced one to analyze the original.

Theorem 15. IfG0 is safe thenGis safe; and ifG0 experiences route oscillation, then in runningG, there exists at least one execution trace that exhibits route oscillation.

According to Theorem 2, we only need to prove that the rewriting process preserves the presence or absence of cycles in the configuration’s EPD representation, as follows:

Lemma 1. The path digraph ofGis acyclic if and only if the path digraph ofG0 is acyclic. Proof. For duplicate reduction, we prove rewriting preserves cyclicity by construct- ing a cycle inG0for any cyclecinG. The duplicate rewrite fromGtoG0is defined by merging duplicate nodes u and v, and the proof proceeds by case analysis of whether any of the paths originating fromuorv are onc. We prove rewriting preserves acyclicity via the contrapositive: ifG0 is cyclic thenGis cyclic, which is also proved by construction.

For complementary reduction, the proof is similar thanks to the EPD formal- ization and the dual nature of the two rules.

We only provide a proof sketch here, the complete proof and its graph illustration are in Appendix A.2.2.

7.2. Properties of Network Reduction

7.2.3 Local Completeness

We first formalize the notion of “local reduction” and “local safety”, and then prove that duplicate and complementary reductions are locally complete with re- gard to preserving the presence or absence of EPD cycles. Intuitively, a reduction rule applied to nodes u and v is “local”, if it only requires information fromu, v

and their immediate neighbors (Γ−(u, v)andΓ+(u, v)) in order to test the reduction precondition, and generate the configuration of the merged node.

Let Nrest stand for the nodes inV which are not within one hop ofu orv. We

introduce a binary relation∼u,v on EPDs, capturing the idea that they only differ

on the configuration ofNrest, byG∼u,v G0 if and only if the following hold:

1. GandG0 are on graphs having the same set of nodes.

2. They have the same path configuration foruandv: soPu =Pu0,Pv =Pv0, with

the same preference arcs; and they have the same set of transmission arcs to and fromuorv.

3. A preference arc (y◦p, y ◦q) is inEp if and only if it is in Ep0, for any y in

Γ+_{(u, v)}_{, and any}_p_and_q_in_P

u∪Pv.

Definition 16 (Local Safety). A network reduction rule on G by merging u and v is locally safe, if it also preserves safety for anyG0withG0 ∼u,v G.

Theorem 17 (Local Completeness). If a network reduction rule that rewrites G by merging uandv is locally safe, then it must be either duplicate or complementary reduction.

Proof. We use proof by contradiction to establish that ifuandv are neither duplicate nor complementary, then the reduction rule that merges them is not locally safe. That is, there is some acyclic EPDG, where application of the merge results inG0 being cyclic, butG∼u,v G0.

Chapter 7. Scalability Techniques for Analysis

of Figure 7.10). Consider an EPD where there is a series of transmission arcs from a downstream neighbor ofv to an upstream neighbor of u(illustrated from y2 to x1). Merginguandvcreates a cycle, shown in the right of Figure 7.10.

Figure 7.10: Ifuandv are neither duplicate nor complementary, merging them can create a cycle.

Note that, while duplicate and complementary reduction are locally complete, we do not exclude the existence of other safety preserving reduction that requires checking policy configuration beyondu, vand their neighbors. That is, we do not exclude less efficient algorithms for simplifying networks.

7.2.4 Confluence

This section discusses confluence properties of the reductions: we first prove duplicate reduction is confluent but complementary reduction is not.

Theorem 18. [Duplicate reduction is confluent] If, for a set of nodesV, any pair of nodes uandvinV are duplicate, thenV can be merged into one single node by multiple steps of duplicate reduction, regardless of the reduction order.

Proof. By induction on the size ofV.

The base case. |V|= 2is trivial.

The induction step. For|V|=k+ 1>2. Consider two nodesuandvinV, which by assumption are duplicate. By merging them into a new nodez, we can rewrite

7.2. Properties of Network Reduction V toV0 = W ∪ {z}whereW =V \ {u, v}. By the induction hypothesis that anyk

pair-wise duplicate nodes can be merged into one node, it is sufficient to prove that

V reduces to one node by showing thatV0is pair-wise duplicate, since|V0|=k. By definition, inV0, the subsetW is pair-wise duplicate, so we only need to show that

z is duplicate with anywinW. Sinceuandv are duplicate withw, it must be the case thatz andwsatisfy at least the duplicate conditions. SincePz =Pu∪Pv, and

by the pair-wise duplicate definition we know that paths inPu and Pw, inPv and

Pw, and inPu and Pv always form a unique total ordering. That is, for any three

pathsp∈Pu,q∈Pv, andr ∈Pw, we know how to set the preferences between any

two of them. Then there must be be a unique ordering between the three of them, and so all paths fromPu∪Pv ∪Pware totally ordered.

On the other hand, complementary reductions are not confluent, counter-example is shown Figure 7.11(a). Nodesu,vandwhave the same set of downstream neigh-

bors. For example, node u _{has two paths} p2 and p3, and there is some down-

stream preference p2 ≺ p3. All downstream neighbors have consensus on pref-

erence among paths fromu_and v₍p2 ≺ p1 ≺ p3), and among paths fromv andw

(p2 ≺ p1 ≺ p4). However, there is no consistent ranking for paths from uand w,

since some nodes preferp3 overp4, and others prefer the reverse. While comple-

mentary reduction can be applied to eitheruandv(as in Figure 7.11(b)), oruand w_{(as in Figure 7.11(c)), a further reduction step is not possible.}

Finally, we show that duplicate reduction does not commute with complementary reduction, by exhibiting a counterexample. Consider the EPD in Figure 7.12, where nodesuandv are duplicate, andv andware complementary. Ifuandv are merged intozthrough duplicate reduction, then thiszis not reducible withw, due to the lack of consensus on pathsp3 andp4among downstream neighbors.

Chapter 7. Scalability Techniques for Analysis

Figure 7.11: The EPD in (a) either rewrites to (b) or (c) depending on the order of two complementary reductions (u, vorv, w)

Figure 7.12: Duplicate/complementary reductions do not commute

7.3 Evaluation

We have implemented a prototype of network reduction using Maude. With the prototype, we demonstrate that network reduction is applicable on various networks, can be done efficiently at low overhead, and enables analysis of BGP configurations that cannot otherwise be completed. Moreover, by comparing BGP systems before and after reduction, we not only validate our reduction theory, but also gain insights into redundancy and conflicts in network configurations. We primarily selected Maude due to its existing libraries [85, 84] for modeling BGP systems and performing safety analysis [85].

7.3. Evaluation

7.3.1 Network Generation

We present evaluation on a variety of networks ranging from synthetic networks including configurations of Cisco guidelines [87], and random network topologies generated using GT-ITM, to actual network topologies including CAIDA inter-AS level topologies [4], and Rocketfuel router-level ISP topologies [68]. All experiments are carried out on an Intel Xeon 2.33GHz CPU with 4GB memory, running Linux 2.6.

Reduction on Synthetic Networks We evaluate network configurations that span multiple ASes, consisting of both iBGP and eBGP configurations. We first develop a model of a BGP system [60] in Maude, which consists of several ASes and routers running the path-vector protocol, and exchanging routes based on their import, export, and route selection policies. In particular, both theCisco-SyntheticandGT- ITMnetwork policies are realized by thelocal preferenceand AS pathattributes for route selection, and import/export filtering for route exchange. In addition, we develop Maude functions that generates the EPD model from a BGP system in terms of topology and configuration attributes [60].

Cisco-Synthetic Network. To evaluate network reduction on well-designed, highly structural policy configurations proposed by Cisco, we construct various synthetic topologies combining full-mesh and reflection configurations according to these guidelines [87].

An full mesh topology is simply a complete graph of the routers. Our reduction theory will collapse all of these routers into a single node, so long as they implement the same policy. In route reflector configuration, the network graph is partitioned into a set of clusters. Inter-cluster communication is done by special routers configured as ‘reflectors’; other routers within a cluster are clients of the local reflector(s). As depicted in Figure 7.13, the reflectors form a full-mesh core graph, while the clients are only connected to their reflectors. However, the clus-

Chapter 7. Scalability Techniques for Analysis

Figure 7.13: Route reflector example: clients are border routers

ters can be interconnected by either a reflector or a client, and our experiment includes both. Our experiments also include configurations with multiple redundant reflectors in a single cluster, as shown in Figure 7.14.

Figure 7.14: Route reflector with POP

To understand how reduction helps in detecting route oscillation due to policy misconfigurations, we embed in the network three small substructures orgad- gets[26]: namely theGood, Badand Disagreegadgets that correspond to safe, per- manent, and transient oscillation behaviors. These gadgets are embedded within the transit ASes by configuration of the local preference attributes. There are also several stub ASes, set up with full-mesh or reflection topologies (described below), and employing a policy that prefers paths with fewer AS hops. (To break tie, an older route is preferred over a newly generated one.)

GT-ITM networks. As an alternative dataset, we generate transit-stub topologies using the GT-ITM topology generator [47]. Each transit-stub topology is param-

7.3. Evaluation

eterized by the number of transit domains, nodes within a transit domain, stubs per transit nodes, and finally, nodes per stub. We increase the network size by in- creasing all of these parameters. We configure routing policies as follows: transit ASes are willing to carry all traffic, while each stub AS carries traffic only for itself. Given the randomness of GT-ITM topology generation, this dataset are less structured compared to the earlier Cisco-Synthetic topologies, resulting in increased variance in our results.

Reduction on Actual Topologies . We evaluate the effectiveness of our reduction techniques on actual Internet topologies, obtained from the CAIDA Inter-AS level topologies [4] and the Rocketfuel router-level ISP topologies [68]. In the CAIDA and Rocketfuel dataset, we sample1 _{the dataset to derive network of sizes up to}

185 and 128 respectively. For all the topology samples, we insert the same policy configurations as our earlier Cisco-Synthetic and GT-ITM setups. We observe that the reduction rate was high, achieving a rate of 75% and 69% on average respectively. This suggests that in practice, there is significant configuration redundancy in actual configurations, observable even for a sample of the network.

7.3.2 Reduction Performance

Table 7.1 summarizes the performance overhead of network reduction and analysis on the two classes of input topologies for various network sizes. Cisco-Good-22

refers to a 22-node Cisco-Synthetic topology embedded with Good Gadgets. The columns shown refer to:

• EPD Generation. Time to generate a EPD model from the input BGP configuration.

1_{Our experimental dataset was limited by the physical memory constraints of storing the entire}

EPD in memory. As future work, we plan to explore out-of-core implementations or the use of multiple machines for executing a single reduction.

Chapter 7. Scalability Techniques for Analysis

Input Topology EPD Time (ms) Reduction Time (ms) Reduction Time (ms, Dup) Reduction Rate Reduction Rate (Dup) Analysis Time (ms) Cisco-Good-22 3 74 22 68% 63% 429043 Cisco-Good-48 113 863 124 85% 84% 429043 Cisco-Good-87 5299 5665 649 92% 92% 429043 Cisco-Good-104 26567 10341 1814 93% 93% 429043 Cisco-Good-140 983300 32562 1814 95% 94% 429043 Cisco-Bad-22 5 96 23 69% 68% 80224 Cisco-Bad-49 112 935 119 86% 86% 80224 Cisco-Bad-87 5204 6075 465 92% 92% 80224 Cisco-Bad-104 25449 11258 725 93% 93% 80224 Cisco-Bad-121 177421 19741 1111 94% 94% 80224 Cisco-Disagree-23 2 30 14 78% 80% 184 Cisco-Disagree-53 40 352 73 90% 90% 184 Cisco-Disagree-70 182 901 164 93% 92% 184 Cisco-Disagree-103 3951 3641 469 95% 95% 184 Cisco-Disagree-122 20792 6430 810 96% 96% 184 GT-ITM-12 1 6 2 82% 81% 1 GT-ITM-38 7 24 9 94% 94% 1 GT-ITM-77 57 2279 68 95% 95% 1 GT-ITM-80 71 5241 84 90% 90% 2 GT-ITM-118 350 583143 455 86% 91% 2

Table 7.1: Summary of results across various input topologies. Averages across multiple runs are presented.

• Reduction Time.Reduction time required to generate the reduced EPD from the corresponding input EPD. Both reduction rules are applied, duplicate followed by complementary.

• Reduction Time (Dup). Same as above, except that complementary reduction is not applied. The difference allows us to compare the marginal overhead of applying complementary reduction.

• Reduction Rate. Percentage of redundant nodes that are reduced. For example,

68% for Cisco-Good-22 means that the reduced EPD is only 1-68% = 32% of the original network size.

7.3. Evaluation • Reduction Rate (Dup). Rate of reduction achieved by only merging duplicate

nodes.

• Reduced Analysis. Time required to run the safety analysis, using our Maude analyzer (Section 6) on the reduced EPD after reduction.

EPD Generation and Reduction. The overhead of reduction includes the time required to generate the EPD representation of the policy configuration, and the overhead of doing the reduction itself. Due to space constraints, we will show performance graphs (derived from Table 7.1) for the the Cisco-Synthetic networks, but discuss conclusions drawn from both input topology classes.

Figure 7.16 shows the EPD generation time (left) and reduction time (right) as the number of nodes increases. We observe that the execution times are polynomial (cubic/quadratic) with respect to network size. While the complexity bounds are not ideal for scaling up, we note that the absolute numbers are easily within the realm of practicality. For instance, on a single commodity PC, EPD and reduction using our unoptimized Maude code requires only 16 minutes and 32 seconds (or 18 seconds with duplicate only reduction) respectively, for a network of 140 nodes (Cisco-Good-140). While the EPD generation time dominates, this cost is amortized across both reduction and analysis, since the subsequent analysis essentially uses the same EPD representation.

In Cisco-Synthetic networks, the reduction overhead is dominated by the EPD generation time. Note however that EPD generation is amortized across both reduction and analysis, since the subsequent analysis essentially uses the same EPD representation. However, in GT-ITM networks, we observe that the actual reduction dominates over EPD generation, suggesting that a nosier (more randomize) configuration increases reduction overhead. Among Cisco-Synthetic networks, we observe that reduction times are increased on denser topologies with full meshes within an AS, as compared to ASes that use route reflectors internally.

Chapter 7. Scalability Techniques for Analysis

20

40

60

80

100

120 Average Node Number

1

10

100 1000

10000

100000

Average EPD Generation Time(ms)

Good Gadget Bad Gadget Disagree Gadget

Figure 7.15: EPD Generation time as number of nodes increases for the Cisco- Synthetic topologies

Reduction rate. Table 7.1 shows that reduction is very effective at reducing the size of the EPDs. In some cases, as the network sizes increases, the reduction can reduce the original EPD by 95%. Figure 7.17 shows the reduction rates on the Cisco-Synthetic networks. For networks beyond 40 nodes, the reduction rate is above 80% and relatively stable. The effectiveness of reduction can be attributed to the highly structured natures of these topologies, where the resulting reduced EPD is often identical to the original embedded gadgets themselves. Another source of irreducibility is if the BGP decision procedure falls through to attributes we do not analyze.

The trends observed in GT-ITM are largely similar, though we note that since these topologies are randomly generated, the reduction times and rates have higher

7.3. Evaluation

20

40

60

80

100

120 Average Node Number

1

10

100 1000

10000

Average Reduction Time (ms)

Good Gadget

Bad Gadget Disagree Gadget

Figure 7.16: Reduction time as number of nodes increases for the Cisco-Synthetic topologies

variance across experimental runs. In Cisco-Synthetic networks, the reduction rate exhibits smaller variance due to its regular structure. In general, when a network becomes more hierarchical, (from GT-ITM to Cisco, from full-mesh to reflection), reduction rate improves due to increased redundancies. Moreover, the reduction overhead is relatively smaller (compared with the growth of network size). All in all, our results imply that a well structured hierarchical network configuration is easier to analyze in terms of reduction times. They are also more likely to result in safer configurations that do not oscillate.

Duplicate vs Complementary. As we noted in Section 7.1.2, the complementary condition is more complex. Our experimental results summarized in Table 7.1

Chapter 7. Scalability Techniques for Analysis

20

40

60

80

100

120 Average Node Number

0

20

40

60

80

100 Average Reduction Rate (%)

Good Gadget

Bad Gadget Disagree Gadget

Figure 7.17: Reduction rate as number of nodes increases for the Cisco-Synthetic topologies.

validate that the overall reduction time tends to be dominated by complementary reduction. While duplicate reduction only requires two nodes to agree upon what they learned from their neighbors, complementary requires all the neighbors of the two nodes to agree upon what are learnt from them. In addition, the marginal benefit of performing complementary reduction on top of duplicate reduction is often small. For instance,Cisco-Good-22results in a 63% reduction compared to the original EPD when only duplicate reduction is used, and 68% (i.e. an additional

In document Automated Formal Analysis of Internet Routing Configurations (Page 138-153)