• No results found

Chapter 4 BGP Session Lifetime Modeling in Congested Networks

4.2 Study Methodology

In this chapter, the same notations are used as in Chapter 3. The period of Hold Timer is Th and the period of KeepAlive Timer is Tk. The default values of Th and Tkare 90 seconds and 30 seconds, respectively, in IETF RFC.

Fig. 4.1 demonstrates our system model. We focus on studying the reliability of a single BGP session. From network’s point of view, BGP networks (e.g., route reflection networks) are affected if some of its sessions fail, which will be addressed in Chapter 5. Recall that the bandwidth saturation can be caused by two major sources: scanning worms cause UDP type of bandwidth saturation, and traffic engineering failures lead to TCP type of bandwidth saturation. In practical scenarios, the congestion may be the combination of

1 2 n qlen c background traffic BGP session TCP or UDP host i BGP router r1 r2

Figure 4.1: System model: BGP session passes a congested link with capacity c (bits/sec) and queue size qlen (pkts).

UDP and TCP bandwidth saturation. In our experiment, we study BGP sessions in extreme and representative cases, i.e., either UDP or TCP saturation exists, and analyze the lifetime with respect to different levels of bandwidth saturation magnitudes. The extreme study can be easily controlled in experiments and it can also reflects the BGP robustness in the general cases.

We define the events, e.g. worm breaks or incoming TCP flows, as the impulses on the BGP sessions. In our experiment, the starting time of an impulse is uniformly distributed on the time line. The impulse is generated by n UDP or TCP hosts that are connected to two BGP routers, r1 and r2. The link between these two routers can process c bits per

second in each direction. If the volume of the impulses imposed on the system exceeds the capacity of the link, packets are buffered in a queue with size qlen. Two major queue management schemes are used in Internet: drop-tail and Random Early Detection (RED). Our extended simulation show that these two queue management schemes lead to similar statistical properties of BGP session lifetime in our specific cases. This is because we are interested in severe network congestion where the averaged queue length is almost full at all time (i.e., averaged queue length exceeds the larger threshold in RED) and thus RED behaves similar to drop-tail in these scenarios. Therefore, we focus on drop-tail queue for the purpose of BGP reliability investigation. Once the number of packets in the queue exceeds

qlen1, the newly arrivals are discarded, including the KEEPALIVE messages of BGP. The

properties of packet drop probability under the two bandwidth saturation scenarios will be discussed in Section 4.3.

BGP session lifetime is defined as the time interval starting from the beginning of the impulse event to the expiration of the Hold Timers. After Hold Timer expires, the BGP router falls into Idle state and routing function terminates. We do not consider the following recovery process in BGP, such as BGP connection connection retry. Once Hold Timer expires, the session is declared to be failed. Because the packet loss in the congestion is a time-varying process, it is extremely hard to find out the exact closed-form solution of the BGP session lifetime. Therefore, we rely on simulations and statistical analysis to investigate the statistical properties of BGP session lifetime, and then propose models to approximate the lifetime distributions. Specifically, by setting up the traffic impulses in the simulator, we collect a large number of samples of the BGP session lifetime, and the Kaplan- Meier estimator [42] is used to extract the CCDF(Complementary Cumulative Distribution Function, also called empirical survival function) of BGP sessions. Based on the observation on the shape of the CCDF’s, we give three models for BGP session lifetime approximation, and perform Kolmogorov-Smirnov test [43] to validate the accuracy of the models.

Our simulation is carried out on the network topology described in Fig. 4.1. The adopted simulator is SSFNet (version 2.0.0) [39]. We modified the drop-tail queuing module such that the queue size is controlled by the number of packets. The link capacity between hosts and routers is 100Mbps, which is larger than the capacity between routers r1 and r2, so that the

link, which the BGP session passes, is the bottleneck. The propagation delay between the two routers is 10ms. TCP Reno is used in our study, which is the most widely deployed TCP version. The network parameters for different simulation scenarios are summarized in Table 4.1.

1The other option of controlling queue length is by using the number of bits instead of the number of

packets, which favors the packets with smaller size. In this thesis, we only consider the drop-tail queue managed by the number of packets, which is widely adopted in the Internet routers [41].

Table 4.1: Simulation parameters for studying BGP session lifetime.

router link capacityc T1 (1.5Mbps), 5Mbps, and 10Mbps router queue size,qlen 75, 250, 500, and 1000 pkts

rtt of hosts 30ms and 120ms, uniform for all hosts, or

(propagation delay) randomly generated from [0, 60] and [0, 240] TCP connections,conn 500, 1000, 1500, 2000, and 3000

UDP connections,conn 500

worm propagation rate2,β 0.0005, 0.001, 0.002 and 0.004

packet size,l TCP 500, 1000 bytes. UDP 256 Bytes.