Overhead Analysis of TRP - TRP Code and Local Testbed

4.5 TRP Code and Local Testbed

5.1.6 Overhead Analysis of TRP

We also conducted simulation to count number of control packet to update a routing table after 1 link/node failure in AT&T network. Figure 5.17 shows distribution of the update packet at each router where the failure detected. In the case of OSPF, the maximum number of the update packets generated by

single failure is 5,115 and the minimum number is 3,530. On the other hand, the maximum number of the update packets in TRP is 1,584 and the minimum number is just 1. Average number of update packets of OSPF and TRP are 4,003.66 and 25.64 respectively. Number of updates packets are significantly small at an failure occurred any router location in AT&T network.

Figure 5.17: Number of Updates of OSPF and TRP in AT&T

OSPF TRP

5.1.7 Performance Statistics and Analysis of TRP on

Testbed

In addition to the development of the local site testbed, we also conducted large scale studies to evaluate our research concepts on test facilities provided by Emulab [69]. After the TRP code was tested and evaluated over the local testbed it was then ported to the Emulab facilities of GENI. In this section, we show performance comparison with OSPF routing protocol to evaluate the TRP as an IGP.

Emulab is an experimentation facility which allows creation of networks with different topologies to provide a fully controllable and repeatable ex-

Table 5.5: Emulab Testbed Configurations Topology 21 Nodes 45 Nodes

Type of processor Pentium III Quad Core Xeon Processor

Number of links 24 54

Link shaping nodes 12 20 Connection speed 100 Mbps 100 Mbps

perimental environment. Emulab uses different types of equipment for this purpose. D710 which is a 64 Bit Intel quad core Xeon based machine was used in a 45 nodes topology and PC850 type which is a Pentium 3 based machine was used in the 21 nodes topology; both topologies are shown in Figure 5.18. Different machines were used for the two topologies due to the allocation pro- cess at Emulab and systems availability at the time of request. Note that number of network interface of Emulab PC is limited to five where one is used for control, thus only four network interfaces are available (without virtual interface). The type of network topology is, thus limited by these constraints.

Figure 5.18: Testbed Topology with Tiered Routing Addresses

For the 21 nodes topology of Figure 5.18(a), the configuration details are provided in Table 5.5 along with the 45 nodes topology. In the 45 nodes topol-

ogy, the additional 24 nodes were added to the outer circle of routers utilizing a topological connection similar to that of the outer routers in Figure 5.18(b). In Figure 5.18(a), the IP addresses were allocated from address space 10.1.x.x /24 to the segments as shown. The TRA addresses to run TRP were allocated using the scheme described in Chapter 4.2. Link shaping nodes were used to emulate link failures. Emulab provides the use of such link shaping nodes that can be placed on the segments for this purpose. Emulab provided software commands were used to disable these nodes and thus emulate link failures.

Due to the limited numbers of machines, the limited durations of availability for use, and to provide a random environment for the test, which replicated topology using identical devices, they were conducted in two different sets of networks and the experiments were repeated five times in each case. The results were collected for convergence time of both OSPF and TRP in each topology. To collect convergence time and data on packet losses, each test topology required five hours and two hours of run time respectively.

Quagga 0.99.17 [70], a software routing suite for configuring OSPF was used for the comparison studies For the OSPF evaluations, only one area was defined, as the intention is to demonstrate the performance impacts to increase in the number of routers in a network or in an area. We also note that The TRP code which is in its research and development stage operates on the Linux kernel user space and hence the timings and dependent variables such as packet loss during convergence would project a higher value than if the code were run in kernel space. Comparatively the Quagga OSPF code, which was verified to run on Linux systems, runs in the kernel space.

Link Failure Detection Time

This is the same for OSPF and TRP as they detect a link failure on missing four hello messages. With a hello interval of 10 seconds, this was recorded to be 30 seconds with an additive time, which is the time between the first missing packet and the time when the link was actually brought down. This would be the same for OSPF and TRP due to the failure detection mechanism. However, TRP can fall back on an alternate path on missing a single hello packet, without affecting the forwarding operation. This was not implemented in the current version.

Time to Update the Routing Tables

This time is different for TRP and OSPF. The differences are explained below with the aid of Figures 5.19 and 5.20.

Figure 5.19: TRP Routing Convergence Time

• TRP Response to Link Failures: In Figure 5.19, the time t1 when the

link failed is noted along with time t3, which is the time it took to remove

Tc = Tru− Tf d (5.6)

where Tf d is the failure detection time given by

Tf d= t2 = t1 (5.7)

and Tru is the routing table update time given by

Tru = t3− t2 (5.8)

Thus,

Tc= t3− t1 (5.9)

Tf d will be the same for OSPF, but Tru is negligible in the case of TRP

as this is the time for the TRP code to access the routing tables and update its contents. In Figures 5.19 and 5.20, these times are identified based on the operations of TRP and OSPF respectively.

• OSPF Response to Link Failure: OSPF uses several timers on link failures, to rerun SPF algorithm and a few other hold times to avoid tog- gling. They are hold time, which is the seperation time in millisec- onds between consecutive SPF calculations. An initial hold time and max hold time are also specified. SPF starts with the initial hold time. If a new event occurs within the hold time of any previous SPF calculation then the new SPF calculation is increased by initial hold time up to a maximum of max hold time.

Let TLSA be the LSA propagation delay, TSP F be the time to run SPF

on subsequent LSA messages and TT U be the table update delay, then

Tru of OSPF is given by

Tru = tLSA+ TSP F + TT U (5.10)

TSP F, initial hold time and max hold time were set to 200 ms, 400

ms, and 5000 ms respectively for the test. Figure 5.20 captures the relationship between the delays for OSPF.

Performance Statistics and Analysis

The performance of OSPF and TRP, during the initial convergence phase and their response to subsequent link failures are presented in this section. The Network Time Protocol (NTP) is used to start OSPF and TRP protocols at the same time in all routers after the networks had stabilized. In the histograms, data collected for the two test sites are provided separately, to show the closeness of the two data sets collected under different environments which reflects the reliability of the experiments conducted.

Figure 5.21: TRP vs. OSPF Initial Convergence Time (sec)

• Initial Convergence Times: Figure 5.21 records the average initial convergence times in seconds as collected from the two test sites and for the two different topologies, one with 21 routers and the other with 45 routers. While the convergence times recorded for OSPF range from 55 seconds in the case of the 21 router network to over 60 seconds in the case of the 45 router network, the convergence times for the network running TRP is around 1 second. While the convergence times are sta- ble irrespective of the number of routers in networks running TRP, in the case of OSPF, the convergence times showed an increase by 5 to 6 seconds, indicating dependency of the convergence times to the network diameter. Thus, TRP under the FCT model offers an improvement in magnitude of 50-60 times as compared to OSPF.

• Control Overhead During Initial Convergence: Figure 5.22 shows the plot of the control overhead in Kbytes for OSPF and TRP. The control overhead in the case of OSPF varies from 250 Kbytes for the 21 router network to around 750 to 800 Kbytes for the 45 router network. The increase in overhead almost triples when the network size doubles. The

Figure 5.22: TRP vs. OSPF Routing Control Overhead Size (KB)

control overhead for TRP was 2.6 Kbytes for the 21 router network and around 6 Kbytes for the 45 router network. The improvement achieved with TRP in magnitude is 100 times in the case of the 21 router network and 130 times in the case of the 45 router network.

Figure 5.23: TRP vs. OSPF Routing Table Entry Size

• Routing Table Size: In Figure 5.23, the routing table sizes collected for the two test sites are the same in both case of TRP and OSPF, and hence the two site graphs have been merged into one. The figure records the maximum of the routing table entries noted in the routers. In the case of OSPF this value is 25 for the 21 router network (as there are

25 segments) and in the case of the 45 router network this value was recorded as 55. In the case of TRP, the routing table entry reflects the number of directly connected neighbors, so in both cases, i.e. the 21 router and 45 router networks the maximum routing table entry was 4. TRP routing table sizes do not depend on the network size, which provides proof to the scalability of TRP.

To summarize, Figures 5.21 and 5.22 provided the performance statistics relating to the initial convergence of the two routing protocols. These metrics were recorded as they reflect the difference of the underlying techniques in the two protocols. Their impact on routing performance during normal network operation especially when there are link or node failures is presented next.

Figure 5.24: TRP vs. OSPF Convergence Time after Failure (sec)

• Convergence Time after Link Failure: Figure 5.24 is the routing table update time in seconds subsequent to a link failure detection. While OSPF shows an update time of 1.5 to 2 seconds for the 45 router network and just over a second for the 21 router network, TRP update times were as low as 200 ms to 240 ms; a magnitude of 6 improvement for the smaller network and a magnitude of 8 improvement for the larger network. The

routing table update time is invariant to the network size in the case of TRP and the increased improvement over OSPF as the number of routers increase is to be noted.

Figure 5.25: TRP vs. OSPF Control Packet Size after Failure (KB)

• Control Overhead after Link Failure: Figure 5.25 is the plot of the control overhead for TRP and OSPF collected during the convergence times after link failure, which includes the time to detect a failure and also the time to update the routing tables. For the given topologies no control overhead is incurred with TRP, i.e. there is no necessity to propagate any change messages to the network. OSPF required around 100 Kbytes and 70 Kbytes of control packets for the 45 router and 21 router networks respectively. Though for the given topology TRP does not incur any control overhead, for complex topologies, in TRP the change information may have to be propagated to networks downstream to block usage of the invalid address when a link goes down. Similarly upstream router may also have to be informed when a downstream link fails. These features are planned for testing using simulations as with the limited

interfaces supported per Emulab equipment it was not possible to have such configurations.

• Data Packets Lost: Since link failure detection mechanism of missing four hello messages is the same for both OSPF and TRP, the packets lost during failure detection is the same for both protocols and hence is not presented. The time to update the routing tables is recorded to be around 0.2 seconds for TRP and 1.2 seconds to 2.0 seconds for OSPF. Thus the packets lost during routing table update time was a maximum of 1 packet for TRP and a maximum of 10 packets with OSPF at a data rate of 5 packets per second generated by SIperf and Iperf respectively.

The TRA address in the FCT architecture is used by TRP for packet forwarding. Initial convergence time and convergence time after failure are significantly low because TRP does not require message flooding to all nodes in the network or any significant recalculations and re-computing on topology change. As a result of no message flooding, control overheads are also very low. Entries in routing tables used in TRP are satisfied by addresses of only the direct neighbors because of the inherent routing information in the TRA addresses. Thus, the routing table sizes in TRP are significantly small. From the results presented thus far it would be clear that TRP would be an ideal routing protocol to address scalability concerns as networks grow in number and in size. This is true as the routing table sizes and routing table update time is independent of network size. This in turn will positively impact the routing performance in the network.

5.2 Evaluation of TRP in Inter-domain Rout-

In document Tiered Based Addressing in Internetwork Routing Protocols for the Future Internet (Page 90-102)