CiteSeerX — Routing Testbed: Initial Design

(1)

Cengiz Alaettinoglu, Klaudia Dussa A. Udaya Shankar, Jean Bolot Department of Computer Science

University of Maryland College Park, Maryland 20742

UMIACS 90-71 CS-TR-2475 May 18, 1990

Abstract

This report presents the initial design of a testbed for developing routing algorithms for large dynamic computer networks.

This research was supported in part by RADC and DARPA under contract number F30602-90-C-0010 to UMIACS at the University of Maryland. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the ocial policies, either expressed or implied, of the Defense Advanced Research Projects Agency, RADC, or the U.S. Government.

(2)

1 Introduction

Our objective is to provide a testbed for developing robust routing algorithms for large dynamic networks. The testbed should allow us to analyze, simulate, and evaluate dierent routing algorithms and network topologies. Because of the size of the networks to be stud- ied (up to 2000 nodes or more), a direct implementation, with each node being a separate computer, is not possible. Hence, our approach is to realize the testbed with a simulator.

In order to achieve our objective, the simulator must eectively model

all

functions of a computer network that aect the performance and robustness of routing algorithms. These functions include the following:

Physical Network

The topology of the nodes and links. The bandwidth, propagation delay, and failure characteristics of links. The speed, memory, and failure characteristics of nodes.

Routing Protocols and Name Resolution Protocols

Routing protocols are distributed algorithms used for maintaining routes. Name resolution protocols are distributed algorithms used for mapping names to addresses. Typ- ically in a large network, there are several of these protocols organized in various ways, both hierarchically and as peers.

Transport Protocols

Transport protocols are above the routing and naming protocols. They ensure reliable communication between any two hosts in a network, using retransmission-with- acknowledgement schemes and ow-control schemes. Both of these schemes interact very strongly with routing schemes.

Application Protocols

These protocols are above the transport protocols. They are the ultimate producers

(4)

and consumers of data that travels through the network. Two categories of applications are important: le transfer and remote login.

In the remainder of this section, we give a brief overview of possible approaches in simulation and survey some existing simulators. Section 2 explains the fundamentals of discrete-event simulation. Section 3 presents the overall structure of our simulator. Sec- tion 4 through section 9 discuss the various modules of the simulator. Section 10 describes successively enhanced versions of the simulator.

Approaches to simulation

We use the term

target system

to refer to the real-time computer network that is to be simulated. Preliminary discussions led to three fundamentally dierent approaches to simulating the target system: (1) Process emulation, (2) Time-driven simulation, and (3) Discrete-event simulation.

In process emulation, every node and link of the target system is represented by a separate process. Every node process executes code for application functions, transport functions, routing functions, buer management, etc. Every link process executes code modeling the propagation of messages in transit. In order for this set of processes to faithfully capture the dynamics of the target system, it is necessary that all processes execute their code at the same rate. To achieve this, synchronization points are introduced into the code. When a process reaches a synchronization point, it waits until all processes have reached the synchronization point. Thus, the execution of the processes proceeds from one synchronization point to the next. One advantage of process emulation is that the actual code of the target system can be used, thereby making the simulation very realistic. A disadvantage is the overhead for the process synchronization.

Time-driven simulation is like process emulation, except that instead of synchronizing processes, we divide each component's code into partitions that take the same amount of

(5)

time to execute. The simulation is done in cycles, with one partition from each node and link executed in every cycle. Time-driven simulation can be very realistic, because the actual code of the target system can be used. But it is not very exible. To add a new node function, the code of the node has to be repartitioned so that the execution time of each partition ts into the time frame.

In discrete-event simulation, the target system is abstracted by a set of variables and a set of events. The behavior of a system is represented by a list of event occurrences, where the time of each occurrence is known. Each event occurrence updates the values of the variables, as specied by event routines. The advantage of discrete-event simulation is that it is very exible. More states and events can be introduced to make the simulation more realistic, at the expense of increasing computational cost.

From these approaches, we chose discrete-event simulation. It oers the greatest exi- bility in modeling the target system, including dierent hierarchies and routing algorithms.

First we will implement a centralized discrete-event simulator. Later, we plan to implement a distributed discrete-event simulator, in order to simulate larger target systems.

All three approaches mentioned above require increasing amounts of processing time and memory as the size of the target system increases. One way to handle this problem is to simulate only a part of the target system, and abstract the rest by a simple approximation system [LZGS 84]. The problem here is how to obtain a simple approximation system that adequately represents the behavior of the rest of the target system. We believe that good approximations can be obtained by conducting careful studies of the complete target system.

Survey of some existing simulators

In order to reduce programming work, we examined three simulators that were available to us: (1) Network Simulator from MIT [Hey 89, Mar 88], (2) DeNet from University of Wisconsin-Madison [Liv 89], and (3) COMNET II.5 from CACI Products Company [CACI].

(6)

Network Simulator

Network Simulator is a discrete-event simulator designed for simulating a network of message- passing components. A component consists of a data structure and an action routine. The data structure stores the information needed by the component and the simulator. The action routine is called for every event that relates to the component. The user can dene his own components by specifying a new data structure and the associated action routine. The new component must respond to a number of predened commands, e.g. create an instance, delete an instance.

The simulator has a graphic interface. It displays the topology of the network and the parameters of its operation. It allows the user to dene and modify the network, control the simulation, log parameters, and save and load the network congurations. The user interface is almost entirely mouse-driven.

Network Simulator does not provide any components to describe routing, hierarchies and address resolution. The simulator is written in C and runs under UNIX and X windows.

The source code is available. The documentation inside the code is good, whereas the external documentation is poor [Hey 89, Mar 88].

DeNet

DeNet (Discrete Event Network) is a general purpose simulator. It views the target system as a set of objects that communicate with each other. Objects are dened as instances of Discrete Event Modules (DEVM). DEVMs can be arbitrarily complex. Discrete event connectors (DEVC) are used to interface objects. An instance of a DEVC establishes a direct interface through which one object can observe changes in the state of another object. The simulator is constructed from module types and connector types according to a description provided at runtime. A principal module manages the simulation run. It invokes the interpreter to read the target system description, and then constructs a representation in terms of DEVMs and

(7)

DEVCs. After initialization, the principal starts the simulation run.

The DeNet simulator can be viewed as a graph, where nodes represent instances of DEVMs and directed arcs represent instances of DEVCs. The description of the graph can be given in DeTop, a special purpose language for dening the topology of directed graphs.

TopDraw is a graphical editor for DeTop les. DeVise is a visualization tool. The DeNet simulator is written in Modula2. Currently, it runs on Sun workstations. There are no predened modules for nodes, links, routing, hierarchy, etc. The external documentation is poor. We did not have access to the source code [Liv 89].

COMNET II.5

COMNET II.5 is a commercially available, discrete-event simulator designed for simulating message-passing networks. It accepts a description of the target network, including trac and routing algorithms, forms an internal representation, and simulates it.

A menu-driven screen editor is used to dene and modify the target network description.

The network description is split up into three parts: network topology, network trac, and network operation. The network topology species the nodes, the links, and the access facilities of the network. A node can model a circuit switch, a store-and-forward switch, or both. A link models a point-to-point transmission channel. Access facilities can model point-to-point or multipoint lines, token-ring LANs, or CSMA/CD LANs. There are various attributes to nodes, links, and access facilities. The network trac can be dened as circuit- switched calls, data-messages, and virtual-circuit calls. Categories of trac can be dened in terms of source node, destination node, and class of service. Network operation includes a description of the routing algorithm. Static routing and adaptive routing are oered by the simulator.

COMNET II.5 does not consider routing hierarchies or dierent name resolution schemes.

The external documentation does not mention any possibility of adding user-dened compo-

(8)

nents [CACI].

Our choice

We found Network Simulator most suitable for our purposes for several reasons. First, the source code was readily available, and the style of coding and documentation is good. Thus modications and expansions seem easy to do. Second, it is written in a very portable form of C, a language and programming environment with which we have much expertise. Third, it does not enforce modular composition rules, unlike DeNet. As a result it can be extended to be a very ecient simulator, in terms of processing and memory requirements. This is very important because of the large target networks that we will be simulating.

2 Fundamentals of Discrete-Event Simulation

In discrete-event simulation, each execution of the target system is represented by a sequence of state transitions, where the time of each transition is known. The state space of the target system is represented by a set of

state variables

. The state transitions are represented by a set of

events

. Associated with each event is a

routine

that species how the state variables are updated when the event occurs. An event can occur multiple times during a simulation.

We use the

event occurrence tuple

(

e;t

) to denote the occurrence of event

e

at time

t

. Thus, a

simulation log

can be formally dened as a sequence of the form:

< s

0

;

(

e

1

;t

1)

; s

1

;

(

e

2

;t

2)

; ::: ;

(

e

ⁿ

;t

ⁿ)

; s

ⁿ

>

where each

s

ⁱ is a system state and each (

e

ⁱ

;t

ⁱ) is an event occurrence such that the following holds: (1)

s

0 is an initial state; (2) for all

i

, state

s

ⁱ results from executing at time

t

ⁱ the routine of event

e

ⁱ in state

s

^i?1; and (3) the

t

ⁱ's are nondecreasing, non-negative numbers.

Note that the last time value,

t

ⁿ, indicates the time up to which the target system has been

(9)

simulated. We refer to

t

ⁿ as the

simulated time

of the simulation log. Given a simulation log, we can compute performance measures.^y

How does the simulator eciently compute a simulation log? The classical solution is to generate the simulation log in an iterative fashion, which we describe next. In addition to the state variables mentioned above, the simulation maintains the following state variables with the indicated meanings holding just before an iteration.

Current Simulation Log

. A sequence of alternating states and event occurrences.

It equals a simulation log and is initially empty.

Current Simulated Time

. Indicates the simulated time for the

Current Simulation Log

.

Event Occurrence List

. A sequence of event occurrences that are yet to be executed. Each (

e;t

) pair in this list satises

t

Current Simulated Time

. Initially, the list contains one or more event occurrence pairs to trigger the simulation ( e.g. a customer arrival at time 0 ).

We say that

event e is scheduled at time t

, if and only if (

e;t

) is in the

Event Occurrence List

. In the target system, the occurrence of an event

e

can cause events to occur later on.

(For example, a customer arrival at time

t

to a server causes a departure at time

t

+

s

, where

s

is the service time for the customer). This is modeled by having the routine of event

e

schedule new event occurrences. (For example, the routine for the customer arrival adds the entry ( customer departure,

Current Simulated Time

+

s

) to the

Event Occurrence List

.)

The iterative activity of the simulator can now be described by the following:

yIn all the simulations of interest to us, the simulated time should increase without bound as the number of events simulated increase, i.e. lim^n!1^tn=¹.

(10)

while

simulation not over

do begin

1. Pick up an (

e;t

) tuple with minimum

t

from the

Event Occurrence List

. 2.

Current Simulated Time t

3. Execute the routine of

e

(this may schedule new event occurrences) 4. Append (

e;t

) and

s

to

Current Simulation Log

, where

s

is a subset of

the current system state.

end

result:

Upon termination, the current simulation log equals a simulation log.

Post-processing can be done on it to obtain performance measures.

In general, storing the entire simulation log requires excessive memory, and is not feasible. In such cases, the standard solution is to decide a priori on some performance measures that can be computed incrementally (such as accumulated response time, number of departures, etc.), and store these instead of the simulation log. Then, just before step 2, we insert a step, where the performance measures are updated, using the information that the current system state has lasted for the past (

t

^?

Current Simulated Time

) seconds.

Now, there is no need for the

Current Simulation Log

and step 4 is removed.

3 Modular Structure of our Simulator

To provide for exible usage, the simulator is designed as a collection of interacting modules.

Most of the modules in our simulator correspond to components of the target system. These modules can be adapted or replaced without altering the rest of the simulator. The rest of the modules in the simulator are concerned with simulation functions, such as event handling, maintenance of the event occurrence list, performance measures, etc. Before describing the particular modules of our simulator, we give a generic description of modules.

(11)

A module encapsulates a data structure and events to access it. An event of one module can call or schedule an event of another module. Formally a

module

consists of the following (see gure 1):

Data structures

Typically, the data structures of the module can be directly accessed only by events of the module. Other modules can access them only via the events of this module.

However, for the sake of eciency, we sometimes allow these data structures to be directly accessed by events of other modules. Any such access is highlighted.

Internal events

An internal event operates only on the data structures of the module. Its routine is called by the event handler or by another event of this module. It can call and schedule events of this module and of other modules.

External events

An external event is called by the event handler, by another event of this module, or by an event of another module. An external event can call or schedule events of this module and of other modules.

Events can be dened with parameters, for example, representing data conveyed from one module to another.

As mentioned above, an event routine can call another event routine, and so on. It is necessary to ensure, not only that there is no cycle (innite recursion), but also that at most one event of any module is executed in any single event occurrence. We enforce this by syntactic constraints on the event routine, i.e. in the simulator design.

(12)

Data Structure

Internal

Events External

Events

Figure 1: Structure of a module

Modules of the simulator

We now give an overview of the modules in our simulator. The modules are divided into those that model the target system and those that handle the simulation function. See gure 2.

The modules modeling the target system are clustered into four groups. The

Phys- ical Network Group

provides modules representing the physical nodes and links. The

Routing/Naming Group

provides modules for routing, hierarchy management, and name service. The

Application/Transport Group

provides modules for transport protocol functions (such as connection management, ow control, etc.) and application functions (le

(13)

Application

Simulator Manager Group

Transport Protocol(TCP)

Group ModicationTopology Routing/Naming Group

Application/Transport Group

Physical Network Group

Routing

Hierarchy Name Server Telnet FTP

Nodes Links

Performance Evaluation Group

Figure 2: Structure of the testbed

transfer trac, remote login trac). The

Topology Modication Group

provides modules for dynamic modication of the network topology, i.e. adding/removing links, nodes, connections, etc.

The modules handling the simulation function are divided into two groups. The

Sim- ulator Manager Group

contains modules that start the simulation, manage the event occurrence list, generate the current simulation log, etc. The

Performance Evaluation Group

contains modules that collect statistics and monitor the performance of the target system modules.

Together, these six groups of modules constitute the testbed. In sections 4 to 9, we

(14)

describe these groups of modules in more detail. For each module, we list the data structures and the events of the module. For each event, we describe (1) its eect on the data structures, (2) the events of this module and of other modules that it calls (if any) and (3) the events of this module and of other modules that it schedules (if any).

4 Physical Network Group

The physical network group models the following: (1) topology of the network, (2) buer policy at each node regarding received messages and outgoing messages, (3) messages in transit on each link, and (4) failure and repair of nodes and links.

We divide the messages in the network into two classes: transport and routing packets.

Transport packets

carry information for application/transport modules.

Routing pack- ets

carry information for routing/naming modules. A transport packet received by a node is either directed up to the transport module of this node, or redirected to an outgoing link. A routing packet received by a node has to be processed by the routing/naming modules of this node. We use two special types of routing packets,

shutdown packets

and

startup pack- ets

, to indicate, respectively, failure and repair of links and nodes, to the routing/naming modules. If the routing/naming modules do not use this information, they can ignore it.

Each packet has additional elds that will be introduced in the report as needed.

Links

Each link in the target system is assumed to be bi-directional. We simulate each link by two queues of messages, one for each direction. Messages in a queue represent packets in transit. These queues are manipulated by send events, receive events, failure events, and repair events. We assume that in case of a link failure, both directions of a link are broken.

Parameters for links are propagation delay, transmission speed, error rate, failure rate, and

(15)

repair rate. Any parameter can be deterministically or probabilistically dened.

Conventions:

Consider a link that connects two nodes, say

A

and

B

. The queue from

A

to

B

is referred to as an

outgoing link queue

of

A

and as an

incoming link queue

of

B

. For the queue of messages from

A

to

B

,

A

is referred to as the

source node

and

B

as the

destination node

. We consider the occurrence of a send event (receive event) to correspond to the arrival of the last bit of a packet at a link (node). One advantage of this is that if another packet is waiting to be sent (received), it can be scheduled by the send (receive) routine.

A

link module

consists of the following:

Link Status

Data structure. Indicates whether the status of the link is

up

^or

down

^.

LQ

¹

;LQ

²

Data structures. Two link queues, one for each direction. Each link queue is a queue of (

p;t

) tuples, where

p

is a packet and

t

is a time value. Every tuple (

p;t

) represents a packet in transit, with

t

indicating the time of reception at the destination node.

Link:Send

⁽

LQ

ⁱ

;p

⁾

External event, where parameter

LQ

ⁱ is either

LQ

1 or

LQ

2 and parameter

p

is a packet. Called by a send event of the source node of the link queue

LQ

ⁱ (node events are described in section 4.2).

If the link status is

up

,

Link:Send

(

LQ

ⁱ

;p

) appends the tuple (

p;t

) to the tail of

LQ

ⁱ where

t

equals the sum of the current simulated time and the propagation time of the link. If (

p;t

) is the only tuple in the link queue,

Link:Send

(

LQ

ⁱ

;p

) schedules a

Link:Receive

(

LQ

ⁱ) at time

t

.

If the link status is

down

,

Link:Send

(

LQ

ⁱ

;p

) does not append

p

to

LQ

ⁱ. Instead, if

p

is a routing packet, it is discarded. If

p

is a transport packet, it calls the transport retransmit

(16)

event at the node that generated the packet

p

. (Transport events are described in section 6). This is one way to model retransmissions initiated by the transport protocol.

Link:Receive

⁽

LQ

ⁱ⁾

Internal event. Called by the event handler.

Link:Receive

(

LQ

ⁱ) removes the rst (

p;t

) tuple in link queue

LQ

ⁱ. It calls the receive event of the destination node of

LQ

ⁱ. If there is another tuple (

p

⁰

;t

⁰) at the head of

LQ

ⁱ, it schedules a

Link:Receive

(

LQ

ⁱ) event at time

t

⁰.

Link:Failure

does the following. It sets the link status to

down

. For each transport packet

p

in link queues

LQ

1and

LQ

2, it calls the transport retransmit event at the node that generated the packet

p

. It empties the link queues. It places a tuple (

p;t

) in each link queue, where

p

is a shutdown packet and

t

is the reception time of

p

. It schedules a

Link:Receive

(

LQ

1) and

Link:Receive

(

LQ

2), both for time

t

.

Link:Repair

does the following. It sets the link status to

up

. It places tuple (

p;t

) in each link queue, where

p

is a startup packet and

t

is the reception time of

p

. It schedules a

Link:Receive

(

LQ

1) and

Link:Receive

(

LQ

2), both for time

t

.

Nodes

In modeling a node for the physical network group we are concerned with how a node buers received messages and outgoing messages. We realize that there are many possible buer management policies. Here, we consider one policy. Other possibilities can be realized easily by changing the node module.

(17)

A node has a separate queue for each outgoing link and a single queue for all received routing packets. The queues in a node are manipulated by send events, receive events, produce events, failure events, and repair events. Parameters for the nodes include the available buer space, buer management policies, CPU speed, failure rate, and repair rate.

A

node module

consists of the following :

Node Status

Data Structure, shared with the routing module. Indicates whether the node status is

up

or

down

.

Outgoing Packet Queue

⁽

LQ

⁾

Data structure, where the parameter

LQ

ranges over outgoing link queues. Shared with the routing/naming modules of this node (described in section 5). Queue of (

p;t

) tuples where

p

is a packet and

t

is a time value. Every tuple (

p;t

) represents a packet that is to be transmitted to link queue

LQ

where

t

is the time of the end of transmission.

Routing Packet Queue

Data structure. Queue of (

p;t

) tuples. Every tuple (

p;t

) denotes a routing packet where

t

signies the time of its processing. We assume that this corresponds to the end of processing this packet.

Routing Table

Data structure. Shared with routing/naming modules of this node. Read-only access by the node module.

Node:Send

⁽

LQ

⁾

Internal event, where the parameter

LQ

ranges over the outgoing link queues for this node. Called by the event handler.

Node:Send

(

LQ

) assumes that

Outgoing Packet

Queue

(

LQ

) is not empty. It removes the rst tuple (

p;t

) from the queue, and calls the appropriate send event of the link module. If there is still another tuple (

p

⁰

;t

⁰) in the

(18)

outgoing packet queue, it schedules a

Node:Send

(

LQ

) event at time

t

⁰.

Node:Receive

⁽

p

⁾

p

is a packet. Called by a link module receive event corresponding to an incoming link queue.

If the node status is

up

,

Node:Receive

(

p

) does the following:

{

If

p

is a transport packet destined for this node,

Link:Receive

(

p

) calls the receive event of the transport module of this node.

{

If

p

is a transport packet not destined for this node,

Node:Receive

(

p

) consults the routing table and appends (

p;t

) to an

Outgoing Packet Queue

(

LQ

); if (

p;t

) is the only packet in

LQ

,

Node:Receive

(

p

) schedules a

Node:Send

(

LQ

) at time

t

.

t

is the sum of

Current Simulated Time

, the time for consulting the routing table and the transmission time of the packet.

{

If

p

is a routing packet,

Node:Receive

(

p

)evaluates the function

Processing:Time

(

p

) and places the tuple (

p;t

) at the tail of

Routing Packet Queue

where

t

is the sum of

Current Simulated Time

and

Processing:Time

(

p

). (

Processing:Time

(

p

) is a function that returns the time required to process

p

. It is described in section 5).

If (

p;t

) is the only packet in the routing packet queue,

Node:Receive

(

p

) schedules a routing process event at time

t

.

If the node status is

down

or there is no buer space available, the packet

p

is discarded.

If

p

is a transport packet,

Node:Receive

calls the retransmit event of the transport module that generated the packet

p

.

Node:Produce

⁽

p;LQ

⁾

p

is a packet, and parameter

LQ

is either null or species an outgoing packet queue of the node. Called by events of transport/application modules and routing modules. If

LQ

is null,

Node:Produce

(

p

) consults

Routing Table

(19)

and appends (

p;t

) to the appropriate outgoing packet queue. If

LQ

is not null,

Node:Produce

(

p;LQ

) appends (

p;t

) to

LQ

. In either case, if (

p;t

) is the only tuple in the outgoing packet queue, it schedules a

Node:Send

event at time

t

.

t

is the sum of the current simulated time, the time for consulting the routing table and the transmission time of the packet. If there is no buer space available, and

p

is a transport packet,

Node:Produce

(

p

) calls the retransmit event of the transport module.

Node:Failure

does the following. It sets the node status to

down

. For each transport packet

p

in the outgoing packet queues, it calls the transport retransmit event at the node that generated the packet

p

. It empties

Routing Packet Queue

and/or the outgoing packet queues, depending on the failure model. It appends a tuple (

p;t

) to

Routing Packet Queue

of the node, where

p

is a shutdown packet and

t

is the time of the failure occurrence. If (

p;t

) is the only packet in

Routing Packet Queue

, it schedules a process event at the node at time

t

.

Node:Repair

does the following. It sets the node status to

up

and appends a tuple (

p;t

) to

Routing Packet Queue

. Where the tuple (

p;t

) is placed in

Routing Packet Queue

and when it will be processed, depends on the repair model. If (

p;t

) is the only packet in the

Routing Packet Queue

, it schedules a routing process event at the node at time

t

, where

t

is the end of repair.

5 Routing/Naming Group

There can be many paths from a node to a destination node. A good routing protocol should cause packets to be routed along a shortest path, i.e. one of minimum cost. The cost of a path is the sum of the costs of its links. The cost of a link depends on its propagation delay, its bandwidth, and the size of its outgoing queue in the source node. Because of the last

(20)

quantity, the cost of a link, and hence the cost of a path, varies with time, as connections are opened and closed, and paths are chosen and discarded.

There are many dierent approaches to routing. One of the most common approaches is

next-hop

routing. In this approach, for each destination each node is only aware of which outgoing link (i.e. next-hop) to take to reach the destination. The basic problem of the routing algorithms in this approach is to determine at each node a good next-hop for each destination.

Routing algorithms for next-hop routing are usually classied into link-state algorithms and distance-vector algorithms. In link-state algorithms, each node attempts to maintain the global topology, i.e. the state of all links in the network. Periodically, neighboring nodes exchange their global topology information. Each node calculates shortest paths, and thereby chooses the next-hops to take. In distance-vector algorithms, each node maintains for each destination the cost via each of its neighbours to reach the destination. Periodically, neighboring nodes exchange their costs to each destination. A node chooses an outgoing link as the next hop, if the cost of that link plus the cost to the destination from the node at other end is minimum over such costs of all other neighbours. There are many dierent versions of link-state algorithms and distance-vector algorithms[MeSe 79, JaMo 82, Garc 89, McRR 80].

We next describe a link-state routing algorithm very much like the Arpanet routing algorithm. In the course of the project, we will be implementing dierent routing algorithms under dierent hierarchies. The algorithm maintains three tables: a local topology table, a global topology table, and a routing table. The local topology table stores information about links incident to this node. The global topology table stores the latest reported link costs throughout the target network. The routing table stores the next-hop entries. Nodes periodically exchange local topology information with other nodes. Whenever link costs are updated or new link-state information is received, shortest paths are recalculated and new next-hops are determined.

(21)

The neighbours of a node

A

can send to

A

the local topology of a node

B

over dierent instances of time. In order to determine which topology information is the latest, sequence numbers are used. Each node has a sequence number that it increments and sends along with local topology information. Each node also stores the largest sequence number it has received from every other node. All sequence numbers are stored in non-volatile memory, so that they survive node failures.

A node exchanges local topology information using

link-state packets

, which are a type of a routing packet. A link-state packet contains the

id

of the node whose local topology is being propagated, the sequence number of that node when the packet was created, and the cost (at that time) of each link incident on that node.

A

routing/naming module

Local Topology Table

Data structure. For each neighbour node

A

,

Local Topology Table

stores a xed cost corresponding to the propagation delay of the link to

A

, the size of the outgoing packet queue for this link, and the status of this link as known by this node.

Global Topology Table

Data structure.

N

by

N

matrix where

N

is the number of nodes in the target network.

Entry (

i;j

) indicates the last received cost of the link between

i

and

j

(if any) in the

i

to

j

direction.

Routing Table

Data structure. Shared with the node module. For each node

A

in the network, stores either the next hop to reach

A

, or the entry

null

, indicating that

A

is unreachable.

Sequence Number

Data structure. Incremented whenever the local topology of this node is broadcast.

Sequence Number

is stored in non-volatile memory.

(22)

Last Sequence Number

⁽

N

⁾

Data structure. For each node

N

in the target system, indicates the largest sequence number of the link-state packet received from

N

.

Last Sequence Number

(

N

) is stored in non-volatile memory.

Routing:Broadcast

Internal event. Called by the event handler. If the node is

up

,

Routing:Broadcast

increments

Sequence Number

, calculates link costs by consulting the local topology table, creates a link-state packet

p

for each neighbour node, calls node produce event with

p

and the appropriate outgoing link queue to use, recalculates shortest paths, updates routing table, and schedules another

Routing:Broadcast

event.

Routing:Processing

⁽

p

⁾

p

is either a link-state packet, a shutdown packet, or a start-up packet. Called by node receive event or event handler.

{

If

p

is a shutdown packet or a start-up packet of an adjacent link,

Routing:Processing

(

p

) stores the status of the link in the local topology table.

{

If

p

is a shutdown packet due to the failure of this node,

Routing:Processing

(

p

) empties the local topology table, the global topology table, and the routing table, and sets

Node Status

to

down

.

{

If

p

is a start-up packet due to a node repair event of this node,

Routing:Processing

(

p

) schedules a

Routing:Broadcast

event, and sets

Node Status

to

up

.

{

If

p

is a link-state packet generated by a node

B

and the sequence number in

p

is greater than

Last Sequence Number

(

B

) then

Routing:Processing

(

p

) updates the global topology table, duplicates

p

for each neighbour node, and calls node produce event with

p

and the appropriate outgoing link queue to use, recalculates shortest paths, updates routing table, and schedules a

Routing:Processing

event if there is a packet in the routing packet queue.

(23)

Routing:ProcessingTime

⁽

p

⁾

Function where parameter

p

is a routing packet. Invoked by node receive event. It returns an estimate for the time required to process

p

.

6 Application/Transport Group

Application protocols are the ultimate producers and consumers in a computer network.

They exist above the transport protocol. There are many dierent kinds of applications. File transfer (e.g. FTP[PoRe 85]) and remote login (e.g. TELNET[PoRe 83]) are two of the most common ones. A le transfer from node

A

to node

B

is characterized by large data packets going from

A

to

B

, and small acknowledgement packets going from

B

to

A

. The size of the packets and their intergeneration time are very regular. In a remote login application, data packets and acknowledgement packets go in both directions, and there is great variation in their sizes and generation times.

In a computer network, the transport protocol provides a reliable communication channel between any two nodes, whether or not they are directly connected by a link. To achieve this in the presence of message loss, link failures, node failures, buer space limitations, etc., a transport protocol utilizes acknowledgement/retransmission schemes, and ow control schemes. A typical retransmission policy is to retransmit a packet, if it is not acknowledged within estimated routing trip time of its transmission.

Flow control mechanisms restrict the number of packets in transit in the network.

Typical ow control mechanisms use either a window-based strategy[GrKl80, Jcob88, Reis79]

or a rate-based strategy[Cher 86, Clar87]. By restricting the number of packets in transit, the outgoing packet queue sizes in nodes become more controlled, reducing the probability of buer over ow.

There is strong interaction between ow control schemes and routing schemes. Without

(24)

a good ow control mechanism, outgoing packet queues can grow unreasonably. This increases the delay at the link, causing the routing algorithms to try to nd a better path when no such path may exist. Therefore, in order to fairly compare routing algorithms, it is absolutely necessary to model ow control mechanisms.

We realize that there are many possible retransmission and ow control schemes. For preliminary consideration, we choose a simple scheme; this can be modied easily later.

Packet counters

are used to represent the number of transport packets produced, the number of transport packets sent, the number of transport packets received, and the number of transport packets acknowledged (i.e. for which an acknowledgement was received). We model ow control by imposing an upper bound on the number of transport packets that have been sent but not yet acknowledged.

To estimate the round trip time, a special transport packet, referred to as a

token

, is exchanged between the nodes of a transport connection. Each node of the connection maintains the exponential average of the roundtrip times of token, and uses that as a roundtrip time estimate.

Each transport packet has the following: (1) a

type

eld indicating if the packet is a data packet, an acknowledgement packet, or a token packet, (2) a

response

bit which indicates if a response is required or not, and (3) a

connection id

, identifying the transport connection.

For each ordered pair of nodes (

A;B

), there is one application/transport module. All the trac from

A

to

B

is generated by the produce event of this module. Parameters to this module include the size of packets, interpacket time, the

maximum send window size

, and the

maximum produce window size

. The interpacket time can specify various clustering phenomena in packet production. The maximum send window size restricts the number of packets that are sent but not yet acknowledged. The maximum produce window size restricts the number of packets produced but not yet sent.

(25)

An

application/transport module

Packets Produced

Data Structure. Indicates the number of data packets produced.

Packets Sent

Data Structure. Indicates the number of data packets sent.

Packets Produced

^?

Packets Sent

is always less than or equal to the produce window size.

Packets Acked

Data Structure. Indicates the number of data packets for which an acknowledgement was received at the source node.

Packets Sent

^?

Packets Acked

is always less than or equal to the send window size.

Packets Received

Data Structure. Indicates the number of data packets received at the destination node.

Token Sent Time

Data Structure. Indicates the value of the current simulated time when the token was last sent.

Roundtrip Time Estimate

Data Structure. Indicates the current estimate of the roundtrip time.

Application:Produce

Internal event. Called by the event handler. If

Packets Produced

^?

Packets Sent

is less than the produce window size, then

Application:Produce

increments

Packets Produced

, calls

Transport:Send

event, and schedules the next

Application:Produce

event according to the specied trac pattern.

Application:Consume

(

p

)

Internal event, where

p

is a data packet. Called by

Transport:Receive

event. If

p

(26)

requires a response, an

Application:Produce

event is scheduled at time

t

, where

t

is the sum of the current simulated time and the processing time.

Transport:Send

Internal event. Called by the events

Application:Produce

and

Transport:Receive

. If

Packets Produced

is greater than

Packets Sent

and

Packets Sent

^?

Packets Acked

is less than the send window size, then

Transport:Send

increments

Packets Sent

, creates a data packet, and calls node produce event with this packet.

Transport:Receive

⁽

p

⁾

External event, where

p

is a packet. Called by the node receive event. If

p

is a data packet,

Transport:Receive

(

p

) increments

Packets Received

, calls node produce event with an acknowledgement packet, and calls

Application:Consume

(

p

). If

p

is an acknowledgement packet,

Transport:Receive

(

p

) increments

Packets Acked

, and calls

Transport:Send

event. If

p

is a token packet, it updates the roundtrip time estimate, sets

Token Sent Time

to the value of current simulated time, and calls node produce event with

p

.

Transport:Retransmit

⁽

p

⁾

External event, where

p

is a packet. Called by node failure, link failure, node receive, and node produce events.

Transport:Retransmit

(

p

) schedules a node produce event at time

t

, where

t

is the sum of current simulated time and a timeout duration based on the roundtrip time estimate. If

p

is a token packet, it also sets

Token Sent Time

to the value of current simulated time.

7 Topology Modication Group

This module dynamically modies the topology of the network. It can add new links and nodes. It can delete existing links and nodes. These modications are dierent from failures

(27)

and repairs in that they are intentional. For example, before a node is deleted all connections involving the node are closed down. When a node is created, new application/transport modules and routing/naming modules are also created and appropriately initialized.

8 Performance Evaluation Group

The modules in the performance evaluation group collect statistics from the dierent modules of the target system and evaluate performance measures. One performance measure would be queue size characterizations (e.g. expected values, transient overshoots). Another would be the time between starting of a new connection and settling of the routing tables.

Because of the complexity of the target system, it is dicult to determine apriori good performance measures. These will have to be arrived at after much experimentation with the simulator. For this reason, in the preliminary version of the simulator, we consider post-processing of the simulation log.

9 Simulator Manager Group

The modules in this group maintain the event occurrence list and compute the simulation log.

The activity of this module is basically the iterative event handling that has been described in section 2.

10 Phases of The Testbed

In this section, we describe successive versions of the simulator.

(28)

Phase 1: Development of centralized simulator

In the preliminary version, we restrict ourselves to post-processing of the simulation log.

Using this, we will develop eective performance measures to use instead of post-processing.

We will examine the following single-layer routing schemes:

1. Fixed-path routing on small static networks.

2. Link-state routing on small and medium static networks.

3. Distance-vector routing on small and medium static networks.

Phase 2: Incremental evaluation of performance measures

We use the centralized simulator to incrementally evaluate the performance measures specied in phase 1. We will examine various single-layer routing schemes on medium static networks.

If memory resources allow, we will store the simulation log and evaluate eectiveness of the performance measures.

Phase 3: Hierarchical routing algorithms

For the centralized simulator, we will build modules for various hierarchical routing algorithms, such as landmark hierarchy[Tsch 88]. We will specify performance measures for hierarchical routing algorithms. We will perform various experiments with these algorithms on medium static networks.

Phase 4: Dynamic network modication algorithms

For the centralized simulator, we will build modules for simulating hierarchical routing algorithms on medium dynamic networks.

(29)

Phase 5: Distributed Simulator

We will redesign the simulator to obtain a distributed discrete-event simulator. We realize that there are many ways of doing this[Mis 86, Fis 78, CMHo 79]. It is essential to take advantage of the localities of resource utilization that are naturally present in a routing system. One way to take advantage is to divide the target system at appropriate links (i.e.

links which are seldom used), and simulate the separate parts of the target system on dierent computers. Then each computer simulates its own part of the target system, using its own local event occurrence list. The problem that arises is how to coordinate trac that ows between dierent parts of the target system, and how to synchronize the simulated times on the dierent computers.

(30)

References

[CACI] CACI Products Company, \COMNET II.5 Overview", March 1990.

[Cher 86] D. R. Cheriton, \VMTP: A Transport Protocol For The Next Generation of communication systems", Proceedings ACM SIGCOMM `86 Symposium, Stowe, Vermont, pp. 406-415, Aug. 1986.

[Clar87] D. D. Clark, M. L. Lambert, L. Zhang, \NETBLT: A High Throughput Trans- port Protocol", Proceedings ACM SIGCOMM `87 Symposium, Stowe, Vermont, pp. 353-359, 1987.

[CMHo 79] K.M. Chandy, J. Misra, V. Holmes, \Distributed simulation of networks", Com- puter Networks, Vol. 3, pp 105-113.

[PoRe 85] J. Postel, J. Reynolds, \File Transfer Protocol (FTP)", RFC 959, Network Information Center, SRI International, October 1985

[PoRe 83] J. Postel, J. Reynolds, \Telnet Protocol Specication", RFC 854, Network Information Center, SRI International, May 1983

[Fis 78] G.S. Fishman, Principles of Discrete Event Simulation. Wiley, New York.

[Gall 77] Robert Gallager, \A Minimum Delay Routing Algorithm Using Distributed Computation", IEEE Transactions on Communications, Vol. COM-25, No. 1, January 1977.

[Garc 89] J. J. Garcia-Luna-Aceves, \A Unied Approach to Loop Free Routing Using Distance Vectors or Link States", Proceedings ACM SIGCOMM `89 Sympo- sium, Austin, Texas, pp. 212-223, Sep. 1989.

[GrKl80] M. Gerla, L. Kleinrock, \Flow control: A Comparative Survey", IEEE Trans- actions on Communications, vol. COM-28, no. 4, pp. 553-574, April 1980.

(31)

[Hey 89] A. Heybey, \The Network Simulator", Laboratory of Computer Science, Mas- sachusetts Institute of Technology, October 1989.

[JaMo 82] J. M. Jae, F. H. Moss, \A Responsive Distributed Routing Algorithm for Computer Networks", IEEE Transactions on Communications, Vol. COM-30, No. 7, 1982.

[Jcob88] V. Jacobson, \Congestion avoidance and control", Proc. ACM SIGCOMM `88, Stanford, California, pp. 314-329, August 1988.

[Liv 89] M. Livny, \DeNet Overview", Technical note, University of Madison- Wisconsin, November 1989.

[LZGS 84] E. Lazowska, J. Zahorjan, G. Graham, K.Sevcik, Quantitative System Perfor- mance Anaylsis, Prentice Hall, Inc., 1984, Englewood Clis.

[Mar 88] D. Martin, \Network Simulator User's Manual", Laboratory of Computer Sci- ence, Massachusetts Institute of Technology, September 1988.

[McRR 80] J. M. McQuillan, I. Richer, E. C. Rosen, \The New Routing Algorithm for the ARPANET", IEEE Transactions on Communications, Vol. COM-28, No. 5.

[MeSe 79] P. M. Merlin, A. Segall, \A Failsafe Distributed Routing Protocol", IEEE Transactions on Communications, Vol. Com-27, No 9, September 1979.

[Mis 86] J. Misra, \Distributed Discrete-Event Simulation", Computing Surveys, Vol.

18, No. 1, March 1986.

[Reis79] M. Reiser, \A queueing network analysis of computer communication networks with window ow control", IEEE Transactions on Communications, vol. COM- 27, No. 10, pp. 1199-1209, Oct. 1979.

(32)

[SpGa 89] J. M. Spinelli, R. G. Gallager, \Event Driven Topology Broadcast Without Sequence Numbers", IEEE Transactions on Communications, Vol. 37, No. 5, May 1989.

[Tsch 88] P. F. Tsuchiya, \The Landmark Hierarchy:A New Hierarchy For Routing In Very Large Networks", ACM SIGCOMM's 88 Symposium, August 1988.

CiteSeerX — Routing Testbed: Initial Design

Contents

1 Introduction 1

2 Fundamentals of Discrete-Event Simulation 6

3 Modular Structure of our Simulator 8

4 Physical Network Group 12

5 Routing/Naming Group 17

6 Application/Transport Group 21

7 Topology Modi cation Group 24

8 Performance Evaluation Group 25

9 Simulator Manager Group 25

10 Phases of The Testbed 25

References 28

1 Introduction

all

Physical Network

Routing Protocols and Name Resolution Protocols

Transport Protocols

Application Protocols

Approaches to simulation

target system

Survey of some existing simulators

Network Simulator

DeNet

COMNET II.5

Our choice

2 Fundamentals of Discrete-Event Simulation

state variables

events

routine

event occurrence tuple

e;t

e

t

simulation log

< s

;

e

;t

; s

;

e

;t

; ::: ;

e

;t

; s

>

s

e

;t

s

i

s

t

e

s

t

t

t

simulated time

Current Simulation Log

Current Simulated Time

Current Simulation Log

Event Occurrence List

e;t

t

Current Simulated Time

event e is scheduled at time t

e;t

Event Occurrence List

e

t

t

s

s

e

Current Simulated Time

s

Event Occurrence List

7 Topology Modication Group 24

Topology Modication Group