Cengiz Alaettinoglu, Klaudia Dussa A. Udaya Shankar, Jean Bolot Department of Computer Science
University of Maryland College Park, Maryland 20742
UMIACS 90-71 CS-TR-2475 May 18, 1990
Abstract
This report presents the initial design of a testbed for developing routing algorithms for large dynamic computer networks.
This research was supported in part by RADC and DARPA under contract number F30602-90-C-0010 to UMIACS at the University of Maryland. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the ocial policies, either expressed or implied, of the Defense Advanced Research Projects Agency, RADC, or the U.S. Government.
Contents
1 Introduction 1
2 Fundamentals of Discrete-Event Simulation 6
3 Modular Structure of our Simulator 8
4 Physical Network Group 12
5 Routing/Naming Group 17
6 Application/Transport Group 21
7 Topology Modication Group 24
8 Performance Evaluation Group 25
9 Simulator Manager Group 25
10 Phases of The Testbed 25
References 28
1 Introduction
Our objective is to provide a testbed for developing robust routing algorithms for large dy- namic networks. The testbed should allow us to analyze, simulate, and evaluate dierent routing algorithms and network topologies. Because of the size of the networks to be stud- ied (up to 2000 nodes or more), a direct implementation, with each node being a separate computer, is not possible. Hence, our approach is to realize the testbed with a simulator.
In order to achieve our objective, the simulator must eectively model
all
functions of a computer network that aect the performance and robustness of routing algorithms. These functions include the following:
Physical Network
The topology of the nodes and links. The bandwidth, propagation delay, and failure characteristics of links. The speed, memory, and failure characteristics of nodes.
Routing Protocols and Name Resolution Protocols
Routing protocols are distributed algorithms used for maintaining routes. Name reso- lution protocols are distributed algorithms used for mapping names to addresses. Typ- ically in a large network, there are several of these protocols organized in various ways, both hierarchically and as peers.
Transport Protocols
Transport protocols are above the routing and naming protocols. They ensure reli- able communication between any two hosts in a network, using retransmission-with- acknowledgement schemes and ow-control schemes. Both of these schemes interact very strongly with routing schemes.
Application Protocols
These protocols are above the transport protocols. They are the ultimate producers
and consumers of data that travels through the network. Two categories of applications are important: le transfer and remote login.
In the remainder of this section, we give a brief overview of possible approaches in simulation and survey some existing simulators. Section 2 explains the fundamentals of discrete-event simulation. Section 3 presents the overall structure of our simulator. Sec- tion 4 through section 9 discuss the various modules of the simulator. Section 10 describes successively enhanced versions of the simulator.
Approaches to simulation
We use the term
target system
to refer to the real-time computer network that is to be simulated. Preliminary discussions led to three fundamentally dierent approaches to simulating the target system: (1) Process emulation, (2) Time-driven simulation, and (3) Discrete-event simulation.In process emulation, every node and link of the target system is represented by a sepa- rate process. Every node process executes code for application functions, transport functions, routing functions, buer management, etc. Every link process executes code modeling the propagation of messages in transit. In order for this set of processes to faithfully capture the dynamics of the target system, it is necessary that all processes execute their code at the same rate. To achieve this, synchronization points are introduced into the code. When a process reaches a synchronization point, it waits until all processes have reached the synchronization point. Thus, the execution of the processes proceeds from one synchronization point to the next. One advantage of process emulation is that the actual code of the target system can be used, thereby making the simulation very realistic. A disadvantage is the overhead for the process synchronization.
Time-driven simulation is like process emulation, except that instead of synchronizing processes, we divide each component's code into partitions that take the same amount of
time to execute. The simulation is done in cycles, with one partition from each node and link executed in every cycle. Time-driven simulation can be very realistic, because the actual code of the target system can be used. But it is not very exible. To add a new node function, the code of the node has to be repartitioned so that the execution time of each partition ts into the time frame.
In discrete-event simulation, the target system is abstracted by a set of variables and a set of events. The behavior of a system is represented by a list of event occurrences, where the time of each occurrence is known. Each event occurrence updates the values of the variables, as specied by event routines. The advantage of discrete-event simulation is that it is very exible. More states and events can be introduced to make the simulation more realistic, at the expense of increasing computational cost.
From these approaches, we chose discrete-event simulation. It oers the greatest exi- bility in modeling the target system, including dierent hierarchies and routing algorithms.
First we will implement a centralized discrete-event simulator. Later, we plan to implement a distributed discrete-event simulator, in order to simulate larger target systems.
All three approaches mentioned above require increasing amounts of processing time and memory as the size of the target system increases. One way to handle this problem is to simulate only a part of the target system, and abstract the rest by a simple approximation system [LZGS 84]. The problem here is how to obtain a simple approximation system that adequately represents the behavior of the rest of the target system. We believe that good approximations can be obtained by conducting careful studies of the complete target system.
Survey of some existing simulators
In order to reduce programming work, we examined three simulators that were available to us: (1) Network Simulator from MIT [Hey 89, Mar 88], (2) DeNet from University of Wisconsin-Madison [Liv 89], and (3) COMNET II.5 from CACI Products Company [CACI].
Network Simulator
Network Simulator is a discrete-event simulator designed for simulating a network of message- passing components. A component consists of a data structure and an action routine. The data structure stores the information needed by the component and the simulator. The action routine is called for every event that relates to the component. The user can dene his own components by specifying a new data structure and the associated action routine. The new component must respond to a number of predened commands, e.g. create an instance, delete an instance.
The simulator has a graphic interface. It displays the topology of the network and the parameters of its operation. It allows the user to dene and modify the network, control the simulation, log parameters, and save and load the network congurations. The user interface is almost entirely mouse-driven.
Network Simulator does not provide any components to describe routing, hierarchies and address resolution. The simulator is written in C and runs under UNIX and X windows.
The source code is available. The documentation inside the code is good, whereas the external documentation is poor [Hey 89, Mar 88].
DeNet
DeNet (Discrete Event Network) is a general purpose simulator. It views the target system as a set of objects that communicate with each other. Objects are dened as instances of Discrete Event Modules (DEVM). DEVMs can be arbitrarily complex. Discrete event connectors (DEVC) are used to interface objects. An instance of a DEVC establishes a direct interface through which one object can observe changes in the state of another object. The simulator is constructed from module types and connector types according to a description provided at runtime. A principal module manages the simulation run. It invokes the interpreter to read the target system description, and then constructs a representation in terms of DEVMs and
DEVCs. After initialization, the principal starts the simulation run.
The DeNet simulator can be viewed as a graph, where nodes represent instances of DEVMs and directed arcs represent instances of DEVCs. The description of the graph can be given in DeTop, a special purpose language for dening the topology of directed graphs.
TopDraw is a graphical editor for DeTop les. DeVise is a visualization tool. The DeNet simulator is written in Modula2. Currently, it runs on Sun workstations. There are no predened modules for nodes, links, routing, hierarchy, etc. The external documentation is poor. We did not have access to the source code [Liv 89].
COMNET II.5
COMNET II.5 is a commercially available, discrete-event simulator designed for simulating message-passing networks. It accepts a description of the target network, including trac and routing algorithms, forms an internal representation, and simulates it.
A menu-driven screen editor is used to dene and modify the target network description.
The network description is split up into three parts: network topology, network trac, and network operation. The network topology species the nodes, the links, and the access facilities of the network. A node can model a circuit switch, a store-and-forward switch, or both. A link models a point-to-point transmission channel. Access facilities can model point-to-point or multipoint lines, token-ring LANs, or CSMA/CD LANs. There are various attributes to nodes, links, and access facilities. The network trac can be dened as circuit- switched calls, data-messages, and virtual-circuit calls. Categories of trac can be dened in terms of source node, destination node, and class of service. Network operation includes a description of the routing algorithm. Static routing and adaptive routing are oered by the simulator.
COMNET II.5 does not consider routing hierarchies or dierent name resolution schemes.
The external documentation does not mention any possibility of adding user-dened compo-
nents [CACI].
Our choice
We found Network Simulator most suitable for our purposes for several reasons. First, the source code was readily available, and the style of coding and documentation is good. Thus modications and expansions seem easy to do. Second, it is written in a very portable form of C, a language and programming environment with which we have much expertise. Third, it does not enforce modular composition rules, unlike DeNet. As a result it can be extended to be a very ecient simulator, in terms of processing and memory requirements. This is very important because of the large target networks that we will be simulating.
2 Fundamentals of Discrete-Event Simulation
In discrete-event simulation, each execution of the target system is represented by a sequence of state transitions, where the time of each transition is known. The state space of the target system is represented by a set of
state variables
. The state transitions are represented by a set ofevents
. Associated with each event is aroutine
that species how the state variables are updated when the event occurs. An event can occur multiple times during a simulation.We use the
event occurrence tuple
(e;t
) to denote the occurrence of evente
at timet
. Thus, asimulation log
can be formally dened as a sequence of the form:< s
0;
(e
1;t
1); s
1;
(e
2;t
2); ::: ;
(e
n;t
n); s
n>
where each
s
i is a system state and each (e
i;t
i) is an event occurrence such that the following holds: (1)s
0 is an initial state; (2) for alli
, states
i results from executing at timet
i the routine of evente
i in states
i?1; and (3) thet
i's are nondecreasing, non-negative numbers.Note that the last time value,
t
n, indicates the time up to which the target system has beensimulated. We refer to
t
n as thesimulated time
of the simulation log. Given a simulation log, we can compute performance measures.yHow does the simulator eciently compute a simulation log? The classical solution is to generate the simulation log in an iterative fashion, which we describe next. In addition to the state variables mentioned above, the simulation maintains the following state variables with the indicated meanings holding just before an iteration.
Current Simulation Log
. A sequence of alternating states and event occurrences.It equals a simulation log and is initially empty.
Current Simulated Time
. Indicates the simulated time for theCurrent Simulation Log
.
Event Occurrence List
. A sequence of event occurrences that are yet to be ex- ecuted. Each (e;t
) pair in this list satisest
Current Simulated Time
. Initially, the list contains one or more event occurrence pairs to trigger the simulation ( e.g. a customer arrival at time 0 ).We say that
event e is scheduled at time t
, if and only if (e;t
) is in theEvent Occurrence List
. In the target system, the occurrence of an evente
can cause events to occur later on.(For example, a customer arrival at time
t
to a server causes a departure at timet
+s
, wheres
is the service time for the customer). This is modeled by having the routine of evente
schedule new event occurrences. (For example, the routine for the customer arrival adds the entry ( customer departure,Current Simulated Time
+s
) to theEvent Occurrence List
.)The iterative activity of the simulator can now be described by the following:
yIn all the simulations of interest to us, the simulated time should increase without bound as the number of events simulated increase, i.e. limn!1tn=1.
while
simulation not overdo begin
1. Pick up an (
e;t
) tuple with minimumt
from theEvent Occurrence List
. 2.Current Simulated Time t
3. Execute the routine of
e
(this may schedule new event occurrences) 4. Append (e;t
) ands
toCurrent Simulation Log
, wheres
is a subset ofthe current system state.
end
result:
Upon termination, the current simulation log equals a simulation log.Post-processing can be done on it to obtain performance measures.
In general, storing the entire simulation log requires excessive memory, and is not feasible. In such cases, the standard solution is to decide a priori on some performance measures that can be computed incrementally (such as accumulated response time, number of departures, etc.), and store these instead of the simulation log. Then, just before step 2, we insert a step, where the performance measures are updated, using the information that the current system state has lasted for the past (
t
?Current Simulated Time
) seconds.Now, there is no need for the
Current Simulation Log
and step 4 is removed.3 Modular Structure of our Simulator
To provide for exible usage, the simulator is designed as a collection of interacting modules.
Most of the modules in our simulator correspond to components of the target system. These modules can be adapted or replaced without altering the rest of the simulator. The rest of the modules in the simulator are concerned with simulation functions, such as event handling, maintenance of the event occurrence list, performance measures, etc. Before describing the particular modules of our simulator, we give a generic description of modules.
A module encapsulates a data structure and events to access it. An event of one module can call or schedule an event of another module. Formally a
module
consists of the following (see gure 1):
Data structures
Typically, the data structures of the module can be directly accessed only by events of the module. Other modules can access them only via the events of this module.
However, for the sake of eciency, we sometimes allow these data structures to be directly accessed by events of other modules. Any such access is highlighted.
Internal events
An internal event operates only on the data structures of the module. Its routine is called by the event handler or by another event of this module. It can call and schedule events of this module and of other modules.
External events
An external event is called by the event handler, by another event of this module, or by an event of another module. An external event can call or schedule events of this module and of other modules.
Events can be dened with parameters, for example, representing data conveyed from one module to another.
As mentioned above, an event routine can call another event routine, and so on. It is necessary to ensure, not only that there is no cycle (innite recursion), but also that at most one event of any module is executed in any single event occurrence. We enforce this by syntactic constraints on the event routine, i.e. in the simulator design.
Data Structure
Internal
Events External
Events
Figure 1: Structure of a module
Modules of the simulator
We now give an overview of the modules in our simulator. The modules are divided into those that model the target system and those that handle the simulation function. See gure 2.
The modules modeling the target system are clustered into four groups. The
Phys- ical Network Group
provides modules representing the physical nodes and links. TheRouting/Naming Group
provides modules for routing, hierarchy management, and name service. TheApplication/Transport Group
provides modules for transport protocol func- tions (such as connection management, ow control, etc.) and application functions (leApplication
Simulator Manager Group
Transport Protocol(TCP)
Group ModicationTopology Routing/Naming Group
Application/Transport Group
Physical Network Group
Routing
Hierarchy Name Server Telnet FTP
Nodes Links
Performance Evaluation Group
Figure 2: Structure of the testbed
transfer trac, remote login trac). The
Topology Modication Group
provides mod- ules for dynamic modication of the network topology, i.e. adding/removing links, nodes, connections, etc.The modules handling the simulation function are divided into two groups. The
Sim- ulator Manager Group
contains modules that start the simulation, manage the event occurrence list, generate the current simulation log, etc. ThePerformance Evaluation Group
contains modules that collect statistics and monitor the performance of the target system modules.Together, these six groups of modules constitute the testbed. In sections 4 to 9, we
describe these groups of modules in more detail. For each module, we list the data structures and the events of the module. For each event, we describe (1) its eect on the data structures, (2) the events of this module and of other modules that it calls (if any) and (3) the events of this module and of other modules that it schedules (if any).
4 Physical Network Group
The physical network group models the following: (1) topology of the network, (2) buer policy at each node regarding received messages and outgoing messages, (3) messages in transit on each link, and (4) failure and repair of nodes and links.
We divide the messages in the network into two classes: transport and routing packets.
Transport packets
carry information for application/transport modules.Routing pack- ets
carry information for routing/naming modules. A transport packet received by a node is either directed up to the transport module of this node, or redirected to an outgoing link. A routing packet received by a node has to be processed by the routing/naming modules of this node. We use two special types of routing packets,shutdown packets
andstartup pack- ets
, to indicate, respectively, failure and repair of links and nodes, to the routing/naming modules. If the routing/naming modules do not use this information, they can ignore it.Each packet has additional elds that will be introduced in the report as needed.
Links
Each link in the target system is assumed to be bi-directional. We simulate each link by two queues of messages, one for each direction. Messages in a queue represent packets in transit. These queues are manipulated by send events, receive events, failure events, and repair events. We assume that in case of a link failure, both directions of a link are broken.
Parameters for links are propagation delay, transmission speed, error rate, failure rate, and
repair rate. Any parameter can be deterministically or probabilistically dened.
Conventions:
Consider a link that connects two nodes, sayA
andB
. The queue fromA
toB
is referred to as anoutgoing link queue
ofA
and as anincoming link queue
ofB
. For the queue of messages fromA
toB
,A
is referred to as thesource node
andB
as thedestination node
. We consider the occurrence of a send event (receive event) to correspond to the arrival of the last bit of a packet at a link (node). One advantage of this is that if an- other packet is waiting to be sent (received), it can be scheduled by the send (receive) routine.A
link module
consists of the following:
Link Status
Data structure. Indicates whether the status of the link is
up
ordown
.
LQ
1;LQ
2Data structures. Two link queues, one for each direction. Each link queue is a queue of (
p;t
) tuples, wherep
is a packet andt
is a time value. Every tuple (p;t
) represents a packet in transit, witht
indicating the time of reception at the destination node.
Link:Send
(LQ
i;p
)External event, where parameter
LQ
i is eitherLQ
1 orLQ
2 and parameterp
is a packet. Called by a send event of the source node of the link queueLQ
i (node events are described in section 4.2).If the link status is
up
,Link:Send
(LQ
i;p
) appends the tuple (p;t
) to the tail ofLQ
i wheret
equals the sum of the current simulated time and the propagation time of the link. If (p;t
) is the only tuple in the link queue,Link:Send
(LQ
i;p
) schedules aLink:Receive
(LQ
i) at timet
.If the link status is
down
,Link:Send
(LQ
i;p
) does not appendp
toLQ
i. Instead, ifp
is a routing packet, it is discarded. Ifp
is a transport packet, it calls the transport retransmitevent at the node that generated the packet
p
. (Transport events are described in section 6). This is one way to model retransmissions initiated by the transport protocol.
Link:Receive
(LQ
i)Internal event. Called by the event handler.
Link:Receive
(LQ
i) removes the rst (p;t
) tuple in link queueLQ
i. It calls the receive event of the destination node ofLQ
i. If there is another tuple (p
0;t
0) at the head ofLQ
i, it schedules aLink:Receive
(LQ
i) event at timet
0.
Link:Failure
Internal event. Called by the event handler.
Link:Failure
does the following. It sets the link status todown
. For each transport packetp
in link queuesLQ
1andLQ
2, it calls the transport retransmit event at the node that generated the packetp
. It empties the link queues. It places a tuple (p;t
) in each link queue, wherep
is a shutdown packet andt
is the reception time ofp
. It schedules aLink:Receive
(LQ
1) andLink:Receive
(LQ
2), both for timet
.
Link:Repair
Internal event. Called by the event handler.
Link:Repair
does the following. It sets the link status toup
. It places tuple (p;t
) in each link queue, wherep
is a startup packet andt
is the reception time ofp
. It schedules aLink:Receive
(LQ
1) andLink:Receive
(LQ
2), both for timet
.Nodes
In modeling a node for the physical network group we are concerned with how a node buers received messages and outgoing messages. We realize that there are many possible buer management policies. Here, we consider one policy. Other possibilities can be realized easily by changing the node module.
A node has a separate queue for each outgoing link and a single queue for all received routing packets. The queues in a node are manipulated by send events, receive events, produce events, failure events, and repair events. Parameters for the nodes include the available buer space, buer management policies, CPU speed, failure rate, and repair rate.
A
node module
consists of the following :
Node Status
Data Structure, shared with the routing module. Indicates whether the node status is
up
ordown
.
Outgoing Packet Queue
(LQ
)Data structure, where the parameter
LQ
ranges over outgoing link queues. Shared with the routing/naming modules of this node (described in section 5). Queue of (p;t
) tuples wherep
is a packet andt
is a time value. Every tuple (p;t
) represents a packet that is to be transmitted to link queueLQ
wheret
is the time of the end of transmission.
Routing Packet Queue
Data structure. Queue of (
p;t
) tuples. Every tuple (p;t
) denotes a routing packet wheret
signies the time of its processing. We assume that this corresponds to the end of processing this packet.
Routing Table
Data structure. Shared with routing/naming modules of this node. Read-only access by the node module.
Node:Send
(LQ
)Internal event, where the parameter
LQ
ranges over the outgoing link queues for this node. Called by the event handler.Node:Send
(LQ
) assumes thatOutgoing Packet
Queue
(LQ
) is not empty. It removes the rst tuple (p;t
) from the queue, and calls the appropriate send event of the link module. If there is still another tuple (p
0;t
0) in theoutgoing packet queue, it schedules a
Node:Send
(LQ
) event at timet
0.
Node:Receive
(p
)External event, where parameter
p
is a packet. Called by a link module receive event corresponding to an incoming link queue.If the node status is
up
,Node:Receive
(p
) does the following:{
Ifp
is a transport packet destined for this node,Link:Receive
(p
) calls the receive event of the transport module of this node.{
Ifp
is a transport packet not destined for this node,Node:Receive
(p
) consults the routing table and appends (p;t
) to anOutgoing Packet Queue
(LQ
); if (p;t
) is the only packet inLQ
,Node:Receive
(p
) schedules aNode:Send
(LQ
) at timet
.t
is the sum ofCurrent Simulated Time
, the time for consulting the routing table and the transmission time of the packet.{
Ifp
is a routing packet,Node:Receive
(p
)evaluates the functionProcessing:Time
(p
) and places the tuple (p;t
) at the tail ofRouting Packet Queue
wheret
is the sum ofCurrent Simulated Time
andProcessing:Time
(p
). (Processing:Time
(p
) is a function that returns the time required to processp
. It is described in section 5).If (
p;t
) is the only packet in the routing packet queue,Node:Receive
(p
) schedules a routing process event at timet
.If the node status is
down
or there is no buer space available, the packetp
is discarded.If
p
is a transport packet,Node:Receive
calls the retransmit event of the transport module that generated the packetp
.
Node:Produce
(p;LQ
)External event, where parameter
p
is a packet, and parameterLQ
is either null or spec- ies an outgoing packet queue of the node. Called by events of transport/application modules and routing modules. IfLQ
is null,Node:Produce
(p
) consultsRouting Table
and appends (
p;t
) to the appropriate outgoing packet queue. IfLQ
is not null,Node:Produce
(p;LQ
) appends (p;t
) toLQ
. In either case, if (p;t
) is the only tu- ple in the outgoing packet queue, it schedules aNode:Send
event at timet
.t
is the sum of the current simulated time, the time for consulting the routing table and the transmission time of the packet. If there is no buer space available, andp
is a transport packet,Node:Produce
(p
) calls the retransmit event of the transport module.
Node:Failure
Internal event. Called by the event handler.
Node:Failure
does the following. It sets the node status todown
. For each transport packetp
in the outgoing packet queues, it calls the transport retransmit event at the node that generated the packetp
. It emptiesRouting Packet Queue
and/or the outgoing packet queues, depending on the failure model. It appends a tuple (p;t
) toRouting Packet Queue
of the node, wherep
is a shutdown packet andt
is the time of the failure occurrence. If (p;t
) is the only packet inRouting Packet Queue
, it schedules a process event at the node at timet
.
Node:Repair
Internal event. Called by the event handler.
Node:Repair
does the following. It sets the node status toup
and appends a tuple (p;t
) toRouting Packet Queue
. Where the tuple (p;t
) is placed inRouting Packet Queue
and when it will be processed, depends on the repair model. If (p;t
) is the only packet in theRouting Packet Queue
, it schedules a routing process event at the node at timet
, wheret
is the end of repair.5 Routing/Naming Group
There can be many paths from a node to a destination node. A good routing protocol should cause packets to be routed along a shortest path, i.e. one of minimum cost. The cost of a path is the sum of the costs of its links. The cost of a link depends on its propagation delay, its bandwidth, and the size of its outgoing queue in the source node. Because of the last
quantity, the cost of a link, and hence the cost of a path, varies with time, as connections are opened and closed, and paths are chosen and discarded.
There are many dierent approaches to routing. One of the most common approaches is
next-hop
routing. In this approach, for each destination each node is only aware of which outgoing link (i.e. next-hop) to take to reach the destination. The basic problem of the routing algorithms in this approach is to determine at each node a good next-hop for each destination.Routing algorithms for next-hop routing are usually classied into link-state algorithms and distance-vector algorithms. In link-state algorithms, each node attempts to maintain the global topology, i.e. the state of all links in the network. Periodically, neighboring nodes ex- change their global topology information. Each node calculates shortest paths, and thereby chooses the next-hops to take. In distance-vector algorithms, each node maintains for each destination the cost via each of its neighbours to reach the destination. Periodically, neigh- boring nodes exchange their costs to each destination. A node chooses an outgoing link as the next hop, if the cost of that link plus the cost to the destination from the node at other end is minimum over such costs of all other neighbours. There are many dierent versions of link-state algorithms and distance-vector algorithms[MeSe 79, JaMo 82, Garc 89, McRR 80].
We next describe a link-state routing algorithm very much like the Arpanet routing algorithm. In the course of the project, we will be implementing dierent routing algorithms under dierent hierarchies. The algorithm maintains three tables: a local topology table, a global topology table, and a routing table. The local topology table stores information about links incident to this node. The global topology table stores the latest reported link costs throughout the target network. The routing table stores the next-hop entries. Nodes periodically exchange local topology information with other nodes. Whenever link costs are updated or new link-state information is received, shortest paths are recalculated and new next-hops are determined.
The neighbours of a node
A
can send toA
the local topology of a nodeB
over dierent instances of time. In order to determine which topology information is the latest, sequence numbers are used. Each node has a sequence number that it increments and sends along with local topology information. Each node also stores the largest sequence number it has received from every other node. All sequence numbers are stored in non-volatile memory, so that they survive node failures.A node exchanges local topology information using
link-state packets
, which are a type of a routing packet. A link-state packet contains theid
of the node whose local topology is being propagated, the sequence number of that node when the packet was created, and the cost (at that time) of each link incident on that node.A
routing/naming module
consists of the following:
Local Topology Table
Data structure. For each neighbour node
A
,Local Topology Table
stores a xed cost corresponding to the propagation delay of the link toA
, the size of the outgoing packet queue for this link, and the status of this link as known by this node.
Global Topology Table
Data structure.
N
byN
matrix whereN
is the number of nodes in the target network.Entry (
i;j
) indicates the last received cost of the link betweeni
andj
(if any) in thei
toj
direction.
Routing Table
Data structure. Shared with the node module. For each node
A
in the network, stores either the next hop to reachA
, or the entrynull
, indicating thatA
is unreachable.
Sequence Number
Data structure. Incremented whenever the local topology of this node is broadcast.
Sequence Number
is stored in non-volatile memory.
Last Sequence Number
(N
)Data structure. For each node
N
in the target system, indicates the largest sequence number of the link-state packet received fromN
.Last Sequence Number
(N
) is stored in non-volatile memory.
Routing:Broadcast
Internal event. Called by the event handler. If the node is
up
,Routing:Broadcast
incrementsSequence Number
, calculates link costs by consulting the local topology table, creates a link-state packetp
for each neighbour node, calls node produce event withp
and the appropriate outgoing link queue to use, recalculates shortest paths, updates routing table, and schedules anotherRouting:Broadcast
event.
Routing:Processing
(p
)External event, where parameter
p
is either a link-state packet, a shutdown packet, or a start-up packet. Called by node receive event or event handler.{
Ifp
is a shutdown packet or a start-up packet of an adjacent link,Routing:Processing
(p
) stores the status of the link in the local topology table.{
Ifp
is a shutdown packet due to the failure of this node,Routing:Processing
(p
) empties the local topology table, the global topology table, and the routing table, and setsNode Status
todown
.{
Ifp
is a start-up packet due to a node repair event of this node,Routing:Processing
(p
) schedules aRouting:Broadcast
event, and setsNode Status
toup
.{
Ifp
is a link-state packet generated by a nodeB
and the sequence number inp
is greater thanLast Sequence Number
(B
) thenRouting:Processing
(p
) updates the global topology table, duplicatesp
for each neighbour node, and calls node produce event withp
and the appropriate outgoing link queue to use, recalculates shortest paths, updates routing table, and schedules aRouting:Processing
event if there is a packet in the routing packet queue.
Routing:ProcessingTime
(p
)Function where parameter
p
is a routing packet. Invoked by node receive event. It returns an estimate for the time required to processp
.6 Application/Transport Group
Application protocols are the ultimate producers and consumers in a computer network.
They exist above the transport protocol. There are many dierent kinds of applications. File transfer (e.g. FTP[PoRe 85]) and remote login (e.g. TELNET[PoRe 83]) are two of the most common ones. A le transfer from node
A
to nodeB
is characterized by large data packets going fromA
toB
, and small acknowledgement packets going fromB
toA
. The size of the packets and their intergeneration time are very regular. In a remote login application, data packets and acknowledgement packets go in both directions, and there is great variation in their sizes and generation times.In a computer network, the transport protocol provides a reliable communication chan- nel between any two nodes, whether or not they are directly connected by a link. To achieve this in the presence of message loss, link failures, node failures, buer space limitations, etc., a transport protocol utilizes acknowledgement/retransmission schemes, and ow control schemes. A typical retransmission policy is to retransmit a packet, if it is not acknowledged within estimated routing trip time of its transmission.
Flow control mechanisms restrict the number of packets in transit in the network.
Typical ow control mechanisms use either a window-based strategy[GrKl80, Jcob88, Reis79]
or a rate-based strategy[Cher 86, Clar87]. By restricting the number of packets in transit, the outgoing packet queue sizes in nodes become more controlled, reducing the probability of buer over ow.
There is strong interaction between ow control schemes and routing schemes. Without
a good ow control mechanism, outgoing packet queues can grow unreasonably. This increases the delay at the link, causing the routing algorithms to try to nd a better path when no such path may exist. Therefore, in order to fairly compare routing algorithms, it is absolutely necessary to model ow control mechanisms.
We realize that there are many possible retransmission and ow control schemes. For preliminary consideration, we choose a simple scheme; this can be modied easily later.
Packet counters
are used to represent the number of transport packets produced, the number of transport packets sent, the number of transport packets received, and the number of transport packets acknowledged (i.e. for which an acknowledgement was received). We model ow control by imposing an upper bound on the number of transport packets that have been sent but not yet acknowledged.To estimate the round trip time, a special transport packet, referred to as a
token
, is exchanged between the nodes of a transport connection. Each node of the connection maintains the exponential average of the roundtrip times of token, and uses that as a roundtrip time estimate.Each transport packet has the following: (1) a
type
eld indicating if the packet is a data packet, an acknowledgement packet, or a token packet, (2) aresponse
bit which indicates if a response is required or not, and (3) aconnection id
, identifying the transport connection.For each ordered pair of nodes (
A;B
), there is one application/transport module. All the trac fromA
toB
is generated by the produce event of this module. Parameters to this module include the size of packets, interpacket time, themaximum send window size
, and themaximum produce window size
. The interpacket time can specify various clustering phenomena in packet production. The maximum send window size restricts the number of packets that are sent but not yet acknowledged. The maximum produce window size restricts the number of packets produced but not yet sent.An
application/transport module
consists of the following:
Packets Produced
Data Structure. Indicates the number of data packets produced.
Packets Sent
Data Structure. Indicates the number of data packets sent.
Packets Produced
?Packets Sent
is always less than or equal to the produce window size.
Packets Acked
Data Structure. Indicates the number of data packets for which an acknowledgement was received at the source node.
Packets Sent
?Packets Acked
is always less than or equal to the send window size.
Packets Received
Data Structure. Indicates the number of data packets received at the destination node.
Token Sent Time
Data Structure. Indicates the value of the current simulated time when the token was last sent.
Roundtrip Time Estimate
Data Structure. Indicates the current estimate of the roundtrip time.
Application:Produce
Internal event. Called by the event handler. If
Packets Produced
?Packets Sent
is less than the produce window size, thenApplication:Produce
incrementsPackets Produced
, callsTransport:Send
event, and schedules the nextApplication:Produce
event accord- ing to the specied trac pattern.
Application:Consume
(p
)Internal event, where
p
is a data packet. Called byTransport:Receive
event. Ifp
requires a response, an
Application:Produce
event is scheduled at timet
, wheret
is the sum of the current simulated time and the processing time.
Transport:Send
Internal event. Called by the events
Application:Produce
andTransport:Receive
. IfPackets Produced
is greater thanPackets Sent
andPackets Sent
?Packets Acked
is less than the send window size, thenTransport:Send
incrementsPackets Sent
, creates a data packet, and calls node produce event with this packet.
Transport:Receive
(p
)External event, where
p
is a packet. Called by the node receive event. Ifp
is a data packet,Transport:Receive
(p
) incrementsPackets Received
, calls node produce event with an acknowledgement packet, and callsApplication:Consume
(p
). Ifp
is an acknowledgement packet,Transport:Receive
(p
) incrementsPackets Acked
, and callsTransport:Send
event. Ifp
is a token packet, it updates the roundtrip time estimate, setsToken Sent Time
to the value of current simulated time, and calls node produce event withp
.
Transport:Retransmit
(p
)External event, where
p
is a packet. Called by node failure, link failure, node receive, and node produce events.Transport:Retransmit
(p
) schedules a node produce event at timet
, wheret
is the sum of current simulated time and a timeout duration based on the roundtrip time estimate. Ifp
is a token packet, it also setsToken Sent Time
to the value of current simulated time.7 Topology Modication Group
This module dynamically modies the topology of the network. It can add new links and nodes. It can delete existing links and nodes. These modications are dierent from failures
and repairs in that they are intentional. For example, before a node is deleted all connections involving the node are closed down. When a node is created, new application/transport modules and routing/naming modules are also created and appropriately initialized.
8 Performance Evaluation Group
The modules in the performance evaluation group collect statistics from the dierent modules of the target system and evaluate performance measures. One performance measure would be queue size characterizations (e.g. expected values, transient overshoots). Another would be the time between starting of a new connection and settling of the routing tables.
Because of the complexity of the target system, it is dicult to determine apriori good performance measures. These will have to be arrived at after much experimentation with the simulator. For this reason, in the preliminary version of the simulator, we consider post-processing of the simulation log.
9 Simulator Manager Group
The modules in this group maintain the event occurrence list and compute the simulation log.
The activity of this module is basically the iterative event handling that has been described in section 2.
10 Phases of The Testbed
In this section, we describe successive versions of the simulator.
Phase 1: Development of centralized simulator
In the preliminary version, we restrict ourselves to post-processing of the simulation log.
Using this, we will develop eective performance measures to use instead of post-processing.
We will examine the following single-layer routing schemes:
1. Fixed-path routing on small static networks.
2. Link-state routing on small and medium static networks.
3. Distance-vector routing on small and medium static networks.
Phase 2: Incremental evaluation of performance measures
We use the centralized simulator to incrementally evaluate the performance measures specied in phase 1. We will examine various single-layer routing schemes on medium static networks.
If memory resources allow, we will store the simulation log and evaluate eectiveness of the performance measures.
Phase 3: Hierarchical routing algorithms
For the centralized simulator, we will build modules for various hierarchical routing algo- rithms, such as landmark hierarchy[Tsch 88]. We will specify performance measures for hierarchical routing algorithms. We will perform various experiments with these algorithms on medium static networks.
Phase 4: Dynamic network modication algorithms
For the centralized simulator, we will build modules for simulating hierarchical routing algo- rithms on medium dynamic networks.
Phase 5: Distributed Simulator
We will redesign the simulator to obtain a distributed discrete-event simulator. We realize that there are many ways of doing this[Mis 86, Fis 78, CMHo 79]. It is essential to take advantage of the localities of resource utilization that are naturally present in a routing system. One way to take advantage is to divide the target system at appropriate links (i.e.
links which are seldom used), and simulate the separate parts of the target system on dierent computers. Then each computer simulates its own part of the target system, using its own local event occurrence list. The problem that arises is how to coordinate trac that ows between dierent parts of the target system, and how to synchronize the simulated times on the dierent computers.
References
[CACI] CACI Products Company, \COMNET II.5 Overview", March 1990.
[Cher 86] D. R. Cheriton, \VMTP: A Transport Protocol For The Next Generation of communication systems", Proceedings ACM SIGCOMM `86 Symposium, Stowe, Vermont, pp. 406-415, Aug. 1986.
[Clar87] D. D. Clark, M. L. Lambert, L. Zhang, \NETBLT: A High Throughput Trans- port Protocol", Proceedings ACM SIGCOMM `87 Symposium, Stowe, Vermont, pp. 353-359, 1987.
[CMHo 79] K.M. Chandy, J. Misra, V. Holmes, \Distributed simulation of networks", Com- puter Networks, Vol. 3, pp 105-113.
[PoRe 85] J. Postel, J. Reynolds, \File Transfer Protocol (FTP)", RFC 959, Network Information Center, SRI International, October 1985
[PoRe 83] J. Postel, J. Reynolds, \Telnet Protocol Specication", RFC 854, Network Information Center, SRI International, May 1983
[Fis 78] G.S. Fishman, Principles of Discrete Event Simulation. Wiley, New York.
[Gall 77] Robert Gallager, \A Minimum Delay Routing Algorithm Using Distributed Computation", IEEE Transactions on Communications, Vol. COM-25, No. 1, January 1977.
[Garc 89] J. J. Garcia-Luna-Aceves, \A Unied Approach to Loop Free Routing Using Distance Vectors or Link States", Proceedings ACM SIGCOMM `89 Sympo- sium, Austin, Texas, pp. 212-223, Sep. 1989.
[GrKl80] M. Gerla, L. Kleinrock, \Flow control: A Comparative Survey", IEEE Trans- actions on Communications, vol. COM-28, no. 4, pp. 553-574, April 1980.
[Hey 89] A. Heybey, \The Network Simulator", Laboratory of Computer Science, Mas- sachusetts Institute of Technology, October 1989.
[JaMo 82] J. M. Jae, F. H. Moss, \A Responsive Distributed Routing Algorithm for Computer Networks", IEEE Transactions on Communications, Vol. COM-30, No. 7, 1982.
[Jcob88] V. Jacobson, \Congestion avoidance and control", Proc. ACM SIGCOMM `88, Stanford, California, pp. 314-329, August 1988.
[Liv 89] M. Livny, \DeNet Overview", Technical note, University of Madison- Wisconsin, November 1989.
[LZGS 84] E. Lazowska, J. Zahorjan, G. Graham, K.Sevcik, Quantitative System Perfor- mance Anaylsis, Prentice Hall, Inc., 1984, Englewood Clis.
[Mar 88] D. Martin, \Network Simulator User's Manual", Laboratory of Computer Sci- ence, Massachusetts Institute of Technology, September 1988.
[McRR 80] J. M. McQuillan, I. Richer, E. C. Rosen, \The New Routing Algorithm for the ARPANET", IEEE Transactions on Communications, Vol. COM-28, No. 5.
[MeSe 79] P. M. Merlin, A. Segall, \A Failsafe Distributed Routing Protocol", IEEE Transactions on Communications, Vol. Com-27, No 9, September 1979.
[Mis 86] J. Misra, \Distributed Discrete-Event Simulation", Computing Surveys, Vol.
18, No. 1, March 1986.
[Reis79] M. Reiser, \A queueing network analysis of computer communication networks with window ow control", IEEE Transactions on Communications, vol. COM- 27, No. 10, pp. 1199-1209, Oct. 1979.
[SpGa 89] J. M. Spinelli, R. G. Gallager, \Event Driven Topology Broadcast Without Sequence Numbers", IEEE Transactions on Communications, Vol. 37, No. 5, May 1989.
[Tsch 88] P. F. Tsuchiya, \The Landmark Hierarchy:A New Hierarchy For Routing In Very Large Networks", ACM SIGCOMM's 88 Symposium, August 1988.