Active Queue Management and Scheduling Methods for Packet-Switched Networks

(1)

Abstract

AKIN, OZDEMIR. Active Queue Management and Scheduling Methods for Packet-Switched Networks

(under the supervision of Dr. J. Keith Townsend)

To support the myriad of envisioned communication products of the future, there is a need to develop a network infrastructure that can provide larger bandwidth, with better control of quality of service (QoS). However, with increasing demand for applica-tions running over packet networks, congestion at the intermediate nodes (e.g., routers and switches) can be a serious problem. Consequences include long delays, large delay variation and high packet loss rates.

Different solutions requiring varying levels of modification to the currently used al-gorithms have been proposed both for responsive (e.g., TCP) and unresponsive (e.g., UDP) protocols. However, most of the solutions are either too complicated to imple-ment in real life or not general enough to be applicable to an arbitrary network topology. In this thesis, we investigate two mechanisms — active queue management (AQM), and scheduling — that can improve QoS in the packet networks.

(2)

of complexity and performance. We also propose a distributed networking scheme that improves the performance of our new AQM algorithms.

(3)

Active Queue Management and Scheduling Methods for

Packet-Switched Networks

by

Ozdemir Akin

A dissertation submitted to the Graduate Faculty of North Carolina State University

in partial fulfillment of the requirements for the Degree of

Doctor of Philosophy

Department of

Electrical and Computer Engineering

Raleigh, N.C. 2004

APPROVED BY:

J. K. Townsend I. Viniotis

Chair of Advisory Committee

(4)

Biography

(5)

Contents

List of Figures vi

List of Tables x

1 Introduction 1

1.1 Related Work . . . 2

1.1.1 Related Work on AQM . . . 3

1.1.2 Related Work on Scheduling . . . 5

1.2 Contribution . . . 8

2 Self-Configuring PI Controller 11 2.1 Analytical Model for TCP . . . 13

2.2 Self-Configuring PI Controller with RTT Estimation . . . 15

2.2.1 SCPI Controller . . . 16

2.2.2 SCPI with Fairness (SCPI-F) . . . 19

2.3 Simulation Results . . . 22

2.3.1 SCPI Controller . . . 22

2.3.2 SCPI with Fairness (SCPI-F) . . . 26

(6)

3.1.1 Adaptive PI Controller . . . 34

3.1.2 Fairness . . . 36

3.2 Simulation Results . . . 39

3.2.1 Case I . . . 40

3.2.2 Case II . . . 43

4 Congested Node Control Network for TCP Connections 48 4.1 Definitions . . . 51

4.2 TCP Traffic in CNC Networks . . . 51

4.2.1 PI Controller . . . 52

4.2.2 Header Encapsulation for TCP connections in CNC Networks . . 54

4.2.3 Implementing CNC Networks . . . 56

4.2.4 Simulations . . . 57

5 Deficit virtual Round Robin (DvRR) 66 5.1 Notation and Summary of Related Work . . . 68

5.1.1 Packet-by-Packet Generalized Processor Sharing (PGPS) . . . 69

5.1.2 Deficit Round Robin (DRR) . . . 70

5.2 DvRR . . . 73

5.3 Analysis . . . 76

5.3.1 Latency . . . 76

5.3.2 Number of queues (p) . . . 79

5.3.3 Fairness Measure . . . 81

5.3.4 Complexity vs. Performance . . . 82

5.3.5 Guaranteed Rate Scheduling . . . 83

(7)

6 Conclusion and Future Work 90

A Appendix 92

(8)

List of Figures

2.1 Block diagram of the linearized system model with AQM. . . 15 2.2 Plots of window size versus time are shown for the real window size (left) and the

effective window size (right). . . 17

2.3 The single bottleneck topology used in the simulations. . . 22 2.4 Plots of queue length vs. time are shown for a load changes (N) at 120 and 220 seconds,

for the regular PI controller (top) and the self-configuring PI controller (bottom). . . 23 2.5 Plots of length vs. time are shown for changing RT T, for the regular PI controller

(top) and the self-configuring PI controller (bottom). . . 24 2.6 Plots of the variance of the queue length are shown for different numbers of connections

both for regular and self-configuring PI controllers. For each point five independent

samples are used, and 95% confidence intervals are also shown. . . 25 2.7 Plots of time vs queue length are shown for SCPI (top), PI (R=0.1) (middle) and

PI(R=0.12) (bottom). The RTT is increased at time 90 seconds by 30 ms. . . 26 2.8 Plots of time vs queue length are shown for SCPI (top), PI (R=0.1) (middle) and

PI(R=0.12) (bottom). The RTT is decreased at time 90 seconds by 30 ms. . . 27 2.9 The average bandwidth used by each connection is shown for RED (top), SCPI

(mid-dle) and SCPI-F (bottom). . . 29

(9)

2.11 Plots of time vs. queue length are shown for RED (top), SCPI (middle) and SCPI-F

(bottom). . . 30

3.1 Single bottleneck topology used in the simulations. CiandCare the link capacities,di anddare the propagation delays of the links andpmis the random marking probability of the packets after leaving queue.. . . 35

3.2 Estimated RTT for different values ofN andpm. . . 36

3.3 The bandwidths of all connections are shown for RED, the regular PI controller and the full PI controller. . . 41

3.4 Plots of queue length vs. time are shown. Only 200 seconds of the simulations are shown for clarity of the figures. . . 42

3.5 The queue length vs. time plots are shown for all three schemes. . . 43

3.6 The queue length versus time plots are shown. . . 44

3.7 The marking probability of the node is shown. . . 44

3.8 Topology used in the simulations for multiple-bottleneck. Ci’s are the link capacities, anddj,l are the propagation delay of the links. . . 45

3.9 The queue length of both nodes (node1 on the left and node2 on the right) are shown. 46 3.10 The queue length of both nodes (node2 on the left and node3 on the right) are shown. 47 4.1 Topology used in the simulations. Ni is the number of connections, andCi is the link capacity.. . . 58

4.2 Plots of queue length vs. time are shown for CNC-PI. The top plot is for linkC3 and the bottom one is for linkC1. . . 59

(10)

4.4 Plots of queue length vs. time are shown for CNC-PI. The top plot is for linkC3 and

the bottom one is for linkC1. . . 60

4.5 Plots of queue length vs. time are shown for DT. The top plot is for link C3 and the

bottom one is for linkC1. . . 61

4.6 Plots of queue length vs. time are shown for CNC-PI. The top plot is for linkC3 and

the bottom one is for linkC1. . . 61

4.7 Plots of queue length vs. time are shown for DT. The top plot is for link C3 and the

bottom one is for linkC1. . . 62

4.8 Plots of queue length vs. time are shown for CNC-PI. The top plot is for linkC1, the

middle one is forC2 and the bottom one is for linkC3. . . 63

4.9 Plots of queue length vs. time are shown for DT. The top plot is for link C1, the

4.10 Plots of queue length vs. time are shown for CNC-PI. The top plot is for linkC1, the

4.11 Plots of queue length vs. time are shown for DT. The top plot is for link C1, the

5.1 The bottom three plots are for the second group and the rest is for first group. The

dotted line is for the DvRR scheduler, the dashed line is for Aliquem and the solid line

is for DRR scheduler. . . 86 5.2 The average delay for both groups combined are shown in this figure. The dotted

line is for the DvRR scheduler, the dashed line is for Aliquem and the solid line is for

DRR scheduler. . . 86 5.3 The bottom three plots are for the second group and the rest is for first group. The

(11)

5.4 The average delay for both groups combined are shown in this figure. The dotted

DRR scheduler. . . 87 5.5 The bottom three plots are for the second group and the rest is for first group. The

is for DRR scheduler. . . 88 5.6 The average delay for both groups combined are shown in this figure. The dotted

(12)

List of Tables

2.1 Convergence times for SCPI and PI schemes. . . 25

2.2 Convergence times for SCPI and PI schemes. . . 26

2.3 Fairness indexes for RED, SCPI and SCPI-F are shown. The numbers in the paren-thesis show the 95% confidence intervals. . . 28

3.1 Simulation results for the fairness indexes for Case I. . . 41

3.2 Simulation results for the fairness indexes for differentN2values. . . 45

3.3 Simulation results for the fairness indexes for Case II. . . 47

(13)

Chapter 1

Introduction

To support the myriad of envisioned communication products of the future, there is a need to develop a network infrastructure that can provide larger bandwidth, with better control of quality of service (QoS). QoS metrics include packet loss, bandwidth, delay and delay variation. The growing demand for packet switched applications over the Internet (e.g., voice over IP, teleconferencing, video-on-demand) requires improved QoS. However, in today’s best-effort traffic delivery, connections are typically treated equally regardless of their QoS requirements. Thus, congestion in network nodes results in large delay and delay variation for all flow passing through the congested nodes. When the congestions continues incoming traffic can exhaust the buffers of the intermediate nodes (e.g., routers and switches) and cause burst of packet losses.

The mechanisms to solve the congestion problem at the intermediate nodes are called active queue management (AQM) algorithms. These techniques attempt to prevent congestion and regulate the queue length by sending congestion signals (i.e., dropping and/or marking packets) in a proactive manner, which would eventually assumed to cause the senders decrease their sending rates.

(14)

rate according to the congestion level of the network. Transmission Control Protocol (TCP) is the most commonly used responsive protocol in today’s Internet. Different protocols use different methods to learn about the congestion level of the network. For example TCP Reno uses packet losses and duplicate acknowledgments as an indicator of congestion and cuts down its sending rate. On the other hand unresponsive protocols —such as User Datagram Protocol (UDP)— have no congestion control mechanism and set their rate regardless of the network conditions. From the definition of unresponsive protocols it can be seen that these kind of protocols may affect the performance of an AQM as they do not react to the congestion notifications sent by the network.

Although AQM schemes can improve the QoS parameters, generally they can not provide strict guarantees. AQM schemes, as their name imply, are mostly used to regulate the queue length and prevent congestion. Another mechanism calledscheduling

is used to provide QoS guarantees. A scheduling mechanism decides on the sending order of the packets in order to satisfy the QoS requirements of the flows.

A flow can be defined in a number of ways. Generally a flow is defined by a group of fields in a packet header, such as source and destination IP addresses, port numbers, transport protocol. Some applications may not use all these fields or may even use different fields to identify a flow. Throughout the thesis we use a simple definition that uses 4-tuples IP addresses and port numbers of the destination and source. Although we use this flow definition, the algorithms introduced later can be generalized for other flow definitions. We use terms flow and connection interchangeably in this thesis.

1.1

Related Work

(15)

brief summary of the previously done research in these areas are provided.

1.1.1

Related Work on AQM

Tail-Drop (TD) is a very simple queue management scheme (i.e., no queue management), where a simple FIFO queue is used and packets arriving to a full buffer are simply dropped. The simplicity makes TD highly scalable. However, as mentioned in [1], this scheme has a bias against bursty traffic and may cause global synchronization as many flows can lose packets at the same time and drop their sending rate. In order to overcome the problems of TD several researchers have studied different AQM techniques.

In a survey [2], Random Drop was defined as a congestion control policy that drops packets with a uniform probability which results in more packet drops to the flows that use a larger share of the bandwidth. In [3] shortcomings of Random Drop are discussed and an Early Random Drop scheme is proposed. This scheme starts dropping packets with a fixed probability when the queue length exceeds a certain limit.

Random Early Detection (RED) [4] uses a dynamic drop probability instead of a fixed one. An exponential weighted moving average of the queue length is used to calculate the dropping probability. When the average queue length is less than a minimum threshold no packets are dropped and when it exceeds a maximum threshold all incoming packets are dropped. When the average queue length is between the thresholds packets are dropped with a probability that is function of the average queue length. In RED, instead of dropping, packet marking can also be used (e.g., ECN bit).

(16)

In [10] the effect of RED on the performance of Web browsing was studied and in [9] the authors compare RED with Tail-Drop. Both papers do not report a significant performance difference between the two schemes.

Different analytical approaches for RED were studied in [11], [12], [13]. Using fluid-based analysis a TCP model was developed in [14]. In the same paper this model was used to investigate RED. In a following paper [15] a control theoretic approach is used to investigate the RED algorithm and some guidelines for tuning RED parameters are also provided.

In a different approach, optimization techniques are used to design new AQM algo-rithms. In Random Exponential Marking (REM) [16, 17], a congestion measure called

price is used. Price is updated according to the rate mismatch (difference between the incoming rate and target rate) and queue mismatch. When the number of users or the sending rate increases, the price also increases hence the marking probability and vice-versa.

BLUE [18] is an another AQM technique which uses random drop to regulate the queue length. In contrast to RED, BLUE uses packet loss and link utilization to calculate packet drop probability instead of queue length.

On the other hand, in [19], a control theoretic approach is employed to design a proportional-integral (PI) controller using the analytical model for TCP derived in [14, 15]. The main advantage of the control theoretic approach over other approaches is that it is easier to analyze the transient behavior of the system.

(17)

In addition there is a significant research effort in controlling TCP traffic with explicit bandwidth notification. Although these are not technically AQM schemes they are an another approach to the same problem that AQM schemes try to solve.

In [23] a method called Generalized Window Advertising is proposed. In this method the buffer space available in the network is sent to the end-users to regulate their window size. However, this method requires large buffer sizes in the network. To solve this problem another method is proposed in [24], in which network sends available bandwidth and propagation delay information to the end-users. In a different context where an IP network connects to an ATM network, a similar technique, Explicit Window Adaptation, is used [25]. An important difference of this technique is that it does not require any modifications at the end-user nodes. The network modifies the window advertisement field in the ACK packets. In a similar method a CSFQ [26] network is considered in which a new header is used to send the available packet information to the end-user [27]. There is also a considerable amount of research on new versions of TCP to overcome the shortcomings of the regular TCP (i.e., TCP Reno). TCP Vegas [28] is a variation of TCP in which two throughput values, expected and actual, are calculated and the window size is modified according to the difference between these two values. TCP West-wood [29] is a variation to regular TCP which only requires modification at the source side. In TCPW, the available bandwidth of a connection is estimated from ACK receiv-ing rate of the source. As mentioned in [30] ACK compression may cause overestimation of the available bandwidth which ultimately cause performance degradation.

1.1.2

Related Work on Scheduling

(18)

only if there are no packets awaiting. In sorted-priority schemes each packet is stamped with a value using a special function (e.g., virtual clock) and serviced in increasing order of the their time-stamp values. Although these algorithms give good fairness and low delay guarantees, the sorting operation increases the per-packet complexity. Also in some schemes, computing the virtual time can be difficult to do in real time.

On the other hand, frame-based algorithms divide time into intervals (frames) and at each frame flows receive service in a round-robin fashion. The amount of service that each flow receives at each frame is adjusted according to a flow’s reserved rate. Although frame-based schedulers can decrease the per-packet complexity (the number of operations required in the worst-case), they cannot sustain the fairness and delay bounds of sorted-priority schemes.

As a scheduling scheme at a router, Generalized Processor Sharing (GPS) is an idealized algorithm which assumes that all flows with awaiting packets in the queue can be serviced simultaneously. Although GPS can not be implemented in a real network as only one packet can be in service at any time, the performance of a GPS scheduler can be used as a benchmark to evaluate new algorithms.

An approximation to GPS is the packet-by-packet GPS (PGPS) algorithm. In PGPS all packets are timestamped with the service finishing time in a GPS server and are serviced in increasing order of their timestamps. This scheme was first introduced in [31] (where it is called Weighted Fair Queueing (WFQ)) and later studied in [32]. Al-though the fairness and delay bounds of PGPS are very good, calculation and sorting of the timestamps require large computational power which might be prohibitive when the number of connections becomes large. Sorting operations require a complexity of

(19)

the GPS server can be sent.

Self-Clocked Fair Queueing (SCFQ) [34] was designed to reduce the computational complexity of WFQ by using a simpler function to calculate timestamps. SCFQ esti-mates the virtual time by the time-stamp of the packet that is being serviced at that time. In a variation of SCFQ, Start-Time Fair Queueing (SFQ) [35] uses the service starting times of packets as timestamps. These schemes are simpler than WFQ but they still require the sorting operation according to the timestamps.

Deficit Round Robin (DRR) [36], is a frame-based scheduling algorithm where each flow is assigned a quantum value — which is the amount of service a flow should receive in one frame — proportional to its reserved rate. At each frame, flows with packets in the queue are serviced in a round-robin fashion. When a flow can not use the full quantum in a round, the difference between its quantum and the service received is stored in a variable called Deficit Counter (DC). Using DC lets a flow use the unused quantums at a later round. DRR exhibits a complexity of O(1) when the quantum size of each flow is greater than the corresponding flow’s maximum packet size. A variation of DRR called Surplus Round Robin (SRR) [37] services a flow until its DC value becomes non-positive. So if a flows is serviced more than its reserved rate in a round, it is penalized in the next rounds.

In Pre-Order DRR (PDRR) [38], a pre-order queueing module is appended to DRR. This module uses Z priority queues, and assigns packets to these queues. This scheme is an approximation of WFQ as it comes closer with increasing Z. However, it has a complexity of O(N +ZlogZ) [39].

(20)

to these queues according to their DC and quantum values. In other words, Aliquem keeps track ofqrounds simultaneously instead of a single one as in DRR. The worst case per-packet complexity of Aliquem is given as O(q). In a similar approach Bin Sort Fair Queueing (BSFQ) [41] uses a virtual clock similar to SCFQ, but the queue is partitioned into N bins and each bin corresponds to an interval in virtual time.

A class of scheduler called Latency-Rate (LR) servers was defined in [42] and [43]. LR servers include WFQ, DRR, Aliquem etc. Also in these references a new definition of

latency is proposed and according to the new definition the latency of many LR servers are calculated.

Another generalized framework called Guaranteed Rate (GR) is introduced in [44, 45] for studying scheduling algorithms. In these papers it is shown that many well-known scheduling algorithms, such as PGPS and SCFQ, belong to the class of GR scheduling algorithms. In [46] the relationship between LR and GR frameworks are studied, and it is shown that if a scheduler is a LR server, it is also a GR server.

1.2

Contribution

(21)

We also add a fairness utility to this new scheme to solve the fairness problem of TCP to connections with long round-trip times.

Also based on this AQM scheme, we propose modifications to TCP which improve network fairness even in multiple bottleneck scenarios. The result is a robust overall system where the queue lengths are stable and the resources are distributed fairly among the competing TCP connections. We achieve this by adding four new fields to the TCP header which enable intermediate nodes and end-users to exchange information about their respective states. At the intermediate nodes this information is used to adjust the AQM scheme while the end-users utilize the information sent by the intermediate nodes to adjust the algorithms that determine the amount of window size increase.

Building on to our self-configuring AQM algorithm we introduce a new networking scheme that would sustain fairness between the connections requiring only modifications at the intermediate nodes. In this model the packets are encapsulated with a new header when they enter the network and the header is removed at the last node before the packet is forwarded to the end-user. The new header carries the information between the nodes that is required by the AQM schemes used. It is important to note that this algorithm does not require any changes at the end-user nodes. Although we only use PI controllers in this new scheme, other AQM and scheduling algorithms can be used to further improve network performance.

(22)

(23)

Chapter 2

Self-Configuring PI Controller

An alternative approach to the design of AQM algorithms is using control theoretical techniques. In [14], a fluid-based analysis using a non-linear model is developed for the analysis of networks using TCP. A linearized dynamic model is developed in [15] for TCP and AQM algorithms. This model is used in [47, 19] to design AQM techniques using proportional and proportional-integral (PI) controllers. The model provides an accurate representation of network behavior with TCP traffic.

In these papers simulations are used to show that the PI controller outperforms the RED technique under a wide variety of conditions. Also in [48] an AQM scheme is proposed based on the adaptive virtual queue algorithm and the linearized model in [15]. In [20, 21], a self configuring PI controller is introduced and shown to perform better than the regular PI controller. In another control theoretic approach [22], a sliding mode variable structure is used.

(24)

Most of the recent proposed AQM schemes (e.g., RED, PI controller), only attempt to control the queue length of the buffer while maintaining a high utilization. Although this is very important, another crucial goal is to distribute the available resources among the connections in a fair fashion. However, a well known problem of TCP is that the transmission rate of a connection is inversely proportional to its round trip time [49]. This introduces a bias against the connections with long RTT and thus results in an unfair distribution of resources among the users.

In order to solve this problem a Constant-Rate algorithm is proposed in [49]. In this algorithm, each connection increases its window by a×RT T2 packets each round trip time for a fixed constant a. As a result, each connection increases its window size by a

packets each second. Although this solves the fairness problem, choosing a value for a

is still an open issue.

Other approaches to solve the fairness issue include “increase-by-k” [50], New-ECN [51], and CANIT [52]. These algorithms manipulate how the window size increases to attempt to improve fairness among connections with different RTTs. A similar approach is used in [53, 54], where the drop probability of the router is changed as a function of the RTT of the packets to make the system fair. However, in order to do this the intermediate node needs to have an estimate of the RTTs of the connections. In these papers it is assumed that the RTT estimates are somehow (e.g., using a new field in TCP header) sent by the end-users to the intermediate nodes.

Although these schemes are shown to work under some conditions, most require changes both at the end-user nodes and intermediate nodes which are not very practical. Also, most of these techniques only deal with the fairness problem and leaves queue management to another module.

(25)

we propose a per-flow fairness scheme combined with SCPI controller (SCPI-F) that only requires modifications at the intermediate nodes. Also in [55] the importance of using some kind of a dropping policy along with an another module for fairness was suggested to improve the fairness for responsive protocols (e.g., TCP), as the sending rate is a direct function of packet dropping probability. The main idea of the new scheme is to distribute the packet marking/dropping probability calculated by the SCPI controller between the connections according to their RTTs so that all connections have a fair share of the bandwidth. This scheme not only keeps the queue at some predetermined length to avoid congestion, but also maintains fairness among all connections.

2.1

Analytical Model for TCP

TCP is the one of the most commonly used reliable communication protocol over the Internet. Although there are different versions of TCP (e.g., Reno, Vegas, NewReno) we do not use a specific version. We only need a simple model that uses additive increase and multiplicative decrease. In this model the window size is increased by one over the current window size with every received packet and the window size is halved whenever a packet loss is detected.

An analytical model for queue and TCP dynamics is developed in [14, 15] by using stochastic differential equations. There it is shown that these equations can be trans-formed into ordinary non-linear differential equations. In this chapter a very similar dynamic model is used where the TCP window and queue behavior is transformed into the following non-linear differential equations,

˙

W(t) = 1

R(t)−

W(t)W(t−R(t)

2R(t−R(t)) ptotal(t−R(t)) ˙

q(t) = W(t)

(26)

where ptotal(t) =pd(t) +pm(t) and,

W = average window size of TCP (packets)˙

N = number of TCP connections˙

R = round trip time˙

C = link capacity (packets/sec)˙

pm = packet marking probability˙

pd = packet dropping probability˙

q = average queue length (packets)˙

In this chapter the early congestion notification (ECN) bit is used instead of dropping packets and it is also assumed that the AQM schemes prevent packet drops due to buffer overflows. For notational simplicity we use p instead of pm, since pd = 0. It

is also important to note that in these equations, TCP is simplified by ignoring time-outs and the slow-starts. However, it is shown in [14, 15] using simulations that these simplifications have negligible effect.

In [15], the non-linear differential equations are linearized around an operating point (W0, q0, p0). In Fig. 2.1 the block diagram of the linearized system is shown. The transfer

function for the TCP and queue dynamics is given as,

P1(s) =

R0C2

2N2

1

s+ _R2N2 0C

P2(s) = N R0

1

s+ 1

R0

(2.2)

(27)

P (s)

1

2

−sRo

e

AQM

q

W

TCP Window

Dynamic

Queue

Dynamic

Delay

p

Figure 2.1: Block diagram of the linearized system model with AQM.

controller yields a simple difference equation for updating the drop probability, p. The algorithm for this difference equation is given as,

p(kT) = p((k−1)T) +a(q(kT)−q0) +b(q((k−1)T)−q0) (2.3)

whereq0is the target queue length,T is the sampling period anda,bvalues are constants.

In [19], it is advised that the sampling period should be 10 to 20 times of the loop bandwidth of the system.

2.2

Self-Configuring PI Controller with RTT

Esti-mation

(28)

Another important parameter to estimate is the number of connections. A naive method is to look at each packet’s header and count the number of different connections. Although this simple model analytically works fine, it can be hard to implement in real life when the outgoing link of the connection is very fast as the additional operations required for counting may become the bottleneck. To solve this problem Bloom Filter can be used.

2.2.1

SCPI Controller

The coefficients of the PI controller are functions of the network parameters R,C and

N. Thus, using fixed values for a and b may significantly affect the performance of the system when a change in the network configuration occurs. A self-configuring structure is proposed in [20, 21] to solve this problem. In these papers, TCP workload at the bottleneck link (N) is estimated and the coefficients of the PI controller is updated using this estimated value. However, in these methods R is not estimated and it is suggested to set it to a fixed value, but this may significantly limit the self-configuring capability of the controller. In this section we show how to estimate R and N at the congested node and use these methods to construct a new self-configuring PI (SCPI) controller.

Assuming the system is at equilibrium, ˙W = 0 and ˙q = 0. So using (2.1) we can get the following relations,

W₀2p0 = 2, W0 = R0C

N (2.4)

where (W0, q0, p0) is the operating point and R0 = q0/C +propagation delay. Then

by combining the equations in (2.4) we can find the relationship between R0 and other

parameters (C, N and p0) as,

R0 = N C

s

2

p0

(29)

w w+1 w+2

t t

w w+1 w+2

t₁ t₂

A

2

A

1

1 2 time time

Figure 2.2: Plots of window size versus time are shown for the real window size (left) and the effective window size (right).

It is important to note that in the above equation W0 = q

(2/p0) which is different

than the estimate found in [56] (W0 ≈ q

(8/3p0)). This is because in our derivation

we assumed all the packets are dropped or marked in a random fashion while in [56] a deterministic assumption is used.

Even though (2.5) is valid only when the system is in equilibrium, it is used to estimate R even when the system is not at equilibrium. Thus, N, C and p values are necessary to estimate R. Here it is assumed that the link capacity (C) is constant and known, this assumption can be relaxed using the estimation method in [20]. The number of connections, N, is estimated by observing individual packets. Also some methods, such as sampling, capture-recapture [57] can be used to reduce the workload of the CPU, but we are not focusing on this here. The other parameter that is estimated is p, which can be easily estimated by calculating the ratio of the number of marked packets to the total number of packets received. Then the expression for estimating R becomes,

ˆ

R =

ˆ

N C

s

2 ˆ

p (2.6)

where ˆX represents the estimated value of X.

(30)

can only send complete packets. Here (using Fig. 2.2) we show that this causes the system, on average, to send 1 less packet than the real window size (we call this window

Wef f ective),

A1−A2 = (t2−t1)(w+ 1) + (t2−t1)∗0.5

−(t2−t1)(w+ 1)

= (t2−t1)∗0.5

on average

∆W =Wreal−Wef f ective=

A1−A2 t2−t1

= 0.5 (2.7)

A further correction is needed, as w could have a decimal part (assuming a uniform distribution, on average the decimal part will be 0.5). This will cause the window size to be overestimated by additional 0.5 which makes the total error exactly 1 packet. This error becomes significant when the average window size is small. To correct this we modify (2.4), thus yielding,

W₀2p0 = 2, (W0−1) = R0C

N

which gives,

ˆ

R= ˆ

N C(

s

2 ˆ

p −1) (2.8)

It is still possible to overestimate R as the model used ignores time-outs and slow-starts in TCP, which may cause slower convergence to the target queue length, q0. To

prevent this an upper limit is used for ˆR, which we call Rmax. It is important to note

that R is the average RTT of the connections, and the value for Rmax should be much

(31)

These parameter estimates are used to update the coefficients of the PI controller (a, and b). Using Tustin’s transformation [58] the system can be digitized giving the following expressions,

a = K

2(2 +T wg)

b = K

2(2−T wg) (2.9)

where K = (2 ˆN)2_/_{( ˆ}_RC₎3q_{1 + (2 ˆ}_{N /}_{( ˆ}_RC₎₎2_, _w

g = 2 ˆN /( ˆR2C) and T is the sampling

period.

The overall algorithm can be summarized as,

If current-time > next1 Calculate pˆand Rˆ

if R > Rˆ max

ˆ

R=Rmax

Calculate new a and b values next1 = current-time + T0

end

If current-time > next2

p ⇐ p + a(q−q0) +b(qold−q0)

next2 = current-time + T

end;

The algorithm updates the coefficients of the PI controller everyT0 time units while the drop probability is updated every T time units. Here we do not focus on how to choose the T0 value, but in our simulations we found performance to be insensitive of the value chosen.

2.2.2

SCPI with Fairness (SCPI-F)

(32)

may have significantly less bandwidth than their fair share. The well-known equation for the bandwidth (B) of a TCP connection is given by,

B = W

R =

1

R s

1

2p (2.10)

Another problem with the SCPI scheme is, it is assumed that all users are responsive to the congestion notifications. However, this might not be the case for connections using UDP, or for malicious users that might have modified their TCP stack. In order to overcome these problems we implement SCPI in a per-flow queueing scheme (SCPI-F) where each connection (or flow) is served in a round-robin fashion. This not only protects TCP connections from unresponsive connections but also provides better fairness. In this section, for simplicity, fixed packet sizes are assumed for all connections, but deficit round-robin (DRR) [59] can be used when this assumption does not hold.

In the previous sections it is shown that some parameters of the network, such as mean RTT and marking/dropping probability, can be estimated at the bottleneck node. In this section the same methods are used to estimate the RTT of individual connections. Then the new estimator for connection i is given by,

ˆ

Ri =

1

Ci

(

s

2

pi

−1) (2.11)

where Ci is the bandwidth used by the connection and pi is the of packet marking

probability of connection i. These values can be found at the bottleneck by counting the number of packets (or bytes if the packet size is not fixed) sent and marked from this connection.

(33)

From (2.10) it is seen that in order to give all connections their fair share the following condition must hold p1R21 =p2R22 =. . .=pNR2N. From this we can see that to achieve

this the pi values should be in the form given below,

pi =A p R2

i

=pAi, ∀i= 1, . . . , N (2.12)

where A is a normalization constant and Ai = A/R2i. It is necessary to put an upper

limit topAi as it can get very large when a connection has a very small RTT with respect

to the average which may cause the connection “starve”. This limit is set to 0.2 which gave satisfactory results.

Also the pi values have to satisfy the following condition to keep the total marking

probability at p which was set by the SCPI controller using the average RTT value,

N X

i=1 pi

C

N =pC (2.13)

The normalization constant, A, can be found by using (2.12) in (2.13) which gives

A=N(PN_i=11/R2_i)−1.

In order for the system to adapt to the changes in the network conditions the Ai

values are updated at some interval, T00. This interval is important because now the individual marking probabilities are also estimated, and the number of packets sent or marked from that connection can be very small to estimate the marking probability correctly. Also in order to prevent the estimate of the marking probabilities to be zero,

Ai is set to unity when the number of marked packets in T00 is zero for connection i.

(34)

1

2

N

d

₁

2

N

d

1

2

a

N

Figure 2.3: The single bottleneck topology used in the simulations.

2.3

Simulation Results

We first investigate the performance of the new schemes and compare it with regular PI controller, and RED using the ns-2 [60] network simulation package. We assumeN users, each using TCP/Reno utilizing the ECN bit, are connected to a router with link capacity

C (packets/sec) and the router is connected to the sinks. All packets are 500 bytes, and the buffer size is 800 packets. Unless otherwise stated, C = 3750 packets/sec,q0 = 100

packets, T = 1/160 seconds and T0 =T00 = 3 seconds. The coefficients for the regular PI controller are set to the values used in [19], a = 1.822×10−5 and b = 1.816×10−5. The topology used in the simulations is shown in Fig 2.3. In all simulations a warm-up period — which are not shown in the plots — is used to let the simulations start from steady-state. Some of the network parameters are taken from the previous papers in order to be able to compare their methods with our new methods, but different scenarios are also considered.

2.3.1

SCPI Controller

We first investigate the effect of changing the number of users (N). The propagation delays d1, d2, . . . dN are uniformly distributed between 2 and 72 ms, and da is set to 3

(35)

50 100 150 200 250 300 0

200 400 600 800

Time (seconds)

Queue Length (packets)

50 100 150 200 250 300

0 200 400 600 800

Time (seconds)

Figure 2.4: Plots of queue length vs. time are shown for a load changes (N) at 120 and 220 seconds, for the regular PI controller (top) and the self-configuring PI controller (bottom).

at 220 N is decreased back to 100. The results are shown in Fig. 2.4. From the figure it is seen that the PI controller converges very slowly after an increase or decrease in N, especially after a decrease. On the other hand the self-configuring PI controller handles both situations very well and converges back to the target queue length very quickly.

In the second simulation, we investigate the effect of changing RTT. The propagation delays for d1, d2, . . . dN are uniformly distributed between 40 and 80 ms and at time 75

all the delays are decreased by 20 ms (may be due to a link change). The number of connections is 100 and does not change during the simulation. The results are shown in Fig. 2.5. From the figure again it is seen that the self-configuring scheme quickly responds to the change and is able to keep the queue length at the target value. However, for the regular PI controller it takes a longer time to reach the target queue length.

(36)

50 100 150 200 0

100 200 300 400

Time (seconds)

50 100 150 200

0 100 200 300 400

Time (seconds)

Figure 2.5: Plots of length vs. time are shown for changingRT T, for the regular PI controller (top) and the self-configuring PI controller (bottom).

a large contributor to the variance of the end-to-end delay. We observed this variance for different numbers of connections. As shown in Fig. 2.6 the variance decreases for both controllers with increasing number of connections. However, the self-configuring controller has lower variance in all cases which results in a reduction in end-to-end delay variation.

(37)

Controller SCPI PI, R = 0.1 PI, R= 0.12

time (s) <3 17 35

Table 2.1: Convergence times for SCPI and PI schemes.

50 100 150 200 250 300 350 400

400 500 600 700 800 900 1000 1100 1200 1300 1400

Number of Connections

Variance of Queue Length (packets)

PI Controler

Self−tuning PI controller

Figure 2.6: Plots of the variance of the queue length are shown for different numbers of connections both for regular and self-configuring PI controllers. For each point five independent samples are used, and 95% confidence intervals are also shown.

(38)

80 85 90 95 100 105 110 115 120 125 130 0

200 400 600

80 85 90 95 100 105 110 115 120 125 130 0

200 400 600

80 85 90 95 100 105 110 115 120 125 130 0

200 400 600

Time (seconds)

SCPI

PI, R =0.1 s

PI, R =0.12 s

Figure 2.7: Plots of time vs queue length are shown for SCPI (top), PI (R=0.1) (middle) and PI(R=0.12) (bottom). The RTT is increased at time 90 seconds by 30 ms.

Controller SCPI PI, R = 0.1 PI, R= 0.12

time (s) <5 15 30

Table 2.2: Convergence times for SCPI and PI schemes.

In the next simulation, we use the same network setting except the propagation delays are uniformly distributed between 50 and 90 ms and this time we decrease the propagation delays by 30 ms. Again, convergence times and a plot of time versus queue length can be seen in Table 2.2 and Fig. 2.8 respectively. From these we see that only the SCPI can adapt to the new conditions quickly.

2.3.2

SCPI with Fairness (SCPI-F)

(39)

80 85 90 95 100 105 110 115 120 125 130 0

200 400 600 800

80 85 90 95 100 105 110 115 120 125 130

0 200 400 600 800

80 85 90 95 100 105 110 115 120 125 130

0 200 400 600 800

Time (seconds)

SCPI

PI, R =0.1 s

PI, R =0.12 s

Figure 2.8: Plots of time vs queue length are shown for SCPI (top), PI (R=0.1) (middle) and PI(R=0.12) (bottom). The RTT is decreased at time 90 seconds by 30 ms.

controller. To measure the fairness we use Jain’s metric [61],

F ≡ (

PN i=1Bi)2 NPN_i=1B2 i

(2.14)

whereBi is the bandwidth of connectioni. From this it is seen that as fairness increases, F approaches unity.

In the first simulation, the network parameters areC = 37500 packets/sec, q0 = 200

packets, T = 1/1600 seconds and T0 =T00 = 3 seconds. The propagation delays are set such that d1 = . . .=dN/2 = 5 ms, dN/2+1 =. . . =dN = 25 ms and da = 5 ms. In this

(40)

RED SCPI SCPI-F

N=100 .836(±3×10−3) .840(±4×10−3) .982(±4×10−3) N=200 .843(±4×10−3) .843(±4×10−3) .983(±9×10−4) N=300 .848(±2×10−3) .844(±2×10−3) .964(±3×10−3)

Table 2.3: Fairness indexes for RED, SCPI and SCPI-F are shown. The numbers in the parenthesis show the 95% confidence intervals.

same so we only show the results for SCPI. The fairness indexes for all cases for different number of users are shown in Table 2.3, the numbers in the parenthesis show the 95% confidence intervals. These results are obtained by using 10 independent simulations.

From this it is seen that both RED and SCPI are biased against long RTTs while SCPI-F significantly improves the fairness. However, as expected for SCPI-F fairness decreases as the number of users increases. This is mainly because when the number of users increases, the number of packets serviced for each connections decreases (as

Bi = 1/N) which causes estimation errors in bothpi andRi. This problem can be solved

by increasing T00. In addition to the fairness indexes, in Fig. 2.9 average bandwidth of each connection for N = 100 case during a 100 second interval is shown. From this it is again seen that both for RED and SCPI the first 50 connections (flows with low RTTs) have much higher bandwidth than the rest of the connections while SCPI-F distributes the bandwidth almost evenly.

In the next simulation we test the response of the system to an abrupt RTT change. In this example we use 100 connections where d1, . . . , d100 are uniformly distributed

between 11 and 20 ms and da = 0.1 ms. After reaching the steady-state, we increase d1, . . . , d50by 10 ms and decrease the rest of the propagation delays by the same amount.

(41)

0 10 20 30 40 50 60 70 80 90 100 100

200 300 400 500

0 10 20 30 40 50 60 70 80 90 100

100 200 300 400 500

Bandwidth (Bytes/sec)

0 10 20 30 40 50 60 70 80 90 100

0 200 400 600

Connection

Figure 2.9: The average bandwidth used by each connection is shown for RED (top), SCPI (middle) and SCPI-F (bottom).

(42)

20 40 60 80 100 120 140 160 180 0.6

0.7 0.8 0.9 1

20 40 60 80 100 120 140 160 180

0.6 0.7 0.8 0.9 1

Fairness Index

20 40 60 80 100 120 140 160 180

0.6 0.7 0.8 0.9 1

Time (seconds)

Figure 2.10: Fairness indexes are shown for RED (top), SCPI (middle) and SCPI-F (bottom).

20 40 60 80 100 120 140 160 180

0 200 400 600

20 40 60 80 100 120 140 160 180

0 200 400 600

20 40 60 80 100 120 140 160 180

0 200 400 600

Time (seconds)

(43)

Chapter 3

Modified TCP for a Stable and Fair

Network

(44)

In a different approach [24], the window size is adjusted by the available bandwidth feedback sent from the network. By doing this the senders adjust their sending rate according to the available bandwidth in the network. The same idea is used in [25] for an IP network inter-connected to a rate-controlled network.

In this chapter we are not just focusing on the AQM scheme used at the intermediate nodes but we also make some small additions to the protocol used by the end-users. The result is a robust overall system where the queue lengths are stable and also the resources are distributed fairly among the competing TCP connections. We achieve this by adding four new fields to the TCP header which enables intermediate nodes and end-users to exchange information about their respective states. At the intermediate nodes this information is used to adjust the AQM scheme while the end-users utilize the information sent by the intermediate nodes to adjust the algorithms that determine the amount of window size increase.

We restrict our focus to connections using TCP in a single bottleneck scenario. By single bottleneck we mean that for each connection, most of the dropping/marking of the packets is done by one intermediate node and the other dropping/marking that takes place is negligible. This approach presented in this chapter can be directly applied to a class-based DiffServ-like architecture where the UDP, short-lived, and long-lived TCP connections are classified into separate service queues [63].

(45)

3.1

The New Algorithm

In this section we first present a more general form of (2.1), and using this we propose some modifications to the TCP and the adaptive PI controller that results in a more robust and fair network.

In TCP’s congestion avoidance algorithm the window size is increased by one packet at every RTT and when congestion is detected the window size is halved. Now we generalize this by adding two parameters, α and β, where each RTT the window is increased by α packets and the window size is multiplied by β, where β < 1, after receiving a marked packet. Under these changes (2.1) becomes,

˙

W(t) = α

R(t)−

βW(t)W(t−R(t)

R(t−R(t) p(t−R(t)) ˙

q(t) = W(t)

R(t)N(t)−C (3.1)

From this we can easily see that setting α to unity and β to 0.5 yields the regular TCP equations.

At theoretical equilibrium both the average window size and queue size must be constant ( ˙W = 0 and ˙q = 0). So by setting both equations in (3.1) to zero we obtain,

W₀2p0 = α

β, W0 = R0C

N (3.2)

where (W0, q0, p0) is the operating point at equilibrium and R0 is the average RTT

(q0/C+average propagation delay). Then by combining the equations in (3.2) we find R0 by using,

R0 = N

C s

α βp0

(3.3)

(46)

3.1.1

Adaptive PI Controller

Using the α-increase β-decrease the first equation in (2.2) becomes,

P1(s) =

βR0C2 N2

1

s+2N α_R2 0C

(3.4)

We can use this in (2.2) and digitize it with Tustin’s transformation [58] and get the following equations for the PI controller,

a = K

2

_R

0W

α +T

b = K

2

_R

0W

α −T

(3.5)

where K = wg(2N)2/(R0C)3 q

1 + (2N/(R0C))2α/β, wg = 2N/(R20C) and T is the

sampling period.

In a non-adaptive PI controller,R0, C and N are set to worst case values, and using

these the PI controller coefficients,a and b, are calculated. Of course, using fixed values may cause the system to become sluggish or unstable when the network conditions (number of users, average RTT etc.) change. Adaptive controllers attempt to solve this problem by tracking changes in the network [20, 21]. From (3.5) we can see that in order to modify the parameters a and b, we need R0, C and N or at least their estimates (we

use ˆX to denote the estimate of X). Although these adaptive controllers are shown to work better than the non-adaptive ones, it has been shown in Chapter 2 that estimating the RTT is very important, especially in a multiple bottleneck scenario. In [20, 21] RTT is assumed to be known or set to a fixed value, and only in Chapter 2 (3.3) is used to estimate the average RTT. This can easily be achieved when packet marking by other intermediate nodes is negligible (single-bottleneck), since p0 can be estimated by,

ˆ

p0 =

number of dropped or marked packets

(47)

1

2 2

N

1

2

N

p_m

Random Packet Mark C, d

C , d 1

C , d

C , d_N

Figure 3.1: Single bottleneck topology used in the simulations. Ci and C are the link capacities,di

anddare the propagation delays of the links andpmis the random marking probability of the packets

after leaving queue.

Plugging this into (3.3) gives an estimate of the average RTT, ˆR0. However, if the

packets are also marked by some other node this gives an incorrect estimate for p0,

which results in a poor estimate of R0.

We now consider a simple example to demonstrate how ˆR0 is affected. We use the

simple topology shown in Fig. 3.1, where the packets are also randomly marked with probability pm after leaving the bottleneck, with parameters C = 3750 packets/sec, R0 = 0.22 sec. The results for different values ofN can are shown in Fig. 3.2. From the

figure we can see that with increasing pm, the error in ˆR0 also increases which causes

the controller to behave very sluggishly.

In order to solve this problem we add a new field (RTT-field) to the TCP header in which the TCP sources put their own RTT estimates. When a packet arrives at an intermediate node, this RTT value is used to update the average RTT estimate. We implemented this by using a list at the router which keeps the user identification and its corresponding RTT. Parameters a and b are updated every T0 seconds, and then the list is reset. The average RTT, ˆR0, and the number of users, ˆN, can be estimated

(48)

0 0.1 0.2 0.3 0.4 0.5 0.2

0.3 0.4 0.5 0.6 0.7

p_m

Estimated R

0

N=200 N=300 N=400

Figure 3.2: Estimated RTT for different values of N andpm.

clearing the list every time but we are not focusing on these approaches here. The link capacity (C) is assumed to be constant and known — this assumption can be relaxed by using the estimation method in [20].

3.1.2

Fairness

We define the fairness criteria as follows. Assuming N connections that have the same bottleneck link, each should have 1/N th of the bandwidth independent of the connec-tion’s respective RTT or the number of hops. Jain’s fairness metric [61] is given by,

F ≡ (

PN i=1ri)2 NPN_i=1r2 i

(3.7)

where ri is the bandwidth of connection i. From this we see that as F approaches

(49)

is because the bandwidth corresponding to other connections are limited by some other node (their own bottleneck nodes).

From the above discussions we see that the PI controller (adaptive or non-adaptive) does not discriminate between different users and drops all the packet with the same probability, p0. In [14] it has been shown that bandwidth of a TCP connection i (Bi),

is given by Wi/Ri where Wi is the average window size and Ri is the average RTT of

connection i. Using this and (3.3) we can find the following relation betweenBi and Ri,

Bi =

1

Ri v u u t

α/β p0,i

(3.8)

where p0,i is the probability of marking a packet from connection i. To simplify we

consider a single-bottleneck scenario as shown in Fig. 3.1 (pm = 0), from this we can

assume that p0,i = p0. From (3.8) we see that bandwidth of a connection at the

bot-tleneck is inversely proportional to its RTT. This results in an unfair distribution of the bandwidth among the competing connections as the connections with higher RTT’s have lower bandwidths. In order to solve this problem we change the parameter α at each TCP connection (αi) such that all connections are treated fairly. In addition, we

can modify the parameter β, but since the window size is stored as an integer, this may cause large rounding errors. Thus, we chose to use α and setβ = 0.5 as in regular TCP. So for connection i, the window size is increased by αi/RTT at every received ACK and

the window size is halved when a marked ACK is received.

Now let’s assume at a bottleneck node there are N connections with RTTs R1, R2 . . . , RN. From (3.8) we see that if every connection sets its corresponding value forαi as

(Ri/k)2, where k2 = _N1 PNi=1R2i, then every user has equal bandwidth. Also, the mean

(50)

shown by plugging αi in (3.8),

Bi =

1

Ri s

(Ri/k)2

2p0

=

s

1 2k2_p

0

(3.9)

From this expression, we see that all connections have the same bandwidth independent of their respective RTT. From the average of all αis we can show that it is equal to one,

1

N N X

i=1 αi =

1

N N X

i=1

(Ri/k)2 =

1 k2 1 N N X i=1 R2_i

= ₁ 1

N PN

i=1R2i

1

N N X

i=1

R_i2 = 1

However, even if all the connections use a common intermediate node this does not mean that they also have the the same bottleneck, so the averageαcan be different than unity as each connection adjusts its αi value with respect to the value of k received from its

bottleneck node. Thus, another field is needed in the TCP header (Alpha-field) to carry the αi from the end-users to the intermediate nodes. At the intermediate nodes the

average is computed to yield α, which is needed to calculate the PI coefficients a and b. This idea is similar to the one used in the Rate algorithm, but in Constant-Rate algorithm a constant is used instead of 1/k. However, the problem with the Constant-Rate algorithm is that if the constant is chosen too small it causes the window size to increase very slowly which leads to a sluggish system. On the other hand, if the parameter is too large, then the window size would increase too fast and result in instability.

In addition two additional fields (Max-p and k-field) are needed to determine the bottleneck for each node and to send the values of k from the bottleneck nodes to the corresponding end-users.

Each intermediate node can calculate an estimate of its marking probability using (3.6), and if kpˆ2

(51)

updates Max-p with its own estimate of packet marking probability and k-fieldwith its

k value. From (3.9) we can see that the bandwidth of a connection is proportional to the kpˆ2

0 value, so the node with the largest value for kpˆ20 is the bottleneck node. While

Max-pis not needed in a single hop scenario, it is very important when there are multiple intermediate nodes, where it forces the end-user to adjust its window size with respect to the fair share at its bottleneck node.

We can summarize the new fields in the TCP header as follows:

• RTT-field: The end-users put their RTT estimates in every outgoing packet and the intermediate nodes use this information to calculate k.

• Alpha-field: The end-users put their respective values for α in every outgoing packet and the intermediate nodes use this value in calculating their new PI coef-ficients.

• Max-p: The intermediate nodes put their respective values for kpˆ2₀ in this field if the value already there is smaller. This field is set to zero when the packet is first generated by the end-user.

• k-field: This field holds k, and is updated only if Max-p is updated.

3.2

Simulation Results

(52)

except we set minth = 50 and maxth = 150 in order to achieve an average queue length

around 100 packets. Unless otherwise is stated all connections use TCP/Reno with ECN bit, also all packets are 500 bytes and the buffer size of the intermediate nodes are 800 packets.

We first consider the single-bottleneck topology shown in Fig. 3.1 to demonstrate the performance of all three schemes both in transient and steady-state. Secondly, we look at a more complicated topology which incorporates multiple intermediate nodes.

3.2.1

Case I

In this case we use the topology shown in Fig. 3.1. In this figure pm is the marking

probability of the unmarked packets by other intermediate nodes. The network param-eters used in the simulations are C1 = C2 = . . . = CN = C = 15Mbp s and pm = 0.

The first simulation investigates the steady state behavior of the network. We use two groups of connections where d1 = . . . = d50 = 5ms and d51 = . . . = d100 = 70ms. We

ran the simulation for 500 seconds and calculated the fairness indexes by dividing the simulation time into 40 second intervals and calculating the indexes for each individual interval. Then we used the average of 10 consecutive intervals to calculate the fairness indexes. In this simulation we set all the delays equal to be able to calculate the fairness index. Using (3.7) we find that if regular TCP were used, F would be equal to 0.86. The fairness indexes for all three AQM schemes are presented in Table 3.1. The values in the parenthesis show the 95% confidence intervals. From this table we see that both RED and the regular PI controller do not improve the fairness of the connections caused by RTT differences. However, the full PI controller results in almost perfect fairness among all users.

(53)

AQM RED PI Full PI

F 0.8379 (±.0033) 0.8501 (±.0037) 0.9819(±.001)

Table 3.1: Simulation results for the fairness indexes for Case I.

0 10 20 30 40 50 60 70 80 90 100

0 20 40 60

0 10 20 30 40 50 60 70 80 90 100

0 20 40 60

0 10 20 30 40 50 60 70 80 90 100

0 20 40 60

Connection

Bandwidth (packets/second)

Figure 3.3: The bandwidths of all connections are shown for RED, the regular PI controller and the full PI controller.

RTTs (connections from 51 to 100) as expected and the full PI controller manages to distribute the bandwidth equally among all users. Fig. 3.4 displays plots of queue length versus time. We see that RED severely oscillates around the desired queue length (100 packets). While the full PI controller and the regular PI controller achieve the desired queue length, the regular PI controller is a little slower. These results are expected as the regular PI controller uses fixed parameters which may cause the controller to be sluggish in different network conditions.

(54)

transmit-0 20 40 60 80 100 120 140 160 180 200 0

200 400 600

time (seconds)

0 20 40 60 80 100 120 140 160 180 200

0 200 400 600

time (seconds)

0 20 40 60 80 100 120 140 160 180 200

0 200 400 600

RED

Regular PI Controller

Full PI Controller

Figure 3.4: Plots of queue length vs. time are shown. Only 200 seconds of the simulations are shown for clarity of the figures.

ting and restart at time 100. This simulation shows how each AQM scheme responds to the sudden changes of the network load. In Fig. 3.5 time versus queue length plots are shown for all three schemes. From this we again see that RED is highly oscillatory. The regular PI controller converges to the desired queue length but is sluggish with respect to the full PI controller. As expected the full PI controller modifies its parameters with changing network conditions which results in faster convergence. Also it is important to note that only the full PI controller is able to distribute the bandwidth equally among all users.

Now we assume that the other nodes can also mark the packets. To simulate this we increase pm to 0.01 between 30 and 60 seconds. We use 100 connections and d1, d2. . . d100 ∼ U[60,80] ms. In this simulation we did not present the RED results

(55)

0 50 100 150 0

200 400 600

time (seconds)

Queue Length (packets) 00 50 100 150 200

400 600

0 50 100 150

0 200 400 600

RED

Full PI Controller

Figure 3.5: The queue length vs. time plots are shown for all three schemes.

time plot for this case. From both of these plots we note that the regular PI controller is slow in responding to the changes while the full PI controller quickly reduces the drop probability by 0.01 and keeps the queue length at the desired level (100 packets).

3.2.2

Case II

In this case we consider the more complex topology shown in Fig. 3.8. We are not claiming that this is a very realistic topology, but we believe that it gives a good idea about how our new algorithms perform under more complicated scenarios. In this system we have three groups of TCP connections containing N1,N2 and N3 connections. The

first and third group are connected to sink1 and the second group uses sink2 as its destination. We set all link capacities,except C1, C2 and C3, large values such that

congestion can only occur at those three links.

(56)

0 10 20 30 40 50 60 70 80 90 100 0

50 100 150 200 250 300

0 10 20 30 40 50 60 70 80 90 100

0 50 100 150 200 250 300

time (seconds)

Full PI Controller

Figure 3.6: The queue length versus time plots are shown.

0 10 20 30 40 50 60 70 80 90 100

0 0.005 0.01 0.015 0.02

0 10 20 30 40 50 60 70 80 90 100

0 0.01 0.02 0.03

time (seconds)

Packet Marking Probability

Full PI Controller

(57)

C C 1

N

N 1

1

3 2

N 1

1 3

Sink2 Sink1 d

d

d _d

1,1

1,N1

2,1 _3,1

Node1 Node2

Node3 C2

Figure 3.8: Topology used in the simulations for multiple-bottleneck. Ci’s are the link capacities, and

dj,l are the propagation delay of the links.

AQM RED PI Full IP

F (N2 = 50) 0.818 (±.0037) 0.84 (±.0056) 0.958(±.0067) F (N2 = 100) 0.909 (±.0018) 0.915 (±.0026) 0.986(±.0005)

Table 3.2: Simulation results for the fairness indexes for differentN2 values.

N2 =N3 = 50 connections. At time 50 we add 50 new connections toN2 and disconnect

these at time 110 seconds. We also used uniformly distributed delays for the connections,

d1,1, . . . , d1,50 ∼ U[40,60] ms, d2,1, . . . , d2,100 ∼ U[1,9] ms and d3,1, . . . , d3,50 ∼ U[20,40]

ms. We ran the simulation for 180 seconds and the results for links C1 and C3 are shown

in Fig. 3.9. From these results we see that RED can not manage to stabilize either of the queues. Also the regular PI controller can control the entire system but it responds slowly to the changes. We again see that the full PI controller is not only able to keep the queues at the desired levels but has good transient response.

(58)

50 100 150 200 0 100 200 300 400

50 100 150 200

0 100 200 300 400

50 100 150 200

0 100 200 300 400

50 100 150 200

0 100 200 300 400

50 100 150 200

0 100 200 300 400 time (seconds)

50 100 150 200

0 100 200 300 400 RED

Full PI Controller

Figure 3.9: The queue length of both nodes (node1 on the left and node2 on the right) are shown.

These results show that the full PI controller results in a more fair distribution of the bandwidth among the connections in both cases.

In the next simulation we use the same topology with the same delays but we decrease the link capacity ofC2 andC3 to 15Mb. The number of users isN1 = 50,N2 = 100 and N3 = 50. In this configuration for connections N1 and N3 the bottleneck link becomes C3 and for N2 the bottleneck link isC2. The queue length versus time plots are shown

in Fig. 3.10. From these plots we see that while RED oscillates around the desired queue length, the regular PI and the full PI controller manage to regulate both queues. Results for Node1 is not shown as it is not a bottleneck for any of the connections. In Table 3.3 we present the fairness indexes for Node2 (bottleneck for connections N1 and N3). We

(59)

AQM RED PI Full IP

F 0.833 (±.0031) 0.863 (±.0035) 0.973(±.0015)

Table 3.3: Simulation results for the fairness indexes for Case II.

0 20 40 60 80 100

0 100 200 300

0 20 40 60 80 100

0 100 200 300

40 60 80 100

0 100 200 300

40 60 80 100

0 100 200 300

40 60 80 100

0 100 200 300

40 60 80 100

0 100 200 300

time (seconds)

RED

Full PI Controller

(60)

Chapter 4

Congested Node Control Network

for TCP Connections

In a network such as the Internet, many flows simultaneously share the limited resources of the network, such as bandwidth and buffer space. From the network point of view the goal is to sustain fairness while having the bandwidth fully utilized at all times. The definition of fairness significantly change the fair bandwidth share of each flow. The most commonly used fairness definition in networking is max-min fairness [64]. The objective is to choose the fair shares of the flows such that a flow’s bandwidth may be increased only at the expense of another flow’s decrease which has less bandwidth. Another approach called proportional fairness is introduced in [65]. This approach maximizes an objective function using utility functions under capacity constraints.

(61)

fair share they can use. A survey of several methods that uses different algorithms to calculate the available bandwidth can be found in [67].

Although these methods mentioned above may work well for sources that can can modify their sending rate to a target level, window-based protocols such as TCP do not have that capability and only respond to marked/lost packets (similar to single bit congestion notification). To solve this problem, a method called Generalized Window Advertising is proposed [23]. In this method the buffer space available in the network is sent to the end-users to regulate their window size. However, this method requires large buffer sizes in the network. To solve this problem another method is proposed in [24], in which network sends available bandwidth and propagation delay information to the end-users. In a different context where an IP network connects to an ATM network, a similar technique, Explicit Window Adaptation, is used [25]. An important difference of this technique is that it does not require any change at the end-user nodes. The network modifies the window advertisement field in the ACK packets. In a similar method a CSFQ [26] network is considered in which a new header is used to send the available packet information to the end-user[27].

All of the methods mentioned above either requires a change at end-user nodes, or the intermediate nodes modify the TCP packet. However, TCP is widely used and changing end-user nodes is not very realistic. In addition, modifying the advertised window size at an intermediate node is also not a desirable method as it is a layer violation. In this chapter we propose a new scheme that uses a PI controller that does not require any change at the end-users or cause. In this algorithm the available bandwidth for each flow is found and the packet dropping probability is calculated accordingly at the intermediate nodes.

(62)

AQM techniques for TCP connections. Although it is shown that these AQM algorithms work well under a wide-range of different network conditions, they all have a limiting assumption that all packets dropped from a connection must be dropped by only one intermediate node (e.g., switch, router). This assumption limits the use of the PI con-troller in a realistic network environment, where connections may have mo