Scheduling for QoS
Management
Outline
• What is Queue Management and Scheduling?
• Goals of schedul ing
• Fairness (Conservation Law/Max-min fair share)
• Various schedul ing techniques
• Research directions in scheduling
What is Scheduling?
• Packets from multiple flows compete for same outgoi ng link
• Which packets shoul d be given preference?
• How many packets shoul d be transmitted from a flow?
• Simple solution: First come best served
• Complex solution: Provide QoS guarantees
Scheduling Goals
• Sharing bandwidth
• Fairness to competing flows
• Meeting bandwidth guarantees (max and min)
• Meeting loss guarantees (mul tiple level)
• Meeting delay guarantees (mul tiple level)
• Reducing delay variations
Some Definitions
• Flow
– Packets sharing the same source and destination address, source and destination port, same protocol identification are considered to belong to a flow
• Work conserving scheduler
– It is not idle when any of the queues has a packet waiting to be served
– A server can remain idle wasting bandwidth to reduce burstiness of traffic while entering a downstream
network element
– A work conserving server follows the conservation low
Conservation Law
• Definition: Sum of the mean queuing delays received by the set of multiplexed connections, weighted by their share of link’s load is independent of the scheduling discipline – Kleinrock
• Where:
Const i
i
i i
i
N
i
q x å=
=
=
1
r l r
flows of
number
scheduler at
i flow of
time wait mean i
i flow from packets
of time service mean
i
i flow of
rate arrival mean
i
i flow of
n utilizatio mean
i
N q
x
=
=
=
=
=
l r
Example of Conservation Low
• A flow can receive lower delay from a work conserving scheduler only at the expense of another flow
• Example: two sources i and j through a router
– i generates 15Mbps, j generates 45Mbps – Outgoing link speed: 155Mbps
• FCFS scheduling
– Mean queuing delay of 1ms (tw,i) to each
• Another scheduling discipline
– Mean queuing delay of i: tw,i= 0.5ms
– What is the mean queuing delay for j (tw,j) ? ri =15/155, rj =45/155;
ri * 1.0 + rj * 1.0 = ri * 0.5 + rj * tw,j ; tw,j = 1.16
Max-Min Fair Share
• Scope
– Fair share allocat ion of resources
• How it works
– Allocates the smallest of all demands from all flows
– Distribute remaining resources equally competing of the flows
• This scheme guarantees that a fl ow either gets what it wants or it is not worse than any other competi ng flow
Max-Min Fair Share
• Let’s assume:
– 1, 2, …k competing f lows
– Each demanding x1≤ x2 ≤ … ≤xk, where the total resource is R unit
• How it works:
– The flow with lowest demand (1) gets R/k unit
– If R/k > x1, then R/k – x1 goes back to resource pool – Remaining k-1 flows get additional R/k + (R/k – x1)/(k-
1)
– The process iterates until:
• All resources are exhausted
• All demands have been met
Example of MMFS
• Consider:
– ATM network with outgoing link capacity of 155Mbps
– 5 competing sources with bandwidth: (1)23, (2)27, (3)35, (4)45, (5)55Mbps
• Initially
– The resource is divided equally: 155/5 = 31Mbps each – The first (1) takes 31Mbps
• Then
– The remaining 8Mbps (31-23) are divided equally among the remaining 4 sources (2Mbps each), thus 33Mbps.
– The second source needs only 27Mbps, the residual 6 (33-27) are divided among the remaining 3 sources (2Mpbs each), thus 33+2=35Mpbs each
– The algorithm stops allocation since all of the sources need 35Mbps or more
Scheduling Disciplines
• First come first serve (FCFS)
• Priority (PQ)
• Round Robin (RR)/Weighed round robin
• Deficit round robin (DRR)
• Weighted fair queuing (WFQ)
• Class based queui ng (CBQ)
First Come First Serve
• Packets enqueued into a common buffer
• Server serves packet from front of queue
• No fair sharing of bandwidth
– A greedy source can occupy most of the
queue and cause delay to other flows using the same queue
– TCP flows get penalized (congest ion
sensitive) w.r.t. UDP (no congest ion control)
• No flow isolation
• No priority or QoS guarantee
FCFS example
(a) No buffer occupied
(b) Buffer full, packet dropped
Priority Queuing
• Multiple queues with priority 0 to n-1
• Priority 0 served first
• Priority i served only if 0 to i-1 empty
• Highest priority – lowest delay/loss, highest bandwidth
• Possible starvation of lower class
• FCFS can be used i n each queue
Priority Queue example
The number of queues depends on the supported pri ority levels of the protocol
Generalized Processor Sharing
• Ideal work conserving scheme
• Flows kept in separate queue
• Serve infinitesimal amount of data from each queue
– Serve all active queues in f inite time
• Weight can be associat ed with each queue
– Each queue is served in proportion of its weight
• Achieves max-min fair share
– If there are K active flows, each one gets 1/Kth of a share of max-min resource
– Achieves also max -min weighted f air share
Generalized Processor Sharing
• In GPS terminology, a connection is called
backlogged when it has data present in queue
• Lets assume that there are K flows to be served by a server implementing GPS with weights
w(1), .. w(k) Service rate of ith flow in interval [τ, t]
is represent ed as R(i, τ,t). For any backlogged flow i in interval [τ,t] and for another flow j, the following equation holds:
) ( /
) ( )
, , ( /
) , ,
( i t R j t i j
R t t ³ v v
t
Generalized Processor Sharing
• Max-Min fair share is achieved by
allocating the residual resource such that i t gets shared by the backlogged connection in proportion of its weight
• GPS is an ideal scheme, because i t serves an infinitesimal amount of data
• Variations can be implemented in real systems
Round Robin
• Flows kept in separate queues
• Serve one packet from each non-empty queue
– Can be seen as GPS where one packet replaces infinitesimal data
• Fair share
– Load balancing among f lows – No advantage to being greedy
• What if packet size variable? No bandwidth guarantee
– Large packet queue gains more bandwidth (long time spent here)
– Not possible dif ferential treatment or specific allocation of bandwidth to specif ic queues
Weighted Round Robin
• Modification to RR
– Allows variable length packet
– Serves n packet from a queue depending on a weight – n adjusted to specif ic fraction of link share
• Assume 3 ATM sources (small cell size) wit h weights 0.75, 1.0 and 1.5. If these weights are normalised t o integer values, each source will be served 3, 4 and 6 cells in each round
Weighted Round Robin
• Needs to know packet size a priori
– Large packets receive more than allocated weight
– Need to adjust weight depending on the mean packet size
• Example:
– Serial link 500MTU, ethernet 1500MTU and FDDI 4500MT U – Weights: 0.33, 0.66, 1.0
– Weight normalized with packet size: 6, 4, 2 packets
• Fairness problem at small time scale
– In the example above, W RR is not fair on a time scale of less than 9000 byte transmission time (72 ms at
1000Mbps)
Deficit Round Robin
• Improves WRR
– Serves variable length packets
– No need to know packet size a priori
• How it works
– Initially serves each queue quantum (queue based) worth of bits – if packet less than or equal to quantum, serve it
else increment deficit_counter (queue based) by quantum – If no more outstanding packet, reset deficit_counter (Why?) – Set quantum to minimum MTU of all incoming links
• Serves more packets at a time if their size is less than the quantum
DRR Example 1
Quantum = 500 for all queues (may have different quantum for each queue)
DRR Example 2
Packet 1 of q1 gets served (500 ≥ 500) => deficit counter = 0 Packet 1 of q3 gets served (500 ≥ 200) => deficit counter = 300 Packet 1 of q4 gets served (500 ≥ 400) => counter = 0
Reset, no outstanding packets
No accumulate credits for a long period)
DRR Example 3
Packet 2 of q1 not served (500 < 700) => deficit counter = 500
Packet 2 of q3 gets served (300+500 > 500)=> deficit counter = 300 Outstanding 400 bytes packet (packet 3)
DRR Example 4
Packet 3 of q1 gets served (500 + 500 > 700) => deficit counter = 0 Packet 3 of q3 gets served (300 + 500 > 400) => deficit counter = 0
Deficit Round Robin
• Set the quantum to serve at l east an MTU of the link
– Ex 1500 bytes for Ethernet
• Fairness problem at smaller time scale
– Shorter than a packet time
Weighted Fair Queuing
• Packets tagged with a value identifying the time last bit of packet should be transmitted using
GPS simulation
• Packet with lowest tag value transmitted by scheduler
• Uses complex finish time calculation
• Hard to implement with variable packet size
• QoS guarantees possible (get s bandwidth in proportion of weight)
å
= Rw(i) / w( j) Throughput
Min
WFQ Delay Bounds
• Delay can be bounded if flows can be policed (token bucket)
• Flows regulated by token bucket are put in dif ferent queues
• Each queue has assigned weight
• With token bucket policing, assume that initially the
token bucket is f ull and a burst of bi packets arrive for a flow of class i. Last packet to complete service will suffer a maximum delay of dmax given by equation
å
= /( ( ) / ( ))
max b Rw i w j
d i
WFQ Delay with Token bucket
WFQ - Finish Time Calculation
• Following equation shows the fi nish time calculation where R(t) is called round
number. Pcm is the time required to
transmit mth packet from cth connection and w(c) is the weight of connecti on c.
) (
/ ))
( ,
max(
1)
(
F R t P w c
F
c m=
mc-+
mcWFQ - Round Number
• This is the number a bit-by-bit round robin
scheduler (in place of GPS’s non-implementable infinitesimal data) has complet ed at a given time
• The round number is a variable that depends on number of active queues to be served (inversely proportional to the number of active queues).
The more queues to serve, the longer a round will take to complete
WFQ – Example
• Assume:
– Three equally weighted connections: i, j, k – Link service rate = 1 unit/s
• Packet details:
– P1: arrival time 0, connection i, size = 2 units – P2: arrival time 1, connection j, size = 2 units – P3: arrival time 2, connection i, size = 3 units – P4: arrival time 2, connection j, size = 1 units – P5: arrival time 3, connection k, size = 4 units – P6: arrival time 4, connection i, size = 1 units
WFQ – Example
• Finish time calculation
– P1: F1i = max(0, 0.0) + 2 = 2, where R(0) = 0, first – P2: F1j = max(0, 1.0) + 2 = 3, where R(1) = 1, first
– P3: F2i= max(2, 1.5) + 3 = 5, where R(2) = 1.5, F1i = 2 – P4: F2j = max(3, 1.5) + 1 = 4, where R(2) = 1.5, F1j = 3 – P5: F1k = max(0, 2.0) + 4 = 6, where R(3) = 2.0, first – P6: F3i = max(5, 2.33) + 1 = 6, where R(4) = 2.33, F2i
= 5
• Rate of round number is controlled by number of active connections
WFQ – Round Number Calculation
• Iterated deletion problem
– Inaccurate estimation of active connections – Done for every packet arrival
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14