SIP Server Overload Control:
Design and Evaluation
Charles Shen and Henning Schulzrinne Columbia University
Erich Nahum
Session Initiation Protocol (SIP)
INVITE 200 OK 100 Trying INVITE 180 Ringing 180 Ringing 200 OK ACK ACK Media BYE INVITE 100 Trying 200 OK ACK 180 Ringing UA Proxy Proxy UA§ Application layer signaling protocol for managing sessions in the Internet § Run on top of common transport layer, e.g., UDP, TCP and SCTP
§ Typical usage: voice-over-IP call setup, instant messaging, presence, conferencing
SIP Server Overload Problem
§
Many causes to SIP server overload
Natural disaster and emergency-induced call volume (earthquake) Predictable special events (Mother’s Day)
Flash Crowds: American Idol, “Free tickets to the third caller” Denial of service attacks
INVITE INVITE
INVITE
INVITE
§
Simply dropping requests on overload?
Simple message dropping induces more messages due to retransmission (especially for SIP over UDP)
E.g., Timer A for INVITE retransmission T1 = 500 ms, increases exponentially
SIP Server Overload Problem (Cont.)
§
Rejecting excessive requests upon overload?
SIP 503 (Service Unavailable) response code used to reject individual request
– overall sending rate is not reduced
– rejecting costs comparable CPU cycles with accepting requests!
503 (Service Unavailable) with Retry-After?
– Client completely shut off during the period specified – Reducing rate on/off may cause oscillation
§
Trying an alternative server?
Feedback-based SIP Overload Control
SIP Overload Feedback Control Design Considerations
Requirements
§
approaching ideal performance
§
Few “tweak” control parameters
Design decisions
§
SIP session as basic control unit
§
Characterizing SIP session
check number of INVITEs accepted
§
Dynamic session backlog estimation
count both INVITEs and non-INVITEs for current session backlog
§
Active source estimation
directly tracking each current active SE sending incoming load
0 0.2 0.4 0.6 0.8 1 1.2 0 1 2 3 4 5 6 7 8 9 Load G oo dp ut Ideal Goodput
Window-based
Feedback Control Algorithms
N/A budget queuing delay
measurement Interval budget queuing delay
control Interval measurement interval Tuning parameters after processing new INVITE request every message arrival
every control interval Window size
adjustment algorithm
upon receiving session request (INVITE) Window size
decrement
Win-auto Win-cont
Rate-based
Feedback Control Algorithms
budget CPU occupancy control interval
measurement interval OCC tuning parameters budget queuing delay
control interval
measurement interval Tuning parameters
request acceptance ratio acceptable rate Rate adjustment algorithm (every control interval) Rate-occ Rate-abs
Simulation Assumptions and Metrics
§ RFC3261 compatible simulator built on OPNET
§ exponential call inter-arrival
standard seven-message call flow
§ 72 cps RE service capacity; 3000 cps rejection rate
§ UAs and SEs have infinite capacity § UDP transport, no link delay and loss
§ Piggyback feedback
§ Goodput = # of calls whose INVITE-to-ACK delay below 10 s
§ Delay = time from INVITE sent to ACK (200 OK) received
No
Feedback Control
Simple drop
§ message dropped when queue full
Threshold rejection
§ queue length configured with a high and a low threshold value
high threshold: new INVITE rejected but other messages processed Low threshold: new INVITE processing restored
Similar congestion collapse
*but different reasons:
§ Simple Drop
Only 1/3 of INVITEs arriving at the callee
all 180 RINGING and most of the 200 OK also dropped due to queue overflow
Threshold Rejection 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 6 7 8 9 Load G oo dp
Sensitivity to Budget Queuing Delay and Control Interval
§
Small queuing delay (< ½ T1 timer) avoids
timeout and gives best results
§
Example results for win-disc
delay budget (DB) <= 200 ms control interval (CI) = 200 msgoodput degraded by 25% for DB = 500 ms
§
Similar results for win-cont and rate-abs
§
Sensitivity of control interval
smaller CI is better
§
Example results for win-disc
at DB =200 ms, CI <= 200 ms sufficient to archive unit goodput in our scenario
DB = 200ms DB = 300ms DB = 400ms DB = 500ms DB = 600ms 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 0 1 2 3 4 5 6 7 8 9 10 11 Load G oo dp ut CI = 200ms CI = 500ms CI = 1s 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 6 7 8 9 Load G oo dp ut
Impact of Control Interval across Algorithms
§
Comparing CI for win-disc, rate-abs and rate-occ
*at D
B
= 200ms
§
Both win-disc and rate-abs close to unit goodput except CI = 1s with
heavy load
§
win-disc more sensitive to CI than rate-abs
§
rate-occ not as good as the other two
0 0.2 0.4 0.6 0.8 1 1.2 14ms 100ms 200ms 1s G oo dp ut
win-disc rate-abs rate-occ
0 0.2 0.4 0.6 0.8 1 1.2 14ms 100ms 200ms 1s G oo dp ut
Best Performance Comparison across Algorithms
All except rate-occ reach unit goodput
§ no retransmissions§ server always busy processing messages § each single message part of a successful
session
rate-occ < unit goodput
§ artificial 85% CPU limit § occupancy too indirect§ extremely small CI improves performance at heavy load but incurs problems
rate-occ1 rate-occ2 win-cont win-disc rate-abs 0.7 0.75 0.8 0.85 0.9 0.95 1 0 1 2 3 4 5 6 7 8 9 10 Load G oo dp ut
Fairness
User-centric fairness
§ equal success rate for each individual user
§ implementation: divide RE capacity proportionally to original SE load arrivals § applicability example: “Free ticket to the third caller”
Provider-centric fairness
§ each provider (SE) gets the same aggregate share of total capacity § implementation: divide RE capacity equally among SEs
§ applicability example: equal-share SLA
Customized fairness
§ any allocation as pre-specified by SLA, …
Dynamic Load Performance with Provider Centric Fairness
§ Realistic server to server overload situations likely
short periods of bulk loads accompanied by source arrivals or departures
§ Example result using rate-abs algorithm
§ Each upstream SE share close to equal RE capacity § Fast dynamic transition
ua1 ua2 ua3 0 1 2 3 4 5 0 200 400 600 800 1000 1200 1400 1600 1800 Time (sec) Lo ad ua1 ua2 ua3 0 0.2 0.4 0.6 0.8 1 1.2 0 200 400 600 800 1000 1200 1400 1600 1800 Time (sec) G oo dp ut
User Centric Fairness
ua1 ua2 ua3 0.2 0.4 0.6 0.8 1 G oo dp ut ua1 ua2 ua3 0 1 2 3 4 5 0 200 400 600 800 1000 1200 1400 1600 1800 Time (sec) Lo ad§ Double feed architecture
Provide incoming load
§ Example using win-cont algorithm
§ Upstream SEs share RE
capacity proportionally § Fast dynamic transition
Win-auto
ua1 ua2 ua3 0 1 2 3 4 5 0 200 400 600 800 1000 1200 1400 1600 1800 Time (sec) Lo ad ua1 ua2 ua3 0 0.2 0.4 0.6 0.8 1 1.2 0 200 400 600 800 1000 1200 1400 1600 1800 Time (sec) G oo dp ut§ Source arrival transition time could be noticeably longer
§ Hard to enforce explicit fairness
no processing intervention
§ Still achieves aggregate unit goodput