Internet Protocols
Fall 2005
Lectures 21-22
HTTP
Outline
• HTTP
• HTTP and TCP Interactions
• Web Caching
– Protocol Mechanisms
– Reverse and Forward Caching
• CDNs
Web and HTTP
First some jargon
• Web page
consists of
objects
• Object can be HTML file, JPEG image, Java applet,
audio file,…
• Web page consists of
base HTML-file
which includes
several referenced objects
• Each object is addressable by a
URL
• Example URL:
www.someschool.edu/someDept/pic.gif
HTTP overview
HTTP: hypertext transfer
protocol
• Web’s application layer protocol
• client/server model
–
client:
browser that requests, receives, “displays” Web objects–
server:
Web server sends objects in response to requests • HTTP 1.0: RFC 1945 PC running Explorer Server running Apache Web server HTTP reque st HTTP re quest HTTP response HTTP respon se Mac runningHTTP overview (continued)
Uses TCP:
• client initiates TCP connection (creates socket) to server, port 80
• server accepts TCP connection from client
• HTTP messages (application-layer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server)
• TCP connection closed
HTTP is “stateless”
• server maintains no information about past client requests
Protocols that maintain “state” are complex!
• past history (state) must be maintained
• if server/client crashes, their views of “state” may be inconsistent, must be reconciled
HTTP request message
• two types of HTTP messages:
request
,response
• HTTP request message:
– ASCII (human-readable format) request line
(GET, POST,
HEAD commands) GET /somedir/page.html HTTP/1.1Host: www.someschool.edu
User-agent: Mozilla/4.0 Connection: close Accept-language:fr header lines Carriage return, line feed
Method types
HTTP/1.0
• GET
• POST
– Appends information• HEAD
– asks server to leave requested object out of response
– Used to obtain metadata about response
HTTP/1.1
• GET, POST, HEAD
• PUT
– uploads file in entity
body to path specified in URL field
• DELETE
– deletes file specified in the URL field
Uploading form input
Post method:
• Web page often
includes form input
• Input is uploaded to
server in entity body
URL method:
• Uses GET method
• Input is uploaded in
URL field of request
line:
HTTP response message
HTTP/1.1 200 OK Connection close
Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix)
Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821
Content-Type: text/html
data data data data data ...
status line (protocol status code status phrase) header lines data, e.g., requested HTML file
HTTP response status codes
In first line in server->client response message.
A few sample codes:
200 OK
– request succeeded, requested object later in this message 301 Moved Permanently
– requested object moved, new location specified later in this message (Location:)
400 Bad Request
– request message not understood by server 404 Not Found
– requested document not found on this server 505 HTTP Version Not Supported
HTTP connections
Non-persistent HTTP
• At most one object
is sent over a TCP
connection.
• HTTP/1.0 uses
non-persistent HTTP
Persistent HTTP
• Multiple objects can
be sent over single
TCP connection
between client and
server.
• HTTP/1.1 uses
persistent
Non-persistent HTTP
Suppose user enters URL
www.someSchool.edu/someDepartment/home.index 1a. HTTP client initiates TCP
connection to HTTP server (process) at
www.someSchool.edu on port 80
2. HTTP client sends HTTP
request message
(containing URL) into TCP connectionsocket. Message indicates that client wants object
someDepartment/home.index
1b. HTTP server at host www.someSchool.edu
waiting for TCP connection at port 80. “accepts”
connection, notifying client 3. HTTP server receives request
message, forms
response
message
containingrequested object, and sends message into its socket
time
(contains text, references to 10
Non-persistent HTTP (cont.)
4. HTTP server closes TCP connection.
5. HTTP client receives response message
containing html file, displays html. Parsing html file, finds 10 referenced jpeg objects 6. Steps 1-5 repeated for each
of 10 jpeg objects
Response time modeling
time to transmit file initiate TCP connection RTT request file RTT file received time timeDefinition of RTT: time to send a small packet to travel from client to server and back. Response time:
• one RTT to initiate TCP connection
• one RTT for HTTP request and first few bytes of HTTP
response to return • file transmission time total = 2RTT+transmit time
Persistent HTTP
Non-persistent HTTP issues: • requires 2 RTTs per object • OS must allocate host
resources for each TCP connection
• but browsers often open parallel TCP connections to fetch referenced objects Persistent HTTP
• server leaves connection open after sending response
• subsequent HTTP messages between same client/server
Persistent without pipelining: • client issues new request
only when previous response has been received
• one RTT for each referenced object
Persistent with pipelining: • default in HTTP/1.1
• client sends requests as soon as it encounters a referenced object
• as little as one RTT for all the referenced objects
User-server interaction: authorization
server
client
Authorization : control access to server content
• authorization credentials: typically name, password
• stateless: client must present authorization in
each
request– authorization: header line in
each request
– if no authorization: header,
server refuses access, sends
WWW authenticate: header line in response
usual http request msg 401: authorization req. WWW authenticate: usual http request msg + Authorization: <cred> usual http response msg usual http request msg + Authorization: <cred>
Cookies: keeping “state”
Many major Web sites use
cookies
Four components:
1) cookie header line in the HTTP response message 2) cookie header line in
HTTP request message 3) cookie file kept on user’s
host and managed by user’s browser
4) back-end database at Web site
Example:
– Susan accesses Internet always from same PC – She visits a specific
e-commerce site for first time
– When initial HTTP
requests arrives at site, site creates a unique ID and creates an entry in backend database for ID
Cookies: keeping “state” (cont.)
client
server
usual http request msg usual http response + Set-cookie: 1678 usual http request msg cookie: 1678 usual http response msg usual http request msg cookie: 1678 usual http response msg cookie-specific action cookie-spectific action server creates ID 1678 for user ent ry in backend data base access acce ss Cookie file ebay: 8734 Cookie file amazon: 1678 ebay: 8734 one week later:Cookie file
amazon: 1678 ebay: 8734
Cookies (continued)
Cookies and privacy:
• cookies permit sites to learn a lot about you
• you may supply name and e-mail to sites
• search engines use
redirection & cookies to learn yet more
• advertising companies obtain info across sites
aside
What cookies can bring: • (primitive) authorization • Shopping carts
• Recommendations
• User session state (Web e-mail)
Conditional GET: client-side caching
server
• Goal: don’t send object if client has up-to-date cached version
• client: specify date of cached copy in HTTP request
If-modified-since: <date>
• server: response contains no object if cached copy is up-to-date: HTTP/1.0 304 Not Modified
client
HTTP request msg If-modified-since: <date> object not modified HTTP response HTTP/1.0 304 Not Modified HTTP request msg If-modified-since: <date> object modified HTTP response HTTP/1.0 200 OK <data>Outline
• HTTP
• HTTP and TCP Interactions
• Web Caching
– Protocol Mechanisms
– Reverse and Forward Caching
• CDNs
Background
• TCP standardized in 1980
• Early TCP applications very different from
HTTP
– Telnet uses a single long lasting connection
– FTP uses separate control and data connections
– FTP connections longer than HTTP connections
TCP Timers
• TCP uses timers to trigger protocol operations
– Retransmission timers
– Repeating slow-start phase
– Reclaiming state from terminated connections
• HTTP is an interactive application
– Timers translate to additional delay perceptible by
end-user
The effect of retx timer
SYN
SYN
SYN
SYN
SYN-ACK
• Initial RTO is 3 sec
– RTO doubles for each lost packet
– Losing two SYNs means additional delay of 9 sec
• Congestion causes (still) causes lost packets
– Edge links, inter-domain links – SYNs lost on busy web server
Client
The effect of retx timer (II)
• Frustrated user stops and reloads page
– Reduced end-user perceived delay
– At the expense of higher load on
web-server and network
The effect of retx timer (III)
• After connection is established negative effect
of RTO is reduced
– RTO converges to “correct” values
– Duplicate ACKs
• However!
– Most HTTP responses are 8-12KB
– Connection may not exit from slow-start phase
– Window is not big enough to generate duplicate
ACKs
Slow-Start restart
• Persistent connections reduce the impact of
short HTTP transfers
– TCP window opens to “correct” value
• Connection might be idle for some time
– During this interval congestion levels might have
changed
• Slow Start restart mechanism
– After connection is idle for some time (a few
RTTs) reduce cwnd to initial value
Reducing penalty of slow start
restart
• Different solutions
– Disable slow start restart at the web
server-> increased network congestion
– Gradually decrease the cwnd
TIME_WAIT state
• Busy web servers have
lots
of TCP
connections -> would like to reclaim
them ASAP
• Server has to wait for 2*MSL (1-4min)
before releasing connection
Solutions
• Reduce the overhead of keeping connections
in TIME_WAIT state
– Memory and timers
• Change TCP
– Move the burden of TIME_WAIT to receiver of FIN
• Change HTTP
– Allow client to close connection (after server
suggestion)
Aborting HTTP connections
• Users abort HTTP transfers by pressing “stop”
button
• No “Abort” method in HTTP
– Browser needs to terminate TCP connection
– Have to go through 3-way handshake and
slow-start again…
– What happens if request changes state on HTTP
server???
Nagle’s algorithm
• TCP sender transmits at most one small
packet per RTT.
– Small means smaller than the MSS (1460
bytes)
– Avoid high overhead from interactive
applications (e.g. telnet)
Effect on HTTP
• Web server sends response using two
write()’s
– One for header one for body
– Assume response < MSS
– TCP sends first segment
– Second segment has to wait for RTT
• Unless HTTP server closes the connection
– Second segment is sent only after client sends
ACK
Effect on HTTP (II)
• Disable Nagle’s algorithm
– setsockopt(TPC_N O DELAY)
• Negative effect
Delayed ACKs
• Motivation
– Sending ACKs increases protocol overhead
• Solution
– Don’t send ACK for each packet
– Piggyback
• Delay transmission of ACK in hope that data
will be sent (200ms)
Interaction with HTTP traffic
• It is
unlikely
that both hosts would be
transmitting data
Server
Client
R
eq
u
es
t
A
C
K
H
ea
d
er
B
o
d
y
A
C
K
B
o
d
y
Multiplexing TCP connections
• Motivations for parallel connections
– Browser downloading embedded images
– Proxy acting on behalf of multiple clients
– Higher throughput by transmitting
Multiple connections effects
• Problems
– Unfairness to other clients
– Higher network and server load
– Higher user-perceived latency
• Remedies
– Removing performance incentives for parallel
connections (network and end-host solutions)
– Provide alternatives to parallel connections
Outline
• HTTP
• HTTP and TCP Interactions
• HTTP Caching
– Protocol Mechanisms
– Reverse and Forward Caching
• CDNs
Why web caching?
• Client-server architecture is inherently not scalable
– Proxies: a level of indirection
• Reduce client response time
– Direct and indirect effect – Less load on the server:
• Server does not have to over-provision for slashdot effect
• Reduce network bandwidth usage
– Wide area vs. local area use
– These two objectives are often in conflict
• May do exhaustive local search to avoid using wide area bandwidth • Prefetching uses extra bandwidth to reduce client response time
• New: DDoS Protection
– dissipate attack over massive resources
Cache Placement
• Reverse Caches
– Cache documents close to the server -> decrease server load – Done by content providers
• Forward Caches
– Cache documents close to the clients -> Reduce network traffic and decrease latencyHTTP/1.0 Caching
• Exploit locality of reference
• A modifier to the GET request:
– If-modified-since – return a “not modified” response if resource was not modified since specified time
• A response header:
– Expires – specify to the client for how long it is safe to cache the resource
• A request directive:
– No-cache – ignore all caches and get resource directly from server
• These features can be best taken advantage of with
HTTP proxies
Cache Hierarchies
• Use hierarchy to scale a proxy
– Why?
• Larger population = higher hit rate (less compulsory misses) • Larger effective cache size
– Why is population for single proxy limited?
• Performance, administration, policy, etc.
• NLANR cache hierarchy
– Most popular
– 9 top level caches
– Internet Cache Protocol based (ICP) – Squid/Harvest proxy
ICP (Internet cache protocol)
• Simple protocol to query another cache for content
• Uses UDP – why?
• ICP message contents
– Type – query, hit, hit_obj, miss
– Other – identifier, URL, version, sender address – Special message types used with UDP echo port
• Used to probe server or “dumb cache”
• Query and then wait till time-out (2 sec)
Squid
Parent
Child
Child
Child
Web page request ICP Query ICP Query
Squid
Client
Parent
Child
Child
Child
ICP MISS ICP
Squid
Parent
Child
Child
Child
Web page request
Squid
Client
Parent
Child
Child
Child
Web page request ICP Query ICP Query ICP Query
Squid
Parent
Child
Child
Child
Web page request ICP MISS ICP HIT ICP HIT
Squid
Client
Parent
Child
Child
Child
Web page request
Optimal Cache Mesh Behavior
• Ideally, want the cache mesh to behave
as a single cache with equivalent
capacity and processing capability
• ICP: many copies of popular objects
created – capacity wasted
• More than one hop needed for
searching object
Hinting
• Have proxies store content as well as metadata about
contents of other proxies (hints)
– Minimizes number of hops through mesh
– Size of hint cache is a concern – size of key vs. size of document
• Having hints can help consistency
– Makes it possible to push updated documents or invalidations to other caches
• How to keep hints up-to-date?
– Not critical – incorrect hint results in extra lookups, not incorrect behavior
Summary Cache
• Primary innovation – use of compact
representation of cache contents
– Typical cache has 80 GB of space and 8KB objects
Æ 10 M objects
– Using 16byte MD5 Æ 160 MB per peer
– Solution: Bloom filters
• Delayed propagation of hints
– Waits until threshold %age of cached documents
are not in summary
Bloom Filters
• Proxy contents summarize as a M bit value
• Each page stored contributes k hash values in range
[1..M]
– Bits corresponding to the k hashes set in summary
• Check for page = if all k hash bits corresponding to a
page are set in summary, it is likely that proxy has
summary
• Tradeoff Æ false positives
– Larger M reduces false positives
– What should M be? 8-16 * number of pages seems to work well
– What about k? Is related to (M/number of pages) Æ 4 works for above M
Problems with caching
• Over 50% of all HTTP objects are uncacheable.
• Sources:
– Dynamic data Æ stock prices, frequently updated content – CGI scripts Æ results based on passed parameters
– SSL Æ encrypted data is not cacheable
• Most web clients don’t handle mixed pages well Æmany generic objects transferred with SSL
– Cookies Æ results may be based on passed data
– Hit metering Æ owner wants to measure # of hits for revenue, etc, so, cache busting
CDN
• Replicate content on many servers
• Challenges
– How to replicate content
– Where to replicate content
– How to find replicated content
– How to choose among known replicas
– How to direct clients towards replica
• DNS, HTTP 304 response, anycast, etc.
Server Selection
• Service is replicated in many places in network
• How to direct clients to a particular server?
– As part of routing Æ anycast, cluster load balancing – As part of application Æ HTTP redirect
– As part of naming Æ DNS
• Which server?
– Lowest load Æ to balance load on servers
– Best performance Æ to improve client performance
• Based on Geography? RTT? Throughput? Load?
Routing Based
• Anycast
– Give service a single IP address
– Each node implementing service advertises
route to address
– Packets get routed from client to “closest”
service node
• Closest is defined by routing metrics
• May not mirror performance/application needs
Routing Based
• Cluster load balancing
– Router in front of cluster of nodes directs packets to server – Can only look at global address (L3 switching)
– Often want to do this on a connection by connection basis – why?
• Forces router to keep per connection state
• L4 switching – transport headers, port numbers
– How to choose server
• Easiest to decide based on arrival of first packet in exchange • Primarily based on local load
• Can be based on later packets (e.g. HTTP Get request) but makes system more complex
L-7 switching
• Interpret requests, content-aware switches
• Have to do the initial hand-shake
• Different proxies for different content-types
• Load balancing vs locality
• Locality means all requests (even to a popular
object) serviced by a single proxy
Application Based
• HTTP supports simple way to indicate that Web page
has moved
• Server gets Get request from client
– Decides which server is best suited for particular client and object
– Returns HTTP redirect to that server
• Can make informed application specific decision
• May introduce additional overhead Æ multiple
connection setup, name lookups, etc.
• While good solution in general HTTP Redirect has
some design flaws – especially with current
Naming Based
• Client does name lookup for service
• Name server chooses appropriate server address
• What information can it base decision on?
– Server load/location Æ must be collected – Name service client
• Typically the local name server for client
• Round-robin
– Randomly choose replica – Avoid hot-spots
• [Semi-]static metrics
– Geography – Route metrics
How Akamai Works-I
• Clients fetch html document from primary
server
– E.g. fetch index.html from cnn.com
• URLs for replicated content are replaced in
html (Akamaization)
– E.g. <img src=“http://cnn.com/af/x.gif”> replaced
with <img
src=“http://a73.g.akamaitech.net/7/23/cnn.com/a
f/x.gif”>
• Client is forced to resolve
How Akamai Works-II
• How is content replicated?
• Akamai only replicates static content
– Serves about 7% of the Internet traffic !
• Modified name contains original file
• Akamai server is asked for content
– First checks local cache
– If not in cache, requests file from primary
server and caches file
How Akamai Works-III
• Root server gives NS record for akamai.net
• Akamai.net name server returns NS record for
g.akamaitech.net
– Name server chosen to be in region of client’s name server – TTL is large
• G.akamaitech.net nameserver choses server in region
– Should try to chose server that has file in cache - How to choose?
– Uses aXYZ name and consistent hash – TTL is small
Hashing
• Advantages
– Let the CDN nodes are numbered 1..m
– Client uses a
good
hash function to map a URL to
1..m
– Say hash (url) =
x
, so, client fetches content from
node
x
– No duplication – not being fault tolerant.
– One hop access
– Any problems?
• What happens if a node goes down?
• What happens if a node comes back up? • What if different nodes have different views?
Robust hashing
• Let 90 documents, node 1..9, node 10 which
was dead is alive again
• % of documents in the wrong node?
– 10, 19-20, 28-30, 37-40, 46-50, 55-60, 64-70,
73-80, 82-90
–
Disruption coefficient
= ½
– Unacceptable, use consistent hashing – idea
behind Akamai!
Consistent Hash
• “view” = subset of all hash buckets that are
visible
• Desired features
– Balanced – in any one view, load is equal across
buckets
– Smoothness – little impact on hash bucket
contents when buckets are added/removed
– Spread – small set of hash buckets that may hold
an object regardless of views
– Load – across all views # of objects assigned to
hash bucket is small
Consistent Hash – Example
•
Construction
• Assign each of C hash buckets to
random points on mod 2n circle, where,
hash key size = n.
• Map object to random position on circle
• Hash of object = closest clockwise bucket 0 8 4 12 Bucket 14
• Smoothness Æ addition of bucket does not cause much
movement between existing buckets
• Spread & Load Æ small set of buckets that lie near object
• Balance Æ no bucket is responsible for large number of
How Akamai Works
End-user
cnn.com (content provider) DNS root server Akamai server
1 2 3
4
Akamai high-level DNS server Akamai low-level DNS server
Closest Akamai server 10 6 7 8 9 12 Get index. html Get /cnn.com/foo.jpg 11 Get foo.jpg 5
Akamai – Subsequent Requests
cnn.com (content provider) DNS root server Akamai server1 2 Akamai high-level DNS server
Akamai low-level DNS server
Closest Akamai server 7 8 9 Get index. html