Network Technologies
Glenn Strong
Department of Computer Science School of Computer Science and Statistics
Trinity College, Dublin
What Happens When Browser Contacts Server I
Top view:The browser determines the hostname (the address that was clicked upon or typed) www.adobe.com (host name is Mnemonic - assists human memory).
The browser asks the DNS (Domain Name Service) for the IP address of www.adobe.com
DNS - a distributed database that translates hostnames to IP addresses
DNS replies with the corresponding IP address. For example 192.150.14.120
The browser then makes a TCP connection to port 80 on 192.150.14.120
Handshakes with web server process running on 192.150.14.120 Sets up a reliable connection between the two machines
What Happens When Browser Contacts Server II
The web server running on 192.150.14.120 retrieves index.html from its file system. It attaches some HTTP header information and passes the file stream to its TCP process. Encapsulation again. The browser receives the file and closes the TCP connection Browser reads HTML and renders page
TCP and Ports
Weve already mentioned that a server machine might have several server processes running on it, each providing different type of service - HTTP, FTP, Telnet, SMTP
Each server process can be identified by the port at which it listens on that machine
What exactly is a port?
We use a port to set up a TCP connection
Programming term specifying a logical connection place In TCP/IP, allows a client program specify a particular server process on a computer in a network
Port numbers are from 0 to 65536
Ports 0 to 1024 are reserved for use by certain privileged services. See RFC 1700
Ports
When taken together, the source port, the destination port, source and destination IP numbers uniquely identify an application process This allows several processes to communicate without the signals becoming mixed up
This is all carried out by the transport layer
It identifies the data destined for each process by examining the
source/destination IP address source/destination port number
of each incoming segment
HTTP
Ok - weve introduced what happens when a client contacts a web server
After it establishes a TCP connection, we saw that it issues a GET command followed by the file name it requires and version of HTTP it is running
HTTP is the application layer protocol for retrieving web pages The browser uses TCP as the transport mechanism to send HTTP information between client and server.
HTTP is the Hypertext Transfer Protocol WWWs application layer protocol Client/server model
client: browser that requests, receives, displays WWW objects server: WWW server sends objects in response to requests
HTTP Example
Suppose a user enters the URL: www.tcd.ie/Library/index.htm 1 HTTP client initiates TCP connection to HTTP server (process) at
www.tcd.ie. Port 80 is default for HTTP server.
2 HTTP server at host www.tcd.ie waiting for TCP connection at port 80 accepts connection, notifying client.
3 HTTP client sends http request message (containing URL) into TCP connection socket.
4 HTTP server receives request message, forms response message containing requested object (Library/index.htm), sends message into socket.
5 HTTP server closes TCP connection.
6 HTTP client receives response message containing HTML file, displays HTML. Parsing the HTML file, the browser finds 10 referenced jpeg objects
How many TCP connections do we need to establish?
Non-persistent connection: one object in each TCP connection Some browsers create multiple TCP connections simultaneously -one per object
A Note on Virtual Hosting
Ive mentioned servers in the context of a single machine hosting a single web domain.
However, a single machine might have several host names associated with its IP address.
This is called virtual hosting; It is made possible by HTTP/1.1 Every HTTP/1.1 request includes a HOST field which specifies the domain name associated with the request
This allows the server know which part of its file system has been allocated to the domain name.
Offering a virtual host facility is often mis-named as offering virtual server facility
A virtual server is a facility whereby you are given complete control of a server process
Essentially, you are running your own remote server, giving
Complete configuration control You have your own IP address
Back to the HTTP Protocol
The TCP transport service for HTTP:
Client initiates TCP connection (creates socket) to server, port 80 Server accepts TCP connection from client
HTTP messages (application-layer protocol messages) exchanged between browser (http client) and WWW server (http server) TCP connection closeda
Note that each request-response transaction is independent:
HTTP is “stateless”. Server maintains no information about past client requests
Protocols that maintain “state” are complex!
past history (state) must be maintained
HTTP Message Format: Request
There are two types of HTTP message: request
response
The HTTP request message is ASCII text, in a human-readable format: GET /somedir/page.html HTTP/1.1
Host: www.virtualhost.com Connection: close
User-agent: Mozilla/4.0
Accept: text/html, image/gif,image/jpeg Accept-language:fr
HTTP Message Format: Request
There are several different kinds of request; the previous example was a GET; here is a POST
POST /somedir/script.php HTTP/1.0 User-Agent: Mozilla/4.0
Content-Type: application/x-www-form-urlencoded Content-Length: 51
username=fred&password=mumble&sweeties=cola+bottles In which a block of data is sent along with the request. This is a common way to submit form data (in a GET request the form contents are sent as part of the URI).
HTTP Message Format: Reply
The response sent by the server consists of a status line (the protocol status code and status phrase), followed by header lines and the data requested (e.g. an HTML file).
HTTP/1.1 200 OK Connection: close
Date: Thu, 06 Apr 2004 12:00:15 GMT Server: Apache/1.3.0 (Unix)
Last-Modified: Mon, 22 Mar 2004 ... Content-Length: 6821
HTTP Reply Status Codes
In first line in server to client response message. A few sample codes: 200 OK
request succeeded, requested object later in this message 301 Moved Permanently
requested object moved, new location specified later in this message (Location:)
400 Bad Request
request message not understood by server 404 Not Found
requested document not found on this server 505 HTTP Version Not Supported
User-Server Interaction: Authentication
HTTP is stateless. So how do we: control access to server documents?
Serve content specifically tailored to a particular user?
We need some form of authentication. The goal of authentication is to control access to server documents.
stateless: client must present authorization in each request authorization: typically name and password
authorization: header line in request
if no authorization presented, server refuses access, sends WWW authenticate:
User-Server Interaction: Cookies [RFC 2109]
Server sends cookie to client in response, as part of the headers Set-cookie: UniqueID=1243af54e
Client presents cookie in later requests as part of the request headers cookie: UniqueID=1243af54e
Server matches presented-cookie with server-stored cookies
authentication
remembering user preferences, previous choices
If the cookie has options they are set as part of the value
How Amazon.com Redirects your Browser and Sets a
Cookie
When you type www.amazon.com into your browser for the first time several things occur before you receive a web page
Firstly Amazon.com (like many large sites) do not keep their front page at index.html. Your browser is redirected to the front page by a HTTP 302 response
When your browser then requests this new page, the Amazons server checks to see if a cookie has already been set
If not, it sends your browser another 302 response this time with a cookie
Your browser sets the cookie
Your browser requests the new Amazon front page, this time returning the cookie ID
The Amazon server at last sends you the front page of its site After the cookie has been set your browser will include a cookie field in every request it makes to www.Amazon.com
Client Caching: Conditional GET
Goal: dont send object if client has up-to-date stored (cached) version
What if the cached version is stale? Client does a quick check:
Client specifies last-modified date of cached copy in a HTTP request If-modified-since: <date>
Web Caches (Proxy Server)
Goal: satisfy client request without involving origin server
User sets browser so that it accesses the web a via the proxy server The proxy has a cache of pages already downloaded
Client sends all HTTP requests to web cache
if object at web cache, web cache immediately returns object in HTTP response
else proxy server requests object from origin server, then returns http response to client
Note: client first opens TCP connection to proxy server and sends HTTP request over connection.
If proxy doesnt have the item, it opens a TCP connection to remote web server and sends request for the item