• No results found

2. WEB SITES’ DESIGN SURVEY

2.3. Hyper Text Transfer Protocol

The Hyper Text Transfer Protocol (HTTP) is the most popular protocol on the Internet today. HTTP is the protocol for the World Wide Web (WWW). Like most of the other Internet protocols, HTTP requires both a client and a server to transfer data. The transfer is accomplished through the Transmission Control Protocol/Internet Protocol (TCP/IP) which used by web servers and also links computers connected to access Internet. HTTP server is also known as HTTP

Deamon. It is a program that listens for HTTP requests on certain machine’s port

(default is 80). It also denotes physical location of the computer that stores those documents. When a client’s side opens the TCP/IP connection, it transmits the request for a document and waits for a response from the server. Finally, when the request-replay sequence is completed, the socket is closed.

The request is defined by the Uniform Resource Identifiers (URIs and URLs). These short strings are addresses into information resources: documents, services, electronic mailboxes, images, downloadable files and etc. They make resources available under access method such as HTTP, file transfer protocol (FTP) and may be accessible through a single mouse click. Uniform Resource Identifier (URI) is a general name of the string which refers to the resource. Uniform Resource Locator or just URL is a part of URI and is associated with such popular URI schemes as HTTP, FTP and mailto. The structure of the URL is hierarchical. The first part specifies protocol used to transmit the resource. The other part indicates path, files name and other specific to the requested file symbols, e. g.: 1. protocol name, e. g. HTTP, 2. domain name, e. g. www.mii.lt, 3. port address, e. g. :80 (HTTP default), 4. directory where requested file is located, 5. name of the requested file, 6. internal links or anchor #. Typical example of the URL: http://www.mii.lt/index.php?siteaction=personnel.browse& has following meaning. Protocol used is http, accessed via www in the domain mii.lt, which is in Lithuania, the file which is downloaded is “index.php?siteaction=personnel.browse&”.

Another example is ftp://ftp.leo.org/pub/program.exe. This URL is interpreted as follows. Protocol is ftp, the resource is on the ftp machine which is part of “.org” domain. The resource located in the “pub” directory and the file is “program.exe”.

HTTP connection requires both web server and client participation. Next sections explain how client and server side HTTP request works.

2.3.1. Client Side HTTP Request

Browser starts the action - sends a request asking for a file in the defined location by URL. Since URL is an addresser, it allows web browser to know where and how to go to the desired location. Following actions are performed accessing the web site (Jeffry Dwight et al. 1996):

1) Browser decodes host of the URL and contacts web server.

2) Browser gives the rest part (directory, files name, internal links) of the URL to the server.

3) Server translates the URL into a path and file name. 4) Server sends the file/page to the client’s browser. 5) Server closes the connection.

6) Browser displays the document/page.

Example of the client’s part HTTP request. The requested page is http://www.mii.lt/index.php?siteaction=personnel.view&id=217: GET //index.php?siteaction=personnel.view&id=217 HTTP/4.01 User-Agent: Mozilla/4.0+(compatible;+MSIE+5.5;+Windows+98;+Win+9x+4. 90;+sao) Host: www.mii.lt

Accept: image/gif, image/jpeg, */*

The request contains method, requested document, HTTP protocol version that the client uses (HTTP/4.01), software/browser name and version, the server host and type of the objects or applications on the server which client can accept.

2.3.2. Server Side HTTP Response

Server side response of the above described request will look like:

HTTP/4.01 200 OK

Date: Thus, 05 Sep 2007 16:17:52 GMT Server: Apache/1.1.1

Content-type: text/html Content-length: 1538

Last-modified: Mon, 05 Oct 2006 01:23:50 GMT

The first line contains HTTP protocol version and response code (200 OK means that page was retrieved successfully), data and time of the retrieval, web server type, name and version, downloaded page type, the size of the downloaded

object, last page’s modification date and time. After the head of the response, the content of that page is presented. However just the textual part will be received. All images, sound and other files inside the HTML are retrieved from the web server through the supplementary HTTP protocol connection. Link references are retrieved from the original location also during additional connections when the user clicks them with the mouse.

2.3.3. HTTP Proxy Request

Sometimes proxy/cache servers exist on the client’s side. These servers are the system that sits between an application (such as Internet Explorer, Netscape Communicator or similar) and a web server. It intercepts the requests to the web server to see if it can substitute the request reducing the number of requests that go out to the Internet. This is the outcome that proxy server can cache files it downloads from the Internet. Requested web pages are stored in a cache. When a previously accessed page is requested again, the proxy can retrieve it from the cache rather than from the original web server. Therefore, time for retrieving web page is saved and Internet network traffic is reduced.