• No results found

Chapter 4: Application Protocols

N/A
N/A
Protected

Academic year: 2021

Share "Chapter 4: Application Protocols"

Copied!
57
0
0

Loading.... (view fulltext now)

Full text

(1)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 1

Chapter 4: Application Protocols

Data Link Layer

Chapter 4: Application Protocols

Presentation Layer

Session Layer

Chapter 2: Computer Networks Physical Layer

Network Layer Transport Layer Application Layer

Chapter 3: Internet Protocols Application Protocols

(2)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 2

Chapter 4: Application Protocols

Layer 5: Session Layer

Layer 5 is the lowest of the application orientated layers; it controls dialogs, i.e. the exchange of related information:

• Synchronization of partner instances by synchronization points: data can have been transferred correctly but have to be nevertheless partially retransmitted. (Crash of a sender in the mid of the data transmission process.) Therefore, synchronization points can be set on layer 5 at arbitrary times of the

communication process. If a connection breaks down, not the entire data transmission has to be repeated; the transmission can remount at the last synchronization point.

• Dialog management during half duplex transmission: layer 5 controls the order in which the communication partners are allowed to send their data.

• Connection establishment, data transmission, and connection termination for layer 5 to 7.

• Use of different tokens for the assignment of transmission authorizations, for connection termination, and for the setting synchronization points.

(3)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 3

Chapter 4: Application Protocols

Layer 6: Presentation Layer

Layer 6 hides the use of different data structures or differences in their internal representation

⇒ The same meaning of the data with the sender and the receiver is guaranteed • Adapt character codes

ASCII 7-bit American Standard Code for Information Interchange

EBCDIC 8-bit Extended Binary Coded Digital Interchange Code • Adapt number notation

32/40/56/64 bits

Little Endian (byte 0 of a word is right) vs. Big Endian (byte 0 is left) ⇒ Abstract Syntax Notation One, ASN.1 as transfer syntax

Substantial tasks of layer 6:

1.) Negotiation of the transfer syntax

2.) Mapping of the own data to the transfer syntax

(4)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 4

Chapter 4: Application Protocols

• Codes are meaningful only if they are clearly decodable, i.e. each sequence of characters, which consists of code words, can be divided definitely into a

sequence of code words.

• In communication, immediately decodable codes are important, i.e. character sequences from code words can be decoded definitely from the beginning of the character sequence word by word, without considering following characters.

Prefix code: no code word may be a prefix of another. • Example:

C = {0, 10, 011, 11111} is a definite code, but not immediately decodable

• To each definite code, an immediately decodable code exists, which is not

“longer”.

Codes

Source coding generally converts the representation of messages into a

sequence of code words

→ Efficient coding

→ Remove redundancies

(5)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 5

Chapter 4: Application Protocols

What is information?

Definition: The mean information content (entropy) of a character is defined by

with N - Number of different characters

pi - Frequency of a character i (i=1,…, N)

A - Basis

In a transferred sense:

The entropy indicates, how surprised we are, which character comes next.

p

p

p

p

i N i i a i N i i a 1 1 1

log

log

= = = − Example 1: Given: 4 characters

All N=4 characters are equivalent frequently (pi = 0.25 ∀i)

⇒ Entropy:

→ There does not exist a better coding as with 2 bits per character

[ ]bit i 2 4 4 25 , 0 log log 2 4 1 2 = = ∑= Example 2: Given: 4 characters

The first character has the frequency p1=1, thus is p2 = p3 = p4 = 0

⇒ Entropy:

→ The entropy is 0 [bit], i.e. because anyway only character 1 is transferred, we did not even code and transfer it.

0 0 0 1 1

1log lim3* log

0

2 + = + = → p a p

p

(6)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 6

Chapter 4: Application Protocols

The entropy indicates how many bits at least are needed for coding.

A good approximation to that theoretical minimum (for mean code word length) is the use of a binary tree. The characters which are to be coded are at the leafs.

Huffman code (a prefix code)

Precondition: the frequency of the occurrence of all characters is well-known. Principle: more frequently arising characters are coded shorter than rarer ones

1.) List all characters as well as their frequencies

2.) Select the two list elements with the smallest frequency and remove them from the list

3.) Make them the leafs of a tree, whereby the probabilities for both elements are being added; place the tree into the list

4.) Repeat steps 2 and 3, until the list contains only one element 5.) Mark all edges:

Father → left son with “0” Father → right son with “1”

The code words result from the path from the root to the leafs

(7)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 7

Chapter 4: Application Protocols

p(ADCEB) = 1.00 0 1 4 p(C) = 0.16 p(CED) = 0.37 0 1 2

Resulting Code Words:

w(A) = 10, w(B) = 11, w(C) = 00, w(D) = 011, w(E) = 010

The characters A, B, C, D and E are given with the probabilities

p(A) = 0.27, p(B) = 0.36, p(C) = 0.16, p(D) = 0.14, p(E) = 0.07 p(E) = 0.07 p(D) = 0.14 p(ED) = 0.21 0 1 1 p(B) = 0.36 p(A) = 0.27 p(AB) = 0.63 0 1 3 ⇒ Entropy: 2,13

(8)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 8

Chapter 4: Application Protocols

Letters Digrams Trigrams E 13,05 TH 3,16 THE 4,72 T 9,02 IN 1,54 ING 1,42 O 8,21 ER 1,33 AND 1,13 A 7,81 RE 1,30 ION 1,00 N 7,28 AN 1,08 ENT 0,98 I 6,77 HE 1,08 FOR 0,76 R 6,64 AR 1,02 TIO 0,75 S 6,46 EN 1,02 ERE 0,69 H 5,85 TI 1,02 HER 0,68 D 4,11 TE 0,98 ATE 0,66 L 3,60 AT 0,88 VER 0,63 C 2,93 ON 0,84 TER 0,62 F 2,88 HA 0,84 THA 0,62 U 2,77 OU 0,72 ATI 0,59 M 2,62 IT 0,71 HAT 0,55 P 2,15 ES 0,69 ERS 0,54 Y 1,51 ST 0,68 HIS 0,52 W 1,49 OR 0,68 RES 0,50 G 1,39 NT 0,67 ILL 0,47 B 1,28 HI 0,66 ARE 0,46 V 1,00 EA 0,64 CON 0,45 K 0,42 VE 0,64 NCE 0,43 X 0,30 CO 0,59 ALL 0,44 J 0,23 DE 0,55 EVE 0,44 Q 0,14 RA 0,55 ITH 0,44 Z 0,09 RO 0,55 TED 0,44

Frequency of Characters and Character

Sequences

(English language)

Codes like the Huffman code are not limited necessarily to individual characters.

It can be more meaningful

(depending on the application) to code directly whole character strings – example: the English language.

(9)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 9

Chapter 4: Application Protocols

Characteristics:

• Achieves optimality (coding rate) as the Huffman coding

• Difference to Huffman: the entire data stream has an assigned probability, which consists of the probabilities of the contained characters. Coding a character takes place with consideration of all previous characters.

• The data are coded as an interval of real numbers between 0 and 1. Each value within the interval can be used as code word.

• The minimum length of the code is determined by the assigned probability. • Disadvantage: the data stream can be decoded only as a whole.

(10)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 10

Chapter 4: Application Protocols

Code data ACAB with pA = 0.5, pB = 0.2, pC = 0.3

pA = 0.5 pC = 0.3

ACAB can be coded by each binary number from the interval [0.3875, 0.4025), rounded up to log2(pACAB) = 6.06 i.e. 7 bit, e.g. 0.0110010

pB = 0.2

pAA = 0.25 pAB= 0.1pAC = 0.15 pBA pBB pBC pCA pCB pCC

pACA = 0.075 pACB = 0.03 pACC = 0.045

pACAA = 0.0375 pACAB = 0.015 pACAC = 0.0225

0 0 0.35 0.35 0.5 0.25 0.35 0.5 0.425 0.425 0.7 1 1 0.7 0.6 0.68 0.85 0.91 0.5 0.455 0.3875 0.4025

(11)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 11

Chapter 4: Application Protocols

Layer 7: Application Layer

Collection of often used communication services • Identification of communication partners

• Detection of the availability of communication partners • Authentication

• Negotiation of the grade of the transmission quality • Synchronization of cooperating applications

(12)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 12

Chapter 4: Application Protocols

IP RARP

ARP

Ethernet Token Ring Token Bus Wireless LAN

Internet protocols Layer 1/2 FTP Telnet SMTP DNS SNMP TFTP HTTP UDP TCP

Application Protocols in the TCP/IP

Reference Model

WWW File Transfer Virtual Terminal E-Mail Name Service Network Management File Transfer ICMP IGMP

(13)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 13

Chapter 4: Application Protocols

Protocols of the application layer are common communication services

Protocols of the application layer are defined for special purposes and specify • The types of the sent messages

• The syntax of the message types • The semantics of the message types

• Rules for definition, when and how an application process sends a message resp. responses to it

Usually: Client/Server structure. Processes on the application layer are using

TCP(UDP)/IP-Sockets

Application Protocols in the TCP/IP

Reference Model

(14)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 14

Chapter 4: Application Protocols

Top level Domain de

rwth-aachen

informatik

metatron.informatik.rwth-aachen.de

DNS - Domain Name System

IP addresses are difficult to remember for

humans, but computers can deal with them perfectly.

Symbolic names are simpler for humans to

handle, but computers can unfortunately not deal with them.

(15)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 15

Chapter 4: Application Protocols

1. DNS manages the mapping of logical computer names to IP addresses (and further services)

2. DNS is a distributed database, i.e. the individual segments are subject to local

control

3. The structure of the used name space of the database shows the administrative organization of the Internet

4. Data of each local area are available by means of a Client/Server architecture in the entire network

5. Robustness and speed of the system are being achieved by replication and caching of the naming data

6. Main components:

– Name Server: Server which manages information about a part of the

database

– Resolver: Client which requests naming information from the server

Name Server Name Server Resolver Resolver Request Response

DNS - Concept

(16)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 16

Chapter 4: Application Protocols

User Request Resolver User Response Shared Database User Program References Requests Responses Requests Responses Administrative Requests Administrative Responses Remote Name Server Remote Resolver Remote Name Server Name Server Master Files References Update

DNS - Architecture

(17)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 17

Chapter 4: Application Protocols

• For structuring of all information: the database can be represented as a tree • Each node of the tree is marked with a label, which identifies it relatively to the

father node

• Each (internal) node is root of a sub-tree

• Each of those sub-trees represents a domain • Each domain can be divided into sub-domains

““

com edu gov mil

Generic Countries Oxford cs se de rwth-aachen informatik Domain Sub-domain

(18)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 18

Chapter 4: Application Protocols

• The name of a domain consists of the sequence of labels (separated by “.”) beginning with the root of the domain and going up to the root of the whole tree • In the leaf nodes the IP addresses associated with the names given by the label

sequence are being stored

“” de rwth-aachen informatik metatron logical name:

metatron.informatik.rwth-aachen.de Associated IP address:137.226.12.221

(19)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 19

Chapter 4: Application Protocols

• Each domain can be managed by another organization

• The responsible organization can split a domain into sub-domains and delegate the responsibility for them to other organizations

• The father domain manages pointers to the roots of the sub-domains to be able to forward requests to them

• The name of a domain corresponds to the domain name of the root node

““

com

edu gov mil

Berkeley

Managed by the Network Information Center

Managed by the UC Berkeley (domain berkeley.edu)

(20)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 20

Chapter 4: Application Protocols

• The names of the domains serve as index for the database

• Each computer in the network has a domain name which refers to further information concerning the computer

“” ca or nv ba la oakland rinkon IP address: 192.2.18.44

The data associated with a domain name are stored in so-called Resource Records (RR)

(21)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 21

Chapter 4: Application Protocols

• Computers can have one or more secondary names, so-called Domain Name

Aliases

• Aliases are pointers of one domain name to another one (canonical domain name) “” ca or nv ba la oakland rinkon IP address: 192.2.18.44 mailhub

No IP address is stored, but a logical name:

rinkon.ba.ca.us. us

(22)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 22

Chapter 4: Application Protocols

The reverse tree represents the Domain Name Space • The depth of the tree is limited to 127 levels

• Domain names can have up to 63 characters

• A label of the length 0 is reserved for the root node (“”)

• The Fully Qualified Domain Name (FQDN) is the absolute domain name, which is declared with reference to the root of the tree

Example: informatik.rwth-aachen.de.

• Domain names which are declared not with reference to the root of the tree, but with reference to another domain, are called relative domain names

(23)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 23

Chapter 4: Application Protocols

• A domain consists of all computers whose domain name is within the domain • Leafs of the tree represent individual computers and refer to network

addresses, hardware information and mail routing information

• Internal nodes of the tree can describe both a computer and a domain • Domains are denoted often relatively or regarding their level:

– Top-Level Domain: child of the root node

– First Level Domain: child of the root node (top-level domain) – Second Level Domain: child of a first level of domain

– etc.

(24)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 24

Chapter 4: Application Protocols

• Originally the name space was divided into seven top-level domains: 1. com: commercial organizations

2. edu: educational organizations 3. gov: government organizations 4. mil: military organizations

5. net: network organizations

6. org: non-commercial organizations 7. int: international organizations

• Additionally, each country got its own top-level domain

• The name space was extended in the meantime by further top-level domains • Within the individual top-level domains, different conventions for name

structuring are given:

– Australia: edu.au, com.au, etc.

– UK: co.uk (for commercial organizations), ac.uk (for academic organizations), etc.

– Germany: completely unstructured

(25)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 25

Chapter 4: Application Protocols

• Information about the name space are stored in name servers

• Name Servers manage the whole information for a certain part of the name space; this part is called zone

• The information about a zone is loaded either from a file or from another name server

• The name server has the authority for the zone

• A name server can be responsible for several zones

(26)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 26

Chapter 4: Application Protocols

• Domain and zone are different concepts:

edu com org berkeley nwu purdue Delegation edu domain berkeley.edu zone purdue.edu zone edu zone ““

• Zones are (except within the lowest levels of the tree) smaller than domains, therefore name servers have to manage less name information

(27)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 27

Chapter 4: Application Protocols

Zones

There are no guidelines how domains are divided into zones. Each domain can select a dividing for itself.

Some zones (e.g. edu) do not manage IP addresses. As information they only store references to other zones

(28)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 28

Chapter 4: Application Protocols

• Generally mapping of names to addresses

• The term Name Resolution also designates the process, in which a name server searches the name space for data, for which he is not responsible

• For the searching, a name server needs the domain name and the addresses of the root name servers

• A name server can ask a root name server for each name in the name space • Root name servers know the responsible servers for each top-level domain • On request, a root name server can return names and addresses of name

servers responsible for the top-level domain of the searched name

• The top level name server again manages references to name servers which are responsible for the second level domain

• If additional information is missing, each search begins with the root name servers

(29)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 29

Chapter 4: Application Protocols

Iterative Name Resolution

“” au gov sa sg nz edu ips gbrmpa gbrmpa.gov.au name server gbrmpa.gov.au name server gov.au name server gov.au name server au name server au name server root name server root name server Name server Name server Resolver Resolver

Request for address of

girigiri.gbrmpa.gov.au

Request for address of

girigiri.gbrmpa.gov.au

Request for address of

girigiri.gbrmpa.gov.au

Request for address of

girigiri.gbrmpa.gov.au Reference to au name server Reference to gov.au name server Reference to gbrmpa.gov.au name server Address of girigiri.gbrmpa.gov.au Reque s t R e sp o n se

(30)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 30

Chapter 4: Application Protocols

Distinction between recursive and iterative requests resp. recursive and iterative name resolution

• In case of recursive resolution, a resolver sends a recursive inquiry to a name server

• The name server must answer either with the searched information or an error message, i.e. the name server may not refer to another name server

• If the addressed name server is not responsible for the searched information, it must contact other name servers

• The name server can start a recursive or iterative inquiry; usually it will use an iterative inquiry

• With the inquiry, the name server tries to shorten the resolution process by directing the inquiry to the most suitable name server regarding the searched information (i.e. if known, a server on a lower level is contacted instead of the root name server)

(31)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 31

Chapter 4: Application Protocols

Root Name Server

• Requests to which a name server cannot answer, are handed upward in the tree • Name server on the upper

levels are heavily loaded • Inquiries, which go into

another zone, often run over the root name server • Thus, the root name

server must always be available

• Therefore: replication -there are 13 instances of the root name server, more or less distributed over the whole world

Problem: very central placement of the servers!

(32)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 32

Chapter 4: Application Protocols

• Information in the database is indicated by names • Mapping of a name to an address is simple

• Mapping of an address onto a name is more difficult to realize (complete search of name space)

• Solution:

– Place a special area in the name space, which uses addresses as label; the in-addr.arpa domain

– Nodes in this domain are marked in accordance with the usual notation for IP addresses (four octets separated by points)

– The in-addr.arpa domain has 256 sub-domains, each of which again having 256 sub-domains, …

– On the fourth level, the appropriate resource records are assigned with the octet, which refers to the domain name of the computer or the network with the indicated address

– The IP address appears backwards because it is read beginning with the leaf node (IP address: 15.16.192.152

=> sub-domain: 152.192.16.15.in-addr.arpa)

(33)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 33

Chapter 4: Application Protocols

in-addr arpa 15 “” 16 192 152 0 255 0 255 0 255 0 255 hostname winnie.corp.hp.com

(34)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 34

Chapter 4: Application Protocols

• Caching is the process of buffering name information in a name server not responsible for those information. In further requests these information are present and the name resolution process can be speeded up

• Stored are not only information about the requested hosts, but additionally all information about other name servers used in the resolution process

• The Time to Live (TTL) indicates how long data are allowed to be buffered • The TTL guarantees that no outdated information is used

– Small TTL gives a high consistency

– Large TTL gives a faster resolution of a name

(35)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 35

Chapter 4: Application Protocols

DNS defines only one protocol format, which is used both for inquiries and for responses:

• Identification: 16 bits for the definite identification of an inquiry, to match requests and responses

• Flag: 4 Bit, marking of (1) request/response, (2) authorative/not authorative, (3) iterative/recursive, (4) recursion possible

• „Number of… “: Indication of the contained number of inquiries resp. data records

• Questions: Names to be resolved • Answers: Resource records to

the previous inquiry

• Authority: Identification of passed responsible name servers

• Additional information: further data to the inquiry. If the name searched is only an alias, the belonging resource record for the correct name is placed here

DNS Protocol

Number of Additional RR Number of Authority RR Number of Answers RR Number of Questions

Questions (variable number of RR)

Additional information (variable number of RR)

Authority (variable number of RR) Answers (variable number of RR)

Flag Identification

(36)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 36

Chapter 4: Application Protocols

Evolution of the WWW

World Wide Web (WWW)

Access to linked documents, which are distributed over several computers in the Internet

History of the WWW

• Origin: 1989 in the nuclear research laboratory CERN in Switzerland. • Developed to exchange data, figures, etc. between a large number of

geographically distributed project partners via Internet. • First text-based version in 1990.

• First graphic interface (Mosaic) in February 1993, developed on to Netscape, Internet Explorer…

(37)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 37

Chapter 4: Application Protocols

The Client/Server model is used:

Client (a Browser)

• Presents the actually loaded WWW page

• Permits navigating in the network (e.g. through clicking on a hyperlink) • Offers a number of additional functions (e.g. external viewer or helper

applications).

• Usually, a browser can also be used also for other services (e.g. FTP, e-mail, news,…).

Server

• Process which manages WWW pages.

• Is addressed by the client e.g. through indication of an URL (Uniform Resource Locator = logical address of a web page). The server sends the requested page (or file) back to the client.

(38)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 38

Chapter 4: Application Protocols

WWW, HTML, URL and HTTP

• WWW stands for World Wide Web and means the world-wide cross-linking of information and documents.

• The standard protocol used between a web server and a web client is the HyperText Transfer Protocol (HTTP).

– uses the TCP port 80

– defines the allowed requests and responses – is an ASCII protocol

• Each web page is addressed by a unique URL (Uniform Resource Locator) (e.g. http://www-i4.informatik.rwth-aachen.de/education/tcpip).

• The standard language for web documents is the HyperText Markup Language (HTML).

(39)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 39

Chapter 4: Application Protocols

HTTP - Message Format

GET http://server.name/path/file.type command URL protocol HTTP server domain name

path name file name

GET http:// www.informatik.rwth-aachen.de / info / general.html

Instructions on a URL are

• GET: Load a web page

• HEAD: Load only the header of a web page • PUT: Store a web page on the server

• POST: Append something to the request passed to the web server • DELETE: Delete a web page

(40)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 40

Chapter 4: Application Protocols

Loading of Web Pages

PC

Browser

TCP/IP network

DNS answers

Browser asks DNS for the IP address of the server

Browser sends the command GET /info/general.html Browser opens a TCP connection to port 80 of the computer

WWW server sends back the file general.html Connection is terminated

DNS server

(41)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 41

Chapter 4: Application Protocols

Example: Call of the URL http://www.informatik.rwth-aachen.de/material/general.html

1. The Browser determines the URL (which was clicked or typed).

2. The Browser asks the DNS for the IP address of the server www.informatik.rwth-aachen.de.

3. DNS answers with 137.226.116.241.

4. The browser opens a TCP connection to port 80 of the computer 137.226.116.241 5. Afterwards, the browser sends the command GET /material/general.html

6. The WWW server sends back the file general.html. 7. The connection is terminated.

8. The browser analyzes the WWW page general.html and presents the text. 9. If necessary, each picture is reloaded over a new connection to the server

(The address is included in the page general.html in form of an URL). Note!

Step 9 applies only to HTTP/1.0! With the newer version HTTP/1.1 all referenced pictures are loaded before the connection termination (more efficiently for pages with many pictures).

(42)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 42

Chapter 4: Application Protocols

method sp URL sp version cr lf header field name : value cr lf header field name : value cr lf

header field name : calue cr lf :

: cr lf

Data

Request line: necessary part, e.g.

GET path/file.type

Header lines: optionally, further

information to the host/document, e.g.

Host: www.rwth-aachen.de Accept-language: fr

User-agent: Opera /6.0

Entity Body: optionally. Further

data, if the Client transmits data (POST method)

HTTP Request Header

sp: space

(43)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 43

Chapter 4: Application Protocols

version sp status code sp phrase cr lf header field name : value cr lf

header field name : value cr lf

header field name : value cr lf :

: cr lf

Data

Status LINE: status code and

phrase indicate the result of an

inquiry and an associated message, e.g.

200 OK

400 Bad Request 404 Not Found

Groups of status messages:

1xx: Only for information 2xx: Successful inquiry

3xx: Further activities are necessary 4xx: Client error (syntax)

5xx: Server error

Entity Body: inquired data

HTTP Response Header

HEAD method: the server answers, but

does not transmit the inquired data (debugging)

(44)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 44

Chapter 4: Application Protocols

Server e.g. HTTP

Proxy Server

Caching of WWW pages

• A proxy temporarily stores the pages loaded by browsers. If a page is requested by a browser which already is in the cache, the proxy controls whether the page has changed since storing it. If not, the page can be passed back from the cache. If yes, the page is normally loaded from the server and again stored in the cache, replacing the old version.

Support when using additional protocols

• A browser enables also access to FTP, News, Gopher or telnet servers etc.

• Instead of implementing all protocols in the browser, it can be realized the proxy. The proxy then “speaks” HTTP with the browser and e.g. FTP with a FTP server.

Integration into a Firewall

• The proxy can deny the access to certain web pages (e.g. in schools).

HTTP Browser

Proxy

Server Internet

A Proxy is an intermediate entity used by several browsers. It takes over tasks of the browsers (complexity) and servers for more efficient page loading!

(45)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 45

Chapter 4: Application Protocols

Early systems

A simple file transmission took place, with the convention that the first line contains the address of the receiver of the file.

Problems

E-Mail to groups, structuring of the e-mail, delegation of the administration to a secretary, file editor as user interface, no mixed media

Solution

X.400 as standard for e-mail transfer. This specification was however too complex and badly designed. Generally accepted only became a simpler system, cobbled together “by a handful of computer science students”:

the Simple Mail Transfer Protocol (SMTP).

(46)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 46

Chapter 4: Application Protocols

Electronic Mail: E-Mail

Internet Message

Transfer Agent User

AgentAgentUserUser AgentAgentUser

An e-mail system generally consists of two subsystems:

• User Agent (UA, normal e-mail program)

Usually runs on the computer of the user and helps during the processing of e-mails

Creation of new and answering of old e-mail Receipt and presentation of e-mail

Administration of received e-mail

• Message Transfer Agent (MTA, e-mail server)

Usually runs in the background (around the clock) Delivery of e-mail which is sent by User Agents

Intermediate storage of messages for users or other Message Transfer Agents

(47)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 47

Chapter 4: Application Protocols

Structure of an E-Mail

For sending an e-mail, the following information is needed from the user: • Message (usually normal text + attachments, e.g. word file, GIF image…) • Destination address (in general in the form mailbox@location,

e.g. [email protected]) • Possibly additional parameters concerning e.g. priority or security

E-Mail formats: two used standards

• RFC 2822

• MIME (Multipurpose Internet Mail Extensions) With RFC 2822 an e-mail consists of

• a simple “envelope” (created by the Message Transfer Agent based on the data in the e-mail header),

• a set of header fields (each one line ASCII text), • a blank line, and

(48)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 48

Chapter 4: Application Protocols

E-Mail Header

Header Meaning

To: Address of the main receiver (possibly several receivers or also a mailing list) Cc: Carbon copy, e-mail addresses of less important receivers

Bcc: Blind carbon copy, a receiver which is not indicated to the other receivers From: Person who wrote the message

Sender: Address of the actual sender of the message (possibly different to “From” person) Received: One entry per Message Transfer Agent on the path to the receiver

Return Path: Path back to the sender (usually only e-mail address of the sender) Date: Transmission date and time

Reply to: E-Mail address to which answers are to be addressed

Message-Id: Clear identification number of the e-mail (for later references) In-Reply-to: Message-Id of the message to which the answer is directed References: Other relevant Message-Ids

(49)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 49

Chapter 4: Application Protocols

E-Mail Header

RFC 822: only suitably for messages of pure ASCII text without special characters. Nowadays demanded additionally:

• E-Mail in languages with special characters (e.g. French or German) • E-Mail in languages not using the Latin alphabet (e.g. Russian)

• E-Mail in languages not at all using an alphabet (e.g. Japanese) • E-Mail not completely consisting of pure text (e.g. audio or video) MIME keeps the RFC-2822 format, but additionally defines

• a structure in the Message Body (by using additional headers), and • coding rules for non-ASCII characters.

Header Meaning

MIME-Version: Used version of MIME is marked

Content-Description: String which describes the contents of the message

Content-Id: Clear identifier for the contents

Content-Transfer-Encoding:

Coding which was selected for the contents of the email (some networks understand e.g. only ASCII characters). Examples: base64, quoted-printable

(50)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 50

Chapter 4: Application Protocols

MIME

MIME-Version: 1.0

Content-Type: MULTIPART/MIXED;

BOUNDARY= "8323328-2120168431-824156555=:325" --8323328-2120168431-824156555=:325

Content-Type: TEXT/PLAIN; charset=US-ASCII

A picture is in the appendix

--8323328-2120168431-824156555=:325

Content-Type: IMAGE/JPEG; name="picture.jpg"

Content-Transfer-Encoding: BASE64 Content-ID: <PINE.LNX.3.91.960212212235.325B@localhost> Content-Description: /9j/4AAQSkZJRgABAQEAlgCWAAD/2wBDAAEBAQEBAQEBAQEBAQEBAQIBAQEBA QIBAQECAgICAgICAgIDAwQDAwMDAwICAwQDAwQEBAQEAgMFBQQEBQQEBAT/ 2wBDAQEBAQEBAQIBAQIEAwIDBAQEBA […] KKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAoooo AKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiig AooooAD//Z ---8323328-2120168431-824156555=:325 —

(51)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 51

Chapter 4: Application Protocols

E-Mail over POP3 and SMTP

Simple Mail Transfer Protocol (SMTP)

– Sending e-mails over a TCP connection (port 25) – SMTP is a simple ASCII protocol

– Without checksums, without encryption

– Receiving machine is the server and begins with the communication

– If the server is ready for receiving, it signals this to the client. This sends the information from whom the e-mail comes and who the receiver is. If the receiver is known to the server, the client sends the message, the server confirms the receipt.

Post Office Protocol version 3 (POP3)

– Get e-mails from the server over a TCP connection, port 110

– Commands for logging in and out, message download, deleting messages on the server (maybe without transferring them to the client)

– Only copies e-mails of the remote server to the local system

Internet Message

Transfer Agent User

AgentAgentUserUser AgentAgentUser

Internet Message

Transfer Agent User

AgentAgentUserUser AgentAgentUser

(52)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 52

Chapter 4: Application Protocols

E-Mail over POP3 and SMTP

• User 1: writes an e-mail

• Client 1 (UA 1): formats the e-mail, produces the receiver list, and sends the e-mail to its mail server (MTA 1)

• Server 1 (MTA 1): Sets up a connection to the SMTP server (MTA 2) of the receiver and sends a copy of the e-mail

• Server (MTA 2): Produces the header of the e-mail and places the e-mail into the appropriate mailbox • Client 2 (UA 2): sets up a connection to the mail

server and authenticates itself with username and password (unencrypted!)

• Server (MTA 2): sends the e-mail to the client • Client 2 (UA 2): formats the e-mail

• User 2: reads the e-mail

User AgentUser AgentUser AgentUser Agent Message Transfer Agent User AgentUser AgentUser AgentUser Agent Message Transfer Agent POP3 SMTP SMTP Internet

(53)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 53

Chapter 4: Application Protocols

SMTP - Command Sequence

Communication between partners (from abc.com to beta.edu) in text form of the following kind:

C: HELO <abc.com> /* Identification of the sender/*

S: 250 <beta.edu> OK /* Server announces itself */ C: MAIL FROM:<[email protected]> /* Sender of the e-mail */

S: 250 OK

/* Receiver of the e-mail */ C: RCPT TO:<[email protected]>

S: 250 OK

C: DATA /* The data are following */

S: 354 Start mail inputs; end with “<crlf>.<crlf>” on a line by itself

C: From: Krogull@…. <crlf>.<crlf> /* Transfer of the whole e-mail, including all headers */

S: 220 <beta.edu> Service Ready

S: 250 OK

C: QUIT

S: 221 <beta.edu> Server Closing

/* Terminating the connection */ /* Receiver is ready/*

S = server, receiving MTA / C = Client, sending MTA

/* Sending is permitted */ /* Receiver known */

(54)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 54

Chapter 4: Application Protocols

PC Client (UA) POP3 Server (MTA) TCP/IP network Commands Replies TCP connection port 110 Greetings

POP3

Get e-mails from the server by means of POP3:

• Authorizing phase: USER name PASS string • Transaction phase: STAT LIST [msg] RETR msg DELE msg NOOP RSET QUIT

Minimal protocol with only two command types: • Copy e-mails to the local computer

(55)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 55

Chapter 4: Application Protocols

POP3 Protocol

Authorizing phase

• user identifies the user • pass is its password

• +OK or -ERR are possible server answers

Transaction phase

• list for the listing of the message numbers and the message sizes

• retr to requesting a message

by its number

• dele deletes the appropriate

message

S: +OK POP3 server ready C: user alice

S: +OK

C: pass hungry

S: +OK user successfully logged in C: list S: 1 498 S: 2 912 S: . C: retr 1 S: <message 1 contents> S: . C: dele 1 C: retr 2 S: <message 2 contents> S: . C: dele 2 C: quit S: +OK

(56)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 56

Chapter 4: Application Protocols

IMAP as POP3 “Variant “

Meanwhile also many operators of web pages offer email services: gmx, web.de, yahoo,…

Here finally again HTTP serves as protocol for the access to the e-mails. The

management is similar as with IMAP, only that the client is integrated into the web server.

Enhancement of POP3: IMAP (Interactive Mail Access Protocol) • TCP connection over port 143

• E-Mails are not downloaded and stored locally, but remain on the server

• The client performs all actions remotely. This is suitable for users who need access to their e-mails from different hosts

• Protocols are more complex than with POP3: set up and manage remote mailboxes

(57)

Lehrstuhl für Informatik 4

Kommunikation und verteilte Systeme

Page 57

Chapter 4: Application Protocols

Conclusion

IP is the core protocol which enables the Internet • IPv4 still in use

• Lots of “helper” protocols, e.g. RSVP for connection-oriented communication, DHCP for mobile devices, …

• IPv6 would make easier lots of things, but migration is hard… TCP and UDP as two different transport protocols

• TCP is connection-oriented, UDP is connectionless

• Several other protocols like RTP to fill the gap between them for today’s needs Application protocols

• Nothing to do with “physical” communication • Dealing more with “contents” of communication • All using on the client/server paradigm

References

Related documents

• A name server can ask a root name server for each name in the name space • Root name servers know the responsible servers for each top-level domain • On request, a root name

1.14 “Public-Root” shall be used to refer to the international federation of independent root operators that are responsible for the operation of the root name servers2. 1.15

• Each name server knows about the higher level name servers • The lowest level server knows the answer of the request (e.g.

This process is automated by EASY for mySAP SmartLink: The related SAP entries are found for business documents already stored in EASY ARCHIVE via index data; the document is

– a.root-server.net ÷ m.root-server.net - equal publishers – All thirteen are authoritative servers for the root zone. • An average client comes here &lt; 8 times

The system administrator must provide networking information for the Synapse Mobility server, along with a user name and password for each user.Following installation of the

If, however, the domain is remote and no information about the requested domain is available locally, the name server sends a query message to the top-level name server for the

Attach one wire to the Lockon spring clip terminal labeled “1” and connect it to the power terminal labeled “A”6. All Controller connections are illustrated in