• No results found

Distributed Systems Principles and Paradigms

N/A
N/A
Protected

Academic year: 2021

Share "Distributed Systems Principles and Paradigms"

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

Distributed Systems

Principles and Paradigms

Chapter 12

(version October 15, 2007)

Maarten van Steen

Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science

Room R4.20. Tel: (020) 598 7784

E-mail:[email protected], URL: www.cs.vu.nl/∼steen/

01 Introduction 02 Architectures 03 Processes 04 Communication 05 Naming 06 Synchronization

07 Consistency and Replication 08 Fault Tolerance

09 Security

10 Distributed Object-Based Systems 11 Distributed File Systems

12 Distributed Web-Based Systems 13 Distributed Coordination-Based Systems

00 – 1 /

Distributed Web-Based Systems

Essence: The WWW is a huge client-server system with millions of servers; each server hosting thousands ofhyperlinkeddocuments: Client machine Browser OS Server machine Web server

1. Get document request (HTTP) 3. Response

2. Server fetches document from local file

Documents are generally represented in text (plain

text, HTML, XML)

Alternative types: images, audio, video, but also

applications (PDF, PS)

• Documents may contain scripts that are executed

by the client-side software

(2)

Multi-tiered Architectures

Observation:Already very soon, Web sites were or-ganized into three tiers:

Web server CGI process Database server CGI

program 1. Get request

3. Start process to fetch document

5. HTML document created HTTP request handler 6. Return result 4. Database interaction

12 – 2 Distributed Web-Based Systems/12.1 Architecture

Web Services

Observation: At a certain point, people started

rec-ognizing that it is was more than justuser ↔site

in-teraction: sites could offerservicesto other sites⇒

standardizationis then badly needed.

Service description (WSDL) Client machine Client application Stub Server application Stub Communication subsystem Communication subsystem SOAP

Service description (WSDL)Service description (WSDL)

Directory service (UDDI)

Publish service Look up a service Generate stub from WSDL description Server machine Generate stub from WSDL description

(3)

Clients: Web browsers

Observation: browsers form the Web’s most

impor-tant client-side sofware. Theyusedto be simple, but

that is long ago.

User interface Browser engine Rendering engine Network comm. HTML/XML parser Displa y bac k end Client-side script interpreter

12 – 4 Distributed Web-Based Systems/12.2 Processes

Apache Web Server

Observation: More than 70% of all Web sites are based on Apache. The server is internally organized more or less according to the steps needed to process an HTTP request:

Hook Hook Hook Hook

Function

... ... ...

Module Module Module

Apache core Functions called per hook

Link between function and hook

Request Response

(4)

Server Clusters (1/2)

Essence: To improve performance and availability, WWW servers are often clustered in a way that is transparent to clients: Front end Web server Web server Web server Web server Request Response Front end handles all incoming requests and outgoing responses

LAN

Problem: The front end may easily get overloaded, so that special measures need to be taken.

Transport-layer switching: Front end simply passes the TCP request to one of the servers, taking some performance metric into account.

Content-aware distribution: Front end reads the con-tent of the HTTP request and then selects the best server.

12 – 6 Distributed Web-Based Systems/12.2 Processes

Server Clusters (2/2)

Question: Why can content-aware distribution be so much better? Switch Client Web server Web server Distributor Distributor Dis-patcher

1. Pass setup request

to a distributor 2. Dispatcher selectsserver 3. Hand of f TCP connection 4. Inform switch Setup request Other messages 5. Forward other messages 6. Server responses

(5)

Communication (1/2)

Essence:Communication in the Web is generally based on HTTP; a relatively simple client-server transfer pro-tocol having the following request messages:

Operation

Description

Head Request to return the header of a document Get Request to return a document to the client Put Request to store a document

Post Provide data that are to be added to a docu-ment (collection)

Delete Request to delete a document

12 – 8 Distributed Web-Based Systems/12.3 Communication

Communication (2/2)

Header C/S Contents

Accept C The type of documents the client can handle Accept-Charset C The character sets are acceptable for the client

Accept-Encoding

C The document encodings the client can handle

Accept-Language

C The natural language the client can handle Authorization C A list of the client’s credentials

WWW-Authenticate

S Security challenge the client should respond to Date C+S Date and time the message was sent

ETag S The tags associated with the returned document Expires S The time for how long the response remains valid From C The client’s e-mail address

Host C The TCP address of the document’s server If-Match C The tags the document should have If-None-Match C The tags the document should not have

If-Modified-Since

C Tells the server to return a document only if it has been modified since the specified time

If-Unmodified-Since

C Tells the server to return a document only if it has not been modified since the specified time Last-Modified S The time the returned document was last modified Location S A document reference to which the client should

redirect its request

Referer C Refers to client’s most recently requested document Upgrade C+S The application protocol sender wants to switch to Warning C+S Information about status of the data in the message

(6)

SOAP

Simple Object Access Protocol: Based on XML, this is the standard protocol for communication be-tween Web services.

• SOAP isboundto an underlying protocol (i.e., it

is not independent from itscarrier)

Conversational exchange style: Send a

docu-mentone way, get a filled-in response back.

RPC-style exchange:Used toinvoke a Web ser-vice.

12 – 10 Distributed Web-Based Systems/12.3 Communication

A Note on XML

Observation:XML has the advantage of allowing self-describing documents. Full stop (i.e., it introduces performance problems and is not meant to be read by human beings) env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope"> <env:Header> <n:alertcontrol xmlns:n="http://example.org/alertcontrol"> <n:priority>1</n:priority> <n:expires>2001-06-22T14:00:00-05:00</n:expires> </n:alertcontrol> </env:Header> <env:Body> <m:alert xmlns:m="http://example.org/alert"> <m:msg>Pick up Mary at school at 2pm</m:msg> </m:alert>

</env:Body> </env:Envelope>

(7)

Naming: URL

URL:Uniform Resource Locator tells how and where

to access a resource.

Scheme Host name Pathname

Scheme Host name Port Pathname

Scheme Host name Port Pathname http http http :// :// :// www.cs.vu.nl www.cs.vu.nl 130.37.24.11 : : 80 80 /home/steen/mbox /home/steen/mbox /home/steen/mbox (a) (b) (c) Examples: http HTTP http://www.cs.vu.nl:80/globe mailto Mail mailto:[email protected]

ftp FTP ftp://ftp.cs.vu.nl/pub/minix/README file Local file file:/edu/book/work/chp/11/11 data Inline data data:text/plain;charset=iso-8859-7,

%e1%e2%e3 telnet Remote login telnet://flits.cs.vu.nl tel Telephone tel:+31201234567

modem Modem modem:+31201234567;type=v32

12 – 12 Distributed Web-Based Systems/12.4

Synchronization: WebDAV

Problem: There is a growing need for collaborative auditing of Web documents, but bare-bones HTTP can’t help here. Solution: Web Distributed Authoring and Versioning.

• Supports exclusive and shared write locks, which

operate on entire documents

• A lock is passed by means of a lock token; the

server registers the client(s) holding the lock

Clients modify the document locally and post it

back to the server along with the lock token

Note:There is no specific support for crashed clients holding a lock.

(8)

Web Proxy Caching

Basic idea:Sites install a separateproxy serverthat handles all outgoing requests. Proxies subsequently cache incoming documents. Cache-consistency pro-tocols:

• Always verify validity by contacting server

• Age-based consistency:

Texpire=α·(TcachedTlast modi f ied) +Tcached

• Cooperative caching, by which you first check your

neighbors on a cache miss:

Web proxy Web server Web proxy Web proxy Cache Cache Cache Client Client Client Client Client Client Client Client Client 2. Ask neighboring proxy caches

1. Look in local cache

HTTP Get request

3. Forward request to Web server

12 – 14 Distributed Web-Based Systems/12.6 Consistency and Replication

Replication in Web Hosting

Systems

Observation:By-and-large, Web hosting systems are adopting replication to increase performance. Much research is done to improve their organization.

Fol-lows the lines ofself-managingsystems:

Web hosting system

Metric estimation Analysis +/-Reference input Initial configuration

Uncontrollable parameters (disturbance / noise)

Observed output Measured output Adjustment triggers Corrections Replica placement Consistency enforcement Request routing

(9)

Handling Flash Crowds

Observation:We needdynamic adjustmentto

bal-ance resource usage. Flash crowdsintroduce a

se-rious problem:

(a) (b)

(c) (d)

2 days 2 days

6 days 2.5 days

12 – 16 Distributed Web-Based Systems/12.6 Consistency and Replication

Server Replication

Content Delivery Network: CDNs act as Web host-ing services to replicate documents across the Inter-net providing their customers guarantees on high avail-ability and performance (example: Akamai).

Origin server Client CDN server CDN DNS server Regular DNS system Cache

1. Get base document 2. Document with refs to embedded documents

6. Get embedded documents (if not already cached)

5. Get embedded documents 7. Embedded documents Return IP address client-best server DNS lookups 3 4

Question: How would consistency be maintained in this system?

(10)

Replication of Web Apps. (1/3)

Observation:Replication becomes more difficult when dealing with databses and such. No single best solu-tion.

Authoritative database

Schema Schema

Server query Server

response

full/partial data replication

full schema replication/ query templates Content-blind cache Content-aware cache Database copy Client

Edge-server side Origin-server side

Assumption:Updates are carried out atorigin server, and propagated to edge servers.

12 – 18 Distributed Web-Based Systems/12.6 Consistency and Replication

Replication of Web Apps. (2/3)

Authoritative database

Schema Schema

Server query Server

response

full/partial data replication

full schema replication/ query templates Content-blind cache Content-aware cache Database copy Client

Edge-server side Origin-server side

Full replication: high read/write ratio, often in

combination withcomplex queries.Note:

replica-tion may possibly speed-down performance when R/W ratio goes down.

Partial replication: high read/write ratio, but in

combination withsimple queries

(11)

Replication of Web Apps. (3/3)

Authoritative database

Schema Schema

Server query Server

response

full/partial data replication

full schema replication/ query templates Content-blind cache Content-aware cache Database copy Client

Edge-server side Origin-server side

Content-aware caching:Check for queries at

lo-cal database, and subscribe for invalidations at

the server. Works good with range queries and

complex queries.

Content-blind caching:Simply cache the result

of previous queries. Works great withsimple queries

that address unique results (e.g., no range queries). 12 – 20 Distributed Web-Based Systems/12.6 Consistency and Replication

Security: TLS (SSL)

Transport Layer Security: Modern version of the the Secure Socket Layer (SSL), which “sits” between transport layer and application protocols. Relatively simple protocol that can support mutual authentica-tion using certificates:

Client Server [ K [ K + + S C CA CA ] ] ([ R ]C KS+ ) Possibilities Choices 1 2 3 4 5

References

Related documents

full schema replication/ query templates Content-blind cache Content-aware cache Database copy Client Edge-server side Authoritative database Schema Web server query Origin-server

Six different possibilities were evaluated on acceptance by the respondents (general budget, new roads, improve public transport, abandon existing car taxation, lower fuel taxes,

Treatments that have accepted off-label for treatment of substance misuse within NHS Fife Addiction Services are:  Baclofen for use in alcohol dependence..  Diazepam

Soil heat flux was measured at a depth of 0.06m below the soil surface using two soil heat flux plates (Radiation and Energy Balance Systems, Inc., Seattle, Wash.) on either side

Because you not only have God’s written WORD to read, you have the Holy Spirit living inside you to quicken that WORD and fulfill the promise Jesus made in John 16: “When he,

The accuracy of the Bolot reward value for the synthetic traces was studied by plotting the actual percentage loss after reconstruction against the percentage loss calculated by

Christopher Westra is the author of &#34;I Create Reality - Beyond Visualization:&#34; How You Can Use Holographic Creation to Manifest Your Desires!. To find out why some

Abstract:  On  the  basis  of  experimental  studies,  the  operational  power  of  four  borehole