• No results found

Distributed Systems Principles and Paradigms. Chapter 12: Distributed Web-Based Systems. Distributed Web-Based Systems. Multi-tiered architectures

N/A
N/A
Protected

Academic year: 2021

Share "Distributed Systems Principles and Paradigms. Chapter 12: Distributed Web-Based Systems. Distributed Web-Based Systems. Multi-tiered architectures"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Distributed Systems

Principles and Paradigms

Maarten van Steen

VU Amsterdam, Dept. Computer Science

[email protected]

Chapter 12: Distributed Web-Based Systems

Version: December 10, 2012

1 / 19

Distributed Web-Based Systems 12.1 Architecture

Distributed Web-based systems

Essence

The WWW is a huge client-server system with millions of servers; each

server hosting thousands of

hyperlinked

documents.

Documents are often represented in text (plain text, HTML, XML)

Alternative types: images, audio, video, applications (PDF, PS)

Documents may contain scripts, executed by client-side software

Client machine

Browser

OS

Server machine

Web server

1. Get document request (HTTP) 3. Response

2. Server fetches document from local file

2 / 19

Distributed Web-Based Systems 12.1 Architecture

2 / 19

Distributed Web-Based Systems 12.1 Architecture

Multi-tiered architectures

Observation

Already very soon, Web sites were organized into three tiers.

Web server CGI process Database server CGI

program 1. Get request

3. Start process to fetch document

5. HTML document created HTTP request handler 6. Return result 4. Database interaction 3 / 19

Distributed Web-Based Systems 12.1 Architecture

(2)

Distributed Web-Based Systems 12.1 Architecture

Web services

Observation

At a certain point, people started recognizing that it is was more than just

user ↔ site

interaction: sites could offer

services

to other sites ⇒

standardization

is then badly needed.

Service description (WSDL) Client machine Client application Stub Server application Stub Communication subsystem Communication subsystem SOAP

Service description (WSDL)Service description (WSDL)

Directory service (UDDI)

Publish service Look up a service Generate stub from WSDL description Server machine Generate stub from WSDL description 4 / 19

Distributed Web-Based Systems 12.1 Architecture

4 / 19

Distributed Web-Based Systems 12.2 Processes

Apache Web server

Observation: More than 52% of all 185 million Web sites are Apache.

The server is internally organized more or less according to the steps needed

to process an HTTP request.

Hook Hook Hook Hook

Function

...

...

...

Module Module Module

Apache core Functions called per hook

Link between function and hook

Request Response

5 / 19

Distributed Web-Based Systems 12.2 Processes

5 / 19

Distributed Web-Based Systems 12.2 Processes

Server clusters

Essence

To improve performance and availability, WWW servers are often clustered in

a way that is transparent to clients.

Front end Web server Web server Web server Web server Request Response Front end handles all incoming requests and outgoing responses

LAN

(3)

Distributed Web-Based Systems 12.2 Processes

Server clusters

Problem

The front end may easily get overloaded, so that special measures

need to be taken.

Transport-layer switching:

Front end simply passes the TCP

request to one of the servers, taking some performance metric

into account.

Content-aware distribution:

Front end reads the content of the

HTTP request and then selects the best server.

7 / 19

Distributed Web-Based Systems 12.2 Processes

7 / 19

Distributed Web-Based Systems 12.2 Processes

Server Clusters

Question

Why can content-aware distribution be so much better?

Switch Client Web server Web server Distributor Distributor Dis-patcher

1. Pass setup request

to a distributor 2. Dispatcher selectsserver 3. Hand of f TCP connection 4. Inform switch Setup request Other messages 5. Forward other messages 6. Server responses 8 / 19

Distributed Web-Based Systems 12.2 Processes

8 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

Web proxy caching

Basic idea

Sites install a separate

proxy server

that handles all outgoing requests.

Proxies subsequently cache incoming documents. Cache-consistency

protocols:

Always verify validity by contacting server

Age-based consistency:

T

expire

=

α ·(T

cached

− T

last modified

) +

T

cached

9 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

(4)

Distributed Web-Based Systems 12.6 Consistency and Replication

Web proxy caching

Basic idea (cnt’d)

Cooperative caching, by which you first check your neighbors on a

cache miss

Web proxy Web server Web proxy Web proxy Cache Cache Cache Client Client Client Client Client Client Client Client Client 2. Ask neighboring proxy caches

1. Look in local cache HTTP Get request 3. Forward request to Web server 10 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

10 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

Replication in Web hosting systems

Observation

By-and-large, Web hosting systems are adopting replication to increase

performance. Much research is done to improve their organization. Follows

the lines of

self-managing

systems.

Web hosting system

Metric estimation Analysis +/-Reference input Initial configuration

Uncontrollable parameters (disturbance / noise)

Observed output Measured output Adjustment triggers Corrections Replica placement Consistency enforcement Request routing 11 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

11 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

Handling flash crowds

Observation

We need

dynamic adjustment

to balance resource usage.

Flash

crowds

introduce a serious problem.

(a) (b)

(c) (d)

2 days 2 days

6 days 2.5 days

(5)

Distributed Web-Based Systems 12.6 Consistency and Replication

Server replication

Content Delivery Network

CDNs act as Web hosting services to replicate documents across the

Internet providing their customers guarantees on high availability and

performance (example: Akamai).

Origin server Client CDN server CDN DNS server Regular DNS system Cache

1. Get base document

2. Document with refs to embedded documents

6. Get embedded documents (if not already cached)

5. Get embedded documents 7. Embedded documents Return IP address client-best server DNS lookups 3 4 13 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

13 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

Replication of Web applications

Observation

Replication becomes more difficult when dealing with databses and

such. No single best solution.

Assumption

Updates are carried out at

origin server

, and propagated to edge

servers.

14 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

14 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

Replication of Web applications: normal

Appl logic Appl logic Authoritative database Schema Schema Web server Web server query response

full/partial data replication

full schema replication/ query templates Content-aware

cache

Database copy

Edge-server side Origin-server side

Content-blind cache Client

15 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

(6)

Distributed Web-Based Systems 12.6 Consistency and Replication

Replication of Web applications

Alternative solutions

Full replication:

high read/write ratio, often in combination with

complex

queries

.

Partial replication:

high read/write ratio, but in combination with

simple

queries

Content-aware caching:

Check for queries at local database, and

subscribe for invalidations at the server. Works good with

range queries

and

complex queries

.

Content-blind caching:

Simply cache the result of previous queries.

Works great with

simple queries

that address unique results (e.g., no

range queries).

Question

What can be said about replication vs. performance?

16 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

16 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

Replication Web apps.: full/partial replication

Appl logic Schema Web server response

full/partial data replication

full schema replication/ query templates Content-blind cache Content-aware cache Database copy Client Edge-server side Authoritative database Schema Web server query Origin-server side Appl logic 17 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

17 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

Replication Web apps.: content-aware caching

Appl logic Schema Web server response

full/partial data replication

full schema replication/ query templates Content-blind cache Content-aware cache Database copy Client Edge-server side Authoritative database Schema Web server query Origin-server side Appl logic

(7)

Distributed Web-Based Systems 12.6 Consistency and Replication

Replication Web apps.: content-blind caching

Appl logic Schema Web server response

full/partial data replication

full schema replication/ query templates Content-blind cache Content-aware cache Database copy Client Edge-server side Authoritative database Schema Web server query Origin-server side Appl logic 19 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication

References

Related documents

Thus for these activities, constraint set 2.6 implies that a train can only depart on a track of an open track section if a train has departed on the same track in the same

Kyrki-Rajamäki Riitta, Lappeenranta University of Technology, 409 500€ Heikinheimo Liisa Sofi, VTT Technical Research Centre of Finland, 545 220€ Salomaa Rainer, Helsinki

They showed that the translational energy and internal distributions of the OH reaction products could be measured by recording appearance profiles (analogous to

Aside from the larger randomness in behaviour, there is also an important difference when uncertainty about the true reward statistics is high (lower left corner): in the absence

l   Assurance of protocol compliance allows service providers to dedicate resources. to address their

As the group target’s motion shows a clear orientation in many occasions, a ran- dom matrix approach is derived based on Bayesian theory for maneuvering target tracking, which does

In addition to experienced personnel, Weatherford is a leader in completion technologies such as ESS systems, conventional well screens, inflow control devices, production

List the SAO’s and select at least one to assess. West LA College users will be highly satisfied overall with InfoTech Support and Services. Most requests from the Info Tech help