• No results found

Scalable Internet Services and Load Balancing

N/A
N/A
Protected

Academic year: 2021

Share "Scalable Internet Services and Load Balancing"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

4/8/2013 CSC 257/457 - Spring 2013 1

Scalable Internet Services

and Load Balancing

Kai Shen

4/8/2013 CSC 257/457 - Spring 2013 2

Internet Services

Internet brings ubiquitous connection

Internet‐based applications/services accessible to online users  through Internet

Applications can be designed and launched quickly and  immediately open to many many potential users

Trends:

Centralization of applications/services

Scalability requirements: many simultaneous user accesses; large  amount of hosted data, …

Search Engine as An Example: 

Step 1 – Crawling

Crawling – get all these Web pages out there:

First retrieve some root pages;

Parse their content and follow hyperlinks to retrieve more pages;

Depth‐first search or breadth‐first search?  

Performance Analysis for Crawling

What are the resources involved?

CPU processing for TCP/HTTP protocol handling and the parsing of  page content

writing to disk storage

network bandwidth to remote web sites

Assume average page size 10KB

raw processing power of a single CPU

1000 pages/sec

I/O to a single disk

100 seeks/sec  up to 100 pages/sec

network bandwidth from/to the Internet 

T1 link (1.5Mbit/s)  12 pages/sec

T3 link (45Mbit/s)  360 pages/sec

(2)

4/8/2013 CSC 257/457 - Spring 2013 5

Parallel Crawling

Challenge

Avoid redundant crawling

4/8/2013 CSC 257/457 - Spring 2013 6

Search Engine as An Example: 

Step 2 – Indexing

Indexing

crawled raw web pages are not easy to search.

we index them to formats that are easy to search.

As part of indexing, we need to give each page an ID

using a hash function.

Fast intersection?

Computer: Page #123 Page #357

… …

Networks: Page #124 Page #468

… …

4/8/2013 CSC 257/457 - Spring 2013 7

Search Engine as An Example: 

Step 3 – Online Search

Index server

Page server Web server/

Query handler Firewall

Internet

Local-area network

Scalability, reliability

4/8/2013 CSC 257/457 - Spring 2013 8

Partitioning and Replication 

Index servers (partition 1)

Page servers Web server/

Query handlers

Local-area

network Index servers

(partition 2) Firewall/

Router Internet

(3)

4/8/2013 CSC 257/457 - Spring 2013 9

Load Balancing over Internet Servers

Popular sites like Google or Facebook receive tens or hundreds  of millions of hits per day.

A large number of replicated servers are used at these sites.

Key question:how to balance client requests over these servers?

Goals:

Performance

Ease of deployment

4/8/2013 CSC 257/457 - Spring 2013 10

Load Balancing on Internet Servers   

Technique 1 ‐ DNS Rotation

Web servers for Facebook.com Firewall/

Router Internet

DNS server for Facebook.com

128.111.1.2

128.111.1.3

128.111.1.4 IP address of

Facebook.com?

128.111.1.2 IP address of

Facebook.com?

128.111.1.3

Discussions on DNS Rotation

Advantages

Require almost no change on the existing Internet architecture

Problems

DNS Caching

Rigid load balancing policy 

can’t balance based on runtime load changes

slow or no adjustment in response to failures

Load Balancing on Internet Servers 

Technique 2 – Cooperative Offloading

Web servers for Facebook.com Firewall/

Router Internet

128.111.1.2

128.111.1.3

128.111.1.4

(4)

4/8/2013 CSC 257/457 - Spring 2013 13

Discussions on Cooperative Offloading

Can be combined with the DNS rotation.

Advantages:

More flexible policy is possible

Be more responsive to runtime workload and server failures (to a  certain degree)

Problems:

Need software changes on servers

Longer delay

4/8/2013 CSC 257/457 - Spring 2013 14

Cooperative Offloading with TCP Handoff 

[Pai et al. ASPLOS1998]

Web servers for CNN.com Firewall/

Router Internet

128.111.1.2

128.111.1.3

128.111.1.4 clt IP

1.3 clt IP

1.4

1.3 clt IP What does 1.3 do?

What does 1.4 do?

What does client do?

Where is the TCP state?

All packets in a TCP connection must offload to one server!

4/8/2013 CSC 257/457 - Spring 2013 15

Cooperative Offloading vs. TCP Handoff

Software changes on the servers

Delays

4/8/2013 CSC 257/457 - Spring 2013 16

Load Balancing on Internet Servers 

Technique 3 – Load Balancing Router

Web servers for CNN.com Firewall

LB Router Internet

128.111.1.2

128.111.1.3

128.111.1.4 128.111.1.1

clt IP 1.1

clt IP

1.2 1.2

clt IP

1.1 clt IP

All packets in a TCP connection must offload to one server.

(5)

4/8/2013 CSC 257/457 - Spring 2013 17

More About Load Balancing Router

Load balancing policies in LB routers (Goal: transparency, plug‐

and‐play)

Simple rotation

Least number of active requests

Shortest response time

How deep do we look into the network protocol stack?

Network layer (IP)?

Transport layer (TCP/UDP)?

Application layer?

4/8/2013 CSC 257/457 - Spring 2013 18

Summary

Scalable Internet servers

partitioning

replication

Load balancing for Internet servers

DNS rotation

Cooperative offloading (w. TCP handoff)

Load balancing router

Changes required on the components:

DNS server??

Web server??

Client??

Router??

References

Related documents

The purpose of this study is to examine the relationship between the functional health literacy of parent enrolled in CHIP as measured by the LSP and their child’s utilization

In both 1990 and 2006, three sectors dominated activity on the Texas border— electronics, transportation equipment and electrical machinery ( Table 2 ).. And one was

Similar to Wall (2000), our analysis is of the effect of NAFTA on the geographic pattern of trade. However, our analysis is from the perspective of US states and estimates the

The BRIGHT study investigated the effect of 3 weeks treatment with the once-daily dual bronchodilator QVA149 on exercise endurance time and lung function in patients

four-way, crossover trial comparing the 24-h FEV 1 profile for once- daily versus twice-daily treatment with olodaterol, a novel long-acting β 2 -agonist, in patients with

The contemporary city and its open public spaces have been exposed to numerous environmental challenges which influence the condition of environment (both built and natural)

Identification of expression profiles of tapping panel dryness (TPD) associated genes from the latex of rubber tree (Hevea brasiliensis Muell. Molecular identification

All measured values on the Load Balancing Modules page are computed as deltas (meaning they indicate the scope of difference from one polling cycle to the next)—except for the