• No results found

Simplest Scalable Architecture

N/A
N/A
Protected

Academic year: 2021

Share "Simplest Scalable Architecture"

Copied!
43
0
0

Loading.... (view fulltext now)

Full text

(1)

Cluster Computing

Simplest Scalable Architecture

(2)

Cluster Computing

Many types of Clusters

(form HP’s Dr. Bruce J. Walker) • High Performance Clusters

– Beowulf; 1000 nodes; parallel programs; MPI

• Load-leveling Clusters

– Move processes around to borrow cycles (eg. Mosix)

• Web-Service Clusters

– LVS; load-level tcp connections; Web pages and applications

• Storage Clusters

– parallel filesystems; same view of data from each node

• Database Clusters

– Oracle Parallel Server;

• High Availability Clusters

(3)

Cluster Computing

Many types of Clusters

(form HP’s Dr. Bruce J. Walker) • High Performance Clusters

– Beowulf; 1000 nodes; parallel programs; MPI

• Load-leveling Clusters

– Move processes around to borrow cycles (eg. Mosix)

• Web-Service Clusters

– LVS; load-level tcp connections; Web pages and applications

• Storage Clusters

– parallel filesystems; same view of data from each node

• Database Clusters

– Oracle Parallel Server;

• High Availability Clusters

– ServiceGuard, Lifekeeper, Failsafe, heartbeat, failover clusters

NOW type architectures

(4)

Cluster Computing

NOW Approaches

• Single System View • Shared Resources • Virtual Machine

(5)

Cluster Computing

Shared System View

• Loadbalancing clusters • High availability clusters • High Performance

– High throughput – High capability

(6)
(7)

Cluster Computing

NOW Philosophies

• Commodity is cheaper • In 1994 1 MB RAM was

– $40/MB for a PC

(8)

Cluster Computing

NOW Philosophies

• Commodity is faster 89-90 91-92 32 MHz SS-1 ~91 92-93 50MHz i860 92-93 93-94 150 MHz Alpha WS year MPP year CPU

(9)

Cluster Computing

Network RAM

• Swapping to disk is extremely expensive

– 16-24 ms for a page swap on disk

• Network performance is much higher

(10)

Cluster Computing

Network RAM

0 100 200 300 400 500 600 1 32 64 128 Problem Size m

in 32MB+diskNetwork RAM

(11)

Cluster Computing

NOW or SuperComputer?

$5M 21 ”+NOW protocol $5M 205 ”+Parallel FS $5M 2211 ”+ATM $4M 27374 RS6000 (256) $30M 27 C-90 (16) Cost Time Machine

(12)

Cluster Computing

NOW Projects

• Condor • DIPC • MOSIX • GLUnix • PVM • MUNGI • Amoeba

(13)

Cluster Computing

The Condor System

• Unix and NT

• Operational since 1986

• More than 1300 CPUs at UW-Madison • Available on the web

• More than 150 clusters worldwide in academia and industry

(14)

Cluster Computing

What is Condor?

• Condor converts collections of

distributively owned workstations and dedicated clusters into a

high-throughput computing facility.

• Condor uses matchmaking to make sure that everyone is happy.

(15)

Cluster Computing

What is High-Throughput

Computing?

• High-performance: CPU cycles/second under ideal circumstances.

– “How fast can I run simulation X on this machine?”

• High-throughput: CPU cycles/day (week, month, year?) under non-ideal circumstances.

– “How many times can I run simulation X in the next month using all available machines?”

(16)

Cluster Computing

What is High-Throughput

Computing?

• Condor does whatever it takes to run your jobs, even if some machines…

– Crash! (or are disconnected) – Run out of disk space

– Don’t have your software installed – Are frequently needed by others

(17)

Cluster Computing

A Submit Description File

# Example condor_submit input file

# (Lines beginning with # are comments)

# NOTE: the words on the left side are not # case sensitive, but filenames are! Universe = vanilla

Executable = /home/wright/condor/my_job.condor Input = my_job.stdin

Output = my_job.stdout Error = my_job.stderr Arguments = -arg1 -arg2

InitialDir = /home/wright/condor/run_1 Queue

(18)

Cluster Computing

What is Matchmaking?

• Condor uses Matchmaking to make sure that work gets done within the constraints of both users and owners.

• Users (jobs) have constraints:

– “I need an Alpha with 256 MB RAM”

• Owners (machines) have constraints:

– “Only run jobs when I am away from my desk and never run jobs owned by Bob.”

(19)

Cluster Computing

Process Checkpointing

• Condor’s Process Checkpointing

mechanism saves all the state of a process into a checkpoint file

– Memory, CPU, I/O, etc.

• The process can then be restarted from right where it left off

• Typically no changes to your job’s source code needed – however, your job must be relinked with Condor’s Standard Universe support library

(20)

Cluster Computing

Remote System Calls

• I/O System calls trapped and sent back to submit machine

• Allows Transparent Migration Across Administrative Domains

– Checkpoint on machine A, restart on B

• No Source Code changes required • Language Independent

• Opportunities for Application Steering

– Example: Condor tells customer process “how” to open files

(21)

Cluster Computing

DIPC

• DIPC – Distributed – Inter – Process – Communication

• Provides Sys V IPC in distributed environments (including SHMEM)

(22)

Cluster Computing

MOSIX and its characteristics

• Software that can transform a Linux cluster of x86 based workstations and servers to run

almost like an SMP

• Has the ability to distribute and redistribute the processes among the nodes

(23)

Cluster Computing

MOSIX

• Dynamic migration added to the BSD kernel

• Uses TCP/IP for communication between workstations

(24)

Cluster Computing

MOSIX

• All processes start their life at the users workstation

• Migration is transparent and preemptive

• Migrated processes use local resources as much as possible and the resources on

(25)

Cluster Computing

Process Migration in MOSIX

User-level Kernel Link Layer User-level Kernel Link Layer Deputy Remote Loca l proc ess

(26)
(27)

Cluster Computing

(28)

Cluster Computing

GLUnix

• Global Layer Unix

• Pure user-level layer that takes over the role of the operating system from the point of the user

• New processes can then be placed where there is most available memory (CPU)

(29)

Cluster Computing

PVM

• Provides a virtual machine on top of existing OS on a network

• Processes can still access the native OS resources

• PVM supports heterogeneous environments!

(30)

Cluster Computing

PVM

• The primary concern of PVM is to provide

– Dynamic process creation

– Process management – including signals – Communication between processes

(31)

Cluster Computing

PVM

PVM Demon PVM Task PVM Task PVM Demon PVM Task PVM Task PVM Demon PVM Task PVM Task PVM

(32)

Cluster Computing

MUNGI

• Single Address Space Operating system • Requires 64 bit architecture

• Designed as an object based OS

• Protection is implemented as capabilities, to ensure scalability MUNGI uses

(33)

Cluster Computing

Amoeba

• The computer is modeled as a network of resources

• Processes are started where they best fit • Protection is implemented as capability

lists

• Amoeba is centered around an efficient broadcast mechanish

(34)

Cluster Computing

(35)

Cluster Computing

Programming NOW

• Dynamic load balancing • Dynamic orchestration

(36)

Cluster Computing

Dynamic Load Balancing

• Base your applications on redundant parallelism

• Rely on the OS to balance the application over the CPUs

• Rather few applications can be orchestrated in this way

(37)

Cluster Computing

Barnes Hut

• Galaxy simulations are still quite

interresting

• Basic formula is:

• Naïve algorithm is O(n2) 2 2 1 r m m G

(38)

Cluster Computing

(39)

Cluster Computing

Barnes Hut

(40)

Cluster Computing

(41)

Cluster Computing

Dynamic Orchestration

• Divide your application into a job-queue • Spawn workers

• Let the workers take and execute jobs from the queue

• Not all applications can be orchestrated in this way

• Does not scale well – job-queue process may become a bottleneck

(42)

Cluster Computing

Parallel integration

∫ ∫ ∫

2 1 2 1 2 1

)

,

,

(

x

x

y

y

z

z

z

y

x

f

(43)

Cluster Computing

Parallel integration

• Split the outer integral

• Jobs = range(x1, x2, interval)

• Tasks = integral with x1 = Jobsi, x2=Jobsi+1; for i in len(Jobs -1)

References

Related documents

The algorithm is designed to provide information-theoretic (that is, unconditional) security by the use of a one-time pad like scheme so that no intermediate or unintended node

• Enhancing International Business Education at Southern University and A&M College – A Comprehensive Program to expand and Strengthen International Business and Small

You’ll also learn how to manipulate cells based on the active cell and how to create a new range from overlapping ranges.. The

The given approach involves performing load, sta- bility and stress test using workload generators to provide a synthetic workload: web stressing tools use scripts de- scribing

Scientists from the Nijmegen Centre for Molecular Life Sciences (NCMLS) designed an intensive training programme that sprouts from its three main research themes and is

By consenting and agreeing to receive such electronically transmitted notifications and/or communications from NCCI and/or the assigned carrier, the undersigned Applicant

In this study, besides coupling GIS and ML techniques, we also focused on several issues: (a) how to employ diverse available data (mostly open access) that can be used as

A given constant time for the other activities of a picker, the routing with the minimum travel time will be achieved if the sequence of the lines within the order picking batch