• No results found

RESEARCH GOAL. Enable BOINC to efficiently support apps that require interprocess communication.

N/A
N/A
Protected

Academic year: 2021

Share "RESEARCH GOAL. Enable BOINC to efficiently support apps that require interprocess communication."

Copied!
27
0
0

Loading.... (view fulltext now)

Full text

(1)

BOINC Workshop 11

Hien Nguyen, Eshwar Rohit

University of Houston Supervisors: Dr. Jaspal Subhlok University of Houston Dr. David P. Anderson SSL – U.C, Berkeley

(2)

RESEARCH GOAL

Enable BOINC to efficiently support apps

that require interprocess communication.

Goals:

Easier programming for communicating applications

Reduce execution time (not increase throughput)

(3)

Example Applications

REMD Protein Folding application

Each process runs a standard molecular simulation at different temperature 270 280 290 300 310 320 330 340 280 270 290 300 320 310 340 330 280 290 270 300 320 340 310 330 P1 P2 P3 P4 P5 P6 P7 P8 STEP-1 STEP-2 STEP-3

(4)

Example Applications

Or many other applications:

Differential equation solvers (grid) (synchronous)

Game playing with alpha/beta pruning (asynchronous)

Search application.

…..

Suitable applications: low to moderate amount and frequency of communication.

(5)

Synchronization point

DIFFICULTIES

Job execution Fast host Slow host X X

Slow down overall execution speed

X

X

Worse as number of hosts increases or

(6)

OUTLINE

1. Volpex Dataspace

IPC for volunteer environment

2. Ensuring Efficiency

Process management

Host selection

3. Experiments And Evaluation

4. Future Work

(7)

Volpex Dataspace

Dataspace: global shared space that processes can use for information exchange without a

temporal or spatial coupling.

Volpex Dataspace Server Put(ABC, 800)

Get(ABC,?) (800)

(8)

Volpex Dataspace – Fault Tolerance

Put(ABC, 800) Get(ABC,?) (800) replicated

X

Volpex DSS is unique in supporting redundant Put/Get operations

(9)

Related Work: Volpex MPI

Volpex MPI:

An MPI library designed for executing parallel applications in volunteer environment.

Direct communication between processes.

Key Features

o Controlled redundancy

o Receiver based direct communication

o Distributed sender based logging

More detail

:

VolpexMPI: an MPI Library for Execution of Parallel Applications on Volatile Nodes” by Troy LeBlanc, Rakhi Anand, Edgar Gabriel, and Jaspal Subhlok.
(10)

ENSURING EFFICIENCY

Parallel program executes at the speed

of the slowest process

Process management

Simultaneous process starting

Failure and recovery

Replica management

Checkpoint/restart

Host selection.

(11)

Job execution scheme

X

BOINC Scheduler

Volpex Dataspace Server Get work

Put data item

Get data item

Get checkpoint

(12)

PROCESSES MANAGEMENT

Simultaneous process starting:

All processes start computation together

Volpex jobs have highest (infinite) priority: uninterruptible by other jobs.

While waiting for all processes of a Volpex job to be ready: host can do other finite priority volunteer jobs.

(13)

PROCESSES MANAGEMENT

Failure and recovery

Dead

instance

spotted

by

heartbeat

mechanism: process instances regularly

send heartbeat to Volpex DSS.

Slow instance detected when very old

checkpoint commit attempted.

“Hot Spare” replaces the dead/slow

(14)

PROCESSES MANAGEMENT

Hot spare policy:

BOINC Scheduler

Volpex Dataspace Server

X

Fast replacement

(15)

PROCESSES MANAGEMENT

Checkpointing:

Process instance commits and uploads

checkpoints to Volpex DSS (only stores

latest checkpoint for each process).

Volpex_time_to_checkpoint()

Volpex_checkpoint(char* checkpoint)

Restarted

process

instance

requests

(16)

HOST SELECTION POLICY

Criteria for selecting volunteer hosts

to assign to a Volpex job:

CPU speed

Memory capacity

Disk space

Upload bandwidth

(17)

HOST SELECTION POLICY

Future availability prediction based on

Last valued predictor:

availability in the

last hour

Predictability:

the number of availability

changes in the past 2 weeks.

In essence: select hosts which change

availability very rarely.

Method partly based on : Exploiting Non-Dedicated

(18)

IMPLEMENTATION STATUS

Volpex

utilities:

for

scientists

to

submit, abort or query status of a

Volpex job.

Modified BOINC scheduler: includes

host selection for Volpex job.

Modified Volpex DSS: handles new

(19)

EXPERIMENTS AND EVALUATION

Experiment Scheme:

Application: Sieve of Eratosthenes, REMD Protein Folding

Number of processes: 10 for Sieve and 16 for Folding

Level of replication: 1, 2, 3

Pool of around 80 nodes: On campus & global

Host Selection Policies:

oRandom/Naïve

oMinimum threshold on CPU, memory, diskspace, upload bandwidth and predicted availability

(20)

Sieve of Eratosthenes 0 50 100 150 200 250 300 350 400 0 5 10 15 20 25 T ime ( s) Experiments

Random Host Selection 10 Processes

1 Replica 2 Replicas 3 Replicas

(21)

Sieve of Eratosthenes 0 20 40 60 80 100 120 0 5 10 15 20 25 T im e ( s) Experiment

On Campus hosts with IP filtering 10 Processes

1 Replica 2 Replicas 3 Replicas

(22)

Sieve of Eratosthenes

Minimum threshold for host selection helps stabilize the execution time even w/o replication.

0 20 40 60 80 100 120 140 160 180 200 0 5 10 15 20 25 T im e ( s) Experiment

Threshold based Host Selection 10 Processes

1 Replica 2 Replicas 3 Replicas

(23)

REMD for Protein Folding 0 100 200 300 400 500 600 0 1 2 3 4 5 6 T ime ( min ut es) Experiment

Rep = 1, random selection Rep = 2, random selection

Rep = 1, threshold based selection Rep = 2, threshold based selection

(24)

FUTURE WORK

More experiments:

Higher number of processes

More applications

Communication pattern (local/global, synch/asynch)

Size and frequency of communication

More in depth study on:

Host selection policy

Optimal checkpoint interval

Granting credits to hosts:

Granting credit to hot spares

(25)

If you have application?

We would be happy to help you in employing Volpex for your application.

Students / postdocs interested in working on Volpex welcome to contact us.

Our team contacts:

Dr. Jaspal Subhlok: [email protected]

Dr. David Anderson: [email protected]

Dr. Edgar Gabriel: [email protected]

Hien Nguyen: [email protected]

Eshwar Rohit: [email protected]

Rakhi Anand: [email protected]

LaToya Green: [email protected]

(26)
(27)

IMPLEMENTATION STATUS

BOINC Server

submit job specs

Database Create job & WU

Volpex Dataspace Server

X

Hot spare group

Scheduler request Scheduler reply dynamically create result from WU BOINC Scheduler heartbeat checkpoint request procID get procID

procID of failed instance replacement

http://

References

Related documents

Here, if the victim of a breach is permitted to choose the remedy at the time of a breach, he would be predicted to choose specific performance, given our assumptions: the reason

- Tips and advice for overcoming common succession planning obstacles and pitfalls - Engaging and retaining high performers with a well developed talent management strategy -

Measuring Radon Concentration and Toxic Elements in the Irrigation Water of the Agricultural Areas in Cameron Highlands, Malaysia.. (Mengukur Kepekatan Radon dan Unsur Toksik dalam

1.36 There shall be a representative amount and type of traffic during the practical examination. At aerodromes, it is recommended that at least three radio-equipped aircraft

The psychologist adds that “the awakening of the senses is structural and structuring in the Initiation to the World” ( id , ibid). The benefits of the lullabies are, indeed,

Annexure A – Shipping industry tariff structure 119 Annexure B – Questionnaire completed by the container shipping industry 122 Annexure C – Questionnaire completed by the

It shows how national capital formation depends both on the scale and location of discrete business investments, and how the cost of public funds from corporate taxation must

Results demonstrated that Hypothesis 1, which stated that when compensation is average, organizations valuing environmental sustainability will yield significantly higher