• No results found

LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS Hermann Härtig

N/A
N/A
Protected

Academic year: 2021

Share "LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS Hermann Härtig"

Copied!
34
0
0

Loading.... (view fulltext now)

Full text

(1)

Hermann Härtig

LOAD BALANCING

(2)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

ISSUES

starting points

independent Unix processes and

block synchronous execution

who does it

load migration mechanism (MosiX)

management algorithms (MosiX)

information dissemination

decision algorithms

(3)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

EXTREME STARTING POINTS

independent OS processes

block synchronous execution (HPC)

sequence: compute - communicate

all processes wait for all other processes

often: message passing 


for example Message Passing Library (MPI)

3

(4)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

SIMPLE PROGRAM MULTIPLE DATA

all processes execute same program

while (true) 


{ work; exchange data (barrier)}

common in 


High Performance Computing:


Message Passing Interface (MPI)


library

(5)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

DIVIDE AND CONQUER

5

node 1

CPU #2

CPU #1

node 2

CPU #2

CPU #1

part 1

part 2

part 3

part 4

result 1

result 2

result 3

result 4

result

problem

(6)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

MPI BRIEF OVERVIEW

Library for message-oriented parallel

programming

Programming model:

Multiple instances of same program

Independent calculation

Communication, synchronization

(7)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

MPI STARTUP & TEARDOWN

MPI program is started on all processors

MPI_Init(), MPI_Finalize()

Communicators (e.g., MPI_COMM_WORLD)

MPI_Comm_size()

MPI_Comm_rank()

: “Rank” of process within

this set

Typed messages

Dynamically create and spread processes

using MPI_Spawn() (since MPI-2)

(8)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

MPI EXECUTION

Communication

Point-to-point

Collectives

Synchronization

Test

Wait

Barrier

8

MPI_Send(

void* buf,

int count,

MPI_Datatype,

int dest,

int tag,

MPI_Comm comm

)

MPI_Recv(

void* buf,

int count,

MPI_Datatype,

int source,

int tag,

MPI_Comm comm,

MPI_Status *status

)

MPI_Bcast(

void* buffer,

int count,

MPI_Datatype,

int root,

MPI_Comm comm

)

MPI_Reduce(

void* sendbuf,

void *recvbuf,

int count

MPI_Datatype,

MPI_Op op,

int root,

MPI_Comm comm

)

MPI_Barrier(

MPI_Comm comm

)

MPI_Test(

MPI_Request* request,

int *flag,

MPI_Status *status

)

MPI_Wait(

MPI_Request* request,

MPI_Status *status

)

(9)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

BLOCK AND SYNC

9

blocking call

non-blocking call

synchronous

communication

asynchronous

communication

returns when message

has been delivered

returns immediately,

following test/wait

checks for delivery

returns when send

buffer can be reused

returns immediately,

following test/wait

checks for send buffer

(10)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

EXAMPLE

10

int rank, total;

MPI_Init();

MPI_Comm_rank(MPI_COMM_WORLD, &rank);

MPI_Comm_size(MPI_COMM_WORLD, &total);

MPI_Bcast(...);

/* work on own part, determined by rank */

if (id == 0) {

for (int rr = 1; rr < total; ++rr)

MPI_Recv(...);

/* Generate final result */

} else {

MPI_Send(...);

}

(11)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

AMDAHLS’ LAW

interpretation for parallel systems:

P: section that can be parallelized

1-P: serial section

N: number of CPUs


(12)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

AMDAHL’S LAW

Serial section: 


communicate, longest sequential section


Parallel, Serial, possible speedup:

1ms, 100 μs: 1/0.1 → 10

1ms, 1 μs: 1/0.001 → 1000

10 μs, 1 μs: 0.01/0.001 → 10

...

(13)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

WEAK VS. STRONG SCALING

Strong:

accelerate same problem size

Weak:

extend to larger problem size

(14)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

WHO DOES IT

application

run-time library

operating system

14

(15)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

SCHEDULER: GLOBAL RUN QUEUE

15

immediate approach: global run queue

all ready processes

(16)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

SCHEDULER: GLOBAL RUN QUEUE

… does not scale

shared memory only

contended critical section

cache affinity

separate run queues with


explicit movement of processes

(17)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

OS/HW & APPLICATION

Operating System / Hardware:


“All” participating CPUs: active / inactive

Partitioning (HW)

Gang Scheduling (OS)

Within Gang/Partition:


Applications balance !!!

(18)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

HW PARTITIONS & ENTRY QUEUE

18

Application

Application

request

queue

(19)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

HW PARTITIONS

optimizes usage of network

takes OS off critical path

best for strong scaling

burdens application/library with balancing

potentially wastes resources

current state of the art in High

Performance Computing (HPC)

(20)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

TOWARDS OS/RT BALANCING

20

work item

time

(21)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

EXECUTION TIME JITTER

21

work item

time

(22)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

SOURCES FOR EXECUTION JITTER

Hardware ???

Application

Operating system “noise”

(23)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

OPERATING SYSTEM “NOISE”

Use common sense to avoid:

OS usually not directly on the critical path, 


BUT OS controls: interference via interrupts, caches,

network, memory bus, (RTS techniques)

avoid or encapsulate side activities

small critical sections (if any)

partition networks to isolate traffic of different

applications (HW: Blue Gene)

do not run Python scripts or printer daemons in parallel

(24)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

SPLITTING BIG JOBS

24

work item

time

Barrier

(25)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

SMALL JOBS (NO DEPS)

25

work item

time

Barrier

(26)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

BALANCING AT LIBRARY LEVEL

Programming Model

many (small) decoupled work items

overdecompose 


create more work items than active units

run some balancing algorithm

Example: CHARM ++

(27)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

BALANCING AT SYSTEM LEVEL

create many more processes

use OS information on run-time and

system state to balance load

examples:

create more MPI processes than nodes

run multiple applications

(28)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

CAVEATS

added overhead

additional communication between

smaller work items (memory & cycles)

more context switches

OS on critical path 


(for example communication)

(29)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

BALANCING ALGORITHMS

required:

mechanism for migrating load

information gathering

decisions

MosiX system as an example

-> Barak’s slides now

(30)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

FORK IN MOSIX

30

Deputy

Remote

fork() syscall request

Deputy

(child)

Remote

(child) fork()

Establish a new link

fork()

Reply from fork()

(31)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

PROCESS MIGRATION IN MOSIX

31

Process

migdaemon

migration request

Deputy

Remote

fork()

Send state, memory maps, dirty pages Transition ack Finalize migration Migration completed Ack

(32)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

SOME PRACTICAL PROBLEMS

flooding


all processes jump to one new empty node


=> decide immediately before migration

commitment 


extra communication, piggy packed

ping pong


if thresholds are very close, processes

moved back and forth


=> tell a little higher load than real

(33)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

PING PONG

33

Node 1

Node 2

One process two nodes

Scenario:

compare load on nodes 1 and 2

node 1 moves process to node 2

Solutions:

add one + little bit to load 


average over time

Solves short peaks problem as well

(short cron processes)

(34)

Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

LESSONS

execution time jitter matters (Amdahl)

approaches: partition ./. balance

dynamic balance components:


migration of load, information, decision


who, when, where

References

Related documents

Abstract: With the increase in the number of concurrent users on the Internet, the load balancing problem in distributed systems is becoming more significant.To the best of

Load Balancing for Distributed Stream Processing Engines. Muhammad Anis

Based on the delay probing experiments and the performance of the dynamic load- balancing strategies in different environments, we proposed a dynamic and adaptive load-balancing

We formulate the load balancing problem for single class job distributed systems as a cooperative game among computers and we provide a method for computing the solution.. We prove

This paper introduces load balancing algorithm for distributed multi-agent systems using all system resources to reduce algorithm execution time and cost of agent

The paper presents what a Distributed system is. It also overviews the issues to be taken into consideration for developing the strategies for load balancing. Strategies for

we classify the dynamic distributed load balancing algorithms for heterogeneous distributed computer systems into three policies: Queue Adjustment Policy (QAP), Rate

This work proposed an approach called DDLB (Distributed Dynamic Load Balancing) policy, this makes an attempt to balance load by splitting proc- esses into separate jobs and