D
D
Y
Y
N
N
A
A
M
M
I
I
C
C
R
R
O
O
T
T
A
A
T
T
I
I
N
N
G
G
L
L
O
O
A
A
D
D
B
B
A
A
L
L
A
A
N
N
C
C
I
I
N
N
G
G
A
A
L
L
G
G
O
O
R
R
I
I
T
T
H
H
M
M
I
I
N
N
D
D
I
I
S
S
T
T
R
R
I
I
B
B
U
U
T
T
E
E
D
D
S
S
Y
Y
S
S
T
T
E
E
M
M
S
S
ISSA OTOUM AL DAHOUD ALI ROSE SULEIMANAl-Zaytoonah University Al-Zaytoonah University Neelain University
com . yahoo @ k 2 issaotoum jo . edu . alzaytoonah @ aldahoud com . o yaho @ rosesuleiman
A
A
B
B
S
S
T
T
R
R
A
A
C
C
T
T
Load Balancing in a distributed system is an important process to reduce delays and improve response times in order to speed up applications and results. Different approaches to Load Balancing have different advantages and disadvantages. ‘Classical’ approaches to load balancing are quite good and mostly efficient, but in many circumstances, the overheads incurred from load balancing are too high and therefore become ineffective.
Dynamic Rotating Load Balancing Algorithm in Distributed Systems is proposed in this paper.
This new algorithm has much lower overheads and faster response times when compared to the classical approaches, as shown in the data obtained from the simulations done to test this approach. It is also scalable and efficient regardless of the size of the network used.
I
I
N
N
T
T
R
R
O
O
D
D
U
U
C
C
T
T
I
I
O
O
N
N
This paper is a study about dynamic load balancing in a distributed system.
It compares the various approaches to Load Balancing in distributed system, with focus on dynamic approaches in general; some processes are discussed in detail to clarify issues concerning factors contributing to load and the mechanisms used in any balancing process.
The paper explains the new algorithm and how it is applied, with discussion of the simulation process used to compare the classical approach to the new- proposed- approach.
Also it gives the results and conclusions drawn from the study and further proposals.
The tables and diagrams at the end of the study give comparisons between the various approaches and the results of the simulations done as part of the study, with references outlined after that.
Key Words: Dynamic Load Balancing, and Distributed System
1
1
-
-
D
D
Y
Y
N
N
A
A
M
M
I
I
C
C
L
L
O
O
A
A
D
D
D
D
I
I
S
S
T
T
R
R
I
I
B
B
U
U
T
T
I
I
O
O
N
N
1-1.Load distribution
Load distribution seeks to improve the performance of a distributed system, usually in terms of response time or resource availability, by allocating workload amongst a set of cooperating hosts.
This division of system load can take place statically or dynamically: 1-2. Dynamic load distribution
Dynamic load distribution is designed to overcome the problems of unknown or un-characterizable workloads, non-pervasive scheduling and runtime variation (any situation where the availability of hosts, the composition of the workload or the interaction of human beings can alter resource requirements or availability). Dynamic load distribution systems typically monitor the workload and hosts for any factors that may affect the choice of the most appropriate assignment and distribute jobs accordingly. This very difference between static and dynamic forms of load distribution is the source of the power and interest in dynamic load distribution.
The objectives of this thesis lie entirely within the domain of dynamic load balancing. For brevity, I will take the more general term of load distribution to stipulate only the dynamic form. [1, 2, 7, 11]
The Degree of Load Distribution
Load Sharing: This is the coarsest form of load distribution. Load may only be placed on idle hosts, and can be viewed as binary, where a host is either idle or busy.
Load Balancing: Where load sharing is the coarsest form of load distribution, load balancing is the finest. Load balancing attempts to ensure that the workload on each host is within a small degree (or balance criterion) of the workload present on every other host in the system.
Load Leveling: Load leveling occupies the ground between the two extremes of load sharing and load balancing. Rather than trying to obtain a strictly even distribution of load across all hosts, or simply utilizing idle hosts, load leveling seeks to avoid congestion on any one host.
Other schemes such as, MOSIX, which could be considered load balancing systems, are in fact load leveling, as the balancing phase occurs periodically.
1-3. Previous Load Distribution Taxonomies
There are numerous existing taxonomies available for the classification of load distribution, including Wang and Morris, Casavant and Kuhl and Jacqmot and Milgrom .
2
2
-
-
T
T
H
H
E
E
P
P
R
R
O
O
P
P
O
O
S
S
E
E
D
D
N
N
E
E
W
W
A
A
L
L
G
G
O
O
R
R
I
I
T
T
H
H
M
M
:
:
A network is made up of nodes connected together in a certain configuration; the configuration will not matter for our purposes here.
The nodes are arranged logically from 1 to n; where n is the total number of nodes.
We can view the network, regardless of its size, as consisting of adjacent pairs of nodes or triplets, the nodes that are nearest each other are considered as the pair. This breaks up the large network into a number of small networks, the Load Balancing can then be done within the small networks (consisting of only two or three nodes).
This grouping has the effect of both reducing the number of messages exchanged and also the physical distances between nodes as they are chosen to be adjacent. This has the effect of reducing the overhead when Load Balancing is done and thus makes for a more efficient process with faster response times and quicker task achievement.
To do the Load Balancing in a dynamic manner, we need to set the criteria for the start of Load Balancing Process, and also to have a mechanism for changing the configuration of the network each time load balancing is done.
2-1. Criteria for Load Balancing:
1) When a certain number of tasks (queue length) at the node is reached- i.e. a threshold. 2) Periodically.
2-2. Selection of nodes:
Any number of nodes can make up a group, but limiting the size to only 2 or 3 gives us the advantages of low overhead, reduction of job thrashing as well as other advantages; like scalability, robustness, stability and efficiency.
The selection of nodes, for 3 nodes per cluster, or for 2 nodes per cluster can be done using the code:
The code for the process, shown here in C++: Cycle =1
While (true)
For (i=0;i< groups ; i++)
For (j=0;j< clustersize ; j++) X=i*clustersize+j+cycle; If (x>nodes) C[i+1][j+1] = x % Nodes Else C[i+1][j+1] = x; The Load Balancing Code is placed in this area
If (++cycle>Nodes) cycle=1;
In this configuration, the load over the groups consisting of nodes I;j;k : NetLoad = int ((load I+ load j)/2) , for 2 node groups NetLoad = int ((load I+ load j+ load k)/3) , for 3 node groups
2-3.Load Balancing steps:
Step 1:
Calculation of local load:
We need to calculate local load ( i.e. load at each node): Factors of load:
Load (L) is directly proportional to :
Average queue length(Q(avg)).
Response time(tresp(avg))
Average waiting time(tw(avg))
Load (L) is inversely proportional to :
Number of nodes, n.
Mathematically:
L α (Q(avg)) * (tresp(avg))* (tw(avg)) / n
Multiplying by a constant (c) makes this an equation: L = (Q(a)) * (tresp(a))* (tw(a))*c / n
(tresp)= t release – t arrival
(tw)= t seize – t arrival
(tresp(avg))=the sum of (tresp)/ number of jobs= sum( t release – t arrival )/ nj
(tw(avg))=sum((tw)/number of jobs=sum(t seize – t arrival )/ nj
(Q(avg)) of node n= the sum of jobs at time (m)
The constants, obtained from experimental values as shown in previous studies (Zhou , Kara; 1994): * tasks arrive at nodes in a Poisson distribution manner.
* task size follows exponential distribution.
time to send message 0.00001 s (10 ms) time to receive message 0.00001 s (10 ms) time for job sending 0.00005 s (50 ms) time for job receiving 0.00005 s (50 ms) time to send result 0.00001 s (10 ms) time to receive result 0.00001 s (10 ms)
Step 2:
Calculation of total load:
1.For central load balancing approach:
Check load of all nodes and distribute load accordingly. 2.For Distributed load balancing approach:
Check tables of loads and distribute load accordingly. 3.For rotating algorithm:
Loads are calculated locally and the total load for each cluster is added and averaged; if threshold is exceeded then load is distributed by transfer of jobs to neighboring nodes in each cluster then the rotation of nodes is done. Step 3:
Trigger of load balancing: 2 mechanisms: 1) Threshold.
2) Periodically. Step 4: Response Times
Response times were calculated for different number of nodes by the simulation code for the various approaches of balancing
Considerations:
To Compare:
1) Centralized approach. 2) Distributed approach. 3) New algorithm.
We have to consider the following issues first:
The different approaches have different overheads depending on various aspects of the system. They each have advantages and disadvantages. To have a fair comparison with no bias would be quite difficult as the various aspects of the overhead and other delay factors will not be constant for each approach, but if we try to keep all the aspects of the simulations used to do the comparisons constant, apart from those that are inherently different, then the bias will be kept to a minimum.
Some variables may be applicable only to some approaches and not in others. Results of the simulation:
A simulation was constructed to do the following:
1) Perform as a network with multiple processors- i.e. consisting of n nodes. this n can be varied for the purpose of the study.
2) A set task was given equally to all approaches; the task was divided into smaller tasks in exponential distribution manner and then distributed to the nodes set up in the simulation.
3) The process of load balancing was done for each approach of simulation.
4) The results obtained were plotted as response times vs. number of nodes for all the simulation processes to compare the prospective response times.
Mathematical Considerations:
(load * Number of processors) Arrival rate =
(Required number of processors *Average execution time)
The Load is directly proportional to: Arrival rate
Mean service time ( = Total service time / number of tasks ) The Load is inversely proportional to number of processors So , mathematical expression to describe Load:
The tasks were split into jobs by the exponential distribution method.
The results obtained are given in the following graphs :
Graph1: Rotational-2nodes, Central, Distributed & No Balancing
A
A
N
N
A
A
L
L
Y
Y
S
S
I
I
S
S
O
O
F
F
R
R
E
E
S
S
U
U
L
L
T
T
S
S
:
:
The results show the following observations:
1) Plotting the results taken from the simulation for the average response times vs. the number of nodes for No Load Balancing, Central approach , Distributed approach , Rotational approach for 2 nodes per cluster we see that : The response times are better for Rotational 2 which is better than central approach which is better than distributed approach , all better than no Load Balancing.[Graph1]
2) Plotting the results taken from the simulation for the average response times vs. the number of nodes for No Load Balancing, Central approach , Distributed approach , Rotational approach for 3 nodes per cluster we see that : The response times are better for Rotational 3 which is better than central approach which is better than distributed
approach , all better than no Load Balancing.[Graph2]
Graph3: Rotational-2nodes, Rotational-3nodes & No Balancing
3) Plotting the results taken from the simulation for the average response times vs. the number of nodes for Rotational approach for 3 nodes per cluster & Rotational approach for 2 nodes per cluster we see that :
The response times are better for Rotational 3 than Rotational 2, all better than no Load Balancing.[Graph3] 4) To see the effect of increasing the number of nodes and how the behavior changes, Graph 7 shows that as the
number of nodes increase, the average response times start increasing after a certain point for the distributed approach and the central approach until eventually they reach a point where no balancing is better. While the rotational approach continues to give better response times regardless of the number of nodes used.[Graph4]
C
C
o
o
n
n
c
c
l
l
u
u
s
s
i
i
o
o
n
n
s
s
:
:
1) From the analysis of results we find that the simulations done show that the proposed new algorithm has much lower overheads and faster response times when compared to the classical approaches, as shown in the data obtained from the simulations done to test this approach.
2) It is also scalable and efficient regardless of the size of the network used: as we see that when number of nodes increases it still gives good response times, which is not found in other approaches.
3) Using this approach gives better results because there are virtually no overheads compared to classical approaches.
4) Using clusters of 3 nodes gives better performance than 2 nodes per cluster, this is related to the fact that 3 nodes are better at handling the local load than 2 nodes because of simple mathematical rules involved.