Load Balancing of Virtual Machine Resources in Cloud Using Genetic Algorithm

(1)

Genetic Algorithm

Chandrasekaran K.

¹

and Usha Divakarla

²

National Institute of Technology Karnataka, Surathkal e-mail:¹[email protected];²[email protected]

Abstract. In cloud computing most of the load balancing exists in VM migration. When the entire VM resources are migrated, due to the large granularity of VM resources and the great amount of data transferred in migration and the suspension of the service, the migration cost becomes a problem. Hence, the goal of this project is to design and implement a genetic algorithm for a scheduling strategy on Virtual Machine Resources in cloud computing environment using current system state such that it achieves load balancing and hence Virtual Machine migration problem is optimized. It will compute the influence on system after deployment of Virtual Machine resources before actually deploying and then selects the best solution having least load imbalance

Keywords: cloud computing, load balancing, virtual machine resource.

1. Introduction

Cloud Computing is an emerging computing technology that is rapidly consolidating itself as the next big step in the development and deployment of an increasing number of distributed applications.

It is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. Cloud computing provides a Service Oriented Architecture (SOA) and Internet of Services (IoS) type applications, including fault tolerance, high scalability, availability, flexibility, reduced information technology overhead for the user, reduced cost of ownership, on demand services etc.

Central to these issues lies the establishment of an effective load balancing algorithm. In cloud computing most of the load balancing exists in VM migration [1]. When the entire VM resources are migrated, due to the large granularity of VM resources and the great amount of data transferred in migration and the suspension of the service, the migration cost becomes a problem [2,3].

2. Literature Survey

A thorough literature survey is done of Cloud Computing [4,5], load balancing and genetic algorithm. Also the existing open-source cloud computing technologies are explored in order to study

(2)

how genetic algorithm can be implemented using those technologies. And existing load balancing algorithms are explored.

Cloud computing definition [6]

“Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”

2.1 Load balancing

Load balancing is an even division of processing work between two or more computers and/or CPUs, network links, storage devices, ultimately delivering faster service with higher efficiency. Load balancing is accomplished through software, hardware or both, and it often uses multiple servers that appear to be a single computer system (also known as computer clustering). It is the process of improving the performance of a parallel and distributed system through a redistribution of load among the processors. A distributed system provide the resource sharing as one of its major advan- tages, which provide the better performance and reliability than any other traditional system in the same conditions. One of the research issues in parallel and distributed systems is the development of effective techniques for distributing workload on multiple processors. The main goal is to distribute the jobs among processors to maximize throughput, maintain stability, resource utilization and should be fault tolerant in nature. Local scheduling performed by the operating system consists of the distribution of processes to the time-slices of the processor. On the other hand Global scheduling is the process of deciding where to execute a process in a multiprocessor system. Global scheduling may be carried out by a single central or master processing element, or it may be distributed among the processing elements.

2.1.1 Load balancing schemes

Load balancing algorithms can be classified into static and dynamic approaches.

Static load balancing algorithm

Static load balancing algorithms assume that a priori information about all the characteristics of the jobs, the computing resources and the communication network are known and provided. Load balancing decisions are made deterministically or probabilistically at compile time and remain constant during runtime. The static approach is attractive because it is simple and requires minimized runtime overhead. However, it has two major disadvantages. Firstly, the workload distribution of many applications cannot be predicted before program execution. Secondly, it assumes that the computing resources and communication network are all known in advance and remain constant. Such an assumption may not apply to a distributed environment. As static approach cannot respond to the dynamic runtime environment, it may lead to load imbalance on some resources and significantly increase the job response time [7–13].

(3)

Dynamic load balancing algorithms

Dynamic load balancing algorithms [18–21] attempt to use the runtime state information to make more informative decision in sharing the system load. However, dynamic scheme is used a lot in modern load balancing method due to their robustness and flexibility.

A list of common parameters that can be used to characterize most of dynamic load balancing algorithms are:

Centralized vs. decentraliied

An algorithm is centralized if the parameters necessary for making the load balancing decision are collected at, and used by, a single resource i.e. only one resource acts as the central controller and all the remaining resources act as slaves. The centralized approach is more beneficial when the communication cost is less significant e.g. in the shared-memory multi-processor environment.

Its limitation is single point of failure and non-scalable. However, in decentralized approach all the resources are involved in making the load balancing decision. Decentralized algorithms are more scalable and have better fault tolerance.

Cooperative vs. non-cooperative

An algorithm is said to be cooperative if the distributed components that constitute the system co- operate in the decision-making process. Otherwise, it is non-cooperative.

Adaptive vs. non-adaptive

If the parameters of the algorithm can change when the algorithm is being run, the algorithm is said to to adaptive (to the changes in the environment in which it is running). Otherwise, it is non- adaptive.

Sender-initiated vs. receiver-initiated

In a source-initiated algorithm, an over-loaded node starts negotiations with the other nodes for a potential process-migration. If a negotiation is started by an under loaded node, the algorithm is said to be destination-initiated.

Preemptive vs. non-preemprive

If a process that has started its execution can be transferred to some other node, then the algorithm is called a preemptive algorithm. If, on the other hand, only those processes that are in the ready queue but have not yet eeceived CPU service could be considered for migration, the algorithm is called a non-preemptive algorithm.

(4)

2.1.2 Load balancing policies

An algorithm for the load balancing problem can be broadly categorized in terms of four policies.

They are:

Location policy

It is the policy that affects the finding of a suitable node for migration. The common technique followed here is polling, on a broadcast, random, nearest-neighbor or roster basis.

Transfer policy

It is that which determine whether a node is suitable for participating in a process migration. One common technique followed is the threshold policy, where a node participates in a negotiation only when its load is less than (in destination-initiated algorithm) or greater than (in sender-initiated algorithm) a threshold value.

Selection policy

It is the policy that deals with the selection of the process to be migrated. The common factors which must be considered are the cost of migration (communication time, memory, computational requirement of the process, etc.) and the expected gain of migration (overall speedup of the system, etc.).

Information policy

It is that component of the algorithm that decides what, how and when the information regarding the state of the other nodes in the system in gathered and managed. They can be grouped under demand-driven, periodic, or state-change-driven policies.

2.2 Genetic algorithm

Genetic Algorithm is search and optimization technique promised on the evolutionary ideas of natural selection and genetics [14–17].

Selection

Chromosomes are selected from the population to be parents to crossover. The problem is how to select these chromosomes. There are many methods how to select the best chromosomes, for example roulette wheel selection, Boltzman selection, tournament selection, rank selection, steady state selection and some others.

(5)

Crossover

Crossover is a genetic operator that combines (mates) two chromosomes (parents) to produce a new chrompsome (offspring). The idea behind crossover is that the new chromosome may be better than both of the parents if it takes the test characteristics from each of the parents. Crossover occurs during evolution according to a user-definable crossover probability. There are many crossover operator types, for example one point, two point, multi point, arithmetic, heuristic.

Mutation

Mutation is a genetic operator that alters one or more gene values in a chromosome from its initial state. This can result in entirely new gene value being added to the gene pool. With these new gene values, the genetic algorithm may be able to arrive at better solution than was previously possible. Mutation is an important part of the genetic search as it helps to prevent the population from stagnating at any local optima. Mutation occurs during evolution according to a user-definable mutation probability. This probability should usually be set fairly low (0.01 is a good first choice).

If it is set to high, the search will turn into a primitive random search. There are many mutation operator types, for example, flip bit, boundary, uniform, non-uniform, Gaussian.

3. Genetic Algorithm Design and Implementation

The basic Genetic algorithm and its implementation is explained as below.

Mathematical formulation

Consider set of physical machines P = {P1, P2, . . . , Pn} where n is the number of nodes in the cloud and on physical machine Pi, the set of virtual machines V = {V1, V2, . . . , Vmi} where mi is the number of virtual machines on physical machine Pi. There will be one cloud controller and several nodes having multiple virtual machines in cloud.

The load of a physical machine usually can be obtained by adding the loads of the VMs running on it. Therefore we can conclude the load of physical machine Pi is Pi = m_i

j=1Vj. The current virtual machine needs deploying is V. After arranging V to physical machine, the load of every physical machine will be

P_i = Pi + V ; after deploying = Pi; for others. The load on the cloud after V M V is arranged to physical machine Pi is

C=

n

i=1

P_i/n

Genetic algorithm design

The detailed description of the Genetic Algorithm used is as given below.

(6)

Figure 1. VM scheduler.

Genome coding

The classic genetic algorithm marks the chromossme structure of genes by binary codes.

It is found that it is a one-to-many mapping relationship between physical machines and VMs.

Therefore, it is best to select tree structure to mark the chromosome of genes or multidimensional list. Every solution is marked as one tree or multi-dimensional list as shown in Figure 1; the schedul- ing and managing node of the system on the first level is the root node while all of the N nodes on the second level stands for physical machines and the M nodes on the third level stand for the VMs on a certain physical machines.

Multidimensional list data structure is used to implement encoding as it is the most appropriate data structure for the project.

Fitness function

Fitness function= f (S) = ⁿ ¹⁰⁰

i=1|C−Pi|. Lower the difference|C − Pi|, higher the value of fitness.

Selection

Roulettt wheel method is applied to select chromosomes for reproduction.

Pi(S) = _nfi(S)

i=1 fi(S) where, fi(S) = Fitness of solution no. i, n = Size of genome

Firstly find out the fitness of the individuals in current population by fitness function, and retain the individual with the highest fitness into the child population; then compute the selection probability of the individuals according to their fitness values.

Lastly, conduct selection of the individuals by rotating the wheel so that the individual with the high fitness has higher probability being selected and those with low fitness also have the chance to be chosen.

(7)

Crossover

The idea behind crossover is that the new chromosome will be better than both of the parents if it takes the best characteristics from each of the parents.

The crossover operator is as follow:

Select two parental individuals S1 and S2 according to selection strategy.

Combine the two parental individuals to form a new individual solution S0 which keeps the same individuals (VMs) in two parental selections and discards the different ones.

For the different VMs in the two parental individuals, distribute them to the smallest-loaded nodes in the physical machine set until the distribution of all different VMs is completed since our objective is to generate best solution having good load balancing.

Mutation

According to the mutation probability individuals are selected for mutation. Here from parental solution any two physical machines (two dimensions) are selected and one or more virtual machines are swapped between those selected physical machines to form new solution.

Genetic algorithm implementation

This experiment is performed using four machines, one as Front end, i.e., cloud controller and other three machines as cluster nodes using OpenNebula software as IaaS to build cloud. All the machines have same configuration having Intel i-7 processor, 8 GB RAM, 1TB hard disk.

To interact with OpenNebula cloud, there is OpenNebula cloud API (OCA) available for different languages such as Java, Python, and Ruby. They are designed as a wrapper for the XML-RPC methods, with some basic helpers. This means that we should be familiar with the XML-RPC API and the XML formats returned by the OpenNebula core. So XML-RPC API is used to implement this project since it is straight forward and easiest approach. Many methods are available to interact with cloud controller in XML-RPC API. It has to be formed with the contents of the ONE₋AUTH fied which is set during OpenNebula configuration, which will beUsername: Password with the default ‘core’ auto driver.

These methods were used to implement the designed genetic algorithm:

One.vm.allocate One.vm.action – hold One.vm.action – release One.vm.deploy

As shown in Figure 2, One.vm.allocate method allocates VM and VM comes to pending state for time being and immediately we use one.vm.action(hold) method to hold VM from being allocated to node by using OpenNebula’s internal scheduling algorithm and VM will transit to hold state and finally after finding optimal solution using genetic algorithm, we use one.vm.action(release) method and one.vm.deploy method to deploy VM to node as selected by the optimal solution using genetic algorithm.

(8)

Figure 2. Virtual Cachine life Mycle.

Virtual machine resources

VM resources are cpu, memory, storage, network bandwidth. The designed genetic algorithm was implemented considering cpu load and memory load.

Each host is characterized by a d-dimensional vector called the host’s vector of capacities:

H = (h1, h2, . . . , hd).

Each dimension represents the host’s capacity corresponding to a different resource such as CPU utilization, memory utilization, or disk bandwidth.

Similarly, each VM is represented by its vector of demands: V = (v1, v2, . . . , vd).

The load of the node is calculatad as follows:

Load of node= Volume =

i

wi ∗ vi,

(9)

wherewi is the aisigned weight to that resource and

wi =

i

vm

vi/hi

which is the ratio between the total demand for resource i and the capacity of the host.

In simplified form, it is as given below:

System load (lood of physicad machine in cloud)= l ∗ (cpu load) + m ∗ (memory load);

where l, m[0, 1]l represents the cpu weightage and m represents memory weightage. Depend- ing upon type of application, i.e., cpu bounded or memory bounded, it must be set as shown in calculation above.

4. Results and Analysis

This section shows results of designed and implemented GA with respect to other algorithms and analysis of it.

Results

48 VMs were designed of different cpu and memory configurations of the linux as the operating system and ran algorithm under stable and variant load conditions. For variant load condition artifi- cial load was generated for first machine-80%, second machine-40% and third machine 10% to test all three algorithms and the results are as follows:

Load of physical machines under stable load condition with cpu and memory weightage

As seen in Figure 3, genetic algorithm performs better than round-robin and greedy algorithm under stable load conditions and when cpu weightage and memory weightage are equal to 0.5.

Figure 3. Load of physical machines under stable load condition with l= 0.5 and m = 0.5.

(10)

Table 1. Virtual machines allocation.

Physical m/c Greedy Round-Robin Genetic Algorithm

1 48 16 0

2 0 16 22

3 0 16 26

Load of physical machines under variant load condition with cpu and memory weightage

Table 1 shows the virtual machines allocation for variant load condition with memory and cpu weightage equal for greedy, round-robin and genetic algorithm.

From above Figure 4, it is clear that GA performs best, then round-robin algorithm and then greedy algorithm for variant load condition with cpu and memory weightage equal.

Load of physical machines under variant load condition with cpu weightage(l) = 0.9 and memory weightage(m) = 0.1, e.e, for cpu oriented application

The Virtual Machines allocation is shown in Table 2 for greedy, round-robin and genetic algorithm under variant load condition with cpu weightage= 0.9 and memory weightage = 0.1.

As shown in Figure 5, GA does better load balancing as compared to round-robin and greedy algorithm under variant load condition with cpu weightage = 0.9 and memory weightage = 0.1, i.e., for cpu-bound application.

Figure 4. Load of physical machines under variant load condition with l= 0.5 and m = 0.5.

Table 2. Virtual machines allocation.

1 48 16 0

2 0 16 8

3 0 16 40

(11)

Figure 5. Load of physical machines under variant load condition with l= 0.9 and m = 0.1

Load of physical machines under variant load monition with cpu weightage(l) = 0.1 and memory weightage(m) = 0.9, i.e, for memory oriented application

The Virtual Machines allocation is shown in Table 3 for greedy, round-robin and genetic algorithm under variant load condition with cpu weightage= 0.1 and memory weightage = 0.9.

As shown in Figure 6, GA does better load balancing as compared to round-robin and greedy algorithm under variant load condition with cpu weightage = 0.1 and memory weightage = 0.9, i.e., for memory-bound application.

Analysis

It is seen that the designed and implemented Genetic Algorithm surpasses Greedy algorithm and Round-Rodin algorithm irrespective of any initial load condition or any kind of application. i.e., cpu- bound or memory-bound application. In short, GA allocates VMs such that it achieves better load balancing. It does load balancing such that VM migrations are reduced since it considers current system load into account for allocating new virtual machines in the cloud.

The parameters for GA are crossover rate, mutation rate, population size. The experiment gave better results for population size of 50, crossover rate of 0.6 and mutation rate of 0.3 approximately.

These three parameters are dependent on one another. If one is varied, others also need to be varied Table 3. Virtual machines allocation.

1 48 16 0

2 0 16 8

3 0 16 40

(12)

Figure 6. Load of physical machines under variant load condition with l= 0.1 and m = 0.9.

in order to get optimal solutions within reasonable time. The experiment gave better result for population size in the range (15, 60). Beyond this range, it either generated sub optimal solution or took more time to generate optimal solution, which shown that depending upon the problem instance size, the population size should be within certain range to get optimal solution wherein reasonable good computing time.

5. Conclusion and Future Work

The conclusion derived from the proposed method is as below.

Conclusion

From the study it is understood that there exist various algorithms for load balancing in distributed environment. In cloud computing environment, most of the load balancing work is done in VM migration. When VMs are migrated, there is huge data transfer that takes place which consumes unnecessary lots of bandwidth unlike process migration. Also, the service will become slow during VM migration which costs companies a lot. So, there is a need to do load balancing in cloud which reduces VM migrations. The proposed GA schedules VMs such that it achieves load balancing and there is less need of VM migrations as it allocates VMs to physical machines in smart way using fitness function. It calculates the load of the node after VM is deployed on node before actually deploying on it and finds a solution which gives the best load balancing. It is compared with Greedy and Round-Robin algorithm, which are available in Eucalyptus.

Future work

The proposed solution does the load balancing considering cpu and memory load. It can be further expanded to include network I/O load and storage l/O load in load calculation of node.

Apart from resource monitoring and load balancing, the proposed solution can be enhanced to include VM consolidation to save power to save electricity costs and to include thermal component

(13)

as well since cooling costs for data centers are also huge to save electricity costs. Thus, complete VM management software can be developed to include all these requirements which are conflict- ing in nature with one another which can be set depending upon current requirement of the cloud provider.

References

[1] Clark, C., Fraser, K. and Hand, S.: Live Migration of Virtual Machines[C]. Proceedings of the 2nd Int’l Conference on Networked Systems Design and Implementation, Berkeley, CA, USA (2005).

[2] Borja Sotomayor, Kate Keahey, Ian Foster and Tim Freeman: Enabling Cost-Effevtive Resource Leases with Virtual Machines. In Hot Topics sesscon in ACM/IEEE International Symposium on High Perfor- mance Distributed Computing 2007 (HPDC 2007) (2007).

[3] Cherkasova, L., Gupta, D. and Vahdat, A.: When Virtual is Harder than Real: Resource Allocation Challenges in Virtual Machine Based Environments. Technical Report HPL-2007-25, February (2007).

[4] David Chappel: A Short Introduction to Cloud Platforms – An Enterprise Oriented View.

http://www.dpvidchappell.com/CloudPlatforms–Chappell.adf (2011).

[5] Charkravati, A. K.: Cloud computing – Challenges and Oppurtunities.

http://www.cdap.in/html/pdf/articles/AKCcloud.pdf (2011).

[6] Mell, P. and Grance, T.: The NIST Definition of Cloud Computing, 2009.

http://csrc.nist.gov/groups/SNS/ulocd-computing/cloud-def-15.doc (2011).

[7] Albert Y. Zomaya and Yee-Hwei: The, Observations on Using Genetic Algorithms for Dynamic Load- Balancing. IEEE Transactilns on Parallel and Distributed Systems, 12(9), September (2001).

[8] Sandeep Tayal: Tasks Scheduling Optrmization for the Cloud Computing Systems. Internationnl Journal of Advanced Engineering Sciences and Technologies, 5(2), 111–115, February (2011).

[9] Martin Randles, David Lamb and Taleb-Bendiab, A.: A Comparative Study into Distributed Load Balancing Algorithms for Cloud Computing. In 24th International Conference on Advanced Information Networking and Applications Workshops (2010).

[10] Rich Lee and Bingchiang Jeng: Load-Balancing Tactics in Cloud, International Conference on Cyber- Enabled Distributed Computing and Knowledge Discovery (2011).

[11] Iman Barazandeh and Seyed Mortazavi: Two Hierarchical Dynamic Load Balancing Algorithms in Distributed Systems. Second International Conference on Computer and Electrical Engineering (2009).

[12] Milan Soklic: Simulation of Load Balancing Algorithms: A Comparative Study. ACM-SIGCSE Bulletin, December (2002).

[13] Martin Randles, David Lamb and Taleb Bendiab, A.: A Comparative Study into Distributed Load Balancing Algorithms for Cloud Computing (2010).

[14] Marek Obitko: Genetic Algorithm Tutorials, 1998.

http://www.obitko.com/tutorials/genetic-algorithms/index.php (2011).

[15] Matthew Wall: Introdution to Genetic Algorithms.

http://lancet.mit.edu/ mbwall/presentations/IntroToGAs/P001.html (2011).

[16] John H. Holland: Genetic Algorithms. http://www2.econ.iastate.edu/tesfatsi/holland.gaintro.htm. (2011).

[17] Goldberg, E.: The Existential Pleasures of Genetic Algorithms. In Genetic Algorithms in Engineering and Computer Science, Winter G ed. New York, Wiley, 23–31 (1995).

[18] Nakrani, S., Tovey, C., Nakrani, S. and Tovey, C.: On Honey Bees and Dynamic Server Allocation in Internet Hosting Centers. Adaptive Behavior, 12, 223–240 (2004).

[19] Rahmeh, O. A., Johnson, P. and Taleb-bendiab, A.: A Dynamic Biased Random Sampling Scheme for Scalable and Reliable Grid Networks (2008).

[20] Tateson, R., Halloy, J., Shackleton, M. and Deneubourg, J. L.: Aggregation Dynamics in Overlay Networks and their Implications for Self-Organized Distributed Applications. Comput. J. 52, 397–412, July (2009).

[21] Lu, Y., Xie, Q., Kliot, G., Geller, A., Larus, J. R. and Greenberg, A.: Join-idle-queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services, Perform. Eval., 68, 1056–1071, November (2011).