ADAPTIVE CLOUD SCHEDULING

(1)

ADAPTIVE

CLOUD SCHEDULING

A dissertation submitted to The University of Manchester for the degree

of Master of Science in the Faculty of Engineering and Physical

Sciences

2014

Abdelkhalik Elsaid Mohamed Mosa

(2)

2

Table of Contents

Abstract ... 9 Declaration ... 10 Copyright ... 11 Acknowledgements ... 12 Chapter 1 : Introduction ... 13 1.1Motivation ... 13

1.2 Research Aims, Objectives and Scope... 14

1.3 Methodology ... 15

1.4 Contributions... 17

1.5 Dissertation Organization ... 18

Chapter 2 : Background and Related Work ... 20

2.1 Cloud Computing ... 20

2.1.1 Cloud Deployment Models ... 21

2.1.2 Cloud Services Architecture ... 22

2.1.3 Cloud Computing Enabling Technologies ... 23

2.2 Virtualization Technology ... 23

2.3 Green Computing ... 26

2.4 Related Work ... 27

2.4.1 Heuristic Approach ... 28

2.4.2 Utility Functions ... 30

2.5 Cloud Computing Simulation Tools ... 34

2.5.1 Existing Cloud Simulators ... 34

2.5.2 CloudSim ... 35

Chapter 3 : Design ... 38

(3)

3

3.2 Input, Processing and Output Model ... 40

3.3 Utility Function Definition ... 41

3.4 Cost Model Development ... 42

3.4.1 The Finite Discrete Markov Chain Prediction Model ... 43

3.4.2 Modelling CPU Utilization using Markov Chain ... 43

3.4.3 Prediction CPU Utilization based on VM utilization ... 48

3.4.4 Calculating Energy Consumption ... 49

3.4.5 Calculating Possible Sources of SLA Violation ... 51

3.5 Optimization Algorithm ... 53 3.5.1 Representation design ... 54 3.5.2 Initial Population ... 55 3.5.3 Evaluation/Fitness function ... 55 3.5.4 Genetic operators ... 55 3.5.5 Convergence ... 56 Chapter 4 : Implementation ... 57

4.1 Steps for creating a basic cloud datacentre ... 57

4.1.1 Initializing the CloudSim Package ... 58

4.1.2 Creating the Data Centre ... 58

4.1.3 Creating the Cloud Broker ... 58

4.1.4 Creating the List of the Virtual Machines ... 59

4.1.5 Creating the cloudlets ... 59

4.1.6 Starting the Simulation ... 60

4.1.7 Stopping the Simulation ... 61

4.1.8 Printing the results ... 61

4.2 Implementing the Adaptive VMs Assignment ... 61

4.2.1 Finding Source and Destination Hosts ... 65

(4)

4

4.4 Integrating the utility based strategy with CloudSim ... 68

4.4.1 The Class Hierarchy ... 68

4.4.2 Elements of the Adaptive cloud scheduling system ... 69

4.5 Configuring the Experiments ... 69

4.6 Conclusion of the Implementation ... 71

Chapter 5 : Evaluation ... 72 5.1 Performance Metrics ... 72 5.2 Experiments Setup ... 73 5.2.1 Experiment 1 ... 73 5.2.2 Experiment 2 ... 81 5.2.3 Experiment 3 ... 86 5.3 Conclusion ... 88

Chapter 6 : Conclusion and Future Ideas ... 90

6.1 Conclusion and Discussion ... 90

6.2 Future Work ... 91

6.2.1 Improving the Cost Model ... 91

6.2.2 Considering all Computing Resources... 91

6.2.3 Multi-objective Optimization... 91

6.2.4 Generalized Framework for Adaptive Cloud Scheduling ... 92

6.2.5 Improving the search ... 92

Bibliography ... 93

(5)

5

List of Figures

Figure 2-1: Cloud Deployment Models, from [10] ... 21

Figure 2-2: Cloud architecture stack diagram, from [10] ... 22

Figure 2-3: Operating System virtualization, from [10] ... 24

Figure 2-4: Hypervisor based virtualization, from [10] ... 25

Figure 2-5: Hosted Virtualization ... 25

Figure 2-6: Current State, Action, and Possible State, from [34] ... 31

Figure 2-7: The Action policies example ... 32

Figure 2-8: The goal policies example ... 32

Figure 2-9: Utility policy example ... 33

Figure 2-10: CloudSim architecture, from [5] ... 36

Figure 3-1: Green cloud system Architecture, from [45] ... 38

Figure 3-2: Input, Processing and Output of the Adaptive Scheduling Problem ... 41

Figure 3-3: CPU Utilization Transition State Diagram... 45

Figure 3-4: CPU Utilization Prediction Algorithm Using Markov Model ... 48

Figure 3-5: The pseudo-code for computing CPU Utilization ... 49

Figure 3-6: Power Consumption according to different utilizations, from [31] ... 50

Figure 3-7: Predicted Energy Cost ... 51

Figure 3-8: Violation cost depending on the number of VMs in violation ... 52

Figure 3-9: Pseudo-code for Calculating the Cost of PDM ... 53

Figure 3-10: General framework for the evolutionary algorithm, from [49] ... 54

Figure 3-11: Solution vector representation ... 55

Figure 4-1: Steps for creating a basic cloud datacentre ... 57

Figure 4-2: The class hierarchy of the cloud adaptive scheduling problem ... 68

Figure 4-3: Monitoring, Analysis, Planning and Execution (MAPE) loop design model ... 69

Figure 5-1: Overall SLA violation to energy consumption after running configurations 1.1 of the first experiment,10 times using the utility and the heuristics based approaches... 75

Figure 5-2: Overall SLA violation to energy consumption after running configurations 1.2 of the first experiment,10 times using the utility and the heuristics based approaches... 77

(6)

6

Figure 5-3: Overall SLA violation to energy consumption after running

configurations 1.3 of the first experiment,10 times using the utility and the heuristics based approaches. ... 79 Figure 5-4: Overall SLA violation to energy consumption running “Configuration

2.2” 10 times using the utility and the heuristics based approaches ... 83 Figure 5-5: Overall SLA violation to energy consumption running “Configuration

2.3” 10 times, using the utility and the heuristics based approaches ... 85

Figure 5-6: Overall SLA violation to energy consumption running “Configuration 3.1” and “configuration 3.2” 10 times using the utility and the heuristics

(7)

7

List of Tables

Table 5-1: Running “Configuration 1.1” 10 times using the utility based approach. 74 Table 5-2: Running “Configuration 1.1” 10 times using the heuristics based

approach ... 74 Table 5-3: Running “Configuration 1.2” 10 times using the utility based approach. 76

Table 5-4: Running “Configuration 1.2” 10 times using the utility based approach. 77 Table 5-5: The results of allocating 150 VMs to 150 hosts after running the

experiment 10 times using the utility based approach ... 78 Table 5-6: The results of allocating 150 VMs to 150 hosts after running the

experiment 10 times using the heuristics based approach... 79 Table 5-7: Summary results of experiment 1 ... 80 Table 5-8: The results of allocating 150 VMs to 100 hosts after running

“Configuration 2.2” 10 times using the utility based approach ... 81

Table 5-9: The results of allocating 150 VMs to 100 hosts after running

“Configuration 2.2” 10 times using the heuristics based approach ... 82 Table 5-10: The results of allocating 200 VMs to 100 hosts after running

“Configuration 2.3”, 10 times using the utility based approach ... 83

Table 5-11: The results of allocating 200 VMs to 100 hosts after running

“Configuration 2.3” 10 times using the heuristics based approach ... 84 Table 5-12: Summary results of experiment 2 ... 85 Table 5-13: The results of allocating 50 VMs to 50 hosts with energy cost of “3”

after running “Configuration 3.2” 10 times using the utility based

approach ... 87 Table 5-14: Summary results of experiment 3 ... 88

(8)

8

List of Codes

Code 4-1: Building the initial population of assignments ... 62

Code 4-2: Selecting parents for mutation ... 62

Code 4-3: Selecting parents for cross over ... 63

Code 4-4: The getMutated() method... 63

Code 4-5: The getCrossover() method ... 64

Code 4-6: The population for the next generation ... 65

Code 4-7: calculation of the energy cost and the violation cost ... 67

(9)

9

Abstract

Cloud computing plays a significant role in today’s computing by delivering computing resources as pay as you go services over the Internet. Many organizations and individuals all over the world rely on cloud environments to support their applications, platform, and even the infrastructure. As a result of the huge demand on cloud services, cloud providers had to build enormous data centres to meet this increase in users’ needs from the cloud. However, these huge data centres consume great amounts of power which not only contribute to data centres operating costs but also increase the amount of carbon dioxide emissions. Energy efficient algorithms are required for minimizing the operating costs of the data centres and building green energy-aware cloud environments.

Our goal in this work is to design and evaluate an optimized adaptive resource allocation and management algorithm which dynamically assigns virtual machines to existing hosts in the cloud data centre. This algorithm will not only save energy consumption but also meet the agreed upon qualities of service (QoS). Existing work followed a heuristic approach for managing the energy-performance trade-off. On the contrary, this work makes use of utility functions along with optimization for deciding which VMs should be allocated to which physical hosts while achieving the desired goal. This work describes in detail the selection of the utility properties, the creation of the utility function, and the relevant optimization technique for maximizing the required utility. The proposed technique will be validated by analysing and evaluating the performance using the CloudSim framework.

(10)

10

Declaration

No portion of the work referred to in this dissertation has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning.

(11)

11

Copyright

i. The author of this dissertation (including any appendices and/or schedules to this dissertation) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes.

ii. Copies of this dissertation, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has entered into. This page must form part of any such copies made.

iii. The ownership of certain Copyright, patents, designs, trade marks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright works in the dissertation, for example graphs and tables (“Reproductions”), which may be described in this dissertation, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions.

iv. Further information on the conditions under which disclosure, publication and commercialisation of this dissertation, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in

the University IP Policy (see

http://documents.manchester.ac.uk/display.aspx?DocID=487), in any relevant Dissertation restriction declarations deposited in the University Library, The

University Library’s regulations (see

http://www.manchester.ac.uk/library/aboutus/regulations) and in The University’s Guidance for the Presentation of Dissertations.

(12)

12

Acknowledgements

First and foremost, praises and thanks to Allah SWT, the Almighty, who has granted me countless blessings, the wisdom and perseverance during this research project, and indeed, throughout my life.

I gratefully acknowledge the deepest gratitude to my supervisor, Professor

Norman Paton, for his great mentorship and wise supervision. Without your continuing support and encouragement, this dissertation would never be a reality.

My beloved parents, the most people in this world whom I love from the bottom of my heart. I appreciate all the sacrifices you made to raise me up. I could not find precious words to express my grateful thank to my beloved brothers (Mahmoud,

Mohamed, Kamal, Yasser, Salah and Emad) and sisters (Fatema and Amany). Moreover, I am also grateful to my dearest wife Walaa for putting up with my late hours, my spoiled weekends and my bad temper, I love you.

Finally, I would like to thank Dr. Ahmed Sobhy and Dr. Tarek Gaber, I cannot forget your help and support. Moreover, love and thanks are extended to Eng. Mostafa Zayed, Dr. Abdulrahman Alghamdi, Eng. Abdullah Al-Ahmari, Dr. Mohamed El-Sawy

(13)

13

Chapter 1 : Introduction

In this chapter, the motivation for this research will be demonstrated, besides the aims and objectives of the project will be delineated. In addition to that, this chapter provides an outline which summarizes what will be done in the following chapters.

1.1

Motivation

Cloud computing is a new computing paradigm for delivering computing resources and services over the Internet [1]. Organizations, business owners and even individuals started using cloud services extensively instead of building their own data centres for providing the required services. Due to the high demand on cloud services, cloud providers had to build large-scale data centres to meet cloud users’ needs. For example, according to [2], in 2012, Amazon EC2 had more than 454,000 servers in 7 different regions all over the world and this number is continuously increasing. These giant data centres with hundreds of thousands of servers consume great amounts of energy. These huge amounts of the energy consumed resulted in a notable increase the datacentres’ operating costs which also affects the cloud users. Moreover, it has a bad effect on the environment due to the increase in the carbon dioxide emissions from these datacentres. As a result, reducing energy consumption in cloud datacentres became a goal for cloud providers both for their personal benefits and for the environment.

Reducing the waste in energy consumption involves two parallel actions. The first action should be increasing the efficiency of power consumption in computing nodes in the cloud infrastructure. The second action should be improving resource utilization which can be done by deploying efficient resource monitoring and scheduling algorithms. Improving the infrastructure efficiency is a hardware issue so it is out of the scope of this work. On the other hand, building efficient resource scheduling algorithms that save energy consumption while meeting the service level agreement (SLA) will be the main goal of this work.

Many of the datacentre hosts are continuously working even though they are underutilized. The average CPU utilization is less than 50% [3], and even if a server is completely idle, it still consumes about 70% of the maximum power that the server

(14)

14

normally consumes [4]. Therefore, efficient resource allocation and management algorithms are required to play their role in alleviating the energy consumption problem and the resulting CO2 emissions. These algorithms should switch underutilized servers into sleep mode after migrating the virtual machines from these underutilized servers to other servers that are not underutilized. By turning these unused servers into sleep mode, the amount of energy consumed and CO2 emissions will be reduced. Moreover, the cloud providers’ return on investment (ROI) will increase as the total energy cost will be reduced. In addition to that, the resource allocation algorithm will migrate virtual machines from overloaded servers to meet the required quality of service (QoS), which achieves the desired level of user satisfaction.

To sum up, the expected benefits of developing efficient resource monitoring and scheduling algorithms for the cloud users, cloud providers, and the environment were the driving force for conducting this research.

1.2 Research Aims, Objectives and Scope

The aim of the project is to design, implement and evaluate an optimized energy-aware adaptive resource scheduling algorithm. This algorithm should dynamically assign virtual machines to physical hosts in the cloud data centre. This adaptive assignment puts into consideration minimizing power consumption and meeting the negotiated service level agreements (SLAs). The proposed strategy depends on the deployment of utility functions and evolutionary algorithms for finding an effective and efficient assignment of VMs. This assignment will be rated against the utility “fitness” function.

In order to achieving the aims of this research, the following objectives need to be accomplished:

1. Identifying and analysing existing techniques for the dynamic allocation of virtual machines to the physical hosts in cloud computing environments.

2. Selecting the properties that are required for the utility definition. This will be followed by defining the utility function that aims to capture the utility of an assignment without violating the cloud user’s and cloud

(15)

15

provider’s constraints. This will inform the definition of the associated cost model for computing energy consumption and SLA violation costs.

3. The implementation of a search over the space of possible assignments of VMs to physical hosts, using genetic algorithms.

4. Extending the current CloudSim toolkit [5] by setting up the experiment and integrating it with the implementation of the optimization algorithm.

5. Evaluating and comparing the proposed utility-based policy with existing heuristics “action-based” techniques in [6] using performance metrics.

This research is concerned with addressing the dynamic allocation of virtual machines to hosts problem based upon a utility based policy. On the contrary, the task of dynamic allocation of workload to the virtual machines is out of the scope of this research.

1.3 Methodology

The following research methodology was followed for achieving the research aim.

1. Background reading and literature review:

A background reading and a review of the literature have been conducted to identify what has been previously addressed and how it was addressed. The reading involved reviewing research papers, journal articles and book chapters that describe the required strategies and techniques for addressing the research problem. The background reading was also crucial for understanding cloud computing concepts, the cloud deployment models and the cloud services architecture. Moreover, this reading made the idea of how efficient cloud computing environments can help in creating a green and energy-efficient environment clear.

The background reading followed by a review of the state-of-the-art of the dynamic cloud scheduling techniques. A precise understanding of the heuristic approach, proposed in [6] and [31], for the dynamic consolidation and

(16)

16

deconsolidation of VMs for saving energy and meeting the SLA. This approach was thoroughly reviewed as it is the one which the results of this research will be compared with. The review and the background reading were important so as not to re-invent the wheel. Moreover, they enabled building a solid background and deep understanding of the technical concepts related to the research and the cloud computing generally.

2. Reviewing existing policies for handling self-management systems:

Understanding different policies used for achieving the self-management in cloud computing environments is crucial to this research, as adaptive cloud scheduling involves monitoring and self-management of the cloud datacentre. The review conducted showed that there are three major policy types that can be used for building autonomic computing systems [7]. These policies are either rule-based action policies, goal policies or utility functions based policies. The utility function based policy is the self-management policy that has been deployed for monitoring and managing the cloud datacentre in this research.

3. Choosing and understanding the simulation toolkit:

A survey of currently used cloud simulation tools had been conducted. The survey showed that there is a number of cloud simulation tools such as MDCSim, Green Cloud, iCanCloud and CloudSim. A comparison among those tools showed that iCanCloud is currently the most powerful cloud modelling and simulation tool [8] as it provides more features than any of the previously stated simulation toolkits. However, the proposed strategy in this research will be simulated using the CloudSim toolkit, so that the results of this project can be easily compared to the results found in the heuristic approach [6], which was implemented using CloudSim.

4. Understanding the heuristic approach and testing its results:

The heuristic approach in [6] had been closely studied. The experiments reported from this approach has been checked and analysed by rerunning it using the CloudSim toolkit. Moreover, the values of the performance metrics have been thoroughly reviewed and reported.

(17)

17

5. Developing the proposed strategy using utility functions:

The first step in the development was finding where the implementation of the utility based approach should be hooked into the existing CloudSim toolkit. Genetic algorithms have been used for formulating the utility function for finding a robust assignment that considers the constraints. The implementation of the genetic algorithm started by building the initial population of VMs-to-hosts assignment and selecting parents for mutation and crossover operations. This followed by the implementation of the mutation and crossover functions. The utility function is implemented according to the description and design shown in chapter 3. The CloudSim is extended to support the utility based approach by integrating the optimization problem with the cloud environment.

6. Project evaluation:

After setting up the experiment, application workloads “CloudeLets” are simulated using synthetic data. The next step was choosing and defining the performance metrics. This followed by defining different experiments that are going to be conducted and the objectives of each of these experiments. The objectives of the first experiment is to assess the effectiveness of the utility based strategy in a lightly loaded datacentres. In this experiment the number of VMs is equal to the number of hosts that they are going to be allocated to. The second experiment appraises the impact of larger number of VMs per physical machines on both the energy consumption and the overall SLA violations. The last experiment assesses the impact of different cost ratios on the strategy. General conclusions are delineated for all the conducted experiments with supporting graphs that compare the results from the proposed approach with the one found in the heuristic approach [6].

1.4 Contributions

The contributions of this research can be classified into five different areas:

1. Survey and literature review:

The first contribution involved a survey of the state of the art in dynamic resource allocation techniques in the cloud computing environments.

(18)

18

2. Utility definition:

The second contribution was in the definition of the utility function and identifying the utility properties. This utility function represents the objective of the optimization problem.

3. Cost model design:

The third contribution was the cost model which predicts the percentage of CPU utilization. The CPU utilization will be used for computing both of the energy consumption cost and SLA violation (SLAV) cost which are crucial for calculating the utility.

4. Metaheuristic optimization:

The fourth contribution was the design and implementation of the genetic algorithm that searches over the space of all possible VMs to the physical hosts assignments. This genetic algorithm seeks to find an effective assignment rated against fitness criteria.

5. Performance Evaluation:

The last contribution was the evaluation of the values of the performance metrics resulted from running the proposed utility based approach. The evaluation also involved analysing the results and comparing it to the results of the heuristics based approach [6].

1.5 Dissertation Organization

The dissertation contains six chapters; Chapter 1 demonstrated the motivation behind this research. In addition to the research aims, objectives and its scope followed by the methodology and the main contributions. The remainder of the dissertation is structured as follows:

• Chapter 2 – Background and Related Work. This chapter examines the general background related to cloud computing besides a review of energy-efficient computing. This will be followed by a review of the current dynamic cloud scheduling techniques and a review of the utility functions and how they can be

(19)

19

used to solve our problem. Finally, a survey of currently used cloud simulation tools and in depth review of CloudSim.

• Chapter 3 – Design. This chapter describes the problem and the system model, in addition to the definition of the utility function and the design of the cost mode. Moreover, it involves the representation design and the design of the genetic algorithm.

• Chapter 4 – Implementation. This chapter describes all the steps required for implementing the genetic algorithm and the utility function. The implementation also involves setting up the experiment using CloudSim and integration of the implemented algorithm with CloudSim.

• Chapter 5 – Evaluation. This chapters presents all the parameters required for setting up the experiment and the definition of the performance metrics. An analysis of the simulation results will follow the running of the experiment. • Chapter 6 – Conclusion and future work. This chapter summarizes the results

obtained and makes conclusions. Finally, it lists a number of future ideas that needs further research.

(20)

20

Chapter 2 : Background and Related Work

This chapter describes the general background and the literature review required as a starting point for this research. The background involves reviewing all information required for understanding cloud computing and its related technologies. This general background will be followed by a review of the previous work related to dynamic resource allocation techniques in cloud computing environments.

2.1 Cloud Computing

Cloud computing is a new computing paradigm in which computing resources (hardware, platform, and application software) are provided as elastic and on-demand services over the Internet [1]. In this computing model, all computations and data storage are done by remote hosts located in the cloud providers’ data centres. These remote data centres, which utilize virtualization technology for the consolidation of multiple virtual servers in physical servers, are what is technically referred to as the “cloud” [9]. The deployment of cloud computing offers many advantages that outweigh the deployment of conventional datacentres. For example, cloud computing provides computing resources on demand and in an elastic manner so that the resources can be increased or decreased according to cloud consumers’ needs. Furthermore, cloud computing eliminates up-front commitments by cloud users, simplifies servers’ operation and management, and improves resource utilization via physical servers virtualization [43].

(21)

21

2.1.1 Cloud Deployment Models

The cloud computing environment is called a public or sometimes external cloud, when the cloud services are delivered on a pay-as-you-go basis to the public [10]. Therefore, in a public cloud, the cloud providers and users belong to different organizations or companies. Many international companies such as Amazon, IBM, Microsoft and Oracle provide public cloud services. Due to security considerations such as availability, data privacy and others, some organizations build their own private or internal clouds which are only accessible by authorized users within the organization and not publically accessible to other non-authorized users. In contrast to the public cloud, both cloud users and providers in a private cloud belong to the same organizational entity. In between, the hybrid cloud provides some services in-house by making use of a private cloud, and other services are provided by public clouds. On one hand, hybrid clouds use private clouds for supporting the security critical services. On the other hand, the non-critical services are performed by the public cloud to take the advantage of the scalability and cost effectiveness provided by the public cloud [9], [10]. Figure 2-1 which is taken from [10], shows different cloud deployment models.

(22)

22

2.1.2 Cloud Services Architecture

The architecture of cloud computing services is represented by a layered or stack model, in which each layer has specific functions and provides its own services [10]. There are differences in the number of these layers between proposals, and also over time new layers may be added to provide specific services. Regardless of the number of layers, the cloud environments generally follow the Everything as a Service (XaaS) paradigm, where X is a variable that refers to the category of the provided service. According to [10], the cloud environment may support four distinct services in four main layers namely Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), and finally Human as a Service (HaaS). Figure 2-2, from [10], shows the cloud services layered architecture.

As shown in Figure 2-2, the IaaS layer consists of two sub-layers namely, the resource set and the infrastructure services. The resource set sub-layer represents the physical and virtualized resources such as processing, storage, memory, and bandwidth. The infrastructure services sub-layer provides services for the infrastructure such as storage services as in Amazon S3[11], Dropbox [12] and Google Big Table [13] or virtual server recovery as in Bluelock virtual recovery[14].

(23)

23

The PaaS layer provides developer oriented services such as different programming and execution environments and database management systems (DBMSs). Examples of well-known PaaS: Google App Engine [15] and Facebook platform [16].

SaaS provides applications directed for end users such as office suites, image processing or customer relationship management applications (CRM). SaaS relieves end users from installing and updating software. Examples of SaaS: Google Docs [17], Adobe Photoshop Express [18] and Salesforce.com [19].

Finally, HaaS layer shows that the cloud computing paradigm includes not only IT services but also services provided by people. The main category of HaaS layer is crowdsourcing, where work can be done online using a crowd of people. Crowdsourcing enables providing of services that are either impossible or can’t be easily or accurately done by computers, such as design services and accurate full text translation. Amazon Mechanical Turk [20] is an example of a marketplace that offers crowdsourcing services.

2.1.3 Cloud Computing Enabling Technologies

Technically speaking, cloud computing represents the normal evolution of computing paradigms, and hence it is not a revolution. Cloud computing makes use of a number of existing technologies such as distributed computing, autonomic computing, virtualization, web services, service oriented architecture (SOA) and Web 2.0. Among these technologies, virtualization technology is considered to be of a higher importance as it represents the most significant difference between a traditional data centre and a cloud data centre [21]. As a result of its importance, besides its relevance to this project on “Adaptive Cloud Scheduling”, virtualization technology

will be discussed in some details.

2.2 Virtualization Technology

Cloud computing wouldn’t be feasible without the existence of virtualization technology. Virtualization is a technique that breaks the physical computing resources in a host/server down into a number of fully isolated virtual environments or machines

(24)

24

with different operating systems and applications [10]. Hardware virtualization aids in reducing hardware costs and energy consumption, as one physical server can be logically used for creating a group of virtual servers, which in turn increases resource utilization. Virtualization provides a number of tangible benefits for the cloud providers, the cloud users, and the environment. For cloud providers, virtualized servers can use the physical resources efficiently by fully utilizing the existing resources which reduce operational costs. In addition, the management of virtual servers can be automated so that no user intervention is required to allocate virtual machines to the physical hosts. For cloud users, the process of building servers and running the applications becomes easier and faster. Moreover, the CEOs also became happy as they only pay for the amount of used resources, reducing the need for overprovisioning to cope with peak demand. Finally, could computing helps in reducing power consumption which in turn aids in reducing CO2 emissions. Virtualization technology can be applied on different levels such as operating system, platform (either full virtualization or paravirtualization), storage, and applications level [10].

Operating system virtualization creates multiple identical isolated containers that use the same operating system kernel. These containers are also called jails, virtual private servers (VPS) and virtualization engines (VE). This technique of virtualization is commonly used in virtual hosting environments as the resulted overhead is small compared to other virtualization techniques. Figure 2-3, from [10], illustrates operating system virtualization.

In contrast to operating system virtualization, Platform Virtualization enables users to run different operating systems simultaneously. There are two types of platform virtualization, the first type is full virtualization while the second type is

(25)

25

paravirtualization. Full virtualization emulates the entire virtual machine and deploys either a hypervisor or a hosted architecture. In the hypervisor architecture, also called bare-metal, the virtualization layer (hypervisor of virtual machine monitor (VMM)) is directly installed above the hardware as shown in Figure 2-4 from [10]. In the hosted architecture, the VMM is installed over the host operating system and the hosted guest OSs are located above the VMM as shown in Figure 2-5. VMware workstation and Oracle virtual box are two examples of desktop virtualization. In contrast to the OS virtualization, the hosted architecture allows for creating a number of virtual machines with different operating systems. In, paravirtualization there is a communication between the guest OS and the hypervisor layer and it involves modifications to the operating system kernel.

Figure 2-4: Hypervisor based virtualization, from [10]

Hardware

Host Operating System Virtual Machine Manager (VMM)

Guest OS 1 Guest OS 2 Applications

Applications Applications

(26)

26

In storage virtualization, multiple network storage devices are pooled together to be logically seen as a single storage device. This pooling of the storage resources makes the tasks of backing up and recovering easier due to the central management of the distributed storage. Network virtualization allows the creation of a number of virtual local area networks (VLANS) from a single physical network. This type of virtualization makes cloud resources appear in the local network. In addition to that, cloud services can be accessed via virtual IPs instead of real ones. Application virtualization insulates the application software from the OSs in which it is running on [10]. Application virtualization allows running of a specific OS applications in another OS. For instance, to run MS Windows applications on Unix OS, the user can use software such as “Wine” [22].

Virtualization allows consolidating a number of VMs into one physical host instead of running multiple independent hosts. This VMs consolidation makes the process of creating a new servers easier and on demand [23]. Furthermore, it contributes to minimizing energy consumption in cloud environments compared to the traditional data centres. However, minimizing energy consumption in the IaaS layer of a cloud environment is still a research challenge. Solving this challenge efficiently, contributes to building green cloud computing environment [21].

2.3 Green Computing

Green computing, also called green IT or energy-efficient computing, aims at building energy efficient computers or any other technology that is environment-friendly [21], [24]. Moreover, green computing is also concerned with reducing the energy consumption of computing resources by implementing energy efficient techniques. These techniques involve saving energy consumption by all possible computing resources. These resources include CPU, storage, cooling systems, interfaces, and even network devices in the cloud data centre. Improving energy efficiency by minimizing energy consumption by all computing resources contributes to creating a green cloud computing environment [21].

Green cloud computing solutions are for the good of cloud users, cloud providers, and the environment. Building such computing solutions bring about saving energy consumption, minimizing operating costs and hence saving money. Moreover,

(27)

27

the amount of carbon dioxide emitted will be reduced which has a positive impact on the environment.

2.4 Related Work

Cloud resource Scheduling is the process of allocating either VMs onto physical hosts or workload onto VMs, according to constraints given by both the cloud users and providers [25]. According to one of the existing classifications, cloud scheduling techniques can be either static or dynamic [26]. Static scheduling requires the existence of information about the whole resources and tasks by the time the application is scheduled. In addition, the resources are assumed to be available all the time. While in dynamic scheduling, the scheduler doesn’t have any knowledge about the resources in advance, which means that any failure or change of resources is considered and handled through rescheduling.

Various proposals addressed scheduling in the cloud and other distributed environments. Cardosa et al. [27] addressed the allocation of virtual machines to physical hosts with the goal of minimizing energy consumption in virtualized heterogeneous computing environments. The authors made use of existing features in virtualization technologies such as Xen[28] and VMware[29]. Existing VMMs already provide a number of handy parameters such as min, max, and shares. The min and max parameters are used for specifying the minimum and maximum amount of resources that can be allocated to the VM. On the other hand, the shares parameter defines the amount of physical resources that can be shared among overloaded VMs. Previous allocation techniques didn’t take advantage of these parameters. Experiments showed that making use of these parameters improves the data centre utility by 47%. In their approach, the amount of resources allocated to VMs can be fine-tuned based upon power consumption and application utilities. The first uncovered problem with this work is that it only handles static allocation, which means that the resources assigned to the VMs can’t be adjusted during run-time. In addition, it didn’t uphold rigid SLAs and requires previous knowledge of the priorities of the applications to define the shares parameter. Finally, the CPU was the only resource taken into account when making VM reallocation.

(28)

28

Verma et al. [30] have implemented a cost-aware technique for the dynamic allocation of applications to virtual machines. The cost-aware technique handles both the power and migration costs. The authors have applied heuristics for the bin packing problem. They depended upon continuous optimization to handle the balance between the power consumption and performance. However, the proposed algorithms didn’t support strict SLA requirements, and the violation of the service level agreements can occur because of the workload variability. In addition, this work addressed the applications-to-VMs placement problem but didn’t handle the VMs-to-hosts allocation problem.

Despite the great efforts done on the previously stated resource allocation techniques in [27] and [30], they either concentrated on the application placement or didn’t strictly handle the required SLA. Anton Beloglazov et al. [31] developed an energy-aware VM allocation technique by following a heuristic approach that manages the energy-performance trade-off. This heuristic method is primarily based upon the analysis of historical data of the virtual machines resource usage. This work followed a divide and conquer approach for solving this problem by dividing it into four smaller sub-problems. This heuristic “action based” approach will be described in more detail as this is the main work that the proposed project will be compared against.

2.4.1 Heuristic Approach

The first sub-problem to be addressed by Anton Beloglazov et al. [31] was when to migrate a virtual machine. They again, divided this sub-problem into two other sub-problems which are host overload detection and host under-load detection. An overloaded host requires the transferring of a number of virtual machines from this overloaded host to another non-overloaded host. This migration process helps in meeting the service level agreement (SLA) by avoiding performance degradation. On the other hand, all virtual machines in an underutilized host should be transferred/migrated to another host. After this migration, this host should be switched into sleep mode to minimize energy consumption and hence improve resource utilization of host or hosts to which the virtual machine are migrated.

(29)

29

2.4.1.1 Host Overload/Underload Detection

Three basic techniques are used to determine when a host is considered overloaded [31]. The first one uses static thresholds for both the upper and lower bounds of resource utilization. The overall CPU utilization by all VMs in a host should always be between the upper and lower limits. This technique can be easily applied and may be effective in environments with static workload. However, this static threshold will not be appropriate for environments with dynamic, changeable and unpredicted workloads.

The other two techniques were able to adjust the utilization threshold value based upon the applications’ workload patterns. The first one is based upon an adaptive utilization threshold using either Median Absolute Deviation (MAD) or Interquartile Range (IQR). MAD is a measure of statistical dispersion, and it is more robust than standard deviation and variance as it is more resistant to outliers [31]. The second technique relied on either Local regression or Robust Local Regression (LRR). According to the authors’ evaluation, the local regression-based algorithm was better than the other techniques.

The author proposed a straightforward algorithm for host under-load detection. The algorithm tries to move all virtual machines in the current host to other hosts. If this migration process is possible, then the host is considered as under-loaded, else the host is said to be overloaded.

2.4.1.2 VM Selection

The second sub-problem was, which VMs from the overloaded host should the algorithm select for migration? The author proposed three different techniques for VM selection namely, minimum migration time policy (MMT), random selection policy (RS) and the maximum correlation policy (MC). Using the MMT policy, the selected VM is the one that can be migrated faster than any other virtual machine in the overloaded host; i.e. the one with the least migration time. The random selection policy chooses the virtual machine that will be migrated based upon uniformly distributed discrete random variable; the values of this random variable represent the set of VMs allocated to the host [31]. The maximum correlation policy (MC) selects virtual machines that have the highest correlation of the CPU utilization with other VMS on

(30)

30

the same host. This correlation is calculated by applying multiple correlation coefficient. According to the authors’ evaluation, the MMT was the best policy for VM selection.

2.4.1.3 VM Placement

The third sub-problem was the placement of the VMs selected for migration from the overloaded and underutilized hosts. The VM placement problem is an example of the bin packing problem. The author solved this problem by making some modifications to the best fit decreasing (BFD) algorithm [32] so that it will be power-aware and named this modified algorithm as Power Aware BFD (PABFD). The PBFD algorithm works as follows: the algorithm sorts the list of virtual machines in a decreasing order according to their CPU utilizations. Each virtual machine in the VM list is allocated to the host that will yield the lowest increase in power consumption after the allocation process. By applying this algorithm, machines that are more energy-efficient are chosen.

2.4.1.4 Switching idle hosts off

The fourth and the last sub-problem was which and when the host will be turned on or off? Switching an idle host off will save power by eliminating the 70% of the power consumed when the host/server is entirely idle. Some idle hosts should be reactivated and switched on to in case of any violation in the SLA.

The work in [6], [31] used heuristics for the adaptive cloud scheduling problem. In this work, an alternative approach is being deployed for finding an efficient and robust scheduling for the same problem. The proposed approach will utilize utility functions along with genetic algorithms; therefore, the following section will discuss utility functions and how they can be used for creating self-management systems.

2.4.2 Utility Functions

Utility functions provide a common framework for creating self-managed and self-optimized autonomic computing systems by capturing the preferences of an agent [33]. This agent can be either a human being or software that acts on a human’s behalf,

(31)

31

and its preferences are expressed in terms of a multi-attribute utility function. The agent selects the state that maximizes the utility, and the best utility can be obtained from the state with the largest value [34].

The adaptive cloud scheduling problem is a management and self-optimization problem, where the cloud datacentre manager should manage VM provisioning according to the desired objectives. Utility functions are a well-known method for representing an agent’s preferences in autonomic computing systems [34]. They are relevant for autonomic computing systems as they focus on the desired state by providing a clear and straightforward basis for decision making. The autonomic computing problems are usually addressed in real-world systems using one of the following three policies namely the rule-based action policies, goal policies, and utility functions.

Figure 2-6 from [34] exhibits a general framework that can be applied to the three previously stated policies. For understanding how this framework acts, suppose that we have a system that has a number of states. Each state “S” is represented as a vector of attributes. The current state will be transited to a new sate “σ” based upon the taken action “a”. This means that different actions to the current state might lead to possible different state.

Action policies are typically represented in the form of IF (condition) THEN (action), where the condition is the current state of the system. This type of policy

(32)

32

doesn’t explicitly specify the new state that the system will reach after applying the action. To illustrate this concept, suppose that the goal is to minimize energy consumption while meeting SLA in the cloud datacentre. Host utilization reflects energy consumption as energy consumption is proportional to host utilization [44]. We assume that when the host is 100% utilized, this means that the host is over-utilized, and some VMs need to be migrated to another host for meeting SLA. Moreover, if the host utilization is less than or equal to 30%, for example, this means that it is underutilized and wastes power. In this case, all VMs in this host under-utilized host should be migrated to another host and this host switched to sleep mode. The pseudo-code in figure 2-7, demonstrates an example of using action policies in autonomic computing environments.

IF (HostUtilization >= 100%) THEN migrateSomeVMstoAnotherHost() ELSE IF (HostUtilization <= 30%) THEN

migrateAllVMstoAnotherHost() END IF

Figure 2-7: The Action policies example

Where HostUtilization represents the percentage of host utilization,

migrateSomeVMstoAnotherHost() migrates some VMs from the host till the host become not over-loaded for meeting SLA, and migrateAllVMstoAnotherHost()

migrates all VMs from under-utilized host for saving energy consumption. The heuristics based approach presented in [6] and [31] utilized action policies for the adaptive cloud scheduling problem.

Goal policies do not specify exactly what should be done in the current state, and instead only specify the desired outcome. The system computes the action that will cause it to move from the current state to the state with the desired properties [34]. The pseudo-code in figure 2-8, demonstrates how goal policies can be used in autonomic computing environments.

30% <= HostUtilization <= 100% Figure 2-8: The goal policies example

(33)

33

Therefore, only the desired state needs to be specified and the system will determine how to reach this state. Actually, the goal policies perform a kind of binary classification against the state of the system. This classification will be either accepted or rejected according to the goal policy. Here in the example in Figure 2-8, there will be a conflict if it can’t satisfy the goal even if the state is actually correct. For example, suppose that the utilization of a host is 20%, this means that it will be rejected by the goal policy, although it might be the only possible state as the VMs representing the 20% of the host utilization cannot be migrated to another host.

Utility function policies can be viewed as an extension of goal policies, but the desired state needn’t be specified in advance. In contrast to goal policies, the desired state is computed by repeatedly selecting the state with the highest utility from the feasible ones. This means that utility functions do not perform any kind of classification done in goal policies. The pseudo-code in Figure 2-9, exhibits solving the same problem previously solved by action and goal policies.

𝑼𝒕𝒊𝒍𝒊𝒕𝒚(𝒂, 𝒕) = 𝑰𝒏𝒄𝒐𝒎𝒆(𝒂, 𝒕) − 𝑻𝒐𝒕𝒂𝒍𝑬𝒏𝒆𝒓𝒈𝒚𝑪𝒐𝒔𝒕(𝒂, 𝒕) 𝑰𝒏𝒄𝒐𝒎𝒆(𝒂, 𝒕) = ∑ 𝑰𝒏𝒄𝒐𝒎𝒆𝑷𝒆𝒓𝑽𝑴(𝑉𝑀, 𝑐𝑜𝑠𝑡𝑃𝑒𝑟𝐶𝑃𝑈)

𝑉𝑀 ∈ 𝑎

− 𝑷𝒓𝒊𝒅𝒆𝒄𝒕𝒆𝒅𝑽𝒊𝒐𝒍𝒂𝒕𝒊𝒐𝒏𝑪𝒐𝒔𝒕(a, t)

Figure 2-9: Utility policy example

Where a is the assignment of the list of VMs to the hosts list in the datacentre, and t is the time spent in the assignment. The calculation of both the TotalEnergyCost(a,t) and PredictedViolationCost(a,t) is based on the host and VM utilization level. The utility is expressed in terms of monetary values or any other value. The objective is finding a robust assignment that maximizes the overall profit.

Theoretically, goal policies and utility function policies are more relevant to self-managing systems and autonomic computing than the action policies as they focus on the required state rather than the current state [34]. Moreover, utility functions are considered to be better than goal policies as they contribute to a more flexible behaviour. On the contrary, goal policies can’t express fine-grained agents’ preferences as they only perform a kind of binary classification against the current

(34)

34

state of the system. This means that it accepts the state as long as it satisfies the policy even if there is a better state that it should consider instead.

2.5 Cloud Computing Simulation Tools

It’s time to think about the implementation and the evaluation of the research after the background reading and the literature review that have resulted in deploying utility functions for the adaptive cloud scheduling problem. The implementation of real cloud computing environments for testing the proposed solution is not simple and costs time and money. Moreover, the performance evaluation of cloud scheduling using different applications and service models under different conditions is a challenge in a real cloud environment.

Generally, simulation tools are widely used to simulate the behaviour of real devices and environments. There many simulation tools that are being used for implementing research work in computer networks such as NS-2 [35], and OPNET [36] and in grid systems such as GridSim [37], and MicroGrid [38]. Using simulation tools, the implementation and the evaluation processes in cloud computing environments will be possible and easier. Furthermore, anyone who knows who to use the simulation tool can reproduce the tests.

2.5.1 Existing Cloud Simulators

Currently, there is a number of cloud simulation tools such as, CloudSim [5], GreenCloud [39], iCanCloud [40], and MDCSim[41]. CloudSim, GreenCloud and iCanCloud are open source, while MDCSim is commercial. CloudSim uses the Java programming language, GreenCloud uses C++/OTcl, MDCSim uses C++/Java, and iCanCloud uses C++. CloudSim, GreenCloud, and MDCSim neither support parallel experiments nor models for the public cloud providers. On the contrary, iCanCloud supports parallel experiments and provides models for Amazon public cloud. According to the comparison in [40], iCanCloud is the most powerful compared to CloudSim, GreenCloud, and MDCSim. However, CloudSim will be used for the implementation of this project so that the project results can be easily compared with the results found in the heuristic approach [6] that was also implemented using

(35)

35

CloudSim. The following section discusses the process of creating a cloud environment using the CloudSim toolkit.

2.5.2 CloudSim

CloudSim is a cloud modelling and simulation tool that is used for simulating both the cloud computing infrastructure and the cloud services. Technically speaking, CloudSim is an open source library that was built using the Java programming language. CloudSim was built at the University of Melbourne by the Computer Science and Software Engineering Department in the Cloud Computing and Distributed Systems (CLOUDS) Laboratory [42]. Developers can extend or replace existing Java classes, and new algorithms and scenarios can be added.

2.5.2.1 CloudSim Architecture

CloudSim was created based on a layered architecture with three basic layers namely, user code, CloudSim and the core simulation layer. This layered architecture makes it easier to add new classes or update existing ones. The CloudSim core simulation engine is located at the bottom of the CloudSim stack. This engine is responsible for processing different simulation events and creating main entities of the cloud such as the data centre, host, VM, and the broker. In addition, this layer is also responsible for managing queues and handling communication between existing entities. The CloudSim layer is located in the middle of the stack, with functionalities such as the allocation of existing VMs to hosts besides the allocation of CPU, memory, storage and bandwidth resources. All the work in this project will be done on the “CloudSim layer”. Finally, the user code layer is at the top of the stack which allows the cloud user to apply different cloud scenarios such as the number of hosts and VMs in addition to the number and size of applications based upon the user requirements. Figure 2-10, from [5], shows the CloudSim architecture.

(36)

36

2.5.2.2 Modelling the Cloud

The first step towards modelling a complete cloud environment is the creation of the required data centres. These data centres, with their physical hosts and VMs, model the cloud infrastructure. CloudSim provides the Datacenter class for creating the cloud environment data centres. CloudSim defines the characteristics of a data centre using the DatacenterCharacteristics class. This class represents the resource properties such as resource architecture, the management policy either time-shared or space-shared, the operating system, the cost and the time zone where the resource is located. After creating the data centre and defining its characteristics, the policy for allocating VMs to physical hosts should be defined. CloudSim provides a class named VmmAllocationPolicy which selects a host from the host list for VM deployment [5]. CloudSim uses the Host class for modelling the physical server. This class has a number of attributes that define the host capabilities such as the number of available CPU cores, the amount of RAM, storage, and bandwidth. The VmScheduler class

(37)

37

defines how the host CPU core will be allocated to the VMs using either time or space sharing. The time shared policy dynamically distribute the capacity of the existing cores among the VMs. However, space shared policy assigns specific CPU cores to specific VMs.

The virtualization technology is at the heart of cloud computing, CloudSim uses the VM class for modelling the virtual machine. This class stores the VM characteristics such as the number of CPU cores per VM, the memory size, priority, and the virtual machine manager (VMM). The cloudletScheduler class is used for scheduling the applications workload to the available CPU resources. The cloud application services are represented by the Cloudlet class. Each cloudlet has its own size or length.

Furthermore, CloudSim models the cloud market based on a layered approach which is represented by cost metrics for both the IaaS and SaaS models. Moreover, CloudSim models the network behaviour, dynamic entities creation, federation of clouds, and datacentre power consumption.

To sum up, this chapter discussed the required background readings related to cloud computing and green cloud computing environments. Furthermore, a review of the literature about adaptive cloud scheduling has been conducted. In addition to that, the existing cloud simulation toolkits have been reviewed. All that have been done so far made it easy for taking the correct decision concerning the strategy and the tools for solving the research problem. Therefore, the decision is to make use of utility functions for solving the adaptive VMs-to-hosts assignment problem. Furthermore, CloudSim toolkit is the cloud simulation toolkit that is going to be used for simulating the research problem.

The next step should be about choosing the utility attributes, defining the utility function and designing the cost model. Everything related to the design is described in the following chapter.

(38)

38

Chapter 3 : Design

The next step that should follow the background reading and the review of the related work is designing the strategy that will be used for solving the problem. Designing the strategy is the purpose of this chapter, which begins with describing the green cloud computing environment architecture followed by the input, processing, and output (IPO) model of the adaptive cloud scheduling problem. The goal of the design is delineating the approach that will be followed for solving the problem. This approach begins with defining the utility function and designing the cost model. The final step is designing the genetic algorithm which searches the search space for an efficient assignment that maximizes the utility.

3.1 System architecture

Cloud architects are seeking to build a green cloud computing architecture that efficiently saves energy consumption and doesn’t violate SLAs. Figure 3-1, from [45], shows a general system architecture of a green cloud computing environment that can be set up to provide an energy-aware resource scheduling solution. This architecture essentially consists of four different layers ranging from the cloud consumers to the physical hosts in the datacentre.

(39)

39

The first layer represents the cloud consumers or their brokers who request cloud services from the cloud providers. In this architecture, the cloud consumer could be different from the cloud user, as the cloud consumer might be any organization that hosts its applications at a cloud provider so that these applications are accessible to the cloud users.

The second layer, green service allocator, represents the interface between cloud consumers and the physical infrastructure of the cloud provider. This layer is actually divided into two sub-layers, the first sub-layer represents the interface to the consumer, while the second is an interface to the cloud resources.

The consumer interface sub-layer is responsible for negotiating the service level agreement (SLA) with the cloud consumer and determining the penalties resulting from any violations to the SLA. The agreement involves the service cost, the expected quality of service (QoS) and the penalties in case of SLA violations. The consumer interface sub-layer analyses the consumer requests by the service analyser

component, and the result of this analysis will be either accepting or rejecting these requests for cloud services. Moreover, it also assigns privileges and prioritizes users according to their characteristics through the consumer profiler component. Finally, it calculates the cost of the provided services through the pricing component.

The cloud interface sub-layer schedules the computing resources and services by the service scheduler component. The resource utilization and the associated costs are monitored by the accounting component and it makes use of historical data for improving the scheduling process. This sub-layer is also responsible for tracking energy consumption, through the energy monitor component, and this tracking information can be used later for determining whether a VM should be switched on, off, or turned into any other power saving modes such as the sleep mode. Most of the research in service scheduling and monitoring in cloud computing is performed in the sub-layers of the green service allocator layer.

The third layer, represents the collection of the VMs built above the physical servers in the cloud datacentre. These VMs are dynamically created, deleted and migrated among running hosts according to consumers’ needs. The last layer is the physical infrastructure of the cloud computing environment, which consists of a number of hosts. Energy-efficient adaptive resource scheduling should decide when

(40)

40

and which physical servers should be turned on or off for energy saving while meeting SLAs which is the ultimate goal of this research.

3.2 Input, Processing and Output Model

Figure 3-2, exhibits the input, processing, and output (IPO) model which provides a general schema of the proposed solution to the adaptive cloud scheduling problem.

Input: N physical hosts, each host is characterized by parameters such as CPU, defined in million instructions per second (MIPS), RAM, storage and bandwidth. Cloud consumers’ submit requests for the allocation of M VMs characterised by the same parameters of the host but with different values. Cloud users or their brokers will submit cloud applications with different requirements defined by the cloudlets.

Processing: The processing is divided into two main parts, the first part is the

Initial cloud scheduling algorithm while the other is the adaptive cloud scheduling

algorithm. The initial cloud scheduling is responsible for accepting new requests from the cloud consumers, and either accepts or rejects these requests according to the availability of resources. The initial allocation actually allocates VMs according to the resources required by the cloud consumer. The initial allocation of VMs to hosts can be seen as a bin packing problem where the physical hosts with variable capabilities represent bins with variable sizes. The VMs that need to be allocated represent the items that need to be placed in the bins. This problem can be solved using the best fit decreasing algorithm (BFD) [32].

The adaptive allocation uses the output of the initial allocation as its input to produce an adaptive scheduling according to resource utilization rather than the anticipated resource requirements used in the initial allocation. The adaptive allocation makes use of utility functions to make a robust and efficient assignment that maximizes the overall utility. A genetic algorithm is used to search for the assignment that maximizes the utility function. The details of the utility function definition, its attributes and the cost model development are presented in Section 3.3.

(41)

41

Output: The output will be the adaptive VMs-to-hosts assignment according to the utilization and SLA requirements. This output will print the values of some performance metrics such as the energy consumption and the percentage of SLA violation.

3.3 Utility Function Definition

The utility function specifies the self-managing policy adopted for the adaptive cloud scheduling problem. The overall goal of the utility is maximizing the benefit of the adaptive allocation of VMs by minimizing energy consumption and minimizing any sources of violation to the negotiated SLA. As a result, the properties of the utility function will be the total amount of energy consumption (E) and the percentage of SLA violation (SLAV). The high level definition of the utility of the assignment of the VMs list to the hosts list is formulated as follows in (1):

𝑼𝒕𝒊𝒍𝒊𝒕𝒚(𝒂, 𝒕)

= 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅𝑬𝒏𝒆𝒓𝒈𝒚𝑪𝒐𝒔𝒕(𝒂, 𝒕) + 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅𝑽𝒊𝒐𝒍𝒂𝒕𝒊𝒐𝒏𝑪𝒐𝒔𝒕(a, t)

+ 𝑷𝑫𝑴𝑪𝒐𝒔𝒕(𝒂, 𝒕) (𝟏)

Where a is a vector representing the assignment of the list of VMs to the hosts list in the datacentre; t is the total time of this assignment which is the same as the time of the scheduling interval; PredictedEnergyCost(a,t) is the cost of energy consumed due to the assignment. Any violation in the SLA will expose the cloud provider to a penalty

(42)

42

which should be paid to the cloud users. In this utility definition, there are two different violation costs that are going to be computed, the first one is

PredictedViolationCost(a,t) which represents the cost of SLA violation and is computed by counting the number of VMs that are in violation due to the assignment. The second source of violation is the PDMCost(a,t) which is the penalty due to the degradation in VMs’ performance resulting from the migration of VMs among hosts.

Maximizing the utility is achieved by minimizing the sources of cost defined by the different parameters in the utility definition in (1). Therefore the utility should be expressed by an inverse relationship with the summation of different costs. This makes the final definition of the utility function be as follows in (2).

𝑼𝒕𝒊𝒍𝒊𝒕𝒚(𝒂, 𝒕)

= 𝟏

𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅𝑬𝒏𝒆𝒓𝒈𝒚𝑪𝒐𝒔𝒕(𝒂, 𝒕) + 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅𝑽𝒊𝒐𝒍𝒂𝒕𝒊𝒐𝒏𝑪𝒐𝒔𝒕(a, t) + 𝑷𝑫𝑴𝑪𝒐𝒔𝒕(𝒂, 𝒕) (𝟐)

The total energy cost in the datacentre will be the summation of the predicted energy cost per host. The algorithms for calculating the predicted energy cost, the predicted violation cost and the cost of performance degradation due to migration are all shown in the cost model development, section 3.4, and its sub-sections.

3.4 Cost Model Development

The cost model involves the prediction of the amount of energy consumed due to the allocation. Predicting CPU utilization is crucial for calculating the expected energy consumption [4], [44].

The proposed algorithm for computing the expected CPU utilization is based on the percentage of VM utilization at the time of the allocation. Moreover, the Markov chain prediction model can be used along with the proposed algorithm for predicting the future level of CPU utilization. Therefore the following section will discuss Markov chains and how they can be used for predicting CPU utilization, which in turn is used for calculating energy consumption.

(43)

References

Download now ( PDF - 96 Page - 2.10 MB )