A al to-DD 14 0 /2 016
9HSTF
MG*agj
bdf+
ISBN 978-952-60-6913-5 (printed) ISBN 978-952-60-6912-8 (pdf) ISSN-L 1799-4934 ISSN 1799-4934 (printed) ISSN 1799-4942 (pdf) Aalto University School of ScienceDepartment of Computer Science www.aalto.fi BUSINESS + ECONOMY ART + DESIGN + ARCHITECTURE SCIENCE + TECHNOLOGY CROSSOVER DOCTORAL DISSERTATIONS N guy en Trung Hieu Vi rtu al Mac hine Man ag em en t f or Ef fic ien t C lo ud Dat a C en ter s wi th A ppl icat io ns to Bi g Dat a Ana lytics A alt o U niv er sit y
Virtual Machine
Management for Efficient
Cloud Data Centers with
Applications to Big Data
Analytics
Nguyen Trung Hieu
DOCTORAL DISSERTATIONS
DOCTORAL DISSERTATIONS 140/2016
Virtual Machine Management for
Efficient Cloud Data Centers with
Applications to Big Data Analytics
Nguyen Trung Hieu
A doctoral dissertation completed for the degree of Doctor of Science (Technology) to be defended, with the permission of the Aalto University School of Science, at a public examination held at the lecture hall T2 of the school on 31 August 2016 at 12 noon.
Aalto University School of Science
Assistant Professor Mario Di Francesco, Aalto University, Finland Thesis advisor
Assistant Professor Mario Di Francesco, Aalto University, Finland Preliminary examiners
Associate Professor Adlen Ksentini, University of Rennes 1, France Associate Professor Dijiang Huang, Arizona State University, USA Opponent
Assistant Professor Hong-Linh Truong, TU Wien, Austria
Aalto University publication series DOCTORAL DISSERTATIONS 140/2016 © Nguyen Trung Hieu
ISBN 978-952-60-6913-5 (printed) ISBN 978-952-60-6912-8 (pdf) ISSN-L 1799-4934 ISSN 1799-4934 (printed) ISSN 1799-4942 (pdf) http://urn.fi/URN:ISBN:978-952-60-6912-8 Unigrafia Oy Helsinki 2016 Finland
Abstract
Aalto University, P.O. Box 11000, FI-00076 Aalto www.aalto.fi
Author
Nguyen Trung Hieu
Name of the doctoral dissertation
Virtual Machine Management for Efficient Cloud Data Centers with Applications to Big Data Analytics
Publisher School of Science
Unit Department of Computer Science
Series Aalto University publication series DOCTORAL DISSERTATIONS 140/2016 Field of research Computer Science and Engineering
Manuscript submitted 19 January 2016 Date of the defence 31 August 2016 Permission to publish granted (date) 15 June 2016 Language English Monograph Article dissertation Essay dissertation Abstract
Infrastructure-as-a-Service (IaaS) cloud data centers offer computing resources in the form of virtual machine (VM) instances as a service over the Internet. This allows cloud users to lease and manage computing resources based on the pay-as-you-go model. In such a scenario, the cloud users run their applications on the most appropriate VM instances and pay for the actual resources that are used. To support the growing service demands of end users, cloud providers are now building an increasing number of large-scale IaaS cloud data centers, con-sisting of many thousands of heterogeneous servers. The ever increasing heterogeneity of both servers and VMs requires efficient management to balance the load in the data centers and, more importantly, to reduce the energy consumption due to underutilized physical servers. To achieve these goals, the key aspect is to eliminate inefficiencies while using computing resour-ces. This dissertation investigates the VM management problem for efficient IaaS cloud data centers. In particular, it considers VM placement and VM consolidation to achieve effective load balancing and energy efficiency in cloud infrastructures. VM placement allows cloud providers to allocate a set of requested or migrating VMs onto physical servers with the goal to balance the load or minimize the number of active servers. While addressing the VM placement problem is important, VM consolidation is even more important to enable continuous reorga-nization of already-placed VMs on the least number of servers. It helps create idle servers during periods of low resource utilization by taking advantage of live VM migration provided by virtualization technologies. Energy consumption is then reduced by dynamically switching idle servers into a power saving state. As VM migrations and server switches consume addi-tional energy, the frequency of VM migrations and server switches needs to be limited as well. This dissertation concludes with a sample application of distributed computing to big data analytics.
Keywords Virtual Machine (VM) consolidation, VM placement, VM migration, Multiple resource prediction, Data centers, Cloud computing, Big data analytics ISBN (printed) 978-952-60-6913-5 ISBN (pdf) 978-952-60-6912-8
ISSN-L 1799-4934 ISSN (printed) 1799-4934 ISSN (pdf) 1799-4942 Location of publisher Helsinki Location of printing Helsinki Year 2016
Preface
This work has been carried out between September 2012 and August 2016 with the Distributed Systems, Mobile Computing and Security group (for-merly Data Communication Software) at the Department of Computer Science, Aalto University School of Science, Finland. This doctoral disser-tation would not have been completed without the support and guidance of a number of people and organizations during my studies toward a doc-toral degree.
First of all, I would like to address special thanks to my former super-visor, Professor Antti Ylä-Jääski, who has given me the opportunity to undertake a PhD and financially supported the first part of my doctoral studies in the Department of Computer Science at Aalto University. I am grateful to Professor Mario Di Francesco, who was my instructor from the beginning of my doctoral studies and was appointed as my supervisor in September 2013. His expertise, professional conduct, and devotion to research allowed me to complete my doctoral degree. I would also like to thank Professor Sangtae Ha for hosting my research visit to the Uni-versity of Colorado at Boulder, USA, between November 2015 and March 2016.
I would like to thank other professors in the Department of Computer Science, Aalto University, for their profound knowledge, fascinating courses, and fruitful discussion during my doctoral studies.
I would like to sincerely thank Professor Adlen Ksentini and Professor Dijiang Huang, who served as the official preliminary examiners of this dissertation. I would also like to extend my gratitude to the dissertation opponent, Professor Hong–Linh Truong from TU Wien. Their valuable comments helped me improve the quality of this dissertation.
I also would like to extend my gratitude and thanks to the depart-ment secretaries and laboratory managers for supporting and creating
an excellent working environment. I would like to thank Laura Kuusisto-Noponen, Maarit Vuorio, Kristiina Hallaselkä, Emma Holmlund, Katri Seitsonen and Jaakko Kotimäki for their support during my doctoral stud-ies. It was because of their sincere help that I was able to concentrate on research.
Many thanks to all the past and current members of the Distributed Systems, Mobile Computing and Security group at the Aalto University School of Science. In particular, I thank Professor Tuomas Aura, Sanja Scepanovic, Pranvera Kortoçi, Vu Ba Tien Dung, Vu Hoang Nam and Ming Li for their friendship and help during my doctoral studies. I also thank my Vietnamese friends for sharing not only happiness but also dif-ficulty in my life over several years abroad.
I appreciate financial support from the Academy of Finland, the Helsinki Doctoral Education Network in Information and Communications Tech-nology (HICT), the Flexible Spaces Services activity of the EIT ICT labs, the Ulla Tuominen Foundation, and the Google Inc.
I am always thankful to my parents as well as my younger sister and brother, for their endless love, unconditional support, and encouragement during my studies. I thank my wife Cao Hoang Thanh Nha for her love, inspiration, patience, and for making my life filled with happiness. They always wanted me to achieve this goal and supported me in every possible way. I am dedicating this dissertation to them.
Espoo, July 6, 2016,
Contents
Preface 1 Contents 3 List of Publications 5 Author’s Contribution 7 List of Abbreviations 9 List of Tables 13 List of Figures 151. Introduction and Motivation 19
1.1 Research Problems and Questions . . . 23
1.2 Methodology . . . 26
1.3 Contributions . . . 29
1.4 Thesis Organization . . . 32
2. Efficient Virtual Machine Placement 33 2.1 Case Study: Energy Efficiency of Data Centers . . . 33
2.2 Virtual Machine Placement . . . 36
2.3 System Model and Considered Metrics . . . 39
2.4 VM Placement for Balanced Resource Utilization . . . 41
2.4.1 The MAX–BRU Algorithm . . . 42
2.4.2 Summary of Results . . . 44
3. Efficient Virtual Machine Consolidation 47 3.1 Overloaded and Underloaded Host Management . . . 48
3.2 Virtual Machine Consolidation . . . 50
3.3.1 Multiple Resource Selection . . . 53
3.3.2 Multiple Usage Prediction . . . 55
3.3.3 Overloaded and Underloaded Host Detection . . . 56
3.3.4 VM Selection and Placement under Migration . . . . 60
3.3.5 The VMCUP–M Algorithm . . . 63
3.3.6 Summary of Results . . . 64
4. An Application of Distributed Computing to Big Data 75 4.1 Distributed Semantic Analysis . . . 75
4.1.1 Pre–Processing of Wikipedia Data . . . 76
4.2 A Big Data Application . . . 77
4.2.1 Word Semantic Relatedness . . . 78
4.2.2 Summary of Results . . . 78
5. Conclusion 81 5.1 Contributions . . . 81
5.2 Future Research Directions . . . 83
Bibliography 85
List of Publications
This thesis consists of an overview and of the following publications which are referred to in the text by their Roman numerals.
INguyen Trung Hieu, Mario Di Francesco and Antti Ylä-Jääski. A Vir-tual Machine Placement Algorithm for Balanced Resource Utilization in Cloud Data Centers. InProceedings of the 7thIEEE International Con-ference on Cloud Computing (CLOUD), Anchorage, Alaska, USA, pages 474-481. DOI: 10.1109/CLOUD.2014.70, 27 June - 2 July 2014.
II Nguyen Trung Hieu, Mario Di Francesco and Antti Ylä-Jääski. A Multi–Resource Selection Scheme for Virtual Machine Consolidation in Cloud Data Centers. InProceedings of the 6thIEEE International Con-ference on Cloud Computing Technology and Science (CloudCom), Singa-pore, pages 234-239. DOI:10.1109/CloudCom.2014.130, September 15-18 2014.
III Nguyen Trung Hieu, Mario Di Francesco and Antti Ylä-Jääski. Vir-tual Machine Consolidation with Usage Prediction for Energy–Efficient Cloud Data Centers. InProceedings of the 8thIEEE International Con-ference on Cloud Computing (CLOUD), New York, USA, pages 750-757. DOI:10.1109/Cloud.2015.104, 27 June - 2 July 2015.
IVNguyen Trung Hieu, Mario Di Francesco and Antti Ylä-Jääski. Vir-tual Machine Consolidation with Multiple Usage Prediction for Energy– Efficient Cloud Data Centers. IEEE Transactions on Services Comput-ing, Under review, 14 pages, March 2016.
V Nguyen Trung Hieu, Mario Di Francesco and Antti Ylä-Jääski. Ex-tracting Knowledge from Wikipedia Articles through Distributed Se-mantic Analysis. InProceedings of the 13thACM International Confer-ence on Knowledge Management and Knowledge Technologies (i-KNOW), Graz, Austria, pages 188-195. DOI:10.1145/2494188.2494195, September 04-06 2013.
Author’s Contribution
Publication I: “A Virtual Machine Placement Algorithm for Balanced Resource Utilization in Cloud Data Centers”
The author of this dissertation is the primary contributor of the publica-tion. He proposed the original idea of balancing resource utilization, de-signed the virtual machine placement algorithm and performed the eval-uation by simulation.
Publication II: “A Multi–Resource Selection Scheme for Virtual Machine Consolidation in Cloud Data Centers”
The author of this dissertation is the primary author of this publication. He proposed the multiple resource selection scheme, designed the bal-anced multiple–resource utilization algorithm and performed the evalua-tion by simulaevalua-tion.
Publication III: “Virtual Machine Consolidation with Usage Prediction for Energy–Efficient Cloud Data Centers”
The author of this dissertation is the primary author of the publication. He proposed the original idea for virtual machine consolidation with us-age prediction. He designed an efficient usus-age prediction approach, de-signed the consolidation algorithm and performed the evaluation by sim-ulation.
Publication IV: “Virtual Machine Consolidation with Multiple Usage Prediction for Energy–Efficient Cloud Data Centers”
The author of this dissertation is the primary author of this publication. He proposed the original idea and the corresponding problem formulation, designed the multiple usage prediction as well as the virtual machine con-solidation algorithm with prediction. He also performed the evaluation by simulation.
Publication V: “Extracting Knowledge from Wikipedia Articles through Distributed Semantic Analysis”
The author of this dissertation is the primary author of this publication. He proposed the original idea of using distributed computing for fast pro-cessing, designed the semantic relatedness metric and performed the ex-perimental evaluation.
List of Abbreviations
IaaS Infrastructure–as–a–Service
VM(s) Virtual Machine(s)
VMM Virtual Machine Manager
CPU Center Processing Unit
QoS Quality of Service
SLA Service Level Agreement
RQ(s) Research Question(s)
I/O Input or Output
OS Operating System
KVM Kernel–based Virtual Machine
FIFO First In First Out
GCD Google Cluster Data
U Resource Utilization
ˆ
U Average Resource Utilization
B Resource Balance
ˆ
B Average Resource Balance
RH Resource Hottest
RT Resource Temperature
RC Resource Correlation
Max–BRU Maximized and Balanced Resource Utilization
MRS Multiple Resource Selection
BRMU Balanced Multiple Resource Utilization
UP Usage Prediction
VMCUP VM Consolidation with Usage Prediction
MUP Multiple Usage Prediction
VMCUP–M VM Consolidation with Multiple Usage Prediction
OHD–MUP Overloaded Host Detection with Multiple Usage Prediction
THR Static Threshold
THR–MUP Static Threshold with MUP
MAD Median Absolute Deviation
IQR Interquartile Range
LR Local Regression
LR–MUP Local Regression with MUP
FF First–Fit
BF Best–Fit
WF Worst–Fit
NF Next–Fit
FFD First–Fit Decreasing
PABFD Power–aware Best Fit Decreasing
PABFD–MUP Power–aware Best Fit Decreasing with MUP
DRR Dynamic Round Robin
MRT Minimum Resource Temperature
MMT Minimum Migration Time
MC Maximum Correlation
MU Minimum Utilization
RS Random Selection
BG Black–box and Gray–box
BG–MUP Black–box and Gray–box with MUP
VSR Volume–to–Size Ratio
RRV Resource Requirement Vector
TCV Total Resource Capacity Vector
UCV Utilized Capacity Vector
BFVD Best–Fit VectorDot
FFVD First–Fit VectorDot
WFVD Worst–Fit VectorDot
RBFVD RelaxedBest–Fit VectorDot
MM Market Mechanism
LAJ Latest Arrival Job
BL Backfill Lowest
BB Backfill Balance
CPULoad CPU Load–aware
mPP Min Power Parity
mPPH Min Power Placement with History
pMaP Balance between Power and Migration Cost
WSRel Word Semantic Relatedness
TF–IDF Term Frequency – Inverse Document Frequency
ESA Explicit Semantic Analysis
TSA Temporal Semantic Analysis
LSA Latent Semantic Analysis
LSAC LSA@CU Boulder
M&C The Miller and Charles’ dataset
R&G The Rubenstein and Goodenough’s dataset
List of Tables
1.1 How the publications address the research questions. . . 26 2.1 Statistics of machines in the Google Cluster Data. . . 34 3.1 The multiple–resource and multiple–step usage prediction
(m= 1,d∈DandK= 3) in Publication IV. . . 56 4.1 Pearson’s correlation coefficient of the different approaches
List of Figures
2.1 Server power usage at varying utilization levels of server platforms from SPEC [118], i.e., (a) from idle to peak perfor-mance and (b) active power ratio at idle and at 30% utiliza-tion (relative to the 100% utilizautiliza-tion). . . 35 2.2 Number of active servers as a function of the number of VM
requests (Publication I) for the (a) Amazon EC2 and (b) nor-mal datasets. . . 44 2.3 Resource balance ratio as a function of the number of VM
requests (Publication I) for the: (a) Amazon EC2 and (b) normal datasets. . . 45 3.1 Host management with MRS in Publication II: (a)
over-loaded server and (b) underover-loaded server detection. . . 53 3.2 Prediction of CPU resource usage in Google Cluster Data
(Publication IV): (a) one–step prediction and (b) six–step prediction. . . 57 3.3 Prediction of memory resource usage in Google Cluster Data
(Publication IV): (a) one–step prediction and (b) six–step prediction. . . 57 3.4 Host management with MUP (Publication IV): (a) overloaded
server and (b) underloaded server detection. . . 58 3.5 CPU resource usage measured every five minutes over 24
hours of a cloud server in our university (Publication IV). . . 59 3.6 Impact of MUP on the average number of hot and cold spots
per data center (Publication IV) for the: (a) GCD and (b) PlanetLab workloads. . . 65
3.7 Impact of MUP on the average number of active machines per data center (Publication IV) for the: (a) GCD and (b) PlanetLab workloads. . . 65 3.8 Number of active physical servers as a function of the
num-ber of VM requests (Publication II) for the different work-loads compared with: (a) BG–DVol and (b) VectorDot and MM. . . 66 3.9 Number of active servers for MRT and MMT as a function
of time (Publication IV) for the: (a) GCD and (b) PlanetLab workload traces. . . 66 3.10 Number of active servers for VMCUP–M and BG as a
func-tion of time (Publicafunc-tion IV) for the: (a) random and (b) GCD workload traces. . . 67 3.11 Resource utilization ratio as a function of the number of VM
requests (Publication II) for the different workloads com-pared with: (a) BG–DVol and (b) VectorDot and MM. . . 67 3.12 Energy consumption of VMCUP–M for the GCD workload
trace (Publication IV) with the: (a) THR and (b) LR over-loaded host detection schemes. . . 68 3.13 Energy consumption of VMCUP–M for the PlanetLab
work-load trace (Publication IV) with the: (a) THR and (b) LR overloaded host detection schemes. . . 68 3.14 Energy consumption of VMCUP–M and BG under the THR
and LR overutilized host detection approaches (Publication IV) with the: (a) random and (b) GCD workload trace. . . 69 3.15 Number of migrations per VM under the THR and LR
algo-rithms with different VM selection policies (Publication IV) for the: (a) GCD and (b) PlanetLab workload traces. . . 70 3.16 Number of power state changes per data center under the
THR and LR algorithms with different VM selection policies (Publication IV) for the: (a) GCD and (b) PlanetLab work-load traces. . . 70 3.17 Number of migrations per VM of VMCUP–M and BG under
the THR and LR algorithms (Publication IV) for the: (a) random and (b) GCD workload traces. . . 71 3.18 Number of power state changes per data center of VMCUP–
M and BG under the THR and LR algorithms (Publication IV) for the: (a) random and (b) GCD workload traces. . . 71
3.19 SLA compliance under the THR and LR algorithms with different VM selection policies (Publication IV) for the: (a) GCD and (b) PlanetLab workload traces. . . 72 3.20 SLA compliance of VMCUP–M and BG under the THR and
LR algorithms (Publication IV) for the: (a) random and (b) GCD workload traces. . . 72 4.1 The Distributed Semantic Analysis (DSA) system for
1. Introduction and Motivation
Infrastructure–as–a–Service (IaaS) cloud data centers — including Ama-zon EC2 [3], IBM Cloud [61], Google Compute Engine [47], and Rackspace [116] — offer several types of virtual machines (VMs) that differ in their amount of resources based on the pay–as–you–go model [21, 138]. This allows cloud users to run their applications on the most appropriate VM instances and pay for the actual resources that are used [2]. To support the growing service demands of their users, cloud providers have recently begun to deploy an increasing number of large–scale IaaS cloud data cen-ters, thus resulting in a huge energy consumption [91]. As reported in Analytics Press, energy consumption by data centers worldwide increased by about 56% from 2005 to 2010, and in 2010 likely accounted for between 1.1% and 1.5% of the total electricity use [70]. Additionally, the energy– related costs accounted for roughly 42% of the total costs of a data cen-ter [54].
IaaS cloud data centers currently consist of many thousands or even millions of heterogeneous servers and each server may host a set of het-erogeneous VMs. Accordingly, Rackspace’s IaaS has increased the total server count in the third quarter of 2014 to 110,453, up from 107,657 servers at the end of the previous quarter, and the number of servers continuously grows [69, 115]. Amazon EC2 had approximately 40,000 servers and launched 80,000 VMs daily in 2011 [33] and it has been esti-mated that one and half million servers were running millions of VMs in 2014 [136]. Google was estimated to have around 1.8 million servers as of January 2012 and 2.3 million servers by early 2013 [63, 123]. Further-more, the number of VM requests deployed in a cloud data center each day can be very large; it has been estimated that approximately 360,000 VM requests were deployed within 24 hours in a single data center in 2013 [9]. These numbers may be even larger today as cloud computing
is much more popular than two years ago. Therefore, the ever increas-ing heterogeneity for both the physical servers and the VMs needs to be managed efficiently in order to achieve the following key goals: maximize resource utilization and reduce the energy costs [24, 58, 134, 150].
To address the problem of high energy use in IaaS cloud data centers, it is necessary to eliminate inefficiencies while using computing resources. This may be achieved by improving resource allocation and management [5, 74, 94, 103, 114]. However, such resources may also vary over time due to dynamic workloads that require resizing, creating, and (or) terminating VMs. Furthermore, computing resources consist of multiple types (or di-mensions) — including CPU, memory, disk, and network bandwidth — and all need to be considered while designing energy–efficient mecha-nisms for resource management [149]. As a consequence, if the owners of cloud data centers could not effectively schedule and reallocate het-erogeneous VM instances and resource types, some hosts might become overloaded while other hosts might be underutilized. Eventually, such an unbalanced use of hosts could result in unnecessary activation of servers, thus consuming huge amounts of electrical energy and resulting in high operating costs [50, 87]. Moreover, by considering the actual VM re-source utilization after VM placement, increasing the workload of some already–placed VMs may cause the corresponding physical servers to be overloaded, possibly affecting the quality of service (QoS) experienced by the hosted applications. In fact, the QoS level offered to cloud users needs to fulfill the service level agreement (SLA) of the cloud provider [11, 17]. On the other hand, physical servers may become underloaded due to a decrease in the VM workloads, but would still contribute to significant amounts of power consumption in data centers. In such a scenario, it is beneficial to move all VMs to other servers and switch the underloaded machine into a power–saving state (e.g., suspend) to save energy [16, 90]. One method to improve resource utilization and reduce the energy con-sumption is VM placement [59, 78, 79, 83]. In most scenarios, when cloud users submit their VM requests, some physical server(s) in the IaaS cloud data center will be selected to deploy the required VMs [82, 102]. Partic-ularly, VM placement allows cloud providers to allocate a set of VMs to physical servers with the goal to minimize the number of active machines to accommodate the VMs [52, 60, 80, 97]. To this regard, choosing the most appropriate target machine in a large pool of physical servers to cre-ate the requested VMs provides a strong motivation for cloud providers to
maximize their operational efficiency [65].
While addressing the VM placement problem is important to minimize the number of active servers starting from the VM submission, VM con-solidation enabled by virtualization technologies [99, 122] is even more important to support continuous consolidation of already–placed VMs on the least number of physical servers [29, 35, 89, 110]. Virtualization al-lows multiple VMs to be placed into the same physical server and each VM may run multiple application tasks, thus ensuring that the server is optimally utilized while reducing the energy consumption [76, 101, 126]. Multiple resource types — i.e., CPU, memory, storage, and network band-width — may be dynamically provisioned for a VM according to the cur-rent resource requirements [1, 102]. This enables the consolidation of VMs in the minimum number of servers to switch off unused machines, with the goal to reduce the total power consumption [12, 104, 145]. By taking advantage of virtualization technologies, cloud providers are al-lowed to increase the energy efficiency of the cloud data centers and scale the costs of the offered virtualized resources.
Another capability provided by virtualization is live migration, which is the ability to transfer a VM between physical servers with little or no migration downtime during the process [27, 56, 142]. By using live migra-tion, VMs may be dynamically consolidated into a few physical servers, then unused machines (i.e., those that do not host any VMs) may be switched off [12, 36, 80, 144, 145]. This approach helps improve the resource utilization and allows energy savings in compliance with the SLA [49]. VM consolidation with live migration is closely related to the problem of: (1) determining when a server is overloaded (i.e., ahot spot), then migrating the potential VMs from such a server to maintain a cer-tain QoS; and (2) determining when a server is underloaded (i.e., acold spot), then migrating all VMs from such a server to minimize energy con-sumption. Idle hosts are automatically switched to a low–power mode to reduce the energy consumption. When required, the low–power hosts are reactivated to accommodate newly–created VMs or VMs being migrated.
However, it is challenging to decide whether a host is overloaded or un-derutilized due to the diverse set of user applications and the variability of the VM workloads with time, especially in a cloud data center with millions of machines. Purely based on the last observed utilization for de-cision making, existing solutions may cause unnecessary migrations, thus increasing the overhead: the energy for VM migration, the performance
degradation of the hosted applications, and extra network communica-tions [37, 88, 125, 147, 148]. For those reasons, even though live migra-tion is a suitable solumigra-tion for managing VM populamigra-tions, it is important to avoid unnecessary VM migrations. For instance, commercial IaaS plat-forms such as Amazon EC2 and Microsoft Azure do not use VM migration at all. In any case, hot and cold spots should be carefully determined, in order to limit the frequency of VM migrations. During migration of VMs, if there is no active physical server with sufficient resources avail-able, an inactive server is automatically started and the selected VMs are allocated to such a machine. In addition, when a host is underutilized, all VMs from such a host are selected for migration if they can be con-solidated into other hosts without causing overutilization. Idle servers are then switched to a low–power state to save energy. However, switch-ing the power state of a host from idle to a low–power state and vice versa consumes additional energy [50, 51, 71, 85, 133]. Therefore, as VM migra-tions and server switches are essential for power reduction, it is even more important to avoid massive migrations and limit power state switches.
This doctoral dissertation is motivated by the limitations of the current VM management algorithms implemented in the OpenStack, OpenNeb-ula, and Eucalyptus cloud management middlewares [57, 107, 113]. Such IaaS platforms use very simple VM management algorithms, i.e., round robin, greedy First–Fit, load–aware, and do not enable energy–efficient cloud infrastructures. Therefore, VM management solutions for large– scale data centers must be designed to effectively take power consump-tion into account while guaranteeing the applicaconsump-tion QoS level. Accord-ingly, the scientific contribution of this dissertation is VM management for energy–efficient IaaS cloud data centers. Particularly, to achieve en-ergy efficiency in cloud infrastructures, two key VM management algo-rithms — specifically, for VM placement and VM consolidation — are pro-posed. They are capable of: (1) limiting the number of active physical servers; (2) creating idle servers, then transitioning such idle servers in a power–saving state and reactivating them once required; and (3) mini-mizing the number of VM migrations and server switches along with the number of activated servers. Besides, big data applications run on large clusters within data centers, and the related energy costs make their pro-cessing time an extremely critical component. MapReduce [30] and its open–source implementation Hadoop [39] have emerged as the leading computing platforms for big data analytics. Reducing the processing time
of big data applications based on Hadoop MapReduce leads to a significant decrease in the overall operating cost of a data center. This dissertation further addresses data–intensive applications by designing a distributed computing system for big data analytics, specifically, for semantic anal-ysis. The proposed system helps reduce the execution time for running data–intensive processes, thus improving the system performance and the energy consumption as a side effect.
1.1 Research Problems and Questions
This dissertation tackles the above–mentioned research challenges by pro-posing algorithms for managing VMs with the goal to optimize the perfor-mance of an IaaS cloud data center in terms of balanced use of resources and reduced energy consumption. In particular, the following research problems are considered:
• Energy Efficiency. Energy efficiency is a major concern for IaaS cloud data center providers. By taking advantage of virtualization technolo-gies, cloud providers are allowed to increase the energy efficiency of their infrastructures by minimizing the number of active servers along with the number of VM migrations. Furthermore, energy saving may be achieved by switching idle servers to a power–saving state while mini-mizing the number of power state changes (i.e., between on and off). In large–scale cloud data centers, improving the energy efficiency only by a few percent may save millions of dollars in electricity costs [67].
• Scalability. IaaS cloud data centers to date consist of millions of het-erogeneous physical servers and VMs. To improve the utilization of physical resources, the management of IaaS clouds would ideally em-ploy optimal VM placement and VM consolidation, which are known to be NP hard problems [53, 129]. Thus, a practical management of such large–scale data centers requires highly scalable VM management algorithms. Consequently, VM placement and consolidation algorithms dealing with an ever growing number of heterogeneous servers and VMs are a challenge with a high level of complexity.
• Performance. In IaaS clouds, heterogeneous resources such as compu-tation and storage are provisioned on–demand by cloud providers in the
form of VMs. This allows cloud users to run their applications on the most appropriate VMs and pay for the actual resources that are used. Specifically, IaaS clouds are suitable for deploying high performance computing [32, 100], scientific workflows [121], and social network appli-cations [25]. Furthermore, VM management algorithms should guaran-tee adequate performance by avoiding overloads of any resources under VM placement and VM consolidation.
In more detail, the following research questions (RQs) related to the previous research problems are investigated and answered in this disser-tation:
• RQ1: How to consider multiple types of resources and balance the load among them? VM management for energy–efficient IaaS cloud data centers should take into account multiple resource types si-multaneously, since CPU is not the only critical resource in cloud data centers. In fact, also memory and network bandwidth may become a bottleneck, possibly causing violations in the SLA. Furthermore, multi-ple applications of various types may have different demands in terms of resources. For instance, a request for computer–intensive applica-tions (e.g., weather forecast and big data analytics) needs more CPU or memory resources and a request for database and memory caching applications (e.g., web hosting and online banking) needs more I/O re-sources. Therefore, VM management algorithms should consider multi-ple types of resources and spread the load among them to help resolve one of the most compelling issues in cloud data centers: underutiliza-tion of physical servers due to an unbalanced use of resources across multiple dimensions.
• RQ2: When to migrate virtual machines? The VM consolidation problem consists of two basic phases: (1) determining when a server is overloaded, then migrating the potential VMs from such a server to maintain a certain QoS; and (2) determining when a server is under-loaded, then migrating all VMs from such a server to minimize energy consumption. In a large IaaS cloud data center with millions of physi-cal machines, the variability of already–placed VM workloads with time makes challenging to decide whether a host is overloaded or underuti-lized. Consequently, more efficient overload and underload management
schemes are needed to correctly make decisions on VM migration. In other words, hot and cold spots should be reliably determined across multiple resources to limit the frequency of VM migrations.
• RQ3: Which and how many virtual machines should be selected for migration? When a server is overloaded, it is challenging to de-termine which and how many VMs should be selected for migration to suitable hosts. As migration is expensive, VM selection plays an im-portant part in limiting the number of VMs migrations. The problem consists of selecting one or more potential VMs for migration to reduce the resource load of the considered servers.
• RQ4: Where to place a virtual machine just created or under migration? The target physical server should be correctly selected in large cloud data centers to allocate newly–created VMs or those under migration. As the result of VM allocation requests or migrations, the target servers may become overloaded during periods of high resource utilization. Consequently, an overloaded host management scheme is required to detect overload situations. Such a scheme also supports VM placement algorithms in deciding which physical machines are the most suitable to accommodate VMs being submitted or migrated. The prob-lem here consists of selecting a target server not only based on the least increased power consumption but also based on its utilization stability. In other words, a selected server should not be overloaded in the future period of time after placing VMs.
• RQ5: When and how many servers should be switched to a low– power state and vice versa? Switching the power state of a host from idle to off or from inactive to on consumes additional energy. In a dynamic environment such as cloud data centers where the resource needs of VMs vary over time, power state switches are essential for re-ducing energy consumption. However, it is even more important to limit the switching frequency.
• RQ6: How to design a distributed–based approach for big data applications? To provide a scalable and efficient approach to process-ing large amounts of data, it is necessary to study the performance and the energy efficiency of cloud computing jobs in terms of computation
time.
To answer the research questions mentioned above, this dissertation develops a set of algorithms for VM placement (Publication I) and VM consolidation (Publication II, Publication III, and Publication IV). In ad-dition, this dissertation implements a scientific application to perform computationally–heavy tasks (Publication V). The research questions are answered in these publications as shown in Table 1.1.
Publications I II III IV V
Energy Efficiency and Scalability Performance
RQ1: How to consider multiple types of resources √ √
– √ –
and balance the load among them?
RQ2: When to migrate virtual machines? – √ √ √ –
RQ3: Which and how many virtual machines
– √ √ √ –
should be selected for migration?
RQ4: Where to place a virtual machine just created √ √ √ √
– or under migration?
RQ5: When and how many servers should be
– – √ √ –
switched to a low–power state and vice versa?
RQ6: How to design a distributed–based approach
– – – – √
for big data applications?
Table 1.1.How the publications address the research questions.
1.2 Methodology
This doctoral dissertation tackles the previously–described research ques-tions by employing the following methods and tools.
• Infrastructure–as–a–Service (IaaS)allows customers (e.g., cloud us-ers) to lease and manage virtual resources (e.g., CPU, memory, stor-age, and network bandwidth) over the Internet in the form of VM in-stances [13, 151]. Some well–known public IaaS clouds include Ama-zon EC2 [3], Google Compute Engine [47], and Rackspace [116]. More-over, a number of open–source IaaS cloud management systems have been developed including CloudStack [40], Eucalyptus [57, 105], Nim-bus [106], OpenNebula [107], and OpenStack [113]. This doctoral dis-sertation mainly focuses on the IaaS model.
• Virtualizationis widely deployed in large–scale data centers and be-comes the foundation of cloud computing due to its ability to isolate
co–located application workloads and its efficiency for resource multi-plexing [146]. This technology allows IaaS infrastructures to create several VMs on a physical server; therefore, it reduces the amount of hardware in use and improves the utilization of resources. The virtual-ization layer for hypervisor–based virtualvirtual-ization is placed between the hardware and the operating system (OS). It is implemented by a Vir-tual Machine Monitor (VMM) [122], which controls resource multiplex-ing and manages the allocation of physical resources (e.g., CPU, memory, storage, and I/O devices) to the VMs. There are two major types of im-plementation of hypervisor–based virtualization [140]: full virtualiza-tion (e.g., VMware Workstavirtualiza-tion [141], VirtualBox [108], Kernel–based Virtual Machine (KVM) [66], and Microsoft Hyper–V [93]) and paravir-tualization (e.g., Xen Paravirparavir-tualization [6]). This doctoral dissertation focuses on hypervisor–based virtualization; therefore, the term virtual-ization throughout the dissertation refers to this category.
• Virtual machine placementis the process of selecting the appropri-ate physical server in large cloud data centers to accommodappropri-ate newly– created or migrated VMs. The goal of VM placement is to assign the VMs to servers in such a way that the number of used servers is mini-mized.
• Virtual machine consolidationis enabled by virtualization technolo-gies [55, 71]. This allows cloud providers to create multiple VM in-stances on a single physical server, thus improving resource utiliza-tion and creating idle servers. The reducutiliza-tion in energy consumputiliza-tion is achieved by switching such idle hosts to low–power states (e.g., sus-pend, sleep, hibernation, or shutdown) during periods of low utilization, thus eliminating idle power consumption.
• Live virtual machine migrationis a method to transfer VMs between physical hosts over a local or wide area network without shutting these VMs down [120, 31, 148]. VM migration may be performed automati-cally by a cloud management system. For instance, multiple VMs can be consolidated on a fewer number of servers for energy saving purposes. However, this process may impact on the performance of applications running on a VM, the source and destination hosts, other VMs, and the network during a migration [130]. VM live migration can be categorized
into three approaches: pre–copy, post–copy, and a combination of both. This doctoral dissertation primarily focuses on the pre–copy approach to live VM migration, which is the chosen method for performing live migration in the Xen hypervisor [27].
• Multiple usage predictionutilizes machine learning techniques (e.g., neural networks and linear regression) to predict future resource uti-lization in the cloud with respect to time [62]. However, training neural network models takes significant time, which depends on the size of the input as well as the frequency of predictions. Therefore, it is important to determine effective learning algorithms for consolidating VMs in dy-namic environments, such as cloud data centers with millions of hetero-geneous machines and heterohetero-geneous resource types. This dissertation utilizes the Multiple Linear Regression technique [143] to forecast the future resource utilization in terms of multiple resource types — i.e., CPU, memory, storage, and network bandwidth — based on historical data.
• CloudSimis the simulation tool used to evaluate the effectiveness of the proposed schemes in a practical cloud scenario [22]. This disserta-tion extends CloudSim to handle both multi–resource types and energy– awareness.
• Virtual machine utilization follows the workload traces from both real–world public workloads as well as synthetic workloads with differ-ent availability of resource types, including PlanetLab VMs (CPU and memory) [111], Google Cluster Data (CPU and memory) [137], the Real Parallel Workloads from Production Systems (CPU and memory) [119], Amazon EC2 (CPU, memory, and storage), and the uniform and normal distribution (CPU, memory, storage, and network bandwidth) [12, 82]. Specifically, the usage of each type of resources in the real–world public dataset is collected every five minutes based on an empirical evaluation of the considered workloads (additional considerations on this aspect are provided in Chapter 5). According to the available resource types of the considered workloads, this doctoral dissertation sets the resource dimension toD= 2,D= 3, andD= 4, then adopts|D|–dimensional VM placement and VM consolidation in the simulations.
• Hadoop MapReduceis a software framework which allows cloud users to solve computationally–intensive problems. In particular, MapReduce is a simple programming model to run distributed computation on very large data sets using large clusters of commodity machines [30]. Its open–source implementation Hadoop [39] includes two main components: the Hadoop Distributed File System (HDFS) and the MapReduce dis-tributed computing paradigm.
1.3 Contributions
This dissertation summarizes five peer reviewed publications whose con-tribution is briefly described below.
Publication I proposes a multi–resource VM placement algorithm for maximizing the utilization and balancing the load across different types of resources in cloud data centers. The research goal is to allocate a set of VMs to physical machines such that the number of servers required to accommodate the VMs is minimized. This is achieved by implementing a multi–resource VM placement algorithm. However, most of the exist-ing solutions only consider a limited number of resource types (in many cases only the CPU), thus resulting in an unbalanced load or in the un-necessary activation of physical servers. To solve this limitation, Publi-cation I investigates the use of multiple resource–constraint metrics that help find the most suitable server for deploying VMs in large cloud data centers. Accordingly, a multi–resource VM placement algorithm called Max–BRU is proposed for spreading the load across multiple dimensions. Max–BRU is especially attractive for the VM placement problem due to its polynomial time worst–case complexity on the number of the VM de-ployment requests. The Max–BRU algorithm is evaluated through simu-lation experiments and is compared with the eight other state of the art approaches: Greedy First–Fit [57, 65, 68, 82, 84], Load–aware [28, 107], VectorDot [125], Market Mechanism [147], the Min–Min and Max–Min heuristics [43, 53, 127], the algorithm proposed in [26], and an extension of the volume metric introduced in [144]. Simulation results demonstrate that Max–BRU obtains a more balanced use of resources than the state of the art. As a consequence, it makes a more efficient use of multiple resources, thus reducing the number of required active physical servers in cloud data centers.
Publication II proposes a multi–resource selection (MRS) scheme for consolidating VMs in cloud data centers. Additionally, an efficient VM consolidation algorithm called BMRU is proposed for balancing the us-age of resources across multiple dimensions. The VM consolidation algo-rithm is important to enable continuous consolidation of already–placed VMs on the least number of physical servers. Therefore, the proposed BMRU algorithm integrates with MRS and uses the current utilization of multiple types of resources to characterize and classify a physical server. This is particularly important to avoid an unbalanced load over multi-ple types of resources and further increase the resource utilization of a data center. To save energy, the BMRU algorithm detects idle physi-cal servers, transitions them into a power–saving state (e.g., suspend), and reactivates them once required. Before this can be achieved, under-loaded host detection and VM migration are performed along with VM consolidation. Both mechanisms aim at placing VMs on the least num-ber of servers and release lightly–utilized machines. To evaluate BMRU, a custom simulator has been written in Java; a 4–dimensional VM con-solidation scheme has then been evaluated under synthetic workloads (considered CPU, memory, storage, and network bandwidth) and a 2– dimensional VM consolidation scheme under the real–world Google Clus-ter Data [137] and RIKEN Integrated ClusClus-ter of ClusClus-ters [119] workloads (considering CPU and memory). The experimental results have proven that the proposed approach obtains a more balanced use of multiple re-sources, thus increasing the energy efficiency of a data center by minimiz-ing the number of active physical servers.
Publication III proposes a VM consolidation with usage prediction (VM-CUP) algorithm for a more efficient detection of overloaded and under-loaded servers to avoid unnecessary VM migrations and server switches. The key idea of the proposed approach is to predict the short–term us-age of a single computing resource (i.e., the CPU utilization in a five– minute horizon) and use the current and predicted usage metrics for a reliable characterization of overloaded and underloaded servers. In addi-tion, the proposed solution is aligned with green computing strategies, in which physical machines may be powered off to save energy. The proposed algorithm is evaluated in a practical cloud scenario by extend-ing the CloudSim simulation toolkit [22]. VMCUP is compared to four well–known overutilized host detection approaches in [12]: static thresh-old (THR), the median absolute deviation (MAD), the interquartile range
(IQR), and a dynamic threshold based on local regression (LR). Extensive experiments performed on the Google Cluster Data [137] and the Planet-Lab VMs [111] workload traces show that VM consolidation embedding usage prediction reduces the energy consumption while limiting both the number of migrations and the power state changes. As a consequence, it improves the performance of a data center with a better compliance with the service level agreement (SLA).
Publication IV extends the VMCUP approach presented in Publication III to predict the long–term utilization over multiple types of resources (i.e., CPU, memory, disk, and network bandwidth). This publication makes the following contributions: (1) it adapts the previously proposed usage prediction (UP) in Publication III to enable multiple usage prediction (MUP), in terms of both resource types and the horizon employed to pre-dict future utilization; (2) it introduces a VM selection policy — namely, the minimum resource temperature (MRT) — that migrates a VM based on the joint impact of multiple resources; (3) it embeds the MUP scheme into the power–aware best fit decreasing (PABFD) solution through a new VM placement algorithm called PABFD–MUP so as to select a target physical server not only based on the least increased power consumption but also on its utilization stability, predicted by using the MUP scheme; and (4) it proposes an algorithm for VM consolidation with multiple us-age prediction (VMCUP–M) for energy–efficient IaaS cloud data centers. The key idea of the proposed algorithms to achieve both multiple–resource and multiple–step prediction is to apply underloaded and overloaded host management, VM selection, and VM placement under migration. Specifi-cally, the joint use of current and predicted resource utilization over mul-tiple dimensions allows for a reliable characterization of overloaded and underloaded servers, thereby reducing both the load and the power con-sumption after consolidation. In addition to that, MUP helps limit the number of active servers, the number of VM migrations and power state changes, thus achieving a better compliance with the SLA. VMCUP–M has a polynomial time complexity on the number of the VMs to be al-located in the data center. The CloudSim simulation toolkit [22] is ex-tended to handle multiple types of resources and energy–aware simula-tions. VMCUP–M is then implemented on top of such an extended ver-sion of CloudSim and is evaluated with the following existing approaches in [12]: (1) overutilized host detection: static and dynamic hot thresholds; (2) VM selection: minimum migration time (MMT), maximum correlation
(MC), minimum utilization (MU), and random selection (RS); and (3) VM placement: power–aware best fit decreasing. Furthermore, VMCUP–M is compared to the multiple–resource black–box and gray–box (BG) scheme introduced in [144] through our own implementation based on extending the volume metric across two resources, i.e., CPU and memory. Extensive experiments performed on the 1,600 VMs from Google Cluster Data [137] and the 1,473 VMs from PlanetLab [111] workload traces show that VM consolidation embedding multiple usage prediction reduces the energy consumption while limiting the number of active physical servers, the number of migrations and power state changes, thus increasing the per-formance of a cloud data center with a better compliance with the SLA.
Publication V proposes Distributed Semantic Analysis (DSA), a big data system that integrates a distributed computing with semantic analysis, thereby allowing cloud users to efficiently process large amounts of data (e.g., huge amounts of Wikipedia articles). This publication is targeted to application–aware approaches that consider high–level application re-quirements in terms of QoS (e.g., response time) while performing com-putationally–heavy tasks. In particular, Hadoop MapReduce is used to build an environment that can easily be scaled through virtualization to split the source data and process them in parallel [30, 39]. Extensive experiments are done on a Hadoop MapReduce cluster to determine the execution time for analyzing Wikipedia data and then to evaluate the per-formance of the proposed DSA system. Accordingly, DSA can significantly improve the energy efficiency of a real IaaS cloud by reducing the compu-tation time to analyze big data.
1.4 Thesis Organization
The rest of this dissertation is organized as follows. Chapter 2 discusses efficient VM placement, while Chapter 3 focuses on VM consolidation for saving energy in cloud data centers. Chapter 4 presents an application of distributed computing to big data analytics. Chapter 5 concludes the dissertation with a summary of the main contributions and a discussion of future research directions. Finally, the original papers are provided at the end of the dissertation.
2. Efficient Virtual Machine Placement
This doctoral dissertation is about efficient Infrastructure–as–a–Service (IaaS) cloud data centers, with focus on energy efficiency. To provide a context for the reader concerning the motivations of the dissertation, this chapter first outlines the key components for energy–efficient cloud data centers. The state of the art on the algorithms for energy–efficient vir-tual machine (VM) placement is then reviewed. Such a review shows that existing VM placement algorithms mostly consider a limited number of resource dimensions (e.g., a single CPU resource) and have several limi-tations. They include neglecting multiple types of resources as well as the heterogeneity of physical servers and VMs. To address these limitations, the main contributions in this chapter are to implement and evaluate a VM placement solution which: (1) considers multiple types of resources such as CPU, memory, storage, and network bandwidth; and (2) balances the load across different types of resources while placing VMs.
This chapter presents research done in Publication I. In particular, the Max–BRU VM placement algorithm for maximized balanced resource uti-lization over multiple dimensions is summarized. The Max–BRU algo-rithm is evaluated through a custom simulator written in Java. Accord-ing to the experimental results from both synthetic and real–world work-loads, the proposed algorithm significantly reduces the number of active servers, thus minimizing the power consumption of IaaS cloud data cen-ters while, at the same time, balancing the usage of multiple types of resources.
2.1 Case Study: Energy Efficiency of Data Centers
The largest energy consumption within a typical data center is repre-sented by the servers, which account for about 60% of the data center’s
Architecture Number of servers CPU Memory Server population (%) 1 6728 0.50 0.50 53.469 2 3864 0.50 0.25 30.708 3 1003 0.50 0.75 7.971 4 795 1.00 1.00 6.318 5 126 0.25 0.25 1.001 6 54 0.50 0.13 0.429 7 5 0.50 0.03 0.040 8 4 0.50 0.97 0.032 9 3 1.00 0.50 0.024 10 1 0.50 0.06 0.008
Table 2.1.Statistics of machines in the Google Cluster Data.
overall consumption [54]. This means that the energy efficiency of the servers is the key component for an energy–efficient data center. Unfor-tunately, such servers are heterogeneous and mostly act at low resource utilization levels [7, 8, 44]. To show the heterogeneity of the servers, I first analyze the CPU and memory capacities from the Google Cluster Data dataset [137], consisting of 12,583 machines. The statistics of the servers are shown in Table 2.1, which provides the exact resource capac-ity of the machines across different types of resources, thereby giving an indication of the heterogeneity of servers in a data center. The data show that physical machines have diverse CPU and memory capacities, specif-ically, 30.708% of the servers belong to architecture 2 and 7.971% of the servers to architecture 3.
Garraghan et al. [44] presented the average CPU and memory utiliza-tion for the top four architectures, highlighted in bold (98.466% of the server population) in Table 2.1. The results confirm that the average CPU and memory utilization over all the servers are below 48% and 50%, re-spectively. A study by IBM researchers in 2012 [15] reported that the utilization of CPU, memory, and disk were 18%, 78%, and 75% (respec-tively) over several thousands of servers and a two–year period. Such results confirm that, while the CPU is seemingly underutilized, memory and disk may already operate at high resource utilization levels. In this scenario, it is a challenging problem to improve the resource utilization of only the CPU without considering other resource types.
Underutilization of the servers in IaaS cloud data centers is a primary source of inefficiency. Accordingly, the authors in [8] showed that Google servers operate most of the time at between 10% and 50% of their max-imum utilization levels. Such server utilization consumes almost half of the energy compared to full utilization. Furthermore, a server still
con-0 20 40 60 80 100 Server utilization (%) 4050 60 70 80 90 100110 120 130 140
Average active power (
W
)
Tyan Computer Corporation B8228Y190X2-045V4H (Dec. 04, 2012) Hitachi, Ltd. HA8000/RS110 (DL2) (Mar. 06, 2013) Huawei Technologies Co., Ltd RH2288H V2 (May 07, 2014) DEPO Computers DEPO Race X340H (Dec. 19, 2014)
(a)
A B C D E F G H I J K L M N O P Q R Avg 18 server platforms from SPEC in 2015 15 20 25 30 35 40 45 50
Average active power ratio (
%
) Power usage at idle / Power usage at 100%Power usage at 30% / Power usage at 100%
(b)
Figure 2.1.Server power usage at varying utilization levels of server platforms from SPEC [118], i.e., (a) from idle to peak performance and (b) ac-tive power ratio at idle and at 30% utilization (relaac-tive to the 100% utilization).
sumes up to 60% of its peak power when it is completely idle, [7, 34, 77]. Figure 2.1a illustrates this through the power consumption of four differ-ent server platforms from the Standard Performance Evaluation Corpo-ration (SPEC) [118] against their utilization levels (0% means the server is completely idle while 100% means the server is fully utilized). The re-sults indeed demonstrate that the considered servers consumed up to 60% of peak power at 0% utilization. Figure 2.1b shows the relative efficiency of the 18 servers (referred from A to R, Avg is average) in the SPEC power benchmark (data from 2015) when running idle and at 30% utilization compared to the values obtained at a 100% utilization level. The results confirm that idle servers consume about 20% of their peak power on the average and that the servers operating at a low utilization (i.e., 30%) con-sume around 43% of the peak power on the average. Overall, the study showed that the overall energy consumption of a data center is due to the underutilization of servers. Precisely, the energy is wasted due to the low average resource utilization, typically ranging at between 10% and 50%, where servers have their worst energy efficiency.
The key aspect for energy–efficient IaaS cloud data centers is to improve the efficiency of the servers. This is possible by balancing the usage of resources across multiple dimensions, thus helping to increase server uti-lization. Furthermore, reducing the number of active physical servers sig-nificantly decreases the overall energy consumption. This is achieved by optimizing the assignment of VMs to physical machines in IaaS cloud data centers. Energy consumption is further reduced by dynamically switching
servers to a low–power mode during idle times.
Accordingly, this dissertation specifically focuses on managing VMs for energy–efficient IaaS cloud through two main methods: VM placement and VM consolidation (the details of the latter can be found in Chap-ter 3). VM placement aims at balancing the load across different types of resources, thus limiting the number of active servers, while VM consol-idation enables to create idle servers that can be switched into a power– saving state. Both approaches try to improve the server utilization through more energy–efficient operating modes within a cloud data center.
In the following, the first major contribution of the dissertation with focus on VM placement for efficient cloud data centers (Publication I) is discussed after reviewing the related work on VM placement.
2.2 Virtual Machine Placement
Virtual machine placement is the process of selecting the appropriate physical servers in large cloud data centers to accommodate newly–created or migrated VMs. The goal of VM placement is to assign the VMs to physical servers in such a way to allow efficient usage of resources [14, 26, 60, 86, 125]. Finding the optimal allocation of physical servers has long been known to be an NP hard problem [53, 129]. As a consequence, many heuristic algorithms have been proposed to find a feasible solution within a reasonable time [19]. A well–known mapping algorithm is greedy First–Fit (FF), which is used in the Eucalyptus cloud management mid-dleware [57, 68, 84]. The First–Fit algorithm allocates each VM request to the first physical server which satisfies the demands of the VM for all resources, starting from the first server sorted according to a predefined metric (e.g., available resources or power efficiency). An improvement version of the FF algorithm is called First–Fit–Decreasing (FFD), which sorts the VMs in decreasing order according to their resource demands. FFD uses no more than11/9·OPT+ 1 bins, where OPT is the number of bins provided by the optimal solution [96]. Other greedy algorithms have also been devised to solve the problem of VM placement, including Best–Fit (BF), Worst–Fit (WF), and Next–Fit (NF). Besides, OpenNeb-ula [28, 107] uses a load–aware policy that selects first the physical server with the least used CPU for allocating VM requests. Other heuristic– based resource allocation algorithms are Min–Min and Max–Min, which have been largely adopted in the literature [43, 53, 127] to assign
compu-tation resources in such a way to guarantee a target processing time. As a consequence, the availableCPU capacity becomes the main allocation criterion affecting the task completion time. The major limitation of these algorithms is that they only focus on meeting one objective called utility of resources (greedy and load–aware) and on managing a one–dimensional resource (Min–Min and Max–Min).
In [80, 81], computing capacity was used as the main allocation crite-rion for VM creation and load minimization. Different algorithms, i.e., Greedy Worst–Fit, Sequential, Knapsack, and Time–bound Knapsack, were then presented. The experimental results using synthetic bench-marks show that Time–bound Knapsack achieves a 72% average load bal-ancing, whereas the three other algorithms all result in loads above 80%. In [84], two algorithms called Dynamic Round Robin (DRR) and a com-bined DRR with First–Fit (Hybrid) were proposed, then compared with both the Round Robin and Power Save scheduling strategies in Eucalyp-tus. Simulation results showed that DDR and Hybrid decrease the power requirements by 56.5% and 55.9% compared with Round Robin, respec-tively. DDR and Hybrid also result in 3% less power consumption on av-erage, compared with Power Save. A major limitation of these approaches is that, in practice, different VM requests may have different demands in terms of resource types. For instance, a request for general–purpose appli-cations includes a balanced amount of CPU, memory, storage and network resources; a request for computer–intensive applications needs more CPU resources; and a request for data base and memory caching applications needs more memory resources. Therefore, solutions that consider only one resource as the main allocation criterion may be efficient for some re-source dimensions while underutilizing others, thus resulting in higher actual costs.
Most of the existing work on VM placement in data centers fails to take advantage of the Min, Max, and Share parameters [23]. Particularly, the Min parameter ensures that VM receives the minimum amount of resources when powered on while the Max parameter ensures the maxi-mum amount of resources for a VM to run a low–priority application. On the other hand, the Share parameter advises the Virtual Machine Monitor (VMM) to distribute resources among contending VMs. These parameters allow VMs to run heterogeneous applications, wherein the amount of re-sources allocated to a VM may be adjusted based on available rere-sources, power costs, and application utilities. The same work [23] also introduced
a suite of algorithmic techniques to address the Min–, Max–, and Share– aware placement problem called GreedyMax, GreedyMinMax, and Ex-pandMinMax (respectively). The authors then proposed a VM placement algorithm called PowerExpandMinMax. Simulation results on a range of large synthetic data center setups and a small real data center testbed have shown that leveraging such parameters may improve the overall utility of the considered data centers by 47% or more. The limitation of this work is that it considered CPU utilization only and was limited to homogeneous servers.
Some works in the literature actually considered multiple resources for VM placement [26, 52, 75, 97, 125]. For instance, a vector–based scheme was employed by the HARMONY system for load balancing [125]. In such solutions, resources were normalized along dimensions as vectors, e.g., the resource requirement vector of VMs (RRV), the total resource ca-pacity vector of servers (TCV), and the utilized caca-pacity vector of servers (UCV). The VectorDot metric, for instance, was defined as the dot product of RRV and UCV and was proposed as a basis for choosing the target phys-ical server for placing VM. Accordingly, after accepting the VM depend-ing on the RRV, the placement scheme selects the physical server whose UCV gives the lowest value derived from the dot product. Evaluation re-sults on simulated data center environments of various sizes and a real data center testbed have shown that VectorDot in combination with Best– Fit (BFVD), First–Fit (FFVD), Worst–Fit (WFVD), and RelaxedBestFit (RBFVD) achieves a more balanced resource usage compared to tradi-tional greedy algorithms. For instance, the average system load obtained by using VectorDot is quite high (i.e., 70%). A different metric based on the vector–based approach, i.e., thecosine of the anglebetween UCV and TCV, was also introduced in [26] to rank physical servers. In detail, the authors proposed a scheme for balancing the utilization of multiple types of resources by minimizing the angle between UCV and TCV among all physical servers. Simulation results using randomly–generated data have shown that the cosine of the angle makes the VM deployments much more balanced in multiple resource utilization among the servers compared to the method used in Sandpiper [144] (see also the discussion below).
A market–based solution for multiple–resource load balancing was pre-sented in [147] in the context of job scheduling. For such a scenario, a Market Mechanism (MM) scheme was introduced to balance multiple re-sources among servers with heterogeneous capacities. The authors
pro-posed a selection policy built on top of a pricing model to calculate the cost per resource of a job. The policy chooses the server whose utilization across different types of resources gives the lowest cost value. Simulation results using randomly–generated data with uniform distribution have shown that MM performs better than Latest Arrival Job (LAJ), Back-fill Lowest (BL), and BackBack-fill Balance (BB) in terms of load balancing. Furthermore, MM maintains a high server utilization (i.e., 80 – 85%). The work in [147] targets both homogeneous and heterogeneous physical servers and considers CPU, memory, and network bandwidth.
Sandpiper [144] combined three dimensions into a singlevolumemetric as the product of its CPU, memory, and network utilization. The higher utilization of multiple resources, the higher value of the volume met-ric. The same work [144] also introduced a black–box and gray–box (BG) strategy that adopts the volume metric to select target physical servers. In particular, servers are sorted by ascending volume then considered for allocation of VMs under migration. The implementation of Sandpiper based on Xen has demonstrated that it may respond to network, CPU, or memory hotspots and may collocate VMs that stress different resources to improve overall system utilization (i.e., the aggregate CPU and network utilizations on both servers falls below 50%). Different VM placement al-gorithms have then been implemented to improve the resource utilization of each dimensions by taking the volume metric as the main optimization objective [97, 128, 135]. However, simple extensions of well–known VM placement solutions may cause low resource utilization and unbalance the load, thus requiring more active servers than those really needed and increasing operational costs.
2.3 System Model and Considered Metrics
The target system is an IaaS environment, i.e., a large–scale cloud data center consisting ofM heterogeneous physical servers, denoted asP = p1, p2, . . . , pM. Each server is characterized asp=pi, vm,ˆrdwith mul-tiple types of resources, d ∈ {1, ..., D}, D ∈ N, where: pi is the unique identifier of a server; vm is a set of VM instances that are allocated in
p; andrˆd ={ˆr1,rˆ2, ...,rˆD}describes the type and amount of thed–th re-source consumed, where each dimension corresponds to a given type of physical resource (i.e., CPU, memory, storage, and network bandwidth).