THE energy efficiency of data centers - the essential

(1)

Profiling-based Workload Consolidation and

Migration in Virtualized Data Centres

Kejiang Ye, Zhaohui Wu, Chen Wang, Bing Bing Zhou, Weisheng Si, Xiaohong Jiang, and

Albert Y. Zomaya,

Fellow, IEEE

Abstract—Improving energy efficiency of data centers has become increasingly important nowadays due to the significant amounts of power needed to operate these centers. An important method for achieving energy efficiency is server consolidation supported by virtualization. However, server consolidation may incur significant degradation to workload performance due to virtual machine (VM) co-locationand migration. How to reduce such performance degradation becomes a critical issue to address. In this paper, we propose a profiling-based server consolidation framework which minimizes the number of physical machines (PMs) used in data centers while maintaining satisfactory performance of various workloads. Inside this framework, we first profile the performance losses of various workloads under two situations: running inco-locationand experiencingmigrations. We then design two modules: (1) consolidation planning module which, given a set of workloads, minimizes the number of PMs by an integer programming model, and (2) migration planning module which, given a source VM placement scenario and a target VM placement scenario, minimizes the number of VM migrations by a polynomial time algorithm. Also, based on the workload performance profiles, both modules can guarantee the performance losses of various workloads below configurable thresholds. Our experiments for workload profiling are conducted with real data center workloads and our experiments on our two modules validate the integer programming model and the polynomial time algorithm.

Index Terms—Virtual machine, server consolidation, live migration, cloud computing, energy efficiency.

F

1 INTRODUCTION

T

HE energy efficiency of data centers - the essential infrastructure for clouds - has increasingly become a problem. In 2012, the New York Times conducted a survey about the energy wasting in the data centers and found that only 6% to 12% of the electricity powering servers was used to perform computations while the remaining electricity was simply wasted [1]. The waste is mainly caused by the low utilization of physical servers [2]. This inefficiency not only leads to high energy bill for data centers, but also incurs high cooling cost, floor space cost and causes adverse impact on the environment. As the popularity of cloud computing grows, it is increasingly important to improve the energy efficiency of cloud data centers.

In recent years, server consolidation supported by vir-tualization [3] has been the main approach to reduce the number of physical machines (PMs) used in data centers

• Kejiang Ye, Zhaohui Wu* and Xiaohong Jiang are with the College of

Computer Science, Zhejiang University, Hangzhou 310027, China.

E-mail:{yekejiang, wzh, jiangxh}@zju.edu.cn. *Zhaohui Wu is the

corresponding author.

• Chen Wang is with CSIRO Computational Informatics, PO Box 76,

Epping, NSW 1710, Australia. E-mail: [email protected].

• Bing Bing Zhou and Albert Y. Zomaya are with the Centre for

Distributed and High Performance Computing, School of Information Technologies, University of Sydney, NSW 2006, Australia.

{bing.zhou, albert.zomaya}@sydney.edu.au.

• Weisheng Si is with the School of Computing, Engineering, and

Mathematics, University of Western Sydney, Penrith, NSW 2751, Australia. E-mail: [email protected].

to improve energy efficiency. Virtual machine (VM) migra-tion [4] provides the enabling technology for server consol-idation. However, server consolidation and VM migration bring two major challenges:

1) Consolidation can incur considerable performance degradation of co-located VMs due to competition on shared resources such as caches and networks. The level of performance degradation varies when VMs running different workloads are co-located. It is challenging to minimize the overall degradation given a set of VMs running various types of workloads. 2) Migrating virtual machines with different

work-loads [4] may incur different level of migration overhead, which will also degrade the performance of workloads. How to reduce the overall migration overhead during server consolidation is not a trivial task.

There have been some works addressing these chal-lenges [5–8]. However, they usually aim to improve re-source utilization based on metrics such as CPU utilization and pay little attention to the performance loss of co-locating and migrating workloads. Specifically, these works have the following limitations:

1) The resource utilization metrics do not correlate to workload performance closely. The relationship be-tween resource allocation and workload performance is complicated. Furthermore, most of the existing works only use the CPU utilization to design consol-idation strategy, which fails to consider the charac-teristics of those memory intensive and I/O intensive

(2)

1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See 1vm 2vm 3vm 4vm 5vm 6vm 7vm 8vm 0 50000 100000 150000 200000 250000 300000 350000 400000

Average Perform ance of Each VM Average CPU Utilization of Each VM CPU Utilizaiton of Dom 0

The Number of Consolidated W eb Server VMs

W e b S e r ve r T h r o u g h p u t ( b yt e s/ se c) 0 50 100 150 200 250 300 350 400 450 C P U U t i l i za t i o n

Fig. 1. Motivation Example. (PM is configured with 16 cores and 32GB DRAM; VM is configured with 1VCPU, 1GB DRAM; Dom0 is configured default with 16 cores; the client requires for each VM is 100.)

workloads.

2) In a virtualization environment, virtual machine mon-itor (VMM) can be a factor that affects the workload performance running in VMs. Fig. 1 shows the con-solidation of several VMs running Web servers. As the number of VMs consolidated to a PM increases, the average Web server throughput decreases drasti-cally even though the overall CPU utilization of the PM is low. In this case, resource contention of the VMM driver domain (Dom0) and network bandwidth have a big impact on the performance of workloads. 3) Migrating VMs running different workloads incur dif-ferent levels of overhead, e.g., migrating a memory-intensive VM may incur higher migration overhead than migrating a CPU intensive VM. To the best of our knowledge, most of the existing works do not consider this factor.

In this paper, we study the server consolidation problem with a focus on reducing the performance loss of workloads instead of simply increasing resource utilization. By con-ducting experiments in a testbed with several typical kinds of data center workloads, such as CPU-intensive loads, Disk-intensive workloads, Memory-intensive work-loads and Network-intensive workwork-loads, we establish the profiles on the performance losses of these four kinds of workloads under two situations: (1) running in co-location with other workloads and (2) experiencing a migration. Then, based on these two types of profiles, we design two modules: consolidation planning module and migration planning module, which minimize the number of PMs and the number of VM migrations respectively while keeping the performance loss of each kind of workload below the threshold specified by users. In brief, the contributions of this paper can be summarized as follows:

1) We establish the profiles on the performance losses for typical kinds of workloads in data centers. To

do so, we conducted comprehensive experiments on every possible combinations of different basic work-loads running on a single PM and on the migration of each kind of workload. Note that this paper did experiments on four basic kinds of workloads as an example. For more complex workload types, experi-ments can be done similarly. Since this effort is one-off, it is affordable for data centers.

2) In the consolidation planning module, we formulate the problem on how to minimize the number of PMs during server consolidation while guaranteeing the performance loss of each workload under certain threshold into an integer linear program, thus obtain-ing the optimal results.

3) In the migration planning module, we pose a new problem on how to change a current workload place-ment on PMs into a new workload placeplace-ment on PMs with minimum number of VM migrations while guar-anteeing the performance loss of each workload under certain threshold, and provide an optimal algorithm to this problem with complexity of O(n3₎_{, where}_n

is the number of PMs used in the current workload placement.

4) By combining the consolidation planning and migra-tion planning modules, we can handle both static and dynamic server consolidations in data centers (more details are given in Section 3).

The rest of the paper is organized as follows. Section 2 discusses related work. Section 3 describes our profiling-based server consolidation framework. Section 4 details our experiments to obtain the performance profiles for four basic kinds of typical workloads in data centers. Section 5 and Section 6 present our consolidation planning module and migration planning module respectively. Section 7 evaluates these two modules and Section 8 concludes this paper.

2 WORK

There have been some previous work on server consolida-tion in data centers [5–13].

Mills et al. [11] developed an objective method to facili-tate the comparison of different virtual machine placement algorithms in the cloud. Xu et al. [12] uses the stable matching framework to decouple policies from mechanisms when mapping virtual machines to physical servers and pre-sented a general resource management architecture called Anchor. Di et al. [13] formulated the resource allocation problem to be a convex optimization problem and proposed a new Cloud architecture, namely self-organizing cloud (SOC), which can connect a large number of desktop computers on the Internet by a P2P network. Speitkamp et al. [5] studied the static consolidation problem with a mathematical programming approach. Srikantaiah et al. [9] modelled the consolidation as a modified bin-packing prob-lem. These works focus on the initial VM deployment or static consolidation problem based on resource utilization and do not consider VM migration overhead.

(3)

placement framework called pMapper to minimize the power under specific requirements. Hermenier et al. [10] propose a consolidation manager called Entropy to perform dynamic consolidation. Ferreto et al. [7] give a migration control strategy for dynamic consolidation to avoid migra-tion of virtual machines with steady capacity. Beloglazov et al. [6] use dynamic consolidation to solve the problem of overloaded hosts based on a Markov chain model. These works focus on the improvement of resource utilization, particularly CPU utilization and consider little about the performance of workloads.

Recently, several works consider the impact of workload types on consolidation efficiency [15–17]. Gong et al. employ the signal processing techniques to extract the consolidated workload pattern from the utilization trace data in order to improve the resource utilization [16]. Zhan et al. use the differences of heterogeneous workloads to reduce the peak resources consumption [15]. Carrera et al. [17] used virtualization techniques to consolidate batch and transactional workloads together [17]. Lee et al. [18, 19] and Liu et al. [20] present efficient consolidation methods for parallel workloads. These works are mainly on improving the resource utilization or managing the peak resource consumption. While we use typical data center workloads to perform the consolidation, and our goal is to reduce the number of physical servers and satisfy the workload performance constraints at the same time.

Compared to existing work, our work focus on the performance of consolidated workloads rather than resource utilization, and also take migration cost into account in the consolidation plan. In addition, the optimal migration planning algorithm has not been studied in this context in previous works.

3 PROFILING-BASED

SERVER

CONSOLIDA-TION

FRAMEWORK

Consolidation Planning

Target consolidation scenario

Migration Planning

A set of workload migrations Source consolidation scenario Static Consolidation Dynamic Consolidation Workloads

Workloads Profiling DataProfiling Data

Fig. 2. Profiling-based Server Consolidation Frame-work.

Our profiling-based server consolidation framework first involves the experiments to profile the performance of

typical workloads in data centers. As the workload types and their configurations are very diverse in the real world, we only profile fourbasictypes of workloads as an exam-ple, they are CPU intensive workloads, Memory intensive workloads, Disk intensive workloads and Network intensive workloads. Other workloads can be approximately classi-fied into these four basic types. Note that the profiling data is always related to the testing environments and configu-rations. When the environments or workloads change, we should just re-measure and update the profiling data. The profiling methods and performance metrics are detailed in Section 4.

Based on the performance profiles, our framework pro-vides two modules: Consolidation Planning Module and

Migration Planning Module. To clearly describe these two

modules, we first define the concepts of consolidation

caseandconsolidation scenario. Aconsolidation case (or

simply case) is defined as a set 1 _{of workloads running}

on a single PM. For instance, {File Server, File Server, Web Server} is a consolidation case consisting of three workloads. Aconsolidation scenariois defined as a set of consolidation cases. For instance, { {Jave Server}, {File Server, Web Server}, {File Server, Web Server}, {Java Server, Database Server, Web Server} } is a consolidation scenario consisting of four cases. Since each consolidation case is run on one PM, each case in a consolidation scenario also corresponds to a PM that runs this case.

As shown in Fig. 2, theConsolidation Planning Module

takes the following as its input: (1) a set of workloads to place on PMs and (2) empirical workload profiling data. After solving the integer linear program to be presented in Section 5, this module outputs the optimal consolidation scenario that minimizes the number of PMs while main-taining the performance loss of each workload below the threshold specified by users.

The Migration Planning Module takes the following as

its input: (1) the consolidation scenario before invoking the Consolidation Planning Module (hereafter, we call this source consolidation scenario) and (2) the optimal consolidation scenario returned by the Consolidation Plan-ning Module (hereafter, we call this target consolidation scenario). After calling our migration planning algorithm to be presented in Section 6, this module outputs the set of workload migrations that transforms the source consoli-dation scenario into the target consoliconsoli-dation scenario while minimizing the number of VM migrations.

With these two modules, our framework can handle two kinds of consolidation circumstances: static

consoli-dation anddynamic consolidation. In static consolidation,

new workloads are placed to PMs for execution and no migrations of workloads among PMs are involved. In dynamic consolidation, existing workloads on PMs are re-consolidated and migrated to other PMs if needed. Dynamic consolidation is desired since existing workloads in data centers keep changing their status, such as finishing or 1. In this paper, we use a generalized notion of set that allows its elements appear more than once.

(4)

re-consolidate.

As illustrated in Fig. 2, in static consolidation, only the Consolidation Planning Module is invoked; and in dynamic consolidation, the Consolidation Planning Module and Mi-gration Planning Module are invoked sequentially, with the former feeding its output to the latter. By supporting both static and dynamic consolidations, our framework can constantly keep the data centers running in energy efficiency.

4 WORKLOAD

PERFORMANCE

PROFILING

In this section, we describe our experiments to establish the performance profiles of four basictypes of workloads in virtualized data centers.

4.1 Two Types of Overheads

Co-location Overhead. The co-location overhead is a

result of resource contention among VMs. It is known that resources like cache and networking are not well isolated in the current hypervisor implementation [21]. When requests to a resource from co-located VMs exceed the capacity of the physical resource at any time, the performance of requestors is affected. Because different workloads have different resource request patterns, a mechanism is needed to co-locate workloads that are least likely to compete on resource use.

Migration Overhead. Dynamic consolidation requires

the support of VM live migration techniques [4]. Live migration incurs overhead for the workload running in the migrating virtual machine. The overhead includes the service downtime and data transmission time. The down-time is usually less than 100 ms [4]. The transmission time may be further stretched by the network bandwidth contention if multiple VMs are migrated simultaneously. Other overheads inured at the beginning and the end of a migration process may also affect the workload perfor-mance [22]. VMs running different workloads are likely to have different migration overheads.

4.2 Profiling Setup

Experimental Environment Our testbed consists of two

Dell T710 servers and a NFS storage server which are in-terconnected by 1 Gigabit Ethernet. The Dell T710 servers have 16 cores 64-bit Xeon processors E5620 at 2.40GHz and 32GB DRAM. We use CentOS 5.6 with kernel version 2.6.18-238.12.1.e15xen in Domain 0, and Xen 3.3.1 as the hypervisor [3]. Each virtual machine is configured with 1 VCPU and 1024MB DRAM. All virtual machine images are stored in the NFS storage server.

Benchmark Workloads The following four common

data center workloads are used in our experiments: Java Server as the CPU-intensive workload, File Server as the Disk-intensive workload, Database Server as the Memory-intensive workload and Web Server as the Network-intensive workload. They are usually used as the stan-dard workloads in server consolidation benchmarks, such

as Intel vConsolidate [23], VMware VMmark [24], SEPCvirt sc2010 [25], etc.

• Java Server: We use SEPCjbb2005 benchmark [26] as the Java Server workload.

• File Server: We use IOzone benchmark [27] as the File Server workload. The workload reads and writes a file with the size of 1 GB.

• Database Server: We use Sysbench OLTP [28] as the Database Server workload. MySQL is used as the back-end database and table size is configured with 100,000 records.

• Web Server: We use Webbench [29] as the Web Server workload. Apache HTTP Server is used as the server daemon. We create 100 clients to concurrently access each VM running Web Server.

Experimental Precision As the optimal consolidation

and migration requiresaccurateprofiling information, each of the measurement experiments were performed three times with the same configuration and the final results are the average values of three runs.

4.3 Consolidation Performance Profiling

In the following, we profile the performance of different consolidation cases. We assume that there are n different types of workloads and each workload runs in a VM. The total number of consolidation cases containing exactly k VMs isCk_n₊_k₋₁. If each PM has a capacity of running up to mVMs simultaneously, the total number of consolidation cases is∑m_k₌₁Ck

n+k−1= C

m

n+m−1. As an example, when n= 4 and k= 4, i.e., there are four types of workloads and each PM supports up to four VMs, the total number of consolidation cases isC4

8−1 = 69.

The matrixAshows these consolidation cases:

A=           J ava F ile DB W eb Case1 1 0 0 0 Case2 0 1 0 0 Case3 0 0 1 0 Case4 0 0 0 1 Case5 2 0 0 0 .. . ... ... ... ... Case69 0 0 0 4           (1)

In this matrix,Case1has only one Java Server VM running

in one PM,Case5 has two Java Server VMs co-located in

a PM,Case69 has four Web Server VMs co-located in a

PM.

Fig. 3 shows the detailed performance of the four workloads in each of the 69 consolidation cases. The performance of each workload is normalized against its peak performance obtained by running on a dedicated PM. As shown in the figure, all the workloads incur performance loss to a certain degree. However, some consolidation cases achieve better performance than the others, e.g., the performance loss of all workloads is less than 5% inCase23

(shown by point A in Fig. 3). In comparison, the workload performance of consolidation caseCase69(shown by point

(5)

1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 B Case 23 : 1java+1db+1web Java Server File Server Database Server W eb Server N o r m a l i ze d P e r f o r m a n ce

Consolidation Case Number

Case 69

: 4web

A

Fig. 3. Consolidation Profiling Result. (A: 1 java server VM, 1 database server VMs and 1 web server VM consolidate together; B: 4 web server VMs consolidate together)

TABLE 1

Migration Performance Profiling.

Java Server File Server Database Server Web Server (bops) (bytes/s) (trans/sec) (bytes/s)

No-Migration 20648.79 790148 366.46 2688607

Migration 16744.84 366001 239.96 2028013

Performance Loss 18.91% 53.68% 34.52% 24.57%

TABLE 2

Migration Performance Constraints.

No. Migration Performance Constraints (MC) Forbidden Workloads

1 81.09%< M C ≤1 All the Java, File, Database, Web Servers are forbidden to migrate

2 75.43%< M C ≤81.09% File, Database, Web Servers are forbidden to migrate

3 65.48%< M C ≤75.43% File, Database Servers are forbidden to migrate

4 46.32%< M C ≤65.48% File Server is forbidden to migrate

5 0< M C≤46.32% No forbidden workloads, all the workloads are free to migrate

B in Fig. 3) is only 14.01% of its peak performance. It is because the network bandwidth resource becomes a bottleneck and a lot of CPU resource is occupied by the Dom0 to process the network I/O.

Co-location Performance Constraint. As users often

have performance requirements to the workloads, each workload subjects to certain performance constraints that can be derived from the corresponding SLA between a user and the resource manager. In our case, the SLA metric of a workload is defined as the percentage of the workload’s peak performance when running on a dedicated PM. The SLA metric defines VM co-location performance constraint. When setting the co-location performance con-straint for Web Server to 50%, the user expect the Web Server performance should be above 50% of its peak performance. According to the experiment results above, we can easily exclude the consolidation cases that don’t

satisfy this requirement, such asCase34,Case54,Case64,

Case68 andCase69.

4.4 Migration Performance Profiling

We also profile the migration performance using Virt-LM benchmark [30]. Table 1 compares the performance of each workload with one migration to that without migration. We find that different workloads incur different migration overheads. The migration of a File Server VM incurs high overhead that results in more than 50% performance loss of the File Server compared with the case without migration. It is mainly because that a File Server continuously reads and writes 1GB file, which triggers many swap operations. These swap operations produce lots of dirty data in mem-ory. The dirty data transfer between the source PM and the target PM incurs significant overhead.

(6)

the quality of service for performance sensitive workloads. During dynamic consolidation, workloads that are unlikely to meet the constraint will therefore not be migrated. According to the migration performance data, a set of mi-gration performance constraints (MC) are shown in Table 2. If a user specifies that a workload’s performance during migration should not drop below 70% of the workload’s peak performance when running on a dedicated PM, File Servers and Database Servers are not qualified for migration because their performance loss during migration is 53.68% and 34.52% respectively.

5 CONSOLIDATION

PLANNING

In this section, we present our integer programming model that considers both co-location performance constraint and migration performance constraint.

5.1 Consolidation Planning Modelling

As described earlier, if there are n types of different workloads and each PM supports up to m VMs, the total number of different consolidation cases isN = Cmn+m−1. We useXi to denote the initial number of PMs running a mix of workloads represented by consolidation caseCasei, andYito denote the optimized number of PMs running the same mix of workloads represented by consolidation case Casei.Ai,k in Equation (1) is the number of workloadk in consolidation case Casei. We use Pi,k to denote the co-location performance of workload k in consolidation case Casei. Pi,k is obtained through profiling described above. We usePk to denote the consolidation performance constraint defined in the SLA on workloadk.

In addition, due to the migration performance constraint, there are a few combinations of workloads that cannot be migrated. The total number of such combinations is l= Cf_n₊_f−1(0≤f ≤m), in whichnis the total number of workload types and f is total number of forbidden workload types. For example, when a PM can run up to four VMs simultaneously, there are C2₆ − 1 = 14

combinations of forbidden workloads if two workloads do not satisfy the migration performance constraint. If all of the four workload types are forbidden to migrate, no new consolidation plan can be produced. We use a matrixF to represent these combinations, in whichFi,j is the number of forbidden workload combinationj in consolidation case Casei.

As a result, the consolidation planning problem can be formulated to an optimization problem as follows:

Min z= N ∑ i=1 Yi (2) s.t. ∀k∈[1, n], N ∑ i=1 Ai,kXi= N ∑ i=1 Ai,kYi, (3) ∀j ∈[1, l] N ∑ i=1 Fi,jXi= N ∑ i=1 Fi,jYi, (4) ∀i∈[1, N],∀k∈[1, n], Pi,k≥Pk, (5) ∀i∈[1, N], Yi∈N, (6)

The objective function is to minimize the number of

PMs to run all VMs.

Constraint 1 ensures that each VM in the initial plan

is deployed onto one PM in the new plan, i.e., the total number of VMs of a workload type should be equal before and after the consolidation.

Constraint 2ensures that the consolidation satisfies the migration performance constraint, which requires that the number of PMs running aforbidden workload combina-tionkeeps the same during the consolidation process. Fig. 4 shows an example of the effect of this constraint. There are totally four types of different workloads. We assume File Server VM (in yellow color) and Database Server VM (in green color) are forbidden to migrate because of the migration performance constraint. In another words, only the Java Server and the Web Server workload are allowed to migrate. To ensure this constraint is satisfied, our method maintains the same number of PMs that run the following two forbidden workload combinations: one with a File Server VM running and one with a File Server VM and a Database Server VM running.

Constraint 3ensures the workload performance in any

consolidation case satisfies the consolidation performance constraint.

Constraint 4ensures the PM number is an integer.

Note, the solution is applicable to static VM allocation problem as well. To do so, the Constraint 1 can be modified to the following: ∑N_i₌₁Ai,kXi = Ck, and remove Con-straint 4.Ck is a constant of the number of VMs running workloadk. Physical Machine C File Server DB Server Physical Machine A Java Server Physical Machine B Java Server File Server Physical Machine D Web Server Physical Machine E Java Server Java Server Web Server

The forbid combinations that containing

“File Server” or “DB Server” are:

File Server File Server + DB Server

Web

Server Physical Machine C’ File

Server DB Server

Physical Machine A’

Java Server Physical Machine B’ Java Server File Server Physical Machine D’ Web Server

Physical Machine E’

Java Server Java Server Web Server Web Server Web Server Web Server Java Server

Fig. 4. Migration Performance Constraint Example.

5.2 Consolidation Plan Selection

The consolidation method described above may produce multiple plans. Fig. 5 shows an example where four VMs

(7)

deployed and the co-location performance constraint is set to 80% of their peak performance when running in a dedicated PM. Consolidation plan (a) and (b) both need two PMs and both satisfy the consolidation performance constraint. However, plan (b) achieves better average per-formance than plan (a). Selecting plan (b) rather than plan (a) benefits both the resource provider and the users. We give a selection method by using the PM number produced by consolidation planning as a new constraint and optimizing the average consolidation performance. The new constraint is represented as∑N_i₌₁Yi=Ymin, and the objective function is as below:

Max z′ = ∑n k=1 ∑N i=1Pi,kAi,kYi ∑n k=1 ∑N i=1Ai,kYi (7)

The final consolidation plan is therefore computed through two steps: 1) computing the minimal number of PMs for consolidating existing VMs; 2) selecting a consolidation plan with the best consolidation performance if multiple feasible plans are produced in step 1.

PM A B PM C D PM A C PM B D 81% 82% 83% 84% 85% 86% 87% 88%

(a) Average Perf.=85% (b) Average Perf.=86.5% Co-location Performance Constraint: >=80%

Fig. 5. Consolidation Decision Selection Example.

6 MIGRATION

PLANNING

With the above consolidation planning, we obtain the target consolidation scenario that minimizes the number of PMs required. As a subsequent step, we still need to transform the source consolidation scenario into the target consolida-tion scenario. In doing this, we aim to minimize the number of VM migrations since the number of VM migrations is an important factor affecting the total migration cost. In this section, we first formulate our migration planning problem for minimizing the number of VM migrations, and then solve it by constructing a mapping to the classical Linear Sum Assignment Problem (LSAP). For an LSAP, many polynomial time algorithms have been proposed. In our work, we pick the Hungarian algorithm with the augmenting path technique [31], which is commonly used to solve an LSAP in practice and has a complexity of O(n3₎_{, where} _n_{is the number of rows in the cost matrix}

of an LSAP.

6.1 The Migration Planning Problem

We formulate our migration planning problem as follows. Given the source consolidation scenario containingscases

and the target consolidation scenario containing t cases, wheres≥tsince the required PM number will generally be reduced by consolidation planning, find the set of VM migrations among themPMs that contain thosemsource cases such that (1)tPMs out of thosemPMs contain the t cases in the target scenario and (2) the number of VM migrations in this set is minimal.

Note that a VM migration here means the movement of a VM from one PM to another PM. For example, to change the PM that contains a single Java Server to the PM that contains a single File Server, we altogether need two VM migrations: moving a File Server into the PM that currently holds the single Java Server, and then moving the Java Server out of its current PM. Here we assume that a PM always has enough memory to hold moving-in VMs, so that temporary third party storage is not needed.

6.2 Mapping to the Linear Sum Assignment Prob-lem

A classical problem in combinatorial optimization, the Linear Sum Assignment Problem (LSAP) [31] is generally described as follows. Supposenjobs are to be assigned to n workers, where the assignment of job i(1 ≤ i≤ n) to workerj(1≤j≤n)incurs a costcij and each worker can only take one job. Please give the way of assigning jobs such that the total cost for completing all jobs is minimized. In other words, given then×ncost matrixCcontaining the entriescij, the problem is to selectnentries inCsuch that exactly one entry in each row and one entry in each column are selected and the sum of thesen entries is minimized.

Our basic idea of mapping an Migration Planning prob-lem to an LSAP is as follows. Consider a set of VM migrations that transforms a given source consolidation scenario to a given target consolidation scenario. After this set of migrations is completed, a PM containing an source case S will turn out to contain a target case T. Based on this observation, we say that caseT is assigned to case S if a PM changes its case from S to T after the transformation. Thus, a set of VM migrations actually produces an assignment solution that assign every case in the target scenario to a case in the source scenario. Further, we define the cost of assigning T to S as the number of VM migrations needed to changeS toT. With this cost definition, when we minimize the total cost of assigning the cases in the target scenario to the cases in the source scenario, we minimize the total number of VM migrations required to transform the source scenario to the target scenario.

Since the number of jobs is equal to the number of workers in an LSAP, we need to address the following issue to make the above idea actually work: the number of cases s in the source scenario can be larger than the number of cases t in the target scenario. Our solution is simple: we adds−tcases containing no VMs to the target scenario, thus the numbers of both source cases and target cases becomeswithout affecting the minimum number of VM migrations required.

(8)

Problem

Using the above mapping method, our algorithm for solving a migration planning problem is detailed below.

Input: (1) source consolidation scenario with s cases

denoted by Si(1≤i≤s)and (2) target consolidation scenario witht cases denoted byTj(1≤j ≤t).

Output: A set of VM migrations with minimal number

of migrations in it.

Begin:

1) If s > t, add s−t empty cases to the target scenario.

2) Construct an LSAP with a s×s cost matrix C, in which each entry cij equals the number of migrations needed to changeSi toTj.

3) Call the Hungarian algorithm with the augmenting path technique to solve the above LSAP.

4) Suppose the solution to the above LSAP assigns Tσ(i) toSi for each i(1≤i≤s), where σ(i) is

a permutation ofi for 1 ≤i≤s. Obtain the set of migrations by putting together the migrations needed for each assigningTσ(i) toSi.

End

Note that in the cost matrix constructed in step 2, each VM migration conducted for changing cases (say moving a VM from PM X to PM Y) is actually considered in the case changing at both PM X and PM Y, so the final minimum number of VM migrations will be equal to half of the minimum cost found by the Hungarian algorithm. Similarly, in step 4 of this algorithm, each VM migration will appear in the migrations at both PM X and PM Y, and we only include it once in our final set of migrations.

6.4 Considering the Migration Performance Con-straint

In the above algorithm description, we do not consider the situation that a VM can be forbidden to move due to the migration performance constraint. This subsection presents our measure to deal with this situation.

Specifically, our migration planning module first refer-ences the migration performance constraint to determine those VMs that cannot be moved (we call them forbidden VM hereafter). Then, cases in both source and target con-solidation scenarios are classified according to the subset of forbidden VMs that they contain. That is, cases containing the same subset of forbidden VMs will belong to the same category. For instance, if ‘File Server’ is an forbidden VM according to the migration performance constraint, then one category of cases can comprise those cases that contains no forbidden VMs, and another category of cases can comprise those cases that only contain one ‘File Server’ in them. Because a forbidden VM cannot move, a target caseT can be assigned to a source case S only when T and S are in the same category. Based on this, we divide the entire migration planning problem into several subproblems, with

each subproblem dealing with the migration planning for one category of cases. For each subproblem, we again solve it by building a cost matrix within its category of cases and mapping it to an LSAP, and this is done by calling the algorithm presented in Subsection 6.3. Finally, we combine the migrations returned from solving each subproblem to obtain the final set of migrations.

To help understand the above steps, Fig. 6 shows an example in which the source consolidation scenario con-tains 21 cases and the target consolidation scenario concon-tains 14 cases. We increase the number of target cases to 21 by adding 7Empty cases. We have two categories in this example, so we divide the migration planning problem into two subproblems and solve them respectively. After applying the Hungarian algorithm to the two cost matrices shown in this example, we can get the minimum number of VM migrations is 21 as well as the corresponding set of VM migrations, which are also shown in Fig. 6.

6.5 Complexity of Our Algorithm

We analyze the complexity of our algorithm by considering each of its steps presented in Subsection 6.3. In step 1, s−tcases are added, so its complexity isO(s). In step 2, each entry of the cost matrix is calculated. Since the matrix has s2 entries and the calculation for each entry requires O(mlogm) time, where m is the maximum number of VMs that a case can contain, the complexity for step 2 is O(s2_m_log_m₎_{. In step 3, the Hungarian algorithm incurs}

the complexity of O(s3₎_{. In step 4, the total number of}

assignment is s and obtaining the migrations for each assignment requires O(mlogm) time, so the complexity for step 4 is O(smlogm). Combining the complexity of all four steps, the total complexity of our solution isO(s3₎_,

sincemis generally much less thans.

7 EVALUATION

We implement the consolidation planning in LINGO [32]. LINGO is commonly used for building and solving linear, nonlinear and integer optimization models. We implement the migration planning using Hungarian algorithm in C++. 7.1 Consolidation with Co-location Performance Constraint

In this section, we focus on the consolidation performance without involving live migration. It can be called static consolidation.

Fig. 7 shows the consolidation results with various co-location performance constraints and VM input sizes. In this experiment,4,000VMs (1000 Java Server VMs, 1000 File Server VMs, 1000 Database Server VMs, and 1000 Web Server VMs) are deployed into PMs. From Fig. 7(a), we can see that when the co-location performance con-straint changes from 99% to 1%, the required PM number decreases significantly. For example,2,834PMs are needed when the co-location performance constraint is 99%, while the number decreases to 2,000 and 1,100 when the co-location performance constraint is relaxed to 90% and 80%

(9)

1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See Hungarian Algorithm Initial Consolidation: 21PM S1: 1file S2: 1file+1db S3: 2java+1file S4: 1java+1file+1db

S5: 1java+1file+1web

S6: 1java S7: 1db S8: 1web S9: 2java S10: 1java+1db S11: 2db S12: 3java S13: 2java+1db S14: 1java+2db S15: 1java+1db S16: 3db S17: 2db+1web S18: 3java+1db S19: 2java+1db+1web S20: 1java+3db S21: 3db+1web T1: 1file+1db+1web T2: 1java+1file+2db T3: 1java+1file+2db T4: 1java+1file+2db T5: 1java+1file+2db T6: 2java+2db T7: 2java+2db T8: 2java+2db T9: 2java+2db T10: 2java+2db T11: 2java+2db T12: 2java+2web T13: 2java+2web T14: 1java+2db+1web T15: Empty T16: Empty T17: Empty T18: Empty T19: Empty T20: Empty T21: Empty Target Consolidation: 14PM Mapping to the LSAP

Optimal Migration Set

Input Migration Cost Matrix: C

S1àT3 3 S2àT5 2 S3àT2 3 S4àT4 1 S5àT1 2 sum=11 S6àT18 1 S7àT15 1 S8àT21 1 S9àT12 2 S10àT16 2 S11àT17 2 S12àT13 3 S13àT11 1 S14àT8 1 S15àT19 3 S16àT20 3 S17àT14 1 S18àT9 2 S19àT6 2 S20àT10 2 S21àT7 4 sum=31

Total Number of Migrations

= (11+31)/2 = 21

Subproblem-1: Free Case

2 1 4 2 2 3 2 3 1 3 3 2 3 1 3 3 2 3 1 3 3 2 3 1 3 S1 S2 S3 S4 S5 T1 T2 T3 T4 T5 3 3 5 2 2 2 3 1 1 3 3 3 2 2 2 4 3 3 5 2 2 2 3 1 1 3 3 3 2 2 2 4 3 3 5 2 2 2 3 1 1 3 3 3 2 2 2 4 3 3 5 2 2 2 3 1 1 3 3 3 2 2 2 4 3 3 5 2 2 2 3 1 1 3 3 3 2 2 2 4 3 3 5 2 2 2 3 1 1 3 3 3 2 2 2 4 5 3 3 2 4 6 3 3 5 3 7 5 4 2 6 6 5 3 3 2 4 6 3 3 5 3 7 5 4 2 6 6 3 3 3 4 2 2 5 3 1 1 3 1 4 2 2 2 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 S7 S6 S8 S9 S10 S11 S12 S13 S14 S15 S16 S17 S18 S19S20S21 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15 T16 T17 T18 T19 T20 T21

Subproblem-2: Forbidden Case

Fig. 6. Migration Planning Example.

96%91%86%81%76%71%66%61%56%51%46%41%36%31%26%21%16%11%6%1% 0 500 1000 1500 2000 2500 3000 P M N u m b e r

Co-location Performance Constraint Input VM Number = 4000 (Jav a=1000, File=1000, DB=1000, W eb=1000) Input VM Number = 400 (Jav a=100, File=100, DB=100, W eb=100) Input VM Number =40 (Jav a=100, File=100, DB=100, W eb=100)

(a) PM Number 0 200000 400000 600000 800000 1000000 0 500000 1000000 1500000 2000000 2500000 3000000 Performance Constraint = 99% Performance Constraint = 90% Performance Constraint = 80% P M N u m b e r

Input Number of Each VM W orkload

(b) PM Number under Various Input Sizes

96%91%86%81%76%71%66%61%56%51%46%41%36%31%26%21%16%11%6%1% 0.0 0.2 0.4 0.6 0.8 1.0 N o r m a l i ze d P e r f o r m a n ce

Co-location Performance Constraint Performance Constraint

Consolidation with Co-located Perforamnce Constraint Random Consolidation without Constraint (lower boundary)

(c) Average Workload Performance

VMi=10 VMi=100 VMi=1000 VMi=10000 VMi=100000

0.0 0.2 0.4 0.6 0.8 1.0 1.2 N o r m a l i ze d P e r f o r m a n ce

Input Number of Each VM W orkload Performance Constraint = 99% Performance Constraint = 90% Perforamnce Constraint = 80% Performance Constraint = 70%

(d) Average Workload Performance under Various Input Sizes Fig. 7. Consolidation with Co-location Performance Constraint.

(10)

decreases, more VMs can be consolidated to run together. However, as the consolidation density increases, there is little room for further optimization. When the co-location performance constraint is set to 77%,1,000PMs are needed to run all of the 4000 VMs, which means every PM hosts the maximum number of VMs already. As a result, there is no more PMs reduction when the constraint is relaxed further.

Fig. 7(b) shows the relationship between VM number and PM number produced by the consolidation planning. The PM number increases nearly linearly as the number of VMs increases from 10 to 1,000,000 under different performance constraints.

Fig. 7(c) shows the consolidation performance achieved by our consolidation planning method changing with the performance constraint. The consolidation method with per-formance constraints is able to guarantee the average work-load performance. Our consolidation method can clearly outperforms the random consolidation method that cannot handle performance constraint by up to32.14%. Note that, the decrease of performance constraint does not necessarily mean the decrease of workload performance. For example, as indicated by the arrow in the figure, the average workload performance at a constraint level between 77% to 69% is lower than that with a more relaxed constraint, such as 68%. This is due to that at a relaxed performance constraint, some workload may benefit from certain consolidation cases our method produces. In this case, consolidation caseCase60

(2 Database Servers and 2 Web Servers) is produced at 68% constraint level. The performance of Database Server inCase60is 0.8696 and the performance of Web Server in

Case60 is 0.6885. The high Database Server performance

improves the overall average workload performance. Fig. 7(d) shows the impact of the number of VMs. As the VM number increases, the average workload performance is stable. It indicates that our consolidation planning method has good scalability and performs well even with 400,000 VMs.

7.2 Consolidation with Migration Performance Constraint

In this section, we take the migration performance into account to evaluate our consolidation planning method. In this case, the input is an existing consolidation plan. VMs running various workloads are running in PMs. We perform the consolidation planning to optimize the consolidation plan through VM live migration. We use two different input plans: one is without co-location constraint (random) and the other is withco-location constraint.

7.2.1 Consolidation without Initial Performance Con-straints

Fig. 8 shows the consolidation results without initial co-location performance constraint. In this experiment, the ini-tial consolidation cases consist of69different consolidation cases (from Case1 to Case69), each case with10instances

(690 PMs in total and 2,240 VMs, with 560 Java Server VMs, 560 File Server VMs, 560 Database Server VM and 560 Web Server VM).

Fig. 8(a) shows the changes of PM number in the con-solidation, from which we have the following findings:1)

The initial input has impact on the final consolidation result. As shown in the Figure, when there is no migration perfor-mance constraint, if the workload perforperfor-mance constraint is set as high as 95%, we may need more PMs (1,120) than the initial ones (690) to guarantee the workload performance.

2) When we take the migration performance into account in the consolidation planning, i.e., VMs running certain workloads cannot be migrated (as analyzed in Table 2), we can NOT always satisfy both the migration performance constraint and co-location performance constraint. For ex-ample, when the co-location performance constraint is set to 80% and the File Server is not allow to migrate due to the migration performance constraint, we can not get a feasible consolidation solution. This explains that several incomplete curves appear in the Figure. 3) More strict migration performance constraint may result in more PM number in order to maintain the workload performance. For example, when the co-location performance constraint is set to 10%, 650 PMs are needed to satisfy the migration performance if File Server, Database Server and Web Server are all forbidden to migrate, meanwhile, only 560 PMs are needed to satisfy the migration performance if File Server and Database Server are forbidden to migrate.

Fig. 8(b) shows the average workload performance under different co-location performance constraints. We have the following observations: 1) The consolidation results can always guarantee the co-location performance constraint. Even more, our consolidation planning method can achieve better performance than that given in the performance constraint. 2) The migration performance constraint may affect the average workload performance of the final consol-idation result. For example, when co-location performance constraint is 30%, a migration performance constraint that forbids File Server from migrating requires 560 PMs. Under the same co-location performance constraint, a migration performance constraint that forbids both File Server and Database Server to migrate requires the same number of PMs (see Fig. 8(a)). However, the former case ob-tains 86.92% performance and the latter obob-tains a lower performance at 84.54%. It is because that the room for optimization is smaller when more VMs are forbidden to migrate.

7.2.2 Dynamic Consolidation with Initial Performance Constraints

In this section, we evaluate the consolidation planning method on how it can improve the initial plan that satisfies a given performance constraint. Fig. 9 shows the consoli-dation results with initial performance constraint. In this experiment, we select the valid consolidation cases that meet the co-location performance constraint as the input.

Eachvalid case has10 instances. For example, when the

(11)

1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See 96%91%86%81%76%71%66%61%56%51%46%41%36%31%26%21%16%11%6%1% 0 200 400 600 800 1000 1200 1400 1600 P M N u m b e r

Co-location Performance Constraint Initial Input

No Migration Constraint (MC) MC with File Server Forbidden MC with File+DB Server Forbidden MC with File+DB+W eb Server Forbidden

(a) PM Number 96%91%86%81%76%71%66%61%56%51%46%41%36%31%26%21%16%11%6%1% 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 N o r m a l i ze d P e r f o r m a n ce

Co-location Performance Constraint Initial Input Performance Performance Constrains No MC (Migration Constraints) MC with File Server Forbidden MC with File+DB Server Forbidden MC with File+DB+W eb Server Forbidden

(b) Average Workload Performance Fig. 8. Consolidation with Migration Performance Constraint. (without Initial Performance Constraint)

96%91%86%81%76%71%66%61%56%51%46%41%36%31%26%21%16%11%6%1% 0 100 200 300 400 500 600 700 P M N u m b e r

Co-location Performance Constraint Initial PM Number No Migration Constraint (MC) MC with File Server Forbidden MC with File+DB Server Forbidden MC with File+DB+W eb Server Forbidden

(a) PM Number 96%91%86%81%76%71%66%61%56%51%46%41%36%31%26%21%16%11%6%1% 0.75 0.80 0.85 0.90 0.95 1.00 1.05 N o r m a l i ze d P e r f o r m a n ce

Co-location Performance Constraint Initial Input Performance Performance Constraint No Migration Constraint (MC) MC with File Server Forbidden MC with File+DB Server Forbidden MC with File+DB+W eb Server Forbidden

(b) Average Workload Performance Fig. 9. Consolidation with migration performance constraint. (with Initial Performance Constraint)

the input consolidation cases satisfying this constraint as input. The PM number of initial consolidation plan varies with different co-location performance constraints.

Fig. 9(a) shows the change of PM number under different co-location performance constraint. We have the follow-ing observations: 1) Different from Fig. 8, all migration performance constraints in this experiment can produce a consolidation plan that satisfies co-location performance constraint because the input plan satisfies the constraint by itself.2) The higher the migration performance constraint is, the less PM number can be reduced. For example, when the co-location constraint is 40%, the initial input PM number is 640. 19.84% PMs can be reduced when only File Server is the forbidden workload and6.25%PMs can be reduced when File Server, Database Server and Web Server are all forbidden workload.

Fig. 9(b) shows the average workload performance under different constraints. From the figure, we find that1) The consolidation plan guarantees both the co-location perfor-mance constraint and migration perforperfor-mance constraint.2)

An interesting phenomenon is that when the co-location performance constraint is 75%, the average workload per-formance produced by the new consolidation plan with no migration performance constraint is worse than the initial average workload performance, as indicated by the arrow in the figure. The reason is that the PM number is not the same in the two plans. The input PM number is 420 when the co-location performance constraint is 75%, while the PM number is 313 without migration performance constraint. As a result, it is possible that the initial workload performance is better, however, the new plan uses much less PMs to satisfy the performance constraint.

7.3 Migration Planning

We compare our optimal migration method with a base migration method. In the base migration method, all the VM workloads in the source consolidation scenario are treated as a set ofnewworkloads. We traverse all the target PMs, when find the target PM need dedicated workloads, then fetch those workloads from source PMs and deploy

(12)

1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See 99% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 N u m b e r o f V M M i g r a t i o n s

Co-location Perform ance Constraint Base Migration Method

Optimal Migration Method

(a) No Migration Constraint

99% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0 200 400 600 800 1000 1200 1400 1600 1800 N u m b e r o f V M M i g r a t i o n s

(b) Migration Constraint with File Server Forbidden

99% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0 200 400 600 800 1000 1200 N u m b e r o f V M M i g r a t i o n s

(c) Migration Constraint with File+DB Servers Forbidden

99% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0 100 200 300 400 500 600 N u m b e r o f V M M i g r a t i o n s

(d) Migration Constraint with File+DB+Web Servers Forbidden Fig. 10. The Results of Migration Planning Algorithm.

them to the target PMs. It is similar to a VM deployment problem. Note that only the movable VMs that satisfying the migration performance constraint can migrate from a PM to another according to the consolidation plan.

Fig. 10 shows the efficiency of our migration planning algorithm. We have the following observations: 1) The migration performance constraint affects the number of VM migrations. When the migration performance constraint becomes stricter, both the base migration method and optimal migration method need less migration steps. For example, when the co-location performance constraint is set to 50%, the number of VM migrations is 1500 and 694 for base method and optimal method in the case of “No Migration Constraint” (see Fig. 10(a)). While in the case of “Migration Constraint with File Server Forbid-den” (see Fig. 10(b)), the number of VM migrations is 950 and 490 for the base method and optimal method respectively. We can observe the similar phenomenon in the other two cases. 2) Our optimal migration planning method can reduce the overall number of VM migrations by more than 50% in average. In the case of “No Migration Constraint” (Fig. 10(a)), our method can reduce the number of VM migrations by 60.10% ∼ 80.00%. The average reduction percentage is 68.37%. Similarly, the average reduction percentage is 59.49%, 52.55% and 58.27% in the

cases of “Migration Constraint with File Server Forbidden” (Fig. 10(b)), “Migration Constraint with File+DB Servers Forbidden” (Fig. 10(c)) and “Migration Constraint with File+DB+Web Servers Forbidden” (Fig. 10(d)) respectively.

7.4 Discussion

Workload profiling is the foundation of our consolidation planning and migration planning method. Currently, we use benchmark applications as workloads in our experiments. However, a benchmark (such as IOzone) always exhausts the applications with high load. In real virtualized data cen-ters, the workload may change dynamically according to the user demand. However, the high load reflects the resource needs therefore can reflect the software characteristics of an application more accurately.

Our proposed optimal consolidation planning and migra-tion planning method is general enough to apply to both static consolidation and dynamic consolidation scenario. The static consolidation deals only co-location performance constraint while the dynamic consolidation deals with both co-location performance constraint and migration perfor-mance constraint. Further, Our method is able to handle consolidation of large scale systems that consist of tens of thousands of VMs.

(13)

per-formance and migration perper-formance. When the workload performance constraint becomes stricter, more PMs are needed to maintain the workload performance. When the migration performance is set to a high value, i.e., more workloads are forbidden to migrate, the room for consol-idation optimization becomes smaller and the PM number will be larger. It also affects the workload performance improvement.

8 CONCLUSION

Server consolidation is an effective means to reduce the energy consumption and improve the utilization of physical servers in modern virtualized data centers. Live migration can be used to implement dynamic resource allocation to workloads. However, both server consolidation and live migration incur non-trivial overheads and have impact on the performance of user workloads.

In this paper, we proposed a profiling-based framework for server consolidation to address this problem. We first es-tablish profiles on the workload performance by conducting comprehensive experiments to quantify the co-location per-formance and migration perper-formance with real workloads. Second, we gave an optimal solution to consolidate VMs under the performance constraints derived from the profiles. Third, we raised a new problem on how to transform the source consolidation scenario into the target consolidation scenario with minimum number of VM migrations, and provided an optimal solution.

Extensive experimental results showed that our proposed consolidation framework greatly reduced both the number of PMs used and the number of VM migrations. In the static consolidation scenario, our method reduced the number of physical machines by 29.15% to 75% and improved workload performance by up to 32.14% compared with random consolidation method. In the dynamic consolidation

withoutinitial performance constraint, our method reduced

the number of physical machines by up to 18.84%, and improved the workload performance by up to 22.05%. While in the dynamic re-consolidation with initial per-formance constraint, our method reduced the number of physical machines by up to 28.89%, and improved the workload performance by up to 7.69%. Our method also significantly reduce average number of VM migrations by 59.67% compared with the base migration method. This significantly reduced the migration overhead.

ACKNOWLEDGMENTS

This work is supported by National High Technology Re-search 863 Major Program of China (No. 2011AA01A207), National Natural Science Foundation of China (No. 61272128).

REFERENCES

[1] T. N. Y. Times. The cloud factories: Power, pollution and the internet. [Online]. Avail-able: http://www.nytimes.com/2012/09/23/technology/data-centers-waste-vast-amounts-of-energy-belying-industry-image.html

[2] L. Barroso and U. Holzle, “The case for energy-proportional com-puting,”Computer, vol. 40, no. 12, pp. 33–37, 2007.

[3] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization,” inACM SIGOPS Operating Systems Review, vol. 37, no. 5. ACM, 2003, pp. 164–177.

[4] C. Clark, K. Fraser, S. Hand, J. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield, “Live migration of virtual machines,” in

Proceedings of the 2nd conference on Symposium on Networked

Systems Design & Implementation-Volume 2. USENIX Association,

2005, pp. 273–286.

[5] B. Speitkamp and M. Bichler, “A mathematical programming ap-proach for server consolidation problems in virtualized data centers,”

Services Computing, IEEE Transactions on, vol. 3, no. 4, pp. 266–

278, 2010.

[6] A. Beloglazov and R. Buyya, “Managing overloaded hosts for dynamic consolidation of virtual machines in cloud data centers under quality of service constraints,” Parallel

and Distributed Systems, IEEE Transactions on, vol.

http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.240, 2012. [7] T. Ferreto, M. Netto, R. Calheiros, and C. De Rose, “Server

consolidation with migration control for virtualized data centers,”

Future Generation Computer Systems, vol. 27, no. 8, pp. 1027–1034,

2011.

[8] A. Verma, P. Ahuja, and A. Neogi, “pmapper: power and migration cost aware application placement in virtualized systems,” in Pro-ceedings of the 9th ACM/IFIP/USENIX International Conference on

Middleware. Springer-Verlag New York, Inc., 2008, pp. 243–264.

[9] S. Srikantaiah, A. Kansal, and F. Zhao, “Energy aware consolidation for cloud computing,” inProceedings of the 2008 conference on

Power aware computing and systems. USENIX Association, 2008,

pp. 10–10.

[10] F. Hermenier, X. Lorca, J. Menaud, G. Muller, and J. Lawall, “Entropy: a consolidation manager for clusters,” in Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on

Virtual execution environments. ACM, 2009, pp. 41–50.

[11] K. Mills, J. Filliben, and C. Dabrowski, “Comparing vm-placement algorithms for on-demand clouds,” inCloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on. IEEE, 2011, pp. 91–98.

[12] H. Xu and B. Li, “Anchor: A versatile and efficient framework for resource management in the cloud,” Parallel and Distributed

Systems, IEEE Transactions on, vol. 24, no. 6, pp. 1066–1076, 2013.

[13] S. Di and C.-L. Wang, “Dynamic optimization of multi-attribute re-source allocation in self-organizing clouds,”Parallel and Distributed

Systems, IEEE Transactions on, vol. 24, no. 3, pp. 464–478, 2013.

[14] A. Verma, G. Dasgupta, T. Nayak, P. De, and R. Kothari, “Server workload analysis for power minimization using consolidation,” in

Proceedings of the 2009 conference on USENIX Annual technical

conference. USENIX Association, 2009, pp. 28–28.

[15] J. Zhan, L. Wang, X. Li, W. Shi, C. Weng, W. Zhang, and X. Zang, “Cost-aware cooperative resource provisioning for heterogeneous workloads in data centers,”Computers, IEEE Transactions on, vol. http://doi.ieeecomputersociety.org/10.1109/TC.2012.103, 2012. [16] Z. Gong and X. Gu, “Pac: Pattern-driven application consolidation

for efficient cloud computing,” Signature, vol. 1, no. 2, pp. 2–2, 2010.

[17] D. Carrera, M. Steinder, I. Whalley, J. Torres, and E. Ayguad´e, “Autonomic placement of mixed batch and transactional workloads,”

Parallel and Distributed Systems, IEEE Transactions on, vol. 23,

no. 2, pp. 219–231, 2012.

[18] Y. Lee and A. Zomaya, “Energy conscious scheduling for distributed computing systems under different operating conditions,”Parallel

and Distributed Systems, IEEE Transactions on, vol. 22, no. 8, pp.

1374–1381, 2011.

[19] ——, “Energy efficient utilization of resources in cloud computing systems,”The Journal of Supercomputing, vol. 60, no. 2, pp. 268– 280, 2012.

[20] X. Liu, C. Wang, B. Zhou, J. Chen, T. Yang, and A. Zomaya, “Priority-based consolidation of parallel workloads in the cloud,”

Parallel and Distributed Systems, IEEE Transactions on, vol.

http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.262, 2012. [21] Y. Koh, R. Knauerhase, P. Brett, M. Bowman, Z. Wen, and C. Pu,

“An analysis of performance interference effects in virtual envi-ronments,” inPerformance Analysis of Systems & Software, 2007.

ISPASS 2007. IEEE International Symposium on. IEEE, 2007, pp.

(14)

energy modeling for live migration of virtual machines,” in Pro-ceedings of the 20th international symposium on High performance

distributed computing. ACM, 2011, pp. 171–182.

[23] P. Apparao, R. Iyer, X. Zhang, D. Newell, and T. Adelmeyer, “Characterization & analysis of a server consolidation benchmark,”

inProceedings of the fourth ACM SIGPLAN/SIGOPS international

conference on Virtual execution environments. ACM, 2008, pp.

21–30.

[24] V. Makhija, B. Herndon, P. Smith, L. Roderick, E. Zamost, and J. Anderson, “Vmmark: A scalable benchmark for virtualized sys-tems,”VMware Inc, CA, Tech. Rep. VMware-TR-2006-002, 2006. [25] SPECvirt sc2010 Benchmark. [Online]. Available:

http://www.spec.org/virt sc2010/

[26] SPECjbb2005: Java Server Benchmark. [Online]. Available: http://www.spec.org/jbb2005/

[27] IOzone Filesystem Benchmark. [Online]. Available: http://www.iozone.org/

[28] SysBench: a system performance benchmark. [Online]. Available: http://sysbench.sourceforge.net/

[29] WebBench. [Online]. Available:

http://cs.uccs.edu/ cs526/webbench/webbench.htm

[30] D. Huang, D. Ye, Q. He, J. Chen, and K. Ye, “Virt-lm: a bench-mark for live migration of virtual machine (abstracts only),”ACM

SIGMETRICS Performance Evaluation Review, vol. 39, no. 3, pp.

18–18, 2011.

[31] R. E. Burkard and E. ela, “Linear assignment problems and exten-sions.”

[32] LINDO Systems: Optimization Software. [Online]. Available: http://www.lindo.com/

PLACE PHOTO HERE

Kejiang Yereceived his B.Sc. degree in soft-ware engineering from Zhejiang University, Hangzhou, China, in 2008. He is currently a Ph.D candidate in computer science at Zhejiang University and also a visiting Ph.D student at the University of Sydney, Australia. His research interests include virtualization and cloud computing, performance/energy evaluation, modelling and optimization. He is a student member of IEEE.

PLACE PHOTO HERE

Chen Wangreceived his Ph.D. from Nanjing University. He is a senior research scientist at CSIRO (Commonwealth Scientific and In-dustrial Research Organisation) ICT Centre, Australia. His research interests are primarily in distributed, parallel and trustworthy sys-tems. His current work focuses on resource management in cloud computing, account-able distributed systems and demand re-sponse algorithms in the smart grid.

PLACE PHOTO HERE

Zhaohui Wu received his B.Sc. and Ph.D. degrees in computer science from Zhejiang University, Hangzhou, China, in 1988 and 1993, respectively. He is currently a Profes-sor at the Department of Computer Science, Zhejiang University. His research interests include grid computing, service computing, distributed artificial intelligence, and perva-sive computing. Professor Wu is a Standing Council Member of the China Computer Fed-eration and is a senior member of IEEE.

PLACE PHOTO HERE

Bing Bing Zhou received the B.Sc. de-gree from Nanjing Institute of Technology, China and the Ph.D. degree in Computer Science from Australian National University. He is currently an associate professor at the University of Sydney. His research interests include parallel/distributed computing, Grid and cloud computing, peer-to-peer systems, parallel algorithms, and bioinformatics. He has a number of publications in leading in-ternational journals and conference proceed-ings. His research has been funded by the Australian Research Council through several Discovery Project grants.

PLACE PHOTO HERE

Weisheng Sireceived the BS, MS, and PhD degrees in computer science from Peking University, University of Virginia, and Univer-sity of Sydney, respectively. He is now a lec-turer in the School of Computing, Engineer-ing, and Mathematics, University of Western Sydney. Prior to this, he was a postdoctoral researcher at National ICT Australia (NICTA). His research interests include routing in wire-less networks, graph theory, and green net-working. He is a member of the IEEE.

PLACE PHOTO HERE

Xiaohong Jiang received her B.Sc. and M.Sc. degree in computer science from Nan-jing University, China and the Ph.D. degree in Zhejiang University, Hangzhou, China, in 2003. She is an associate professor at the Department of Computer Science, Zhejiang University. Her research focuses on dis-tributed systems, virtual environment, cloud computing, and data service.

(15)

PLACE PHOTO HERE

Albert Y. Zomayais currently the Chair Pro-fessor of High Performance Computing & Networking and Australian Research Council Professorial Fellow in the School of Informa-tion Technologies, The University of Sydney. He is also the Director of the Centre for Dis-tributed and High Performance Computing which was established in late 2009.

He is the author/co-author of seven books, more than 450 publications in technical jour-nals and conferences, and the editor of nine books and 11 conference volumes. He is currently the Editor in Chief of the IEEE Trans. on Computers and serves as an associate editor for 20 journals including some of the leading journals in the field. Professor Zomaya was the Chair the IEEE Technical Committee on Parallel Processing (1999-2003) and currently serves on its execu-tive committee. He also serves on the advisory board of the IEEE Technical Committee on Scalable Computing, the advisory board of the Machine Intelligence Research Labs. Professor Zomaya served as General and Program Chair for more than 60 events and served on the committees of more than 500 ACM and IEEE conferences. He delivered more than 130 keynote addresses, invited seminars and media briefings.

Professor Zomaya is a Fellow of the IEEE, AAAS, the Institution of Engineering and Technology (U.K.), a Distinguished Engineer of the ACM and a Chartered Engineer (CEng). He received the 1997 Edgeworth David Medal from the Royal Society of New South Wales for outstanding contributions to Australian Science. He is also the recipient of the IEEE Computer Society?s Meritorious Service Award and Golden Core Recognition in 2000 and 2006, respectively. Also, he received the IEEE TCPP Outstanding Service Award and the IEEE TCSC Medal for Excellence in Scalable Computing, both in 2011. His research interests are in the areas of parallel and distributed computing and complex systems.

THE energy efficiency of data centers - the essential

Profiling-based Workload Consolidation and

Migration in Virtualized Data Centres

Kejiang Ye, Zhaohui Wu, Chen Wang, Bing Bing Zhou, Weisheng Si, Xiaohong Jiang, and

Albert Y. Zomaya,

Fellow, IEEE

F

1

INTRODUCTION

T

2

RELATED

WORK

3

PROFILING-BASED

SERVER

FRAMEWORK

4

WORKLOAD

PERFORMANCE

PROFILING

5

CONSOLIDATION

PLANNING

6

MIGRATION

PLANNING

7

EVALUATION

8

CONCLUSION

ACKNOWLEDGMENTS

REFERENCES