Business-Driven Long-term Capacity Planning
for SaaS Applications
David Candeia, Ricardo Ara ´ujo Santos and Raquel Lopes
Abstract—Capacity Planning is one of the activities developed by Information Technology departments over the years, it aims at
estimating the amount of resources needed to offer a computing service. This activity contributes to achieving high Quality of Service levels and also to pursuing better economic results for companies. In the Cloud Computing context, one plausible scenario is to have Software-as-a-Service (SaaS) providers that build their IT infrastructure acquiring resources from Infrastructure-as-a-Service (IaaS) providers. SaaS providers can reduce operational costs and complexity by buying instances from a reservation market, but then need to predict the number of instances needed in the long-term. This work investigates how important is the capacity planning in this context and how simple business-driven heuristics for long-term capacity planning impact on the profit achieved by SaaS providers. Simulation experiments were performed using synthetic e-commerce workloads. Our analysis show that proposed heuristics increase SaaS provider profit, on average, at 9.6501% per year. Analysing such results we demonstrate that capacity planning is still an important activity, contributing to the increase of SaaS providers profit. Besides, a good capacity planning may also avoid bad reputation due to unacceptable performance, which is a gain very hard to measure.
Index Terms—Capacity Planning, Cloud Computing, Software-as-a-Service.
F
1
I
NTRODUCTIONI
NFORMATIONTechnology (IT) infrastructure manage-ment is a discipline that aims at achieving stability and control of an IT infrastructure [1]. IT management is important to meet QoS requirements and to achieve an efficient use of the infrastructure. Decisions made in such discipline have an impact on the infrastructure owner business bottom line and, because of this, IT infrastructure management evolved to consider business aspects [1].Planning the amount of computing resources needed to deliver a computing service (i.e., application) is one of the activities of an IT infrastructure management plat-form, which is called capacity planning. Before Cloud Computing, capacity planning typically involved over-provisioning of the IT infrastructure [2] as a common solution to deal with workload peaks.
Cloud Computing has brought some novelties: provid-ers offer computing services in different markets; clients can buy computing services and start them quickly. Three main types of services are commonly present-ed [3]: Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS); all these services are acquired in a pay-per-use manner. In this paper, we consider a scenario in which a SaaS provider is a consumer of one IaaS provider [4].
IaaS providers typically offer several types of virtual instances configurations: different amounts of CPU units,
David Candeia is with Instituto Federal de Educa¸c˜ao, Ciˆencia e Tecnologia da Para´ıba, Campina Grande, Para´ıba, Brasil. E-mail: [email protected]. Raquel Lopes and Ricardo Ara ´ujo Santos are with Universidade Federal de Campina Grande - UFCG. E-mail: [email protected], [email protected]
Manuscript received May 28, 2014
memory and storage. Furthermore, IaaS providers usu-ally offer virtual instances in two different markets, each one with different charging strategies and QoS offered: (i) in the on-demand market, available instances of the IaaS provider are acquired with no long-term commit-ments by paying an usage fee for each usage charging period (typically an hour). The IaaS provider does not guarantee that the consumer will receive the amount of instances required; (ii) in the reservation market, instances can be reserved for longer future intervals (typically greater than one year) by paying an upfront reservation fee. As the IaaS consumer uses reserved instances, she pays a discounted usage fee (compared to the on-demand usage fee) for each usage charging period. The IaaS provider ensures that reserved instances will be available whenever the IaaS consumer wishes to use them within the reservation interval.
Cloud Computing also brought some challenges for capacity planning. First, a capacity planner now has access to several types of instances and markets to build the IT infrastructure. With so many options, the capacity planning algorithm may achieve better results, but at the cost of higher complexity. Second, the capacity planner has to deal with the uncertainty of future workload prediction, which is typically a very hard task to ac-complish. This task is especially hard for SaaS providers that have contracts that may span from one week to some months and users that may quit whenever they want, as an open system. It is hard to predict the real usage of the system by each tenant. Finally, it is also a challenge to model the impact that capacity planning decisions may have on the business. Impacts on QoS of the services offered may lead to service level agreements (SLAs) violations, which may lead to penalties and bad
Fig. 1. A workload example
reputation, or even loosing important consumers. All these aspects turn capacity planning in the Cloud Computing context a non trivial problem. Considering the growing number of applications running in the cloud, whether academic or industrial, it’s important to study capacity planning heuristics and strategies. A good capacity planning reserves expected well used resources in the reservation market, in order to achieve cost reductions, instead of only using the more expensive resources from the on-demand market. For low used re-sources, the on-demand market might be the best choice. Also, IaaS providers ensure that reserved resources are available whenever they are needed, contributing to improving the availability of the SaaS application being offered. Capacity planners can combine resources from the on-demand and reservation markets in order to improve application QoS and cost reductions.
Imagine an application with its varying resources demand, as presented in the curve of Figure 1. It is clear that if 32 instances are reserved, lots of instances will not be efficiently used, which results in waste of money. Certainly, the best decision is to reserve some number of instances between 14 and 18 instances from the reser-vation market, and acquire more instances from the on-demand market when needed. Although the rationale behind this idea is apparently very simple, in practice it is difficult to achieve such a good heuristic, mainly because it is hard to predict the real demand of the applications. Besides, there is the risk of having requests for instances not satisfied from the on-demand market.
This work focus on the capacity planning for long future intervals (e.g., one year), evaluating heuristics that use a workload prediction to define how many instances must be acquired from the reservation market. We propose two heuristics based on literature concepts and compare them with other four heuristics, one of them being an optimal solution. Analysing the results we show the importance of performing a capacity planning, the improvements in the SaaS provider profit (mean values
of 10.04% and 9.25%) and the small number of SLA violations achieved by proposed heuristics.
1.1 Contributions and Structure of the Paper
The major contributions of this work are three-fold: (i) we propose an utility model to guide and evalu-ate capacity planning. Our model considers the pay-as-you-go aspect of Cloud Computing, the receipt of the SaaS provider and the costs related to offering a SaaS application; (ii) we propose two capacity planning heuristics that combine usual literature concepts, the utility model proposed, instances acquisition from the on-demand and reservation markets and the usage of different reservation markets. Our focus is on evaluating the combination of such points on both simple proposed heuristics. One of the heuristics also performs a simple evaluation of the possibility of using instances of differ-ent types; (iii) we compare proposed heuristics to four reference heuristics using simulation experiments and a synthetic e-commerce workload based on real world parameters [5] [6].
The rest of this paper is organized as follows: Section 2 discusses the related work. Section 3 presents the utility model proposed. Section 4 presents the two capacity planning heuristics proposed. Section 5 presents the simulation environment used in our experiments and Section 6 contains the evaluation of proposed heuristics. Finally, Section 7 discusses conclusions and possible future research directions.
2
R
ELATEDW
ORKCapacity planning studies have presented throughout the years a set of streams in the literature, each with its particular features. Recently, the study of elasticity solu-tions was highlighted [7]. A first analysis can separate studies in two worlds: reactive and predictive approa-ches [7]. Reactive approaapproa-ches act only after a condition is satisfied, while predictive approaches anticipate system load to estimate the amount of necessary resources.
Considering workload prediction we can point long-term and short-long-term capacity planning solutions. On the one hand, a long-term capacity planning [8], [9], [10], [11] use workload predictions for a long future interval of time (e.g., a year) and estimates the amount of computing resources needed to deliver the application during such interval. In this work, we assume that long-term capacity planning results can guide the acquisition of instances at an IaaS provider reservation market. On the other hand, a short-term capacity planning [12], [13], [14], [15], [16] uses workload predictions for managing resources for a future short interval of time (e.g., an hour), optimising the amount that will be acquired or released.
The estimation of the amount of necessary resources can be based on operational metrics or business metrics. A capacity planning based on operational metrics focus on meeting operational metrics targets. These targets
can be defined in terms of availability [17], response time [18], CPU usage, power consumption [7] or a combination of such metrics [19].
Using only operational metrics to plan an infrastruc-ture, without considering metrics such as cost and re-ceipt, can lead to an infrastructure configuration that meets operational targets, but that is an economically infeasible solution. In order to deal with this prob-lem, business-driven IT management solutions [1] aim at combining operational and business metrics in the decision-making. Capacity planning solutions based on business metrics can consider infrastructure costs [20] [21] [14] [15] [10] [11] [16], loss incurred due to consumer defection [8] [9], SLA violations [8] [9] [7] and business profit/revenue [22] [23] [24] [25] [26] when planning IT infrastructures. As more high level models, business models enable the merge of operational metrics like power consumption and business metrics [26].
Our study resembles other capacity planning studies as we consider business metrics in the capacity planning. However, some aspects can be pointed to distinguish our study from the others: (i) we captured the business model of a real SaaS provider profit using a linear utility function that combines the revenue for the different plans with different penalties incurred due to SLA vi-olation; (ii) it is not from our knowledge the existence of studies that combine profit evaluation and the capacity planning of SaaS providers using instances acquired from different markets of an IaaS provider; (iii) proposed heuristics combine this utility model with concepts of resource utilization and Queueing Theory to produce good results. So, our work can be seen as complementary to previous studies.
3
U
TILITYM
ODELUtility [27] is a microeconomics concept used to state the preference of agents (i.e., service providers and their consumers), with higher values typically stating greater preference. Agents use such preferences to guide their behavior: they attempt to achieve the outcomes they most prefer. A utility function maps a space of outcomes onto utility values. A utility function can combine differ-ent aspects and metrics, simplifying the evaluation and choice process done by agents.
The utility model proposed maps the profit of a SaaS provider, obtained as a result of offering an application, onto an utility value. A capacity planning agent can use this model to build a capacity plan that maximizes the utility value, which translates to maximize the SaaS provider profit. Our utility model combines three main components: (i) the revenue obtained from charging consumers that use the SaaS application; (ii) the cost of buying instances from a IaaS provider; (iii) penalties related to SLA violations.
3.1 Revenue Model
The utility model proposed considers that SaaS provid-ers can offer one or many plans to their consumprovid-ers, so
that each consumer chooses the plan that best fits its need and contracts it in order to use the SaaS application. As a result of evaluating current SaaS providers, the revenue model developed aims at covering the main aspects discovered: (i) SaaS consumers are typically charged periodically (i.e., per month or year); (ii) each application has its usage restrictions specified in the plans offered by the provider; (iii) a contract established between a SaaS provider and a SaaS consumer defines provider’s reimbursement rules.
A SaaS provider develops and offers an application A to a set U of SaaS consumers, U = {u1, u2, . . . , u|U |}.
In order to offer this application, the provider builds a portfolio of plans P = {p1, p2, . . . , p|P |}, where each
plan pj aims at meeting a demand of a specific class
of consumers, so it’s expected that |P | < |U |. Each consumer uk|uk ∈ U , chooses and signs a contract related
to a plan pj|pj ∈ P , in order to use application A.
After signing the contract of plan pj, a consumer uk
can use application A for an interval [nb
k, nek], where
ne
k ≥ nbk (for example, if a plan pj is semiannual then
ne
k − nbk = 6 months). For simplicity, all plans offered
by a SaaS provider are accounted by using the same fixed usage period duration (e.g., one month) and period
n with value 1 marks the launch of the application.
Also, we consider that new consumers can only enter the system just before a new usage period n starts. As time passes, n is incremented to indicate current usage period.
By the time uk signs the contract (i.e., in nbk), the
SaaS provider must configure and deploy application
A to serve its specific consumer. To this end, the SaaS
consumer uk can be requested to pay a configuration fee
Ijb. This fee depends on the plan pj contracted. During
the term of pj, as well as in sequential intervals in which
the consumer renews the contract, the configuration fee Ib
j is not charged again. The provider will only charge
this fee again if the consumer changes the contracted plan. The function ib : N+⇒ {0, Ib
j} given by ibk(n) = Ib j if n = nbk 0 otherwise (1)
defines if consumer uk must pay a configuration fee
in an usage period n.
In order to use application A during each usage period
n, the consumer uk must pay an usage fee Ij to the
SaaS provider. This fee should be enough to cover SaaS provider costs of acquiring necessary resources to offer application A to consumer uk. The fee Ij depends on the
plan pjcontracted by consumer uk. Regardless if the
con-sumer remains in her plan or changes to a new one, the fee Ij is always charged. The function ius: N+ ⇒ {0, Ij}
given by iusk (n) = Ij+ ej,n if nbk ≤ n ≤ n e k 0 otherwise (2)
defines the usage fee payed by a consumer uk in an
Each plan pj defines the set of resources that can be
used by a SaaS consumer while using application A. For example, it is common that during an usage period n
a consumer uk uses a certain amount of storage and
transfers a certain amount of data over the network.
Each plan pj defines limits for each computing resource
a consumer can use during an usage period n. If during
an usage period n a consumer uk exceeds such limits,
the SaaS provider charges uk an extra fee ej,n. This extra
fee is proportional to the amount of extra resources used by the consumer.
Finally, a plan pj is associated with a service level
agreement (SLA), represented here as SLAj. For
simplic-ity, a SLA is defined by the tuple < AM IN, TM AX >,
which values must apply for each usage period n.
AM IN represents the lowest required availability for
application A and TM AX represents a response time
percentile accepted for requests processing (e.g., 95% of requests must be processed within 8 seconds). According to SaaS provider business evaluations, it may be feasible
to define a SLA for each plan pj in order to offer a
higher quality of service to plans that contribute more to the business. Furthermore, if the SaaS provider violates SLAj, a function Mj(n) indicates, for a certain usage
period n, the penalty that the provider must pay to the corresponding consumer. The value of the penalty is proportional to the intensity of the violation and may be defined differently for each plan offered. Penalties payment is included in the cost model presented in Section 3.2.
Given the above aspects of plans, it’s necessary to
define how the SaaS provider charges each consumer uk.
During an usage period n the SaaS provider must do the accounting of resource consumption for each consumer
uk. With this accounting, the provider calculates the
amount of resources used inside plans limits and the amount of extra resources used.
The revenue obtained by a SaaS provider from the
payment of a consumer uk, in any period n, is given
as a combination of usage and configuration fees:
ik(n) = ibk(n) + iusk (n) (3)
Evaluating the set U of consumers that contracted ap-plication A, we can calculate the total revenue obtained by a SaaS provider in an usage period n as:
i(n) = k=|U | X k=1, uk∈U ik(n) (4)
The revenue obtained by a SaaS provider during an interval D of usage periods, where D = [nb, ne]and ne≥
nb, is given by the function ι : [nb, ne] ⇒ R+, where
nb, ne∈ N+: ι(D) = ne X n=nb i(n) (5)
We can use the revenue model presented above to cal-culate the revenue obtained by a SaaS provider in a past interval D, or even to estimate the revenue in a future interval D. In this case, it’s necessary to characterize the future set P of SaaS provider plans and to estimate the future workload that will be submitted by a set U of estimated future consumers.
3.2 Cost Model
As a SaaS provider acquires computing resources from an IaaS provider to build its IT infrastructure, we con-sider that the following costs exist: (i) costs related to using acquired resources; (ii) costs related to reserving resources. Besides these costs, a SaaS provider spends money as a result of SLA violations, which may lead to the payment of penalties to SaaS consumers.
Each IaaS provider has a set O of resources classes being offered. These resources classes can be, for exam-ple: (i) virtual instances; (ii) storage resources; (iii) data transfer resources. Each resource class o|o ∈ O, defines
a set So of resource types offered in this class. Each
resource type s|s ∈ So, is associated with an usage cost
cs, which indicates the minimal charge unit of the type.
For example, considering the Amazon EC21 service, a
small instance (s = small and o = virtual instances) has an usage cost csmall = $0, 062 for each hour of usage.
We consider that all resources from the same class o are charged according to the same minimal charge unit (i.e., for each hour).
The IaaS provider has an accounting system that reg-isters, for each period n and for each SaaS provider, the amount of resources used, as well as their types. This system is then queried to report total resource consumption. For each resource type s and each period n, counters an
s are incremented every time the SaaS
provider uses a resource of type s within period n. For example, suppose that during the first accounting period, n = 1, a large instance (s = large) has been used for 10 hours. In this scenario, the counter a1
large would
be 10.
The cost of the SaaS provider associated with IaaS resources usage in a period n, whether obtained in the on-demand or reservation markets, is defined by the
function ca : N+⇒ R+ given by:
ca(n) =X o∈O " X s∈So ans· cs # (6) Even the use of reserved resources can be accounted in the equation above since those resources are related to a type s and a usage cost csrepresenting the fees practiced
in the reservation market.
Besides the costs related to resources usage, the SaaS provider has another cost related to the act of reserving
1http://aws.amazon.com/ec2/previous-generation/ 2Amazon EC2 service values in 2013.
resources in advance from the IaaS provider. A reserva-tion of resources of type s|s ∈ So, is always associated
with an amount of resources reserved (rs), an upfront
reservation fee (fs) and the interval in which such
re-sources will be available for use. Thus, we can define a reservation contract as:
V =< o, s, rs, fs, nbs, n e s>
where nb
s and nes indicate, respectively, the period at
which resources are available and the time limit to use such resources. It is noteworthy that the interval [nb
s, nes]
should be defined considering the periods in which the SaaS provider will be using resources to offer its application. The set γ represents the set of reservation contracts established between the SaaS provider and the IaaS provider. It’s important to remember that current IaaS providers only offer the possibility of reserving processing resources (i.e., virtual instances), but the cost model proposed here is flexible to consider other classes of resources that may be available for reservation in the future.
Upfront reservation fees paid by the SaaS provider can be amortized over the interval [nb
s, nes]. Thus, each period
nhas a cost component related to the amortization of the reservation contracts defined in γ. This cost component can be calculated using the function cv : N+⇒ R+given
by: cv(n) = ( P <o,s,rs,fs,nbs,nes>∈γ fs·rs ne s−nbs if n b s≤ n ≤ nes 0 otherwise (7) Defined the cost components related to resource usage, ca(n), and resource reservations, cv(n), the total cost of a SaaS provider in a period n can be calculated using the function c : N+⇒ R+ given by:
c(n) = ca(n) + cv(n) + p(n) (8)
where p(n) = P
uk∈UMj(n) represents all penalties
paid by a SaaS provider to its consumers in a period n. A
provider must pay a penalty to a consumer ukwhenever
SLAj, established in the contracted plan pj, is violated.
A SLA violation, as mentioned in Section 3.1, is related to availability or response time violations, according to restrictions established in the plan pj contracted by the
consumer. The function Mj(n)can be used to model
dif-ferent penalties values according to violations intervals or to model single penalty values.
Finally, it’s possible to evaluate the total cost of a SaaS provider in an interval D, where D = [nb, ne]and ne≥
nb, using the function α : [nb, ne] ⇒ R+, where nb, ne∈
N+: α(D) = ne X n=nb c(n) (9)
We can use the cost model presented here to calculate a SaaS provider cost in a past interval D, or even to
estimate the cost in a future interval D. In this case, it´s necessary to estimate the future workload and resources usage from an IaaS provider in each period n.
3.3 Utility Model
The utility function3proposed in this work is defined in
terms of the profit achieved by a SaaS provider. Once the revenue model and the cost model are defined the utility function of a SaaS provider in an interval D, where D = [nb, ne] and ne ≥ nb, is defined by the function
υ : [nb, ne] ⇒ R, where nb, ne∈ N+:
υ(D) = ι(D) − α(D) (10)
We can use this utility function to evaluate the utility obtained by a SaaS provider in a past interval D, and also to estimate the future utility of a SaaS provider (in a business-driven capacity planning). During the capacity planning, the function υ allows an agent to estimate SaaS provider utility over a set of possible capacity plans and select the most beneficial plan to the business.
4
C
APACITYP
LANNINGH
EURISTICSWe propose two capacity planning heuristics: (i) one based on instance utilization (UT); and (ii) one based on Queueing Theory (QN). Both heuristics receive as input the prediction of a future workload for a time interval D, where D is the interval being planned. This prediction can be obtained from historical data of the SaaS applica-tion execuapplica-tion. Both heuristics consider more than one reservation market, each market offering a better cost according to reserved instances usage. Both heuristics use the utility model and the workload prediction to produce a capacity plan indicating the amount and type of instances to reserve in each reservation market. 4.1 Heuristic based on Instance Utilization - UT This heuristic focus on evaluating a trace of instances usage. The trace indicates the amount of instances used and the corresponding amount of hours during which these instances were used (e.g., a SaaS provider used 19 instances of type small for 1000 hours and 5 instances of type small for 1500 hours). This trace must be consistent with future workload prediction. UT uses this trace as input of the algorithm presented in Algorithm 1. If a trace of instances usage does not exist, one can be produced by simulating predicted workload processing. We consider that predicted workload processing simu-lation uses a workload prediction composed of requests arrival time and processing demand. The processing demand estimation considers a base instance. In simula-tion, a Dynamic Provisioning System (DPS) periodically (i.e., hourly) acquires on-demand instances from an IaaS provider. Simulated DPS is based on the behavior of a
3A more detailed version of the utility model can be found at http://www.lsd.ufcg.edu.br/∼davidcmm/utilitymodel
real DPS. We assume that the DPS chooses the correct type and amount of instances in order to meet SLAs established between the SaaS provider and its clients. After simulation, for example, a trace may indicate that 20 instances were used for 300 hours and 30 instances were used for 5000 hours.
Once we have a trace of instances usage, we have to adapt it for Algorithm 1. We consider that if 2 instances were used for 20 hours and 3 instances were used for 20 hours, in fact, 2 instances were used for 40 hours. When 3 instances are acquired to process the workload we consider that we can keep the 2 instances, previously acquired, and add other instance to meet workload demand. In this example, Algorithm 1 would receive two tuples indicating instances usage: h40, 2i, indicating that 2 instances were used for 40 hours, and h20, 3i, indicating that 3 instances were used for 20 hours.
UT uses the cost model proposed in Section 3.2 and instances usage to plan the infrastructure. For each in-stance type s and each reservation market, UT calcu-lates the minimal utilization rate that makes a reserved instance cheaper than an on-demand instance (line 3). An utilization rate represents the percentage of a time interval (e.g., 50% of the reservation interval) in which the instance is used. Such rate is calculated based on
usage costs (cs) of on-demand and reservation markets
and on reservation fee (fs). For example, UT can find that
a small instance should be used for at least 50% of the reservation interval in reservationM arket1 in order to be cheaper than an on-demand small instance. Also, UT can find that a small instance should have an utilization rate of at least 70% in reservationM arket2. Using these information, UT sorts reservation markets from the one with the lowest minimal utilization rate to the one with the highest utilization rate (line 4).
In the next step, UT calculates, for each instance type
sand amount of instances used (amount), obtained from
the trace, the average utilization rate of such instances (line 7). UT looks for the reservation market with greater minimal utilization rate that is lower than or equal to the average instance utilization rate (lines 8 to 12). This market is selected as bestM arket (line 10) and will be used to reserve instances. For example, sup-pose that reservationM arket1 has a minimal utilization rate of 50% and reservationM arket2 has a minimal utilization rate of 70%. Also, suppose that the average utilization rate for 10 instances of type small is 90%. Analysing such values, UT reserve these instances in
reservationM arket2 for the interval being planned.
After choosing the market that offers the best costs for reserving amount instances of type s, UT evaluates the amount of instances to reserve. Instances of same type can be reserved in different reservation markets. To consider this, UT calculates the amount of instances of type s to reserve in bestM arket as the difference between current amount of instances being evaluated (amount) and the total amount of instances of type s already reserved (lines 13 and 14). After evaluating the
whole set of instances usage data, UT has a capacity plan (capacityP lan) containing the type and amount of instances to reserve in each reservation market (line 17).
Algorithm 1.UT reservation algorithm.
1: function UTRESERVATION
Input: Sets (conss) containing tuples husage, amounti
indicating the amount of hours used by each amount of instances of type s acquired. Tuples are sorted in ascending order of amount of instances used. Input: A set (reservationM arkets) containing the
reser-vation markets that can be used to reserve instances. Output: UT returns a capacity plan (capacityP lan) con-taining the type, amount of instances to be reserved and reservation markets to be used.
2: for all sin type1, type2, . . . , typen do
3: Calculates minimal utilization rate
(minimalU tilizationmarket
s ) for each reservation
market in markets
4: Sorts markets, in ascending order, according
to minimalU tilizationmarket s
5: totalReserved ← 0
6: for all husage, amountiin conss do
7: instancesUtilization = usage / (planning
interval length in hours);
8: for allmarket in reservationM arkets do
9: if instancesU tilization ≥
minimalU tilizationmarket
s then
10: bestM arket ← reservationM arket;
11: end if
12: end for
13: capacityP lan[bestM arket][s]+ = amount−
totalReserved
14: totalReserved+ = amount−totalReserved;
15: end for
16: end for
17: return capacityP lan
18: end function
4.2 Heuristic based on Queue Networks - QN
QN heuristic uses Queueing Theory concepts [28] such as mean arrival rate and mean service time. Such con-cepts are used to model the IT infrastructure that will be used to process the workload. We consider that instances are used to process requests and that queues are formed according to the workload submitted. The steps of the algorithm are presented in Algorithm 2.
Instead of using information of each request to be submitted in predicted workload, QN uses a workload summary. This summary contains estimations for each hour of the future workload. Each hour estimation is
composed of: requests mean arrival rate (¯λ); requests
mean service time ( ¯S); mean number of users (N ); users mean think time (Z); instances utilization rate target (ρ). The utilization rate target ρ represents the maximum utilization expected for an instance, for example, a max-imum utilization of 70%. The workload summary can
be estimated considering historical workload traces and workload growth estimates.
Using the workload summary (especially ¯λ and ¯S),
QN estimates the total CPU demand (T ) needed to process future workload (line 2). Also, QN considers the cost model proposed in Section 3.2 to find the minimal utilization rate (minimalU tilizationmarket
s ) that makes a
reserved instance cheaper than an on-demand instance. To do this, QN associates usage costs (cs) of on-demand
and reservation markets and reservation fee (fs) to find
minimalU tilizationmarket
s (line 5).
Using T and minimalU tilizationmarket
s , QN calculates
the largest number of reserved instances of type s that could be used to process the workload (lines 6 to 10). This value is used to limit the amount of instances in the plans that will be evaluated. For example,
sup-pose that for a certain workload M AXsmall = 10 and
M AXlarge = 3. QN evaluates all 44 capacity plans
resulting of combinations containing from 0 to 10 small instances and from 0 to 3 large instances.
After choosing the plans to evaluate, QN estimates the utility of each of these plans (lines 14 to 33). For each hour of the predicted workload, QN determines the amount of instances to be used from the on-demand and reservation markets. First, QN distributes arriving requests (according to mean arrival rate ¯λ) in reserved instances calculating the amount of incoming requests that can be processed without violating the response time established in the SLA and without exceeding instances utilization rate target ρ (lines 17 to 20).
If not all requests could be processed using reserved instances, QN assumes that on-demand instances can be acquired. The throughput of these instances is used to find the amount of on-demand instances needed to process remaining requests (lines 21 and 22). QN also considers a risk (line 23) that the on-demand market denies service (i.e., can not provide the amount of on-demand instances needed). In this case, some requests are not be processed and the SLA might be violated (line 24). We consider that the on-demand market can deny service for two reasons: (i) the SaaS provider has reached the limit of instances that can be acquired from the IaaS provider; (ii) the IaaS provider does not have enough instances to offer4.
After estimating the amount of instances to be used from the on-demand and reservation markets, QN looks for the best reservation market to buy such instances (lines 26 to 28). QN calculates the cost of acquiring reserved instances in each reservation market using the cost model proposed in Section 3.2. Then, QN chooses the market that gives the lowest cost.
Finally, QN calculates an estimated utility value for the capacity plan being evaluated using the utility model presented in Section 3.3 (line 29). After estimating the utility of these plan, QN checks if it is the plan with
4Such risk of instances denial is real for current players of IaaS, as can be seen for example at Amazon AWS http://aws.amazon.com/ec2/purchasing-options/
greater utility (lines 30 to 32) in order to return it as the plan to be used in the infrastructure (line 34).
Algorithm 2.QN reservation algorithm.
1: function QNRESERVATION
Input: A set (reservationM arkets) containing the reser-vation markets that can be used to reserve instances Input: A summary (predictedW orkload) of the predicted workload for an interval D containing, for each hour of the predicted workload: ¯λ, ¯S, N , Z
Input: Instances utilization target: ρ
Output: QN returns a capacity plan (capacityP lan) con-taining the type, amount of instances to be reserved and reservation markets to be used.
2: T =Pplanning interval hours
m=1 S¯m∗ ¯λm
3: for all marketin reservationM arkets do
4: for all s in type1, type2, . . . , typen do
5: Calculates minimalU tilizationmarket
s
ba-sed on con−demand
s , cmarkets , fs
6: demandmarket
s ←
bT /(minimalU tilizationmarket
s ∗
planning interval hours)c
7: end for
8: if demandmarkets ≥ M AXsthen
9: M AXs← demandmarkets
10: end if
11: end for
12: possibleP lans ← builds all possible capacity
plans with amount of instances from 0 to M AXs
13: capacityP lan ← null
14: for all plan ∈ possibleP lans do
15: utility[plan] ← 0
16: for all hour in predictedW orkload do
17: for all sin type1, type2, . . . , typen do
18: resReqs← the amount of requests
pro-cessed using reserved instances (instance utilization limited to ρ)
19: reservedHourss+ = dresReq ∗ ¯Sme
20: end for
21: onDemReq ←the amount of requests
pro-cessed using on-demand instances
22: onDemandHours+ = donDemReq ∗ ¯Sme
23: notP rocessed+ = the amount of requests
not processed in current hour
24: violations+ =the amount of requests that
violated the SLA in current hour
25: end for
26: for all s in type1, type2, . . . , typen do
27: Choose market ∈ markets that gives the
lowest cost for reservedHourss
28: end for
29: utility[plan] = estimateReceipt() −
estimateCost(reservedHours, onDemandHours) −
estimateP enalties(notP rocessed, violations)
30: if utility[plan] ≥ utility[capacityP lan] then
31: capacityP lan ← plan
32: end if
34: return capacityP lan
35: end function
5
S
IMULATOR5.1 Simulation Model
Proposed heuristics were evaluated through simulation
experiments. Existent simulators, such as CloudSim5,
were not used for two reasons: (i) the difficulty of adapt-ing them to support the utility model proposed (Section 3); (ii) the amount of details that are not the focus of this work (e.g., virtual machines allocation models and energy consumption models) . Instead, we developed an
extension6 of the SaaSim framework7 [29] considering
Verification & Validation techniques proposed by [30]. Our simulation model considers a SaaS provider of-fering one application to its consumers. The SaaS pro-vider IT infrastructure is composed of virtual instances acquired from an IaaS provider. The simulation has two main phases: (i) capacity planning, this phase considers one heuristic to build a capacity plan; and (ii) workload execution, this phase processes the workload considering instances reserved in the capacity planning phase and extra on-demand instances that might be acquired.
In the first phase, a capacity planning heuristic con-siders a workload prediction for a future interval D to build the capacity plan. A perfect workload predictor might not be used in a real scenario, so to model the predictor precision we consider a prediction error related to the amount of SaaS clients submitting requests to a SaaS provider. For example, an error of 10% means that if the real workload is composed of 100 clients, the predictor estimates a workload composed of 110 clients. On the other hand, an error of −10% means that if the real workload is composed of 100 clients, the predictor estimates a workload composed of 90 clients.
The second phase aims at evaluating the capacity planning performed in the first phase. We simulate workload processing using reserved and on-demand instances acquired from an IaaS provider. In the end of the simulation, we calculate the utility obtained by the SaaS provider as a result of using the capacity plan produced. It’s important to remember that the workload of each SaaS consumer is an aggregation of requests submitted by end users.
We consider that as requests arrive to be processed, a weighted round-robin load balancer distributes them in the available instances. The round-robin policy used considers the amount of virtual CPUs in each instance and distributes requests proportionally to such amount of CPUs (i.e., an instance with 2 CPUs receives the double of requests received by a one CPU instance).
Each instance process requests according to a consol-idated model [31]. An instance can process an amount of m requests in parallel, controlled by a set of m tokens
5http://www.cloudbus.org/cloudsim/
6Available at http://code.google.com/p/saasim-david 7Available at http://github.com/ricardoas/saasim
Fig. 2. System Model: general view of queues and request processing
that represents available threads (Figure 2). As a request arrives, it acquires a token and enters the processing queue. If no tokens are available, the request waits at a backlog queue, which works in a first-come first-served policy (FCFS), until a token is available. If backlog is full, the request is discarded. The processing queue works in a time sharing policy8.
Besides processing demand, each request has a data transfer demand. Also, each SaaS consumer has a stor-age demand related to hosting its application and user records. We assume that the IaaS provider meets these two demands regardless of the choice and negotiation of instances. The associated cost is calculated according to the model presented in Section 3.2.
Web applications typically present a variable work-load [32], so a DPS is used to control the amount of instances in the short-term. We consider an unrealistic perfect DPS that knows the future workload and uses this information to buy instances from the IaaS provider. Although this simplification is unrealistic, it is important to focus on evaluating the quality of the capacity plan-ning performed.
5.2 Simulation Model Instance
A preliminary full factorial design pointed workload prediction error and capacity planning heuristic as the main factors that influence SaaS provider profit. Ex-periments conducted later also pointed the on-demand denial of service risk as another important factor. Our experiments tried to explore several combinations of such factors, while other variables received fixed values.
8A request can use the CPU for an interval ∆, typically very small, and then the CPU is allocated to another request in the processing queue. Thus, all requests are simultaneously processed and delays related to contention are captured.
TABLE 1
SaaS provider monthly fees
Plans Price
Bronze $24.95 Gold $79.95 Diamond $299.95
Our analysis is not exhaustive and different levels could be used for these fixed variables, but we are confident that our approach was enough to evaluate the trends of heuristics and get ideas of future work.
Well-known IaaS and SaaS providers were used as the basis to instantiate the utility model proposed in Section 3. For the revenue model, three plans offered by BigCommercein 2011 were considered (Table 1): Bronze, Gold and Diamond. BigCommerce charges its consumers monthly (n is equals to 1 month). A contribution margin of 30% was chosen for each plan according to what is practiced in the market9.
Regarding SLA, the availability (AM IN) and response
time limit (TM AX) were instantiated in ranges. We
con-sider that the SaaS provider establishes a response time limit (TM AX) of 2 seconds. If requests processing take
more than 2 seconds, we consider that these requests were lost. If less than 0.1% of the requests are lost, due to response time or availability problems, the SaaS provider does not pay any penalty to its consumer. If less than
1% of the requests are lost, the provider pays a penalty
corresponding to 25% of the value of the plan contracted. If less than 5% of the requests are lost, the provider pays a penalty corresponding to 50% of the value of the plan. Finally, if more than 5% of the requests are lost, the penalty corresponds to the whole value of the contract. Regarding the cost model, the IaaS provider simulated was based on the prices of the Amazon EC2 service in 2013. Three instance types were considered: small (1 virtual CPU), large (2 virtual CPUs) and xlarge (4 virtual CPUs). The only difference considered between these th-ree types is the amount of virtual CPUs. After an instance is requested from an IaaS provider there is a period, considered here as 5 minutes [21], to start the instance and the application. We also considered three reservation markets: a light utilization, a medium utilization and a heavy utilization. Each of this markets offers a better cost according to the usage of reserved instances. Usage costs of such instances (Table 2), per hour, and upfront reservation fees (Table 3) are presented.
We consider that one of the reasons that the on-demand market denies service is because the IaaS pro-vider does not have enough instances to offer. To model such aspect, an on-demand denial of service risk, which represents the probability of not being attended when requiring an instance from the on-demand market, is considered. We consider that the capacity planning
inter-9http://biz.yahoo.com/p/sum qpmd.html
TABLE 2
Virtual instances usage price for each IaaS provider market
IaaS provider market Small Large Xlarge
On-demand $0.06 $0.24 $0.48 Light $0.034 $0.136 $0.271 Medium $0.021 $0.084 $0.168 Heavy $0.014 $0.056 $0.112
TABLE 3
Virtual instances upfront fees for a one year reservation
IaaS provider market Small Large Xlarge
Light $61 $243 $486 Medium $139 $554 $1108
Heavy $169 $676 $1352
val D has three types of intervals according to workload variations over the interval [5]. For each type of interval we associate a denial of service risk, and intervals with higher workload present a higher denial of service risk. We consider two scenario of risks: (i) risks of 1%, 5% and 10%; (ii) risks of 5%, 10% and 50%. The first scenario represents a IaaS provider that cares about its reputation and the quality of the service offered, while the second scenario represents a provider that does not care so much about its reputation.
We simulated a workload corresponding to an inter-val of 1 year (i.e., D = 1 year). A total of 100 SaaS consumers were uniformly distributed among the three plans offered by the SaaS provider. Workload prediction errors initially considered were −20%, 0% and 20%. For each combination of these variables levels a total of 70 different synthetic workloads were simulated to calculate confidence intervals of 95%.
Arlitt et al. [5] shows that an e-commerce workload has some peaks during the day (between 9:00 and 21:00) and the weeks (some days have more and others less load than typical days). A workload peak corresponds to twice the mean amount of requests, while lighter periods correspond to 50% of the mean amount of requests. These invariants were combined with SaaS plans’ prices, contribution margin and usage limits to calculate the request arrival rate of each SaaS provider plan during an year (Table 4). Moreover, some special events (e.g., Christmas) cause peak loads compared to typical weeks [5]. Workloads used in simulations were generated by GEIST [33], while workload predictions were derived from these workloads. GEIST generates a workload assuming a Poisson distribution as the mar-ginal distribution of the arrival process and then adds multifractal and self-similarity properties. Finally, we considered requests processing demands based on [6].
TABLE 4
Requests arrival rate for a typical week of the workload
Workload days Bronze Gold Diamond
Typical day 0.058req/s 0.176req/s 0.650req/s Peak day 0.117req/s 0.350req/s 1.300req/s Light day 0.029req/s 0.090req/s 0.325req/s
6
E
VALUATIONQN and UT heuristics were compared to four reference strategies/heuristics: 1) one baseline strategy that only uses instances acquired from the ondemand market -named ON; 2) one heuristic that reserves 20% of the in-stances needed to process the workload peak, using only small instances from the heavy utilization reservation market - named ST10; 3) a heuristic (COHR0), based on
[10], that uses the three reservation markets considered and a prediction of the amount of instances to be used. It tests a set of possible capacity plans containing from 0 to an upper bound amount of instances in order to choose the capacity plan with the lowest estimated cost; 4) an optimal strategy that knows the exact amount of instances that will be used by the DPS to process the future workload - named OP. It tests a set of capacity plans containing from the smallest to the highest amount of instances that will be used by the DPS and chooses the capacity plan with the best estimated utility value.
Our analysis focus on two metrics: 1) the SaaS pro-vider utility (Section 3.3); and 2) the gain, in percentage, obtained by each heuristic in comparison to the utility obtained by our baseline strategy (ON). This gain is given by:
gain(υA(D), υON(D)) = 100 ∗
(υA(D) − υON(D))
|υON(D)|
(11) First, we verify the feasibility of the capacity plan-ning performed by evaluated heuristics/strategies. The
null hypothesis υST(D) = υU T(D) = υQN(D) =
υCOHR0(D) = υON(D) was rejected according to the
analysis of variance (ANOVA) performed with a
p-value of 1.952e−12. A post-hoc analysis was performed
to evaluate if any heuristic obtained utilities simi-lar to the ones obtained by ON. We concluded that υU T(D), υQN(D), υCOHR0(D), υST(D) > υON(D).
Evalu-ated heuristics present different gains from each other, and different from zero, so they increase the utility of the SaaS provider in comparison to using ON.
According to Shapiro-Wilk normality tests, heuristics utilities are normally distributed while gains are not. So, in order to compare heuristics, we performed Student’s t-tests for heuristics utilities and Wilcoxon signed-rank tests for gains. Analysing the results of such tests (Table 5) we could observe that QN and UT always present the
10ST heuristic reserves 20% of the amount of instances needed to process the workload peak since 20% is an expected utilization for an infrastructure that is planned for a workload peak [2].
TABLE 5
T-tests and Wilcoxon tests results
Prediction Errors Risks of 1%, 5%
and 10% Risks of 5%, 10% and 50% -20% U T > QN > ST ≥ COHR0 QNST > COHR> U T0 > 0% U T > QN > ST > COHR0 QN > U T > COHR0> ST 20% QN > U T > ST > COHR0 QNST > COHR> U T0 > TABLE 6
Average gains for different prediction error levels
Heuristics Prediction Error of -20% Prediction Error of 0% Prediction Error of 20% QN [8.83%; 9.05%] [10.2%; 10.43%] [10.27%; 10.36%] UT [10.23%; 10.28%] [10.64%; 10.68%] [9.24%; 9.32%] ST [7.70%; 7.83%] [8.89%; 8.97%] [8.99%; 9.09%] COHR0 [4.64%; 4.74%] [7.12%; 8.09%] [4.21%; 4.31%]
best results. QN performs better than the other heuris-tics when the on-demand market risk or the workload prediction error increase. QN is the only heuristic that uses directly the on-demand market risk, so as the risk increases QN perceives that the on-demand market can not be trusted to provide instances and tries to reserve more instances (Figure 3) in order to avoid denying ser-vice to application end users. However, other heuristics do not notice such need in the scenarios simulated and do not increase the amount of instances to be reserved.
Since the capacity planning of evaluated heuristics increased the SaaS provider utility in comparison to the ON strategy, the next step of the post-hoc analysis consisted of quantifying the gains obtained. Table 6 presents the gains obtained by each heuristic at different workload prediction errors for the scenario of risks of 1%, 5% and 10%. Although the best (10.66% - error of 0%) and worst (4.5% - error of 20%) average gains obtained seem to be small, they can represent larger savings to a SaaS provider as higher is the profit of the SaaS provider. The difference in heuristics gains can be explained analysing reservations, presented in Figure 3, and in-stances utilization (i.e., percentage of the reservation interval in which the instance was used), presented in Figure 4. We can observe that QN and UT instances were reserved in two markets (heavy and light utilization) and that, in both markets, instances were very used. Since the lowest expected utilization for light market is
28%, for medium is 50% and for heavy is 75%, reserved
instances were cheaper than on-demand instances. UT reserved a larger amount of well used instances in the heavy market, thus obtaining the best cost reductions and greater utilities. QN obtained good cost reductions reserving different instances types (i.e., small, large and xlarge). QN variation in instances types reduced the absolute amount of instances reserved and increased
0 10 20 30 40 50 COHR' QN ST UT Heuristics Reser v ed CPUs Type large small xlarge
(a) Risks of 1%, 5% and 10%
0 10 20 30 40 50 COHR' QN ST UT Heuristics Reser v ed CPUs Type large small xlarge (b) Risks of 5%, 10% and 50% Fig. 3. Instances reserved by each heuristic for a prediction error of 0%
their usage.
As expected, ST heuristic obtained cost reductions in comparison to ON since small reserved instances were well used. However, more instances could be reserved instead of being acquired from the on-demand market
in order to achieve better cost reductions. COHR0
heu-ristic reserved instances were also well used, but more instances could be reserved. Moreover, in the evaluated
scenarios COHR0choice of reservation markets could be
improved (e.g., instances that were reserved in the light market could be reserved in the medium or heavy mar-kets for better cost reductions and utility improvements). By analysing the optimal heuristic (OP) results, we could observe that OP obtained average gains of 10.72% and 11.51% for risks of 1%, 5%, 10% and risks of 5%, 10%, 50%, respectively11. In order to compare OP with other
heuristics, we computed an efficiency as the division of the gain obtained by each heuristic and the gain obtained by the OP heuristic. Comparing heuristics and OP for workload prediction errors of 0% (the best scenario for heuristics), UT achieves an efficiency of 99.42% (±0.16%) for risks of 1%, 5%, 10% and QN achieves an efficiency of
96.19%(±1.08%) for risks of 5%, 10%, 50%. These values
were calculated with a confidence interval of 95%. So, in such scenarios QN and UT achieve utilities that are close to the utilities obtained by OP. Table 7 presents the efficiencies achieved by QN, UT and ST heuristics considering the whole set of scenarios simulated.
By analysing such efficiencies, we can observe that as the on-demand risk increases UT and ST are the heu-ristics mostly affected. In such scenarios, improvements in the heuristics can be investigated in order to obtain greater gains and efficiencies. In all scenarios the worst percentage of requests lost were 0.047% for risks 1%, 5% and 10% and 0.05% for risks 5%, 10%, 50%.
Finally, considering the difficulty in workload
predic-11Confidence intervals are [10.706%; 10.747%] for risks of 1%, 5% and 10% and [9.37%; 13.655%] for risks of 5%, 10%, 50%.
TABLE 7
Heuristics Average Efficiencies
Heuristics Risks of 1%, 5% and
10% Risks of 5%, 10% and 50% QN [85.5497%; 88.2974%] [74.1320%; 79.3194%] UT [85.5053%; 87.6603%] [64.5329%; 68.2585%] ST [74.9753%; 76.9642%] [45.4674%; 49.8758%]
tion, we performed a sensitivity analysis of the workload prediction error (Figure 5). This analysis attempted to reflect the possibility of using predictors that result in different prediction errors. We considered the following prediction errors: −40%, −20%, 0%, 20% and 40%. As expected, the analysis demonstrated that the reduction of the workload prediction error improves the gains obtain-ed by evaluatobtain-ed heuristics. Also, the QN heuristic deals better with positive prediction errors (i.e., overestimating the workload) due to its more conservative prediction. This is an important conclusion since providers may try to overestimate instances consumption in order to reduce the risk of denying service to end users. For these providers, QN is a better choice.
6.1 Discussion about the heuristics
There are some reasons why one heuristic makes better decisions than others. Following we discuss some of them. Firstly, ON does not use reserved instances, which may reduce its chance of being the best in terms of cost. Obviously, at least one instance must be always allocated to the application, otherwise the application would be unavailable. A reserved instance that is used all the time is cheaper than an on-demand instance.
Secondly, ST is a very simple heuristic that considers only peak load to make this capacity planning decision. On the opposite, QN and UT heuristics consider details of the load history, such as average demand, waiting
● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●●● ●●●● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ●●●● ●●●● ●●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
Heavy Light On−demand
0.00 0.25 0.50 0.75 1.00 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 Instances Utilization
type ● LARGE ● SMALL ● XLARGE
(a) QN ● ●●●●●●●● ●●●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ● ● ● ● ●●● ● ●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●●●●● ●●●● ●●●●●●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●
Heavy Light On−demand
0.00 0.25 0.50 0.75 1.00 0 100 200 300 0 100 200 300 0 100 200 300 Instances Utilization type ● SMALL (b) UT ● ● ● ● ●●● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●●●●●●●●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●●● ●●●● ●●●● ● ● ● ● ● ● ● ● ●●● ● ●●● ● ● ●● ●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● Heavy On−demand 0.00 0.25 0.50 0.75 1.00 0 50 100 150 200 0 50 100 150 200 Instances Utilization type ● SMALL (c) ST ● ● ● ● ● ● ●● ●●●●●● ● ● ● ● ● ● ● ● ● ● ●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●● ● ●●● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● Heavy Light Medium On−demand 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0 100 200 300 0 100 200 300 Instances Utilization
type ● LARGE ● SMALL ● XLARGE
(d) COHR0
Fig. 4. Instances utilization for a workload prediction error of 0% and risks of 1%, 5% and 10%
times, resource usage, etc. With more detailed informa-tion it is possible to make better decisions. For instance, if the peak load is 20 times greater than the average load and it happens just for a couple of minutes during the year, ST will make a bad decision, allocating far more nodes than really needed. QN and UT are able to identify such surges, making decisions more appropriate considering the workload history patterns. Besides, both QN and UT have a kind of what-if engine inside, which simulates decisions made by the DPS considering the past workload. With this engine QN and UT may search for the best capacity planning according to the business utility value. ST does not consider different possibilities, it is kind of deterministic, based only on the previous peak load and on small instances.
QN considers the on-demand market risk, so as the risk increases QN tries to reserve more instances, al-most keeping the gains obtained (Figure 5b) while other heuristics obtain lower utilities. This indicates that it is important to consider this risk, especially if it is high.
COHR0does not consider the on-demand market risk
and adaptations performed to verify cost improvement of exchanging small instances for large or xlarge in-stances resulted in inin-stances from the heavy market with utilizations lower than the threshold of 75%, which is ineficient.
Finally, QN and UT are less efficient than OP mainly due to prediction errors. For instance, UT with no predic-tion errors leads to an utility very similar to the optimal. Same occurs with QN.
● ● ● ● ● 2 3 4 5 6 7 8 9 10 11 −40 −20 −10 0 10 20 40 Prediction Error (%) Gain in compar ison to ON heuristic ● COHR' QN ST UT
(a) Risks of 1%, 5% and 10%
● ● ● ● ● 2 3 4 5 6 7 8 9 10 11 −40 −20 −10 0 10 20 40 Prediction Error (%) Gain in compar ison to ON heuristic ● COHR' QN ST UT (b) Risks of 5%, 10% and 50% Fig. 5. Sensitivity analysis of workload prediction error
6.2 Validity Threats
In order to enable the investigation of the proposed problem some simplifications were done, resulting in validity threats. Regarding external validity, a synthetic e-commerce workload was used. Requests arrivals were generated using a outdated workload generator (GEIST) [33] since more recent workload generators based on recent workload studies were not found. Although our utility model covers many IaaS and SaaS providers busi-ness models, our experiments were based on information of one IaaS provider and one SaaS provider and do not account for sensibility of their cost choices. Regarding construction validity, we modeled the SaaS application as a black box single tier application and users session were not considered.
7
C
ONCLUSIONS ANDF
UTUREW
ORKAnalysing our simulation experiments using synthetic e-commerce workloads we demonstrated that capacity planning should not be neglected when offering a SaaS application deployed at instances acquired from an IaaS provider. We developed an utility model that considers business aspects related to offering a SaaS application. This model guides the capacity planning performed by proposed heuristics, QN and UT. Proposed heuristics were compared to other three solutions: (i) a baseline strategy that uses only on-demand instances (ON); (ii) a
heuristic (COHR0) based on [10]; and (iii) a heuristic
that considers workload peak to determine the amount of instances to reserve (ST). Analysing our results, all heuristics improve SaaS provider utility in comparison to ON. QN and UT present the best results, improving SaaS provider utility, on average, by 10.04% and 9.25%, respectively. Also, such heuristics lose 0.05% of the re-quests in the worst cases.
Our sensitivity analysis demonstrated that workload prediction errors influence the results obtained by
eval-uated heuristics. Large SaaS providers tend to have huge amounts of historical data and invest in good prediction techniques. As a consequence, they get small prediction errors and achieve better capacity planning results. However, smaller SaaS providers may not have access to such possibilities, obtaining larger errors and failing to explore the best of capacity planning heuristics. Simplifications considered result in validity threats that should be explored in future work. We plan to use real e-commerce workload to validate the results obtained by each heuristic considered in this work. Although we used synthetic workload in this work, the utility models proposed as well as the synthetic e-commerce workload generated considered real IaaS and SaaS providers information. So, we believe that an overview of heuristics behavior could be established. A more detailed application model can be considered using many tiers and users sessions. Finally, improvements in heuristics can be investigated, especially for scenarios of large workload prediction errors.
A
CKNOWLEDGMENTSThe authors would like to thank Siqi Shen [10] for pro-viding specifications about the original COHR heuristic.
R
EFERENCES[1] A. Moura, J. Sauve, and C. Bartolini, “Business-driven it management-upping the ante of it: exploring the linkage between it and business to improve both it and business results,” Commu-nications Magazine, IEEE, vol. 46, no. 10, pp. 148–153, 2008. [2] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz,
A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “Above the Clouds : A Berkeley View of Cloud Computing Cloud Computing : An Old Idea Whose Time Has ( Finally ) Come,” Computing, pp. 07–013, 2009.
[3] L. Vaquero, L. Rodero-Merino, J. Caceres, and M. Lindner, “A break in the clouds: towards a cloud definition,” ACM SIGCOMM Computer Communication Review, vol. 39, no. 1, pp. 50–55, 2008.