Cloud Analytics for Capacity Planning
and Instant VM Provisioning
Yexi Jiang
Florida International University
Advisor: Dr. Tao Li
Presentation Outline
•
Background
•
Cloud Capacity Prediction
–
Predict provisioning resource demand
–
Estimate de-provisioning requests
–
Experimental evaluation results
•
Instant Cloud Provisioning
–
Predict VM provisioning demand
–
Experimental evaluation results
Background
• What is
Cloud Analytics
?
Rapidly identify cloud resource or application trouble spots so you can
solve the problem.
•
What is the objective of cloud analytics?
•
The cloud platform itself.
• What can cloud analytics do?
– Workload analysis
– System fault diagnostics
Smart Cloud Enterprise trace data
•
5 month, 35k+ requests, 120+ image types, 20+ features each record
•
Important Features: Image Name, Owner, Start Time, End Time, ID
Aggregating the Raw Data
weekly
daily
hourly
Cannot reflect
real capacity
Just right
Aggregating the Raw Data
Measurement Weekly Daily Hourly
Coefficient of Variance (CV) 0.5606 0.7915 1.2249 Skewness 0.3295 1.5644 5.4464 Kurtosis 1.62 5.8848 52.4103
weekly
daily
hourly
Cannot reflect
real capacity
Just right
Too irregular
Presentation Outline
•
Background
•
Cloud Capacity Prediction
–
Predict provisioning resource demand
–
Estimate de-provisioning requests
–
Experimental evaluation results
•
Instant Cloud Provisioning
–
Predict VM provisioning demand
–
Experimental evaluation results
Cost of Data Centers
•
31% of the cost is related to power.
•
As hardware price continuously decreases, the proportion would
further increase.
•
The US EPA estimates the energy usage at data centers is experiencing
successive doubling every five years. (7.4 billion in 2011)
Motivation
• Reduce power cost via capacity prediction
Cos
t of the
Cl
oud
P
ro
vi
de
r
Prepared Resource
Real Requirement
Motivation
• Reduce power cost via capacity prediction
Cos
t of the
Cl
oud
P
ro
vi
de
r
Prepared Resource
Predicted Resource
Real Requirement
Candidate Time Series
•
Capacity time series
–
Non-stationary.
–
Difficult to model directly
•
Provisioning /de-provisioning
time series
–
Obvious temporal pattern
Basic Idea
• Capacity = (# existing VMs) + (# provisioning) - (# de-provisioning)
Existing VM in
cloud
-+
Predicted
Provisioning
Predicted
De-provisioning
Predicted Capacity
Predicting Provisioning
Demands
• Ensemble method for time series prediction
•
Individual prediction techniques used:
– Moving Average. Naïve predictor.– Auto Regression. Linear predictor. – Neural Network. Non-linear predictor.
– Gene Expression Programming. Genetic algorithm.
– Support Vector Machine. Linear predictor with non-linear kernel.
• Dynamic weighted linear combination
• Weight update
w
p(t)weight of predictor p
v
ppredicted value of individual
predictor p
c
p(t)cost of predictor p at time t
Cloud Prediction Cost
•
Over-prediction: cost of resource waste.
•
R
function:
•
Under-prediction: cost of SLA penalty.
•
T
function:
•
Property: Non-negative, Monotonic.
))
(
~
),
(
(
))
(
~
),
(
(
v
t
v
t
T
v
t
v
t
R
C
=
+
Prediction Result
• Ensemble has the best average performance.
Predicting De-provisioning
• Use the life span CDF
F(x)
of VMs to estimate number of
de-provisioning requests
• Estimation of distribution: step-wise function.
De-provisioning evaluation
Test data:
last 60 day.
Test methods:
1. No preparation at all (None)
2. Always prepare the maximum capacity
(Maximum)
3. Time series prediction (Time Series)
4. Life span distribution despite of image
–
60 days of data (Dist 60)
–
90 days of data (Dist 90)
• Global distribution estimation method outperforms the time
series prediction method.
Presentation Outline
•
Background
•
Cloud Capacity Prediction
–
Predict provisioning resource demand
–
Estimate de-provisioning requests
–
Experimental evaluation results
•
Instant Cloud Provisioning
–
Predict VM provisioning demand
–
Experimental evaluation results
Motivation
•
Problem
:
Existing clouds are not “instant”, not suitable for mid-job
scaling and urgent tasks.
•
VM preparation is fast, but patching, security assurance, manual process and
other processes cost time.
•
Known solutions
:
–
Prepare extreme large number of different types of VMs.
Waste
resource
–
Ask customers to provide schedule.
Impractical
•
Our Idea
: Make good use of the customer historical requests to infer
Core Idea
Model and
predict
demands
Predict
Results
Pre-provision
at suitable
time
Wait for
Requests
Assign VMs
to
customers
Focus on individual types
• No obvious temporal patterns for individual image type.
Ensemble is still
required.
Focus on popular VM types
1) About 10% (12) of the 124 VM types consists more than 80% requests
2) Inflection point divides the VM types into popular group and rare group
3) Requests for rare image types appear randomly.
Experimental Evaluation
• Ensemble method have the best performance in
reducing waiting time and resource waste.
Conclusion
•
Capacity Prediction
–
The demand of cloud capacity can be estimated by predicting provisioning and
de-provisioning requests
–
Use time series ensemble method for provisioning prediction
–
Use VM life span model for de-provisioning prediction
•
Instant cloud provisioning
–
Pre-provision VMs before requests arrive
–
Predict VM provision requests use time series ensemble method
–
The average provisioning fulfillment time can be reduced by 85%+
•
Future work
–
Improve prediction with user profile
Thank you
•
Related Paper:
•
Intelligent Cloud Capacity Management. (NOMS 2012)
•
ASAP: A Self-Adaptive Prediction System for Instant Cloud
Resource Demand Provisioning. (ICDM 2011)
•
Patent: