IaaS-Clouds in the MaDgIK Sky
Konstantinos Tsakalozos
PhD candidate
Research Topics
1.Nefeli: Hint based deployment of virtual
infrastructures
2.How profit maximization drives resource
allocation in highly scalable infrastructures
3.MigrateFS, towards a true share nothing cloud
4.Tackle cloud's heterogeneity
Nefeli, VM placement
The Idea behind Nefeli:
The Virtual Infrastructure consumer/user is aware of
operation and data flows among VMs. Can we
harvest this information to tackle performance
bottlenecks?
BUT: The physical cloud infrastructure must
Interfacing with Nefeli
The consumer/user expresses a set of
constraints/hints describing an ideal
deployment
Nefeli takes these user constraints/wishes
under consideration when VMs are mapped to
physical machines (PMs)
Consider VMs holding Database replicas. They
have to be deployed on different PMs.
Consider VMs producing excessive network traffic.
Constraints
User constraints
VMs to be co-deployed, spread
across physical machines (PM),
favored against others, data gravity
Administrative constraints
Offload a PM, Power save
Solver: Simulated annealing
Runtime Interaction
The consumer/user expresses a set of states
for her infrastructure. These states “activate”
different constraints.
States are “trapped”. Nefeli migrates VMs to
accommodate user wishes
Active hints may change over time offering a
Nefeli vs other placement policies
Simulation measuring the end node throughput
Random VM placement, Balanced VM placement,
Nefeli in a real cloud
Nefeli achieves a 17% improvement on the time required to
have video and audio transcoding complete, compared to
default OpenNebula 1.2.
2. Resource allocation in highly
scalable infrastructures
Highly scalable frameworks:
The more resources consumed the higher the
performance
Scale linearly?
Clouds, seemingly endless resources
Performance guaranties?
How many resources (eg, Satelites, VMs)
How many resources (eg, Satelites, VMs)
should we use for a scalable infrastructure?
Clouds... It is all about money
Cost: Pay for the resources you consume.
Revenue: Sell products coming form the processing taking
place within the cloud
Budget Function: Response time to revenue
Pay more -> Reduce response time -> Increase your
Finding the maximum profit point
Max profit B changes at runtime.
Why?
Some cloud resources are shared
among users (Disk, Net I/O, CPU)
Workloads (processing time)
change based on input
To specify B’ we assume re-occurring user’s
workloads
•
DB loads Day-Night,
•Index updates
Finding the maximum profit point
Re-occurring user workload:
In each iteration compute MR
and MC
We increase or decrease the
size number of VMs used
accordingly so as MR == MC
B’ “too far away” from B:
B’ “too far away” from B:
•
increase/decrease VMs exponentially
When B’ close to B:
Applications - Evaluation
Used by the cloud provider
Used by the cloud provider
Cost: cloud’s operational cost,
Revenue: per VM
Used by each consumer separately
Used by each consumer separately
Revenue: the degree of satisfaction the service
offers
Resources shared proportionally to the money
Evaluation - Two users
Evaluated using
Real infrastructures elastic Hadoop/Condor
Simulated for large infrastructures
•
A single user computing Pi
over and over again
•
Exponential and linear VM
adjustments
•
Second user entering the
3. A true share nothing cloud
Suspend/resume VM migration is a show
stopper for load balancing
You must have shared storage facilities
Shared storage is:
A single point of failure
Performance bottleneck
Clouds are based on commodity hardware to
Migrate FS. Why?
Distributed file systems:
Scaling issues
Have relaxed semantics
Offer much more than what clouds need
Migration operation
Sync VM disk image between target and source PM
Sync VM RAM between target and source PM
Instantly suspend VM form source and resume it to the
Migrate FS prototype
Two modes of operation:
“I need to move VM v from PM A to PM B in less
than t seconds”
“I need to move VM v from PM A to PM B with
guaranteed VM I/O performance”
Respect SLAs
At any time you can get an estimate on the time
the migration will take (depends on the I/O load
of the VM)
4. Handling Heterogeneity
How we dealt with hetogeneity
Organize physical nodes into ”sites”
Specialy crafted VMs to boot in multiple ”sites”
Univeral instantiation configuration schema
Heterogeneity: a challenge
Sky computing: Cloud of clouds
Load Balancing in IaaS-Clouds
Load balancing through VM migration
Live migration: almost no downtime
Copy RAM while the VM in online
Requirement: PMs share storage, compatible
hypervisors
Suspend-resume: have to copy memory and disk
content before resuming
Load balancing is itself a costly (time &
resources) operation
VM Scheduling - Placement
Physical,Virtual infrastructure properties
Resource availability, VM requirements (CPU, RAM,
network)
Topology: “distance” from repositories, neighboring nodes
Future load balancing prospects
User provided hints/constraints
System properties: Compatibility (kernel, virtualization),
Features (high availability, RAID)
Two Phase VM Scheduling
How to form a site:
Load balancing prospects. Favor site formation among
PMs allowing live migration. When live-migration
enabled nodes not enough allow suspend/resume
migration
Resources of the site must be more than the requested
Site formation is formed as a constraint satisfaction
problem
VM-to-PM mapping is also a constraint satisfaction
problem (Nefeli)
Elastic Solver
•
Consume resources from the cloud – fill out
underutilized, isolated physical nodes
•
Simulated annealing easily parallelizable through
simultaneous executions
•
More resources better site formation and VM-to-PM
Results?
Reduction of the search space yields:
Improvements in the time consumed
No degradation in the VM-to-PM quality when
Related work
[Tsak11] K. Tsakalozos, H. Kllapi, E. Sitaridi, M. Roussopoulos, D. Paparas and A. Delis,
“Flexible Use of Cloud Resources through Profit Maximization and Price Discrimination”, ICDE 2011 Hannover, Germany, April 2011.
[Tsak10] K. Tsakalozos, M. Roussopoulos, V. Floros and A. Delis, “Nefeli: Hint-based Execution
of Workloads in Clouds”, ICDCS 2010, Genoa, Italy, June 2010.
[TsakF]K. Tsakalozos, M. Roussopoulos, and A. Delis, “VM Placement in non-Homogeneous
IaaS-Clouds”, under review.
J. O. Kephart and D. M. Chess, “The Vision of Autonomic Computing”, IEEE–Computer, vol. 36,
no. 1, pp. 41–50, 2003.
K. Lee, N. Paton, R. Sakellariou, and A. Fernandes, “Utility Driven Adaptive Workflow
Execution,” in Proc. of the 2009 9th IEEE/ACM Int. Symposium on Cluster Computing and the Grid, Shanghai, PR China.
J. O. Kephart and R. Das, “Achieving Self-Management via Utility Functions,” IEEE Internet
Computing 2007.
D. Grosu and A. Das, “Auctioning resources in Grids: model and protocols: Research Articles,”
Related work
K. Subramoniam, M. Maheswaran, and M. Toulouse, “Towards a MicroEconomic Model for
Resource Allocation,” in In IEEE Canadian Conference on Electrical and Computer Engineering. IEEE Press, 2002.
H. R. Varian, Intermediate Microeconomics : A Modern Approach, 7th ed. W. W. Norton and
Company, Dec. 2005, ch. 25, Monopoly
Yingwei Luo, Binbin Zhang, Xiaolin Wang, Zhenlin Wang, Yifeng Sun, Haogang Chen, "Live and
incremental whole-system migration of virtual machines using block-bitmap," Cluster
Computing, 2008 IEEE International Conference on , vol., no., pp.99-106, Sept. 29 2008-Oct. 1 2008
Robert Bradford, Evangelos Kotsovinos, Anja Feldmann, and Harald Schioberg. 2007. Live
wide-area migration of virtual machines including local persistent state. In Proceedings of the 3rd international conference on Virtual execution environments (VEE '07).
Keahey, K., Tsugawa, M., Matsunaga, A., Fortes, J., , "Sky Computing," IEEE Internet
Computing, Sept.-Oct. 2009
F. Hermenier, X. Lorca, J.-M. Menaud, G. Muller, and J. Lawall, “Entropy: a consolidation
manager for clusters,” in Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual Execution Environments, ser. VEE ’09.