Server Virtualization Techniques
Agenda
• Define Server Virtualization
• The Server Virtualization Spectrum
• Server virtualization solutions
• Similarities and differences
• OS Issues
• Note:
• “Virtualization” (V12N) is really a misnomer when applied to some
of the HW technologies. A better general term would be “Workload Containment” (WC)
• V12N is one kind of WC...HW partitioning is another…
EPA Report to Congress –
Server & Data Center Energy Efficiency
• Data center energy use more than doubled 2000-2006.
• The power and cooling infrastructure accounts for 50% of
data center total energy consumption.
• The energy used by the nation’s servers and data
centers in 2006:
> 61 billion kilowatt-hours (kWh)
> 1.5% of total U.S. electricity consumption!
> Total electricity cost of about $4.5 billion.
> Equal to 5.8 million average US Households
EPA Report to Congress – Server &
Data Center Energy Efficiency
EPA Report to Congress – Server &
Data Center Energy Efficiency
Motivating factors - consolidation
• Improving the utilization of computing resources
> Translation: better return on the money spent
• One way to do this is server consolidation
• Consolidation requires (at least) the following
> 1. Non-interference of independent workloads: > with security, and performance management
> 2. Resource management (to ensure service levels) > Capacity planning is rediscovered as systems discipline!
> 3. Resource accounting (to pay for shared resources)
Server Virtualization
• Goals:
> Run multiple application environments on the same
machine at the same time without allowing them to
interact (except normal inter-machine interaction, e.g. network communication)...that is:
> workload and security separation
> Platform abstraction (emulation of hardware)
• In other words, convince the applications that they are on
separate and/or different systems, even though they are sharing a system...
> ...actually, CPU time sharing and OS process
Server Virtualization Context
Decouple the hard connection between this application on
this OS instance on this box…essential for Cloud Computing
• Virtualization has been used since the 1960s (mainframes)
> Now a mainstream technology available on multiple platforms
• Renewed emphasis due to changed economics and needs
> Server sprawl has gotten out of hand, energy costs have skyrocketed
• Different styles of V12N with different benefits and limitations
• Provide the Illusion of a dedicated computer for multiple OS instances:
> Partitioning: hardware and/or firmware capability
> Virtual machines: host OS (“hypervisor”) software (VMware, Xen, ...)
> Containers, zones, vServers: light-weight, single-OS virtualization
• NOTE: Grid, J2EE, cloud computing, service-oriented-architecture apps
Other Motivators / Use Cases
• There are other important use cases for virtualization
> Upgrade OS version or patch level with concurrent operation
> Migrate from one OS to another on same server
> Coexist different OSes for different types of work
> Provide separate fault, security, admin domains of same OS level
> Relieve scalability constraints of a given OS via multiple instances
> Use legacy OS on newer systems
> examples: run NT4 on current x86, Solaris 8 on SPARC T2/T3
> Development, sandbox, play-pen in congenial environment
> Flexible, rapid redeployment of workloads to servers
Hardware Virtualization
hardware VMM
VM VM
OS, e.g., Linux OS, e.g., Win32 applications applications
. . .
• A simple (simplistic) picture!
• VMM = Virtual Machine Monitor (Hypervisor)
• But implementation is complex.
• Virtual Machines (VMs) can be:
V12N Terminology
interpretation
emulation
virtualization
Performing instructions written in a programming
language (e.g., perl, python, ruby, Java bytecodes, x86 machine code)
Imitating the behavior of one system (e.g., interpreter) using the resources of another
The abstraction of computing resources (e.g. memory, cpu)
virtual machine “an efficient, isolated duplicate of a real machine”
Credit: Popek/Goldberg, “Formal Requirements for Virtualizable Third Generation Architectures”
Resource Virtualization
Run queue:
app app app app
Round robin Pre-emptive Scheduling
E.g, the CPU
offset directory table page directory page table cr3
(10-bit offset) (10-bit offset)
Pg table base Page frame
page
Memory cell E.g, Memory
Linear address
(12-bit offset)
Credit: Intel i486 reference manual
Application Virtualization
Application Software Run Application Installer CodeSystem calls (resource request) system interface
intercept simulated files
simulated registry settings generate simulated registry settings simulated files Virtualization Layer generate Application V Layer package Highly portable.
Application leaves no footprint on host (just user preferences).
Application can be streamed.
“Isolation” is voluntary. Link package
with manager
Application
V Layer .EXE
Run anywhere. Credit: www.anandtech.com
Run under manager Running Application Operating System VL
Bytecode Interpreters
Java Virtual Machine (JVM)
Java program
Operating System Hardware
~200 JVM instructions (bytecodes)
Emulation OR Just-In-Time compilation
An “imaginary” machine, except for picoJava HW
• in 2006, >4 billion JVM devices
• Java marketing: “write once, run anywhere!”
– Java mockery: “write once, debug everywhere!” (forgot who said that)
• Microsoft .Net Common Language Runtime (CLR) is similar but more generic. Safe
Strongly typed
“verified” on execution: valid opcodes, jump targets, type discipline Garbage collected memory
Sandboxed
Hardware Virtualization
hardware VMM
VM VM
OS, e.g., Linux OS, e.g., Win32 applications applications
. . .
• A simple (simplistic) picture!
• Clearly the different VMs must be separate and secure; Why?
Hardware Virtualization
(HW Server View)
Terminology
Guest OS : runs only on VMM Host OS : runs only on HW
Domain : virtual machine on VMM
HW VMM Guest OS Guest OS Guest OS A I/O VMM A A A A A A A Full virtualization type 1 Host OS VMM Guest OS A A HW type 2 ring 1 ring 2 ring 3 CPU mode dom0 A A Guest OS (kernel) x86 A A Guest OS (kernel) VMM ring 0 ring 1 ring 3 ring 0 ring 3 Issue: Deprivileging Para-virtualization HW VMM Guest OS Guest OS Guest OS A I/O VMM A A A A A A A type 1 Host OS VMM Guest OS A A HW type 2 dom0 1 2 3 4
Hardware Virtualization
Device Driver Placement
HW VMM A I/O VMM A type 1 Host OS VMM Guest OS A A HW type 2 dom0 Guest OS A A Guest OS A A Guest OS A A Guest OS A A Guest OS A A Guest OS A A Guest OS A A Guest OS A A I/O VMM VMM
redirect emulated device device driver pass through redirect Device emulated device emulated device emulated device device driver device driver device
driver
emulated
device device driver emulated device emulated device emulated device device driver device driver device
driver Device Device Device Device Device Device Device
VMM Formal Requirements
(summary of Popek and Goldberg, 1974 CACM)
memory
For machines having: 1) user/supervisor modes, 2) location-bounds register, and 3) a trapping mechanism.
trap 0 1 2 3 … 0 1 2 3 4 q-1 … n n+1 n+2 n+3 n+4 …… “u” PC=0 (n, 4) “s” PC=2 (0, q-1) Sensitive Instructions (change or depend on memory map or mode)
If then a Virtual Machine Monitor (VMM) can be built having 3 properties:
∩
Dispatcher Allocator Instruction Interpreter
Popek Goldberg
Theorem
:
“user” program
Efficiency: most instructions run directly.
Resource Control: the VMM allocates all resources.
Equivalence: the user program mostly believes it runs
on the hardware.
Privileged instructions
Making x86 Virtualizable
Using Binary Translation
. . . . . . “Running” Basic blocks Translation Cache (also in memory)
1 Identify the “next” block by scanning instructions
for a jump/call/etc (that ends a basic block).
ret
call jmp
… 2
3 Binary translate any prohibited instruction
into a sequence that emulates it “safely.”
instruction instruction instruction instruction call SGDT instruction instruction instruction instruction call instruction instruction instruction
4 Run/rerun translated block at full speed. Guest OS kernel in ring 1 VMM ring 0 (if needed) C A B C B A Guest OS kernel in ring 1 Copy a newly- encountered basic block to the cache.
Intel 64
contains ~595 instructions. Data transfer 32 Arithmetic 18 Logical 4 Shift/rotate 9 Bit/byte 23 Control transfer 31 String 18 I/O 8 Enter/leave 2 Flag control 11 Segment register 5 Misc 6 General Purpose 167 Floating Point 94 SIMD System 34 64-bit mode 10 VT-x Extensions 12 Safe mode 1 Data 17 Arithmetic 26 Compare 14 Transcendental 8 Constants 7 Control 20 State management 2 MMX 47 SSE 62 SSE2 69 SSE3 13 SSSE3 32 SSE4 54 Intel version of x86-64 Hardware extensions make the instruction set virtualizableMaking x86 Virtualizable
Intel Virtual Machine Extensions (VMX)
Ring 0 VMXON VMXOFF ring 0 ring 1 ring 2 ring 3 CPU mode CPU State transitions VMX root VMM A A Host OS VMXLAUNCH VMXRESUME VMXCALL “side effects” A A Host OS A A Host OS A A Host OS A A Host OS VMX non-root A A Host OS ring 0 ring 3 Original structureLegacy software runs in the expected rings, hopefully unaware.
“there is no software-visible bit…indicates…VMX non-root operation”, Intel 64 manual.
Deprivileged (very configurable).
• Many instructions cause fault-like VM exits:
– interrupts – I/O events
– page table management
– privileged instructions, etc.
• VMM handles faults • VM exit rate determines
performance
How Complex is Virtualization?
1990 55,000,000 35,000,000 20,000,000 15,000,000 3,000,000 1,000,000 60,000 Source Lines Of Code Debian Linux Windows 3.1 Windows 95 Windows NT Windows 2kRed Hat Linux
Bochs Kaffe Xen Qemu VirtualBox 2000 2008
VMM code counts generated using David A. Wheeler's “SLOCCount” tool. Windows estimate from Bruce Schneier
Operating system Virtualization system
VMM Implementation Quality
Should Not be Assumed
Reference: “An Empirical Study into the Security Exposures to Host of Hostile Virtualized Environments,” by Travis Ormandy. taviso.decsystem.org/virtsec.pdf
Code counts generated using David A. Wheeler's “SLOCCount” tool.
In 2007, Tavis Ormandy subjected 6 virtualization systems to guided random testing of their invalid instruction handling and I/O emulation.
Bochs QEMU VMWare Xen Anonymous 1 Anonymous 2
All of the systems failed the tests, most with “arbitrary execution” failures. Device emulation was a particular area of vulnerability.
For details, see: taviso.decsystem.org/virtsec.pdf
Nevertheless
…
•
Virtualization is now a pervasive technology
•
Used in majority of data centers
•
VMware on x86 has greatest market share…
–
Competitors:
• Microsoft Hyper-V
• Xen (Open Source, Citrix, Oracle OVM)
• Linux KVM
Virtualization Approaches
Interconnect CPU, Memory OS Kernel OS Features ApplicationsShared Shared Shared
Shared Shared Shared Shared Shared Shared OS C D om ai n C Ap p OS B Ap p D om ai n B OS A Domain A Ap p OS C Ap p OS B Ap p OS A Ap p OS C Ap p OS B Ap p OS A Ap p HW Support? Hypervisors Hypervisor Ap p Ap p Ap p OS Virtualization Hardware Assignment Ap p Ap p Ap p VM Layer Hosted Virtualization Hard Partitions
Multiple OS's Single OS
OS B OS A
Software Hypervisors
Interconnect CPU, Memory OS Kernel OS Features Applications Shared Shared Shared Shared Shared Shared Shared Shared OS C D om ai n C Ap p OS B Ap p D om ai n B OS A Domain A Ap p OS C Ap p OS B Ap p OS A Ap p Hard Partitions Ap p Ap p Ap p OS V12N Hardware Assignment Ap p Ap p Ap p Encapsulation Hosted Virtualization VM Layer Shared OS C Ap p OS B Ap p OS A Ap p Hypervisors• Some competing technologies
> Type 1 – alone on the hardware
> VMware ESX, KVM,
> Xen / Citrix / Oracle OVM,
> Microsoft Hyper-V
> Type 2 – on an OS (“Hosted V12N”)
> Virtual Box
> Parallels Workstation
> VMware Fusion (for OS X)
> Microsoft Virtual Server
HW Support
User Mode Linux Overview
System Call Interception
Provides a
self-contained
environment
Identical as hosting Linux kernel Processes have no access to host resources thatwere not explicitly provided Host OS Kernel Guest OS Kernel/UML VM User Process 1 p t r a c e VM User Process 2 V irt ua l Ma ch in e
Linux KVM
http://www.linux-kvm.org
Kernel-based Virtual Machine
for Linux on x86 hardware containing
virtualization extensions (Intel VT or AMD-V)
loadable kernel module
Included in mainline Linux, as of 2.6.20
can run multiple virtual machines running
VMotion-like technology lets you “move” live, running virtual
machines from one host to another while maintaining continuous service availability.
“Live Migration” also available on other V12N platforms
Xen, SPARC T2/LDoms, IBM Power, ...
What are the technical challenges to implementing this?
HW? OS? Applications?
Operating System Virtualization
For example: Solaris Containers
•
Single OS instance (“Global Container”)
>
Appearance
of many OS instances...
> ...but not really
>
Minimal performance impact
CPU CPU CPU CPU
Memory Solaris
I/O I/O I/O I/O
CPU CPU CPU CPU
I/O I/O I/O I/O
Zone Zone Zone
Zone Zone
Zone
Zone Zone Zone
Zone Zone Zone
Zone Zone
Zone
Impact of VMs on Virtual Memory?
Virtualization of virtual memory if each guest OS in every VM
manages its own set of page tables?
VMM separates real and physical memory
Makes real memory a separate, intermediate level between
virtual memory and physical memory
Guest OS maps virtual memory to real memory via its page
tables, and VMM page tables map real memory to physical memory
VMM maintains a shadow page table that maps directly from
the guest virtual address space to the physical address space of HW
Rather than pay extra level of indirection on every memory
access
VMM must trap any attempt by guest OS to change its page
V12N Myths
V12N is easy
extra layer of training & expertise required
Applications run the same under V12N
performance, installation, licensing, support can be different
applications are written assuming non-virtualized OS services, APIs needed
V12N requires no planning
provision/deploy/destroy ease tempts oversimplification, lack of logging
V12N reduces IT infrastructure complexity
more, not less, complex, VMs may be hard to locate without rules
V12N saves money
HW reduction is real, but other costs can increase (mgt SW, training)
“Operating systems are dead”
Hypervisors are OS's...some merging of features & responsibilities may
V12N Myths
V12N increases availability & reliability
but HA and proper failover architecture requirements and methods needed
V12N enhances security
VM security not yet well understood, investigation harder
V12N can be used everywhere
not where performance & scalability are priorities
Organizations can exploit V12N immediately
not without planning, deployment & management training
Future of Virtualization
Although it originated decades ago, it's relatively
“new” to the modern, multi-system data center and
low- & mid-range UNIX/Linux/MS servers and
workstations
and to a certain extent, new to university CS
curricula
Many new uses...and problems, too
Future of Virtualization
Desktop Virtualization becoming prominent
growing use of “thin” desktops for security, ease of desktop management
back to centralized computing model !!!
why did IT move away from centralized?
VM management issues
tools still in development for provisioning, monitoring, patching, securing, “moving”, ...
“VM sprawl” starting to occur
debugging problems difficult
non-deterministic architectures;
Future of Virtualization
Virtualized “appliances”
preconfigured databases, web servers, app servers, thin client servers, etc
encapsulates OS & application/service
VM standardization
OVF standard under development
goal is to enable fully portable VMs and their deployment/management
High Availability
solutions & SLAs still needed
Future of Virtualization
Continuing trend of HW-assisted V12N
Intel & AMD virtualization accelerators
see AMD’s Rapid Virtualization Indexing
Competition & Self-Serving Predictions
Big 3 on Intel/AMD:
Microsoft, VMware, Xen
“Operating systems are dead” (VMware)
Architectural design skills needed