• No results found

Control sensitive - instructions that attempt to change either the memory allocation or the privileged mode. Mode sensitive - instructions whose behavior is different in the privileged mode.

N/A
N/A
Protected

Academic year: 2020

Share "Control sensitive - instructions that attempt to change either the memory allocation or the privileged mode. Mode sensitive - instructions whose behavior is different in the privileged mode."

Copied!
47
0
0

Loading.... (view fulltext now)

Full text

(1)

Case Study: Xen,VMM based on

Paravirtualization

(2)

Performance and security isolation

• The run-time behavior of an application is affected by other applications running concurrently on the same platform and competing for CPU cycles, cache, main memory, disk and network access. Thus, it is difficult to predict the completion time!

• Performance isolation - a critical condition for QoS guarantees in shared computing environments.

• A VMM is a much simpler and better specified system than a traditional

operating system. Example - Xen has approximately 60,000 lines of code; Denali has only about half, 30,000.

Xen /ˈzɛn/ is a hypervisor using a microkernel design, providing services that allow multiple computer operating systems to execute on the same computer hardware concurrently.

• The University of Cambridge Computer Laboratory developed the first versions of Xen. The Xen community develops and maintains Xen as free and

open-source software, subject to the requirements of the GNU General Public License (GPL), version 2. Xen is currently available for

the IA-32, x86-64 and ARM instruction sets.

• The security vulnerability of VMMs is considerably reduced as the systems expose a much smaller number of privileged functions.

(3)

Computer architecture and virtualization

• Conditions for efficient virtualization

– A program running under the VMM should exhibit a behavior essentially identical to that demonstrated when running on an equivalent machine directly.

– The VMM should be in complete control of the virtualized resources.

– A statistically significant fraction of machine instructions must be executed without the intervention of the VMM.

• Two classes of machine instructions:

– Sensitive - require special precautions at execution time:

Control sensitive - instructions that attempt to change

either the memory allocation or the privileged mode.

Mode sensitive - instructions whose behavior is different in

the privileged mode.

– Innocuous - not sensitive.

(4)

Full virtualization and paravirtualization

Full virtualization – a guest OS can run unchanged under the VMM as if it

was running directly on the hardware platform.

– Requires a virtualizable architecture

– Example: Vmware

Paravirtualization - a guest operating system is modified to use only

instructions that can be virtualized. Reasons for paravirtualization:

– Some aspects of the hardware cannot be virtualized.

– Improved performance.

– Present a simpler interface Examples: Xen, Denaly

(5)

Full virtualization and paravirtualization

5

Hardware abstractions are sets of routines in software that emulate some

(6)

History

In early 2000 ,become disadv. That hardware support for virtualization

provided by AMD -Advanced Micro Devices & Intel 1

st

gen x86

architecture

In 2005 , Intel released two pentium 4 models support VT-x.

In 2006 AMD pacifica & serval Athlon 64 models

(7)

x86 poses some problems

Certain x86 instructions were impossible to truly ‘virtualize’ in that

classical sense

For example, the ‘smsw’ instruction can be executed at any privilege-level,

and in any processor mode, revealing to software the current hardware

status (e.g., Vm,Rf)

Intel’s Vanderpool Project endeavored to remedy this (using new

processor modes)

(8)

VT-x

Virtualization Technology for x86 CPUs

There are two modes of operation of VT-x ,and the two operation to

transit from one to another.

The VMCS has Two new processor execution-modes

VMX ‘root’ mode (for VM Managers)

VMX ‘non-root’ mode (for VM Guests)

Ten new hardware instructions

A six-part VMCS data-structure

(9)

VT- x

Cloud Computing: Theory and

(10)

Interaction of VMs and VMM

VM Monitor (Host) VM #1

(Guest)

VM #2 (Guest)

VMXON VMXOFF

VM Entry

VM Exit

VM Entry

(11)

VT-d, a new virtualization architecture

I/O MMU virtualization gives VMs direct access to peripheral

devices.

VT-d supports:

DMA address remapping, address translation for device

DMA transfers.

Interrupt remapping, isolation of device interrupts and VM

routing.

I/O device assignment, the devices can be assigned by an

administrator to a VM in any configurations.

Reliability features, it reports and records DMA and

interrupt errors that my otherwise corrupt memory and

impact VM isolation.

Cloud Computing: Theory and

(12)

VMCS

Virtual Machine Control Structure

A six-part data-structure (fits in a page-frame)

One VMCS for each VM, one for the Monitor

CPU is told physical address of each VMCS

Software must first “initialize” each VMCS

Then no further direct access to a VMCS

Access is indirect (via VMX instructions)

(13)

Six logical groups

Organization of contents in the VMCS:

The ‘Guest-State’ area

The ‘Host-State’ area

The VM-execution Control fields

The VM-exit Control fields

The VM-entry Control fields

(14)

The ten VMX instructions

VMXON and VMXOFF

VMPTRLD and VMPTRST

VMCLEAR

VMWRITE and VMREAD

VMLAUNCH and VMRESUME

(15)

Xen Architecture

Virtual machine layer

Hypervisor layer

Hardware/physical layer

Hardware or physical layer:

Physical hardware components including memory, CPU, network cards, and disk drives.

Hypervisor layer:

Thin layer of software that runs on top of the hardware. The Xen hypervisor gives each virtual machine a dedicated view of the hardware.

Virtual machine layer:

(16)

VM Techniques (1) - Full-Virtualization

• Technical aspects:

Full virtualization is a virtualization technique used to provide a certain kind of virtual machine environment, saying, a complete

simulation of the underlying hardware which represents total abstraction of the underlying physical system, and create a complete virtual system in which the guest operating system can execute.

In such an environment, any software capable of execution on the raw hardware, can be run in the virtual machine and; in particular, any operating systems (Guest Operating System). No modification is required in the guest operating system or application; the guest operating system or application is not even aware that it is running within a virtualized environment.

• Typical solution of Full-Virtualization:

❑ Commercial: VMWare ESX, Microsoft Virtual Server, Citrix XenServer.

(17)

Full-Virtualization - Continue

• Advantages:

❑ Operating System does not need to be modified in order to run in a virtualized environment.

❑ Virtual machine can smoothly, easily change to different virtual system.

Example: converting VMWare guest image into Xen image

• Disadvantages:

❑Incur performance and resource penalty on VMs.

(18)

VM Techniques (2) - Para-Virtualization

• Technical aspects:

Para-virtualization is a virtualization technique that attempts to provide most services directly from the underlying hardware instead of abstracting it. Para-virtualization allows for near-native performance.

Para-virtualization requires that a guest operating system be modified to support virtualization. This typically means that guest operating systems are limited to open source systems such as Linux.

• Typical solution of Para-Virtualization:

• Commercial: Sun Solaris container.

(19)

Para-Virtualization - Continue

• Advantages:

❑ Para-virtualized guest system comes closer to native performance than a fully virtualized guest.

❑ The latest virtualization CPU support is not needed for para-virtualized.

• Disadvantages:

❑ Requires that a guest operating system be modified to support

virtualization. This typically means that guest operating systems are limited to open source systems such as Linux.

(20)

Xen - a VMM based on paravirtualization

The goal of the Cambridge group - design a VMM capable of scaling to

about 100 VMs running standard applications and services without any

modifications to the Application Binary Interface (ABI).

Linux, Minix, NetBSD, FreeBSD, NetWare, and OZONE can operate as

paravirtualized Xen guest OS running on x86, x86-64, Itanium, and ARM

architectures.

Xen domain - ensemble of address spaces hosting a guest OS and

applications running under the guest OS. Runs on a virtual CPU.

– Dom0 - dedicated to execution of Xen control functions and privileged instructions.

– DomU - a user domain.

(21)
(22)

Xen: Approach and Overview

Xen: paravirtualization

Provides some exposures to the underlying HW

Better performance

Need modifications to the OS

(23)

Xen implementation on x86 architecture

• Xen runs at privilege Level 0, the guest OS at Level 1, and applications at Level 3.

• The IDE interface was originally designed for rotating HDD (Hard Disk Drives) in the PC system. IDE DOM (Disk-On-Module).

• An IDE DOM is a small flash storage module which plugs directly to the IDE connector of the host motherboard.

• The x86 architecture does not support either the tagging of TLB entries or the software management of the TLB. Thus, address space switching, when the

VMM activates a different OS, requires a complete TLB flush; this has a negative impact on the performance.

A translation lookaside buffer (TLB) is a cache that memory management hardware uses to improve virtual address translation speed.

• Solution - load Xen in a 64 MB segment at the top of each address space and delegate the management of hardware page tables to the guest OS with

minimal intervention from Xen. This region is not accessible or re-mappable by the guest OS.

• Xen schedules individual domains using the Borrowed Virtual Time (BVT) scheduling algorithm.

(24)

Memory Management

Depending on the hardware supports

Software managed TLB

Associate address space IDs with TLB tags

Allow coexistence of OSes

Avoid TLB flushing across OS boundaries

X86 does not have software managed TLB

Xen exists at the top 64MB of every address space

Avoid TLB flushing when an guest OS enter/exist Xen

Each OS can only map to memory it owns

Writes are validated by Xen

(25)

CPU

X86 supports 4

levels of privileges

0 for OS, and 3 for

applications

Xen downgrades the

privilege of OSes

System-call and

page-fault handlers

registered to Xen

“fast handlers” for

(26)

Device I/O

Xen exposes a set of simple device abstractions

The Cost of Porting an OS to Xen

Privileged instructions

Page table access

Network driver

Block device driver

(27)

Control Management

Separation of policy and mechanism

Domain0 hosts the application-level

management software

Creation and deletion

(28)

Control Transfer: Hypercalls and Events

Hypercall: synchronous calls from a domain to

Xen

Analogous to system calls

Events: asynchronous notifications from Xen

to domains

(29)

Data Transfer: I/O Rings

(30)

CPU Scheduling

Borrowed virtual time scheduling

Allows temporary violations of fair sharing to

favor recently-woken domains

(31)

Time and Timers

Xen provides each guest OS with

Real time (since machine boot)

Virtual time (time spent for execution)

Wall-clock time

Each guest OS can program a pair of alarm

timers

Real time

(32)

Virtual Address Translation

No shadow pages (VMWare)

Xen provides constrained but direct MMU

updates

All guest OSes have read-only accesses to

page tables

(33)

Physical Memory

Reserved at domain creation times

(34)

Network

Virtual firewall-router attached to all domains

Round-robin packet scheduler

To send a packet, enqueue a buffer descriptor

into the transmit rang

Use scatter-gather DMA (no packet copying)

A domain needs to exchange page frame to avoid

copying

(35)

Disk

Only Domain0 has direct access to disks

Other domains need to use virtual block

devices

Use the I/O ring

Reorder requests prior to enqueuing them on the

ring

If permitted, Xen will also reorder requests to

improve performance

(36)

Dom0 components

XenStore – a Dom0 process.(Disk-On-Module)

– Supports a system-wide registry and naming service.

– Implemented as a hierarchical key-value storage.

– A watch function informs listeners of changes of the key in storage they have subscribed to.

– Communicates with guest VMs via shared memory using Dom0 privileges

Toolstack - responsible for creating, destroying, and managing the

resources and privileges of VMs.

– To create a new VM, a user provides a configuration file describing memory and CPU allocations and device configurations.

– Toolstack parses this file and writes this information in XenStore.

(37)
(38)

Xen abstractions for networking and I/O

Each domain has one or more Virtual Network Interfaces (VIFs) which

support the functionality of a network interface card. A VIF is attached to

a Virtual Firewall-Router (VFR).

Split drivers have a front-end in the DomU and the back-end in Dom0; the

two communicate via a ring in shared memory.

Ring - a circular queue of descriptors allocated by a domain and accessible

within Xen. Descriptors do not contain data, the data buffers are allocated

off-band by the guest OS.

Two rings of buffer descriptors, one for packet sending and one for packet

receiving, are supported.

To transmit a packet:

– a guest OS enqueues a buffer descriptor to the send ring,

– then Xen copies the descriptor and checks safety,

– copies only the packet header, not the payload, and

(39)
(40)

Xen 2.0

• Optimization of:

Virtual interface - takes advantage of the capabilities of

some physical NICs, such as checksum offload.

I/O channel - rather than copying a data buffer holding a

packet, each packet is allocated in a new page and then

the physical page containing the packet is re-mapped into

the target domain.

Virtual memory - takes advantage of the superpage and

global page mapping hardware on Pentium and Pentium

Pro processors. A superpage entry covers 1,024 pages of

physical memory and the address translation mechanism

maps a set of contiguous pages to a set of contiguous

physical pages. This helps reduce the number of TLB

misses.

Cloud Computing: Theory and

(41)
(42)
(43)

Performance comparison of virtual machines

Compare the performance of Xen and OpenVZwith, a standard

operating system, a plain vanilla Linux.

The questions examined are:

– How the performance scales up with the load?

– What is the impact of a mix of applications?

– What are the implications of the load assignment on individual servers?

The main conclusions:

– The virtualization overhead of Xen is considerably higher than that of OpenVZ and that this is due primarily to L2-cache misses.

– The performance degradation when the workload increases is also noticeable for Xen.

(44)

The setup for the performance comparison of a native Linux system with OpenVZ, and the Xen systems. The applications are a web server and a MySQL database server. (a) The first

(45)

The darker side of virtualization

In a layered structure, a defense mechanism at some layer can be disabled

by malware running at a layer below it.

It is feasible to insert a rogue VMM, a Virtual-Machine Based Rootkit

(VMBR) between the physical hardware and an operating system.

Rootkit - malware with a privileged access to a system.

The VMBR can enable a separate malicious OS to run surreptitiously and

make this malicious OS invisible to the guest OS and to the application

running under it.

Under the protection of the VMBR, the malicious OS could:

– observe the data, the events, or the state of the target system.

– run services, such as spam relays or distributed denial-of-service attacks.

(46)
(47)

References

Related documents

The timetable structure at Timaru Boys’ and Girls’ will allow students to have normal classes and aviation theory during the course of the school day and after school or weekends

Koinè Greek was the common language of those days, ‘spoken as freely on the streets of Rome, Alexandria and Jerusalem as in Athens’ (Dana and Mantey 1955, 7) 13. Vattimo often

This higher privileged mode allows the OS access to certain machine instructions (called privileged in- structions) and resources not available to the user mode ap- plications..

All state government web applications that require authentication and authorization of users will utilize the enterprise LDAP directory, known as Nebraska Directory Services..

None of reference is invalid, drug information to opioids under the references from a group of the technologies we cannot show you already on our pdr.. Just some electronic access

researchers who will provide guidance on appropriate analyses and effective strategies suitable for replication in other settings. The GU-NC evaluation plan will be further

A virtual machine monitor can be constructed if the set of sensitive instructions is a subset of the set of privileged instructions.

Some instructions sensitive read or update the state of virtual machine and don't trap nonprivileged ●.. 17 sensitive, non-privileged instructions [Robin et