• No results found

Hardware virtualization technology and its security

N/A
N/A
Protected

Academic year: 2021

Share "Hardware virtualization technology and its security"

Copied!
61
0
0

Loading.... (view fulltext now)

Full text

(1)

Hardware virtualization

technology and its security

Dr. Qingni Shen

Peking University

(2)

Main Points

VMM technology

Intel VT technology

(3)

Virtual Machine Monitors (VMMs)

VMM is a software layer

Allow many virtual machine to share hardware

Allow unmodified software directly compatible

...

Virtual Machine Monitor (VMM)

VMn VM0 VM1 Platform HW I/O Devices Processor/CS Memory Virtual Machines (VMs) Appn App0 Guest OS0 App1 Guest OS1 Guest OSn

(4)

Workload Isolation

Purpose of Virtualization

Workload Consolidation

Workload Migration Workload Embedding HW App2 App1 OS HW1 HW2 App2 App1 OS1 OS2 VMM HW App2 App1 OS1 OS2 VMM HW1 App HW2 VMM OS VMM HW1 App HW2 VMM OS VMM HW App App OS1 OS2 VMM HW App1 App2 OS OS

(5)

Virtualization Usage Models

Legacy software support

Test

The active partition

Manageable

Server consolidation

Failure recovery architecture

High elastic data center

Manageable

Migration Consolidation Consolidation Consolidation Isolation Migration Embedding Isolation Migration Embedding Isolation Migration

CL

IENT

SER

VE

R

(6)

What is Intel VT technology

Formerly known by the codenames Vanderpool* & Silvervale*

 VT is a collection of a series of hardware enhanced components

 VT is designed to simplify the virtualization software

 VT brings a new value, and various opportunities

 VT-x and VT-i the first VT series products implement on Intel processor and chip set.

 VT-x for IA-32 CPU virtualization enhancement

(7)

Main components of Intel-VT

Intel-VT technology, which is designed by

Intel corporation, is a solution of hardware

assisted virtualization. Including:

VT-x/VT-i for CPU

VT-d for chip set

(8)

Core function of VT-x/VT-i

Intel flexible priority technology

– (Intel VT FlexPriority)

Intel VT flexible migration technology

– (Intel VT FlexMigration)

Intel VT extended page table

(9)

Intel VT FlexPriority

When the processor executes the task,it will receive request or “Interruption” command which needs to pay attention to and produced by other devices or applications. In order to minimize the impact on performance, a special register within the processor will monitor the task priority. Thus, only a higher priority than the currently running task interruption will be timely focused. Intel FlexPriority can create a virtual copy of TPR6,which can be read, and can be modified by guest os without any intervention in some cases. This measure can make a significant performance improvement in 32 bit OS which uses TPR frequently.( For instance,the performance of application in Windows Server* 2000 will be improved by 35%.)

(10)

Intel VT FlexMigration

An important advantage of virtualization is that in no downtime condition, running applications can be migrated between physical machines. The aim of Intel VT FlexMigration is to achieve the seamless migration between current server and future server which are based on Intel processor, even if the new system may include enhanced instruction set. With the help of this technology, management process can create a set of consistent instructions in all servers in migration pool, realizing seamless migration of workload. This generates a more flexible and unified server resource pool which can run seamlessly among generations of hardware.

(11)

Platform Hardware

VM1

VM Monitor

VM0

Guest OS0 App App ... App

...

Guest OS1 App App App

...

OS and applications should not know that they are

sharing CPU resources with others

VMM should be able to protect themselves from other client software threat

Challenge of development of VMM

VMM should be able to make software stack in VM

mutually independent

VMM should be able to provide virtual hardware platform interface to guest software

(12)

Platform Hardware VM1 VM Monitor VM0 Guest OS0

...

Guest OS1 Run VMM in VMM to handle errors during Guest OS operation

CPU virtualization of current IA architecture

requires complex software design.

Software solution: Client degradation

Virtual hole of IA architecture:

• Ring level rename • Non-trap instruction • Out of bound error

• I interruption virtualization • Context switching of CPU state •Address space compression

Complex software skills • Source code modification • Binary code modification

App App ... App App App ... App

Sensitive instruction will go wrong when run Guest OS in ring 0 and above

(13)

VMM is able to execute

privilege instructions before guest software

VT removes the design of virtualization hole and

complex software

Intel

®

Virtualization Technology

Guest software runs in the new model, and the privilege is down;

Applications still run in ring 3 • OS runs in degraded privilege ring 0

• VMM runs in a new model with all privileges

Platform Hardware VM1 VM Monitor VM0 Guest OS0

...

Guest OS1 App App ... App App App ... App

(14)

An overview of VT-x

Operation Mode

Guest OS

 VMM transition

VM control structure

Virtual-machine control structure

Principle of VM exit

(15)

Operation mode

VMX root

mode:

 Own all privileges for the operation of the VMM

VMX non-root

mode:

 Own a subset of privileges for running guest softwares

 Rely on the ring level to reduce guest and software privileges

(16)

VMX operation mode

Root operation mode

VMM is running in the root operation mode

Non- root operation mode

Guest software is running in the non-root operation

mode

(17)

VM Entry and VM Exit

VM Entry

 From VMM into Guest

 Fetch VM state from VMCS,and enter in non-root mode

 VMLAUNCH instruction is used to initialize the entry

VMRESUME is used to re-enter the virtual machine state

Physical Host Hardware

VM1

VM Monitor

VM0

Guest OS0 App App ... App

...

Guest OS1 App App ... App

VM Exit VM Entry

VM Exit

From Guest into VMM

Enters VMX root mode

Place guest state into

VMCS

Import VMM state from

VMCS

(18)

IA-32

Operation

VT-x Operation

Ring 0

Ring 3

(19)

VT-x Operation

Ring 0

Ring 3

VMX Root

(20)

VT-x Operation

Ring 0

Ring 3

VMX Root

Operation

VMX

Non-root

Operation

Ring 0

Ring 3

VM 1

VMLAUNCH

(21)

VT-x Operation

Ring 0

Ring 3

VMX Root

Operation

VMX

Non-root

Operation

Ring 0

Ring 3

VM 1

VM Exit

(22)

VT-x Operation

Ring 0

Ring 3

VMX Root

Operation

VMX

Non-root

Operation

Ring 0

Ring 3

VM 1

VMRESUME

(23)

VT-x Operation

Ring 0

Ring 3

VMX Root

Operation

VMX

Non-root

Operation

. . .

Ring 0

Ring 3

VM 1

Ring 0

Ring 3

VM 2

Ring 0

Ring 3

VM n

VMLAUNCH

(24)

VT-x Operation

Ring 0

Ring 3

VMX Root

Operation

VMX

Non-root

Operation

. . .

Ring 0

Ring 3

VM 1

Ring 0

Ring 3

VM 2

Ring 0

Ring 3

VM n VMCS2 VMCSn VMCS1

(25)

Virtual Machine Control

Structure (VMCS

)

VMCSs is control structure stored in the memory

Only one VMCS is active every time

VMCS Payload:

VM execution,exit,entry control

Guest and host state

VM exits information field

VMCS currently has no uniform standard , so

different designs may have different definitions

VMPTRLD:

a pointer pointing to VMCS

(26)

Virtual machine control structure (VMCS)

In the view of VMX operation,Intel defines VMCS. This structure can only be operated by VMCLEAR, VMPTRLD, VMREAD, and VMWRITE。

 a) GUEST-STATE domain:state of processor when VM changes from root mode to non-root mode;

 b) HOST-STATE domain:state of processor when VM changes from non-root mode to root mode ;

 c) VM execution control domain : Processor is forced to exit from non-root operation mode to root operation mode if VM is running in non-root operation mode.

 d) VM exit control domain : Store information f VM exits from non-root operation mode.

 e) VM entry control domain:Read information if VM enters into non-root operation mode.

 f) VM exit information domain:Save the reason into domain if VM exits from non-root operation mode to root operation mode.

(27)

Reasons of VM EXIT

Exit paging state to operate on the page table

Access CR3, INVLPG instruction(Control TLB disabled)

Page error

CR0/CR4 access

Some states need virtualization

CPUID, RDMSR, WRMSR, RDPMC, RDTSC, MOV DRx

Exception and I/O access

32-entry exception bitmap, I/O-port access bitmap

Control of the asynchronous events

When guest interrupt blocks, VMM should handle this situation

Detect guest states in order to facilitate VM scheduling

(28)

Benefits: VT helps improve VMMs

VT reduces the guest OS’s dependency

No need for binary package or translation

Provide support for legacy system

VT improves robustness

No need for complex software technology

Simplified

Smaller Trusted Compute Base (TCB)

VT improves performance

(29)

Device Virtualization (VT-d)

As for server, I/O is an important component. The improvement

of CPU computing ability can lead to faster data processing, only

with the premise of the smooth arrival of data to CPU. As a result,

whether the storage or the network, as well as the graphic cards,

memory, and so on, I/O capability is an critical part of

enterprise-level architecture.

Without VT-d technology, VMM must be involved in the

interaction with I/O directly, which will not only slows down the

speed of data transmission, but also increases processor’s

workload due to frequent VMM activities. VT-d provides direct

access to real hardware mechanism for guest OS, which greatly

reduces server processor’s workload.

(30)

Current way of virtualization

Simulate the I/O device:VMM simulates an I/O device for the guest so that the guest can make use of the corresponding real drivers through fully simulating devices’ functionality. This approach can provide perfect compatibility (regardless of the fact that whether this device exists or not), but this simulation will affect performance apparently.

Additional software interface : This mode is more like I/O simulation model. VMM software will provide a series of direct device interface to VM, so as to enhance the efficiency of virtualization. This is a bit like the DirectX technology of Windows OS, which offers better performance than I/O simulation model, but decreases the capability.

(31)
(32)
(33)

Design of VT-d

The key to I/O virtualization is to solve the problem of DMA and

IRQ interrupt request.

Intel VT-d technology is based on hardware-assisted virtualization technology of North Bridge. The DMA virtualization hardware and IRQ virtualization hardware, built in the North Bridge, greatly enhance the reliability, flexibility and performance of I/O.

Traditional IOMMUs (I/O memory management units) distinguishes devices through the range of memory address. So it is easy to realize, but is not easy to implement DMA isolation. Therefore, VT-d realizes the existence of multiple DMA protected areas by updating the design of IOMMU architecture, and achieves DMA virtualization eventually. It is also called DMA Remapping.

(34)

I/O device will generate many interrupt requests, so the I/O virtualization must separate these requests correctly, and routes them to different virtual machines. Traditional devices have two kinds of interrupt requests: One way is through I/O interrupt controller router, and the other way is through MSI(message signaled interrupts) which is sent by DMA write request directly. Due to the need to embed the target memory address into DMA request, this architecture requires fully access all the memory addresses, without realizing interrupt isolation.

VT-d’s interrupt-remapping architecture solves this problem by redefining MSI format. The new MSI is still in the form of a DMA write request, but does not embed the target memory address, and replaces with a message ID instead. Hardware can identify different VM domains through different message IDs by maintaining a table structure. The interrupt-remapping architecture implemented by VT-d is able to support all I/O resources, including IOAPICs, and all types of interrupt, such as common MSI and extended MSI-X.

(35)

DMA Remapping

DMA remapping can provide hardware isolation for

devices to access the memory. Through different I/O

page tables, every device will be assigned to a specific

domain. When the device attempts to access the

system memory, DMA intercepts the access, decides

whether to allow the access, and determines the real

address location simultaneously. When the I/O table

data structure is used frequently, it will be cached.

DMA remapping mechanism can be configured

independently by every device.

(36)

Interrupt Remapping

Interrupt

remapping

provides

the

functions of remapping and routing the

interrupt requests from I/O devices.

(37)

New design of IOMMU

IOMMU manages device access to system memory. It locates between the peripheral devices and the host, and translates the address of device request to system memory address, and also checks the appropriate permission for each access.

With IOMMU, every device can be assigned to a protection domain, which defines that the I/O page translation will be used in every device of the domain, and reveals the read privilege of every I/O page. As to virtualization, VMM can specify all devices to a specific guest OS environment in the same protected domain, which will create a series of address translation and access restrict for devices running on specific guest OS.

(38)

Two kinds of new device virtualization based on VT-d

Direct assignment of I/O device:Physical I/O device is directly assigned to VM. In this model, drivers inside the VM will directly communicate with hardware devices, only through a small amount or without the management of VMM. For the sake of system’s robustness, hardware virtualization is needed to isolate and protect hardware resources only for specified VM to use. In the meanwhile, hardware also needs to possess multiple I/O container partitions for multiple VMs simultaneously.

This model almost eliminates the need of running drivers in VMM completely.

Such as CPU,although it is not an I/O device in common sense, it is surely in this way allocated to VM, while the CPU resources are still under the management of VMM.

Shared I/O device: This model is an extension of the I/O assignment model, and has a high requirement that needs to support multiple function interfaces, and each interface can be assigned to a VM independently. This model will no doubt provide very high virtualization performance.

(39)
(40)

Network Virtualization (VT-c)

Intel VT-c can further optimize network for virtualization.

Essentially, the function of this set of technology

combination is similar with post office: categorize all the

received letters, packages and envelopes, and deliver them to

their respective destinations. Intel VT-c significantly

increases the speed of delivery, and reduces the workload of

VMM and server processor through these functions

implementing in private network chips. VT-c includes:

Virtual Machine Device Queue (VMDq)

(41)

VMDq

In traditional server virtualization environment, VMM must categorize every individual data packet, and deliver it to its assigned VM, which will take up a lot of processor cycles. And with VMDq, this function can be performed by specified hardware within Intel server network card, and VMM is only responsible to deliver presort data packet group to appropriate guest OS. This will slow down I/O latency, and gain more available cycles for processor to deal with business applications. I/O throughput can be more than doubled by Intel VT-c, so that virtualized applications are able to reach the level of the host throughput. Every server will integrate more applications, while I/O bottlenecks will be less.

(42)

Network virtualization model

Currently, all the VM softwares with

network capabilities have built-in virtual

switches, a majority of which provide the

function of router on that basis. Their

aim is to connect multiple virtual

machines together into one or more

networks, like the effect of real switch or

router.

(43)
(44)

Structure of VMDq

VMDq technology provides a classification/sorting engine, belonging to the second layer of ISO OSI 7-layer model, realizes part of the functions of the switch. In order to offer a suitable performance, it must use a stack buffer queue, therefore the network card that supports VMDq will also supports RSS receiver’s extended function.

A layer 2 classification/sorting device is realized by a hardware on the network card that supports VMDq, which through the MAC address or VLAN to send packets to specified VM queue(this queue is called pool). VMM software that completes virtual switch task only requires simple data replication in the final. Thus it greatly improve the efficiency of the virtual network.

Network card that supports VMDq queue usually supports RSS queue. For example, Intel 82576EB network card supports 8 VM queues, and 16 RSS queues. The are essentially 16 send/receive queue pairs, which means every VM can be assigned two pairs.

(45)

Diagram of VMDq Acceleration Structure

(46)

Virtual Machine Direct Connection( VMDc )

With the aid of single root I/O virtualization (SR-IOV)

standard in PCI-SI, VM direct connection (VMDc)

supports

VM’s direct access to network I/O hardware, and thus

improves the performance significantly.

As it is mentioned

before, Intel VT-d supports direct communication channel

between guest OS and I/O port. SR-IOV can be extended by

supporting each I/O port’s multiple communication

channels. For example,each of the 10 guest OSes can be

assigned a protected and 1Gb/s private link by the mean of

a single Intel 10 Gigabit server network card. These links

bypass the VMM switch,and can further enhance I/O in

performance and reduce workload of server processors.

(47)

Security Analysis of VT-d

Hardware virtualization solves the security

problem of virtual system, and provides a

better isolation solution in system hardware

resources.

But the hardware system is complicated, so

there are still some security problems to be

solved. In the meantime, a few attackers

have discovered some loopholes in hardware

virtualization.

(48)

Attack Scenario

Assume such a virtual system, which builds a driver

domain with the aid of the Intel VT-d technology.

Driver domains are similar to traditional VMs, but

they are assigned the privileges of choosing devices

such as network card, disk controller etc.

We can attempt to get the complete control of the

whole system by the mean of such a deriver domain.

In this attack scenario, we suppose that attackers

have managed to get a full control of a certain driver

domain.

(49)
(50)

MSI( Message Signaled Interrupts )

MSI Format(From Intel developer manual ):

All the three attacks, which will be mentioned

later, make use of I/O devices to generate the MSI,

so as to realize the attack.

(51)

1)Threat based on SIPI Construction

SIPI ( Start-up Inter Processor Interrupt )

interrupt is a key function of any multiprocessor

(or multi-core) system based on Intel processor.

BIOS uses SIPI interrupt to initialize all

processers and distribute tasks to them at startup.

When system starts, only one processor, called

Bootstrap processor or BSP, is active, and its job

is to initialize other processors to make them

work properly.

(52)

SIPI interrupt informs target processor to start to

execute special boot code at the address 0xvv000.

While VV is passed by SIPI interrupt vector. In

order to make SIPI effective, target CPU must be

sent a INIT interrupt firstly, which will reset CPU to

enter the wait-for-SIPI state. BSP sends SIPI

interrupts to all other processors under normal

circumstances.

The only mechanism of sending SIPI interrupt is

through the local advanced programmable interrupt

controller.

(53)

SIPI 格 式

( 摘 自

Intel 开 发

(54)
(55)

2)System call injection attack

Driver Domain

CPU#0 CPU#1 CPU#2

Hypervisor NIC

0x82h

hypercall

(56)

3)#AC-based injection attack

#AC can be tried to confuse the stack layout

of exception handler.

#AC exception is the only exception that

meets the following two requirements:

The vector value is greater than 15, so that it

can be distributed by MSI;

It is the only one that can be interpreted as

exception, without storage error codes.

(57)

LOW HIG H

ErrorCode

RIP

CS

RFLAGS

RSP

SS

Normal distribution of #AC exception Storage

(58)

The #AC handler will be triggered to execute on

target CPU if the MSI, with a vector value 0x11(#

AC), is distributed from some devices. Because

handler is expected to place error codes on the top of

the stack, so it will go wrong when resolve other

values on the stack. In this case, CS may be revolved

to RIP, and RFLAGS will be treated as CS and so on.

When an exception handler ends, it will execute

IRET instruction to popup saved register values, and

jumps back to CS:RIP, which means that handler

will return to RFLAGS:CS actually。

(59)
(60)

Bibliography

1. Hiremane, R. (2007). "Intel virtualization technology for directed i/o (intel vt-d)." Technology@ Intel Magazine 4(10).

2. Neiger, G., et al. (2006). "Intel virtualization technology: Hardware support for efficient processor virtualization." Intel Technology Journal 10(3): 167-177.

3. Uhlig, R., et al. (2005). "Intel virtualization technology." Computer 38(5): 48-56. 4. Adams, K. and O. Agesen (2006). A comparison of software and hardware

techniques for x86 virtualization. ACM SIGOPS Operating Systems Review, ACM. 5. Zhang, X. and Y. Dong (2008). Optimizing Xen VMM Based on Intel®

Virtualization Technology. Internet Computing in Science and Engineering, 2008. ICICSE'08. International Conference on, IEEE.

6. Perez, R., et al. (2008). "Virtualization and hardware-based security." Security & Privacy, IEEE 6(5): 24-31.

7. De Gelas, J. and I. ESX (2008). "Hardware Virtualization: the Nuts and Bolts." AnandTech. Retrieved March 17: 2008.

(61)

References

Related documents

Users Data Application Middleware OS Virtualization Hardware Network Middleware Middleware OS OS Virtualization Hardware Network On-Premises SaaS Virtualization Hardware

Considering its ability to address the challenges of VLAN grouping using a virtualization solution, an NVGRE solution is considered an ideal technology for network

relative to, say, Figure ??, reveals a problem with comparing prediction markets to opinion polls: opinion polls in general quote vote shares, whereas prediction markets in

There is a significant variation between an individuals ability to thermoregulate , VO2 max (fitness), differing ages, body composition and ethnicity being the principal ones. It

The patient was offered the possibility of just using edge composite build-ups to regain the correct upper incisal outline with better guidance and deal with her worn occlusion

This document utilizes the components within the Danielson Framework for Teaching to provide possible guiding questions for conversations that occur between a principal and

Districts/Schools must open a DOE help desk ticket and let SEA know the Student ID or Student Name if the student will be testing for Access-Alt. Q7: Can the student continue

The National Danish Survey of Patient Experi- ences (Danish acronym: LUP) is a questionnaire survey for assessing patients’ experiences with the Danish health care