Challenges in x86 Virtualization - Adaptive Resource Relocation in Virtualized Heterogeneous Cl

3.2 Challenges in x86 Virtualization

Unlike mainframes, x86 machines were not designed to support full virtualization [33, 3]. The operating systems based on x86 architecture are designed to run directly on the bare-metal hardware, so they naturally assume they fully own the computer hardware. As shown in Figure 3.1(a), the x86 architecture offers four levels of privilege known as Ring 0, 1, 2 and 3.

While user level applications typically run in Ring 3, the operating system needs to have direct access to the memory and hardware and must execute its privileged instructions in Ring 0. Virtualizing the x86 architecture requires placing a virtualization layer under the operating system (which expects to be in the most privileged Ring 0) to create and manage virtual machines that deliver shared resources. Further complicating the situation, some sensitive instructions cannot effectively be virtualized as they have different semantics if they are not executed in Ring 0. The difficulty in trapping and translating these sensitive and privileged instruction requests at runtime was the challenge that originally made x86 architecture virtualization look quite difficult, until VMWare introduced full virtualization in 1999 [19].

In the case of ‘Full virtualization’, the virtual machine or the guest operating system is presented with a complete simulation of the underlying hardware. It relies on binary translation to trap and virtualize the execution of certain sensitive, non-virtualizable instructions. With this approach, critical instructions are discovered and replaced with traps into the VMM to be emulated in software. The binary translation results in a large performance overhead in comparison to other virtualization techniques [33]. VMware Workstation [19], Oracle Virtual box [17] and the Microsoft Virtual PC [7] , are well-known commercial implementations of full virtualization. Full virtualization is also referred as software virtualization. Paravirtualization involves modifying the OS kernel to replace non- virtualizable instructions with ‘hypercalls’ that communicate directly with the virtual machine monitor (VMM or the hypervisor). This results in significant performance improvement compared to the full virtualization technique [33, 107]. The downside of paravirtualization is the fact that proprietary operating systems like Windows XP/Vista and 7 cannot be paravirtualized due to non-availability of the source code. The open source Xen project [11] is an example of paravirtualization that virtualizes the processor and memory using a modified Linux kernel and virtualizes the I/O using custom guest OS device drivers. A typical use of hardware rings used in hypervisor environment is shown in Figure 3.1(b).

Kernel Ring 0 Ring 1 Ring 2 Ring 3 Device Drivers Device Drivers Applications

(a) Privilege rings for the x86, along with their common uses.

Hypervisor Ring 0 Ring 1 Ring 2 Ring 3 Domain 0 and U Applications

(b) Privilege rings for the x86 in paravirtualization. Ring 0 Ring 1 Ring 2 Ring 3 Applications Hypervisor Ring 0 Ring -1 Domain 0 and U

Figure 3.1: Privilege rings in the x86 architecture [courtesy [21].

and AMD) have also started to provide support for virtualization. Intel-VT [13] and AMD-V [10] architectures are already in the market which enable Xen and other VMMs to host unmodified guest operating systems. As depicted in Figure 3.1(c), privileged and sensitive calls are set to automatically trap to the hypervisor, removing the need for either binary translation or paravirtualization.

The guest state is stored in Virtual Machine Control Structures (VT-x) or Virtual Machine Control Blocks (AMD-V). Processors with Intel VT and AMD-V became available in 2006, so only newer systems contain these hardware assist features. Due to high overheads of hypervisor to guest transition the hardware assisted virtualization is slow compared to paravirtualization [19]

A number of virtualization products (VMMs) are now available for various operating systems. Among them VMware [19] , Xen [11, 33], Microsoft’s HyperV [14] and Oracle’s Virtualbox [17] are worth mentioning. The mainstream Linux kernel also includes Kernel based Virtual Machine (KVM).

In this research we have used Xen for the virtualization of our compute farm, mainly due to:

1. Xen is an open source hypervisor licensed under GPL (ver 2.0) enabling the developers to view and modify the code.

2. Currently, paravirtualization is the fastest virtualization technique available [106, 3]. Other open source attempts were either in infancy (like KVM) or did not support paravirtualization or provide the migration of virtual machines. 3. VMWare does not allow publishing the performance benchmarks and results

of their hypervisor to the public domain unless approved by VMWare (clause 3.3 of End Users Licensing Agreement) [1]. In any case, VMWare is a closed

3.3 Xen Architecture Ported OS to Xen (Management domain Dom0) Hardware (SMP, MMU, Network IF, IDE etc.) Hypervisor Management Software Ported OS to Xen (Paravirtualized) Unmodified Application Unmodified OS (Fully virtualized) Unmodified Application Backend Front end device drivers Front end device drivers Native Device Drivers

Control IF Event Ch. Virtual MMU Virtual CPU ...

Figure 3.2: Xen environment block diagram [courtesy [33]].

source product and therefore not applicable to our research.

4. Microsoft Hyper-V was not released when we started our research work moreover, Hyper-V is a closed source product.

In the next section, we briefly discuss Xen’s architecture.

3.3 Xen Architecture

Xen is an open source Virtual Machine Monitor that enables running multiple operating systems on a single machine [33]. The overall Xen environment consists of several items that work together to give the virtualization experiance [106]. Some critical components of Xen environment are:

• Hypervisor

• Domain 0 (Dom 0 or Management Domain)

• Domain U (Dom U - can either be paravirtualized or fully virtualized instance of an OS)

• Management Software

In document Adaptive Resource Relocation in Virtualized Heterogeneous Clusters (Page 49-51)