3.6 Virtualization software
3.6.5 Comparison between virtualization software
A high-level comparison is given in table 3.1. All virtualization products in the table, except Xen, are installed within a host operating system. Xen is installed directly on the hardware. Most products provide two techniques for virtualization on x86 architectures. Hardware support for virtualization on x86 architectures is supported by all virtualization software in the table.
1
http://virtualizationreview.com/Blogs/Mental-Ward/2009/02/ KVM-BareMetal-Hypervisor.aspx
3.6. VIRTUALIZATION SOFTWARE 27
VirtualBox VMware
Workstation XEN KVM
Hypervisor type Hosted Hosted Native,
bare-metal Hosted Dynamic binary translation X X Paravirtualization X Hardware support X X X X
CHAPTER
4
Nested virtualization
The focus of this thesis lies with nested virtualization on x86 architectures. Nested virtualization is executing a virtual machine inside a virtual machine. In case of multiple nesting levels, one can also talk about recursive virtual machines. In 1973 and 1975 initial research was published about properties of recursive virtual machine architectures [40, 41]. These works refer to virtualization that was used in main- frames so that users could work simultaneously on a single mainframe. Multiple use cases come in mind for using nested virtualization.
• A possible use case for nested x86 virtualization is the development of test setups for research purposes. Research in cluster1and grid2computing requires extensive test setups, which might not be available. The latest developments in the research of grid and cluster computing make use of virtualization at different levels. Virtualization can be used for all, or certain, components of a grid or cluster. It can also be used to run applications within the grid or cluster in a sandbox environment. If certain performance limitations are not an issue, virtualizing all components of such a system can eliminate the need to acquire the entire test setup. Because these virtualized components, e.g. Eucalyptus3or OpenNebula4, might use virtualization for running applications in a sandbox environment, two levels of virtualization are used. Nesting the physical machines of a cluster or grid as virtual machines on one physical machine can offer security, fault tolerance, legacy support, isolation, resource control, consolidation, etc.
1
A cluster is a group of interconnected computers working together as a single, integrated com- puter resource [42, 43].
2There is no strict definition of a grid. In [44], Bote-Lorenzo et al. listed a number of attempts to
create a definition. Ian Foster created a three point checklist that combine the common properties of a grid. [45]
3
http://www.eucalyptus.com
29
• A second possible use case is the creation of a test framework for hypervi- sors. As virtualization allows testing and debugging an operating system by deploying the OS in a virtual machine, nested virtualization allows testing and debugging a hypervisor inside a virtual machine. It eliminates the need for a separate physical machine where a developer can test and debug a hypervisor. • Another possible use case is the use of virtual machines inside a server rented from the cloud5. Such a server is virtualized on its own so that the cloud vendor
can make optimal use of its resources. For example, Amazon EC26 offers
virtual private servers which are virtual machines using the Xen hypervisor. Hence, if a user wants to use virtualization software inside this server, nested x86 virtualization is needed in order to make that setup work.
As explained in chapter 2, virtualization on the x86 architecture is not straight- forward. This has resulted in the emergence of several techniques that are given in chapter 3. These different techniques produce many different combinations to nest virtual machines. A nested setup can consist of the same technique for both hypervisors, but it can also consist of a different technique for either the first level hypervisor or the nested hypervisor. Hence, if we divide the techniques in three ma- jor groups: dynamic binary translation, paravirtualization and hardware support, there are nine possible combinations for nesting a virtual machine inside another virtual machine. In the following sections, the theoretical possibilities and require- ments for each of these combinations are given. The results of nested virtualization on x86 architectures are given in chapter 5.
Figure 4.1: Layers in a nested virtualization setup with hosted hypervisors. To prevent confusion about which hypervisor or guest is meant, some terms are introduced. In a nested virtualization setup, there are two levels of virtualization, see
5
Two widely accepted definitions of the term cloud can be found in [46] and [47].
4.1. DYNAMIC BINARY TRANSLATION 30
figure 4.1. The first level, referred to as L1, is the layer of virtualization that is used in a non-nested setup. Thus, this level is the virtualization layer that is closest to the hardware. The terms L1 or bottom layer indicate the first level of virtualization, e.g. the L1 hypervisor is the hypervisor that is used in the first level of virtualization. The second level, referred to as L2, is the new layer of virtualization, introduced by the nested virtualization. Hence, the terms L2, nested or inner indicate the second level of virtualization, e.g. the L2 hypervisor is the hypervisor that will be installed inside the L1 guest.
4.1
Dynamic binary translation
This section focusses on L1 hypervisors that use dynamic binary translation for nested virtualization on x86 architectures. This can be in the host operating sys- tem or directly on the hardware. The hypervisor can be VirtualBox (see subsec- tion 3.6.1), a VMware product (see subsection 3.6.2) or any other hypervisor using dynamic binary translation. The nested hypervisor can be any hypervisor, resulting in three major combinations. Each combination uses a nested hypervisor that allows virtualization through a different technique. The nested hypervisor will be installed in a guest virtualized by the L1 hypervisor. The first combination is again a hy- pervisor using dynamic binary translation. In the second combination a hypervisor using paravirtualization is installed in the guest. The last combination is a nested hypervisor that uses hardware support.
It should be theoretically possible to nest virtual machines using dynamic binary translation as L1. When using dynamic binary translation, no modifications are needed to the hardware or to the operating system, as pointed out in section 3.1. Code running in ring 0 will actually run in ring 1, but the guest is not aware of this.
Dynamic binary translation: The first combination nests a L2 hypervisor
inside a guest virtualized by a L1 hypervisor where both hypervisors are based on dynamic binary translation. The L2 hypervisor will be running in guest ring 0. Since the hypervisor will not be aware that its code is actually running in ring 1, it should be possible to run a hypervisor in this guest.
The nested hypervisor will have to take care of the memory management in the L2 guest. It will have to maintain the shadow page tables for its guests, see subsection 3.1.3. The hypervisor uses these shadow page tables to translate the L2 virtual memory addresses to, what it thinks to be, real memory equivalents. But actually these translated addresses are in the virtual memory range of the L1 guest and can be converted to real memory addresses by the shadow page tables maintained by the L1 hypervisor. The memory architecture in a nested setup is illustrated in figure 4.2. For a L1 guest, there are two levels of address translation as shown in figure 3.1. A nested guest has three levels of address translation resulting in the need for shadow tables in the L2 hypervisor.
Paravirtualization: The second combination uses paravirtualization as tech-
4.1. DYNAMIC BINARY TRANSLATION 31
Figure 4.2: Memory architecture in a nested situation.
binary translation for the L2 hypervisor. The hypervisor using paravirtualization will be running in guest ring 0 and is not aware that it is actually running in ring 1. This should make it possible to nest a L2 hypervisor based on paravirtualization within a guest virtualized by a L1 hypervisor using dynamic binary translation.
Hardware supported virtualization: The virtualized processor that is
available to the L1 guest is based on the x86 architecture in order to allow cur- rent operating system to work in the virtualized environment. However, are the extensions (see section 3.3 and 3.4) for virtualization on x86 architectures also in- cluded? In order to use a L2 hypervisor based on hardware support within the L1 guest, the L1 hypervisor should virtualize or emulate the virtualization extensions of the processor. A virtualization product that is based on hardware supported virtualization needs these extra extensions. If the extensions are not available, the hypervisor cannot be installed or activated. If the L1 hypervisor provides these ex- tensions, chances are that it requires a physical processor with the same extensions. It might be possible for hypervisors based on dynamic binary translation to provide the extensions without having a processor that supports the hardware virtualization. However, all current processors have these extensions. Therefore it is very unlikely that developers will incorporate functionality that provides the hardware support to the guest without a processor with hardware support for x86 virtualization.
Memory management in the L2 guest based on hardware support is not possible because the second generation hardware support only provides two levels of address translation. The L1 hypervisor should provide the EPT or NPT functionality to the guest together with the first generation hardware support, but it will have to use a software technique for the implementation of the MMU.
4.2. PARAVIRTUALIZATION 32
4.2
Paravirtualization
The situation for nested virtualization is quite different when using paravirtualiza- tion as the bottom layer hypervisor. The most popular example of a hypervisor based on paravirtualization is Xen (see subsection 3.6.3). There are again three combinations. A nested hypervisor can be the same as the bottom layer hypervisor, based on paravirtualization. The second combination is the case where a dynamic binary translation based hypervisor is used as the nested hypervisor. In the last com- bination a hypervisor based on hardware support is nested in the paravirtualized guest. The main difference is that the L1 guest is aware of the virtualization.
Dynamic binary translation and paravirtualization: The paravirtual-
ized guest is aware of the virtualization and should use the hypercalls provided by the hypervisor. The guest’s operating system should be modified to use these hyper- calls, thus all code in the guest that runs in kernel mode needs these modifications in order to work in the paravirtualized guest. This has major consequences for a nested virtualization setup. A nested hypervisor can only work in a paravirtualized environment if it is modified to work with these hypercalls. A native, bare-metal hypervisor should be adapted so that all ring 0 code is changed. For a hosted hy- pervisor this indicates that the module, that is loaded into the kernel of the host operating system, is modified to work in the paravirtualized environment. Hence, companies that develop virtualization products need to actively make their hyper- visors compatible for running inside a paravirtualized guest.
Memory management of the L2 guests is done by the nested hypervisor. The pages tables of the L1 guests are directly registered with the MMU, so the nested hypervisor can use the hypercalls to register its page tables with the MMU. A nested hypervisor based on paravirtualization might allow a L2 guest to register its page tables directly with the MMU, while a nested hypervisor based on dynamic binary translation will maintain shadow tables.
Hardware supported virtualization: Hardware support for x86 virtual-
ization is also for paravirtualization an exceptional case. The L1 hypervisor should provide the extensions for the hardware support to the guests, probably by means of hypercalls. Modified hypervisors based on hardware support can then use the hardware extensions. Second generation hardware support can also only be used if it is provided by the L1 hypervisor, together with first generation hardware support. In conclusion, nested virtualization with paravirtualization as a bottom layer needs modifications to the nested hypervisor, whereas nested virtualization with dynamic binary translation as bottom layer did not need these changes. On the other hand, the guests know that they are virtualized which might influence the performance of the L2 guests in a positive way. The nested virtualization will not work unless support is actively introduced. There is a low likelihood that virtual- ization software developers are willing to incorporate these modifications in their hypervisors since the cost of the implementation does not exceed the benefits.