Models For Modeling and Measuring the Performance of a Xen Virtual Server

(1)

Measuring and Modeling the Performance of the Xen VMM

Jie Lu, Lev Makhlis, Jianjiun Chen BMC Software Inc.

Waltham, MA 02451

Server virtualization technology provides an alternative for server consolidation by creating a set of logical resources that share underlying physical resources. Xen virtual machine monitor, a popular virtualization solution in Linux world, supports execution of multiple guest operating systems with unprecedented levels of performance and resource isolation.

Performance modeling of virtual servers faces challenges of obtaining meaningful measures as the operating system deals with virtual resources. This paper presents a practical approach for measuring and modeling the performance of Xen.

1. Background

Server virtualization technology provides an alternative for server consolidation by creating a set of logical resources that share underlying physical resources.

Virtual servers enable dynamic tuning through dynamically moving resources, make better use of the resources through sharing resources, provide high availability through isolating guest operating systems, reduce overall costs through higher utilization, and increased flexibility through rapid provisioning.

There are many server virtualization approaches.

Table 1 compares the most popular commercial virtualization products.

All these approaches enable the execution of multiple guest operating systems on the same computer hardware (except Solaris Container, which is a single

OS image). IBM micro-partition (SPLPAR) employs hardware-based hypervisor provided by its Power 5 server. VMware ESX Server provides hardware abstraction using a proprietary kernel extension.

Microsoft Virtual Server provides virtual machines by simulating the complete hardware. Solaris Container is a single OS image, which provides application isolation with its zone. Xen provides software-based hypervisor using the technique of paravirtualization.

The term of hypervisor came from the mainframe world, which first introduced the virtualization concept.

It is a mechanism that allows multiple operating systems to share the underline hardware resources With the virtualization techniques, a business needs to plan for effective consolidation of business services, minimize risk in deploying virtual servers, identify and resolve performance issues before they impact

Product Type Hypervisor Hardware

platform Guest OS Modify guest OS IBM micro-partition

(SPLPAR) System Firmware Power 5 AIX, Linux, i5/OS and

Windows Yes VMware ESX Server Full Software x86 Windows

and Linux No Xen Para, Full Software x86 Linux and

Windows

Yes (for Para) MS Virtual Server Emulation No x86 Windows

and Linux No

Solaris Container OS No SPARC

and x86 Solaris N/A

Table 1. Comparison of popular commercial virtualization products

(2)

business users, and assure that applications perform well when using virtualized resources. All these needs

call for effective performance modeling and capacity planning tools and methods for virtual servers.

The major challenge that we are facing is to obtain meaningful performance measures. However, the most critical performance measures, such as CPU utilization and I/O rate, obtained from standard OS- provided facilities are often virtual-based. Therefore, it could be misleading by using conventional tools that are based on physical measures.

Many papers have been published discussing performance modeling and capacity planning issues for virtualization from various aspects. In this paper, we are in particular focusing on measuring the performance from a Xen virtual server and applying the metrics to performance modeling and capacity planning. In Section 2, we give a brief introduction on the architecture of Xen. We then discuss how to obtain performance measures from Xen in Section 3. Finally, we present our performance modeling method via a series of experiments in Section 5.

2. Xen introduction

Xen is an open source virtual machine monitor (VMM) developed by the Computer Laboratory at University of Cambridge. It is designed for x86 to support execution of multiple guest operating systems with unprecedented levels of performance and resource isolation [1].

The traditional x86 architecture lacks the technology to support full virtualization. Solutions like VMware’s ESX server implement full virtualization at the cost of sacrificing performance. In contrast with full virtualization, paravirtualization offers better performance by presenting a virtual machine abstraction that is similar but not identical to the underlying hardware. Therefore, it requires modifications to the guest operating system. However, it virtualizes all architectural features required by the existing standard application binary interface (ABI), and hence no modifications are required to guest applications.

In order to protect the hypervisor from OS misbehavior, guest OSes must be modified to run at a lower privilege level. Xen operates at a higher privilege level than the supervisor code of the guest OSes, thus itself is the hypervisor. Privileged instructions are paravirtualized by requiring them to be validated and executed within Xen. Any guest OS attempt to directly execute a privileged instruction is failed by the processor, either silently or by taking a fault, since only Xen executes at a sufficiently privileged level.

In Xen, the guest OSes are responsible for allocating

and managing the hardware page tables. They have direct read access to hardware page tables, but updates are batched and validated by Xen.

Guest OSes install fast handler for system calls, to allow direct calls from an application into its guest OS and avoid indirecting through Xen on every call. Each guest OS registers a descriptor table for exception handlers. Hardware interrupts are replaced with a lightweight event system. Each guest OS has a timer interface and is aware of both real and virtual time.

Xen provides a high-performance communication mechanism for passing buffer information vertically through the system, while allowing Xen to efficiently perform validation checks. It exposes a set of clean and simple device abstractions. I/O data is transferred to and from each domain via Xen, using shared- memory, asynchronous buffer-description rings.

Figure 1 from [2] illustrates the overall system structure of a Xen virtual server.

The initial domain, Domain0, is created at boot time. It is responsible for hosting the application level management software. Domain0 is a virtual machine, same as other guest domains, which runs on virtual CPUs and virtual memory. However, only Domain0 is permitted to use the privileged control interface to hypervisor. The control interface provides the ability to create and terminate other domains. It also controls the scheduling parameters, physical memory allocations, physical disks access, and network devices access.

Paravirtualization requires the modifications on the guest OS kernel. Xen provides patched Linux kernel (XenLinux) for both version 2.4 and 2.6. Many Linux distributions have already included or will include Xen into their packages, such as RHEL 5 and SLES 10, as Xen becomes more popular.

Xen also implemented support of full virtualization for processors with Intel VT or AMD Pacifica technology.

In this configuration, it does not require modifying the guest OS, which makes it possible to host proprietary operating systems, such as Windows.

3. Data visibility

Traditional performance measures are obtained from the operating system. Xen is hosting multiple guest OSes. Each OS only has a partial view of the physical system. To each guest, it appears that it is the only operating system using the hardware. Effective

(3)

Event Channel Virtual CPU Virtual MMU Control IF

Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE) Native

Device Drivers

GuestOS (XenLinux) Device Manager &

Control s/w VM0

GuestOS (XenLinux) Unmodified

User Software

VM1

Front-End Device Drivers

GuestOS (XenLinux) Unmodified

User Software

VM2

Unmodified GuestOS (WinXP)) Unmodified

User Software

VM3

Safe HW IF

Xen Virtual Machine Monitor

Back-End

VT-x x86_32

x86_64 IA64

AGP ACPI

PCI SMP

performance modeling requires a picture of the whole server.

Moreover, all guests and Domain0 within a Xen virtual server run on virtual CPUs and virtual memory.

Therefore the guest operating systems are not aware of the actual physical configurations of the server. For example, the virtual CPUs can be configured to a different number than the available physical (or logical if Hyper-Threading is enabled) CPUs in the server.

With full virtualization, the CPU time reported by standard Linux facilities is based on virtual CPU time and thus cannot be used for performance modeling and capacity planning.

In performance modeling, the layer closest to the physical hardware resources has the best view of system activity. The hypervisor keeps track of the actual CPU usage and other resources used by the individual guests. These metrics are the most reliable statistics for measuring system activity.

At user level, Xen provides the xm tool as the primary management software running in Domain0 to access the data at the hypervisor layer. At low level, there are three basic approaches to obtain the data. We discuss them in the following sections.

3.1 Xen daemon

The Xen daemon (Xend) can be used to obtain the detailed configuration data. Xend is the node control daemon that performs system management functions related to virtual machines. It forms a central point of virtualized resources control and must be running in order to start and manage virtual machines. Xend must be run as root because it needs access to privileged system management functions.

An HTTP interface is provided for communicating with Xend that allows one to pass commands to the daemon. There are four ways to connect to Xend:

• SXP over a UNIX domain socket (/var/lib/xend/xend-socket);

• SXP over TCP (port 8000);

• XML-RPC over a UNIX domain socket (/var/run/xend/xmlrpc.sock);

• XML-RPC over TCP (port 8005).

These services can be enabled or disabled in the configuration file /etc/xen/xend-config.sxp. As of Xen 3.0.2, XML-RPC over UDS is the only service that is enabled by default.

In the first two cases above, Xend responds with data in S-expression format [3].

Figure 1. Xen 3.0 architecture

(4)

3.2 XenStore

The XenStore is an information storage space shared between domains. It is meant for configuration and status information rather than for large data transfers.

Each domain gets its own path in the store. The appropriate drivers are notified when values are changed in the store. XenStore can be accessed in one of the following ways:

• Through procfs, using the file /proc/xen/xenbus;

• Through xenstored, using UNIX domain socket /var/run/xenstored/socket or

/var/run/xenstored/socket_ro.

The protocol for both methods is the same, which is based on read() and write() calls. It is encapsulated in the xenstore library, which is a standard Xen component. Utilities such as xenstore-list and xenstore-read use the library to provide command line access to XenStore data.

3.3 Hypervisor calls

Xen also provides hypervisor calls to access the physical configurations as well as the CPU usage based on real time. A user space program running on Domain0 can perform hypervisor calls as ioctl commands on the special file /proc/xen/privcmd.

This can also be accomplished using the xenctrl library, a standard component of Xen.

The end result of performance modeling is individual workload response time. The response time is driven by two factors: service time and usage. An application running in a full-virtualized guest OS may still have acceptable response time even if it reports high CPU usage. In this case, it is the physical CPU usage reported by the hypervisor that drives the response time, not the usage reported by the guest OS.

Therefore, it is important to collect data from Domain0, which is the only domain that has access to the global view of the host.

The overall system configuration can be obtained via the hypervisor call DOM0_PHYSINFO. It provides the following critical information for capacity planning:

• CPU topology: number of threads per core, number of cores per socket, number of sockets per node, and number of nodes in the box;

• CPU speed in KHz;

• Memory: total pages and free pages.

Using the hypervisor call DOM0_GETDOMAININFOLIST, Domain0 can get a

list of all domains with the following parameter:

• Numeric ID;

• Status (dying | shut down | paused | blocked | running);

• Memory pages allocated to the domain, and its memory limit;

• Number of virtual CPUs (VCPUs);

• Aggregate CPU usage, in nanoseconds.

It is also possible to get the CPU usage for each individual CPU by using the hypervisor call DOM0_GETVCPUINFO. The CPU usage reported for the domain is calculated as sum over that domain’s VCPUs.

3.4 Device data

Individual PCI devices can be assigned to a given domain, which is called driver domain, to allow that domain direct access to the PCI hardware. Normally, Domain0 is the only driver domain.

Driver domain is the only domain that sees physical block devices. Other guest domains see virtual block devices (VBDs) instead. VBDs can be backed by an entire physical device (either disk or disk partition), by a logical volume, or by a file. Such information is part of the domain configuration data. Standard Linux facilities will report virtual disk I/O statistics for VBDs in guest domains other than the driver domain. The physical disk I/O statistics for physical block devices are reported in the driver domain only.

Similarly, driver domain is the only domain that sees both virtual and physical network adapters. Other guest domains see only virtual adapters (VIFs). In each domain, virtual adapters are given names eth0, eth1, etc., like real Ethernet adapters would be. Each virtual adapter is connected to a peer virtual adapter as follows: ethX in domain #Y is connected to vifY.X in Domain0. This includes eth0 in Domain0, which is connected to vif0.0. Physical adapters are visible in driver domain as peth0, peth1, etc.

However, their MAC addresses are visible in ethX, instead of pethX.

Virtual adapters in guest domains are assigned virtual MAC addresses that are part of the domain configuration. This enables the correlation of data collected from within a guest OS to data collected from hypervisor.

4. Performance analysis and modeling 4.1 Data collecting

(5)

Traditional performance analysis collects performance metrics from the operating system. Data collector is running as an application within the OS. In our experiments, we run regular data collector within each OS image. It collects performance metrics for system configuration and statistics, disk configuration and statistics, network configuration and statistics, and process statistics.

In addition to regular metrics from OS, we also collect Xen specific metrics via hypervisor calls in Domain0. It includes the configuration and statistics of each domain hosted by Xen.

The configuration data is sampled for every hour, while the statistics data is sampled for every 10 seconds. All data is summarized into 15 minute spills.

4.2 Baseline

We start our study with a controlled workload running on a dedicated Linux system, and obtain CPU Utilization, as shown in Figure 2.

0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00

CPU Utilization

(%)

12:00 12:15 12:30 12:45 13:00 13:15 13:30 13:45 CPU Utilization of Workloads on Standalone

Linux

Others PD-Perform-Agent work

Figure 2. CPU utilization of standalone Linux We use BMC^® Performance Assurance^® for Servers in our study. The estimated response time well matches the response time measured by the workload itself, as illustrated in Figure 3. This demonstrates that the traditional performance analysis and modeling tool for regular Linux (or UNIX in general) works well for standalone systems.

4.3 Traditional method does not work

Next we setup a Xen virtual server using Fedora Core 5, a preview of RHEL 5. It is configured with two guest domains in addition to Domain0. We start with a simple configuration that can be easily extended to other configurations.

21.00 21.50 22.00 22.50 23.00 23.50

Response Time (sec)

12:00 12:15 12:30 12:45 13:00 13:15 13:30 13:45 Response Time of Workload "work"

Evaluated Measured

Figure 3. Response time of standalone Linux We first run the same workload in one of the guests, and leave the other idle. Both the CPU utilization and throughput remain pretty much the same as in a standalone system although the response time increases a little bit. We then run the same workload in both guests. With the added competition, the CPU utilization decreases and the response time increases significantly as illustrated in Figures 4 and 5.

0.00 10.00 20.00 30.00 40.00

CPU Utilization (%)

18:00 18:30 19:00 19:30

CPU Utilization of VM1 with Competition

Others PD-Perform-Agent work

Figure 4. CPU utilization of VM1 with competition

0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00

1 2 3 4 5 6 7 8

Response Time Comparison

Standalone Without Comp With Comp

Figure 5. Response time comparison

When we use the same analysis method as was used on the standalone system, the results are way off from

(6)

the actual measurement, especially when there is competition among guests. Figures 6 and 7 demonstrate the difference between estimated and measured response time.

0 10 20 30 40

Response Time (%)

20:00 20:15 20:30 20:45 21:00 21:15 21:30 21:45 Response Time of work@VM1 without

Competition

Evaluated Measured

Figure 6. Response time of VM1 without competition

0.00 20.00 40.00 60.00 80.00

18:00 18:15 18:30 18:45 19:00 19:15 19:30 19:45 Response Time of work@VM1 with

Competition

Evaluated Measured

Figure 7. Response time of VM1 with competition Obviously, traditional performance analysis and modeling method for regular system does not work for the virtual server environment.

4.4 Consolidating workloads

In older versions of XenLinux, the standard Linux accounting was based on virtual CPU time, and therefore could not be used for performance modeling.

As of 3.0.2, Xen implements “steal” (involuntary wait) time accounting in Linux under the paravirtualization configuration. As a result, the standard Linux accounting in the Xen kernel tracks real time instead of virtual time. Thus, the performance data collected from within the guest OS may be used for performance modeling.

In this case, the involuntary wait time becomes an important metric because it keeps track of the time when there is a process in a ready state but the CPU is not available. It indicates the performance impact due to the competition from other domains. Figure 8 shows the relationship between the CPU time and the involuntary wait time.

Involuntary Wait Time vs. CPU Time

0%

10%

20%

30%

40%

18:00 18:15 18:30 18:45 19:00 19:15 19:30 19:45

VM1 CPU VM1 IV wait VM2 CPU

VM2 IV wait Domain0 CPU

Figure 8. Involuntary wait time vs. CPU time As a modeling workaround, we may build a performance model for each individual guest using existing tools. Next, we revise the configuration of Domain0 to match the capacity of the entire physical server. We then move all workloads from each individual guest to the updated system of Domain0. In this way, all workloads from various isolated guests are competing together. This method produces reasonable results as shown in Figure 9.

0 10 20 30 40 50 60 70

18:00 18:15 18:30 18:45 19:00 19:15 19:30 19:45 Response Time of work@VM1 with Competition

Evaluated Measured

Figure 9. Response time using workaround

4.5 Modeling at hypervisor level

The previous workaround does not apply to fully- virtualized domains that run unmodified kernels and will see virtual CPU time only. Moreover, it requires a performance model for every guest running on the virtual server. Thus, a data collector has to run on each guest. This would incur considerable overhead when there are a large number of guests running together.

Since the most commonly used configuration of virtual server is to have each application running on an isolated virtual machine, there is not much need to have workload characterization at the process level.

Rather, each virtual machine (guest) can be seen as a workload running on the physical server. Hence, it does not require one to instrument every guest OS.

(7)

Therefore, one only needs to collect performance data at the hypervisor level via Domain0.

When we run the workloads in the Xen guests for the experiment in the previous section, we have Xen specific data collected as well. Figure 10 illustrates the CPU utilization for each domain. Although the data matches the data collected within each guest, they are from totally different sources. Furthermore, we still have accurate CPU accounting for fully-virtualized domains, while the accounting from OS would be

“virtual”.

0%

10%

20%

30%

40%

50%

60%

70%

18:00 18:30 19:00 19:30 20:00 20:30 21:00 21:30 Overall CPU Utilization from Hypervisor

Domain0 VM1 VM2

Figure 10. Overall CPU utilization from hypervisor Xen is mostly comparable with VMware ESX server, especially under full-virtualization. There are some existing performance analysis and modeling tools for VMware ESX server that are proven and working well.

With the configuration and statistics data for each domain hosted by Xen, we use BMC® Performance Assurance® for Virtual Servers to model Xen. Figures 11 and 12 show the breakdown of CPU degradation of the two guests based on the performance model built with hypervisor level data. These metrics provide the quantified indicator on the performance impact among guests within the same virtual server.

0 20 40 60 80 100

CPU Degradation

(%)

18:00 18:30 19:00 19:30 20:00 20:30 21:00 21:30 CPU Degradation of VM1

CPU degradation CPU degradation by other guests

Figure 11. CPU degradation of VM1

0 20 40 60 80 100 120

CPU Degradation

18:00 18:30 19:00 19:30 20:00 20:30 21:00 21:30 CPU Degradation of VM2

CPU degradation CPU degradation by other guests

Figure 12. CPU degradation of VM2

5. Conclusion

Xen provides an alternate virtualization solution. We have demonstrated that traditional performance analysis and modeling methods do not work in a virtualized environment. Then we presented a practical approach for measuring and modeling the performance of Xen virtual server. Correlating the data from hypervisor and data from within a guest OS provides both top down and bottom up views of performance aspects on Xen server.

References

[1] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt and Andrew Warfield, Xen and the Art of Virtualization, Proceedings of the ACM Symposium on Operating Systems Principles, 2003.

[2] Ian Pratt, Keir Fraser, Steven Hand, Christian Limpach and Andrew Warfield, Xen 3.0 and the Art of Virtualization, The 6th Free and Open source Software Developers' European Meeting, Brussels, 2006.

[3] S-expression, http://en.wikipedia.org/wiki/S- expression.

[4] The Xen virtual machine monitor, http://www.cl.cam.ac.uk/Research/SRG/netos/xen.

[5] XenSource, http://www.xensource.com/.