<full title of document> Table of Contents
VBLOCK
™
SOLUTION FOR KNOWLEDGE
WORKER ENVIRONMENTS WITH VMWARE
VIEW 4.5
Version 2.0
February 2013
Copyright 2013 VCE Company, LLC. All Rights Reserved.
Contents
Introduction ... 5
About this document ... 5
Solution overview ... 5
Objectives ... 6
Audience ... 6
Feedback ... 6
Technology overview ... 7
Vblock
™Systems ... 7
Vblock System 720 ... 7
Vblock System 320 ... 7
Vblock System 200 ... 8
Vblock System 100 ... 8
VMware View 4.5 ... 9
Architecture overview ... 11
Design considerations ... 13
Unified Computing System configuration ... 13
LAN cloud configuration ... 13
VMware virtual infrastructure ... 16
VMware vSphere ESX/ESXi server ... 16
VMware vSphere advanced parameters ... 17
VMware View 4.5 ... 18
Solution validation ... 20
Test environment design ... 20
Define a workload profile ... 20
Define a run profile ... 21
Test procedures ... 22
Test results ... 23
Application response times ... 23
vSphere Server utilization ... 24
Cisco ... 36
EMC ... 36
Appendix A: Characteristics of knowledge workers ... 37
Introduction
About this document
The challenges related to traditional desktop deployment and day-to-day administration include lost laptops containing corporate data, security breaches related to viruses or hackers, and simply ensuring IT resources can maintain the required service level agreements (SLA). In addition to the challenges of operational management, IT must also consider implications of broader system-wide issues such as compliance, corporate governance, and business continuity strategies.
Organizations are turning to virtual desktop technologies to address the operational and strategic issues related to traditional corporate desktop environments. VMware View provides a virtual desktop infrastructure (VDI) environment that is secure, cost-effective, and easy to deploy. VMware View also has the capability to meet the demanding needs of the different types of user profiles, whether on the corporate LAN or on the WAN. Combining VMware with Vblock™ Systems ensures a high level of user experience, which in turn means acceptance of the virtual desktop deployment within
organizations.
This document describes the solution architecture for VMware View 4.5 for managing virtual desktops on Vblock Systems. Specifically, this solution is built on the Vblock System 700.
Solution overview
Today enterprise environments, which consist of various types of users such as knowledge workers, task workers, and power users, are planning, testing, and transitioning from a physical desktop/laptop to a virtual desktop environment. The workload characteristics of these users vary across enterprise environments. The I/O profiles required for virtual desktops may be different, based on the user type, use type, and application type installed onto the virtual desktop. Task workers generate workloads that are mostly network-centric. On the other hand, knowledge workers and power users generate
workloads that exercise the compute and storage components of the infrastructure supporting the virtual desktop environment. .
The Vblock™ Solution for Knowledge Worker Environments with VMware View 4.5 provides an efficient path to meet the increased demands of knowledge workers while maximizing performance for users and balancing the storage and compute resources. The solution allows:
§ The consolidation of the desktop environment into one infrastructure behind the firewall, making it easy to update the operating system, patch applications, ensure compliance, perform application migrations, and provide support from a central location. You can deliver a consistent user experience for employees, whether they are at corporate locations, conducting travel, or located at a remote office. Using this solution, you will spend less time reacting to regulatory compliance and security issues and more time adding value to your business.
Objectives
This solution architecture paper presents an overview of a VMware View 4.5 solution on the Vblock System. It illustrates a virtual desktop solution designed for knowledge workers and addresses the following aspects:
§ An architectural overview
§ Information for configuring Vblock for deploying View 4.5 § Design considerations
§ Performance validation results
Audience
This document is for sales engineers, field consultants, advanced services specialists, and customers who will configure and deploy a virtual desktop solution that centralizes data and applications to provide desktops as a managed service.
Feedback
To suggest documentation changes and provide feedback on this paper, send email to
[email protected]. Include the title of this paper, the name of the topic to which your comment
Technology overview
This section summarizes technologies used in the solution.
Vblock
™Systems
The Vblock System from VCE is the world's most advanced converged infrastructure—one that optimizes infrastructure, lowers costs, secures the environment, simplifies management, speeds deployment, and promotes innovation. The Vblock System is designed as one architecture that spans the entire portfolio, includes best-in-class components, offers a single point of contact from initiation through support, and provides the industry's most robust range of configurations.
Vblock System 720
The Vblock System 720 is an enterprise, service provider class mission-critical system in the Vblock System 700 family, for the most demanding IT environments—supporting enterprise workloads and SLAs that run thousands of virtual machines and virtual desktops. It is architecturally designed to be modular, providing flexibility and choice of configurations based on demanding workloads. These workloads include business-critical enterprise resource planning (ERP), customer relationship management (CRM), and database, messaging, and collaboration services. The Vblock System 720 leverages the industry’s best director-class fabric switch, the most advanced fabric based blade server, and the most trusted storage platform. The Vblock System 720 delivers greater configuration choices, 2X performance and scale from prior generations, flexible storage options, denser compute, five 9s of availability, and converged network and support for a new virtualization platform that accelerates time to service and reduces operations costs.
Vblock System 320
Vblock System 200
The Vblock System 200 is right-sized to meet the capacity, workload, and space requirements of mid-sized data centers and distributed enterprise remote offices. By leveraging the Vblock System 200, companies experience the repeatability, architecture standardization, implementation flexibility, and business results synonymous with Vblock Systems.
With pre-defined, variable configurations, the Vblock System 200 balances real workload requirements with fastest time to value, reducing risk and complexity. The Vblock System 200 is designed to:
§ Bring the power and benefits of the Vblock System family into a value-focused solution § Deliver core IT services (file/print and domain) for mid-sized data centers and distributed
enterprise remote locations
§ Provide development/test and co-location data center support
§ Efficiently handle mixed workload requirements for mid-sized data centers
§ Offer business applications with data segregation requirements (such as eDiscovery and eArchive) with predictable performance and operational characteristics
Vblock System 100
The Vblock System 100 is right-sized to meet the capacity, workload, and space requirements of mid-sized data centers and distributed enterprise remote offices. By leveraging the Vblock System 100, companies experience the repeatability, architecture standardization, and business results
synonymous with Vblock Systems.
With pre-defined fixed configurations, the Vblock System 100 is designed to:
§ Bring the power and benefits of the Vblock System family into a value-focused solution § Deliver core IT services (file/print and domain) for mid-sized data centers and distributed
enterprise remote locations in industries such as healthcare and advanced manufacturing § Offer dedicated local instance business application support including VDI, SharePoint, and
Exchange
VMware View 4.5
Using VMware View’s virtual desktop infrastructure technologies, which include VMware View Manager’s administrative interface, desktops can be quickly and easily provisioned using templates. The technology permits rapid creation of virtual desktop images from one master image, enabling administrative policies to be set, and patches and updates applied to virtual desktops in minutes, without affecting user settings, data, or preferences.
The table lists VMware View 4.5 key components and their descriptions.
Component Description
View Connection Server Acts as the broker for client connections. It authenticates users through Active Directory and then directs that request to the virtual desktop. View Client Client software for accessing the virtual desktop from a Windows PC, a
Mac PC, or a tablet. The administrator can configure the client to allow users to select a display protocol such as PCoIP or RDP.
View Agent Enables discovery of the virtual machine used as the template for virtual desktop creation. Additionally, the agent communicates with the View Client to provide features such as access to local USB devices, printing, and monitoring connections.
View Manager An enterprise-class desktop management solution that streamlines the management, provisioning, and deployment of virtual desktops. The View Manager is installed at the same time as the Connection Server, and allows the user to administer the Connection Server. For this solution architecture a single instance of the View Manager was used for deploying and managing both the 768 and 1,536 desktop environments.
Centralized Virtual
Desktops A method for managing virtual desktops that enables remote sites to access virtual desktops residing on server hardware in the data center. View Composer An optional tool that uses VMware Linked Clone technology, which
Figure 1 illustrates the VMware View physical architecture.
Architecture overview
Figure 2 shows the Vblock System infrastructure for the VMware View 4.5 solution architecture topology.
Figure 2. Vblock System 700 and VMware View 4.5 solution architecture
Vblock System 700 with VMAX storage contains hardware and software components as listed in the table below:
Hardware Software
Cisco Unified Computing System with B200 M2 Series Blades with 3.33 GHz Intel Xeon 6 core CPU, 96 GB RAM
Cisco UCS Manager
Cisco Nexus 5000 Series Switch EMC Symmetrix Management Console EMC Symmetrix VMAX Storage EMC PowerPath/VE
N
Noottee:: Required in addition to the above components is an environment with Active Directory, DNS, DHCP, and
Design considerations
Unified Computing System configuration
Following are the configuration details of the service profiles template used for creating service profiles for deploying VMware vSphere ESXi 4.1.
N
Noottee:: This solution architecture assumes that EMC Ionix Unified Infrastructure Manager (UIM) was
pre-configured on the Vblock System according to the logical build guide.
LAN cloud configuration
§ VLANs
Figure 3 shows the list of VLANs configured in the Vblock and usable by UIM. The VLan16 VLAN was used for accessing the Internet and View Clients. The VLan20 VLAN was used for the vMotion network.
Figure 3. VLANs global to the Vblock § WWN pools
Figure 4 shows the WWN pools defined in UIM.
§ SAN pin group
One SAN pin group was created in the UCS Manager using the port channels configured to segregate traffic for a two-chassis configuration. For scaling, one more SAN pin group and segregated traffic were created between the remaining chassis. This configuration optimized traffic and was transparent to UIM. Figure 5 shows SAN pin group.
Figure 5. SAN pin group
§ Storage Array (EMC Symmetrix VMAX) configuration
A single-engine VMAX system in the Vblock System was used for testing the virtual desktop deployment. The ESXi clusters, which contained hosts from a single chassis, were mapped to four front-end ports of the VMAX system through the pool definition. All the desktops on the ESXi clusters were laid out on Fibre Channel disks at the array back-end. These were presented as virtually provisioned LUNs to the front end of the VMAX system, and then presented to UIM as a storage pool.
§ VMAX LUNs
Figure 6. VMAX LUNs
§ Symmetrix Virtual Provisioning pool and LUNs
A single Virtual Provisioning pool named vcevdipool was created from Fibre Channel (FC) back-end storage devices, which were FC disks of 15K RPM 450 GB capacity. The thin pool consisted of 48 data devices for the 768 desktop deployment and 96 data devices for the 1,536 desktop deployment. The pool was used for storing the linked clone user desktops. For testing a configuration of 1,536 user desktops, 24 FC LUNs were created for use by UIM using the vcevdipool with RAID 5 protection. Figure 7 shows the details of the Virtual Provisioning pool and the LUNs.
§ VMAX (VMware data stores)
A second pool was created for use in UIM to provide the data store capacity. Figure 8 shows the pool details from a vCenter perspective.
Figure 8. VMware data stores
VMware virtual infrastructure
VMware vSphere ESX/ESXi server
Figure 9. ESXi clusters: Infra and VBLOCK-2A
VMware vSphere advanced parameters
No specific advanced parameters were tuned for this testing. All the VAAI parameters were set to their default value.
Data stores
A total of 24 data stores were used for storing the linked clones. The size of each data store was 939 GB. The data store VCE-VDI-Infrastructure was used specifically to store gold images of Windows 7 desktops and the swap files for all the virtual machines. Figure 10 shows the 24 data stores used.
VMware View 4.5
In this environment, a single View Connection Server handled 1,536 virtual desktops. Figure 11 illustrates vCenter Server’s integration with VMware View 4.5 and the also shows VMware Composer as enabled.
Figure 11. vCenter Server integration with VMware View 4.5
Figure 13 shows the configuration details for the event database configured for logging all the occurring events.
Figure 13. Configuration details of event database
Virtual desktop pools
For testing the virtual desktop environment, two desktop pools were created to accommodate 1,536 users. For ease of scaling, two pools of 768 users were used. The test was run with one pool of 768 desktops on a single UCS chassis and with two pools for a total of 1,536 desktops. Figure 14 shows the list of pools.
Solution validation
Test environment design
This section describes the test environment, the choice of tools used for validation, and the procedures used for the workload characterization.
The solution validation used VMware View Planner and defined a workload and run profile for running virtual desktops on the Vblock System 700.
N
Noottee:: View Planner is a virtual desktop workload generator and sizing tool that enables the measurement of the
performance characteristics of the various components of the VDI deployment, such as ESX servers, virtual desktops, and the back-end storage.
The following two steps were involved in using View Planner:
1. Define a workload profile
2. Define a run profile
Define a workload profile
The first step was to define a workload that specified the applications that need to run. These
applications ranged from simple applications such as a web browser to applications such as Microsoft Excel, Word, and so forth. For this solution, a list of applications and operations representative of a knowledge worker were chosen. Refer to Appendix A: Characteristics of knowledge workers for more information.
Define a run profile
Next, we created a run profile to associate the workload profile with a set of virtual machines that executed the defined workload. The table below shows the parameters specified when creating the run profile:
Parameter Description
Number of virtual machines Number of virtual machines to run the workload. Run type The run type can be local, remote, or passive.
§ The local run type runs the workload locally on the virtual desktop. § The remote run type executes the workload through a client such as
a physical desktop connected to a virtual desktop.
§ The passive run type runs the workload on several virtual desktops using a single physical or virtual desktop client.
Display protocol Specified as either RDP or PCoIP and is associated with a set of Active Directory groups. Each group consists of a set of users on the virtual machines.
Test procedures
The section covers the user workload environment and simulation used in the validation. This solution assumes a virtual desktop user to be a high-end knowledge worker whose characteristics are described in Appendix A: Characteristics of knowledge workers. Each virtual desktop is equipped to run a workload that simulates typical user behavior, using an application set commonly found and used across a broad array of desktop environments. Each virtual desktop user was created on a 64-bit Windows 7 desktop with 2 GB of RAM. The View Planner workload profile was defined as an application set consisting of Microsoft Office 2007 applications, Adobe Acrobat Reader, Internet Explorer, WinZip, Mozilla Firefox, and Windows Media Player. The think time for this workload was defined to be 20 seconds and the number of iterations was defined as 8. Based on the think time and the words per minute used for this validation, this workload can be considered
equivalent to that of a high-end knowledge worker. Two workloads were run:
§ 768 desktop users on one UCS chassis and other common hardware
§ 1,536 desktop users on two similar UCS chassis and other common hardware
The latter workload was executed to show the linear scaling capabilities of the Vblock System 700. The table below describes the profile used for testing:
Component Description
UCS blades § The UCS B-Series Blade Server Chassis used in this testing and validation effort was capable of supporting eight half-width blade servers at full capacity.
§ Each blade had two 10G converged network adapters to carry network and storage data.
§ Each chassis was populated with eight B200 M2 blades running the Intel Xeon 5600 CPU chipset and 96 GB of RAM.
§ The eight blades in each chassis were configured as a VMware HA cluster managed by a single VMware vCenter Server 4.1.
§ The virtual desktops were hosted on data stores, which were mapped to LUNs on the VMAX storage array.
VMAX system § The VMAX system was set up so that the configuration could scale with the number of desktops.
Component Description
Data stores and linked clones § Twelve to 24 data stores (12 data stores for 768 desktops and 24 data stores for 1,536 desktops) were created and used to store the Windows 7 64-bit linked clones.
§ All of the linked clones were created from the same parent virtual machine.
§ This configuration resulted in 64 virtual machines on each data store as per VMware’s best practice recommendation.
ESXi Clusters § A separate cluster consisting of two ESXi hosts was used to host the common infrastructure components, such as Active Directory, DNS, DHCP, and View Manager, needed for an enterprise desktop environment.
§ Each desktop infrastructure service was implemented as a virtual machine running Windows 2008 R2.
§ The infrastructure cluster also hosted load clients (also known as "witness" clients). These witness clients had the View Client 4.1 installed in a Windows XP virtual machine.
§ An automated approach was used to launch a View Client on each witness virtual machine and then establish a session with a corresponding Windows 7 desktop.
Test results
The section describes the test results for the ESXi host memory and CPU utilization, and the storage metrics, using both IOPS and response time.
The most critical metric in a virtual desktop validation is the user application response time. In this testing, the system was optimized so that the user response time for any application activity was less than 3 seconds. As expected, when the number of users was increased by adding an additional blade, the application response time increased only marginally. The validation effort also focused on ensuring that the back-end storage array response time was lower than 1 ms, in order to ensure exceptional I/O performance characteristics. The CPU and memory characteristics provide information on the behavior of a representative blade used in the validation.
Application response times
Figures 17 and 18 show the overall application response times as measured by the open and close times of the various Microsoft Office applications. When 768 desktop sessions were hosted on the UCS chassis, it meant that 96 Windows 7 desktops were running the user simulation workload on each blade in the chassis. Similarly, the 1,536 desktop sessions show the response time
Figure 17. Application response time with 768 users
Figure 18. Application response time with 1,536 users
vSphere Server utilization
The charts below show the resource utilization of one representative blade server during the steady state. Similar characteristics were observed on the remaining blades of the test harness.
CPU utilization
Figure 19. CPU utilization with 768 users
Figure 20. CPU utilization with 1,536 users
Memory utilization
Figures 21 and 22 illustrate the memory utilization of a representative ESXi server from the cluster hosting the virtual desktops. Each UCS blade was configured with 96 GB of physical RAM. With each blade running 96 desktop sessions, respectively, memory was over-committed. ESXi allows for the over-commitment by using techniques such as page sharing, ballooning, and swapping to disk. All of the above memory over-commitment techniques were observed during the testing and validation effort. Ballooning and swapping activity were observed in each test case and were found to be the same as the desktop environment scaled from 768 desktops on one UCS chassis to 1,536 desktops on two UCS chassis.
N
Noottee:: Do not be concerned that the ballooning of memory appears to be very high in the above figures User
experience in desktop environments is affected only if the swapping activity is high. In our test
Figure 21. Memory utilization with 1,536 users
Figure 22. Memory utilization with 768 users
Storage metrics
VMAX front-end IOPS
Figures 23-28 plot the VMAX front-end I/O characteristics with 768 and 1,536 users.
Figure 23. VMAX front-end IOPS (4 front-end ports) with 768 users
Figure 25. VMAX front-end throughput with 768 users
Figure 27. VMAX front-end % busy with 1,536 users
Backend storage
Figures 29-32 show the corresponding statistics from the back end of the VMAX storage array. It can be clearly seen that as the number of virtual desktops is scaled from 768 to 1,536, the back-end disk statistics scale almost linearly—the backend I/O’s increase about 80% when the number of desktops is increased from 768 to 1,536 while the % disk busy remains the same at about 35%. This is because more disks were added when the environment was scaled from 768 to 1,536 desktops.
Figure 30. VMAX back-end % busy (average) with 768 users
Figure 32. VMAX back-end disk % busy (average) with 1,536 users
The steady-state IOPS on the VMAX front-end ports are approximately 573 for 768 users and 1,250 for 1,536 users. The utilization (% busy) of the front-end ports is an average of 15% for 768 users and 25% for 1,536 users. Note that the front-end utilization and IOPS increase significantly when all the desktops are started at the same time (boot storm event).
The back-end IOPS scale linearly as the number of desktops double while the disk utilization remains almost the same. This is because 48 additional disks were added when the desktop environment was scaled from 768 to 1,536 desktops. From the above charts, it is clear that as the environment was scaled from 768 desktops to 1,536 desktops, by adding a new UCS chassis and storage resources, the IOPS and other storage operations on the VMAX system scale linearly.
Scaling considerations
In this solution architecture, we have seen that virtual desktop environments can be scaled by either of the following:
Key findings
Following are some key findings:
§ CPU utilization scales linearly as additional users are added with an additional chassis and additional storage resources.
§ These tests were performed with Windows 7 desktop virtual machines configured with 2 GB of RAM, which is the high-end configuration for a virtual desktop. As such, memory resources are over-committed and there is ballooning and swapping activity on each vSphere host.
§ From a practical standpoint, a typical enterprise desktop deployment will see a mixture of desktop operating systems, each configured differently. As such, a lower CPU and memory utilization can be expected in such deployments.
§ The storage architecture utilized during this validation effort leveraged advanced features of the VMAX array. These features allow for capacity and performance being available for secondary storage capabilities including backups, replication, and maintenance.
§ The Vblock System provides flexibility at the hardware layer by allowing a choice of CPU and memory configurations that can benefit the underlying VDI.
§ The Vblock System with VMAX storage is thus an ideal platform for running virtualized desktops and provides operational efficiencies that will help drive down TCO and increase ROI.
The above results demonstrate the following: § Scaling
- A Vblock System 700 with two UCS chassis, four front-end VMAX ports, and 96 disks is capable of running 1,536 Windows 7 desktop sessions running the knowledge worker user profile workload with View Planner. Scaling up to 3,072 desktops can be achieved by adding two more UCS chassis along with four more front-end ports and 96 more disks. With a single-engine Symmetrix VMAX, one can implement a virtual desktop environment of 3,000
desktops.
- Further scaling can be achieved by adding more VMAX engines and UCS chassis. A fully configured Vblock System 700 with eight UCS chassis and six VMAX engines can support up to 6,144 concurrent desktops. This is based on the assumption that each B200 M2 blade in the UCS chassis can support 96 desktops/users with 100% concurrency and applications that have high I/O requirements.
- If further scaling is desired on a single blade in the chassis, then architects should consider using the next-generation B230 blades, which have the next-generation Intel Westmere CPU as well as expanded memory.
§ Level of concurrency
- If the level of concurrency is changed, for example only 80% users might be concurrent in an enterprise environment due to geographic location of users, or if many users run applications that do not have high requirements on the disk I/O subsystem, then an even higher number of desktops can be supported.
§ Component failures
- Knowledge workers have high expectations from the disk I/O subsystem, and this solution will not accommodate any failures such as a blade failure or bursts of I/Os from a few desktops in the architecture.
Conclusion
The solution architecture for VMware View 4.5 powered on Vblock System 700 enables organizations to quickly, and predictably, deploy centrally managed, secure, server-hosted virtual desktops. The Vblock System 700 validated infrastructure environment is a key enabler because the compute, network, and storage environments are tightly coupled, which in turn drives IT simplicity and greatly eases manageability. The technical validation of VMware 4.5 on Vblock System 700 confirms the capability to host a large number of virtual desktops running knowledge worker–type workloads, with a high degree of concurrency that is characteristic of large enterprise environments.
Next steps
References
VMware
§ Workload Considerations for Virtual Desktop Reference Architectures
http://www.vmware.com/go/view4rawc
§ VMware vSphere 4 Guidance for Modifying vSphere’s Fairness/Throughput Balance
http://kb.vmware.com/kb/1020233
Cisco
§ Cisco Unified Computing System
http://www.cisco.com/go/unifiedcompkuting
§ Cisco Data Center and Virtualization
http://www.cisco.com/go/datacenter
§ Cisco Validated Designs
http://www.cisco.com/go/designzone
EMC
§ EMC Symmetrix VMAX Family
http://www.emc.com/storage/Symmetrix-vmax/symmetrix-vmx.htm
§ EMC PowerPath/VE
Appendix A: Characteristics of knowledge workers
The table below outlines the various characteristics of a knowledge worker who is the primary user type for the virtual desktop environment described in this paper.
Characteristic Knowledge Worker CPU (cumulative) of moment of decline 60-65%
I/O pattern (cumulative) R/W 40/60 IOPS (per user) range 7-59 IOPS (per user) average 22 Block size average 16K Block size standard deviation 4K Application mix (number of apps) 15+ Number of apps open concurrently Many Sizing difficulty High
Applications and operations representative of knowledge workers
Figure 33 depicts the applications and operations on them performed by a knowledge worker.Since the operations in a desktop typically happen at discrete intervals of time, often in bursts that can consume lot of CPU cycles and memory, it is desirable not have all desktops execute the same sequence of operations as it is not representative of a typical VDI deployment and can cause resource over-commitment. To avoid synchronized swimming among desktops, View Planner randomizes the execution sequence in each desktop so that they are doing different things at any given instant of time and the load is evenly spread out.
Figure 34 depicts the sequence of workloads among a set of desktops.
ABOUT VCE
VCE, formed by Cisco and EMC with investments from VMware and Intel, accelerates the adoption of converged infrastructure and cloud-based computing models that dramatically reduce the cost of IT while improving time to market for our customers. VCE, through the Vblock Systems, delivers the industry's only fully integrated and fully virtualized cloud infrastructure system. VCE
solutions are available through an extensive partner network, and cover horizontal applications, vertical industry offerings, and application development environments, allowing customers to focus on business innovation instead of integrating, validating, and managing IT infrastructure.
For more information, go to www.vce.com.