5.6
Microsoft Hyper-V 2008 R2 / SCVMM 2012
5,000 Users
Table of Contents
EXECUTIVE SUMMARY ... 1
ABBREVIATIONS AND NAMING CONVENTIONS ... 2
KEY COMPONENTS ... 3
SOLUTIONS ARCHITECTURE ... 4
USERS AND LOCATIONS ... 6
Enterprise Campus – Datacenter (3,775 users) ... 6
Large Branch Office (525 Users) ... 6
Small Branch Office (150 users) ... 6
Remote Access Users (600 Users) ... 7
NETWORK INFRASTRUCTURE ... 7
Network Design ... 7
Datacenter and Remote Office LAN Network Architecture Overview ... 7
WAN Network Architecture Overview ... 8
STORAGE INFRASTRUCTURE ... 11 Storage Planning ... 11 Storage Deployment ... 13 Infrastructure Storage ... 13 VDI Storage ... 13 COMMON INFRASTRUCTURE ... 15
Infrastructure Deployment Methodology ... 15
Physical Common Infrastructure ... 16
Virtualized Common Infrastructure ... 16
Common Infrastructure Services ... 17
Overview ... 17
DNS ... 17
DHCP ... 17
Hyper-V and SCVMM Infrastructure ... 18
XenDesktop Infrastructure ... 19
MODULAR VDI INFRASTRUCTURE ... 24
Infrastructure Deployment Methodology ... 24
Modular Block Overview ... 25
Modular Block Sizing ... 26
Modular Deployment Design ... 26
Modular Block Infrastructure ... 27
Hyper-V Virtualization for VDI ... 27
SCVMM for VDI ... 29
SCVMM Host ... 29
SCVMM Cluster ... 30
SCVMM Library File Server ... 32
SCVMM SQL ... 32
Provisioning Services (PVS) for VDI ... 32
PVS Server Networking ... 33
PVS Farm... 34
PVS File Servers ... 36
PVS SQL ... 36
User Profile Management ... 36
Multi-Site Infrastructure ... 37
Branch Offices ... 37
Session Launching ... 39
Performance Capturing ... 39
In-Session Workload Simulation ... 39
RESULTS ... 40
PERFORMANCE RESULTS ... 40
Performance Results – Boot Storm ... 40
PVS for VDA Performance (Boot Storm) ... 40
SCVMM for VDA Performance (Boot Storm) ... 41
Performance Results – Test Run... 42
Hyper-V for VDA Performance (Test Run) ... 42
PVS for VDA Performance (Test Run) ... 42
SCVMM for VDA Performance (Test Run) ... 43
SCVMM for VDA Library Server File Server Performance (Test Run) ... 44
Multi-Site Performance (Test Run) ... 44
ADDITIONAL TESTING:PVS WITH PERSONAL VDISK (PVD)... 45
Objective ... 46 Results ... 46 LESSONS LEARNED ... 47 CONCLUSIONS ... 48 APPENDIX... 50 DOCUMENTS REFERENCES ... 50 HARDWARE CONFIGURATION ... 50
Active Directory Physical Domain Controller Configuration: ... 50
Hyper-V Cluster – Pool Specifications... 51
Hyper-V Host for Virtual Desktops Configuration (Host #1): ... 51
Hyper-V Host for Virtual Desktops Configuration: ... 53
PVS Servers Configuration for each Modular Block: ... 54
SCVMM Host for Virtual Desktops Configuration: ... 54
HARDWARE SPECIFICATIONS... 55
Servers ... 55
Storage Systems ... 56
Network Switches ... 56
Junipers Network Appliances ... 56
AGEE Network Appliances ... 56
SDX Network Appliances ... 56
BRVPX Network Appliances ... 57
Repeater Network Device ... 57
MULTI-SITE PERFORMANCE (TEST RUN) ... 57
BR VPX - Performance / Utilization ... 60
Executive Summary
This reference architecture documents the design, deployment, and validation of a Citrix Desktop Virtualization solution that leveraged the best-of-breed hardware and software vendors in a real-world environment. This design included Microsoft Hyper-V and SCVMM 2012, HP blade servers, NetApp storage arrays, and Cisco networking. Five modular blocks (or 5,050 virtual desktops) consisted of users divided into the following groups: 3,775 in the Datacenter, 675 in 2 Branch Offices, and 600 remote users.
This Desktop Virtualization reference architecture was built with the following goals: • Leverage a modular design to allow for linear scalability and growth by adding
additional modular blocks
• Design an environment to be highly available and resilient
• Architect a virtual desktop solution that can support users in different geographic locations, such as branch office workers and remote access users
This Citrix Desktop Virtualization reference architecture was tested using the industry standard LoginVSI benchmark at the medium workload. Below are the high-level notable findings from the deployment:
• Citrix XenDesktop 5.6 delivered a resilient solution with Hyper-V and SCVMM 2012 at 5,000-hosted VDI desktops.
• SCVMM has the capability to support 2,000 desktops per host. When deploying a clustered SCVMM 2012 server, we found that supporting 1,000 VMs in our
environment was the optimal configuration for minimal impact on deployment time. • Citrix Personal vDisk (PvD) in a Hyper-V clustered environment provided the
benefits of desktop personalization while avoiding the increased server utilization of dedicated desktops.
• Hyper-V failover clustering proved to be a robust infrastructure that was highly available when a node failed during testing.
• The Citrix modular block architecture was validated to provide linear scalability of the VM architecture. This allows an environment to scale to large numbers by duplicating simple modular blocks.
• HP blade servers were able to easily support a large-scale deployment of virtual desktops offering a balance of power, efficiency, and performance for the end customer.
Abbreviations and Naming Conventions
AG Access Gateway
AGEE Access Gateway Enterprise Edition
BR Branch Repeater
CIDR Classless Inter-Domain Routing CSV Cluster Shared Volume
HDX High Definition Experience GPO Group Policy Object ISL Inter-Switch Link KMS Key Management Server
NS NetScaler
NTP Network Time Protocol PvD Personal vDisk
PVS Citrix Provisioning Services
SCVMM Microsoft System Center Virtual Machine Manager UPM Citrix User Profile Manager
VDA Virtual Desktop Agent
VDI Virtual Desktop Infrastructure
vDisk Virtual Disk (Provisioning Services Streamed VM image) VPX Virtual Appliance
Key Components
• Software• VDI Desktop Broker Citrix XenDesktop 5.6 • VDI Desktop Provisioning Citrix PVS 6.1 w/ Hotfix 3 • Endpoint Client Citrix Receiver for Windows 3.1 • User Profile Management Citrix User Profile Manager 4.1 • VDI Personalization Citrix Personal vDisk 5.6.5 • Workload Generator Login VSI 3.6
• Virtual Desktop OS Microsoft Windows 7 SP1 x86
• Hypervisor Management Microsoft SCVMM 2012 Update Rollup 2 • Database Server Microsoft SQL 2008 R2
• Server Operating System Microsoft Windows Server 2008 R2 SP1 • VDI Hypervisor Microsoft Windows Server 2008 R2 SP1 with
Hyper-V Role • Hardware
• Blade Servers
• HP BL460c G6 Infrastructure
• HP BL460c G7 Virtual Desktop Hypervisors • Network Appliances
• WAN Optimization Citrix Branch Repeater 8800, SDX, and VPX • Remote Access Citrix NetScaler / Access Gateway Enterprise • Network Devices
• Backbone Switch Cisco Nexus 7010K 8x32/10G • Workgroup Switches HP H3C 5820, 5801
• Firewall/Routers Juniper Router SRX 240 • Storage Systems - NetApp FAS Series Storage
• FAS3240 Infrastructure / User Profile Storage • FAS3270 Virtual Desktop Storage
• FAS3270 PVS and SCVMM Infrastructure
Solutions Architecture
Designing a Solutions Architecture to achieve the required scale involved a significant amount of planning for the systems and components in the environment. The first step in creating the conceptual architecture was to determine the number of users in the environment and the required services. Each of the sections below contains a description of the major elements of the environment, as well as sizing and design considerations for those elements.
We also wanted to build a common block architecture that could be used to scale the solution more easily. The modular block is a concept that will be used throughout this document. A modular block is defined as a set of virtual desktops along with all of the components required to run that particular set of desktops. The high-level architecture shown in Figure 2 illustrates the components involved.
Common infrastructure is the infrastructure put into place to run the entire project, or even datacenter. It consists primarily of services that are probably already in place in an environment, including AD, DNS, NTP, DHCP and other services. In our environment, this included a cluster of hypervisors to provide those services and an instance of SCVMM to manage that cluster.
The shared XenDesktop Infrastructure components are those that are required for virtual desktops to execute in the environment, but that can be scaled and used for any number of virtual desktops deployed. For example, a Citrix License Server must exist in an environment, but this can easily be used for a deployment of any size. In this case, it included provisioning servers for the desktops as well.
The modular blocks, in this environment, consist of the hypervisor hosts for the virtual desktops along with the System Center Virtual Machine Manager (SCVMM) management servers for those desktops. Each block of 1,010 desktops can be simply added in to the common and shared infrastructure as needed.
The multi-site infrastructure also scales across multiple blocks, and will scale up as additional users are added and as the appliances allow.
This Reference Architecture includes the following components: • Users and Locations
Figure 2: High-Level Infrastructure Architecture
Users and Locations
Enterprise Campus – Datacenter (3,775 users)
The datacenter location required virtual desktop services with centralized system and desktop image management. The components selected to deliver these services were Citrix XenDesktop virtual desktops streamed by Provisioning Services 6.1, virtualized on Hyper-V, managed by Microsoft SCVMM 2012 with Update Rollup 2 (UR2), and with shared storage hosted on NetApp FAS3200 series Storage. NetScaler AGEE and Branch Repeater SDX appliances were selected to provide remote access and acceleration services for all remote branch and telecommuting users.
Large Branch Office (525 Users)
Users in the large branch office location needed secure and accelerated remote access to the datacenter-based virtual desktops. While having virtual desktops at a branch office of this size is a possibility, one of the requirements was ease of management and redundancy of infrastructure. The easiest way to meet that requirement was to have all virtual desktops be maintained in the datacenter. Components selected to provide connection acceleration services for these remote desktops utilize Citrix Branch Repeater technology: BR 8800-series appliances at the branch location and BR-SDX appliances at the Datacenter.
Small Branch Office (150 users)
Remote Access Users (600 Users)
A Citrix NetScaler with Access Gateway appliance was chosen to provide secure remote-access services because of its simple integration with a Citrix XenDesktop VDI infrastructure. Remote-access users connect to a NetScaler/Access Gateway appliance using the Citrix Receiver application, just like all users connect to the infrastructure.
Network Infrastructure
The next consideration for this environment is the network architecture and design. The network architecture included, but was not limited to, creation of IP address requirements, VLAN configurations, required IP services, and server network configurations. Considerations regarding IP allocation, IP block assignment, and WAN Routing were extremely important in ensuring that the design maintained its modularity while still being able to scale appropriately.
Network Design
When planning the network environment, one must determine how many nodes are needed at the beginning of the project and how many might be added throughout the lifetime of the project. Using this information, we can begin to plan the IP address blocks.
It is desirable to employ a modular approach to network VLAN design. Traffic separation is efficient for VDI IP considerations and alleviating bandwidth traffic concerns. If possible, create a separate VLAN for certain types of traffic. For example: a Storage VLAN for storage traffic (that is, iSCSI, NFS, or CIFS), DMZ’s for certain external incoming traffic types, a server management VLAN (which may include Lights-Out capabilities and log-gathering mechanisms), and Guest VLANs for virtual desktops. This type of design approach keeps Layer-2 broadcasts to a minimum while not overutilizing the CPU and memory resources of network devices.
Design Considerations:
• To provision 1,000 desktops and accommodate growth in chunks of 200 Desktops, using multiple blocks of /24 networks (254 hosts) aggregated is a more flexible approach than utilizing larger /23 (512 hosts) IP blocks.
• To grow the VDI IP Network environment in blocks of 400-500 users at a time, consider a larger/23 network. Allocate blocks of IP addresses according to what can be served logically from the Virtual Desktop administrator’s perspective (gradual growth and scalability) along with what can be provisioned within the company’s IT network-governance policies.
• To account for overhead and future headcount growth, as well as covering IP needs for services, allocate additional IP addresses as you grow. Planning the network design with growth and a buffer considered, blocks can be aggregated in chunks of /24’s, /23’s, and /22’s (1024 hosts) as needed. In addition, CIDR supernetting of the IP blocks can be utilized as required.
Datacenter and Remote Office LAN Network Architecture Overview
switch also provided all routing, switching, and security services for the rest of the environment. H3C/HP 5820 10GbE switches served other 10GbE service ports. Also, 1GbE ports were served by H3C/HP 5801 1GbE switches with 10GbE uplinks.
For Branch Office sites, workgroup switching and routing were required. The 1GbE ports required were provided by H3C/HP 5801 1GbE switches, which incorporated 10GbE uplinks to the core infrastructure.
WAN Network Architecture Overview
The planning for the multisite WAN test environment included planning for VLAN and other network-based services; supporting both virtualized and physical systems; providing for WAN, DMZ, and firewall connectivity. Planning also included placing network service appliances, such as Branch Repeater and NetScaler systems, in correct, logical network locations.
The solutions environment WAN routing at the datacenter was provided by a single Cisco core switch, as mentioned above. Providing appliance-to-appliance ICA optimization for the XenDesktop virtual desktops access required for the environment. To meet this requirement, we deployed BR-SDX appliances at the Datacenter and BR appliances (8800-series and virtual appliances) at each of the Branch Office locations. Branch Site Layer-3 routing and edge services were provided by Juniper SRX240 full service Router/Firewall devices.
A Branch Repeater 8800-series appliance was selected for the large branch office (525 users), and a Branch Repeater virtual appliance (VPX) was selected for the 150 users at Branch Office 2. In the Datacenter, a Branch Repeater SDX appliance Model 13505 was used to allow for a single connection point for remote Branch Repeater connections at the Datacenter.
WAN simulation and load generation, including WAN-byte traversal visibility, was provided by Apposite Linktropy 1GbE based WAN simulator appliances inserted between the remote sites and the Datacenter site. No reduction in bandwidth was introduced in the test environment.
For remote access users, ICA Proxy and VPN services were required. To meet this requirement, a NetScaler appliance with an Access Gateway Enterprise Edition license was deployed in the datacenter. LACP 802.3AD was used for all ISL’s between all devices. Network Design Considerations:
• Each network appliance is limited by the number of connections; most network appliances list the maximum number of TCP connections that they support. In the Citrix VDI environment, the ICA connection capacity of the Remote Access and WAN Acceleration devices need to be considered. It is necessary to match this capacity with the site user requirements, while including a buffer for future site growth. • To optimize storage communications in the environment, we recommend using a
dedicated VLAN for server to storage connections.
Synthetic NICs on 2020 virtual desktop VMs per farm; as a result, two /21 VLANs were created across two modular VDI blocks.
• A storage VLAN was created for our environment and was sized large enough to provide IP addresses for all of our VDI hypervisor blades. It was configured so that each blade used two storage NICs bound via MPIO.
• Consider separating heavy network traffic in a dedicated VLAN so that it does not interfere with other traffic. In our environment, the virtual desktop PXE Boot traffic was separated based on the PVS servers servicing each modular VDI block.
• Uplinks between all network switches at Layer 2 should employ 802.3ad LACP Ethernet aggregation for increased throughput and resiliency.
Storage Infrastructure
Shared Storage is one of the most critical components in a large scale VDI environment. Scale, end user satisfaction and overall performance greatly depend on the storage systems deployed and their capabilities. Hardware and software features employed in the design of the storage layer architecture also impact these areas. As shown below, the Storage Layer touches and shares I/O with every other common block in the VDI server architecture.
Figure 4: Storage Infrastructure
Storage Planning
Storage planning consists of two major sections: capacity planning and performance planning. If you have chosen a network attached storage (NAS) implementation, you must also account for additional network impact.
• Storage capacity planning is the projection of disk space assignment and allocation on the storage appliances as well as the projection of the required space based on known requirements. Single Server Scalability tests with application workloads specific to your company’s operation can help you start your sizing diligence. This is a very common first step in any VDI storage implementation, as everyone’s storage needs are unique.
• Storage performance planning takes into account your physical disk assignment and allocation while balancing the required disk IOPS for acceptable end user response. Storage processor CPU utilization (while running at full load) within the storage device is also an important data point. Every individual environment has its own requirements and bill of materials. Unfortunately, there is no concept of “one size fits all” in storage calculations.
Sizing Considerations:
• When planning your NetApp RAID groups, the size and continuity of the RAID group member sizes change as you add more shelves of disk. There are two suggestions to consider:
o Never exceed 24 disk members per RAID group. This is per guidance from NetApp support
o Whenever possible keep all of your RAID group sizes even for even continuous data stripe length in your aggregates
• Consider the disk size, media type, spindle motor speed and disk cache of the disks employed in the array when calculating the projected storage implementation. • Free space percentage of the provisioned storage units should be included in the
capacity consumption. Free space is needed to allow sufficient seek time. In addition, some features of your storage subsystem, such as snapshots, deduplication, and other features, may require planning for additional space.
• Consider the amount of resource the Storage Processors utilize regarding the memory and CPU capacity to serve the disk shelves IO load.
Design Considerations:
• Storage performance is greatly affected by the number of aggregate spindles in the storage array. With regards to NetApp storage, a very critical detail is the sizing of the RAID Groups in proportion to the amount of disks available for use. For NetApp, when employing Single Parity RAID4, there is a limitation of 16 disks per Raid Group. When employing 64Bit based RAID DP (Double Parity RAID 6), the disk design is limited to 24 Disks per Raid Group before performance limitations are imposed on the design. All RAID groups should be as close to the same exact size as possible. In addition, when calculating storage availability, remember that there is a RAID penalty: usable capacity goes down when using spindles for the RAID parity. • When using iSCSI, the overall Ethernet overhead of the protocol must be taken into
consideration. This load of Ethernet encapsulation and unencapsulation can be extremely high when aggregated at this scale. We recommend the use of a dedicated TCP offload engine card (i.e. TOE card) to maximize performance.
• Implement the best practices of the storage vendor chosen in regard to LUN types, PIT copy practices, drivers, firmware, operating system releases, and other details that affect the host-to-storage relationship.
Storage Deployment
It was found that a single NetApp FAS3270 storage could reliably host storage for at least 2020 XenDesktop 5.6 virtual desktop VMs virtualized on Hyper-V using the iSCSI storage protocol.
Infrastructure Storage
NetApp FAS3240 storage were utilized for the common infrastructure storage in this environment. These storage systems hosted iSCSI LUNs for the common infrastructure Hyper-V Cluster, SQL database storage, and User Profile storage. They also hosted NFS volumes for test client storage.
Sizing Considerations:
• The LUN for the Infrastructure Server VMs should be large enough to host several large fixed-VHD VMs. Note that the need for fixed-VHD is brought about by the combination of Hyper-V 2008 R2 SP1 and a NetApp storage. All VHDs should be fixed VHD. Dynamic-VHDs should not be used when using iSCSI storage for a Windows 2008 R2 SP1 server) The DataCenter Infrastructure VM LUN was assigned 1TB to provide support 20 virtualized infrastructure server VMs at 40GB each with 35% free space, based on Microsoft and NetApp best practices.
• The SCVMM SQL Server LUN was assigned 203 GB to host five SCVMM 2012 databases (each database was allocated 30GB) with 35% free space.
• The PVS SQL Server LUN was assigned 160GB to store three PVS farm databases, and also included 35% free space.
• User Profile LUNs were assigned 100GB each and were shared via iSCSI. In addition, each LUN was assigned to a specific Modular Block. This size is based on small user profiles (less than 50MB per user) present in our environment. Users accessed these LUNs as Windows Shares off of an infrastructure Windows File Server and not directly on the storage via iSCSI.
Design Considerations:
• Windows Server 2008 R2 SP1-based Hyper-V VMs have .BIN files equal to the amount of RAM assigned to the VM. The .BIN files produce minimal IOPS activity, and should be assigned to thin-provisioned and/or lower-cost SAN storage if available.
VDI Storage
NetApp FAS3270s were utilized for the VDI storage. Sizing Information for this environment:
VDA Storage
VDA LUN space calculation:
VDA RAM 1GB = Hyper-V .BIN File Size (When
possible, put this on thin-provisioned storage)
VDA Write Cache 4GB (this must to be large enough to
contain the Page File as well as difference data)
Total VM space 5GB each
Total Space required 505 VMs * 5GB = 2525GB
LUN Size* 2525GB + 28% = 3234GB
*For the purpose of consistency among all of the storage we were using, the VDA LUN size was chosen to be 3234GB, which contains approximately 28% free space. In addition, 1GB RAM was selected per amount utilized as well as for maximum scalability. VDA RAM size needs to be determined for your specific user environment.
PVS File Server Storage
• Each of the 3 PVS File Servers were assigned a 550 GB iSCSI LUN that were shared as Windows File Server shares and mounted by the PVS server farms from a UNC location. Each PVS iSCSI LUN was assigned to a PVS File Server dedicated to a 2,020 virtual desktop PVS Farm. Each PVS LUN was allocated 550GB.
PVS LUN Size calculation:
Host virtual desktop vDisks (2 x 40GB) = 80GB Backup location for virtual desktop write
cache during failure of shared storage (up to 160MB per virtual desktop)
2020*160/1000 = 323GB
Total Size 80GB + 323GB and with 35% free
space = 544GB (Rounded up to 550GB)
SCVMM Library Storage
• The SCVMM Library iSCSI LUN was assigned 675GB to provide centralized storage for the virtual desktop template and also contain the backup of critical environment VMs. The LUN size chosen was to provide enough storage for at least ten 40GB VMs with free space.
Cluster Quorum LUN
• A 2GB Cluster Quorum LUN was assigned to each Failover Cluster to serve as Quorum Witness storage LUN.
Design Considerations
• Before creating the storage architecture for a large scale environment, you need to collect storage utilization data such as Disk Space and IOPS utilization for each of the environment component types mentioned above,
the Processor, network interfaces, Disk Space, and Disk Performance of the storage device and its components.
• It is recommended to use network interface cards (NIC) that provide maximum performance and caching capabilities. In this test environment, we deployed Intel 10GbE cards (NetApp X1117A-r6) and this resulted in higher network throughput with lower system CPU load due to higher buffer memory.
• NetApp LUN-type and format-allocation units are important for best performance of the cluster’s shared storage. In this test environment, Windows_2008 LUN-type and 64KB allocation-unit size were found to be the best model.
Common Infrastructure
The next step of creating the Solutions Architecture was the planning and preparation of the Common Infrastructure.
Figure 5: Common Infrastructure
The common infrastructure was made up of the systems that provided core services to the entire environment. These systems were comprised of a mixture of physical and virtualized systems.
Infrastructure Deployment Methodology
The infrastructure Operating System, Features, Roles, Software, IP information, and other configuration settings were centrally managed and deployed with HP Insight Control server. HP Insight Control allows for streamlined and consistent deployment to the large number of servers. The Common Infrastructure functions were hosted on HP BL460c G6 blade servers, which are managed by the Insight Control.
Physical Common Infrastructure
Resiliency and performance requirements of many infrastructure services mandated the servers to be physical rather than virtual. Additionally, two physical Domain Controllers were required to maintain the functionality of the Failover Cluster that supported the virtual Domain Controllers per Microsoft Best Practices. All physical infrastructure servers ran on HP BL460 G6 servers.
Microsoft offers best practices around SCVMM leveraging both physical servers as well as for virtualizing the environment. Microsoft offers the option of running VMM as a highly available VM instead of relying on physical clusters. As there are a number of existing documents and testing that explore virtualized SCVMM servers, we decided to use the physical server option. Both designs are valid and supported by Microsoft.
The following software/services ran on physical machines for the Common Infrastructure: • Active Directory / DNS / DHCP / NTP
• Microsoft Windows 2008 R2 SP1 Enterprise Edition with Hyper-V Role (This is the same code-base as Microsoft Hyper-V Server 2008 R2 SP1, but includes the GUI management capabilities. We fully expect our test results to match results from that operating system choice as well.)
• Microsoft SCVMM 2012 R2 • Microsoft SQL 2008 R2
Virtualized Common Infrastructure
In order to make the best use of system resources and to take advantage of virtualization, some common infrastructure components were virtualized. Active Directory virtualized systems added resiliency to the already existing physical services and spread the active directory load during the Boot Storm and logon processes. All virtual machines in the infrastructure ran on Windows Server 2008 R2 SP1 with the Hyper-V Role hosted on HP BL460c G6 server blades.
The following software/services were virtualized to support the Common Infrastructure: • Active Directory / DNS
• XenDesktop Controllers (Desktop Delivery Controllers, or DDCs) • Citrix License Server
Design Considerations:
• Ensure that virtualized systems participate in the NTP process.
• Disabling the Hyper-V host time sync for the Virtual Guest services. This was applied on all virtualized Active Directory Domain Controllers and XenDesktop Controllers (using host time sync was a duplication of effort, since the hosts were already participating in the NTP time sync process).
Common Infrastructure Services
Overview
• Active Directory
o Two Active Directory DCs in the Datacenter
Two Physical: these servers provided DC, DHCP, DNS and NTP services
• Hyper-V 2008 R2
o One Hyper-V Cluster with 6 Nodes • SCVMM 2012
• SQL 2008 R2 • XenDesktop
o Two XenDesktop Controllers • Citrix Licensing
o One Citrix License Server
DNS
DNS services are critical for both Active Directory services and XenDesktop communications. DNS services in both the Datacenter and the Branch Offices were used to fulfill name resolution requests and support local Active Directory requests. Active Directory Integrated Zones were created for both forward- and reverse-lookup zones. A reverse-lookup zone was created for each VLAN/Subnet. This was required to allow 2-way communications between XenDesktop and the VDAs.
Design Considerations:
• For Windows and Microsoft Office activation, if a Key Management System (KMS) is employed, DNS entries for the KMS service need to be added.
DHCP
One DHCP server resided in the Datacenter and one in each of the Branch Offices. These DHCP servers were used to provide IP addresses and specific configuration options used by PVS to allow the virtual desktops to locate and boot from the PVS server.
Additional specific Scope Options used:
• Option 066 Boot Server Host Name Set to IP address of the PVS TFTP server • Option 067 Bootfile Name Set to ardbp32.bin – PVS boot file name Design Considerations:
Hyper-V and SCVMM Infrastructure
Virtualization in the Common Infrastructure was performed via Microsoft Hyper-V 2008 R2 SP1 and managed by SCVMM 2012.
SCVMM Servers
The infrastructure virtual machines were hosted on a Hyper-V Cluster that was managed by SCVMM. The SCVMM was deployed as an Active-Passive Clustered application on a two Node Failover Cluster. SCVMM clustering was deployed to provide resilient management of the critical hypervisor Cluster for the common infrastructure. The SCVMM servers for the Common Infrastructure were the same HP BL 460c G6 blades as the SCVMM servers for the VDI.
Design Considerations:
• SCVMM servers can be virtualized, and, Microsoft has guidelines about deploying SCVMM in a virtual fashion—these can be found on Microsoft’s TechNet website. • Two SCVMM servers clustered in an active-passive Failover Cluster create a resilient
Hyper-V management system, allowing for a single host failure.
• Our Common Infrastructure SCVMM Cluster used a dedicated database that was hosted on the VDI SCVMM database SQL 2008 R2 Cluster.
Hyper-V Cluster
A Hyper-V Failover Cluster was deployed for the purpose of virtualizing the common and XenDesktop infrastructure systems. All virtual machines that provided common infrastructure services ran on this cluster of Hyper-V 2008 R2 SP1 servers in High Availability mode. This cluster consisted of HP BL460c G6 servers with NetApp-presented iSCSI LUNs for Cluster Shared Storage.
Sizing Considerations:
• The number of servers selected for the infrastructure Hyper-V Cluster should be chosen based on the infrastructure VM count and number of CPUs assigned to them and to allow for host failures.
Design Considerations:
XenDesktop Infrastructure
Figure 6: Shared XenDesktop Infrastructure
XenDesktop Controllers
Each XenDesktop controller was virtualized on Windows Hyper-V 2008 R2 SP1 and configured with four vCPUs. This configuration allowed for sufficient performance to manage the assigned virtual desktop VMs, as well as log-on requests for the users during the test. Virtualized XenDesktop controllers were hosted on the common infrastructure Hyper-V Cluster.
XenDesktop Site
The XenDesktop site was configured to contain the following components: • One connection per SCVMM Cluster
• One host record per Hyper-V Cluster, two per Connection • Two Machine Catalogs per Hyper-V Cluster
o Catalog for regular virtual desktop VMs o Catalog for remote access virtual desktop VMs
• One Primary Desktop Group per Modular Block, with additional desktop group for remote access users. This configuration also facilitates sizing tests.
This configuration facilitated management and troubleshooting in a large, multi-connection and multi-cluster XenDesktop VDI environment.
This ensured that optimal performance was achieved. This host configuration allows for the starting of 80 virtual desktop VMs every minute, which is five per Hypervisor host, for connection with two 8-Node Clusters.
Host Configuration Advanced Settings:
Max active actions: 100
Max new actions per minute: 80
Max power actions as percentage of desktops: 8% Max Personal vDisk power actions as percentage: 25%
For the full environment, these settings allowed the VMs to start at the rates listed below. Overall, this allows starting all 5050 virtual desktop VMs in 12 to 13 minutes.
Sizing Considerations:
• XenDesktop components should be virtualized on Hyper-V in an N+1 configuration for resiliency and hosted on a resilient Hyper-V Cluster
• Our performance results indicated that two XenDesktop Controllers could support at least the number of desktops in this environment.
• The CPU and RAM parameters should be determined during environment sizing tests based on observed utilization of the XenDesktop controllers.
Design Considerations:
• The host advanced settings should be adjusted based on the number of hosts, number of virtual desktops on the host connection, and on the desired boot time for the environment.
• To ensure that the PVS farm can support the required boot speed, it is important to perform a Boot Storm validation of the host advanced settings in a PVS environment. This will validate whether the hypervisor and storage subsystems can sustain the increased load during the boot storm.
Additional information can be found in the Appendix under “XenDesktop Site Details” SQL for XenDesktop
The SQL environment that supported this XenDesktop environment was configured in a database synchronous mirroring SQL configuration with Principal, Mirror, and Witness servers. This model allowed for a high performance resilient configuration for the XenDesktop site database. Please note that SQL clustering can also be used to create a resilient configuration, or a virtual SQL server can be configured as an HA VM.
Deployment Note:
• The configuration of the SQL environment was done with the following steps: 1. Create the site database on the Principal server
2. Back up the database.
3. Restore the database to the Mirror server set with the “No Recovery Mode” option. Failure to set this option will result in error messages regarding lack of access when attempting to start the mirror.
Citrix License Server
A single virtualized Citrix License Server was used to provide all of the needed licensing to the Citrix components, such as EServer, XenDesktop, and PVS. One license file, XenDesktop Platinum, was utilized for the entire environment.
Sizing Considerations:
• A single virtual machine running Citrix License Server was able to support the entire environment with no issues. The VM was assigned 1 vCPU and 2GB RAM.
Design Considerations:
• The Citrix License Server should be deployed as a highly available VM to allow for a Hypervisor host failure.
• It is recommended to back up the Citrix Licensing Server VM for easy recovery in the event of failure.
Virtual Desktop VM
Figure 7: VDI Environment
XenDesktop virtual desktops in this VDI environment were deployed with PVS as provisioned (streamed) VMs on Windows Server 2008 R2 SP1 with Hyper-V Role hypervisors. All virtual desktop VMs were configured to be highly available in Hyper-V Failover Clusters and were configured with Fixed VHD drives to improve storage performance.
• Virtual Desktop VM
o 1 vCPU, 1GB RAM, 2GB Page File, 4GB Write Cache Fixed VHD o Windows 7 x86
o Citrix XenDesktop Virtual Desktop Agent 5.6 o User Profile Manager 4.1
Figure 8: VDA Network Configuration Virtual desktop VMs were configured with two network interfaces:
• Legacy NIC for PXE boot process. This is required for Virtual Desktop VMs to boot from PVS and is specific to Hyper-V deployments.
• Synthetic NIC for optimized network communications in Windows (optional) During the virtual machine creation, virtual desktop VM network interfaces were configured with Static MAC addresses. The environment utilized the non-PVD virtual desktop model, storage configuration for which is shown in the diagram below.
Modular VDI Infrastructure
Figure 10: Modular VDI Block #1
With the common infrastructure in place, the next step was to design, create, and configure the infrastructure that would support the first Modular Block. Once the first Modular Block was established and tested, additional Modular Blocks were deployed. This section outlines the modular VDI Infrastructure, Modular Block sizing, and configuration of the Hyper-V environment, SCVMM, PVS, and SQL.
Infrastructure Deployment Methodology
Modular Block Overview
Modular VDI Block hardware contains virtual desktops, Hyper-V servers, SCVMM servers and PVS servers.
Figure 11: Modular VDI Block Infrastructure
Modular Block Sizing
When scaling up a Virtual Desktop Infrastructure environment from a single server to a modular block, it is important to understand your equipment along with its capacity and performance capabilities. This is important not only as a single component, but also as an orchestrated system working together. Finding the optimal design involves multiple cycles of testing, troubleshooting, and resolution of issues that will eventually lead to a design where every component within the system has the ability to handle the load.
The first step in this process was to determine single server’s VM density. Based on this, the next step was to calculate the single cluster VM density numbers. A single cluster consisted of 8 servers. We progressed to a single chassis (in our case, with HP blades, this was 16 blade servers) VM density evaluation that consisted of two clusters; this was defined as a “Modular Block” of virtual desktops. For this project, each of the blocks was built to support 1,010 users. It is important to note that moving up from a single server to a higher scale was an iterative process that required testing the environment, finding, isolating and resolving the bottlenecks; and scaling the environment up until the goal was achieved. This resulted in the final Modular Block design and architecture.
To ensure fail over in the virtual desktop environment all Hyper-V Clusters were configured with high availability. For this reason, all hosts were run at approximately 70% of maximum capacity, in order to accommodate any host failures by redistributing the VMs running on that failed host to other hosts within the cluster. These sizing numbers would have been greater had the goal of this test been to run at full server capacity. Based on the server models used in this project, the workload that was leveraged, and the configuration of the systems, it was found that a single HP BL460c G7 server could accommodate 90 virtual desktops at maximum load during single server scalability testing.
Modular Deployment Design
With the number of virtual desktops defined at 1,010 per modular block, the infrastructure was designed to support that size deployment. Each Block would have XenDesktop, PVS, SCVMM, Hyper-V and NetApp Shared Storage. Each of these components needed to meet the performance and capacity requirements.
The environment was configured to include a dedicated clustered SCVMM in each Modular Block instead of a shared SCVMM for two Blocks. The PVS farm scaled as expected. It was found that with a single SCVMM servicing a single Modular Block of 2 Hyper-V clusters and 1,010 virtual machines, performance was at the expected level. As noted earlier, non-clustered SCVMM servers have been shown to support up to 2,000 virtual desktops.
With the architecture implemented and tested, the remaining three Blocks were deployed and prepared for a full five Block environment of 5,050 virtual desktops.
Design Considerations:
• All components of a modular block should be performance validated at the scale of the single block before merging with other blocks.
Modular Block Infrastructure
Hyper-V Virtualization for VDI
Hyper-V Host
This environment deployed HP BL460c G7 servers with Windows 2008 R2 SP1 and Failover and Hyper-V Roles for virtualizing Citrix XenDesktop virtual desktops.
• The Hyper-V network was configured on the hosts prior to the creation of the Cluster and the addition of SCVMM.
o VDA Virtual Network: configured as External on a matching NIC o PXE Virtual Network : configured as External on a matching NIC
• The Network architecture for the Hyper-V Clusters was configured, segregating traffic for specific functions including Management, Storage, VDA, and PXE.
o Management Cluster and RDP management. A separate VLAN was used for management operations.
o Storage: Two NICs allowed for iSCSI MPIO access to NetApp. A separate VLAN was used for storage traffic.
o VDA: Dedicated VLAN for VDA synthetic NICs. A Hyper-V virtual NIC was created with no IP address due to its use for host management.
o PXE: Dedicated VLAN for PVS streaming and PXE operations for the virtual desktops. A Hyper-V virtual NIC was created with no IP address due to its use for host management.
The hypervisor network configuration is show in the diagram below:
Hyper-V Cluster
VDI Hyper-V Clusters were configured leveraging eight HP BL 460c G7 servers per Cluster. The 8-Node Cluster was chosen as the cluster size, allowing for two Clusters per HP C7000 chassis. The Cluster connects to two NetApp iSCSI LUNs, one as a Cluster Shared Volume for VDA storage and one as Cluster Quorum Witness.
Figure 14: Hyper-V Cluster Calculations for sizing VDI Hyper-V Cluster in our environment:
1. A single Hyper-V Server maximum load after testing with LoginVSI, and the workload as we configured it, was found to be ~90 VDA VMs.
2. A single Server at 80% capacity for a production environment was found to be 72 VMs, which equals 80% of 90.
3. A single cluster has eight hosts.
4. To allow for a single host failure within a cluster and maintain the 80% capacity per server, we reduced the server count by 1 (n-1, where n is the number of hosts in a cluster). This resulted in a desired single Cluster VM density of [72 * (8-1) = 504] 5. Based on these measurements, a single cluster will start with 505 VMs, which
results in 63 VMs per hosts when all hosts are up and running. In the event of a host failure, each host will end up with 72 VMs, representing the maximum 80% target load on each server.
Sizing Considerations:
• The VDA Hyper-V Cluster VM count was chosen as 505 VMs to allow for a single hypervisor host failure and maximum of 80% load per host.
Design Considerations:
• Hyper-V Cluster VM density sizing was determined by a process that started with single server and then full Cluster sizing. These estimates are highly dependent on the hardware specifications of the hypervisor hosts.
• The Cluster should be validated with exactly the same network and storage configurations that will be deployed in scaling the VDI environment.
• The first Node in each Cluster was configured with File services and a Windows Share created for C:\ClusterStorage. This is the share for Cluster Shared Volumes (CSVs), and is required for Citrix XenDesktop 5.6.
SCVMM for VDI
SCVMM in the XenDesktop and Hyper-V VDI processes provides virtualization management functionality for the Hyper-V clusters and is a single point of connection from XenDesktop controller and virtualization systems. It is also utilized for the VM deployment as connection point from PVS server XenDesktop Setup Wizard.
Figure 15: Hyper-V and SCVMM and XenDesktop Topology
SCVMM Host
SCVMM Cluster Host network configuration is shown in the following figure:
Figure 16: SCVMM Host Network Configuration Design Considerations:
• The SCVMM Cluster Quorum File Share should have a dedicated Windows File Share per cluster on a file server in the environment.
• The SCVMM host hardware configuration needs to be based on performance during both deployment and large-scale power operations on both virtual desktops and virtualization hosts, as they can be different numbers.
SCVMM Cluster
Figure 17: SCVMM for VDI in Modular Block Each SCVMM cluster configuration contained the following items:
• A MAC Address Pool per Hyper-V cluster
Design Considerations:
• In order for the 2-Node Clusters to function properly, the “Domain Computers” Active Directory group should be added to the security settings of the file share used for the witness.
• In order to ensure there were no MAC address conflicts between the Clusters, a Host Group with specific MAC address assignments should be used.
• Consider creating special RunAs accounts to be used by PVS and XenDesktop to be able to easily identify SCVMM job originators in the job log.
SCVMM Library File Server
All VDI SCVMM servers were connected to a dedicated SCVMM Library file server. Following the Microsoft best practices for highly available SCVMM configuration, the SCVMM library was configured as a 2-Node Failover Cluster for Windows File Server application, and the storage was set up as a Cluster-Shared NetApp iSCSI LUN.
The hosts used were the HP BL460c G6 blades with Windows 2008 R2 SP1 configured with Failover and File Services Roles.
The Virtual Desktop template used for deployment by the PVS XenDesktop Setup Wizard resided in the SCVMM Library. Additional free space for critical VM backups was also allocated to the SCVMM Library LUN.
The following shows the configuration of the Virtual Desktop VM Template used for the deployment:
RAM 1GB of Static vRAM
CPU 1 vCPU
Legacy NIC Used for PXE traffic
Synthetic NIC Used for VDA communication Hard Disk Assigned a Static 4 GB VHD Boot Mode PXE
Availability Highly available
SCVMM SQL
A Cluster utilizing shared storage with a Quorum configuration of Node and Disk Majority supported the Microsoft SQL database.
Design Considerations:
• The SQL Database should be designed to allow for host redundancy.
Provisioning Services (PVS) for VDI
The decision to implement either physical servers or virtual servers for these components should be based on your specific requirements.
PVS Server Networking
Each PVS server was deployed with 2 NICs, one for management and another for streaming: • A 1Gbps NIC assigned to the Management VLAN was used for server to server
communication
• A 9Gbps NIC assigned to a VDA PXE VLAN was used for streaming network traffic. The TFTP service was configured to run on this NIC on all PVS servers
Figure 18: PVS Host Network Configuration
Each PVS server was configured for optimum performance for the scale deployed. To support 2K desktops per farm, the following calculations and PVS server network and advanced settings were implemented:
VMs = Ports * Threads
Ports Configured 6910-6960 (50 ports) Threads per Port = 16
Total Threads = 50*16 = 800
Other PVS Server settings configured for improved performance: Buffers per Thread = 32 Buffers
Device Booting = 1000 devices
PVS Farm
Each PVS farm was comprised of three servers supporting two Modular Blocks (2020 virtual desktop VMs) per farm or 700-800 streams per PVS server. Each PVS Farm was configured with a shared vDisk store on a dedicated PVS File Server for the vDisk storage. The below diagram shows the PVS farm structure.
Figure 19: PVS for VDI Farm Structure
The configuration for the PVS farm for each Modular Block as shown on the diagram: • Site: <Environment Name>
o Servers (each server was configured identically)
Log events to the severs Windows Event Log – Enabled Stores - <shared storage path>
Options – Active Directory password updates (30 Days) – Enabled Logging – Logging level was set to “Info”
o vDisk Pool
o vDisk Update Management (not configured) o Device Collection
<Modular Block 1>
<Modular Block 2>
<Modular Block 2> – Remote Users o Views (not configured)
• Storage
o Shared Storage
Site: <Environment Name>
Servers: Host1, Host2, and Host3 Enabled Path: <path to shared storage>
• Boot Strap Settings were configured on each PVS server to load balance boot process to all servers in the PVS farm
In this configuration, if one of the PVS farm hosts failed, the remaining PVS servers will still be able to support the 2 Blocks.
There were two vDisks in each PVS farm, one for the Datacenter virtual desktops and one Remote Access virtual desktops with additional configuration.
The configuration for the vDisk for the virtual desktops is as follows: • vDisk Details
o Name: <image name> o Size: 40GB
o Mode: Cache on Device Hard Drive • vDisk settings
• General
o Access mode: Standard Image (multi-device, read-only access) o Cache Type: Cache on device hard drive
o Enable Active Directory machine account password management – Enabled
o Enable streaming of this vDisk – Enabled • Identification: (default values)
• Auto update: Not configured
PVS XenDesktop Setup Wizard was deployed to create virtual desktop VMs and used a Template from SCVMM to create the virtual desktops VMs. Special registry settings were applied to optimize the VM creation process.
When using the PVS XenDesktop Setup Wizard to deploy the virtual desktops, the following settings were used to reduce the time required:
[HKEY_CURRENT_USER\Software\Citrix\ProvisioningServices\VdiWizard\Max_VM_CREATE_T HREADS_PER_HYPERVISOR] set to “2” (This specifies two VM creations per Hyper-V Host to set only two write operations per Cluster Shared Volume LUN per host)
• Two PVS servers are required to support two Modular Blocks. The third PVS server should be added to the PVS farm for resiliency.
• A centralized Windows File Share vDisk store should be stored on a shared NetApp iSCSI LUN to allow for best vDisks read performance both during deployment and the boot processes.
Design Considerations:
• The number of servers in the PVS farm is based on the number of VMs that a specific PVS server configuration can support and the number of the VMs the total PVS farm needs to stream, plus one for resiliency, in case of a host failure.
• Having a DNS entry for the Windows file server or storage system hosting the PVS store is a requirement, because an IP address cannot be used to point to a storage location.
• Consider using Windows File Share for shared vDisk store to get the benefit of Windows share caching technology. This will help improve storage performance • Consider using KMS setting with PVS vDisks if the image contains Microsoft
Windows 7 or later and Office 2010 or later, which are activated using Microsoft KMS Volume Licensing method.
PVS File Servers
Each PVS farm was configured with a shared vDisk store configured to connect to a dedicated Windows File Server share, which was a shared NetApp iSCSI LUN. Dedicated PVS File Servers with dedicated iSCSI LUN were configured for each PVS farm to provide the required performance. This reduces storage contention or file locking of the vDisks among the PVS farms.
PVS SQL
The SQL Database should be designed to allow for host redundancy.
User Profile Management
The user accounts were created and organized based on a per-Modular Block design. Profile management was then configured based on that design. This allowed for better overall management of the user accounts and policies associated with those accounts.
Citrix User Profile Manager 4.1 was leveraged for user profiles. Each block was configured with a dedicated network share for profile storage and was placed on a CIFS share located on a NetApp FAS 3240. The user setting for profile management was assigned by Group Policy Objects (GPO), which aligned configurations specific to each modular block. Profiles for this project were configured as follows:
• Desktops without Personal vDisk: Streamed – enable delete cache on logoff • Desktops with Personal vDisk: Streamed – Disable delete cache on logoff Design Considerations:
• For this environment, separate CIFS shares and a separate UPM settings GPO was used for each of the five Modular Blocks.
• Consider validating storage performance and protocol selection for user profiles to match the environment requirements. In the test environment, iSCSI LUNs shared as Windows File Shares were chosen to meet performance requirements.
Multi-Site Infrastructure
Figure 20: Multi-site Infrastructure
The multi-site design with remote access was implemented with the intention of replicating the production environment of an Enterprise-level organization. The organization would have infrastructures comprised of a back-end Datacenter and geographically remote business locations that required access to resources in the Datacenter. We also took providing access for telecommuters into account. This was accomplished by including a Datacenter, two Branch Offices, and a Remote Access entry point for telecommuters.
Branch Offices
Branch offices were designed with segregated LAN and WAN networks. The WAN was created with both routers and firewalls on either side to emulate a production environment. The branch locations leveraged either a Branch Repeater VPX running as a VM on XenServer or a physical Branch repeater 8820 appliance. Both of these branch office types connect to a Branch Repeater VPX appliance set in the central Datacenter.
Remote Access Users
than a full VPN tunnel. The ICA proxy configuration facilitated the configuration of a VPN tunnel that allows for ICA connections, XenApp and XenDesktop, via the SSL protocol.
Design Considerations:
• A major consideration was monitoring and reporting device state. The Citrix Command Center was leveraged to report on all the Citrix Network Devices, which allowed a single unified console to manage, monitor, and troubleshoot the entire global application delivery infrastructure. The Command Center also incorporates up-to-date, proprietary SNMP counters that were used for reporting purposes.
Test Methodology
Test Milestones
Many tests were performed during different phases of the project. After completion of each test run, Performance and Test Reports were generated and analyzed. The purpose of each of the phases was to find possible bottlenecks that might be caused by one or more components,. These components were adjusted prior to the next higher-scale phase, which ultimately led to the full 5k-scale multisite test.
Test phases:
• Single Server Scale: A single server was tested to confirm the maximum load that a single server could support and to validate the environment.
• Single Cluster Scale: A single Cluster consisting of eight Nodes was tested to obtain Single Cluster performance data.
• Single Modular Block Scale: A single chassis consisting of two Clusters (16 Nodes) was tested to obtain Single Chassis performance data.
• Dual Modular Block Scale: All modular blocks hosted by a single storage system were tested to determine single storage-performance data.
• Appliance Full Scale Tests: These were multiple tests performed to validate the network appliances assigned to branch and remote sites could handle the number of user sessions assigned to those sites. The tests are performed for each type or model of network appliance deployed, such as Branch Repeaters or Access Gateway.
• Full 5K-Scale Multisite: This was the final phase of testing and consisted of all Chassis, Clusters, and Servers. This test utilized multiple storage units and spanned multiple sites. This testing represented the primary goal of the project.
Test Tools
Testing consisted of multiple-user ICA sessions launched from Citrix Receiver clients to XenDesktop Virtual Desktop VMs virtualized on Hyper-V, and launching user activity with Login VSI 3.6 medium workload.
Login VSI analyzer was used to record user experience data for all user sessions.
Session Launching
Citrix Receiver was utilized to launch ICA sessions to virtual desktops. In this test environment virtualized clients were utilized. These clients were configured to launch multiple ICA sessions each.
Orchestration tasks:
• Launching an ICA client session to a XenDesktop VDA • Starting the in-session workload
• Logging off the user at the end of the test Launcher Details:
Datacenter Site BR 8800 Site
• Client Launchers: 170
• Number of VDA: 3775
• Sessions Per Client: 22
• Client Launchers: 30
• Number of VDA: 525
• Sessions Per Client: 17
BR VPX Site AG Site
• Client Launchers: 9
• Number of VDA: 150
• Sessions Per Client: 16
• Client Launchers: 35
• Number of VDA: 600
• Sessions Per Client: 17 Figure 21: Client Launcher Details for Sites
Performance Capturing
The Windows Performance Monitor was used to collect data for all major systems during each test run. This allowed for near real-time performance monitoring during the test and in-depth historical data analysis after the test completed. All pertinent data was collected and centrally stored, including general system metrics. Citrix internal tools were leveraged for capturing specific data, such as XML Brokering time, user logon time.
In-Session Workload Simulation
LoginVSI is a publicly available tool from Login VSI V.B. that provides an in session workload representative of a typical user. A predefined Medium Workload was used for this test as described in the link.
addition, the workload script was running on the server-side desktop and the generated traffic stream was almost completely downstream.
The test was configured to have each user execute two Login VSI loops, each 12-15 minutes long. The following defines the Login VSI Medium Load loop:
• The workload emulated a medium knowledge user executing Office, IE and PDF. • Once a session is started, the medium workload repeats every 12 minutes. • During each loop, the response time is measured every two minutes. • The medium workload is up to five apps opened simultaneously. • The type rate is 160ms for each character.
• Approximately two minutes of idle time is included to simulate real-world users. For additional detail, see the Login Consultants VSI Admin Guide: a
http://www.loginvsi.com/workloads,
and under http://www.loginvsi.com/images/adminguide/mediumworkload.png
Results
Performance Results
During the execution of the test, performance was monitored and data was gathered after the each test to gauge the load on each component in the environment. Performance was gathered during boot storms and test runs.
Performance Results – Boot Storm
Boot Storm performance data was captured while spinning up the 5000 virtual desktops in preparation for the Test Run. The Boot Storm starts when XenDesktop powers on the first virtual desktop and ends when all desktops are registered.
PVS for VDA Performance (Boot Storm)
During boot storm, we captured the following results: • Maximum CPU load: 50%-55%
• The Average PVS Sent Bandwidth per VM from PVS was 629KBps.
Figure 22: PVS for VDA Performance – Boot Storm
Note: Modular Block 6 was not deployed in the environment; thus, the load on the third PVS server farm is only 1010 VMs.
SCVMM for VDA Performance (Boot Storm)
During boot storm, we captured the following results: • CPU Utilization: 17%
• Bandwidth Utilization: 10MBps
The results validated our design assumption that a single SCVMM can handle a Modular Block of 1010 Desktops.
System Role CPU Peak (%) Total RAM (MB) RAM Used (MB) RAM %
R1E03C1B13 SCVMM - Block 1 1.2% 98304 5021 5.1% R3E07C3B13 SCVMM - Block 1 11.6% 98304 6578 6.7% R1E03C2B13 SCVMM - Block 2 17.6% 98304 6653 6.8% R3E02C2B13 SCVMM - Block 2 1.9% 98304 5275 5.4% R1E03C1B05 SCVMM - Block 3 1.6% 98304 5268 5.4% R3E07C3B05 SCVMM - Block 3 12.8% 98304 7031 7.2% R1E03C2B05 SCVMM - Block 4 12.9% 98304 7224 7.3% R3E02C2B05 SCVMM - Block 4 1.7% 98304 5288 5.4% R3E02C2B03 SCVMM - Block 5 1.1% 98304 5027 5.1% R3E07C3B11 SCVMM - Block 5 17.3% 98304 6391 6.5%
Figure 23: SCVMM for VDA Performance – Boot Storm
System Role (%) CPU Peak
Performance Results – Test Run
This section covers the Test Run performance data that was captured on the systems while running the full-scale virtual desktops test.
Hyper-V for VDA Performance (Test Run)
The test run validated that the 8-server cluster was sized adequately as a CPU utilization of 70-80% was captured. The test data also shows that the cluster is able to maintain all VMs running if one host fails. During the test, one of the hosts failed, and VMs were distributed to the other seven nodes within the cluster. Hyper-V RAM utilization was as expected.
The table below shows the data for a single Modular Block. The overall average CPU Utilization for all Blocks was 73.48% with a Maximum of 87.35%.
Virtualization (Hyper-V VDA
Host) Role
CPU Peak
(%) RAM Available RAM Used RAM %
R1E04C3B01 Hyper-V - Block 1 (4C3P1) 77.49 196608 81532 70.85% R1E04C3B02 Hyper-V - Block 1 (4C3P1) 77.56 196608 78734 66.80% R1E04C3B03 Hyper-V - Block 1 (4C3P1) 80.99 196608 78747 66.81% R1E04C3B04 Hyper-V - Block 1 (4C3P1) 72.72 196608 79649 68.10% R1E04C3B05 Hyper-V - Block 1 (4C3P1) 75 196608 82124 71.73% R1E04C3B06 Hyper-V - Block 1 (4C3P1) 69.08 196608 78732 66.79% R1E04C3B07 Hyper-V - Block 1 (4C3P1) 71.5 196608 78721 66.78% R1E04C3B08 Hyper-V - Block 1 (4C3P1) 70.42 196608 76496 63.69% R1E04C3B09 Hyper-V - Block 1 (4C3P2) 75.55 196608 80704 69.63% R1E04C3B10 Hyper-V - Block 1 (4C3P2) 79 196608 78563 66.55% R1E04C3B11 Hyper-V - Block 1 (4C3P2) 72.26 196608 79813 68.34% R1E04C3B12 Hyper-V - Block 1 (4C3P2) 79.27 196608 78691 66.73% R1E04C3B13 Hyper-V - Block 1 (4C3P2) 79.82 196608 78737 66.80% R1E04C3B14 Hyper-V - Block 1 (4C3P2) 68.68 196608 78486 66.44% R1E04C3B15 Hyper-V - Block 1 (4C3P2) 69.09 196608 80452 69.26% R1E04C3B16 Hyper-V - Block 1 (4C3P2) 74.76 196608 80814 69.79%
Figure 24: Hyper-V for VDA Performance
PVS for VDA Performance (Test Run)
System Role CPU Peak (%) Total RAM (MB) RAM Used (MB) RAM % R1E03C1B03 PVS for VDA - Block 1&2 9.9% 98304 6356 6.5% R1E03C2B03 PVS for VDA - Block 1&2 12.2% 98304 7126 7.2% R3E07C3B03 PVS for VDA - Block 1&2 11.4% 98304 6067 6.2% R1E03C1B11 PVS for VDA - Block 3&4 12.8% 98304 6817 6.9% R1E03C2B11 PVS for VDA - Block 3&4 12.4% 98304 7013 7.1% R3E02C2B11 PVS for VDA - Block 3&4 11.4% 98304 6517 6.6%
R1E03C2B06 PVS for VDA - Block 5 6.2% 98304 6409 6.5%
R3E02C2B06 PVS for VDA - Block 5 7.5% 98304 6767 6.9%
R3E07C3B06 PVS for VDA - Block 5 5.1% 98304 6383 6.5%
Figure 25: PVS for VDA Performance
SCVMM for VDA Performance (Test Run)
The test results show low CPU and RAM utilization on the SCVMMs VDA servers as there are limited request to the SCVMM servers during the test run.
System Role CPU Peak (%) Total RAM (MB)
RAM Used (MB) RAM % R1E03C1B13 SCVMM - Block 1 1.2% 98304 5018 5.1% R3E07C3B13 SCVMM - Block 1 6.8% 98304 6330 6.4% R1E03C2B13 SCVMM - Block 2 10.4% 98304 6465 6.6% R3E02C2B13 SCVMM - Block 2 2.0% 98304 5278 5.4% R1E03C1B05 SCVMM - Block 3 1.9% 98304 5265 5.4% R3E07C3B05 SCVMM - Block 3 8.4% 98304 6905 7.0% R1E03C2B05 SCVMM - Block 4 15.0% 98304 7238 7.4% R3E02C2B05 SCVMM - Block 4 2.0% 98304 5286 5.4% R3E02C2B03 SCVMM - Block 5 1.2% 98304 5023 5.1% R3E07C3B11 SCVMM - Block 5 10.8% 98304 6362 6.5%
SCVMM for VDA Library Server File Server Performance (Test Run)
The test results shows that there is minimum utilization on the SCVMM Library servers.
System Role CPU Peak (%) Total RAM (MB) RAM Used (MB) RAM %
R1E03C1B10 SCVMM - Infrastructure 0.9% 98304 5401 5.5%
R3E02C2B10 SCVMM - Infrastructure 1.3% 98304 5507 5.6%
Figure 27: SCVMM for VDA Library Server File Server Performance
Multi-Site Performance (Test Run)
The results for the Branch Repeater VPX showed a 3.81 to 1 compression ratio or about 73.75% over the life of the test run.
The graphs below show the send and receive comparisons for both compressed and non-compressed traffic. Notice that the receive traffic is more prevalent since the virtual desktop traffic is being received by the endpoints at the branch office.
Figure 28: BR VPX Receive Stats Figure 29: BR VPX Send Stats
With the SDX appliance on the Datacenter side that is accepting the accelerated connections, note that the SDX has higher send traffic that correlates to the receive traffic on the branch side.
The next branch office type leverged a physical Branch Repeater (BR) 8820. This appliance showed similar results to the VPX, but can handle more connections because it is a more powerful appliance. The test run showed a 3.48 to 1 compression ratio, or about 71.3%. The graphs below show the send and receive comparisons for both compressed and non-compressed traffic. Similar to the VPX statistics above, the BR 8820 showed the following:
Figure 32: BR 8820 Receive stats Figure 33: BR 8820 Send stats
Figure 34: BR SDX Receive stats Figure 35: BR SDX Send stats
Additional Testing: PVS with Personal vDisk (PvD)
Deployed in a separate parallel environment in order to perform PvD-related tests, the virtual desktop for the PvD configuration was streamed from PVS, and utilized 5.5-10.5 GB on Cluster shared storage.
Please note that the following PvD data cannot be used as comparison to our non-PvD environment as it was deployed with a newer version of PvD with performance enhancements.
Figure 36: Streamed VDA Storage Structure with PVD Configuration
Objective
A separate environment that was utilized for the initial sizing of the environment for this project was also used to do additional testing using PVS with Personal vDisk. The goal of this additional test was to determine the maximum number of desktops that could successfully pass a test using PvD in the same environment as Non-PvD.