The section describes the test results for the ESXi host memory and CPU utilization, and the storage metrics, using both IOPS and response time.
The most critical metric in a virtual desktop validation is the user application response time. In this testing, the system was optimized so that the user response time for any application activity was less than 3 seconds. As expected, when the number of users was increased by adding an additional blade, the application response time increased only marginally. The validation effort also focused on ensuring that the back-end storage array response time was lower than 1 ms, in order to ensure exceptional I/O performance characteristics. The CPU and memory characteristics provide information on the behavior of a representative blade used in the validation.
Application response times
Figures 17 and 18 show the overall application response times as measured by the open and close times of the various Microsoft Office applications. When 768 desktop sessions were hosted on the UCS chassis, it meant that 96 Windows 7 desktops were running the user simulation workload on each blade in the chassis. Similarly, the 1,536 desktop sessions show the response time
characteristics when the number of Windows 7 desktops running on each blade remained the same due to the addition of another UCS chassis. The increased load on the front-end ports of the storage array resulted in increasing the application response time as illustrated by the charts in the figures below. The average response times increased by an average 15% when the number of desktops increased from 768 to 1,536.
Figure 17. Application response time with 768 users
Figure 18. Application response time with 1,536 users
vSphere Server utilization
The charts below show the resource utilization of one representative blade server during the steady state. Similar characteristics were observed on the remaining blades of the test harness.
CPU utilization
Figures 19 and 20 illustrate the CPU utilization of a representative ESXi server. Each server hosted 96 user desktops. It is evident from these figures that the CPU utilization was high and close to 95% in
Figure 19. CPU utilization with 768 users
Figure 20. CPU utilization with 1,536 users
Memory utilization
Figures 21 and 22 illustrate the memory utilization of a representative ESXi server from the cluster hosting the virtual desktops. Each UCS blade was configured with 96 GB of physical RAM. With each blade running 96 desktop sessions, respectively, memory was over-committed. ESXi allows for the over-commitment by using techniques such as page sharing, ballooning, and swapping to disk. All of the above memory over-commitment techniques were observed during the testing and validation effort. Ballooning and swapping activity were observed in each test case and were found to be the same as the desktop environment scaled from 768 desktops on one UCS chassis to 1,536 desktops on two UCS chassis.
N
Noottee:: Do not be concerned that the ballooning of memory appears to be very high in the above figures User experience in desktop environments is affected only if the swapping activity is high. In our test
CPU Performance
Figure 21. Memory utilization with 1,536 users
Figure 22. Memory utilization with 768 users
Storage metrics
Figures 23-28 plot the VMAX front-end I/O characteristics with 768 and 1,536 users.
Figure 23. VMAX front-end IOPS (4 front-end ports) with 768 users
Figure 24. VMAX front-end % busy with 768 users
15% Front End Busy
Figure 25. VMAX front-end throughput with 768 users
25% Front End Busy
Figure 27. VMAX front-end % busy with 1,536 users
Figure 28. VMAX front-end throughput with 1,536 users
Backend storage
Figures 29-32 show the corresponding statistics from the back end of the VMAX storage array. It can be clearly seen that as the number of virtual desktops is scaled from 768 to 1,536, the back-end disk statistics scale almost linearly—the backend I/O’s increase about 80% when the number of desktops is increased from 768 to 1,536 while the % disk busy remains the same at about 35%. This is because more disks were added when the environment was scaled from 768 to 1,536 desktops.
Figure 29. VMAX back-end IOPS with 768 users
35% Disk busy
Figure 30. VMAX back-end % busy (average) with 768 users
Figure 31. VMAX back-end IOPS with 1,536 users
35% Disk busy
Figure 32. VMAX back-end disk % busy (average) with 1,536 users
The steady-state IOPS on the VMAX front-end ports are approximately 573 for 768 users and 1,250 for 1,536 users. The utilization (% busy) of the front-end ports is an average of 15% for 768 users and 25% for 1,536 users. Note that the front-end utilization and IOPS increase significantly when all the desktops are started at the same time (boot storm event).
The back-end IOPS scale linearly as the number of desktops double while the disk utilization remains almost the same. This is because 48 additional disks were added when the desktop environment was scaled from 768 to 1,536 desktops. From the above charts, it is clear that as the environment was scaled from 768 desktops to 1,536 desktops, by adding a new UCS chassis and storage resources, the IOPS and other storage operations on the VMAX system scale linearly.
Scaling considerations
In this solution architecture, we have seen that virtual desktop environments can be scaled by either of the following:
§ Scaling the Vblock System by adding additional Cisco UCS blades and EMC VMAX engines
§ Using the next generation of UCS blades, which offer better performance characteristics