• No results found

Solution validation and performance results

This section details the solution validation and performance test results:

 About performance results

 Exchange performance validation tools

 Monitoring Exchange performance

 Monitoring the Exchange server’s health

 Solution validation with Exchange Jetstress

 Performance results with Jetstress–servers

 XtremCache acceleration test results

 Storage array performance results with Jetstress

 Solution validation with Exchange LoadGen

 Analyzing the LoadGen test results

 LoadGen test results summary

 Performance with XtremCache data deduplication

 Deduplication test results summary

Performance results are highly dependent on workload, specific application

requirements, and system design and implementation. Relative system performance will vary as a result of these and other factors. Therefore, we suggest that you do not use this workload as a substitute for a specific customer application benchmark when you are planning for capacity and evaluating product decisions.

All performance data contained in this document was obtained in a rigorously controlled environment. Results obtained in other operating environments may vary significantly. EMC Corporation does not warrant or represent that a user can or will achieve similar performance expressed in transactions per minute.

To evaluate how XtremCache improves Exchange 2010 performance, we performed a full end-to-end solution validation using standard tools for Exchange performance validation, including both Jetstress and LoadGen. Each tool is described briefly below.

Exchange Jetstress

Jetstress 2010 works with the Microsoft Exchange Server 2010 database engine to simulate the Exchange database and log disk I/O load. It simulates the Exchange database and log file loads produced by a specific number of users.

Jetstress is primarily used to verify the performance and stability of a disk subsystem prior to putting an Exchange 2010 server into production. Jetstress helps verify that your disk subsystem meets or exceeds the performance criteria you establish. After successful completion of the Jetstress disk performance and stress tests in a non-production environment, you will have verified that your Exchange 2010 disk About performance

results

Exchange performance validation tools

Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache 32 subsystem is adequately sized (in terms of performance criteria you establish) for the user count and user profiles you have established.

Jetstress is also helpful in identifying any issues or bottlenecks in your I/O stack, from Windows STORport drivers to the SAN and hypervisor infrastructure, prior to

production deployment.

Note: You will never see the workload that Jetstress creates in your production environment.

Jetstress runs at 100 percent user concurrency with the maximum load to the storage subsystem. It is highly unlikely for this condition to occur in your production environment.

EMC strongly recommends that you use Jetstress to validate storage reliability and performance prior to deploying your Exchange servers in a production environment.

You can download Jetstress from the Microsoft Download Center:

http://go.microsoft.com/fwlink/?LinkId=178616. Jetstress documentation describes how to configure and execute an I/O validation or evaluation on your server hardware.

In this solution, we used Jetstress to evaluate how the VNX storage subsystem was affected by the Jetstress workload and how XtremCache acceleration on the database LUNs helped to reduce the read I/O to the back-end storage. We also used Jetstress to identify the best option for XtremCache page and max_io parameters for the XtremCache device.

Exchange Load Generator

While the Jetstress tool tests the performance of the Exchange storage subsystem before placing it in the production environment, it does not test the impact of the server CPU and memory configuration of MAPI user activity on the entire Exchange Infrastructure. For this purpose, we used the Exchange Load Generator (LoadGen) tool.

LoadGen is designed to produce a simulated client workload against a test Exchange deployment. Although this workload is different from the workload you will see in your production environment, it can be used to evaluate how Exchange performs and to test the overall solution. It can also be used to analyze the effect of various

configuration changes on Exchange behavior and performance while the system is under load. LoadGen is a useful tool for administrators when they are sizing servers and validating a deployment plan.

Note: LoadGen requires full deployment of the Exchange environment for validation testing.

You should perform all LoadGen validation testing in an isolated lab environment where there is no connectivity to production data.

Also, ensure that you plan an adequate time to prepare your test environment as the database initialization process can take a long time depending on the mailbox size and number of users in your test Exchange deployment. In our lab environment, for example, populating 15,000 users with 1.5 TB of Mailbox data took three weeks.

For additional information about using Jetstress and LoadGen, refer to the Microsoft TechNet Library topic Tools for Performance and Scalability Evaluation.

33 Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache When validating whether a Mailbox server is properly sized you should focus on processor, memory, storage, and Exchange application health.

When validating performance on Exchange 2010 Mailbox servers, EMC recommends that you use the key counters with the target benchmark values shown in Table 7. For a full list of all Exchange 2010 performance counters, review the Microsoft TechNet Library topic Performance and Scalability Counters and Thresholds.

Table 7. Key Exchange performance counters

Counter Target

MSExchange Database\I/O Database Reads Average Latency <20 ms MSExchange Database\I/O Database Writes Average Latency <20 ms MSExchange Database\IO Log Writes Average Latency <10 ms

MSExchangeIS\RPC Averaged Latency <10 ms on average

MSExchangeIS\RPC Requests <70 at all times

Processor(_Total)\% Processor Time <80%

For Exchange Jetstress, you should concentrate on the disk latencies and IOPS you achieved from your storage subsystems, while with LoadGen you consider CPU, memory, and RPC latencies on the Exchange Mailbox servers and other Exchange roles.

Note: In this solution, we focused only on the performance of the Mailbox server role and not the Hub Transport and Client Access roles.

Even if there are no obvious issues with processor, memory, or disk, EMC

recommends that you monitor the standard application health counters to ensure that the Exchange Mailbox server is in a healthy state.

The MSExchangeIS\RPC Averaged Latency counter provides the best indication of whether other counters with high database latencies are actually impacting Exchange’s health and client experience. Often, high RPC-averaged latencies are associated with a high number of RPC requests, which should be less than 70 at all times.

To validate Exchange performance with XtremCache, we configured Exchange

Jetstress to run on three exchange virtual machines with all databases configured in a single VNX storage pool simulating a 15,000-user workload (similar to a DAG

database switchover condition). In the first test, we established our baseline, where XtremCache was not enabled on the database LUNs. In the second test we, enabled XtremCache on the database LUNs to accelerate exchange performance.

Monitoring

Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache 34 After both tests completed, we analyzed the results and compared the I/O and disk

latency performance between the two tests on the Exchange servers and the back-end storage array. For performance comparison we only measured achieved IOPS for the Exchange database LUNs. These IOPS include:

 Database IOPS (user IOPS)– database reads and database writes

 BDM IOPS – IOPS associated with Background Database Maintenance process

Figure 17 and Figure 18 display the results of the tests. We observed the following:

 IOPS aggregate from three Mailbox server improved by 26 percent from 2,812 IOPS to 3,545 IOPS.

 Read IOPS increased by 34 percent from 1,388 IOPS to 1,862 IOPS.

 Write IOPS increased by 33 percent from 851 IOPS to 1,118 IOPS.

 Read latencies decreased by 3.2 ms.

The marginal decrease in read latencies occurred because Jetstress is not fully utilizing server memory. Jetstress is designed to stress the back-end storage system by bypassing local server resources. When Exchange LoadGen workload is used, as described later in this white paper, the beneficial effect of XtremCache is more apparent.

Figure 17. Exchange performance with XtremCache—Jetstress results (IOPS) Performance

results with Jetstress–servers

XtremCache acceleration test results

35 Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache Figure 18. Exchange performance with XtremCache—Jetstress results (latencies)

On the VNX storage array, we observed the following results when XtremCache was enabled on Exchange database LUNs in the virtual machines. While the read IOPS decreased, because of XtremCache acceleration on the host, the write IOPS increased resulting in almost unchanged total IOPS produced by the VNX R1/0 NL-SAS storage pool. There was a slight increase in bandwidth with increased disk utilization as a result of the array processing 50 percent more writes.

The following results summarize the data presented in Figure 19.

 Sixteen-and-a-half percent decrease in read IOPS to the back-end storage array because XtremCache offloads the reads from the array to the server.

 Fifty percent increase in write IOPS to the back-end storage array because XtremCache offloads the reads from the array to the server, allowing more write activity to be processed by write-through cache.

 Fifteen percent increase in disk utilization due to the array processing more writes.

 Six-and-a-half percent increase in bandwidth (MB/s) due to the increased write activity processed by the array.

Storage array performance results with Jetstress

Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache 36 Figure 19. VNX storage performance with Exchange Jetstress

To validate the effectiveness of XtremCache with Exchange 2010 under MAPI

workload, we performed multiple tests with different LoadGen workload profiles. We installed Exchange 2010 and initialized 15,000 users with up to 1 GB mailboxes that resulted in the 15 TB total working set (5 TB per each Exchange server virtual

machine). For testing purposes, we performed a DAG database switchover that resulted in three Exchange virtual machines hosting 5,000 active users configured from a single VNX storage pool with 48 2-TB NL-SAS drives in RAID 1/0 configuration.

We ran each LoadGen workload simultaneously on three Mailbox servers with 5,000 users per Exchange Mailbox server and with the Outlook profile in cached mode. Each LoadGen test ran for 8 hours with a full workload.

As stated earlier, the baseline profile for this solution is 150-messages/user/day.

With each test, we increased the workload to 250 messages/user/day and then to 300 messages/user/day. We conducted these tests without changing the virtual machines’ configuration or the back-end VNX storage. After establishing a baseline for each workload profile, we reran the tests with XtremCache enabled on the Exchange database LUNs.

Note: To simplify the comparison analysis between workloads, we configured each Exchange virtual machine with six vCPUs and 32 GB of RAM. Although only four vCPUs were required for a 150-message/user/day workload profile, eight vCPUs were required for 250- and 300-message/user/day workloads.

Solution validation with Exchange LoadGen

37 Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache To ensure accuracy in our analysis, we used the following values for results

comparison:

 IOPS value—We calculated the results by adding database read I/O, database write I/O, and BDM I/O for all three Exchange virtual machines, each running a 5,000-user workload. The BDM process ran periodically on each database during the eight-hour LoadGen runs.

 Latencies—We used database read and write latency performance counters from each Exchange Mailbox server.

After performing all tests, we observed the consistent reduction in read latencies and increased user IOPS with all workload types when XtremCache was enabled to accelerate performance for database LUNs. Even 300-message workloads that experienced over 20 ms read latencies without XtremCache, became a normal steady workload with reduced latencies and increased IOPS when XtremCache was enabled for the performance acceleration of database LUNs. This failing workload was expected, because the storage and Exchange virtual machine resources were

originally designed for 150-message workloads. Figure 20 provides additional details for each test performed.

Figure 20. Exchange 2010 performance with XtremCache and LoadGen workload The observed test results highlights are as follows:

 A 150-message/user/day workload achieved a 51 percent reduction in read latencies (by 6.4 ms) and a 14.6 percent increase in user IOPS (by 224 IOPS).

 A 250-message/user/day workload achieved a 69.3 percent reduction in read latencies (by 11.1 ms) and a 12.8 percent increase in user IOPS (by 275 IOPS).

Analyzing the LoadGen test results

LoadGen test results summary

Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache 38

 A 300-message/user/day workload achieved a 56.8 percent reduction in read latencies (by 12.5 ms) and a 12 percent increase in user IOPS (by 346 IOPS).

To validate Exchange performance with XtremCache inline data deduplication, we performed validation on one Exchange virtual machine with 5,000 users. We performed a series of LoadGen tests, with each test for 8 hours, and with multiple workload profiles to see the effect of data deduplication. We monitored the XtremCache statistics to determine the appropriate deduplication ratio for each workload. With LoadGen workloads we generated, we observed that a 30 percent deduplication ratio would be more effective than the default 20 percent. Figure 21 shows the deduplication ratio observed during testing.

Note: As stated earlier, the LoadGen workload does not represent the actual workload that will be seen in your production environment. The results observed and recommendations provided here are based on lab configuration and results only. Ensure that you configure your environment based on your workload requirements and characteristics.

Figure 21. XtremCache statistics with data deduplication

Figure 22 and Figure 23 show the XtremCache data deduplication test results with multiple workload profiles for the Exchange 2010 Mailbox server, demonstrating:

 Decreased Exchange server CPU utilization with each workload

 Slightly increased write latencies due to XtremCache analysis and processing of duplicate data

Performance with XtremCache data deduplication

Deduplication test results summary

39 Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache Figure 22. Exchange server CPU utilization with XtremCache data deduplication

Figure 23. Exchange server disk latencies with XtremCache data deduplication

Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache 40 Analysis of the back-end VNX storage array shows that when deduplication was

enabled on the server, the writes to the VNX array were reduced. In Figure 24, you can see that the write activity was reduced from 90 IOPS to about 65 IOPS for one of the database LUNs, which is about a 27.7 percent difference.

Figure 24. Exchange database LUN performance with XtremCache data deduplication

41 Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache

Conclusion

This solution validates the use of EMC XtremCache with Microsoft Exchange Server 2010. XtremCache is server-based cache. Introducing XtremCache into your physical or virtual infrastructure does not require changes to the application or storage system layouts. Because XtremCache is a caching solution rather than a storage solution, moving data is unnecessary. Therefore, your data is not at risk of becoming

inaccessible if the server or the XtremSF PCIe card fails. XtremCache is designed to minimize CPU overhead in the server by offloading flash management operations from the host CPU onto the XtremSF PCIe card.

Based on observations during our performance validation, XtremCache has proven to be highly scalable and reliable in virtualized environments. It can relieve I/O

processing pressure from the storage system and boost the disk read operations driven by the host.

XtremCache increases the overall Exchange application IOPS and significantly reduces disk latencies with minimal impact on system resources. Using XtremCache enables customers to configure Exchange for performance, and low cost without making trade-offs.

Based on this solution validation, managing and monitoring XtremCache in a vSphere environment is easy. After being configured, XtremCache requires no user

intervention and continuously changes to meet the application workload requirements. From our results, we conclude that:

 XtremCache can reduce Exchange database read response times.

 With an optimized VNX storage system, XtremCache can offload read I/O processing from the storage array while reducing disk latencies, which enables higher transactional throughput. It can address hotspots in the data center and alleviate possible storage bottlenecks.

 XtremCache host driver has minimal impact on server/virtual machine system resources. During our testing, the system resources were mostly consumed by the Exchange mailbox server workload and the XtremCache driver overhead was negligible.

 With an optimized VNX storage system, XtremCache can offload read I/O processing from the storage array while reducing disk latencies, thus enabling higher transactional throughput.

 With XtremCache data deduplication enabled, we observed significant reduction in write activity going to the back-end VNX storage system.

XtremCache can offload write I/O processing from the storage array, reducing disk latencies and bandwidth, thus enabling higher transactional throughput.

 The initial warm-up period for XtremCache with Exchange simulated workloads varies for each environment. In this solution, we observed the effect of

XtremCache almost immediately after it was enabled. It reached a steady state in approximately 30 minutes for all Exchange accelerated database LUNs with 15 TB of data.

Summary

Findings

Accelerating Microsoft Exchange 2010 Performance with EMC XtremCache 42

References

The following white papers are available on EMC:

 EMC VNX Series Unified Storage Systems

 EMC VNX Family

 Introduction to the EMC VNX Family

The following product documentation is available for XtremCache:

 XtremCache

 EMC VNX Series Unified Storage Systems

 Microsoft Exchange 2010 on VMware Best Practices Guide White papers

Product

documentation

Related documents