• No results found

Solution Validation and Testing 39

Chapter 7 Solution Validation and Testing

This chapter presents the following topics:

Overview ... 40 Performance criteria and methodologies ... 40 Mixed workload test results ... 42 Time Finder SnapVX test results ... 49

Chapter 7: Solution Validation and Testing

Overview

This chapter validates the performance of SQL Server 2014 in a virtualized

environment on the VMAX3 storage array with mission-critical combined workloads (OLTP/DSS).

This chapter also describes how we implemented TimeFinder SnapVX and validated the performance impact during snapshot creation, and the functionality of the snapshot recovery.

Performance criteria and methodologies

This section describes the performance criteria and methodologies used to validate the solution.

Because the SLO is a desired level of performance required by the storage workload, we needed to note the expected response-time criteria listed in Table 9 before running the tests.

Table 9. Pre-test response-time criteria for SLOs

Service

Diamond Emulates flash drive

performance 0.8 ms 0-3 ms 2.3 ms 1–6 ms

Gold Emulates 15k RPM performance

5.0 ms 3-10 ms 6.5 ms 4–12 ms

Silver Emulates 10k RPM performance

8.0 ms 6-15 ms 9.5 ms 7–17 ms

Note: These results do not fully demonstrate the overall performance capabilities of VMAX3, which can achieve results higher than those achieved for this solution’s requirements and configuration. This solution was designed to meet very specific customer-driven

requirements using a subset of the available VMAX3 hardware configuration.

Overview

Performance criteria

Chapter 7: Solution Validation and Testing

Table 10 lists the test scenarios we used to validate this solution.

Table 10. Test scenarios Test

Number Scenario Description

1 Performance test

with OLTP workload Three hosts, each containing one active SQL Server instance. Ran OLTP workloads on the four databases with 1 TB/500 GB/250 GB/50 GB. Each instance covered one of the SLO levels (Diamond, Platinum, Gold).

2 Performance test with DSS workload

Two hosts, each containing one standalone SQL Server instance, with one 1.5 TB database on each node. Ran DSS workloads on the database with Gold SLO level.

3 Snapshot protection test with SnapVX

Ran the OLTP workload against the four database sets on Gold SLO level and created snapshots every hour through Unisphere for VMAX. A maximum of eight snapshots was created during the test.

Compared the performance during each snapshot period.

Recovery test

through snapshot Mounted one of the database snapshots to a second host for repurposing through Unisphere for VMAX3 to measure the recovery time.

To simulate workload in a realworld OLTP and DSS environment, we used the following tools:

OLTP workload tool : Derived from an industry-standard, modern OLTP benchmark. It simulated a stockbroker trading system, such as managing customer accounts, executing customer trade orders, and other transactions within the financial markets. The majority of the I/O size was 8k with a 90:10 read/write ratio.

OLAP workload tool: Derived from an industry-standard DSS or OLAP benchmark. It simulated system functionality representative of complex

business analysis applications for a wholesale supplier, through 22 queries set that were given a realistic context. The majority of the I/O size was between 64k and 512k, with a 100 percent read ratio.

The detailed test methodology was:

1. Ran the performance test against three OLTP instances on Gold, Platinum, and Diamond SLO levels respectively and reached a steady state.

2. Based on the previous step, ran the performance test against two DSS instances on a Gold SLO level, and reached a steady state. Monitored each SLO and host behavior for both OLTP and DSS. Measured and recorded the performance.

3. Ran the baseline performance test again against a Gold OLTP instance and reached a steady state. Created SnapVX snapshots every hour against the Test scenarios

Test methodology

Chapter 7: Solution Validation and Testing

storage group for the Gold OLTP databases. Measured and recorded the performance during the snapshot period.

4. Performed the recovery test against one of the Gold OLTP database snapshots and mounted the snapshot on another host. Measured and monitored the recovery time.

The validation test results are highly dependent on workload, specific application requirements, design, and implementation. Relative system performance varies because of these and other factors. Therefore, the workloads used to validate this solution should not be used as a substitute for a specific customer application benchmark when critical capacity planning and product evaluation decisions are contemplated.

All performance data in this guide was obtained in a rigorously controlled environment. Results obtained in other operating environments may vary.

EMC Corporation does not warrant or represent that a user can or will achieve similar performance expressed in transactions per minute.

Note: The database metrics transactions per second (TPS) is described and used within the test results. As transactions differ greatly between various database environments, these values should only be used as a reference and for comparative purposes within the test results.

Mixed workload test results

This section describes the test results for mixed workload validation for both OLTP and DSS environments. Table 11 shows the CPU and memory configuration for the test environment.

Table 11. CPU and memory reservation

Item OLTP DSS

CPU reservation 24 32

Memory reservation 32 GB virtual machines (30 GB

reserved for SQL Server) 128 GB virtual machines (120 GB reserved for SQL Server)

To determine the performance of the SQL Server mixed workload on VMAX3, we used performance monitors, including Windows Perfmon, VMware ESXTOP, and EMC Unisphere for VMAX3, to measure and record the statistics.

The key metrics for OLTP in the mixed workload test are:

 Throughput in IOPS (transfers per second)

 Throughput in TPS

 Processor time percentage Test result notes

Overview

Performance metrics

Chapter 7: Solution Validation and Testing The key metrics for DSS in the mixed workload test are:

 Bandwidth

 Processor time percentage The key metrics on VMAX3 are:

 SLO response time in milliseconds

 VMAX3 front-end and back-end utilization as a percentage

 Utilization for each disk technology as a percentage

We ran a traditional OLTP workload continuously against three OLTP SQL Server instances on Gold, Platinum, and Diamond SLO levels. At the same time, we applied an industrial DSS workload to simulate a typical data warehousing environment and drive high bandwidth. The DSS workload was generated by eight consecutive query sets, each containing 22 T-SQL queries. The total test duration was about 10 hours.

Table 12 summarizes the high-level performance results for the mixed workload test.

Table 12. Summary of mixed workload test results

OLTP workload DSS workload

Host IOPS 107,310 Host bandwidth 3,406 MB/s

TPS 8,574 Avg. DSS query-set

execution time 1 hour and 18 minutes SLO response time Diamond: < 2.1 ms

Platinum: < 2.8 ms Gold: < 4.0 ms

SLO response time < 12 ms

As shown in Table 12, we achieved over 107,000 IOPS and 8,500 TPS in total for the three OLTP instances running DSS workloads on different SLO levels. The SLO response time was kept within the expected compliant range, at a very lower level.

We also achieved over 3,400 MB bandwidth with the standard DSS workloads on the Gold SLO level, with each 22-query set of the DSS database completed within 1 hour and 18 minutes. The SLO response time for each DSS instance was kept within 12 ms.

Note: The test results were based on a specific number of disk resources for the VMAX3 system, which demonstrated both efficient and balanced utilization of the VMAX3 hardware resources.

Test results overview

Chapter 7: Solution Validation and Testing

Figure 14 shows the detailed IOPS and TPS results for the OLTP validation. We applied a similar workload profile on three different OLTP instances, and the results showed a highly scalable performance for each SLO level.

Figure 14. IOPS and TPS results for OLTP

For the Gold OLTP that simulates a heavy I/O transactional environment, the

corresponding SLO level rivals the performance of 15k SAS disks. We achieved over 27,000 IOPS and 2,322 TPS with host latency kept within 5 ms.

For the Platinum OLTP that simulates a mission-critical OLTP workload, the

corresponding SLO level rivals the performance between 15k SAS and flash disks. We achieved over 33,000 IOPS and 2,759 TPS with host latency kept within 3 ms.

For the Diamond OLTP that pursued extreme performance on VMAX3, the

corresponding SLO level rivals the performance of pure flash disks. We achieved over 44,000 IOPS and 3,493 TPS with host latency kept to about 2 ms.

Test results for OLTP validation

Chapter 7: Solution Validation and Testing Figure 15 shows the SLO response time on VMAX3 during the performance test. The workloads were started before hour zero. After nearly two hours, the performance entered a steady state.

Figure 15. SLO response time on VMAX3

We kept all SLO response times within the corresponding SLO compliance range.

 For the Gold OLTP, the average SLO response time was less than 4 ms, which is compliant within the range of 3 to 10 ms.

 For the Platinum OLTP, the average SLO response time was slightly above 2.5 ms, which is compliant within the range of 2 to 7 ms.

 For the Diamond OLTP, the average SLO response time was about 2 ms, which is compliant within the range of 0 to 3 ms.

Table 13 shows the processor time on each OLTP instance. The CPU utilization on each OLTP instance was kept within 75 percent.

Table 13. OLTP instance processor time

SQL Server instance Processor time Target

Gold OLTP 31% Less than 75%

Platinum OLTP 41% Less than 75%

Diamond OLTP 49% Less than 75%

The results demonstrate that VMAX3 can easily handle over 107,000 IOPS and 8,500 TPS (depending on the transaction types) even under a mixed OLTP and DSS workloads. As the VMAX3 SLO feature set the response time target, each storage

Chapter 7: Solution Validation and Testing

group would service within the specified compliance range to ensure the performance level preset by the user.

On the basis of OLTP workloads, we applied DSS workloads, which contained eight same-query sets to enable a total test duration of about 10 hours, throughout the validation test. We monitored the bandwidth achieved on each SQL Server data warehousing instance and recorded the completion time of each query set.

The workload applied to the two Gold DSS instances are derived from an industry-standard OLAP benchmark to simulate a realworld data warehousing system. The actual bandwidth achieved in the solution was highly dependent on the queries derived from the DSS benchmark.

Table 14 shows the detailed test results for the DSS part of the mixed workload validation. The average bandwidth results were 1,670 MB/s and 1,736 MB/s for each DSS instance. The CPU utilization was over 60 percent, still within the target range.

Table 14. Tests results for DSS workload SQL Server

instance

Avg. bandwidth (MB/s)

Process

or time Target Avg. duration for each query set

Gold DSS 1 1,670MB/s 61.2% Less than 75% 1 hour and 18 minutes Gold DSS 2 1,736MB/s 63.1% Less than 75% 1 hour and 17 minutes

Figure 16 shows the response time on each storage group of Gold SLO level DSS on the VMAX3 array. The SLO target was to keep the response time within a range of 3 to 12 ms for the data warehousing database.

Figure 16. Gold SLO level DSS response times Test results for

DSS validation

Chapter 7: Solution Validation and Testing From Figure 16, for most of the test period, the response time of each storage group was kept around 5 to 8 milliseconds under full monitoring via the new VMAX3 SLO feature.

At the eighth hour of the validation test, the response time for Gold DSS 1 hit

12.3 ms, which exceeded the 12 ms threshold. VMAX3 automatically discovered this occurrence and rescheduled the resource to keep the response time within the SLO target range. The response time dropped immediately for the next hour, without affecting the overall performance of the OLTP or DSS workloads.

We monitored the status of both the front-end and back-end of the VMAX3 array throughout the mixed workload test. We designed the entire system to provide full and balanced utilization on VMAX3.

The front-end CPU utilization was less than 15 percent, while the back-end CPU utilization was close to 20 percent. This demonstrates that VMAX3 is capable of handling a heavier workload.

In this solution, we equipped the VMAX3 array with a usable number of disks—64 flash and 160 10k SAS. The average disk utilization for the two-disk technology was over 90 percent for flash disks and 85 percent for SAS disks, as shown in Figure 17.

This confirms that the available disks were used close to maximum efficiency while maintaining the SLOs.

Figure 17. Disk heat map

Note: In Figure 17, the disks in black are hot spares and NL_SAS drives, which are not used.

VMAX3 system performance

Chapter 7: Solution Validation and Testing

Figure 18 and Figure 19 show the detailed disk utilizations.

Figure 18. SAS disk utilization

Figure 19. Flash disk utilization

Chapter 7: Solution Validation and Testing Based on the results shown in this section, we confirmed the following:

 The front-end and back-end directors achieved a balanced performance.

 The VMAX3 configuration used in this solution was not stressed at all by our workload, and still had enough buffer to promote a heavier workload.

 We achieved favorable performance based on an efficient disk configuration that was almost fully utilized.

Time Finder SnapVX test results

This section describes the results for performance testing with snapshot creation and recovery testing.

In this solution, we selected one OLTP storage group based on a Gold SLO level, which contained four databases (1 TB, 500 GB, 250 GB, 50 GB). We added a standard modern OLTP workload on each database as the baseline. When the baseline

performance entered a stable state, we enabled the scheduled work in Unisphere for VMAX3 to activate hourly creation of snapshots on this storage group. This simulated a multiple-database repurposing scenario. We set the maximum number of

snapshots to eight. We measured the performance throughout the test.

During snapshot creation, we measured the performance impact as follows:

 Throughput in IOPS

 Throughput in TPS

 SLO response time (ms)

Figure 20 shows the results for the snapshot creation test.

Figure 20. Snapshot creation test results Overview

Performance testing with snapshot creation

Chapter 7: Solution Validation and Testing

For the baseline, we achieved 29,898 IOPS and 2,861 TPS, while keeping the host latency below 4 to 5 ms for all four databases.

When the hourly snapshots were created, we found that during each period, the performance result was maintained at about 30,000 IOPS and 2,900 TPS for the duration of the entire test, while the host latency was still kept within 4 to 5 ms.

As the number of snapshots increased, the performance was still not affected. The average snapshot creation time was within five seconds for four databases of 1.8 TB in total size.

Figure 21 shows the SLO response time results throughout the eight hours of testing.

The latency was kept within 2.0 to 2.2 ms, without performance degradation.

Figure 21. SLO response times for test period

From these test results, we came to the following conclusions for the snapshot validation:

 SnapVX has zero performance impact against SQL Server OLTP workloads.

 The number of snapshots does not affect the performance of the SQL Server databases.

 A SnapVX snapshot can be created instantly and is ready to use in seconds.

The recovery test was performed from the previously created snapshots. We mounted the oldest snapshot with the 1 TB databases, which had the largest data changes, to a second host and monitored the consumed time. We opted to link the target in no copy mode to be as space-efficient as possible. Table 15 shows the test results when creating snapshot links to target devices and mounting to the second host.

Recovery testing

Chapter 7: Solution Validation and Testing

Table 15. Test results for snapshot recovery

Task Recovery time Total database size Data changes Snap link to target in no

copy mode and mount to host

2 minutes 54

seconds 1 TB 397 GB

Notes:

No copy: Creates a temporary, space-saving snapshot of only the changed data on the snapshot's storage resource pool. Target volumes linked in this mode will not retain data after the links are removed. This is the default mode.

Copy: Creates a permanent, full-volume copy of the data on the target volume's storage resource pool. Target volumes linked in this mode will retain data after the links are removed.

During testing, the VMAX3 snapshot was linked to the target device and mounted to the host with a SQL Server instance installed in no copy mode. The results show that the database, with 1 TB total database size and 397 GB data changes, can be recovered and mounted to the host in three minutes.

Related documents