DEPLOYMENT BEST PRACTICE FOR
MICROSOFT SQL SERVER WITH
VMAX
3
SLO MANAGEMENT
EMC
®
VMAX
®
Engineering White Paper
ABSTRACT
With the introduction of the third generation VMAX disk arrays, Microsoft SQL Server administrators have a new way to deploy a wide range of applications in a single high-performance, high capacity, self-tuning storage environment that can
dynamically manage each application’s performance requirements with minimal effort. January, 2015
To learn more about how EMC products, services, and solutions can help solve your business and IT challenges, contact your local representative or authorized reseller, visit www.emc.com, or explore and compare products in the EMC Store
Copyright © 2015 EMC Corporation. All Rights Reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.
The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.
TABLE OF CONTENTS
EXECUTIVE SUMMARY ... 4
AUDIENCE ... 4
VMAX
3PRODUCT OVERVIEW ... 5
VMAX
3Overview ... 5
VMAX
3and Service Level Objective (SLO) based provisioning ... 6
STORAGE DESIGN PRINCIPLES FOR MIROSOFT SQL SERVER ON VMAX
310
Storage connectivity considerations ... 10
Physical Host connectivity considerations ... 10
Number and size of host devices considerations ... 11
Virtual Provisioning and thin devices considerations ... 11
Microsoft SQL Server and data skew ... 12
Microsoft SQL Server data types and the choice of SLO ... 12
Host I/O Limits and multi-tenancy ... 15
Using cascaded storage groups ... 16
SQL SERVER DATABASE PROVISIONING ... 18
Storage provisioning tasks with VMAX
3... 18
Provisioning SQL Server database storage with Unisphere ... 19
Provisioning SQL Server database storage with Solutions Enabler CLI ... 24
MICROSOFT SQL SERVER SLO MANAGEMENT TEST USE CASES ... 25
Test Configuration ... 25
Test Overview... 26
Use case 1 – Single database run with change in SLO ... 27
Use case 2 – Competing database runs with changein SLO ... 28
Use case 3 – All flash configuration with data files and transaction log ... 30
EXECUTIVE SUMMARY
VMAX3 family of storage arrays is the next major step in evolving VMAX hardware and software targeted to meet new industry
challenges of scale, performance, and availability. At the same time, VMAX3 has taken a leap in making complex operations of
storage management, provisioning, and setting performance goals, simple to execute and manage.
VMAX3 family of storage arrays come pre-configured from factory to simplify deployment at customer sites and minimize time to first
I/O. Each array uses Virtual Provisioning to allow the user easy and quick storage provisioning. While VMAX3 can ship as an all-flash
array with the combination of EFD1 (Enterprise Flash Drives) and large cache that accelerates both writes and reads even farther, it
also excels in providing FAST2 (Fully Automated Storage Tiering) enabled performance management based on service level goals
across multiple tiers of storage. VMAX3 new hardware architecture comes with more CPU power, larger persistent cache, and a new
Dynamic Virtual Matrix dual InfiniBand fabric interconnect that creates an extremely fast internal memory-to-memory and data-copy fabric.
Many enhancements were introduced to VMAX3 replication software to support new capabilities such as TimeFinder SnapVX local
replication to allow for hundreds of snapshots that can be incrementally refreshed or restored, and can cascade any number of times. Also SRDF remote replication software adds new features and capabilities that provide more robust remote replication support. VMAX3 adds the ability to connect directly to Data Domain system so database backups can be sent directly from the primary storage
to the Data Domain system without having to go through the host first. VMAX3 also offers embedded file support (eNAS) via a new
hypervisor layer new to VMAX3, in addition to traditional block storage.
This white paper explains the basic VMAX3 design changes with regard to storage provisioning and performance management, and
how they simplify the management of storage and affect Microsoft SQL Server database layout decisions. It explains the new FAST architecture for managing SQL Server databases performance using Service Level Objectives (SLOs) and provides guidelines and best practices for its use.
AUDIENCE
This white paper is intended for database and system administrators, storage administrators, and system architects who are responsible for implementing, managing, and maintaining SQL Server databases and VMAX3 storage systems. It is assumed that
readers have some familiarity with Microsoft SQL Server and the EMC VMAX3 family of storage arrays, and are interested in achieving
higher database availability, performance, and ease of storage management.
1 Enterprise Flash Drives (EFD) are SSD Flash drives that are designed for high performance and resiliency suited for enterprise applications. 2 Fully Automated Storage Tiering (FAST) allows VMAX3 storage to automatically and dynamically manage performance service level goals across the
VMAX
3PRODUCT OVERVIEW
VMAX
3OVERVIEW
The EMC VMAX3 family of storage arrays are built on the strategy of simple, intelligent, modular storage, and incorporates a Dynamic
Virtual Matrix interface that connects and shares resources across all VMAX3 engines, allowing the storage array to seamlessly grow
from an entry-level configuration into the world’s largest storage array. It provides the highest levels of performance and availability featuring new hardware and software capabilities.
The newest additions to the EMC VMAX3 family, VMAX 100K, 200K and 400K, deliver the latest in Tier-1 scale-out multi-controller
architecture with consolidation and efficiency for the enterprise. With enhanced hardware and software, the new VMAX3 array
provides unprecedented performance and scale. It offers dramatic increases in floor tile density (GB/Ft2) with engines and high
capacity disk enclosures for both 2.5" and 3.5" drives, consolidated in the same system bay. Figure 1 shows possible VMAX3
components. Refer to EMC documentation and release notes to find the most up to date supported components.
In addition, VMAX3 arrays can be configured as either hybrid or all- flash arrays. All VMAX3 models come pre-configured from the
factory to significantly shorten the time from installation to first I/O. See sample configuration below:
Figure 1 VMAX3 storage array3
VMAX3 engines provide the foundation to the storage array. Each fully redundant engine contains two VMAX3 directors and redundant
interfaces to the new Dynamic Virtual Matrix dual InfiniBand® fabric interconnect. Each director consolidates front-end, global memory, and back-end functions, enabling direct memory access to data for optimized I/O operations. Depending on the array chosen, up to eight VMAX3 engines can be interconnected via a set of active fabrics that provide scalable performance and high
availability. New to VMAX3 design, host ports are no longer mapped directly to CPU resources. CPU resources are allocated as needed
using pools (front-end, back-end, or data services pools) of CPU cores which can service all activity in the VMAX3 array. This ‘shared’
multi-core architecture reduces I/O path latencies by facilitating system scaling of processing power without requiring additional drives or front-end connectivity.
VMAX3 arrays introduce the industry’s first open storage and hypervisor converged operating system, HYPERMAX OS. It combines
industry-leading high availability, I/O management, data integrity validation, quality of service, and storage tiering and data security with an open application platform. HYPERMAX OS features a real-time, non-disruptive, storage hypervisor that manages and protects embedded data services (running in virtual machines)by extending VMAX high availability to these data services that traditionally
• 1 – 8 redundant VMAX3 Engines
• Up to 4 PB usable capacity
• Up to 256 FC host ports
• Up to 16 TB global memory (mirrored)
• Up to 384 Cores, 2.7 GHz Intel Xeon E5-2697-v2
• Up to 5,760 drives
• SSD Flash drives 200/400/800 GB 2.5”/3.5”
• 300 GB – 1.2 TB 10K RPM SAS drives 2.5”/3.5”
• 300 GB 15K RPM SAS drives 2.5”/3.5”
All storage in the VMAX3 array is virtually provisioned. VMAX Virtual Provisioning enables users to simplify storage management and
increase capacity utilization by sharing storage among multiple applications and only allocating storage as needed from a shared pool of physical disks known as Storage Resource Pool (SRP). The array uses the dynamic and intelligent capabilities of Fully Automated Storage Tiering (FAST) to meet specified Service Level Objectives (SLOs) throughout the lifecycle of each application. VMAX3 SLOs and SLO provisioning are new to the VMAX3 family, and tightly integrated with EMC FAST software to optimize agility
and array performance across all drive types in the system. While VMAX3 can ship in an all-flash configuration, when purchased with
hybrid drive types as a combination of flash and hard drives, EMC FAST technology can improve application performance and at the same time reduce cost by intelligently using a combination of high-performance flash drives with cost-effective high-capacity hard disk drives.
For local replication, VMAX3 adds a new feature to TimeFinder software called SnapVX, which provides support for a greater number
of snapshots. Unlike previous VMAX snapshots, SnapVX snapshots do not require the use of dedicated target devices. SnapVX allows for up to 256 snapshots per individual source. These snapshots can copy (referred to as link-copy) their data to new target devices and re-link to update just the incremental data changes of previously linked devices. For remote replication, SRDF adds new capabilities and features to provide protection for SQL Server databases and applications. All user data entering VMAX3 is T10 DIF
protected, including replicated data and data on disks.
VMAX
3AND SERVICE LEVEL OBJECTIVE (SLO) BASED PROVISIONING
Introduction to FAST in VMAX
3With VMAX3, FAST (Fully Automated Storage Tiering) is enhanced to include both intelligent storage provisioning and performance
management, using Service Level Objectives (SLOs). SLOs automate the allocation and distribution of application data to the correct data pool (and therefore storage tier) without manual intervention. Simply choose the SLO (for example, Platinum, Gold, or Silver), that best suites the application requirements. SLOs are tied to expected average I/O latency for both reads and writes and therefore, both the initial provisioning and application on-going performance are automatically measured and managed based on compliance to storage tiers and performance goals. FAST continuously samples the storage activity every 10 minutes and when necessary moves data at FAST’s sub-LUN granularity, which is 5.25MB (42 extents of 128KB). SLOs can be dynamically changed at any time (be promoted or demoted), and FAST continuously monitors and adjusts data location at the sub-LUN granularity across the available storage tiers to match the performance goals provided. All this is done automatically, within the VMAX3 storage array, without having
to deploy complex application ILM4 strategy or use host resources for migrating data due to performance needs.
VMAX
3FAST Components
Figure 2 depicts the elements of FAST that form the basis for SLO based management, as described below.
Physical disk group provides grouping of physical storage (Flash, or hard disk drives) based on drive types. All drives in a disk group have the same technology, capacity, form factor, and speed. The disk groups are pre-configured at the factory, based on the specified configuration requirements at the time of purchase.
Data Pool is a collection of RAID protected internal devices (also known as TDATs, or thin data devices) that are carved out of a single physical disk group. Each data pool can belong to a single SRP (see definition below), and provides a tier of storage based on its drive technology and RAID protection. Data pools can allocate capacity for host devices or replications. Data pools are also pre-configured at the factory to provide optimal RAID protection and performance.
Storage Resource Pool (SRP) is a collection of data pools that provides FAST a domain for capacity and performance
management. By default, a single default SRP comes factory pre-configured. Additional SRPs can be created with an EMC service engagement. The data movements performed by FAST are done within the boundaries of the SRP and are covered in detail later in this paper.
Storage Group (SG) is a collection of host devices (LUNs) that consume storage capacity from the underlying SRP. Because both FAST and storage provisioning operations are managed at a storage group level, storage groups can be cascaded (hierarchical) to
Host devices (LUNs) are the components of a storage group. In VMAX3 all host devices are virtual and at the time of creation can
be fully allocated or thin. Virtual means that they are a set of pointers to data in the data pools, allowing FAST to manage the data location across data pools seamlessly. Fully allocated means that the device’s full capacity is reserved in the data pools even before the host has access to the device. Thin means that although the host sees the LUN with its full reported capacity, in reality no capacity is allocated from the data pools until explicitly written to by the host. All host devices are natively striped across the data pools where they are allocated with granularity of a single VMAX3 track size, which is 128KB.
Service Level Objectives (SLO) provides a pre-defined set of service levels (such as Platinum, Gold or Silver) that can be supported by the underlying SRP. Each SLO has a specific performance goal and some have also tier5-compliance goals from the
underlying SRP that FAST will work to satisfy. For example, Bronze SLO will attempt to have no data on EFDs and Platinum SLO will attempt to have no data on 7.2k rpm drives. An SLO defines an expected average response time target for a storage group. By default, all host devices and all storage groups are attached to the Optimized SLO, (which will assure I/Os are serviced from the most appropriate data pool for their workload), but in cases where more deterministic performance goals are needed, specific SLOs can be specified.
Figure 2 VMAX3 architecture and service level provisioning
5 Storage tiers refer to combinations of disk drive and RAID protection that create unique storage service levels. For example Flash (SSD) drives with
Service Level Objectives (SLO) and Workload Types Overview
Each storage resource pool (SRP) contains a set of known storage resources, as seen in Figure 2. Based on the available resources in the SRP, HYPERMAX OS will offer a list of available Service Level Objectives (SLOs) that can be met using this particular SRP, as shows in Table 1. This assures SLOs can be met, and that SRPs aren’t provisioned beyond their ability to meet application requirements.
Note: When certain drive types are not present in the storage, the associated SLOs will not be shown. For example, if no 7.2 RPM drives are available in the system, Bronze SLO will not be offered. Likewise, if EFDs are not present in the SRP, the Diamond SLO will not be offered.
Note: Since SLO’s are tied to the available drive types, it is important to plan the requirements for a new VMAX3 system carefully.
EMC works with customers using a new and easy to use Sizer tool to assist with this task.
Table 1 Service Level Objectives
SLO Minimum required drive combinations to list SLO
Performance expectation
Diamond EFD Emulating EFD performance
Platinum EFD and (15K or 10K) Emulating performance between 15K drive and EFD Gold EFD and (15K or 10K or 7.2K) Emulating 15K drive performance
Silver EFD and (15K or 10K or 7.2K) Emulating 10K drive performance Bronze 7.2K and (15K or 10K) Emulating 7.2K drive performance Optimized Any System optimized performance
By default, no specific SLO needs to be selected as by default all data in the VMAX3 storage array receives Optimized SLO. System
Optimized SLO meets the performance and compliance requirements by dynamically placing the most active data in the highest performing tier and less active data in low performance high capacity tiers.
Note: Optimized SLO offers an optimal balance of resources and performance across the whole SRP, based on I/O load, type of I/Os, data pool utilization, and available capacities in the pools. It will place the most active data on higher performing storage and least active data on the most cost-effective storage. If data pools’ capacity or utilization is stressed it will attempt to alleviate it by using other pools.
However, when specific storage groups (database LUNs) require a more deterministic SLO, one of the other available SLOs can be selected. For example, a storage group holding critical SQL Server data files can receive a Diamond SLO while the logs can be put on Platinum. A less critical application can be fully contained in Gold or Silver SLO. Refer also to MICROSOFT SQL SERVER DATA TYPES AND THE CHOICE OF SLO section later in the paper.
Once an SLO is selected (other than Optimized), it can be further qualified by a Workload type: OLTP or DSS, where OLTP workload is focused on optimizing performance for small block I/O and DSS workload is focused on optimizing performance for large block I/O. The Workload Type can also specify whether to account for any overhead associated with replication (local or remote). The workload type qualifiers for replication overhead are OLTP_Rep and DSS_Rep, where ‘Rep’ denotes replicated.
Understanding SLO Definitions and Workload Types
Each SLO is effectively a reference to an expected response-time range (minimum and maximum allowed latencies) for host I/Os, where a particular Expected Average Response Time is attached to each SLO and workload combination. The Solutions Enabler CLI or Unisphere for VMAX can list the available service levels and workload combinations, as seen in Figure 3 (see command line syntax example to list available SLOs in the Appendix). They only list the expected average latency, not the range of values.
Without a workload type, the latency range is the widest for its SLO type. When a workload type is added, the range is reduced, due to the added information. When Optimized is selected (which is also the default SLO for all storage groups, unless the user assigns another), the latency range is in fact the full latency spread that the SRP can satisfy, based on its known and available components.
Figure 3 Unisphere shows available SLOs
Important SLO considerations:
• Because an SLO references a range of target host I/O latencies, the smaller the spread the more predictable is the result. It is therefore recommended to select both an SLO as well as a workload type. For example: Platinum SLO with OLTP workload and no replications.
• Because an SLO references an Expected Average Response Time, it is possible for two applications executing a similar workload and set with the same SLO to perform slightly different. This can happen if the host I/O latency still falls within the allowed range. For that reason it is recommended to use a workload type together with an SLO when a smaller range of latencies is desirable.
Note: SLOs can be easily changed using Solutions Enabler or Unisphere for VMAX online. Also, when necessary to add additional layers of SLOs, the Storage Group (SG) can be easily changed into cascaded SG so each child or the parent can receive its appropriate SLO.
STORAGE DESIGN PRINCIPLES FOR MIROSOFT SQL SERVER ON VMAX
3 VMAX3 storage provisioning has become much simpler than previous releases. Since VMAX3 physical disk groups, data pools, andeven the default SRP come pre-configured from factory based on inputs to the Sizer tool that helps size them correctly, the only thing required is configure connectivity between your hosts and the VMAX3 and then start provisioning host devices.
The following sections discuss the principles and considerations for storage connectivity and provisioning for Microsoft SQL Server.
STORAGE CONNECTIVITY CONSIDERATIONS
When planning storage connectivity for performance and availability it is recommended to ‘go-wide before going deep’, which means it is better to connect storage ports across different engines and directors6 than to use all the ports on a single director. In this way,
even in a case of a component failure, the storage can continue to service host I/Os.
New to VMAX3 is dynamic core allocation. Each VMAX3 director provides services such as front-end connectivity, backend
connectivity, or data management. Each such service has its own set of cores on each director that are pooled together to provide CPU resources which can be allocated as necessary. For example, even if host I/Os arrive via a single front-end port on the director, the front-end pool with all its CPU cores will be available to service that port. As I/Os arriving to other directors will have their own core pools, again, for best performance and availability it is recommended to connect each host to ports on different directors before using additional ports on the same director.
PHYSICAL HOST CONNECTIVITY CONSIDERATIONS
Host connectivity considerations include two aspects. The first is the number and speed of the HBA ports (initiators) and the second is the number and size of host devices.
HBA ports considerations:
Each HBA port (initiator) creates a path for I/Os between the host and the SAN switch which then continues to the VMAX3 storage. If
a host was to only use a single HBA port it will have a single I/O path that has to serve all I/Os. Such design is not advisable as a single path doesn’t provide high-availability, and also risks a potential bottleneck during high I/O activity for the lack of additional ports for load-balancing.
A better design provides each database server at least two HBA ports, preferably on two separate HBAs. The additional ports provide more connectivity and also allow multipathing software like EMC PowerPath or Microsoft Multipath I/O (MPIO), to load-balance and failover across HBA paths.
Each path between host and storage device creates a SCSI device representation on the host. For example, two HBA ports going to two VMAX front-end adapter ports with a 1:1 relationship create 3 presentations for each host device: one for each path and another that the multipathing software creates as an “EMC SYMMETRIX Multi-Path Disk Device” (PowerPath System Devices). If each HBA port was zoned and masked to both FA ports (1: many relationship) there will be 5 SCSI device representations for each host device (one for each path combination + pseudo device).
While modern operating systems can manage hundreds of devices, it is not advisable or necessary, and it burdens the user with complex tracking and storage provisioning management overhead. It is therefore recommended to have enough HBA ports to support workload concurrency, availability, and throughput, but use 1:1 relationships to storage front-end ports, and not have each HBA port zoned and masked to all VMAX front-end ports. Such approach provides enough connectivity, availability, and concurrency, yet reduces the complexity of the host registering lots of SCSI devices unnecessarily.
6 Each VMAX3 engine has two redundant directors
NUMBER AND SIZE OF HOST DEVICES CONSIDERATIONS
VMAX3 introduces the ability to create host devices with a capacity from a few megabytes to multiple terabytes. With the native
striping across the data pools that VMAX3 provides, the user may be tempted to create only a few very large host devices. Think
about the following example: a 1TB Microsoft SQL Server database can reside on a 1 x 1TB host device, or perhaps on 10 x 100GB host devices. While either option satisfies the capacity requirement, it is recommended to use reasonable number of host devices and size. In the example above, if the database capacity was to rise above 1TB, it is likely that the DBA will want to add another device of the same capacity even if they didn’t need 2TB in total. Therefore, large host devices create very large building blocks when additional storage is needed.
Secondly, each host device creates its own host I/O queue at the operating system. Each such queue can service a tunable, but limited, number of I/Os that can be transmitted simultaneously. If, for example, the host had 4 HBA ports, and a single 1TB LUN (using the previous example again), with multipathing software it will have only 4 paths available to queue I/Os. A high level of database activity will generate more I/Os than the queues can service, resulting in artificially elongated latencies. In this example two or more host devices are advisable to alleviate such an artificial bottleneck. Host software such as EMC PowerPath, or Windows perfmon can help in monitoring host I/O queues to make sure the number of devices and paths is adequate for the workload. Another benefit of using multiple host devices is that internally, the storage array can use more parallelism when operations such as FAST data movement or local and remote replications take place. By performing more copy operations simultaneously, the overall operation takes less time.
While there is no one magic number to the size7 and number of host devices, we recommend finding a reasonable low number that
offers enough concurrency, provides an adequate building block for capacity when additional storage is needed, and doesn’t become too large to manage.
VIRTUAL PROVISIONING AND THIN DEVICES CONSIDERATIONS
All VMAX3 host devices are Virtually Provisioned (also known as Thin Provisioning), meaning they are merely a set of pointers to
capacity allocated at 128KB extent granularity in the storage data pools. However, to the host they look and respond just like regular LUNs. Using pointers enables FAST to move the application data between the VMAX3 data pools without affecting the host. It also
allows better capacity efficiency for TimeFinder snapshots by sharing of extents when data doesn’t change between snapshots. Virtual provisioning offers a choice of whether to fully allocate the host device capacity, or allow it to do allocation on-demand. A fully
allocated device consumes all its capacity in the data pool on creation, and therefore, there is no risk that future writes may fail if the
SRP has no capacity left8. On the other hand, allocation on-demand allows over-provisioning, meaning that although the storage
devices are created and look to the host as available with their full capacity, actual capacity is only allocated in the data pools when host writes occur. This is a common cost saving practice.
Allocation on-demand is suitable in situations when: • Application’s capacity growth rate is unknown, and
• The user prefers to not commit large amounts of storage ahead of time, as it may never get used, and • The user prefers to not disrupt host operations at a later time by adding more devices.
Therefore, if ‘allocation on demand’ is leveraged capacity will only be physically assigned as it is needed to meet application requirements.
7 Follow best practice to Quick format NTFS volumes using Microsoft recommended 64KB allocation unit size for data, logs, TEMPDB.
8 FAST allocates capacity in the appropriate data pools based on the workload and SLO. However, when a data pool is full, FAST may use other pools in
In general when data files are created, Microsoft SQL Server pre-allocates capacity, by writing to every page, with contiguous zeros. When allocation on-demand is used, it is best to deploy a strategy where database capacity is grown over time based on actual need. For example, if SQL Server was provisioned with a thin device of 2TB, rather than immediately creating data files of 2TB and
consuming all its space, the DBA should use auto-growth feature that consume only the capacity as needed.
Note: In cases where Windows Instant File Initialization (IFI) is involved, allocation of datafiles happens in a thin-pool friendly way. Areas of a disk under which a sparse file is defined, as created by Instant File Initialization, are not zeroed. As table and index information is written to a fully initialized data file, areas of the database become allocated and used by non-zero user data. SQL Server automatically uses Windows Instant File Initialization (IFI), provided the service account under which the SQL Server service is running has Perform volume maintenance tasks permission under the local security policy. By default, only administrators have this permission. Information regarding the Instant File Initialization functionality (IFI) is provided in the Microsoft SQL Server Books online product documentation.
MICROSOFT SQL SERVER AND DATA SKEW
SQL Server databases perform both random and sequential I/O operations. Skewed data access is a phenomenon that is present in most databases. Skewing is when a small number of SQL Server LUNs, or the data files resident on those LUNs, receive a large percentage of the I/O from the applications. Data access skewing also commonly occurs in the database environments as most recently created data is typically accessed more frequently. As data ages the relative importance becomes lower and hence activity level on that also drops. This skewed type of activity can be transient, that is to say lasting only a short period of time, or persistent and lasting much longer periods of time. Skewed, persistent access to storage allocations makes those allocations good candidates for promotion to higher-performing storage tier. Skewing by its nature means that there are SQL Server data structures that are receiving lower activity than others. If the storage allocations servicing those structures are persistently receiving fewer requests than others, then they are good candidates for FAST to down-tier. By using sub-LUN level FAST, bottlenecks based on skewed workloads to a limited set of spindles are avoided completely.
MICROSOFT SQL SERVER DATA TYPES AND THE CHOICE OF SLO
The following sections describe considerations for various SQL Server data types and selection of SLOs to achieve desired performance.
Planning SLO for SQL Server databases
VMAX3 storage arrays can support many enterprise applications, together with all their replication needs and auxiliary systems (such
as test, development, reporting, patch-testing, and others). With FAST and Service Level Objective (SLO) management, it is easy to provide the right amount of resources to each such environment with ease, and modify it as business priorities or performance needs change over time. This section discusses some of the considerations regarding different SQL Server data types and SLO assignment for them.
When choosing SLO for the SQL Server databases, consider the following:
• While FAST operates at a sub-LUN granularity to satisfy SLO and workload demands, the SLO is set at a storage group granularity (a group of devices). It is therefore important to match the storage group to sets of devices of equal application and business priority.
• Consider that with VMAX3 all writes go to the cache, which is persistent, and thus uses lazy writer to the backend storage.
Therefore, unless other reasons are in play (such as synchronous remote replications, long I/O queues, or a system that is over-utilized), write latency should always be very low (cache-hit), regardless of the SLO or disk technology storing the
In general, EMC recommends for mission critical databases to separate the data files, the log files, TEMPDB to distinct sets of devices or child storage groups will allow setting different SLOs, thereby allowing I/O latency and manageability:
• SQL Server data files: The primary data file (.mdf), secondary data files (.ndf), and a transaction log file (.ldf), are all associated with every SQL Server database. Microsoft SQL Server database environments are created by defining one or more filegroups (the PRIMARY file group will always exist). Each file group is subsequently defined to exist on one or more data files located on NTFS volumes. Due to the proportional fill mechanism used by SQL Server, data files within a given filegroup will generate almost identical I/O patterns. This highly correlated workload is primary motivation for the best practice recommendation to utilize multiple data files within a filegroup to distribute I/O load.
• SQL Server transaction logs: Active transaction log files are formatted when a Microsoft SQL Server database is created. At the time of creation, every single page of the log files is written to so that the log files become fully provisioned when they are initialized. The log file contains mostly contiguous zeroes when first written, but over time transactions are written to the log files and later truncated based on log backup or checkpoint operations. Transaction logs are written to for all activities such as data modifications, data loads, index rebuilds. Storage allocations used for SQL Server transaction log files are based on a different style of I/O activity. Pre-written space on the devices provides optimum write performance, as the allocations are dedicated and available for the SQL Server transaction log file. Overall performance of a highly active SQL Server database servicing an Online Transaction Processing (OLTP) workload can be adversely affected by long latencies to the transaction log. Usually latency counters less than 10ms is good, while anything over 50ms indicates I/O bottleneck.9
Latencies result from the Write Ahead Logging (WAL) feature of SQL Server. WAL ensures that SQL Server returns to a transactionally consistent state even after a server or system outage. The WAL feature persists all updates and inserts to the transaction log before committing them to the data files. The combination of the current state of the data files and the transaction log allows SQL Server to recover in the event of a failure. This functionality requires that a commit for a transaction must be persisted to the log file before the transaction returns a status to the user process. Longer latencies for the log writes will reduce user process performance. Transaction log writes are small I/O which are cached by VMAX3 and
also coalesced by VMAX3 for optimized disk writes. Frequency of the backup and truncation of the transaction logs would
result in large sequential reads so latency of such I/O will determine overall performance of the database.
• SQL Server TEMPDB: TEMPDB is a system database that is used to maintain temporary sort areas, stored procedures, etc. Of all the system databases, TEMPDB can be the destination of significant I/O load. The usage cycle of Microsoft SQL Server TEMPDB storage deployed on virtually provisioned storage is interesting to follow and understand. Initially, as TEMPDB is simply another SQL Server database the rules for data files and log file apply. The transaction log files will be fully allocated at creation, and data files will be allocated on an as-needed basis. The TEMPDB system database is utilized by SQL Server to provide semi-persistent storage for certain operations. The amount of activity to the TEMPDB data and log files may be significant in certain environments, depending on the style of Transact-SQL statements being executed and certain SQL Server functionality. SQL Server recreates the TEMPDB storage when a new instance of the SQL Server environment is initiated. Even if there is no viable persistent data stored within TEMPDB, the thin extents will remain allocated, when SQL Server instance is restarted, and the TEMPDB is re-initialized. The total thin extent allocation will be based on the maximum TEMPDB utilization, but will never exceed the maximum size of the TEMPDB10 itself. Depending on the usage of TEMDB – for
large search, batch jobs, small table transactions or online indexing operations, the TEMPDB may experience large
sequential IO or small random I/O workload. Most of the usage patterns of TEMPDB will follow very low amount of locality of reference hence so such I/O may experience low cache hit ratios and hence higher disk IOs making disk latencies very critical for TEMPDB operations. Keeping TEMPDB in a separate storage group will allow for appropriate SLO selection for TEMPDB.
The following section will address SLO considerations for these data types.
SLO considerations for SQL Server data files
A key part of performance planning for the SQL Server database is understanding the business priority of the application it serves, and with large databases it can also be important to understand the structure of the associated data files. A default SLO can be used for the whole database for simplicity, but when more control over database performance is necessary, a distinctive SLO should be used, together with a workload type.
The choice of workload type is rather simple; for databases focused on sequential reads/writes, a DSS type should be used. For
databases that either focus on transactional applications (OLTP), or mixed workloads such as both transactional and reporting, an
OLTP type should be used. If storage remote replication is used (SRDF), add: with Replications to the workload type.
When to use Diamond SLO: Diamond SLO is only available when EFDs are available in the SRP. It tells FAST to move all the allocated storage extents in that storage group to EFDs, regardless of the I/O activity to them. Diamond provides the best read I/O latency as flash technology is best for random reads. Diamond is also popular for mission critical databases servicing many users, where the system is always busy, or even when each group of users start their workload intermittently and expect high-performance with low-latency. By having the whole storage group using EFDs, it doesn’t matter when a user becomes active to provide them with best performance.
When to use Bronze SLO: Bronze SLO doesn’t allow the storage group to leverage EFDs, regardless of the I/O activity. It is a good choice for databases that don’t require stringent performance and should let more critical applications utilize capacity on EFDs. For example, databases can use a Bronze SLO when their focus is development, test, and reports. Another use for Bronze SLO is for gold copies of the database.
When to use Optimized SLO: Optimized SLO is a good default when FAST should make the best decisions based on actual workload and for the storage array as a whole. Because Optimized SLO uses the widest range of allowed I/O latencies, FAST will attempt to give the active extents in the storage group the best performance (including EFDs if possible). However, if there are competing workloads with explicit SLO, they may get priority for the faster storage tiers, based on the lower latency requirements other SLOs have.
When to use Silver, Gold, or Platinum SLO: as explained earlier, each SLO provides a range of allowed I/O latency that FAST will work to maintain. Provide the SLO that best fits the application based on business and performance needs. Refer to Table 1 and Figure 3 to determine the desirable SLO.
SLO considerations for SQL Server log files
Typically the I/O generated to a transaction log is sequential write stream, with occasional read entries when user transactions abort. Sizing I/O capacity for a LUN that serves as storage for a transaction log is therefore different from that required for a data file. SQL Server uses a write-ahead logging mechanism (WAL), where change descriptions (log records) related to the current transaction are first committed to the transaction log file, before it is written to the data file. Once there are sufficient number of dirty pages, the lazy writer or the checkpoint process then write it to the disk. The checkpoint runs periodically to make sure that the dirty pages in cache are kept to a small number, if you need to recover the database. VMAX3 provides support for transaction log write operations
from global memory. All write operations are acknowledged as complete when they are saved to cache. All cache is mirrored and is protected by the vaulting process should the need arise. Write operations received by the VMAX3 environment are persisted to disk,
and provide full support for SQL Server WAL requirements. Since the cache acknowledges all write operations, VMAX3 depends less
on the physical storage required to service the write stream. VMAX3 may coalesce multiple discrete write operations into a small
number of larger write operations, which improves the efficiency of transaction log writes while reducing the I/O demand. Since all writes in VMAX3 go to cache, the SLO has limited effect on log performance. Sizing and performance for the transaction log volumes
is generally easily met, by setting the storage group to any SLO since they are write latency critical, almost instantly acknowledged by VMAX3 cache, not backend storage technology.
SLO considerations for SQL Server TEMPDB
TEMPDB usage is an important aspect for more customer configurations, and the considerations described for the user database will apply to the TEMPDB as well.
SLO considerations for All-Flash workloads
When a workload either requires a predictable low-latency / high-IOPS performance, or perhaps when many users with intermittent workload peaks use a consolidated environment, each requiring high-performance during their respective activity time, an All-Flash performance is suitable. All-Flash deployment is also suitable when data center power and floor-space are limited, and high performance consolidated environment is desirable.
Note: unlike All-Flash appliances, VMAX3 offers a choice of a single EFD tier, or multiple tiers. Since most databases require
additional capacity for replicas, test/dev environments and other copies of the production data, consider a hybrid array for these replicas, and simply assign the production data to the Diamond SLO.
SLO considerations for noisy neighbor and competing workloads
In highly consolidated environments, many databases and applications compete for storage resources. FAST can provide each with the appropriate performance when specific SLO and workload types are specified. By using different SLOs for each such application (or group of applications), it is easy to manage such a consolidated environment, and modify the SLOs when business requirements change. Refer to the next section for additional ways of controlling performance in a consolidated environment.
HOST I/O LIMITS AND MULTI-TENANCY
The Host I/O Limits quality of service (QOS) feature was introduced in the previous generation of VMAX arrays but it continues to offer VMAX3 customers the option to place specific IOPs or bandwidth limits on any storage group, regardless of the SLO assigned to
that group. Assigning a specific Host I/O limit for IOPS, for example, to a storage group with low performance requirements can ensure that a spike in I/O demand will not saturate its storage, cause FAST inadvertently to migrate extents to higher tiers, or overload the storage, affecting performance of more critical applications. Placing a specific IOPs limit on a storage group will limit the total IOPs for the storage group, but it does not prevent FAST from moving data based on the SLO for that group. For example, a storage group with Gold SLO may have data in both EFD and HDD tiers to satisfy the I/O latency goals, yet limited to the IOPS provided by Host I/O limit.
USING CASCADED STORAGE GROUPS
VMAX3 offers cascaded Storage Groups (SGs) wherein multiple child storage groups can be associated with a single parent storage
group for ease of manageability and for storage provisioning. This provides flexibility by associating different SLOs to individual child storage groups to manage service levels for various application objects, and using the cascaded storage groups for storage
provisioning.
The following Figure 4 shows an SQL server using Single storage group, whereas Figure 5 shows an example of a cascaded storage group.
Figure 5 Cascaded storage group
As shown in Figure 5 above, SQL Server datafiles storage group is set to use Gold SLO whereas SQL Server transaction log storage group is set to use Bronze SLO. Both storage groups are part of a cascaded storage group SQL_DB_SG, which can be used to provision all the database devices to the host, or multiple hosts in the case of a cluster.
SQL SERVER DATABASE PROVISIONING
STORAGE PROVISIONING TASKS WITH VMAX
3Since VMAX3 comes pre-configured with data pools and a Storage Resource Pool (SRP), what is left to do is to create the host
devices, and make them visible to the hosts by an operation called device masking.
Note: Remember that zoning at the switch sets the physical connectivity that device masking defines more closely. Zoning needs to be set ahead of time between the host initiators and the storage ports that will be used for device masking tasks.
Device creation is an easy task and can be performed in a number of ways: 1) Using Unisphere for VMAX3 UI Intuitive Provisioning Wizards
2) Using Solutions Enabler CLI to create a Masking View Device masking is also an easy task and includes the following steps:
1) Creation of an Initiator Group (IG). Initiator group is the list of host HBA port WWNs to which the devices will be visible. 2) Creation of Storage Group (SG). Since storage groups are used for both FAST SLO management and storage provisioning
review the discussion on cascaded storage groups earlier.
3) Creation of a Port Group (PG). Port group is the group of VMAX3 front-end ports where the host devices will be mapped and
visible.
4) Creation of a Masking View (MV). Masking view brings together a combination of SG, PG, and IG.
Device masking controls host access to storage enabling a secure multi-tenant environment. For example, storage ports can be shared across many servers, but only the masking view determines which devices of the server will have access to which ports.
PROVISIONING SQL SERVER DATABASE STORAGE WITH UNISPHERE
This section covers storage provisioning for Microsoft SQL Server databases using Unisphere for VMAX.
Creation of a host Initiator Group (IG)
Provisioning storage requires creation of host initiator groups by specifying the host HBA WWN ports. To create host IG select the appropriate VMAX3 storage, then select Hosts tab. Select from the list of initiator WWNs, as shown in Figure 6.
Creation of Storage Group (SG)
A storage group defines a group of one or more host devices. Using the SG creation screen, a storage group name is specified and new storage devices can be created and placed into the storage group together with their initial SLO. If more than one group of devices is requested, each group creates a child SG and can take its own unique SLO. Storage group creation screen is seen in Figure 7.
Select host(s)
In this step the hosts to which the new storage will be provisioned are selected. This is done by selecting an IG (host HBA ports), as shown in Figure 8
Creation of Port Group (PG)
A Port Group defines which of the VMAX front-end ports will map and mask the new devices. A new port group can be created, or an existing one selected, as seen in Figure 9 . Unisphere will automatically select the ports where your host or cluster initiators are logged in.
Creation of Masking View (MV)
At this point Unisphere has all that is needed to create a masking view. As outlined in Figure 10 the Storage Group, Initiator Group, and Port Group are presented, and a masking view name is entered. VMAX automatically maps and masks the devices in the Storage Group to the Microsoft SQL servers.
PROVISIONING SQL SERVER DATABASE STORAGE WITH SOLUTIONS ENABLER CLI
The following is a provisioning example using VMAX3 Solutions Enabler CLI to create storage devices and mask them to the host.
Create devices for SQL Server Database
Creating 4x1TB thin devices for SQL Server Database. The output of the command includes the new device IDs. The full capacity of the devices can be pre-allocated as shown below.
# symconfigure -sid 536 -cmd "create dev count=4,size=1024 GB, preallocate size=ALL, emulation=FBA, config=tdev ;" commit
...
New symdevs: 00020:00024
Mapping and masking devices to host
<Create Child Storage Groups DATA and Transaction logs >
# symaccess -sid 536 create –name SQL_OLTP1_DATA –type storage –devs 020:023
# symaccess -sid 536 create –name SQL_OLTP1_Logs –type storage –devs 024:024
<Create Parent Storage Group and add Childs>
# symaccess -sid 536 create –name SQL_OLTP1 –type storage –sg OLTP1_DATA, OLTP1_Logs
<Create host initiator group using a text file containing WWNs of HBA ports> # symaccess -sid 536 create –name SQL_OLTP1_IG –type initiator –file wwn.txt
<Create port group specifying the VMAX3 FA ports>
# symaccess -sid 536 create –name SQL_OLTP1_PG –type port –dirport 1D:4,1D:5,2D:5,2D:8
<Create masking view to complete the mapping and masking>
MICROSOFT SQL SERVER SLO MANAGEMENT TEST USE CASES
TEST CONFIGURATION
This section covers examples of using Microsoft SQL Server databases with SLO management.
Test overview
The use cases covered here are:
• Single database performance using different SLOs for the SQL Server Data files.
• Multiple competing database performance using different SLOs for SQL Server Data files. • All Flash Array (AFA) configuration with both SQL Server Data files and Transaction log on EFDs
Databases Configuration details
The following tables show the use cases test environment. Table 2 shows the VMAX3 storage environment, Table 3 shows the host
environment, and Table 4 shows the databases storage configuration.
Table 2 Test storage environment
Configuration aspect Description
Storage array VMAX 200K with 1-engine
HYPERMAX OS 5977.249
Drive mix(excluding spares) • 16 200GB-EFDs - RAID5 (3+1) • 64 300GB-15K HDD - RAID1 • 32 1TB-7K HDD - RAID6 (6+2)
Table 3 Test host environment
Configuration aspect Description
Microsoft SQL Server SQL Server 2014 Enterprise Edition 64-bit Windows Windows Server 2012 R2 64-bit
Multipathing EMC Powerpath 5.7 SP4 64-bit Hosts 2 x Cisco C240, 96 GB memory
Table 4 Test database configuration
Database Thin devices (LUNs)
LUN layout SRP Start SLO
Table 5 SQL Database layout details
Database
SQL DB details
Mount point SQL Server File Groups SQL Server Data files Total SQL Data files size
OLTP1 C:\OLTP1_Data1 FIXED_FG, GROWING_FG, SCALING_FG MSSQL_OLTP_root.mdf, Fixed_1.ndf, Growing_1.ndf, Scaling_1.ndf 378G
C:\OLTP1_Data2 Growing_2.ndf, Fixed_2.ndf,
Scaling_2.ndf 370G C:\OLTP1_Data3 Growing_3.ndf, Fixed_3.ndf,
Scaling_3.ndf 370G C:\OLTP1_Logs OLTP1_log.ldf OLTP2 C:\OLTP2_Data1 FIXED_FG, GROWING_FG, SCALING_FG MSSQL_OLTP_root.mdf, Fixed_1.ndf, Growing_1.ndf, Scaling_1.ndf 378G
C:\OLTP2_Data2 Growing_2.ndf, Fixed_2.ndf,
Scaling_2.ndf 370G C:\OLTP2_Data3 Growing_3.ndf, Fixed_3.ndf,
Scaling_3.ndf 370G
C:\OLTP2_Logs OLTP2_log.ldf
TEST OVERVIEW
General test notes:
• OLTP1 and OLTP2 were configured to run a 90/10 read/write ratio OLTP workload derived from an industry standard. No special database tuning was done as the focus of the test was not on achieving maximum performance and rather comparative differences of a standard database workload.
• DATA and LOG Storage Groups were cascaded into a common parent storage group for ease of provisioning and performance management.
• Data collection included storage performance metrics using Solutions Enabler and Unisphere, host performance stats using Windows Perfmon, and EMC Powerpath stats.
High level test overview:
• In the first use case of a single database (OLTP1) run, SLOs for SQL Server Datafiles were set to Gold for 4 hours of test run and then changed to Platinum SLO, for the next 4 hours. The SQL Server transaction log storage group was left on Bronze since, as explained earlier, transaction LOG workload is focused on writes, which are always handled by the VMAX3 cache,
and therefore affected to a lesser degree by the SLO latency.
• In the second use case we have two competing databases SQL Server Databases OLTP1 and OLTP2, where one of the SQL Server Databases (OLTP1), transitioned from Gold SLO to Platinum SLO, while the other OLTP2 database remained at Gold.
Figure 11 Test Configuration Layout details
USE CASE 1 – SINGLE DATABASE RUN WITH CHANGE IN SLO
Test Scenario:
Apply Gold SLO to the SQL Server datafiles storage group and gather performance statistics to analyze the effect of SLO on database transaction rate, by noting down SQL Batch Requests/sec. Next transition the SQL Server datafiles storage group to Platinum SLO, and then note down the SQL Batch Requests/sec.During this test the SQL Server transaction log storage group remained on Bronze SLO.
Objectives:
The purpose of this test case is to understand how database performance can be controlled by changing the SLO on SQL Server datafiles.
Test execution steps:
1. Run an OLTP workload on OLTP1 SQL Server Database with SQL Server Datafiles on Gold SLO, and SQL Server transaction log storage groups on Bronze SLO. Run the test for 4 hours, and at the end of the test note down the SQL Server Database Response time and SQL Batch Requests/sec.
2. Now change the SLO for the SQL Server datafiles storage group to Platinum SLO and then gather performance statistics. Test Results:
Table 6 shows the test results of use case 1, including the database transaction rate as measured in SQL Batch Requests/sec and the SQL Server Database Response time(ms).
Table 6 shows the SQL Server Database Batch Requests/sec changes, and SQL Server Database Response time improves as a direct effect of changes in SLO.
Figure 12 SQL Batch Request changes as a direct effect of changes in SLO on Data file storage group
As shown in Figure 12, and Table 6 since VMAX3 promoted active data extents to high performance storage tiers, including more EFD
capacity as the SLO changed from Gold to Platinum, the transaction rate increased. IO latencies were reduced with more EFD allocations. With Gold SLO, SQL Server database experienced a latency of 3.34 ms which improved to 0.96 ms with Platinum SLO. Corresponding transaction rate jumped from 1591 on Gold SLO to almost 4152 with Platinum SLO – 2.6x improvements in transaction rate.
USE CASE 2 – COMPETING DATABASE RUNS WITH CHANGEIN SLO
Test Scenario:
Here we have 2 competing workloads running on two SQL Server Databases OLTP1 and OLTP2. Apply Gold SLOs to both SQL Server datafiles storage groups and gather performance statistics. Next transition one of the SQL OLTP storage groups containing SQL to Platinum SLO, while the other OLTP workload remains at Gold SLO. Note down the corresponding SQL transaction rate in SQL Batch Requests/sec.During the run the SQL Server transaction log storage groups for both the databases remained on Bronze SLO. Objectives:
VMAX3 provides specific performance targets for different SLOs. The purpose of this test case is to understand how competing SQL
databases interact and affect performance when changing SLO. Test execution steps:
1. Run an OLTP workload on OLTP1 SQL Database and OLTP2 SQL Database, keeping the SQL Datafiles on Gold SLO, for both and SQL transaction log storage groups on Bronze SLO. Run the test run 4 hours, and at the end of the test note down the
Test Results:
Table 7 shows the test results of use case 2, including the database transaction rate in SQL Batch Requests/sec and the SQL Response time(ms).
Table 7 Use case 2 results, shows the SQL Batch Requests/sec changes, and SQL Response time improves with changes in SLO for OLTP1, while OLTP2 maintains its performance characteristics.
Figure 13 Use case 2 SQL Batch Request changes, SQL Reponse time changes for OLTP1, and OLTP2 SQL databases.
With Gold SLO, OLTP1 and OLTP2 SQL database experienced a latency of 5.9ms-6.1ms which improved to 1.98 ms when OLTP1 was transitioned to Platinum SLO. Corresponding SQL Server Batch Requests/sec for OLTP1 also jumped from 527 on Gold SLO to almost 1147 with Platinum SLO – 2x improvements in transaction rate.
USE CASE 3 – ALL FLASH CONFIGURATION WITH DATA FILES AND TRANSACTION LOG
Test Scenario:
This test used an All Flash configuration, by placing the SQL database and its associated storage groups on a Diamond SLO (EFD only). A similar scenario can happen without an SLO by purchasing an All-Flash VMAX3.
Objectives:
Provides an All Flash configuration for low latency and high performance. This can be accomplished by customers by either
purchasing an All-Flash VMAX3, or using the Diamond SLO in a hybrid VMAX3 array. This test case used Diamond SLO for SQL Server
Data files and SQL Server transaction LOG storage groups. Test case execution steps:
1. Set the SQL Server Data files storage group and SQL Server transaction storage group SLO to Diamond. 2. Run the OLTP workload and gather performance statistics.
Results
Table 8 shows SQL Server Batch Requests/sec and SQL Response time using all flash configuration. As we can see from, Figure 14 Diamond SLO provides predictable performance by offering very high transaction rate at a low latency for OLTP workloads.
Table 8 Use case 3 results, shows the SQL statistics while using Diamond SLO for both SQL Server Datafiles and log (equivalent to an All Flash configuration).
CONCLUSION
VMAX3 provides a platform for Microsoft SQL Server databases that is easy to provision, manage, and operate with the application
performance needs in mind. The purpose of this paper was to describe some of the changes in the platform, how they relate to Microsoft SQL Server database deployments, and provide a few examples as to how SLO management helps with performance management.
REFERENCES
• EMC VMAX3 Family with HYPERMAX OS Product Guide
• Unisphere for VMAX Documentation set
APPENDIX
SOLUTIONS ENABLER CLI COMMANDS FOR SLO MANAGEMENT AND MONITORING
List available SLOs in the array
Availability of the SLO depends on the available drive types in the VMAX3 SRP. The following command can be used to list the
available SLO in the array and their expected average latency.
# symcfg –sid 536 list –slo
The output will contain the list of SLOs available to choose along with approximate response times: SERVICE LEVEL OBJECTIVES
Symmetrix ID : 000196700536 Approx Resp Time Name (ms) --- --- Optimized N/A Diamond 0.8 Platinum 3.0 Gold 5.0 Silver 8.0 Bronze 14.0
Example of creating a Cascaded Storage group from a Single Storage Group
Change SLO for a storage groups
Set the desired SLO and workload type for the a Child SG
# symsg –sid 536 –sg OLTP1_DATA set –slo Bronze –wl oltp
Review thin device allocations in the data pools
The thin device allocations can be checked for a given storage group