MAXIMIZING AVAILABILITY OF MICROSOFT SQL SERVER 2012 ON VBLOCK SYSTEMS

(1)

Maximizing Availability of Microsoft SQL Server 2012 on Vblock

Systems Table of Contents

1

MAXIMIZING AVAILABILITY OF

MICROSOFT SQL SERVER 2012

ON VBLOCK™ SYSTEMS

January 2013

(2)

2

Introduction ...4 Business case...4 Solution overview ...4 Key results ...5 Objectives ...5 Scope ...5 Audience...6 Feedback ...6 Technology overview ...7 Vblock Systems ...7 Compute components ...7 Network components ...7 Storage components...8 Virtualization components ...8 EMC PowerPath/VE ...8

Microsoft Windows Server 2008 R2 with Service Pack 2 ...8

Microsoft SQL Server 2012 ...9 Architecture overview ...10 Physical architecture ...10 Logical architecture ...11 Hardware resources ...11 Software resources ...12 Design considerations ...13

Determining high-availability and disaster recovery needs ...13

Choosing a SQL Server 2012 AlwaysOn design ...14

Designing the overall solution architecture ...14

Sizing and configuring infrastructure components ...16

Compute configuration ...16

Network configuration ...16

Storage configuration...16

Virtualization configuration ...19

Solution validation ...20

(3)

3

Workload generation ...21

Distance emulation ...22

Test methodology ...23

Test scenarios ...23

Creating and configuring the Availability Group...23

Test results ...25

Site failure ...25

Network failure...27

Storage array failure ...28

Server/node failure ...30

Conclusion ...31

Next steps ...31

(4)

4

Introduction

In today’s dynamic business environment, continuous uptime and fast recovery from adverse

conditions are vital to meeting service-level agreements (SLAs) for high availability of business critical applications and application data. Planning for business disruptions and minimizing their impact is critical for enterprises, government agencies, and service providers to reduce revenue losses and compliance risks, including potential lawsuits. Designing and deploying a database like Microsoft SQL Server 2012, in conjunction with a highly available converged infrastructure helps organizations to avert these risks and control deployment and maintenance costs.

This solution architecture paper describes the VCE solution for supporting a geographically dispersed SQL Server 2012 environment protected by AlwaysOn technology with multi-subnet support at synchronous distances of 0, 100, and 300 kilometers.

Business case

Because of the demand for increasingly higher availability service levels, many organizations are upgrading or migrating to SQL Server 2012 to leverage its AlwaysOn features for high availability and disaster recovery. SQL Server 2012 AlwaysOn is an integrated, flexible, and cost-efficient solution that provides redundancy within and across data centers, fast application failover for maximum availability, and data protection of critical applications.

These migrations can replace older infrastructures, which are difficult to manage and expensive to support. Building an infrastructure by separately deploying and managing compute, storage, and virtualization resources for databases and other business systems has been complex,

time-consuming, and costly—until now.

Converged infrastructures that take advantage of virtualized databases like SQL Server are rapidly transforming the way organizations deploy and manage their IT assets and resources. With a converged infrastructure, IT has a single flexible, scalable platform to manage, which reduces system complexity and IT operational overhead. With fewer servers consuming less space, power, and cooling, IT organizations also see significant capital and operational expense savings. For this reason, deploying SQL Server 2012 and taking advantage of its AlwaysOn functionality on a leading

converged infrastructure helps organizations achieve business continuity objectives at a lower risk and cost.

Solution overview

The VCE solution is built on geographically dispersed SQL Server 2012 environments deployed on Vblock™ Systems protected by AlwaysOn technology in a primary and disaster recovery configuration across two data centers. Deploying SQL Server 2012 on Vblock Systems enhances the benefits of the AlwaysOn high-availability and disaster recovery features. Vblock Systems immediately deliver a pre-engineered, tested, and validated converged infrastructure that supports your high-availability requirements while reducing the total cost of ownership (TCO) and improving time to market.

(5)

5

By combining the power of the Vblock converged infrastructure and a virtualized SQL Server 2012, your organization can:

 Demonstrate business-critical levels of high performance and availability between data centers  Ensure site resiliency and meet recovery objectives when faced with application, storage,

network, and node failures

 Achieve business-continuity SLAs end-to-end at a lower risk and cost

Key results

Testing and validation of the VCE solution demonstrated the following:

 Using SQL Server 2012 AlwaysOn Availability Groups, the Recovery Time Objective (RTO) of less than one hour, and the Recovery Point Objective (RPO) of zero data loss were met in all failure scenarios at distances of 0, 50,100, and 300 kilometers.

 The Vblock System 300 can easily service SQL Server 2012 mission-critical online transaction processing (OLTP) production workloads while providing high availability through a combination of the Availability Groups feature and the storage capabilities of the Vblock converged

infrastructure.

 Deploying SQL Server 2012 on a pair of Vblock Systems is simple and efficient, and provides improved data protection under typical transaction loads.

Objectives

The primary objective of this solution architecture paper is to demonstrate how to reduce the impact of hardware and software failures by using virtualized SQL Server 2012 for high availability and disaster recovery. This document includes best practice for facilitating SQL Server 2012 deployments on Vblock Systems to maximize continuous availability of the application and ensure business continuity during disasters.

Scope

The scope of this solution architecture is to:

 Validate the Vblock System for critical, enterprise-level, SQL Server 2012 deployments in a disaster recovery configuration featuring AlwaysOn Availability Groups.

 Demonstrate the high availability of SQL Server 2012 on Vblock Systems during disaster recovery.

 Provide best practices for a SQL Server 2012 deployment on the Vblock System.

(6)

6

Audience

The primary audience of this solution architecture paper is technical practitioners, such as data center managers, database architects, and database administrators (DBAs). In addition, IT management teams would also be interested in reviewing this document as part of a product evaluation.

Feedback

To suggest documentation changes and provide feedback on this paper, contact

(7)

7

Technology overview

The following sections summarize the technologies used in the VCE solution:  Vblock Systems

 EMC PowerPath/VE

 Microsoft Windows Server 2008 R2 with Service Pack 2  Microsoft SQL Server 2012

For more information about the overall test environment including the distance emulator, load generation tools, and test measurement tools, refer to the Test environment design section.

Vblock Systems

Growing business challenges demand an IT infrastructure that is swift, agile, and economical— enabling organizations to adapt to change quickly and cost-effectively. VCE designs and delivers Vblock Systems, which seamlessly integrate leading compute, network, and storage technologies. Through intelligent discovery, awareness, and automation, Vblock Systems provide the highest levels of virtualization and application performance.

The Vblock System 300 is an agile and efficient data center-class system, providing modular and scalable performance. It features a high-density, compact fabric switch; tightly-integrated, fabric-based blade servers; and best-in-class, unified storage. Organizations look to the platform for unified storage, to support high-bandwidth applications, and for ease of use. Available in several models, this solution uses the Vblock 300EX.

Each Vblock model has a base configuration, which is a minimum set of compute and storage components as well as fixed network resources. Within the base configuration, certain hardware aspects can be customized. Together, the components offer balanced CPU, I/O bandwidth, and storage capacity relative to the compute and storage arrays in the system.

For more information, go to http://www.vce.com/vblock.

Compute components

The compute components in Vblock Systems are built on the Cisco Unified Computing System (UCS) line of products. Components include one or more blade server chassis, compute blades, I/O

modules, and the fabric interconnects that connect the unified fabric to the rest of the environment.

Network components

(8)

8

Storage components

The Vblock Systems used in this solution contain VNX5300 storage arrays. The VNX5300 is a high-end, enterprise storage array that supports multiple drive technologies, including Flash, serial-attached SCSI, SAS, and nearline SAS (NL-SAS) drives, and a full range of RAID types. Intel Xeon processors power the VNX series for intelligent storage that automatically and efficiently scales in performance, while ensuring data integrity, security, and high availability for mission-critical SQL Server environments.

Virtualization components

Virtualization components include VMware ESXi, VMware vCenter Server, and VMware vSphere. VMware vCenter Server provides a scalable and extensible platform that forms the foundation for virtualization management. It centrally manages VMware vSphere environments, allowing IT administrators control over the virtual environment.

For more information on VMware vCenter Server, go to http://www.vmware.com/products/vcenter-server/overview.html.

EMC PowerPath/VE

EMC PowerPath/VE performs multipathing for the storage arrays on the Vblock System.

PowerPath/VE works in combination with SAN storage to intelligently manage Fibre Channel (FC), iSCSI, and Fibre Channel over Ethernet (FCoE) I/O paths. Installed on the vSphere host,

PowerPath/VE scales to the maximum number of virtual machines on the host. The virtual machines do not have PowerPath/VE installed nor are they aware that PowerPath VE is managing them. PowerPath/VE dynamically load balances I/O requests and automatically detects, and recovers from, path failures. PowerPath/VE standardizes path management across heterogeneous physical and virtual environments and ensures application uptime.

Microsoft Windows Server 2008 R2 with Service Pack 2

For this solution, the cluster feature available with Windows Server 2008 R2 with Service Pack (SP) 2 was used. A failover cluster is a group of independent computers that work together to increase the availability of applications and services. The clustered servers, called nodes, are connected by physical cables and software. If one of the cluster nodes fails, another node begins to provide service (a process known as failover). Users experience a minimum of disruptions in service.

Microsoft cluster technologies guard against three specific types of failure:

 Application and service failures, which affect application software and essential services.  System and hardware failures, which affect hardware components such as CPUs, drives,

memory, network adapters, and power supplies.

(9)

9

Microsoft SQL Server 2012

Microsoft SQL Server is a database management and analysis system for e-commerce,

line-of-business, and data warehousing solutions. The latest version, SQL Server 2012, adds new high availability and disaster recovery solutions through AlwaysOn Availability Groups.

The new SQL Server 2012 AlwaysOn feature provides flexible design choices for selecting an appropriate high-availability and disaster recovery solution for applications. SQL Server AlwaysOn was developed for applications that require high uptime, need protection against failures within a data center and adequate redundancy against failures across data centers.

SQL Server 2012 AlwaysOn provides availability at either the application database or instance level:  Availability Groups is a new feature that helps protect application databases from planned and

unplanned downtime.

The Availability Groups feature provides an enterprise-level alternative to database mirroring. An availability group supports a failover environment for a set of read-write primary databases and one to four sets of corresponding secondary databases. Optionally, secondary databases can be made available for read-only access and/or some backup operations.

 Failover Cluster Instance provides availability for the entire SQL Server instance by enabling failover of the instance rather than individual database(s).

The Failover Cluster Instance feature enhances existing SQL Server Failover Clustering with support for multisite clustering across subnets for improved site protection, a flexible failover policy for better control over instance failover, and improved diagnostics for faster

(10)

10

Architecture overview

VCE solutions are designed to reflect and validate real-world deployments. This paper describes a disaster recovery solution for SQL Server 2012 on the Vblock System using AlwaysOn Availability Groups. The following sections describe the physical and logical architecture of the solution.

Physical architecture

The solution uses Windows Server in a Windows Server Failover Cluster (WSFC) on two Vblock 300EX Systems. One Vblock 300EX System serves as the primary data center site and the other Vblock System as the disaster recovery data center site The Vblock 300EX System is optimal for the storage, compute, hypervisor, and network requirements of this solution. Each of the ESX hosts resides on a VNX5300 storage array connected through MDS 9148 and Nexus 5010. Figure 1 shows the physical architecture of the solution.

(11)

11

Logical architecture

Figure 2 shows the logical architecture of the solution.

Figure 2. Logical architecture

Hardware resources

Table 1 shows the hardware resources used in this solution.

Table 1. Solution hardware resources

Equipment Quantity Configuration

Compute 6 Cisco UCS B200 M2 blade server packs (12-core, 96 GB memory)

Network switches 4 Cisco Nexus 5548

Storage platform 2 EMC VNX5300

Aggregation SAN switch 4 Cisco MDS 9148 Virtualization 2  VMware vSphere 5.0

 VMware vCenter Server 5.0

(12)

12

Software resources

Table 2 lists the software used in this solution.

Table 2. Solution software resources

Description Quantity Version Purpose

Windows Server 2008 R2 6 2008 R2 x64 SP1 Server operating system Microsoft SQL Server 2012

Enterprise Edition 2 2012 Database server

EMC PowerPath/VE 2 5.5.1 Advanced multipathing

(13)

13

Design considerations

To meet high-availability and disaster recovery requirements using SQL Server 2012 AlwaysOn Availability Groups, it is important for an organization to:

 Determine high-availability and disaster recovery needs.  Choose a SQL Server 2012 AlwaysOn design.

 Design an overall solution architecture.  Size and configure infrastructure components.

 Follow best practices for SQL Server 2012 from VMware, Cisco, and EMC

The following sections discuss some of the design considerations for the implementation of Availability Groups. For more information about prerequisites, restrictions, and other recommendations, refer to us/library/ff878487.aspx and

http://msdn.microsoft.com/en-us/library/hh510238.aspx.

Determining high-availability and disaster recovery needs

Disaster recovery is the ability to recover data in case the production system is damaged, destroyed or becomes unavailable for an undeterminable period. One of the first things to do when creating a disaster recovery plan is to determine how much data and time you can afford to lose. The primary measures to use to determine high availability and disaster recovery needs including data recovery metrics, are:

 High availability—Refers to a system or component that is continuously operational for a desirable length of time. Availability can be measured relative to "100% operational" or "never failing." Disaster recovery is a subset of business continuity. High availability can be a concept within a data center or across data centers.

 Disaster recovery—Refers to a strategy preparing for recovery and continuation of the technology infrastructure vital to an organization after a natural or human-induced disaster. Disaster recovery is also a subset of business continuity. Recovery Point Objective and Recovery Time Objective are the key metrics to measure disaster recovery readiness.  Recovery Point Objective (RPO)—Describes a point in time from which data can be restored.

For instance, if there is data corruption, loss, or unavailability, what point in time can a valid copy of the data be restored from?

Example

“The database is corrupted. Can we restore from 15 minutes ago?" This would be an RPO of 15 minutes.

 Recovery Time Objective (RTO)—The amount of time in which you want data or hardware restored after a data corruption or hardware failure.

Example

(14)

14

Figure 3. Using RPO and RTO to determine data recovery needs

The disaster recovery goal of the VCE solution is recovery in less than one hour with zero data loss.

Choosing a SQL Server 2012 AlwaysOn design

SQL Server 2012 AlwaysOn provides flexible design choices for selecting an appropriate solution for your application that provides protection against failures within a data center and adequate

redundancy across data centers.

Microsoft has identified the following design patterns as the most commonly used by their customers for high availability and disaster recovery:

 Multi-site Failover Cluster Instance for a local high-availability and disaster recovery solution  Availability Groups for a local high-availability and disaster recovery solution

 Failover Cluster Instance for local high availability, and Availability Groups for a disaster recovery solution

For an overview of these design patterns for high availability and disaster recovery, refer to http://msdn.microsoft.com/en-us/sqlserver/gg490638).

Designing the overall solution architecture

Setting up Availability Groups requires:

 At least two servers (one primary replica and one secondary replica) running Microsoft Windows Server 2008 R2 Enterprise Edition

 Microsoft SQL Server 2012 Enterprise Edition

 Windows Server Failover Clustering (WSFC) configured between the hosts

Table 3 lists important Microsoft prerequisites and requirements to consider when designing a high-availability and disaster recovery solution. For more information, refer to

(15)

15

Table 3. Prerequisites and Requirements

Component Prerequisites and requirements

Windows Server 2008 R2  Availability Groups are not supported on domain controllers.  Each host computer must be a node in a WSFC cluster.

 A WSFC node can host only one availability replica for a given Availability Group. On a given WSFC node, one or more instances of SQL Server can host availability replicas for many Availability Groups.

Ask your database administrators how many WSFC nodes are required to support the availability replicas of the planned Availability Groups.  Every computer on which a server instance hosts an availability replica

must possess sufficient disk space for all the databases in the Availability Group.

SQL Server 2012  The instances of SQL Server that host availability replicas for a given Availability Group must reside on separate nodes of a single WSFC cluster.  Each server instance must be running the Enterprise Edition of SQL Server

2012.

 Enable the AlwaysOn Availability Groups feature on each server instance that will host an availability replica for any availability group.

Availability Groups  For a given Availability Group, availability replicas must be hosted by server instances running on different nodes of the same WSFC cluster. However, virtual machines on the same physical computer can each host an availability replica for the same Availability Group because each virtual machine acts as a separate computer.

 Each Availability Group supports one primary replica and up to four secondary replicas.

 The number of databases and Availability Groups you can put on a computer (virtual or physical) depends on the hardware and workload, but there is no enforced limit.

 Thoroughly test your environment with a production-like workload to ensure it can handle peak workload capacity within your application service level agreements (SLAs). Consider load under failure conditions as well as expected response times.

Availability Database  Must be a read-write database. Read-only databases cannot be added to Availability Groups.

 Must be a multi-user database.

 Must not belong to any existing Availability Group.  Must not be configured for database mirroring.

The infrastructure used for testing and validation was built on a Windows Server 2008 R2 Enterprise Edition operating system platform with a four-node WSFC consisting of a two-node cluster on the primary data center site and a two-node cluster on the secondary data center site using a stretched, active-active failover cluster configuration.

(16)

16

Sizing and configuring infrastructure components

The following sections describe how the components of the converged infrastructure on the Vblock 300EX were configured for validation of the solution. Before you begin, check the latest software and hardware requirement for SQL Server 2012. Current requirements are listed at

http://technet.microsoft.com/en-us/library/ms143506.aspx.

Compute configuration

For the Vblock solution, two Vblock 300EX Systems were used to represent the primary and secondary (disaster recovery) data center sites. Each Vblock contained four Cisco UCS B200 M2 blade server packs. Each blade server had a 12-core CPU and 96 GB of memory.

Network configuration

Microsoft strongly recommends that you use the same network links for communications between Windows Server Failover Cluster (WSFC) cluster members and communications between availability replicas. Using separate network links can cause unexpected behaviors if some of the links fail (even intermittently).

To improve throughput, manageability, application separation, high availability, and security, virtual local area networks (VLANs) were used to segregate the network traffic of various types.

Storage configuration

SQL AlwaysOn Availability Groups require separate, non-shared storage between data centers. The following sections describe the storage configuration for the solution.

RAID groups

In accordance with best practices from Microsoft, EMC, and VMware, the SQL Server database configuration was split into two RAID groups for data and transaction log data.

For application data, RAID 5 disks were used. RAID 5 was chosen for two main reasons:

 Lower cost of data protection compared to RAID 10, with RAID 5 requiring only 20 percent of the total usable capacity for storing parity data

 High-read performance because all drives contribute to the delivery of data

To ensure availability of transaction log data in disaster recovery situations, RAID 10 disk groups were used for transaction log data.

Table 4 shows the RAID group configurations in the VNX5300 storage arrays.

(17)

17

RAID Group

No. No. of Disks Disk Type Disk Size (GB) RPM RAID Type Purpose

10 5 SAS 600 10K RAID

(4+1) ESXi boot LUNs, VM Boot and Swap

239, 240 2 SAS 600 10K Hot

Spare Hot Spare Storage pools

To optimize read performance, a RAID 5 storage pool was used for Microsoft SQL Server application data. A separate RAID 10 storage pool was used for the SQL Server transaction logs. Table 5 shows the storage pool configurations, which are identical in both Vblock Systems.

Table 5. Storage pools

Pool Name No. of

Disks Disk Type Disk Size (GB) RPM RAID Type Purpose

MSSQL_R5 10 SAS 600 10K RAID 5 For SQL Server DB

MSSQL_R10 8 SAS 600 10K RAID 10 For SQL Server

Transaction Log Disk storage

To maximize performance and availability, best practices from Microsoft, EMC, Cisco, and VMware were used to configure the virtualized storage to the physical infrastructure.

(18)

18

Table 6. Disk storage layout—Vblock 300EX 1

Drive

Letter Storage (GB) Hard Disk No. Purpose ESXi Disk Type

VMware

Datastore VNX Array LUN Size (GB)

VNX Storage Pool

C: 100 Hard Disk 1 Boot and

Swap vmdk vm_boot_swap 600 RG #10 D: 1024 Hard Disk 2 DB vmdk sql_db 2500 MSSQL_DB L: 700 Hard Disk 3 Log vmdk sql_log 1700 MSSQL_LOG C: 100 Hard Disk 1 Boot and

Swap vmdk vm_boot_swap 600 RG #10 C: 100 Hard Disk 1 Boot and

Swap vmdk vm_boot_swap 600 RG #10 C: 1100 Hard Disk 1 Boot and

(19)

19

Table 7. Disk storage layout—Vblock 300EX 2

Drive

Letter Storage (GB) Hard Disk # Purpose ESXi Disk Type

VMware

Datastore VNX Array LUN Size (GB)

VNX Storage Pool

C: 100 Hard Disk 1 Boot and

Swap vmdk vm_boot_swap 600 RG #10 D: 1024 Hard Disk 2 DB vmdk sql_db 2500 MSSQL_DB L: 700 Hard Disk 3 Log vmdk sql_log 1700 MSSQL_LOG C: 100 Hard Disk 1 Boot and

Swap vmdk vm_boot_swap 600 RG #10 D: 1024 Hard Disk 2 DB vmdk sql_db 2500 MSSQL_DB L: 700 Hard Disk 3 Log vmdk sql_log 1700 MSSQL_LOG

Virtualization configuration

The following table describes the virtual machines created to validate the solution. Data Center Site Virtual Machine Role Primary Windows Server 2008 R2 SP2

domain controller (ms-dc1) Provides DNS, Active Directory, and DHCP services SQL Server 2012 (ms-sql-db1) Provides database services

SQL Server 2012 (ms-sql-db2) Provides database services Cluster File Share Witness

(ms-sql-wns) Provides cluster FSW services SQL Server 2012 Native Client

(ms-sql-client1) Provides access to data in the SQL Server 2012 database TPC-E (MSTPCE1) Provides test workload data

TPC-E (MSTPCE2) Provides test workload data

(20)

20

Solution validation

The testing of this solution validated the use of the high-availability and disaster recovery capabilities of SQL 2012 AlwaysOn Availability Groups with the Vblock System to enable business continuity of Microsoft operations.

Test environment design

The solution infrastructure was deployed on Windows Server 2008 R2 in a four-node Windows Server Failover Cluster (WSFC) over two Vblock 300EX Systems. One Vblock 300EX System served as the primary data center site and the other Vblock System served as the secondary data center site. The data center sites were connected by a simulated WAN.

A SQL Server 2012 instance was deployed on each node in the WSFC configuration. The SQL Server instances share an AlwaysOn Availability Group for high-availability and disaster recovery purposes. With this configuration, the primary data center site can fail over to the secondary data center site easily while maintaining availability and minimizing costly downtime.

Two additional virtual machines were created on the primary data center site for TPC-E, which was used to generate the application test workload. A distance emulator between the two Vblock Systems emulated a WAN for testing over distance.

Figure 4 shows the test environment.

(21)

21

Workload generation

To test high availability and disaster recovery, VCE ran OLTP-like workloads against the SQL Server 2012 database environment in parallel with a simulated real world failure event and verified that the application and database were online and available during the failure. Workloads were generated using TPC Benchmark™ E (TPC-E).

TPC-E consists of a set of transactional operations designed to exercise system functionalities in a manner representative of complex OLTP application environments. To help users relate to the components of the benchmark, these transactional operations have been given a real-world context, depicting the activity of a brokerage firm. The workload is centered on the activity of processing brokerage trades and uses a schema that is divided in four sets of tables.

TPC-E models the activity of brokerage firm that must manage customer accounts, execute customer trade orders, and be responsible for the interactions of customers with financial markets. The

brokerage firm interacts with financial markets to execute orders on behalf of the customers and updates relevant account information.

The number of customers defined for the brokerage firm can be varied to represent the workloads of different size businesses.

This benchmark is composed of a set of transactions that are executed against three sets of database tables that represent market data, customer data, and broker data. A fourth set of tables contains generic dimension data such as zip codes. Figure 5 shows the key components of the environment:

(22)

22

Distance emulation

Since the link between the production and disaster recovery sites is often a factor in the failure of a solution, it is important to understand the capacity, latencies, and utilization of the link when designing a disaster recovery solution. To characterize the effects that network latencies and bandwidth can have on the VCE SQL Server Availability Groups solution, multiple failure tests were performed at varying emulated distances.

In general, network latencies correspond to different distances. Round-trip latencies can vary from zero milliseconds (ms) to 250 ms or higher. Table 8 shows distances and their corresponding link latencies. The numbers shown in the table are estimates and can vary depending on link quality, network hops, and backbone hops.

Table 8. Distance and Average Link Latency

Approximate distance One-way latency (ms) Round-trip latency (ms)

50 km 0.25 0.5 500 km 2.5 5 1,000 km 5 10 5,000 km 25 50 10,000 km 50 100 20,000 km 100 200 22,500 km 125 250

(23)

23

Test methodology

The following test methodology was used:

 Primary database tables with data for up to 75,000 customers were created and populated using Microsoft TPC-E.

 TPC-E tools generated the OLTP-like workloads used for testing purposes.

 Various failure scenarios were emulated while workloads were run against the SQL Server 2012 databases.

 Successful automatic and manual failover of the SQL Server database from the primary to the secondary site was verified for each failure scenario.

 Baseline testing was conducted at 0 km between the primary and secondary sites. Failover testing was conducted with emulated distances of 50 km, 100 km, and 300 km between the primary site and the secondary site.

 Replication was set to synchronous-commit mode at all distances. Under synchronous-commit mode, before committing transactions, a synchronous-commit primary replica waits for a synchronous-commit secondary replica to acknowledge that it has finished hardening the log. Synchronous-commit mode ensures that once a given secondary database is synchronized with the primary database, committed transactions are fully protected.

 CPU, memory, storage, and network utilization was monitored before, during, and after each test to ensure a full production workload was running during peak activity.

 To ensure successful failover and failback, row counts for three key tables (ADDRESS, BROKER, and DAILY_MARKET) were checked before, during, and after failover.

Test scenarios

The following failure scenarios were used to test the solution:



Data center site failure

 Network failure _{Storage array failure} _{Server/node failure}

Creating and configuring the Availability Group

Using the Availability Group Wizard, an Availability Group was configured with a primary replica (MS-SQL-DB1\PROD01) and one secondary replica (MS-SQL-DB2\PROD01) on the primary (production) data center site, and two secondary replicas (MS-SQL-DB3\PROD03 and MS-SQL-DB4\PROD04) on the secondary (disaster recovery) data center site. This provides redundancy in case of multiple failures on primary and secondary sites.

(24)

24

Figure 6. SQL Server AlwaysOn Availability Group configuration

Automatic Failover Mode is an option that provides high availability by ensuring that the database is quickly made available again after the loss of the primary replica. Failover takes place without manual intervention in the event of an unplanned disaster.

To configure an availability group for automatic failover, the current primary replica and one secondary replica must be set to synchronous-commit mode with automatic failover. Synchronized mode

provides for little to no data loss but requires more available network bandwidth and is limited to latency requirements per SQL Server AlwaysOn limits.

To maximize performance and minimize data loss for disaster recovery, all replicas were configured for synchronous-commit mode with automatic failover.

(25)

25

Figure 7. Failover and failback configuration

Test results

The following sections describe results for failover of the SQL Server 2012 AlwaysOn Availability Group in the event of:

 Site failure  Network failure  Storage failure  Server/node failure

Site failure

(26)

26

First, baseline site failover testing was conducted without any load at a distance of 0 km. After

baseline testing, failover tests were conducted with a full production workload and emulated distances of 50 km, 100 km, and 300 km between the primary and secondary sites. Table 9 summarizes the test results.

Table 9. Site failure results

Distance Load/No load Automatic

failover time Manual failover time Failback time Result

Baseline No load ~ 2.5 minutes ~ 3 minutes — Zero data loss 50 km Load < 2 minutes ~ 3 minutes ~ 3 minutes Zero data loss 100 km Load ~ 5 minutes ~ 7 minutes ~ 6 minutes Zero data loss 300 km Load ~ 6 minutes ~ 9 minutes ~ 8 minutes Zero data loss A row count query was issued on the primary and secondary sites to verify zero data loss before, during, and after failover testing. The row counts for the ADDRESS, BROKER, and DAILY_MARKET financial transaction tables on the primary and secondary sites matched, which means no data was lost (Figure 8).

(27)

27

Network failure

To emulate network failure, the network adapter was disabled from a cluster node virtual machine (Figure 9). This test was performed with a full production workload and an emulated distance of 100 km between the primary and secondary sites. Table 10 summarizes the test results.

Figure 9. Network failure emulation Table 10. Network failure results

(28)

28

Figure 10. Network failure row count query and results

Storage array failure

To emulate a storage array failure, storage was removed from the vb319-esxi-01 and vb319-esxi-02 storage groups on the VNX array on the primary site (Figure 11).

(29)

29

Figure 11. Storage failure emulation Table 11. Storage failure results

(30)

30

Server/node failure

To emulate a server or node failure, the blade servers on which the cluster node virtual machines were installed on the primary site were powered off. Table 12 summarizes the test results.

Table 12. Server/node failure results

100 km Load 6 minutes 7.4 minutes — Zero data loss A row count query was issued on the primary and secondary sites to verify zero data loss before, during, and after failover testing. The row counts for the ADDRESS, BROKER, and DAILY_MARKET financial transaction tables on the primary and secondary sites matched, which means no data was lost.

(31)

31

Conclusion

This solution architecture paper provides a validated business continuity solution for Microsoft SQL Server 2012 on the Vblock System using the high-availability and disaster recovery capabilities of AlwaysOn Availability Groups. Testing and validation demonstrated that the solution provides data protection for SQL Server 2012 that meets SLA requirements for recovery from a broad range of failures.

The key findings of testing of this solution are as follows:

 Using SQL Server 2012 AlwaysOn Availability Groups, the Recovery Time Objective (RTO) of less than one hour, and the Recovery Point Objective (RPO) of zero data loss were met in all forced manual and automatic failover scenarios at distances of 0, 50,100, and 300 kilometers.  Through a combination of the Availability Groups feature and the storage capabilities of the

Vblock converged infrastructure, the Vblock System 300 can easily service SQL Server 2012 mission-critical online transaction processing (OLTP) production workloads while providing high availability.

 Zero data loss is possible under typical customer high-volume workloads by following best practices from VMware, Cisco, and EMC for SQL Server 2012 mission-critical database environments that include high-volume update, read, and write transactional data.

By combining the power of the Vblock converged infrastructure and a virtualized SQL Server 2012, an organization can demonstrate business-critical levels of high performance and availability of database platforms, while accelerating the time to deployment and simplifying IT operations.

Next steps

(32)

32

References

 Introducing SQL Server AlwaysOn (http://msdn.microsoft.com/en-us/sqlserver/gg490638)  Hardware and Software Requirements for Installing SQL Server 2012

(http://msdn.microsoft.com/en-us/library/ms143506.aspx)

 Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)(http://msdn.microsoft.com/en-us/library/ff878487.aspx)

 Microsoft SQL Server on VMware Best Practices Guide

(http://www.vmware.com/files/pdf/sql_server_best_practices_guide.pdf)

(33)

ABOUT VCE

VCE, formed by Cisco and EMC with investments from VMware and Intel, accelerates the adoption of converged infrastructure and cloud-based computing models that dramatically reduce the cost of IT while improving time to market for our customers. VCE, through the Vblock Systems, delivers the industry's only fully integrated and fully virtualized cloud infrastructure system. VCE

solutions are available through an extensive partner network, and cover horizontal applications, vertical industry offerings, and application development environments, allowing customers to focus on business innovation instead of integrating, validating, and managing IT infrastructure.

For more information, go to http://www.vce.com.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." VCE MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OR MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.