• No results found

SQL Server AlwaysOn. Michal Tinthofer 11. Praha What to avoid and how to optimize, deploy and operate.

N/A
N/A
Protected

Academic year: 2021

Share "SQL Server AlwaysOn. Michal Tinthofer 11. Praha What to avoid and how to optimize, deploy and operate."

Copied!
74
0
0

Loading.... (view fulltext now)

Full text

(1)

Praha 2013

11.

ročník největší odbornéIT konference v ČR!

Michal Tinthofer

|Michal.Tinthofer@Woodler.eu|

SQL Server AlwaysOn

(2)

Overview

Introduction to AlwaysOn & Availability Groups

How to set up AlwaysOn

Managing and tuning AlwaysOn

Monitoring Availability Groups

Common issues

(3)

to AlwaysOn & Availability Groups

(4)

SQL Server Always On Solutions

 AlwaysOn ≠ Availability Groups

 AlwaysOn = (SQL Server Failover Cluster Instances + Availability Groups)  Availability Groups ≠ Database Mirroring

(5)

SQL Server Always On Solutions

Availability Groups • Non-Shared Storage Solution • (Group of) Database level HA • (Group of) Database level DR

• DR replica can be Active Secondary

• Databases must be in FULL recovery model Failover Cluster Instance

(for local HA) + Availability groups (for DR)

• Combined Shared and Non-Shared Storage • Instance level HA

• (Group of) Database level DR

• DR replica can be Active Secondary

(6)
(7)
(8)

The logical topology of a representative AlwaysOn solution

Windows Server Failover Clustering (WSFC) Cluster

WSFC Quorum Witness Remote File Share (Optional)

Network Subnet A Network Subnet B

Node B1 SQL Server Instance 3 Storage Node B2 SQL Server Instance 4 Storage Instance Network Name Instance Network Name WSFC Configuration WSFC Configuration Node A3 SQL Server Instance 2 Storage Instance Network Name WSFC Configuration Node A2 Node A1 SQL Server Failover Cluster Instance 1 Shared Storage Instance Network Name WSFC Configuration WSFC Configuration

AlwaysOn Availability Group

Availability Group Virtual Network Name

(9)

Set Up AlwaysOn

(10)

Pre-installation Planning:

 1.1 Synchronous or asynchronous replication?

 Up to three synchronous  Total of four replicas + primary

 1.2 Where should be Automatic failover?

 Two Automatic failover partners (primary and secondary)

 2.1 Decision is made by number of nodes and votes used to

prevent cluster split brain

 Odd: Node majority

 Even: Node and file share majority

 How many nodes need to be up for the cluster to stay online, and which nodes should vote.

 3.1. How many databases do you want in each group?

 3.2. Do you want to move groups around for load balancing?

 3.3. Do you need to separate reporting databases from the

OLTP databases

 Report db can be read-only replica of production OLTP?

 3.4. If just one database on an instance fails, what do you want to happen?

 3.5. How many AG you need based on db priority and function

 Auto Failover and synch replica or manual failover and asynch replica.

 4. If any other servers require restores from your new 2012 instance, they should get 2012 first.

 (Think development, disaster recovery, or reporting servers.)

 5. Review non-database stuff installed on old servers.

 Plan for SSIS, SSAS, SSRS, DTS packages, logins, Agent jobs, etc.

1. Choose number and location of replicas

2. Choose Quorum model (based on number of replicas)

3. Design Availability Groups

4. Plan DR strategy with 2k12 SQL servers only

(11)

Windows Configuration

 Create a single instance account for all SQL Servers

 You can use different accounts

per node if NO Kerberos auth. will ever be used for listener.

 Use same drive letters for across all nodes (No C:\ drive please )  Use static IP addresses for all

servers & subnets.

 If SQL Failover Cluster instance (FCI) will be used:

 Reserve IP also for SQL instance &

Cluster management

 Configure DTC if clustered

 Achieve network redundancy between nodes before SQL installation.

 No separate heartbeat network necessary

 Use more NICs with separate IPs or Team them.

 Where backups will be done?

 Choose preferred & secondary server

 What if failover to DR site occurs ?

 Plan for both sites (DR & HA)  Use windows update to get

servers up to date.

(12)

Installation

Single domain should be used!

Install Windows prerequisite hotfixes for Availability

Groups

 http://msdn.microsoft.com/en-us/library/ff878487.aspx#WinHotfixes  Install the hotfix to allow nodes with 0 votes

-http://support.microsoft.com/kb/249403

This is important to do on all nodes

Open Windows firewall ports 1433,1434 and 5022 on

all nodes.

Enable Instant File Initialization.

.NET Framework 3.5.1 feature and the Failover

(13)

Cluster Installation step by step

Install Failover

(14)

Cluster Installation step by step

Create a primary

cluster via

(15)

Cluster Installation step by step

Pass validation

tests

SCSI SAN must

support SCSI-3

Persistent

Reservation

(16)

Cluster Installation step by step

Specify cluster name for MSCS service

(17)

Cluster Installation step by step

Windows Cluster

created

With two nodes

for Primary site

(18)

AlwaysOn

(19)

Cluster Installation settings

Create folder on

shared server

 Will be used for new quorum settings  R/W access

 Go to cluster

management and

change quorum mode

Servers will need to exchange backup

(20)

Cluster Installation settings

 Quorum is managed by the WSFC, irrespective of the number of SQL Server instances, number of nodes, number of availability groups

 Important goal: Design to ensure

 Unavailability of the DR site (or the node at DR site) , or loss of network connectivity between sites should not impact the

quorum of the WSFC  Two steps:

 Node votes: First decide which nodes should have a vote

(21)

Primary Data Center Disaster Recovery Data Center

SQL Server Primary

SQL Server Secondary

Windows Server Failover Cluster (single WSFC crossing two data centers)

Availability Group SQL Server Secondary Synchronous Asynchronous File Share

 Use this quorum model with a protected file share witness.

 The Fileshare Witness always has 1 vote.

Cluster Installation settings

(22)

 Add an additional voting node to the WSFC in the primary data center, and then use the this quorum model.

Cluster Installation settings

Node Majority

Primary Data Center Disaster Recovery

Data Center

SQL Server Primary

SQL Server Secondary

Windows Server Failover Cluster (single WSFC crossing two data centers)

Availability Group

SQL Server Secondary

Synchronous

Asynchronous

(23)

Cluster Installation settings

Pick Node and file share majority

(24)

Cluster Installation settings

Confirm new quorum settings

 Important design goal is to ensure that:

(25)

AlwaysOn

(26)

Cluster Installation settings

(27)

Cluster Installation settings

Add shared disk to cluster

Suooprted

SCSI

iSCSI

(28)

SQL Clustered Instance setup

Prefer default instance over named

Use single AD service account for all instances

 You will be able to use Kerberos authentication in future

Also use same collation on all instances

Use same drive letters on all nodes

Apply latest Service Pack for SQL Server 2012

Enable AlwaysOn HADR via Pwshell or ConfMgr.

Prepare SAN storage

Confirm network routing between subnets

Configure SQL Instance as you do usually after new installation

(29)

SQL Clustered Instance setup

(30)

SQL Clustered Instance setup

(31)

SQL Clustered

Instance setup

(32)

SQL Clustered Instance setup

Pick features and

(33)

SQL Clustered Instance setup

Pick name of cluster

resource group

There will be stored

resources for PROD

cluster

Then choose available

shared storage

And IP address on

(34)

SQL Clustered Instance setup

Add accounts to

sysadmin role

Optionally you can

make AD group which

will hold all service

accounts of AG

(35)

SQL Clustered Instance setup

Set Service account for

new instance

Finish installation and

(36)

SQL Clustered Instance setup

(37)

SQL Clustered Instance setup

Choose existing FCI

Then confirm network

interface used for

communication

Fill password for service

(38)

SQL Clustered Instance setup

After successful

installation you

will be asked for

SP

Patch the server

(39)

AlwaysOn

(40)

Enable AlwaysON on PROD FCI

Via Configuration Manager

Or Powershell

(41)

DR Cluster Installation

(42)

DR Cluster Installation

Some resources cannot be used on those

DR nodes

Such as:

Shared drives

PROD LAN

(43)

DR Cluster Installation

 Remove Cluster Votes for offsite nodes.

 Also remove nodes from possible &

preferred owners on PROD FCI resources

 We need to separate

(44)

AlwaysOn

(45)

DR Cluster Installation

Install second FCI

 This will be our DR site

Pick features and

name

NW name

 Hostname of instance

ID Instance

 Depends on Type

Pick name of cluster

resource group

(46)

DR Cluster Installation

Then choose available

shared storage

And IP address on

which DR instance

will listen

Set Service account

(47)

DR Cluster Installation

 TEMPDB on Local Disk

 Enables use of local storage for

TEMPDB

 Can use solid state storage to

improve performance of TEMPDB-heavy workloads

 Saves money on storage

replication licensing

 Reduces cross-data center storage

replication traffic

(48)

DR Cluster Installation

Same as on PROD

FCI add second

node on DR FCI

(49)

DR Cluster Configuration

Enable AlwaysOn on DR FCI

Remove PROD nodes from possible

(50)

FCI Configuration

If named instance used, set static port on All nodes TCP/IP

(51)

FCI Configuration

Again ensure that correct possible & preferred owners

are set on all resources in cluster application

(52)

AlwaysOn

(53)

AlwaysOn (Availability groups) setup

Databases should be:

 In Full Recovery mode  Full backup must exist

FCI+AG using manual

(54)

Special Case: Automatic Failover for DR

Use of 3rd Data Center and synchronous mode

Primary Data Center Disaster Recovery

Data Center SQL Server

Primary

SQL Server Secondary

Windows Server Failover Cluster

Availability Group

Synchronous File Share

(55)

AlwaysOn (Availability groups) setup

(56)

AlwaysOn (Availability groups) setup

Build your backup strategy for DR

Create AG Listener to provide centralized Client management.

(57)

AlwaysOn (Availability groups) setup

Select Initial

Synchronization method

Don’t forget on AG

prerequisites

(58)

AlwaysOn (Availability groups) setup

Fixing AG prerequisites

In this case folder structure

should be same

Create same folders on

(59)

AlwaysOn (Availability groups) setup

Validation scripting

and process

Logins

Endpoints

 “Hard_endpoint”

Start XE session

 “AlwaysOn_Health”

Create AG

Add Listener

Join Members

Backup/Restore dbs

Wait 5 min to start

(60)

AlwaysOn

(61)

Result: FCI +AG

Three separate FCIs

Production

 Holds PROD SQL service

Disaster Recovery

 Holds DR SQL service

AlwaysOn

 Holds AG Listener

Two separate LANs

Nodes are evenly

(62)

 Disaster = Primary site is down

 Manual Process involved to bring database service online on the DR site  Force Quorum on the secondary in the DR site

 Start-ClusterNode –Name "DRNODE1" –FixQuorum

 If still running Stop-ClusterNode –Name "DRNODE1"

 Execute FORCE SERVICE ALLOW DATA LOSS

 ALTER AVAILABILITY GROUP [AlwaysOnAG] FORCE_FAILOVER_ALLOW_DATA_LOSS;

 Adjust quorum model and/or node votes

(Get-ClusterNode "DRNode1").NodeWeight=1  (Get-ClusterNode "DRNode2").NodeWeight=1  (Get-ClusterNode "PrimaryNode1").NodeWeight=0  (Get-ClusterNode "PrimaryNode2").NodeWeight=0

 Reporting: Get-ClusterNode | fl NodeName, NodeWeight

(63)

 Adjust quorum model node votes to 1 for primary site nodes  Resume synchronization for all paused AlwaysOn databases  Optionally switch to synchronous mode

 To ensure transaction safety  Failback to primary site

 Switch back to asynchronous mode

 Adjust quorum model node votes to 0 for dr site nodes

(64)

How to failover via GUI or TSQL

(65)

SQL Server Failover Cluster Instance DNS Settings

(66)

RegisterAllProvidersIP setting for AG listener

Reducing client recovery latency after failover

 Connection strings that set MultiSubnetFailover to true/yes

AlwaysOn Availability Groups sets the RegisterAllProvidersIP property to 1 in order to reduce re-connection time after a failover

 Clients will connect to all IP addresses simultaneously  Available providers

.NET Framework 3.5 SP1 + connectivity patch

 SQL Native Client 11.0 ODBC

 Microsoft JDBC driver 4.0 for SQL Server

 Legacy strings

 change RegisterAllProvidersIP to 0

 The active IP address (instead of all) is listed in the Client Access Point in the WSFC cluster, reducing latency for legacy clients.

 Change HostRecordTTL to 300

(67)
(68)

Testing MultiSubnetFailover

Same process as ReadOnlyRouting in my previous

(69)
(70)

FCI monitoring

 On SQL 2k8

 ResDLL query @@Servername

 If two times no SQL response = resource down, then failover

 On SQL 2k12

 ResDLL query SP_SERVER_DIAGNOSTICS  Continuous send of data from SQL instance

 Is more precise

 Better configuration of failover policy

(71)

Flexible Failover Policy

FailureConditionLevel (0 to 5)

5 – Failover or restart on any qualified failure

4 – Failover or restart on moderate SQL Server errors 3 – Failover or restart on critical SQL Server errors 2 – Failover or restart on SQL Server unresponsive 1 – Failover or restart on SQL Server down

0 – No Automatic Failover or restart

User sets new Cluster properties

HealthCheckTimeout and FailureConditionLevel

(72)

Flexible Failover Policy

Configuration can be viewed and reconfigured from SQL

server or Cluster resource

 HealthCheckTimeout  Default 60sec  Min 15sec  Speed of Responses  FailureConditionLevel  Default 3

 Failover or restart on critical SQL Server errors

 Should be enough

 User Configurable

 Diagnostics are always captured

 Instance root LOG\%SQLDIAG%

(73)

Managing SQL server diagnostics

(74)

Q&A

Thank You !!

References

Related documents

Application for 5-Day Field Experience: Dual Certification – Special Education PK-8 and Early Childhood Education PK-4. EARLY CHILDHOOD EDUCATION

You create a synthetic workload dataset with an included tool, load it into the instance via a new test database (DS2), and then run an included load testing tool to simulate orders

 VM HA + Single instance SQL Server for pre-production  VM HA + SQL Server Cluster for production (pre-v2012)  VM HA + SQL Server AlwaysOn Availability

Note: As noted previously, if SQL is installed without using this instance name, the Autodesk Vault Server installation will create its own instance of SQL Server called

To verify the failover clustering instance name, start Cluster Administrator, select the cluster group with the SQL Server 2005 resources, and look next to the SQL Network

Components, Click Next to proceed. Reporting Services will not be the part of SQL Failover Cluster.. E).The Instance configuration window will prompt for SQL Server Network

At the SQL Server Instance step of the wizard, choose to install a new instance of SQL Server or use an existing instance (either local or remote). At the Service Settings step of

Scientists can have many incentives to move, citing both salary and career progression, as the quality of their research environment, availability of funding, or the opportunity