Database High Availability. Solutions 2010

(1)

DB2 9 DBA certification

exam 731

Database High Availability

Solutions 2010

P.O. Box 200, 5520 AE Eersel, The Netherlands Tel.:(+31) 497-530190, Fax: (+31) 497-530191

Disclaimer

 The information contained in this presentation is based on techniques, algorithms, and documentation published by the several authors and companies, and in addition is the result of research. It is therefore subject to change at any time without notice or warning.

 The information contained in this presentation has not been submitted to any formal tests or review and is distributed on an “As is” basis without any warranty, either expressed or implied.  The use of this information or the implementation of any of these techniques is a client

responsibility and depends on the client’s ability to evaluate and integrate them into the client’s operational environment.

 While each item may have been reviewed for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere.

 Clients attempting to adapt these techniques to their own environments do so at their own risks.  Foils, handouts, and additional materials distributed as part of this presentation or seminar

should be reviewed in their entirety.

Note: This presentation gives you an overview of techniques used by database vendors. It can not be used for making company decisions regarding high availability without further studies.

(2)

Trademarks

This presentation contains many trademarks in use by database vendors if we are aware of a trademark we put it in capitals.

Agenda

What is downtime Techniques in use SQL Server Cluster

DB2 Data Sharing / PureScale vs Oracle RAC Wise Words

(3)

What is downtime?

Terminology in use

Term

Business Risk

Solution

Data Recovery

Downtime and

_{Data loss}

Redundant data

High

Availability

Downtime

Redundant

system

components

Disaster

Recovery

Permanent Data

loss and

"unable to continue"

Redundant

systems

and facilities

Not investing in hardware, software and knowledge

(4)

About Downtime and Data loss



Permitted downtime?



Acceptable data/transaction loss (if any)?



Mean time to recovery?

• Difference for "normal down" and disaster?



How much damage is done after how much time? $$?



Who is first in case of disaster?

Note:

Database uptime ≠ application availability

• Application failures

• Hardware Outages (Power, Network, etc.)

Uptime SLA Downtime_{per Year} Downtime_{per Month} 99.9% 8.76 hours 43.8 minutes

99.99% 52.6 minutes 4.38 minutes

99.999% 5.26 minutes 0.438 minutes

What is causing downtime?

Downtime

Unplanned

down

Hardware

Failure

Data

Corruption

Planned

down

Data / Appl

Changes

System

Upgrades

(5)

Unplanned: Hardware Failure



Storage subsystem

• Disk or Controller…

• Firmware or driver problem



Network

• Often causing a partial down (difficult to measure)

• Often rely on third party (SLA!)



Server

• Cluster Support:

Oracle RAC / Microsoft MSCS / IBM Sysplex and PureScale • Where is the backup / cluster server?

• Can virtual server be a solution?



Power Outage

• Environment change, too many requests (unstable grid) etc.

• Third party, difficult to SLA

Unplanned down Hardware Failure Data Corruption

Unplanned: Site failure



Site Failure

• Complete Server room down (e.g. fire)

• Can always happen because you depend on external party (e.g. power)



Can you handle?

• Network changes

• Workload (different config)

• Fail-back situation



Isn't there a hidden Single Point of Failure

• E.g. glassfiber back-up in the same bundle

(6)

Unplanned: Data Corruption



Human error

• Biggest problem difficult because many

scenarios which need different solutions



Logical Corruption

• Difficult to detect (sometimes after years strange data emerges)

• Can you detect the cause (e.g. program) and how much data is affected?

– Special techniques to go back in time and select data again – Can you re-process the data / transactions?

Planned: Data / Application changes



Application upgrades

• Schema changes still difficult

– Still a market for vendor tools

– DB2 versioning has performance impact

• Running systems need to shutdown in order to reload

– E.g. middleware transaction refresh

• How are we testing / accepting this

– If test fails, how do we undo the change?



Data maintenance

• Offline REORG (sometimes needed)

• Roll-in / Roll-out data

Planned down Data/Appl Changes System Upgrades

(7)

Planned: System Upgrades



Hardware upgrades

• Growth

• End of life cycle / vendor support

• Redundant is not always hot-swap



Software upgrades

• Operating system, middleware, DBMS etc.

– Wise to combine upgrades?

– What if the new combination is not stable



How to respond to vendor patches

• Needed? What if we don't

• Policy? One size fits all?

Planned down Data/Appl Changes System Upgrades

Down vs Solution

(8)

Partial down



Most common down

• System does work but certain function are unavailable

– Examples:

 Transactions with certain input abort  Certain location cannot connect  Single database corrupt



End-user to database goes through many layers

• How to report?



Layered approach can buy to downtime

• Let user work with front-end as if it is real-time

• Bring down backend for maintenance and queue requests

– E.g. online banking without full history or real-time transactions

How long does it take to fix it…

0 2 4 6 8 10 Time N ee d ed

Recovery Time

Breakdown of recovery

 Many problems the analysis takes a (very) long time

• Human errors

• Corruptions

 Many companies suffer a knowledge problem

 How to fix it

• Creating the scenario

• Testing the scenario?

 Speed of recovery itself

• Best parameters

(9)

Techniques in use

to minimize downtime

Database Backups



Traditional backup types

• Full backup – Online / Offline (z/OS Sharelevel)

• Incremental & Differential backup

• Include log backup

• Any other than Full backup is substitute for log!



Disk is better than tape

• First backup to disk (separate physical disk volume)

• Detect exceptions encountered during backup

• Verify backup files

• Copy backup files to tape, remote disk or storage manager (TSM)



Data retention policy for backup files

• What are you going to do with these backups?

(10)

Backup Retention Policy



Location of backup files



Duration of retention



Protection of sensitive data!

• Sarbanes/Oxley (SOX)

• HIPAA

• Internal policies for data management and protection



Access to backups from offsite data storage

• Often the weakest link in security scenario's

Snapshot technology



Two techniques

• Real disk mirroring

• Share disk after snap until an update is done



How useful is the snapped disk

• Was DBMS aware of the snap? Was it up?

• Did the DBMS participate in the snap?



Snap can be extremely useful

• As backup

• As a fallback (e.g. after failed upgrade)



Be careful

(11)

Rebuild database (restore & rollforward)



DB2 LUW more flexible than z/OS

• Include log with backup

• Differential and delta backups



Used for:

• Redirected restore (problem investigation)

• Simple scenario's which allow for down-time

• As a safety net when all else fails



Less safe than you might imagine

• Backup just reads files (does not analyze them)

Replication



Create offline solution

• Based on master / slave

• Can be a "High Availability" solution



All database support this (sometimes very advanced)

• Often horizontal / vertical segmented



Mostly row based but sometimes optimized

• E.g. MySQL DRBD is like RAID1 over network



Replication does not take care of:

• IP takeover

• Heartbeat & automate takeover

• Slave becoming master

• Fail back and resync

(12)

Replication using Log Shipping



DB2 for LUW HADR

• Many limitation compared to DataGuard



Xkoto Gridscale for DB2

• More options than HADR



Data Guard

• Suports both log shipping and SQL shipping

• Very mature / flexible product



DB2 for z/OS Trackersite

• Very basic (really in use?)



Microsoft SQL Server mirror

• Witness server automates the takeover

• Xkoto has also gridscale for SQL Server

Clustering solution



Withdraw a "node" from the solution

• Network problem (IP needs re-routed)

• Failover might not be active

• Can it be automated?

– Heartbeat or Timeout – Client re-route – Split Brain problem



What happens to running Units Of Work?

• Locked data or other node backout?

• Take-over of transaction / restart of transaction / fail transaction



Controlled "failure" for planned down (e.g. upgrade)



Fail back / Insert node into the cluster

(13)

Shared nothing vs shared everything



Shared Everything

• DB2 Data Sharing & PureScale (HA & Perf) • Oracle RAC (HA & Perf)

– Does not share memory, only disk

• Microsoft SQL Server Cluster (HA)

– Based on hardware / operating system solution

• Sybase IQ ??



Shared Nothing

• Microsoft SQL Server mirror (HA)

• Oracle Data Guard (HA)

• DB2 HADR (HA)

• MySQL Cluster (HA) MySQL Replication (Perf)

– Both have many limitations, still large system use them

• Perf: Terradata, Postgres (Greenplum), Netezza , DB2 LUW DPF

High Availabilty Replication v.s. Cluster



Both can have Single Point of Failure (SPF)



Wrong configuration can destroy data



SAN/NAS I/O overhead when shared storage

• With RAID SAN is no longer SPF

• Make sure network to the SAN is not SPF



Replication is easy to break

• Inconsistent data (e.g. middle transaction)

(14)

Split Brain condition



Due to communications failures nodes are separated

• Missing heartbeat is really down or communication failure



If multiple nodes control of the cluster, then it's called a

split-brain condition

• If this happens, then bad things will happen



Special software solutions are needed to 100% secure a

down of the other node(s)

• This software can become Single Point of Failure

(15)

Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 29 Client PCs Node A Node B Disk cabinet A Disk cabinet B Heartbeat SQL Passive _Node Failure Occurs! SCSI Reserve Broken SQL fails over and is available to clients SQL New Reservation Established

SQL Server cluster: Failover clustering

Mirror Principal Witness Log Application SQL Server SQL Server 2 2 4 5 1

Data Log Data

3

>2 >3

Commit

(16)

SQL Server Comparison

Database Mirroring

 Scope: user Database

 Standard hardware

 Very fast failover (seconds)

 OS flexible (e.g. 32/64)

 Independent storage

 Reporting on mirror (Read-Only)

 Geographic separation OK

Failover Clustering

 Scope: Full instance

 Certified hardware

 Automatic failover (minutes)

 Enterprise OS

 Shared storage

 Standby not available

 Servers co-located (site failure!)

Compare DB2 to Oracle

DB2 Data Sharing / PureScale

 Single system image

 Dynamic workload management

 Software / Hardware Solution

 No Single Point of Failure

ORACLE RAC

 High Speed inter-system links

 Lots of communications

 No Global cache

• Cache Fusion / Interconnect

• Passes data around a lot

 Extra communication overhead

Data Log Log Log DB2 DB2 DB2

(17)

Components in a RAC Cluster



Global Cache Services (GCS)

• Manage Data Page Synchronization

• Sends DATA to other nodes



Global Enqueue Service (GES)

• Manages Global Locks for non-data pages

Oracle RAC in action – cache fusion

Example how Oracle RAC moves data around

Instance A Instance B Instance C

Instance D

Master Node

8741 1

2 3 _Read

into buffer cache 8741

Want to read

data block 8741

Do READ

(18)

Oracle RAC in action – cache fusion

Example how Oracle RAC moves data around

Instance A Instance B Instance C

Instance D Master Node 8741 8741 Want to read data block 8741 1 2 8741 3 Please send 8741 to node C Send 8741

4 data block 8741Received from Node B

Data no longer comes from disk, owners forward them. Multiple copies around

Oracle RAC in action – cache fusion

Example how Oracle RAC moves data around

Instance A Instance C Instance D Master Node 8741 8741 1 3 8741 6 Flush PI data block 8741

4 _{data block 8741}Write 5 8741 Flush PI data block 8741 6 2 Forward 8741 to node B PI = Previous Image

Owners have to write, after write caches are flushed New read requests moves the data again around

(19)

What happens if RAC node fails



Crash recovery by other node

• Freeze GCS, not allowing updates to database anymore

• Data block remaster and recover the pages using redo/undo

• Invalidate blocks recovered



False node failure detection

• Can have split brain problem

• Oracle Custer software

– Heartbeat based, Single Point of Failure (Yes, says IBM)

(20)

RAC vs Purescale Scalability



IBM PureScale benchmark:

• 95% scalability  32 members • 81% scalability  112 members



Oracle RAC??

• No figures…

• License forbids publication of measurements

Even Larry admits…

eWEEK (www.eweek.com) 31-Oct-2003:

I make fun of a lot of other databases – all other

databases, in fact, except the mainframe version of

DB2. It's a first-rate piece of technology.

Larry Ellison, Oracle's Founder and CEO

(21)

Wise words…



Prepare for failure… DISASTER WILL HAPPEN



Ensure that no important data is lost

• Think of the different types of unavailability (there is no golden bullet)



Keep It Simple, Stupid (KISS)



Complexity is the enemy of reliability

• Saving on education is like stopping a watch to save time…



Automate as much as possible

• Careful you still understand it, so document it (incl. what if scenario)



Test it! Frequently!! Use good scenarios!



Audit it

• You need a devile's advocate to find the holes…

New technology

Source: channelinsider.com

If you can see it an touch it then it is: physical

If you cannot see it but you can touch it then it is: transparent If you can see it but not touch it then it is: virtual

If you cannot see it nor touch it then… It's gone!

(22)

Database High Availability. Solutions 2010

DB2 9 DBA certification

exam 731