DB2 9 DBA certification
exam 731
Database High Availability
Solutions 2010
P.O. Box 200, 5520 AE Eersel, The Netherlands Tel.:(+31) 497-530190, Fax: (+31) 497-530191
E-mail: [email protected]
Disclaimer
The information contained in this presentation is based on techniques, algorithms, and documentation published by the several authors and companies, and in addition is the result of research. It is therefore subject to change at any time without notice or warning.
The information contained in this presentation has not been submitted to any formal tests or review and is distributed on an “As is” basis without any warranty, either expressed or implied. The use of this information or the implementation of any of these techniques is a client
responsibility and depends on the client’s ability to evaluate and integrate them into the client’s operational environment.
While each item may have been reviewed for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere.
Clients attempting to adapt these techniques to their own environments do so at their own risks. Foils, handouts, and additional materials distributed as part of this presentation or seminar
should be reviewed in their entirety.
Note: This presentation gives you an overview of techniques used by database vendors. It can not be used for making company decisions regarding high availability without further studies.
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 3
Trademarks
This presentation contains many trademarks in use by database vendors if we are aware of a trademark we put it in capitals.
Agenda
What is downtime Techniques in use SQL Server Cluster
DB2 Data Sharing / PureScale vs Oracle RAC Wise Words
What is downtime?
Terminology in use
Term
Business Risk
Solution
Data Recovery
Downtime and
Data loss
Redundant data
High
Availability
Downtime
Redundant
system
components
Disaster
Recovery
Permanent Data
loss and
"unable to continue"
Redundant
systems
and facilities
Not investing in hardware, software and knowledgeCopyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 7
About Downtime and Data loss
Permitted downtime?
Acceptable data/transaction loss (if any)?
Mean time to recovery?
• Difference for "normal down" and disaster?
How much damage is done after how much time? $$?
Who is first in case of disaster?
Note:Database uptime ≠ application availability
• Application failures
• Hardware Outages (Power, Network, etc.)
Uptime SLA Downtimeper Year Downtimeper Month 99.9% 8.76 hours 43.8 minutes
99.99% 52.6 minutes 4.38 minutes
99.999% 5.26 minutes 0.438 minutes
What is causing downtime?
Downtime
Unplanned
down
Hardware
Failure
Data
Corruption
Planned
down
Data / Appl
Changes
System
Upgrades
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 9
Unplanned: Hardware Failure
Storage subsystem
• Disk or Controller…• Firmware or driver problem
Network
• Often causing a partial down (difficult to measure)
• Often rely on third party (SLA!)
Server
• Cluster Support:
Oracle RAC / Microsoft MSCS / IBM Sysplex and PureScale • Where is the backup / cluster server?
• Can virtual server be a solution?
Power Outage
• Environment change, too many requests (unstable grid) etc.
• Third party, difficult to SLA
Unplanned down Hardware Failure Data Corruption
Unplanned: Site failure
Site Failure
• Complete Server room down (e.g. fire)
• Can always happen because you depend on external party (e.g. power)
More than just a database problem
• All data and hardware is involved
Can you handle?
• Network changes• Workload (different config)
• Fail-back situation
Isn't there a hidden Single Point of Failure
• E.g. glassfiber back-up in the same bundleUnplanned down Hardware Failure Data Corruption
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 11
Unplanned: Data Corruption
Human error
• Biggest problem difficult because many
scenarios which need different solutions
Logical Corruption
• Difficult to detect (sometimes after years strange data emerges)
• Can you detect the cause (e.g. program) and how much data is affected?
– Special techniques to go back in time and select data again – Can you re-process the data / transactions?
Unplanned down Hardware Failure Data Corruption
Planned: Data / Application changes
Application upgrades
• Schema changes still difficult– Still a market for vendor tools
– DB2 versioning has performance impact
• Running systems need to shutdown in order to reload
– E.g. middleware transaction refresh
• How are we testing / accepting this
– If test fails, how do we undo the change?
Data maintenance
• Offline REORG (sometimes needed)
• Roll-in / Roll-out data
Planned down Data/Appl Changes System Upgrades
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 13
Planned: System Upgrades
Hardware upgrades
• Growth
• End of life cycle / vendor support
• Redundant is not always hot-swap
Software upgrades
• Operating system, middleware, DBMS etc.
– Wise to combine upgrades?
– What if the new combination is not stable
How to respond to vendor patches
• Needed? What if we don't
• Policy? One size fits all?
Planned down Data/Appl Changes System Upgrades
Down vs Solution
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 15
Partial down
Most common down
• System does work but certain function are unavailable
– Examples:
Transactions with certain input abort Certain location cannot connect Single database corrupt
End-user to database goes through many layers
• How to report?
Layered approach can buy to downtime
• Let user work with front-end as if it is real-time• Bring down backend for maintenance and queue requests
– E.g. online banking without full history or real-time transactions
How long does it take to fix it…
0 2 4 6 8 10 Time N ee d ed
Recovery Time
Breakdown of recovery Many problems the analysis takes a (very) long time
• Human errors
• Corruptions
Many companies suffer a knowledge problem
How to fix it
• Creating the scenario
• Testing the scenario?
Speed of recovery itself
• Best parameters
Techniques in use
to minimize downtime
Database Backups
Traditional backup types
• Full backup – Online / Offline (z/OS Sharelevel)
• Incremental & Differential backup
• Include log backup
• Any other than Full backup is substitute for log!
Disk is better than tape
• First backup to disk (separate physical disk volume)
• Detect exceptions encountered during backup
• Verify backup files
• Copy backup files to tape, remote disk or storage manager (TSM)
Data retention policy for backup files
• What are you going to do with these backups?Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 19
Backup Retention Policy
Location of backup files
Duration of retention
Protection of sensitive data!
• Sarbanes/Oxley (SOX)• HIPAA
• Internal policies for data management and protection
Access to backups from offsite data storage
• Often the weakest link in security scenario'sSnapshot technology
Two techniques
• Real disk mirroring• Share disk after snap until an update is done
How useful is the snapped disk
• Was DBMS aware of the snap? Was it up?• Did the DBMS participate in the snap?
Snap can be extremely useful
• As backup• As a fallback (e.g. after failed upgrade)
Be careful
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 21
Rebuild database (restore & rollforward)
DB2 LUW more flexible than z/OS
• Include log with backup• Differential and delta backups
Used for:
• Redirected restore (problem investigation)
• Simple scenario's which allow for down-time
• As a safety net when all else fails
Less safe than you might imagine
• Backup just reads files (does not analyze them)Replication
Create offline solution
• Based on master / slave• Can be a "High Availability" solution
All database support this (sometimes very advanced)
• Often horizontal / vertical segmented
Mostly row based but sometimes optimized
• E.g. MySQL DRBD is like RAID1 over network
Replication does not take care of:
• IP takeover• Heartbeat & automate takeover
• Slave becoming master
• Fail back and resync
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 23
Replication using Log Shipping
DB2 for LUW HADR
• Many limitation compared to DataGuard
Xkoto Gridscale for DB2
• More options than HADR
Data Guard
• Suports both log shipping and SQL shipping
• Very mature / flexible product
DB2 for z/OS Trackersite
• Very basic (really in use?)
Microsoft SQL Server mirror
• Witness server automates the takeover• Xkoto has also gridscale for SQL Server
Clustering solution
Withdraw a "node" from the solution
• Network problem (IP needs re-routed)• Failover might not be active
• Can it be automated?
– Heartbeat or Timeout – Client re-route – Split Brain problem
What happens to running Units Of Work?
• Locked data or other node backout?• Take-over of transaction / restart of transaction / fail transaction
Controlled "failure" for planned down (e.g. upgrade)
Fail back / Insert node into the cluster
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 25
Shared nothing vs shared everything
Shared Everything
• DB2 Data Sharing & PureScale (HA & Perf) • Oracle RAC (HA & Perf)
– Does not share memory, only disk
• Microsoft SQL Server Cluster (HA)
– Based on hardware / operating system solution
• Sybase IQ ??
Shared Nothing
• Microsoft SQL Server mirror (HA)
• Oracle Data Guard (HA)
• DB2 HADR (HA)
• MySQL Cluster (HA) MySQL Replication (Perf)
– Both have many limitations, still large system use them
• Perf: Terradata, Postgres (Greenplum), Netezza , DB2 LUW DPF
High Availabilty Replication v.s. Cluster
Both can have Single Point of Failure (SPF)
Wrong configuration can destroy data
SAN/NAS I/O overhead when shared storage
• With RAID SAN is no longer SPF
• Make sure network to the SAN is not SPF
Replication is easy to break
• Inconsistent data (e.g. middle transaction)
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 27
Split Brain condition
Due to communications failures nodes are separated
• Missing heartbeat is really down or communication failure
If multiple nodes control of the cluster, then it's called a
split-brain condition
• If this happens, then bad things will happen
Special software solutions are needed to 100% secure a
down of the other node(s)
• This software can become Single Point of Failure
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 29 Client PCs Node A Node B Disk cabinet A Disk cabinet B Heartbeat SQL Passive Node Failure Occurs! SCSI Reserve Broken SQL fails over and is available to clients SQL New Reservation Established
SQL Server cluster: Failover clustering
Mirror Principal Witness Log Application SQL Server SQL Server 2 2 4 5 1
Data Log Data
3
>2 >3
Commit
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 31
SQL Server Comparison
Database Mirroring
Scope: user Database
Standard hardware
Very fast failover (seconds)
OS flexible (e.g. 32/64)
Independent storage
Reporting on mirror (Read-Only)
Geographic separation OK
Failover Clustering
Scope: Full instance
Certified hardware
Automatic failover (minutes)
Enterprise OS
Shared storage
Standby not available
Servers co-located (site failure!)
Compare DB2 to Oracle
DB2 Data Sharing / PureScale
Single system image
Dynamic workload management
Software / Hardware Solution
No Single Point of Failure
ORACLE RAC
High Speed inter-system links
Lots of communications
No Global cache
• Cache Fusion / Interconnect
• Passes data around a lot
Extra communication overhead
Data Log Log Log DB2 DB2 DB2
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 33
Components in a RAC Cluster
Global Cache Services (GCS)
• Manage Data Page Synchronization• Sends DATA to other nodes
Global Enqueue Service (GES)
• Manages Global Locks for non-data pagesOracle RAC in action – cache fusion
Example how Oracle RAC moves data around
Instance A Instance B Instance CInstance D
Master Node
8741 1
2 3 Read
into buffer cache 8741
Want to read
data block 8741
Do READ
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 35
Oracle RAC in action – cache fusion
Example how Oracle RAC moves data around
Instance A Instance B Instance C
Instance D Master Node 8741 8741 Want to read data block 8741 1 2 8741 3 Please send 8741 to node C Send 8741
4 data block 8741Received from Node B
Data no longer comes from disk, owners forward them. Multiple copies around
Oracle RAC in action – cache fusion
Example how Oracle RAC moves data around
Instance A Instance C Instance D Master Node 8741 8741 1 3 8741 6 Flush PI data block 8741
4 data block 8741Write 5 8741 Flush PI data block 8741 6 2 Forward 8741 to node B PI = Previous Image
Owners have to write, after write caches are flushed New read requests moves the data again around
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 37
What happens if RAC node fails
Crash recovery by other node
• Freeze GCS, not allowing updates to database anymore
• Data block remaster and recover the pages using redo/undo
• Invalidate blocks recovered
False node failure detection
• Can have split brain problem• Oracle Custer software
– Heartbeat based, Single Point of Failure (Yes, says IBM)
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 39
RAC vs Purescale Scalability
IBM PureScale benchmark:
• 95% scalability 32 members • 81% scalability 112 members
Oracle RAC??
• No figures…
• License forbids publication of measurements
Even Larry admits…
eWEEK (www.eweek.com) 31-Oct-2003:
I make fun of a lot of other databases – all other
databases, in fact, except the mainframe version of
DB2. It's a first-rate piece of technology.
Larry Ellison, Oracle's Founder and CEO
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 41
Wise words…
Prepare for failure… DISASTER WILL HAPPEN
Ensure that no important data is lost
• Think of the different types of unavailability (there is no golden bullet)
Keep It Simple, Stupid (KISS)
Complexity is the enemy of reliability
• Saving on education is like stopping a watch to save time…
Automate as much as possible
• Careful you still understand it, so document it (incl. what if scenario)
Test it! Frequently!! Use good scenarios!
Audit it
• You need a devile's advocate to find the holes…
New technology
Source: channelinsider.com
If you can see it an touch it then it is: physical
If you cannot see it but you can touch it then it is: transparent If you can see it but not touch it then it is: virtual
If you cannot see it nor touch it then… It's gone!
Copyright © KBCE b.v., 2010 – All Intellectual Rights Reserved 43