DB
2
®Data
Recovery
and
High
Availability
Guide
and
Reference
DB2Version9forLinux,UNIX,andWindows
DB
2
®Data
Recovery
and
High
Availability
Guide
and
Reference
DB2Version9forLinux,UNIX,andWindows
Beforeusingthisinformationandtheproductitsupports,besuretoreadthegeneralinformationunderNotices.
EditionNotice
ThisdocumentcontainsproprietaryinformationofIBM.Itisprovidedunderalicenseagreementandisprotected bycopyrightlaw.Theinformationcontainedinthispublicationdoesnotincludeanyproductwarranties,andany statementsprovidedinthismanualshouldnotbeinterpretedassuch.
YoucanorderIBMpublicationsonlineorthroughyourlocalIBMrepresentative. v
Toorderpublicationsonline,gototheIBMPublicationsCenteratwww.ibm.com/shop/publications/order
v
TofindyourlocalIBMrepresentative,gototheIBMDirectoryofWorldwideContactsatwww.ibm.com/
planetwide
ToorderDB2publicationsfromDB2MarketingandSalesintheUnitedStatesorCanada,call1-800-IBM-4YOU (426-4968).
WhenyousendinformationtoIBM,yougrantIBManonexclusiverighttouseordistributetheinformationinany wayitbelievesappropriatewithoutincurringanyobligationtoyou.
Contents
About
this
book
.
.
.
.
.
.
.
.
.
.
. vii
Whoshouldusethisbook . . . vii
Howthisbookisstructured. . . vii
Part
1.
Data
recovery
.
.
.
.
.
.
.
. 1
Chapter
1.
Developing
a
good
backup
and
recovery
strategy
.
.
.
.
.
.
.
.
. 3
Developingabackupandrecoverystrategy . . . . 3
Decidinghowoftentobackup . . . 6
Storageconsiderationsforrecovery. . . 8
Keepingrelateddatatogether. . . 9
Backupandrestoreoperationsbetweendifferent operatingsystemsandhardwareplatforms . . . . 9
Crashrecovery . . . 10
Crashrecovery-details . . . 12
Recoveringdamagedtablespaces . . . 12
Recoveringtablespacesinrecoverabledatabases 12 Recoveringtablespacesinnon-recoverable databases . . . 13
Reducingtheimpactofmediafailure. . . 14
Reducingtheimpactoftransactionfailure . . . 16
Recoveringfromtransactionfailuresina partitioneddatabaseenvironment . . . 16
Recoveringfromthefailureofadatabase partitionserver . . . 19
Recoveringindoubttransactionsonthehost whenDB2ConnecthastheDB2Syncpoint Managerconfigured . . . 20
Recoveringindoubttransactionsonthehost whenDB2ConnectdoesnotusetheDB2 SyncpointManager. . . 21
Disasterrecovery . . . 22
Versionrecovery. . . 23
Rollforwardrecovery . . . 24
Incrementalbackupandrecovery . . . 27
Incrementalbackupandrecovery-details . . . . 28
Restoringfromincrementalbackupimages. . . 28
Limitationstoautomaticincrementalrestore . . 30
Monitoringtheprogressofbackup,restoreand recoveryoperations. . . 32
Understandingrecoverylogs . . . 33
Recoverylogdetails . . . 35
Logmirroring . . . 35
ReducingloggingwiththeNOTLOGGED INITIALLYparameter . . . 36
Configurationparametersfordatabaselogging 37 Configuringdatabaseloggingoptions . . . . 45
Logfilemanagement . . . 46
Logfileallocationandremoval. . . 48
Logfilemanagementthroughlogarchiving . . 49
Logarchivingusingdb2tapemgr . . . 51
Archivinglogfilestotape . . . 52
Blockingtransactionswhenthelogdirectoryfile isfull . . . 53
Ondemandlogarchive . . . 54
Includinglogfileswithabackupimage . . . . 54
Howtopreventlosinglogfiles. . . 56
Understandingtherecoveryhistoryfile . . . 56
Recoveryhistoryfile-garbagecollection . . . . 58
Garbagecollection . . . 58
Understandingtablespacestates . . . 60
Enhancingrecoveryperformance . . . 61
Chapter
2.
Database
backup
.
.
.
.
. 63
Backupoverview . . . 63
Displayingbackupinformation. . . 65
Privileges,authorities,andauthorizationrequiredto usebackup . . . 66
Usingbackup. . . 66
Backinguptotape . . . 68
Backinguptonamedpipes . . . 70
BACKUPDATABASE . . . 71
db2Backup-Backupadatabaseortablespace . . 76
Backupsessions-CLPexamples . . . 84
Optimizingbackupperformance . . . 84
Automaticdatabasebackup . . . 85
Enablingautomaticbackup . . . 86
Compatibilityofonlinebackupandotherutilities 87
Chapter
3.
Database
restore
.
.
.
.
. 89
Restoreoverview . . . 89
Privileges,authorities,andauthorizationrequiredto userestore. . . 90
Usingrestore . . . 90
Usingincrementalrestoreinatestandproduction environment . . . 92
Redefiningtablespacecontainersduringarestore operation(redirectedrestore) . . . 94
Restoringtoanexistingdatabase . . . 95
Restoringtoanewdatabase. . . 96
Redefinetablespacecontainersbyrestoringa databaseusinganautomaticallygeneratedscript. . 97
Performingaredirectedrestoreusingan automaticallygeneratedscript . . . 99
RESTOREDATABASE . . . 100
db2Restore-Restoreadatabaseortablespace . . 115
Restoresessions-CLPexamples . . . 127
Optimizingrestoreperformance . . . 129
Databaserebuild . . . 130
Rebuild-details . . . 134
Choosingatargetimagefordatabaserebuild 134 Restrictionsfordatabaserebuild . . . 137
Rebuildingadatabaseusingselectedtablespace images . . . 137
Rebuildingselectedtablespaces . . . 139
Rebuildingapartitioneddatabase . . . 140
Rebuildandincrementalbackupimages . . . 142
Rebuildandtablespacecontainers . . . 143
Rebuildsessions-CLPexamples . . . 145
Chapter
4.
Rollforward
recovery
.
.
. 155
Rollforwardoverview . . . 155
Privileges,authorities,andauthorizationrequired touserollforward . . . 157
Usingrollforward . . . 157
Rollingforwardchangesinatablespace . . . . 159
Recoveringadroppedtable . . . 163
Recoveringdatawiththeloadcopylocationfile 165 Synchronizingclocksinapartitioneddatabase environment. . . 166
Client/servertimestampconversion. . . 167
ROLLFORWARDDATABASE . . . 168
db2Rollforward-Rollforwardadatabase . . . . 177
Rollforwardsessions-CLPexamples . . . 187
Chapter
5.
Database
recover
.
.
.
.
. 191
Recoveroverview . . . 191
Privileges,authorities,andauthorizationrequired touserecover . . . 192
Usingrecover . . . 192
Client/servertimestampconversion. . . 193
RECOVERDATABASE . . . 193
db2Recover-Restoreandrollforwardadatabase 199
Part
2.
High
availability
.
.
.
.
.
. 205
Chapter
6.
Introducing
high
availability
and
failover
support
.
.
. 207
Highavailability . . . 207
Highavailabilitythroughlogshipping . . . 209
Highavailabilitythroughonlinesplitmirrorand suspendedI/Osupport . . . 210
Onlinesplitmirrorhandling . . . 212
Usingasplitmirrortocloneadatabase . . . 212
Usingasplitmirrorasastandbydatabase . . 213
Usingasplitmirrorasabackupimage. . . . 214
FaultmonitorfacilityforLinuxandUNIX. . . . 215
db2fm-DB2faultmonitor . . . 219
Chapter
7.
High
availability
disaster
recovery
(HADR)
.
.
.
.
.
.
.
.
.
. 221
Highavailabilitydisasterrecoveryoverview . . . 221
Systemrequirementsforhighavailabilitydisaster recovery(HADR) . . . 222
Installationandstoragerequirementsforhigh availabilitydisasterrecovery . . . 224
Restrictionsforhighavailabilitydisasterrecovery (HADR) . . . 226
Standbydatabasestatesinhighavailabilitydisaster recovery(HADR) . . . 226
Synchronizationmodesforhighavailability disasterrecovery(HADR) . . . 229
Replicatedoperationsforhighavailabilitydisaster recovery(HADR) . . . 232
Non-replicatedoperationsforhighavailability disasterrecovery(HADR) . . . 233
Highavailabilitydisasterrecovery(HADR) commandsoverview . . . 234
Highavailabilitydisasterrecovery(HADR) management . . . 237
Initializinghighavailabilitydisasterrecovery (HADR) . . . 238
STARTHADR . . . 240
db2HADRStart-Starthighavailabilitydisaster recovery(HADR)operations . . . 242
Stoppinghighavailabilitydisasterrecovery (HADR) . . . 244
STOPHADR . . . 246
db2HADRStop-Stophighavailabilitydisaster recovery(HADR)operations . . . 247
Databaseconfigurationforhighavailability disasterrecovery(HADR) . . . 249
Databaseactivationanddeactivationinhigh availabilitydisasterrecovery(HADR) . . . 254
Automaticclientrerouteandhighavailability disasterrecovery(HADR) . . . 255
Indexloggingandhighavailabilitydisaster recovery(HADR) . . . 256
Logarchivingconfigurationforhighavailability disasterrecovery(HADR) . . . 257
Clustermanagersandhighavailabilitydisaster recovery(HADR) . . . 258
Switchingdatabaserolesinhighavailability disasterrecovery(HADR) . . . 259
PerforminganHADRfailoveroperation . . . . 261
TAKEOVERHADR . . . 264
db2HADRTakeover-Instructadatabasetotake overasthehighavailabilitydisasterrecovery (HADR)primarydatabase . . . 266
Reintegratingadatabaseafteratakeoveroperation 268 Performingarollingupgradeinahighavailability disasterrecoveryenvironment. . . 269
Monitoringhighavailabilitydisasterrecovery (HADR) . . . 270
Highavailabilitydisasterrecovery(HADR) performance. . . 271
Chapter
8.
Cluster
support
on
AIX
.
. 275
HighAvailabilityClusterMulti-Processingsupport 275
Chapter
9.
Cluster
support
on
the
Windows
operating
system
.
.
.
.
. 281
MicrosoftClusterServersupport . . . 281
Chapter
10.
Cluster
support
for
the
Solaris
Operating
Environment
.
.
.
. 285
ClustersupportfortheSolarisoperatingsystem 285 SunCluster3.0support . . . 287
VERITASClusterServersupport . . . 290
Part
3.
Appendixes
.
.
.
.
.
.
.
. 295
Appendix
A.
How
to
read
the
syntax
diagrams
.
.
.
.
.
.
.
.
.
.
.
.
. 297
Appendix
B.
Warning,
error
and
completion
messages
.
.
.
.
.
.
.
. 301
Appendix
C.
Additional
DB2
commands
.
.
.
.
.
.
.
.
.
.
.
. 303
Systemcommands. . . 303
db2adutl-ManagingDB2objectswithinTSM 303 db2ckbkp-Checkbackup . . . 310
db2ckrst-Checkincrementalrestoreimage sequence . . . 314
db2flsn-Findlogsequencenumber . . . . 316
db2inidb-Initializeamirroreddatabase . . . 317
db2mscs-SetupWindowsfailoverutility . . 319
db2rfpen-Resetrollforwardpendingstate . . 322
CLPcommands . . . 323 ARCHIVELOG . . . 323 INITIALIZETAPE . . . 325 LISTHISTORY . . . 326 PRUNEHISTORY/LOGFILE . . . 329 REWINDTAPE . . . 330
SETTAPEPOSITION . . . 331
UPDATEHISTORY . . . 332
Appendix
D.
Additional
APIs
and
associated
data
structures
.
.
.
.
. 335
db2ArchiveLog-Archivetheactivelogfile . . . 335
db2HistoryCloseScan-Endthehistoryfilescan 337 db2HistoryGetEntry-Getthenextentryinthe historyfile . . . 338
db2HistoryOpenScan-Startahistoryfilescan . . 341
db2HistoryUpdate-Updateahistoryfileentry . . 345
db2Prune-Deletethehistoryfileentriesorlog filesfromtheactivelogpath . . . 348
db2ReadLogNoConn-Readthedatabaselogs withoutadatabaseconnection. . . 350
db2ReadLogNoConnInit-Initializereadingthe databaselogswithoutadatabaseconnection . . . 353
db2ReadLogNoConnTerm-Terminatereadingthe databaselogswithoutadatabaseconnection . . . 355
db2ReadLog-Extractslogrecords . . . 356
db2HistoryData . . . 360
SQLU_LSN . . . 366
Appendix
E.
Recovery
sample
programs
.
.
.
.
.
.
.
.
.
.
.
.
. 367
SampleprogramswithembeddedSQL. . . 367
Appendix
F.
Cross-node
recovery
with
the
db2adutl
command
and
the
logarchopt1
and
vendoropt
database
configuration
parameters
.
.
.
.
.
. 397
Appendix
G.
Tivoli
Storage
Manager
403
ConfiguringaTivoliStorageManagerclient . . . 403ConsiderationsforusingTivoliStorageManager 404
Appendix
H.
Tivoli
Space
Manager
Hierarchical
Storage
Management
support
for
partitioned
tables
.
.
.
. 407
Appendix
I.
User
exit
for
database
recovery
.
.
.
.
.
.
.
.
.
.
.
.
. 409
Sampleuserexitprograms . . . 409
Callingformat . . . 410
Errorhandling . . . 411
Appendix
J.
Backup
and
restore
APIs
for
vendor
products
.
.
.
.
.
.
.
. 413
DB2APIsforbackupandrestoretostorage managers. . . 413
Operationaloverview . . . 413
Operationalhintsandtips . . . 418
Invokingabackuporarestoreoperationusing vendorproducts . . . 419
sqluvint-Initializeandlinktoavendordevice 421 sqluvget-Readdatafromavendordevice . . . 426
sqluvput-Writedatatoavendordevice . . . . 427
sqluvend-Unlinkavendordeviceandreleaseits resources . . . 429
sqluvdel-Deletecommittedsession. . . 430
db2VendorQueryApiVersion-Getthesupported levelofthevendorstorageAPI . . . 431
db2VendorGetNextObj-Getnextobjectondevice 432 DB2_info . . . 434 Vendor_info . . . 437 Init_input . . . 438 Init_output . . . 439 Data . . . 440 Return_code . . . 440
APIsforcompressedbackups . . . 441
DB2APIsforusingcompressionwithbackup andrestoreoperations . . . 441
Appendix
K.
DB2
Database
technical
information
.
.
.
.
.
.
.
.
.
.
.
. 445
OverviewoftheDB2technicalinformation . . . 445
Documentationfeedback . . . 445
DB2technicallibraryinhardcopyorPDFformat 446 OrderingprintedDB2books . . . 448
DisplayingSQLstatehelpfromthecommandline processor. . . 449
AccessingdifferentversionsoftheDB2 InformationCenter . . . 450
Displayingtopicsinyourpreferredlanguageinthe DB2InformationCenter. . . 450
UpdatingtheDB2InformationCenterinstalledon yourcomputerorintranetserver. . . 451
DB2tutorials . . . 453
DB2troubleshootinginformation. . . 453
TermsandConditions . . . 454
Appendix
L.
Notices
.
.
.
.
.
.
.
. 455
Trademarks . . . 457
Index
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 459
About
this
book
Thisbookprovidesdetailedinformationabout,and showsyouhow touse,the IBM DB2databasebackup, restore,and recoveryutilities. Thebookalsoexplains theimportanceofhighavailability,anddescribes DB2failoversupportonseveral platforms.
Who
should
use
this
book
Thismanualisfordatabaseadministrators,application programmers,andother DB2 databaseuserswhoareresponsiblefor,or whowanttounderstand, backup, restore,andrecovery operationsonDB2databasesystems.
Itisassumedthatyouarefamiliar withtheDB2databasesystem,Structured Query Language(SQL), andwiththeoperatingsystemenvironment inwhichthe DB2 databasesystemisrunning.Thismanualdoesnotcontaininstructions for installingDB2,whichdependonyour operatingsystem.
How
this
book
is
structured
Thefollowingtopics arecovered:
Data Recovery
Chapter1,“Developinga goodbackupandrecovery strategy”
Discussesfactors toconsiderwhenchoosing databaseandtable space recovery methods,includingbackingupandrestoringadatabaseortable space, andusingrollforwardrecovery.
Chapter2,“Databasebackup”
DescribestheDB2backuputility,usedtocreatebackupcopiesofa database ortablespaces.
Chapter3,“Databaserestore”
DescribestheDB2restoreutility,usedtorecreatedamagedorcorrupted databasesortablespacesthatwerepreviouslybackedup.
Chapter4,“Rollforward recovery”
DescribestheDB2rollforwardutility, usedtorecovera databaseby applyingtransactionsthatwererecordedinthedatabaserecoverylogfiles.
Chapter5,“Databaserecover”
DescribestheDB2recoverutility,whichperformstheneccessaryrestore and rollforwardoperationstorecovera databasetoa specifiedtime,based oninformationfoundintherecoveryhistoryfile.
HighAvailability
Chapter6,“Introducinghighavailabilityandfailoversupport”
Presents anoverviewofthehighavailabilityfailoversupportthatis providedbyDB2.
Chapter7,“High availabilitydisasterrecovery(HADR)”
Discussestheconceptsandproceduresrequiredtosetupandmanage a highavailability disasterrecovery (HADR)environment.
Chapter8,“Cluster supportonAIX”
DiscussesDB2 supportforhighavailabilityfailoverrecovery onAIX, whichiscurrentlyimplementedthroughtheEnhancedScalability(ES) feature ofHighAvailabilityClusterMulti-processing (HACMP)forAIX.
Chapter9,“Cluster supportontheWindowsoperatingsystem”
DiscussesDB2 supportforhighavailabilityfailoverrecovery onWindows® operatingsystemswhichiscurrentlyimplementedthrough Microsoft® Cluster Server(MSCS).
Chapter10,“ClustersupportfortheSolarisOperatingEnvironment”
DiscussesDB2 supportforhighavailabilityfailoverrecovery intheSolaris OperatingEnvironment,whichiscurrentlyimplementedthrough Sun Cluster 3.0(SC3.0)or VeritasClusterServer (VCS).
Appendixes
AppendixA, “Howto readthesyntaxdiagrams”
Explainstheconventionsusedinsyntax diagrams.
AppendixB,“Warning,errorandcompletionmessages”
Provides informationaboutinterpretingmessages generatedbythe database managerwhenawarningorerrorconditionhasbeendetected.
AppendixC,“AdditionalDB2commands”
Describesrecovery-relatedDB2 commands.
AppendixD, “AdditionalAPIsandassociateddatastructures”
Describesrecovery-relatedAPIsandtheirdatastructures.
AppendixE,“Recoverysampleprograms”
Provides thecode listingforasampleprogramcontainingrecovery-related DB2APIsandembedded SQLcalls,andinformationonhowtousethem.
AppendixG, “TivoliStorageManager”
Provides informationabouttheTivoliStorageManager(TSM)product, whichyoucanusetomanagedatabase ortablespacebackupoperations.
AppendixH, “TivoliSpaceManagerHierarchicalStorageManagementsupport forpartitioned tables”
Provides informationaboutusingtheTivoli®SpaceManager Hierarchical StorageManager(HSM) clientprogramautomaticallymigrateseligiblefiles tosecondarystorage tomaintainspecificlevelsoffreespaceonlocalfile systems.
AppendixF, “Cross-noderecoverywiththedb2adutlcommand andthe logarchopt1andvendoroptdatabaseconfigurationparameters”
Provides examplesthatshowhowtoperformcross-node recoveryusing thedb2adutlcommand,andthelogarchopt1andvendoropt database configurationparameters.
AppendixI,“Userexitfordatabaserecovery”
Discusseshow userexitprogramscanbe usedwith databaselogfiles,and describes somesampleuser exitprograms.
AppendixJ,“BackupandrestoreAPIsforvendor products”
Describesthefunctionand useofAPIsthatenableDB2tointerfacewith othervendorsoftware.
Chapter
1.
Developing
a
good
backup
and
recovery
strategy
Thissectiondiscussesfactorstoconsider whenchoosingdatabase andtablespace recovery methods,includingbackingupandrestoringadatabaseor tablespace, and usingrollforward recovery.
Thefollowingtopics arecovered:
v “Developinga backupandrecoverystrategy”
v “Decidinghow oftentobackup”onpage6
v “Storageconsiderationsforrecovery”onpage8
v “Keepingrelateddatatogether”onpage9
v “Backupandrestoreoperationsbetweendifferentoperatingsystemsand
hardwareplatforms”onpage9 v “Crashrecovery”onpage10
v “Disasterrecovery”onpage22
v “Versionrecovery”onpage23
v “Rollforwardrecovery”onpage24
v “Incrementalbackupandrecovery”onpage27
v “Monitoringtheprogressofbackup, restoreandrecovery operations”onpage
32
v “Understandingrecoverylogs”onpage33
v “Understandingtherecovery historyfile”onpage56
v “Understandingtablespacestates”onpage60
v “Enhancingrecoveryperformance”onpage61
Developing
a
backup
and
recovery
strategy
Adatabasecanbecomeunusablebecauseof hardwareorsoftwarefailure,orboth. Youmight,at onetimeoranother,encounterstorage problems,power
interruptions,or applicationfailures,andeachfailurescenariorequiresadifferent recovery action.Protectyour dataagainstthepossibilityoflossbyhavingawell rehearsedrecovery strategyinplace.Some ofthequestionsthatyoushouldanswer whendevelopingyour recoverystrategyare:
v Willthedatabaseberecoverable?
v Howmuchtimecanbespentrecoveringthedatabase?
v Howmuchtimewillpass betweenbackupoperations?
v Howmuchstorage spacecanbeallocatedforbackupcopiesandarchivedlogs?
v Willtablespacelevelbackupsbe sufficient,or willfulldatabasebackupsbe
necessary?
v ShouldIconfigureastandby system,eithermanuallyorthroughhigh
availabilitydisasterrecovery (HADR)?
Adatabaserecovery strategyshouldensurethatallinformationisavailablewhen it isrequiredfordatabaserecovery.Itshouldinclude aregularschedulefortaking database backupsand,inthecaseofpartitioneddatabase environments,include backupswhen thesystem isscaled(whendatabase partitionservers ornodesare addedor dropped).Youroverallstrategyshouldalsoinclude proceduresfor
recoveringcommandscripts,applications,user-definedfunctions(UDFs),stored procedurecodeinoperatingsystemlibraries,and loadcopies.
Different recoverymethodsarediscussedinthesectionsthatfollow,andyouwill discover whichrecoverymethodisbestsuitedtoyourbusinessenvironment.
The conceptofadatabase backupisthesameasanyotherdatabackup: takinga copy ofthedataand thenstoringitonadifferentmediumincaseoffailureor damagetotheoriginal.Thesimplestcaseofabackupinvolvesshuttingdownthe database toensurethatnofurthertransactionsoccur,andthen simplybacking it up.Youcanthenrecreatethedatabaseifitbecomesdamagedorcorruptedinsome way.
The recreationofthedatabaseiscalledrecovery. Versionrecoveryistherestorationof a previousversionofthedatabase,usinganimagethatwas createdduringa backupoperation.Rollforwardrecoveryisthereapplicationof transactionsrecorded in thedatabase logfiles afteradatabase oratablespacebackupimagehasbeen restored.
Crashrecoveryistheautomaticrecovery ofthedatabaseifa failureoccursbeforeall of thechanges thatarepartofoneormore unitsofwork(transactions)are
completedand committed.Thisisdonebyrollingbackincompletetransactions and completingcommittedtransactionsthatwere stillinmemorywhenthecrash occurred.
Recoverylogfilesand therecovery historyfileare createdautomaticallywhena database iscreated (Figure1 onpage5). Theselogfilesareimportantifyouneed torecover datathatislostordamaged.
Eachdatabase includesrecoverylogs,whichareusedtorecoverfromapplicationor system errors.Incombinationwith thedatabase backups,theyareusedtorecover theconsistencyof thedatabase rightuptothepointintimewhentheerror occurred.
The recoveryhistory filecontainsa summaryofthebackupinformationthatcanbe usedtodeterminerecoveryoptions, ifallorpart ofthedatabase mustbe
recovered toagivenpointintime. Itisusedtotrackrecovery-relatedeventssuch asbackupandrestoreoperations,amongothers.Thisfileislocatedinthedatabase directory.
The tablespacechangehistoryfile,whichisalsolocatedinthedatabasedirectory, contains informationthatcanbe usedtodeterminewhichlogfilesarerequiredfor therecovery ofaparticulartablespace.
Youcannotdirectlymodify therecovery historyfileorthetablespacechange historyfile;however,youcandeleteentriesfromthefilesusingthePRUNE HISTORY command.Youcanalso usetherec_his_retentndatabaseconfiguration parameter tospecifythenumber ofdaysthatthesehistoryfileswillberetained.
Data thatiseasilyrecreatedcanbe storedinanon-recoverable database.This includesdatafromanoutsidesourcethatisusedforread-onlyapplications,and tablesthatarenotoftenupdated,forwhichthesmallamountofloggingdoesnot justify theaddedcomplexityof managinglogfilesand rollingforwardafter a restoreoperation.Ifboththelogarchmeth1and logarchmeth2databaseconfiguration parametersareset to“OFF”thenthedatabaseisNon-recoverable.Thismeansthat theonlylogsthatarekeptare thoserequiredfor crashrecovery.Theselogsare known asactivelogs,andtheycontaincurrenttransactiondata.Versionrecovery usingoffline backupsistheprimarymeansofrecovery foranon-recoverable database.(An offlinebackupmeansthatnootherapplication canusethedatabase whenthebackupoperationisin progress.)Sucha databasecanonlybe restored offline. Itisrestoredtothestateitwasinwhenthebackupimagewas takenand rollforward recoveryisnotsupported.
Data thatcannotbeeasilyrecreatedshouldbestored inarecoverable database. Thisincludesdatawhosesourceisdestroyedafter thedataisloaded,datathatis manuallyenteredintotables, anddatathatismodifiedbyapplicationprogramsor usersafter itisloadedintothedatabase.Recoverabledatabaseshavethelogarchmeth1
orlogarchmeth2 databaseconfigurationparametersset toavalue otherthan“OFF”. Active logsarestillavailable forcrashrecovery,but youalsohavethearchivedlogs,
whichcontaincommittedtransactiondata.Sucha databasecanonlybe restored offline. Itisrestoredtothestateitwasinwhenthebackupimagewas taken. However, withrollforward recovery,youcanrollthedatabaseforward(thatis, past thetimewhenthebackupimagewastaken)byusingtheactiveandarchivedlogs toeither aspecific pointintime,ortotheendoftheactivelogs.
Recoverabledatabasebackupoperationscanbe performedeitheroffline oronline
(online meaningthatotherapplicationscanconnecttothedatabaseduring the backupoperation).Onlinetablespacerestoreandrollforwardoperationsare supportedonlyifthedatabaseisrecoverable.Ifthedatabaseisnon-recoverable, database restoreandrollforwardoperationsmust beperformedoffline.Duringan online backupoperation,rollforward recoveryensuresthatalltablechanges are capturedand reappliedifthatbackupisrestored.
Ifyouhavearecoverable database,youcanbackup,restore,androllindividual table spacesforward,ratherthantheentiredatabase.Whenyoubackupa table
Equivalent physical object
Database
Table space change history file Recovery history file Recovery log files System Instance Database object or concept
spaceonline,itisstillavailable foruse,andsimultaneous updatesarerecordedin thelogs.Whenyouperform anonlinerestoreorrollforward operationona table space, thetablespaceitselfisnotavailableforuseuntiltheoperationcompletes, but usersare notprevented fromaccessingtablesinothertablespaces.
Automated backupoperations:
Sinceitcanbe time-consumingtodeterminewhetherand whentorun maintenanceactivitiessuchasbackupoperations,youcanusetheConfigure Automatic Maintenancewizardtodothisforyou. Withautomaticmaintenance, youspecifyyourmaintenanceobjectives,includingwhenautomaticmaintenance canrun. DB2®thenusesthese objectivestodetermineif themaintenanceactivities
need tobedoneand thenrunsonlytherequiredmaintenanceactivitiesduring the next availablemaintenancewindow(a user-definedtimeperiodfortherunningof automatic maintenanceactivities).
Note: Youcanstillperformmanualbackupoperationswhenautomatic
maintenanceisconfigured.DB2willonlyperform automaticbackup operationsiftheyare required.
Related concepts:
v “Crashrecovery”onpage10
v “Highavailabilitydisasterrecoveryoverview”onpage221
v “Rollforwardrecovery”onpage24
v “Versionrecovery”onpage23
Related reference:
v “logarchmeth1-Primarylogarchivemethod configurationparameter”in PerformanceGuide
v “rec_his_retentn-Recoveryhistoryretentionperiodconfigurationparameter”in PerformanceGuide
Deciding
how
often
to
back
up
Yourrecovery planshouldallowforregularlyscheduledbackupoperations, because backingupadatabaserequirestimeandsystemresources.Yourplan might includea combinationoffulldatabasebackupsandincremental backup operations.
Youshouldtakefulldatabasebackupsregularly,evenifyouarchivethelogs (whichallows forrollforwardrecovery).Torecovera database,youcanuseeither a fulldatabasebackupimagethatcontainsall ofthetable spacebackupimages,or youcanrebuildthedatabase usingselectedtablespaceimages.Tablespacebackup imagesare alsousefulforrecoveringfroman isolateddisk failureoranapplication error. Inpartitioneddatabaseenvironments,youonlyneedtorestorethetable spacesthatresideondatabasepartitionsthathavefailed.Youdonotneedto restoreall ofthetablespacesorallofthedatabasepartitions.
Althoughfulldatabasebackupsare nolongerrequiredfordatabaserecoverynow thatyoucanrebuildadatabasefromtable spaceimages,itisstillgoodpracticeto occasionallytakea fullbackupofyour database.
Youshouldalsoconsider notoverwritingbackupimagesandlogs,savingatleast twofulldatabasebackupimagesandtheirassociatedlogsasanextraprecaution.
Iftheamount oftimeneededtoapplyarchivedlogswhenrecoveringand rollinga veryactivedatabaseforwardisamajor concern,consider thecostofbackingup thedatabasemore frequently.Thisreducesthenumberofarchivedlogsyouneed toapply whenrollingforward.
Youcaninitiatea backupoperationwhilethedatabaseiseitheronline oroffline.If it isonline, otherapplicationsorprocessescanconnecttothedatabase,aswell as read andmodifydatawhilethebackupoperationisrunning. Ifthebackup operationisrunningoffline,otherapplicationscannotconnecttothedatabase.
Toreducetheamountof timethatthedatabaseisnotavailable, considerusing online backupoperations.Onlinebackupoperationsare supportedonlyif
rollforward recoveryisenabled.Ifrollforwardrecovery isenabled andyouhavea completesetof recoverylogs,youcanrestorethedatabase,shouldtheneed arise. Youcanonlyuseanonline backupimageforrecovery ifyouhavethelogsthat spanthetimeduring whichthebackupoperationwasrunning.
Offlinebackupoperationsare fasterthanonline backupoperations,sincethereis nocontention forthedatafiles.
Thebackuputilityletsyoubackupselectedtablespaces.IfyouuseDMStable spaces,youcanstore differenttypesofdataintheirown tablespacestoreducethe timerequiredforbackupoperations.Youcankeeptabledatainonetable space, longfieldandLOBdatainanothertablespace, andindexesinyetanothertable space. Ifyoudo thisanda diskfailureoccurs,itislikelytoaffectonlyone ofthe table spaces.Restoringorrollingforwardoneofthesetablespaceswilltakeless timethanitwouldhavetakentorestoreasingletablespacecontainingall ofthe data.
Youcanalsosave timebytakingbackupsofdifferenttablespacesatdifferent times,aslongasthechangestothemare notthesame.So,iflongfieldorLOB dataisnotchangedasfrequentlyastheotherdata,youcanbackupthesetable spacesless frequently.IflongfieldandLOBdataarenotrequiredforrecovery,you canalso considernotbackingupthetablespacethatcontains thatdata.IftheLOB datacanbe reproducedfroma separatesource,choosetheNOTLOGGED option whencreating oralteringatable toincludeLOBcolumns.
Note: Considerthefollowingifyoukeepyour longfielddata,LOBdata,and
indexesin separatetablespaces,butdonotbackthemuptogether:Ifyou backupatablespacethatdoesnotcontainallof thetabledata,youcannot performpoint-in-timerollforwardrecovery onthattablespace.Allthetable spacesthatcontainanytype ofdataforatablemust berolledforward simultaneouslytothesame pointintime.
Ifyoureorganizea table,youshouldbackuptheaffectedtablespacesafter the operationcompletes.Ifyouhavetorestorethetablespaces,youwillnothaveto rollforwardthroughthedatareorganization.
Thetimerequiredtorecoveradatabaseismadeupoftwoparts:thetimerequired tocompletetherestorationof thebackup; and,ifthedatabaseisenabledfor forwardrecovery,thetimerequiredtoapplythelogsduringtherollforward operation.Whenformulatingarecovery plan,youshouldtaketheserecovery costs and theirimpactonyour businessoperationsintoaccount. Testingyour overall recovery planwillassistyouindeterminingwhetherthetimerequiredtorecover thedatabaseisreasonablegivenyourbusinessrequirements.Following eachtest, youmightwanttoincreasethefrequencywith whichyoutakeabackup. If
rollforward recoveryispartofyour strategy,thiswillreducethenumber oflogs thatare archivedbetweenbackupsand,asa result,reducethetimerequiredtoroll thedatabaseforwardafter arestoreoperation.
Related concepts:
v “Developinga backupandrecoverystrategy”onpage3
v “Incrementalbackupandrecovery”onpage27
Related reference:
v “Configurationparametersfordatabase logging”onpage37
v AppendixI,“Userexit fordatabaserecovery,”onpage409
Storage
considerations
for
recovery
Whendecidingwhichrecoverymethodtouse, considerthestoragespacerequired.
The versionrecoverymethodrequiresspacetoholdthebackupcopyofthe database andtherestoreddatabase.Therollforwardrecovery methodrequires spacetoholdthebackupcopyofthedatabaseortablespaces,therestored database,and thearchiveddatabaselogs.
Ifa tablecontainslongfieldorlargeobject(LOB)columns,youshouldconsider placingthis dataintoaseparate tablespace.Thiswillaffectyourstorage space considerations, aswell asaffectyour planforrecovery.Withaseparate tablespace for longfieldandLOBdata,andknowingthetimerequiredtobackuplongfield and LOBdata,youmightdecidetousearecovery planthatonlyoccasionally savesa backupof thistablespace.Youcanalsochoose,whencreatingoralteringa table toincludeLOBcolumns,nottologchangestothose columns.Thiswill reducethesizeoftherequiredlogspaceandthecorresponding logarchive space.
Topreventmedia failurefromdestroyingadatabase andyourabilitytorestoreit, keepthedatabasebackup, thedatabase logs,and thedatabaseitselfondifferent devices.For thisreason,it ishighlyrecommendedthatyouusethenewlogpath
configurationparametertoputdatabase logsonaseparatedeviceoncethe database iscreated.
The databaselogscanuseupa largeamountofstorage.Ifyouplantousethe rollforward recoverymethod,youmustdecidehowtomanage thearchivedlogs. Yourchoicesarethefollowing:
v SpecifyalogarchivingmethodusingtheLOGARCHMETH1or
LOGARCHMETH2configurationparameters.
v Manuallycopythelogs toastoragedeviceordirectory otherthanthedatabase
logpath directoryaftertheyare nolongerintheactivesetoflogs.
v Useauser exitprogramtocopytheselogstoanotherstoragedeviceinyour
environment.
Related concepts:
v “Logfilemanagementthroughlogarchiving”onpage49
Related reference:
v “Configurationparametersfordatabase logging”onpage37
v “logarchmeth1-Primarylogarchivemethod configurationparameter”in PerformanceGuide | | | | |
v “logarchmeth2-Secondarylogarchive methodconfigurationparameter”in PerformanceGuide
Keeping
related
data
together
Intheprocessofdesigning yourdatabase,youwilldevelop anunderstandingof therelationshipsthatexistbetweentables.Theserelationshipscanbeexpressed: v Attheapplicationlevel, whentransactionsupdatemorethanonetable
v Atthedatabaselevel,wherereferentialintegrityexistsbetweentables,orwhere
triggersononetableaffectanothertable.
Youshouldconsiderthese relationshipswhendevelopingarecovery plan.Youwill wanttobackuprelatedsetsofdatatogether.Suchsetscanbeestablished ateither thetablespaceorthedatabaselevel.By keepingrelatedsetsofdatatogether,you canrecovertoa pointwhereallofthedataisconsistent.Thisisespecially
importantifyouwanttobe abletoperformpoint-in-timerollforward recoveryon table spaces.
Related concepts:
v “Crashrecovery”onpage10
v “Developinga backupandrecoverystrategy”onpage3
v “Disasterrecovery”onpage22
v “Versionrecovery”onpage23
Related tasks:
v “Usingrollforward”onpage157
Backup
and
restore
operations
between
different
operating
systems
and
hardware
platforms
DB2 databasesystemssupport somebackupand restoreoperationsbetween differentoperatingsystemsandhardwareplatforms.
ThesupportedplatformsforDB2 backupand restoreoperationscanbegrouped into oneofthreefamilies:
v Big-endianLinux
®and UNIX®
v Little-endianLinuxandUNIX
v Windows
Adatabasebackupfromoneplatformfamilycanberestoredonanysystemwithin thesameplatformfamily.For Windowsoperatingsystems,youcanrestorea database createdonDB2UDBV8ona DB2Version9database system.ForLinux and UNIXoperatingsystems,aslongastheendianness(bigendianorlittle endian)ofthebackupandrestoreplatformsisthesame,youcanrestorebackups thatwere producedonDB2UDBV8onDB2Version9.
Thefollowingtableshows eachoftheLinuxand UNIXplatformsDB2supports and indicateswhethertheplatformsarebigendianorlittleendian:
Table1.EndiannessofsupportedLinuxandUNIXoperatingsystemsDB2supports
Platform Endianness
AIX®
Table1.EndiannessofsupportedLinuxandUNIXoperatingsystemsDB2 supports (continued)
Platform Endianness
HP-UX bigendian
HPonIPF bigendian
SolarisOperatingenvironment bigendian LinuxonzSeries bigendian LinuxonPowerPC bigendian LinuxonIA-64 littleendian LinuxonAMD64andIntel®
EM64T littleendian 32-bitLinuxonx86 littleendian
The targetsystemmust havethesame(orlater)versionoftheDB2database productasthesourcesystem.Youcannotrestoreabackupcreatedononeversion of thedatabase producttoa systemrunningan earlierversionofthedatabase product. Forexample,youcanrestorea V8backuponaV9database system,but youcannotrestoreaV9backuponaV8 databasesystem.
Insituationswherecertainbackupandrestorecombinationsare notallowed,you canmovetablesbetweenDB2databasesusingothermethods:
v db2movecommand
v Exportutility followedbytheimportortheloadutilities
Related tasks:
v “Usingbackup”onpage66
v “Usingrestore”onpage90
Related reference:
v “db2move-Databasemovementtoolcommand”inCommand Reference
v “EXPORTcommand”inCommandReference
v “IMPORTCommand”inCommandReference
v “LOADcommand”inCommand Reference
Crash
recovery
Transactions(orunits ofwork)againsta databasecanbeinterruptedunexpectedly. Ifa failureoccursbeforeallofthechangesthatare partoftheunitofworkare completedand committed,thedatabaseisleftinaninconsistentand unusable state. Crashrecoveryistheprocessbywhichthedatabase ismovedbacktoa consistent andusablestate.Thisisdone byrollingbackincompletetransactions and completingcommittedtransactionsthatwere stillinmemorywhenthecrash occurred (Figure2 onpage11).Whena databaseisina consistentandusablestate, it hasattainedwhatisknownasa″pointof consistency″.
Atransaction failureresultsfroma severeerrororcondition thatcausesthe database orthedatabase managertoendabnormally.Partially completedunits of work,orUOWthathavenotbeenflushedtodiskatthetimeoffailure,leavethe database inaninconsistentstate. Followingatransactionfailure,thedatabase must be recovered.Conditionsthatcanresultintransactionfailureinclude:
v Apowerfailureonthemachine,causingthedatabase managerandthedatabase
partitionsonittogodown
v Ahardwarefailuresuchasmemorycorruption,ordisk,CPU,ornetwork
failure.
v Aserious operatingsystemerrorthatcausesDB2togodown
v Anapplicationterminatingabnormally.
Ifyouwanttherollbackof incompleteunits ofworktobedoneautomaticallyby thedatabasemanager,enabletheautomaticrestart (autorestart)database
configurationparameterbysettingittoON.(Thisisthedefaultvalue.) Ifyoudo notwantautomaticrestartbehavior,set theautorestart databaseconfiguration parameter toOFF.Asa result,youwillneedtoissuetheRESTARTDATABASE commandwhena databasefailureoccurs.IfthedatabaseI/Owassuspended before thecrashoccurred,youmust specifytheWRITE RESUMEoption ofthe RESTART DATABASEcommandinorderforthecrashrecoverytocontinue. The administrationnotificationlogrecordswhenthedatabase restartoperationbegins.
Ifcrash recoveryisappliedtoa databasethatisenabledforforwardrecovery(that is, thelogarchmeth1configurationparameterisnotset toOFF),andanerroroccurs during crashrecoverythatisattributabletoanindividual tablespace,thattable spacewillbetakenoffline,andcannotbeaccesseduntilit isrepaired. Crash recovery continues.Atthecompletionofcrashrecovery,theothertablespacesin thedatabasewillbeaccessible,andconnectionstothedatabase canbeestablished. However, ifthetablespacethatistakenofflineisthetablespacethatcontains the system catalogs,itmust berepairedbeforeanyconnectionswillbepermitted.
Related reference:
v “autorestart-Autorestartenable configurationparameter”inPerformanceGuide
v “logarchmeth1-Primarylogarchivemethod configurationparameter”in PerformanceGuide 1 2 3 4 rollback rollback rollback rollback Units of work Crash All four rolled back
TIME
Crash
recovery
-
details
Recovering
damaged
table
spaces
Adamagedtablespacehasoneormorecontainersthatcannotbeaccessed.Thisis oftencausedbymediaproblemsthatareeither permanent(forexample,abad disk),ortemporary(forexample,anoffline disk,oranunmountedfilesystem).
Ifthedamagedtablespaceisthesystem catalogtablespace,thedatabasecannot be restarted.Ifthecontainerproblemscannotbefixed leavingtheoriginaldata intact,theonlyavailable optionsare:
v Torestorethedatabase
v Torestorethecatalog tablespace.
Notes:
1. Tablespacerestoreisonlyvalidforrecoverable databases,becausethe
database mustberolledforward.
2. Ifyourestorethecatalogtablespace,youmustperform arollforward
operationtotheendoflogs.
Ifthedamagedtablespaceisnotthesystem catalogtablespace,DB2 attemptsto make asmuchofthedatabaseavailable aspossible.
Ifthedamagedtablespaceistheonlytemporarytablespace,youshouldcreatea new temporarytablespaceassoon asaconnectiontothedatabasecanbe made. Once created,thenew temporarytablespacecanbe used,andnormaldatabase operations requiringa temporarytablespacecanresume.Youcan,ifyouwant, droptheoffline temporarytablespace.Therearespecial considerationsfortable reorganizationusingasystemtemporary tablespace:
v Ifthedatabase orthedatabase managerconfigurationparameterindexrecisset
toRESTART,allinvalidindexesmust berebuiltduringdatabase activation;this includesindexesfromareorganizationthatcrashedduring thebuildphase. v Ifthereare incompletereorganizationrequestsinadamagedtemporarytable
space,youmighthavetosettheindexrecconfigurationparameter toACCESSto avoidrestart failures.
Related tasks:
v “Recoveringtablespacesinnon-recoverabledatabases”onpage13
v “Recoveringtablespacesinrecoverable databases”onpage12
Related reference:
v “RESTARTDATABASEcommand”inCommand Reference
v “RESTOREDATABASE”onpage100
Recovering
table
spaces
in
recoverable
databases
Whencrashrecovery isnecessary,thedamagedtablespacesistakenoffline andis notaccessible.Itisplacedinroll-forwardpending state.Iftherearenoadditional problems,a restartoperationwillsucceed inbringingthedatabase onlineeven with thedamagedtablespace.Onceonline, thedamagedtablespaceisunusable, but therestof thedatabase isusable. Tofixthedamagedtablespaceandmake it useable,followtheprocedurebelow.
Procedure:
Tomake thedamagedtablespaceuseable,useoneoftheproceduresthatfollow: v Method1
1. Fix thedamagedcontainers withoutlosingtheoriginaldata.
2. Completeatablespacerollforwardoperationtotheendofthelogs.
Note: Therollforwardoperationwillfirstattempttobringthetablespace
fromofflinetonormalstate. v Method2
1. Fix thedamagedcontainers withorwithoutlosingtheoriginaldata.
2. Perform atablespacerestoreoperation.
3. Completeatablespacerollforwardoperationtotheendofthelogsortoa
point-in-time.
Related concepts:
v “Recoveringdamagedtablespaces” onpage12
v “Understandingtablespacestates”onpage60
Related tasks:
v “Recoveringtablespacesinnon-recoverabledatabases”onpage13
Related reference:
v “RESTARTDATABASEcommand”inCommandReference
Recovering
table
spaces
in
non-recoverable
databases
Whencrashrecovery isneeded,but therearedamagedtable spaces,youcanonly successfullyrestartthedatabaseif thedamagedtablespacesare dropped.Ina non-recoverabledatabase,thelogsnecessarytorecover thedamagedtablespaces are notretained. Therefore,theonlyvalidaction againstsuchtablespacesisto dropthem.
Procedure:
Torestart adatabasewith damagedtablespaces:
1. Invokeanunqualifiedrestartdatabase operation.Itwillsucceedifthereareno
damagedtablespaces.Ifitfails(SQL0290N),lookin theadministration notificationlogfilefor acompletelistof tablespacesthatarecurrently damaged.
2. Ifyouarewillingtodropallofthedamagedtablespaces,initiateanother
restartdatabase operation,listing allofthedamagedtablespacesunderthe DROPPENDINGTABLESPACESoption.Ifa damagedtablespaceisincluded intheDROPPENDINGTABLESPACESlist,thetable spaceisputintodrop pending state,andyoumustdropthetablespaceaftertherecoveryoperation iscomplete.
Therestartoperationcontinueswithoutrecoveringthespecifiedtablespaces.If a damagedtablespaceisnotincludedintheDROPPENDINGTABLESPACES list,therestartdatabaseoperationfailswith SQL0290N.
Note: Includingatable spacenameintheDROPPENDINGTABLESPACESlist
doesnotmeanthatthetable spacewillbeindroppending state.Itwill beplacedinthisstateonlyif thetablespaceisfoundtobedamaged duringtherestartoperation.
3. Iftherestartdatabaseoperationissuccessful,invoketheLIST TABLESPACES
commandtofindoutwhichtablespacesareindroppending state.
4. IssueDROPTABLESPACEstatementstodropeachofthetablespacesthatare
indroppending state.Onceyouhavedonethis,youwillbeable toreclaimthe spacethatthedamagedtable spaceswereusingorrecreatethetablespaces. 5. Ifyouareunwillingtodropandlosethedatainthedamagedtable spaces,you
can:
v Fix thedamagedcontainers (withoutlosingtheoriginaldata).
v Reissue theRESTART DATABASEcommand.
v Perform adatabaserestoreoperation.
Related concepts:
v “Recoveringdamagedtablespaces” onpage12
v “Understandingtablespacestates”onpage60
Related tasks:
v “Recoveringtablespacesinrecoverable databases”onpage12
Related reference:
v “RESTARTDATABASEcommand”inCommand Reference
Reducing
the
impact
of
media
failure
Toreducetheprobabilityofmedia failure,and tosimplifyrecoveryfromthistype of failure:
v Mirrororduplicatethedisksthatholdthedataandlogs forimportant
databases.
v UseaRedundantArrayofIndependentDisks(RAID)configuration, suchas
RAIDLevel5.
v Ina partitioneddatabaseenvironment,setupa rigorousprocedureforhandling
thedataand thelogs onthecatalog partition.Becausethisdatabasepartitionis criticalformaintainingthedatabase:
– Ensurethatit residesonareliabledisk
– Duplicate it
– Makefrequentbackups
– Donotputuserdataonit.
Protecting
against
disk
failure
Ifyouareconcerned aboutthepossibilityofdamageddataorlogsduetoa disk crash,consider theuseofsomeformofdisk faulttolerance.Generally,this is accomplished throughtheuseof adiskarray,whichisaset ofdisks.
Adiskarray issometimesreferredtosimplyasa RAID(RedundantArrayof IndependentDisks).Diskarrayscanalsobe providedthroughsoftwareat the operatingsystem orapplicationlevel.Thepoint ofdistinctionbetweenhardware and softwarediskarrays ishow CPUprocessingofinput/output(I/O)requestsis
handled. Forhardwarediskarrays,I/Oactivity ismanagedbydiskcontrollers; for software diskarrays,thisisdonebytheoperatingsystem oranapplication.
Hardware diskarrays: Inahardwarediskarray, multipledisksare usedand
managedbyadiskcontroller,completewith itsown CPU.Allof thelogicrequired tomanage thedisksformingthisarrayiscontainedonthediskcontroller;
therefore, thisimplementationisoperatingsystem-independent.
Thereare severaltypesofRAID architecture,differinginfunction and performance, butonlyRAIDlevel1and level5 arecommonlyusedtoday.
RAID level1 isalso knownasdiskmirroringorduplexing.Diskmirroringcopies data(a completefile)fromonedisktoa seconddisk,usingasingledisk controller.
Disk duplexingissimilartodiskmirroring, exceptthatdisksareattachedtoa second diskcontroller(liketwoSCSI adapters).Dataprotectionisgood:Either disk canfail,and dataisstill accessiblefromtheotherdisk.Withdiskduplexing,a disk controllercanalso failwithoutcompromising dataprotection. Performanceis good,butthisimplementationrequirestwicetheusualnumber ofdisks.
RAID level5 involvesdataandparitystripingbysectors,acrossalldisks. Parityis interleavedwith data,ratherthanbeingstoredona dedicateddrive.Data
protectionisgood:Ifanydiskfails,thedatacanstillbeaccessedbyusing informationfromtheotherdisks,alongwith thestripedparityinformation.Read performance isgood,butwriteperformanceisnot.ARAIDlevel5 configuration requiresa minimumofthreeidenticaldisks. Theamountofdiskspacerequiredfor overheadvarieswiththenumberof disksinthearray.Inthecaseofa RAIDlevel 5 configurationwith5disks, thespaceoverheadis20percent.
WhenusingaRAID(but nota RAIDlevel0)diskarray,a faileddiskwillnot preventyoufromaccessingdataonthearray.Whenhot-pluggableor
hot-swappable disksare usedinthearray,a replacementdiskcanbeswapped with thefaileddiskwhilethearrayisinuse.WithRAIDlevel5,iftwodisksfail at thesametime, alldataislost(buttheprobabilityofsimultaneousdisk failuresis verysmall).
Youmight considerusingaRAID level1 hardwarediskarray orasoftwaredisk array foryourlogs,becausethisprovides recoverabilitytothepoint offailure,and offers goodwriteperformance,whichisimportantforlogs.Todo this,usethe
mirrorlogpathconfigurationparametertospecifyamirrorlogpathona RAIDlevel 1 filesystem.Incaseswhere reliabilityiscritical(because timecannotbelost recoveringdatafollowingadiskfailure),andwriteperformance isnotsocritical, consider usingaRAIDlevel5hardwarediskarray.Alternatively,ifwrite
performance iscritical, andthecostofadditionaldiskspaceisnotsignificant, consider aRAID level1 hardwarediskarray foryourdata,aswellasforyour logs.
For detailedinformationabouttheavailableRAID levels,visit thefollowingweb site: http://www.acnc.com/04_01_00.html
Softwarediskarrays: Asoftware diskarrayaccomplishesmuchthesameasdoes
a hardwarediskarray,butdisktrafficismanagedbyeithertheoperatingsystem, orbyanapplication programrunningontheserver.Likeotherprograms,the software arraymustcompeteforCPU andsystemresources.Thisisnota good option foraCPU-constrainedsystem,andit shouldbe rememberedthatoverall disk arrayperformanceisdependentontheserver’s CPUloadand capacity.
Atypicalsoftwaredisk arrayprovidesdiskmirroring.Althoughredundantdisks are required,asoftware diskarrayiscomparativelyinexpensivetoimplement, because costlydiskcontrollers arenotrequired.
CAUTION:
Havingtheoperatingsystem bootdriveinthediskarraypreventsyoursystem from startingifthatdrivefails.Ifthedrivefailsbeforethediskarrayis
running, thediskarraycannotallowaccesstothedrive.Abootdriveshouldbe separatefromthediskarray.
Related concepts:
v “Backupoverview”onpage63
v “Developinga backupandrecoverystrategy”onpage3
Reducing
the
impact
of
transaction
failure
Toreducetheimpactof atransactionfailure,tryto ensure: v Anuninterruptedpowersupplyfor eachDB2server
v Adequatediskspacefordatabase logsonalldatabasepartitions
v Reliablecommunicationlinksamongthedatabasepartitionserversina
partitioneddatabaseenvironment
v Synchronizationofthesystemclocksin apartitioneddatabaseenvironment.
Related concepts:
v “Synchronizingclocks inapartitioneddatabaseenvironment”onpage166
Recovering
from
transaction
failures
in
a
partitioned
database
environment
Ifa transactionfailureoccursina partitioneddatabaseenvironment,database recovery isusuallynecessaryonboththefailed databasepartitionserverandany otherdatabase partitionserverthatwasparticipating inthetransaction:
v Crashrecoveryoccursonthefaileddatabasepartitionserveraftertheantecedent
conditioniscorrected.
v Databasepartitionfailurerecoveryontheother(stillactive)databasepartition
serversoccursimmediatelyafter thefailurehasbeendetected.
Ina partitioneddatabaseenvironment,thedatabasepartitionserveronwhichan application issubmittedisthecoordinator partition,andthefirst agentthatworks for theapplication isthecoordinatoragent. Thecoordinatoragentisresponsible for distributingworktootherdatabasepartitionservers,anditkeepstrack of whichones areinvolvedinthetransaction.Whentheapplication issuesa COMMITstatementfora transaction,thecoordinatoragentcommitsthe transactionbyusingthetwo-phasecommit protocol.Duringthefirst phase,the coordinator partitiondistributesa PREPARErequesttoalltheotherdatabase partitionserversthatare participatinginthetransaction.Theseserversthen respond withoneofthefollowing:
READ-ONLY Nodatachangeoccurredatthis server
YES Datachangeoccurred atthisserver
NO Becauseofanerror, theserverisnotpreparedto commit
Ifoneof theservers respondswith aNO,thetransactionisrolledback.Otherwise, thecoordinatorpartitionbeginsthesecondphase.
Duringthesecondphase, thecoordinator partitionwrites aCOMMITlogrecord, then distributesaCOMMITrequestto alltheservers thatrespondedwith aYES. Afteralltheotherdatabasepartitionservershavecommitted,theysend an acknowledgmentoftheCOMMITtothecoordinatorpartition.The transactionis completewhenthecoordinatoragenthasreceivedallCOMMITacknowledgments fromall theparticipating servers.Atthispoint,thecoordinatoragentwritesa FORGETlogrecord.
Transaction
failure
recovery
on
an
active
database
partition
server
Ifanydatabasepartitionserverdetectsthatanotherserverisdown,allworkthat isassociatedwith thefailed databasepartitionserverisstopped:
v Ifthestill activedatabasepartitionserveristhecoordinatorpartitionforan
application,andtheapplicationwas runningonthefaileddatabasepartition server(andnotreadytoCOMMIT),thecoordinatoragentisinterruptedtodo failurerecovery.IfthecoordinatoragentisinthesecondphaseofCOMMIT processing,SQL0279Nisreturnedto theapplication,whichinturnlosesits databaseconnection.Otherwise,thecoordinatoragentdistributesa ROLLBACK requesttoallotherservers participatinginthetransaction,andSQL1229Nis returnedtotheapplication.
v Ifthefailed databasepartitionserverwasthecoordinatorpartitionforthe
application,agentsthatarestillworkingfortheapplication ontheactiveservers areinterruptedtodofailurerecovery.Thecurrenttransactionisrolledback locallyoneachserver,unlessithasbeen preparedandiswaitingforthe transactionoutcome.Inthissituation,thetransactionisleftin doubtonthe activedatabasepartitionservers,andthecoordinatorpartitionisnotawareof this(becauseitisnotavailable).
v Iftheapplication connectedtothefaileddatabase partitionserver(before it
failed),butneitherthelocaldatabasepartitionservernorthefailed database partitionserveristhecoordinatorpartition,agentsworking forthisapplication areinterrupted.Thecoordinatorpartitionwilleithersend aROLLBACK ora disconnectmessagetotheotherdatabasepartitionservers.Thetransactionwill onlybe indoubtondatabasepartitionserversthatarestill activeifthe
coordinatorpartitionreturnsSQL0279.
Anyprocess(suchasanagentordeadlockdetector)thatattemptstosenda requesttothefailedserverisinformedthatit cannotsend therequest.
Transaction
failure
recovery
on
the
failed
database
partition
server
Ifthetransactionfailurecausesthedatabase managertoendabnormally,youcan issue thedb2startcommandwiththeRESTARToptiontorestart thedatabase manager oncethedatabasepartitionhasbeenrestarted.Ifyoucannotrestart the database partition,youcanissue db2starttorestart thedatabase managerona differentdatabasepartition.
Ifthedatabasemanager endsabnormally,database partitionsontheservercanbe left inaninconsistentstate. Tomakethemusable,crashrecovery canbetriggered ona databasepartitionserver:
v Explicitly,throughtheRESTARTDATABASEcommand
v Implicitly,through aCONNECTrequestwhentheautorestartdatabase
Crash recoveryreappliesthelogrecordsintheactivelogfiles toensurethatthe effectsofall completetransactionsareinthedatabase.Afterthechangeshavebeen reapplied,all uncommittedtransactionsarerolledbacklocally,exceptforindoubt transactions.Therearetwotypesof indoubttransactioninapartitioneddatabase environment:
v Ona databasepartitionserverthatisnotthecoordinatorpartition,a transaction
isindoubt ifitispreparedbutnotyet committed.
v Onthecoordinatorpartition, atransactionisindoubt ifitiscommittedbutnot
yetloggedascomplete(thatis,theFORGETrecordisnotyetwritten).This situationoccurswhenthecoordinator agenthasnotreceivedalltheCOMMIT acknowledgmentsfromalltheserversthatworkedfortheapplication.
Crash recoveryattemptstoresolvealltheindoubttransactionsbydoingoneofthe following.Theactionthatistakendependsonwhetherthedatabasepartition serverwas thecoordinator partitionforan application:
v Iftheserverthatrestartedisnotthecoordinatorpartitionfortheapplication,it
sendsa querymessagetothecoordinator agenttodiscovertheoutcomeof the transaction.
v Iftheserverthatrestartedisthecoordinatorpartitionfortheapplication, it
sendsa messagetoalltheotheragents(subordinateagents)thatthecoordinator agentisstillwaitingforCOMMITacknowledgments.
Itispossiblethatcrashrecovery mightnotbeabletoresolve alltheindoubt transactions(forexample,someofthedatabasepartitionserversmight notbe available). Inthis situation,theSQLwarningmessageSQL1061Wisreturned. Becauseindoubttransactionsholdresources,suchaslocksand activelogspace,it ispossibletoget toapointwherenochanges canbe madetothedatabasebecause theactivelogspaceisbeingheldupbyindoubttransactions.For thisreason,you shoulddeterminewhetherindoubttransactionsremainaftercrashrecovery,and recover alldatabasepartitionserversthatarerequiredtoresolvetheindoubt transactionsasquickly aspossible.
Ifone ormoreserversthatarerequiredtoresolveanindoubttransactioncannotbe recovered intime,andaccessisrequiredtodatabasepartitions onotherservers, youcanmanuallyresolvetheindoubttransactionbymaking anheuristicdecision. YoucanusetheLIST INDOUBTTRANSACTIONScommandtoquery,commit, and rollbacktheindoubttransactionontheserver.
Note: TheLISTINDOUBTTRANSACTIONScommandisalsousedina
distributedtransactionenvironment.Todistinguishbetweenthetwotypes ofindoubttransactions,theoriginatorfieldintheoutputthatisreturnedby theLISTINDOUBTTRANSACTIONScommanddisplaysoneofthe following:
v DB2EnterpriseServerEdition,whichindicatesthatthetransaction
originatedinapartitioneddatabaseenvironment.
v XA,whichindicatesthatthetransactionoriginatedina distributed
environment.
Identifying
the
failed
database
partition
server
Whenadatabase partitionserverfails, theapplication willtypicallyreceiveoneof thefollowingSQLCODEs.Themethodfordetecting whichdatabasemanager failed dependsontheSQLCODEreceived:
SQL0279N
ThisSQLCODEisreceivedwhena databasepartitionserverinvolvedina transactionisterminatedduring COMMITprocessing.
SQL1224N
ThisSQLCODEisreceivedwhenthedatabasepartitionserverthatfailedis thecoordinatorpartitionforthetransaction.
SQL1229N
ThisSQLCODEisreceivedwhenthedatabasepartitionserverthatfailedis notthecoordinatorpartitionforthetransaction.
Determining whichdatabasepartitionserverfailedisatwo-stepprocess.The SQLCAassociatedwithSQLCODESQL1229Ncontainsthenodenumberof the serverthatdetectedtheerrorinthesixtharraypositionofthesqlerrdfield.(The node numberthatiswrittenfortheservercorrespondstothenodenumberinthe db2nodes.cfgfile.)Onthedatabasepartitionserverthatdetectstheerror,a messagethatindicatesthenodenumber ofthefailedserveriswrittentothe administrationnotificationlog.
Note: Ifmultiplelogicalnodesare beingusedona processor,thefailureofone
logicalnodecancauseotherlogicalnodesonthesameprocessortofail.
Related concepts:
v “Errorrecoveryduring two-phasecommit”inAdministrationGuide:Planning
v “Two-phasecommit”inAdministrationGuide: Planning
Related tasks:
v “Resolvingindoubttransactionsmanually” inAdministrationGuide:Planning
Related reference:
v “db2start-StartDB2 command”inCommandReference
v “LISTINDOUBTTRANSACTIONScommand”inCommandReference
Recovering
from
the
failure
of
a
database
partition
server
Procedure:
Torecover fromthefailureofa databasepartitionserver: 1. Correcttheproblemthatcausedthefailure.
2. Restartthedatabasemanagerbyissuingthedb2startcommandfromany
databasepartitionserver.
3. Restartthedatabasebyissuing theRESTARTDATABASEcommandonthe
failed databasepartitionserverorservers.
Related concepts:
v “Recoveringfromtransactionfailuresinapartitioneddatabase environment”on
page16
Related reference:
v “db2start-StartDB2 command”inCommandReference
Recovering
indoubt
transactions
on
the
host
when
DB2
Connect
has
the
DB2
Syncpoint
Manager
configured
Ifyour applicationhasaccessedahostoriSeries™databaseserverduring a transaction, thereare somedifferences inhowindoubttransactionsare recovered.
ToaccesshostoriSeriesdatabase servers,DB2Connectisused. Therecoverysteps differ ifDB2ConnecthastheDB2 SyncpointManagerconfigured.
Procedures:
The recoveryofindoubttransactionsathostoriSeriesserversisnormally
performedautomaticallybytheTransactionManager(TM)andtheDB2Syncpoint Manager (SPM).Anindoubttransactionat ahostoriSeriesserverdoesnothold anyresourcesatthelocalDB2location,butdoesholdresourcesat thehost or iSeries serveraslongasthetransactionisindoubtat thatlocation.Ifthe
administratorofthehostoriSeriesserverdeterminesthata heuristicdecisionmust be made,thentheadministratormightcontact thelocalDB2database
administrator(forexamplevia telephone)todeterminewhethertocommitorroll backthetransactionat thehostoriSeriesserver.Ifthis occurs,theLISTDRDA INDOUBTTRANSACTIONScommandcanbe usedtodeterminethestateofthe transactionattheDB2Connectinstance.Thefollowingstepscanbeusedasa guidelineformostsituationsinvolvinganSNAcommunicationsenvironment. 1. ConnecttotheSPMasshownbelow:
db2 => connect to db2spm
Database Connection Information
Database product = SPM0500
SQL authorization ID = CRUS
Local database alias = DB2SPM
2. IssuetheLISTDRDAINDOUBTTRANSACTIONScommandtodisplaythe
indoubttransactionsknowntotheSPM.Theexamplebelowshowsone indoubttransactionknown totheSPM.Thedb_name isthelocalaliasforthe hostoriSeriesserver.Thepartner_luisthefullyqualified lunameofthehost or iSeriesserver.Thisprovides thebestidentificationofthehostoriSeriesserver, andshouldbe providedbythecallerfromthehostoriSeriesserver.Theluwid providesa uniqueidentifierforatransactionandisavailableatall hostsand iSeriesservers.Ifthetransactioninquestionisdisplayed,then theuow_status fieldcanbeusedtodeterminetheoutcomeofthetransactionif thevalue isC (commit) orR(rollback).IfyouissuetheLISTDRDAINDOUBT
TRANSACTIONScommandwith theWITHPROMPTINGparameter,youcan commit,rollback,orforget thetransactioninteractively.
db2 => list drda indoubt transactions DRDA Indoubt Transactions:
1.db_name: DBAS3 db_alias: DBAS3 role: AR
uow_status: C partner_status: I partner_lu: USIBMSY.SY12DQA
corr_tok: USIBMST.STB3327L
luwid: USIBMST.STB3327.305DFDA5DC00.0001
xid: 53514C2000000017 00000000544D4442 0000000000305DFD A63055E962000000
00035F
3. Ifanindoubttransactionfor thepartner_luandfor theluwid isnotdisplayed,
orif theLISTDRDAINDOUBTTRANSACTIONScommandreturnsasfollows:
db2 => list drda indoubt transactions