TOKYO DELPHI
Chapter 15. Tivoli Storage Manager and databases
15.2 Backup techniques
There are many techniques you can use to back up data managed by a RDBMS.
These techniques are, at least at a conceptual level, common to most RDBMSs.
Often, a combination of techniques is used. The techniques we consider are shown in Figure 87 and Figure 88: disk mirroring, database export, offline backup, online backup, full database backup, partial database backup, and log file backup.
Figure 87. Backup techniques to be considered
Figure 88. Other backup techniques
Most DBs support online and offline
Subset of full database
Must ensure logical unit of recovery Must ensure consistent with database
Back up DB changes only Useful if large units of recovery Used with online backups
"Simulated" incremental backup
Tivoli Storage Manager and databases 165 15.2.1 Disk mirroring
Disk mirroring is a useful technique for maximizing the availability of your database. Mirroring is the process of writing the same data to multiple storage devices at the same time. If a media failure occurs, operations are automatically switched to a mirrored copy. Mirroring allows your users to continue working even though a media failure has occurred.
However, mirroring does not remove the need to back up databases. For example, disk mirroring will not allow you to restore a table that has been lost or damaged as result of user error. Also, although disk mirroring dramatically reduces the impact of media failures, there is still a risk of damage to both sides of the mirror.
Another backup technique related to disk mirroring has recently gained
popularity: breaking a disk mirror. This technique breaks the synchronization of one of the disk mirrors and takes a backup copy from that broken mirror. The database is still online and available to users with the remaining mirror or mirrors.
We call this technique "simulated online" because the backup is not really an online backup; the backup is taken from a nonfunctional broken mirror, not a running database.
There are several disadvantages to this technique. The backups and restores are being done without the database backup utilities and therefore are always full backups. Ensuring that all necessary data is included in the backups as well as figuring out what is needed for the database restore and recovery are the
responsibility of the administrator; it is not an automated procedure, and therefore it is more error prone and requires a higher skill level. In addition, the database is not online to users while the mirror is being broken; a quiesce of the file system and application is required before breaking the mirror. The time to re synchronize the broken mirror with the database after the backup can also be quite extensive;
Some customers, for large database systems, have estimated that it would take more than 24 hours to re synchronize the mirrors, which would not be acceptable if you assume a dailybackup strategy.
Customers considering the mirror-breaking approach for database backup are typically customers who do not have sufficient spare CPU cycles to run online database backups, and hence are trying to offload some CPU cycles to the storage subsystem. Serious consideration should be given to solving the processor CPU constraint instead of using the mirror-breaking technique, because of the disadvantages we have discussed above.
15.2.2 Database export
Export and import utilities operate on logical objects as opposed to physical objects. For example, you can use an export command to copy an individual table. At some later time, you might want to restore the table, in which case you would use the import command. Export and import are not designed as backup and restore utilities, but instead for moving data, for example, for workload balancing or migration. Export and import are often not integrated with the database’s logging capability, so it is up to you to institute the proper procedures to ensure database consistency. However, because export is usually the only utility that can access individual tables, you may need to use export if you have a requirement for keeping, for example, the last 30 days of each table.
15.2.3 Offline backup
Offline backup involves shutting the database down before you start the backup and restarting the database after the backup is complete. Offline backups are relatively simple to administer, however, they suffer from the obvious but significant disadvantage that neither users nor batch processes can access the database (read or write operations) while the backup is taking place. You need to schedule sufficient time to perform the backup to ensure that the periods when the database will be unavailable are acceptable to your users. Most databases do not require that you perform offline backups if you perform online backups; online backups (along with the log files) are sufficient to recover the database.
Some RDBMSs provide a "single-user mode" or "quiesced mode." You can think of this as an "almost offline" mode. A database administrator can still use the database, but general users cannot. With some RDBMSs, general users can stay connected to the database but they cannot use it (their transactions are queued).
Backup time is reduced with "almost-offline" mode because a full shutdown and restart of the database is not required.
15.2.4 Online backup
Most RDBMSs allow backups to be performed while the database is started and in use (both reads and writes). Clearly, if a database is being backed up while users are updating it, it is likely that the data backed up will be inconsistent. The RDBMSs that support online backup use log files during the recovery process to recover the database to a fully consistent state. This approach requires that you retain the RDBMS log files and indicate to the RDBMS when you are about to start the backup and when you have completed the backup.
Some RDBMSs allow you to quiesce activity on portions of the database (for example, a particular table space) so that a set of complete tables is temporarily
"frozen" in a consistent state. You then can back up the set of tables that has been "frozen". Once the backup has completed, you can reactivate the table space.
15.2.5 Full database backup
Full database backups involve making copies of all of the data files used to hold user data. In some database products, full database backups also include copies of the data files that hold tables used by the RDBMS itself, RDBMS log files, and any control files and parameter files that the RDBMS uses. Many RDBMSs allow you to perform full database backups when the database is either online or offline.
Some database products provide incremental backup, which only backs up changed database pages or blocks. This type of incremental backup is called a
"true" incremental backup, as opposed to a "simulated" incremental backup (described below as log file backup). The products that support a true incremental backup include Oracle 8, Informix, and a third-party database backup and recovery product, BMC DataTools SQL-BackTrack). Understanding what
incremental backup means for a database is a critical concept, so we will explore that in more detail later.
Tivoli Storage Manager and databases 167 15.2.6 Partial database backup
Many RDBMSs allow partial database backups when the database is online or offline. Partial database backups involve backing up a subset of the full database (such as a tablespace, or data files that make up a tablespace). It is often not the best approach to back up only a subset of a database, because you must ensure that what you back up represents a complete logical unit of recovery from the point of view of the application. Otherwise, you may introduce inconsistencies into the database upon recovery.
15.2.7 Log file backup
For some applications, the units of recovery are too large to be backed up on a daily basis, for example, performing a full daily backup. Sometimes the
constraining factor is the elapsed time that is available (the backup window).
Sometimes the load that the backup would place on the network would have an unacceptable impact on other processes and users.
In such situations it may be possible to capture only the changes to the database by backing up the RDBMS log files. This type of backup is sometimes referred to as an incremental backup (because you are not performing a full daily backup), but it is really a "simulated" incremental backup. Recovery in such situations is achieved by restoring the database from a full database backup (in some
circumstances restoring from a partial backup may be sufficient), restoring the log files, and applying the log files to the restored database (forward recovery). This is a longer recovery process than recovering from a full daily backup or a true incremental backup.
15.2.8 Restore techniques
We have talked in detail about a variety of backup techniques, but we really need to focus on restore techniques as well. As we have discussed, most RDBMS' provide full and partial online backups. Certain types of restores can also often be made while the database is online. Many RDBMSs allow you to restore parts of the database while the rest of the database is online and in use. Typically, these partial online restores are for user data, not system data. You cannot restore an entire database while it is online.
It is also important to distinguish between restoring a database and recovering a database. Restoring a database means bringing back the files that make up the database from your backup system repository. You do not necessarily have an operational database to use at this point. Recovering a database means bringing the restored database to the point of being fully operational, which may, for example, entail restoring log files and performing forward recovery.