Creating backups of data is the process of creating redundant copies, as insur- ance for data loss. Losing data is undesirable if it has value to the owner. Some data, such as configuration files, can be recreated, but this takes time and hu- man resources. Mission critical data require redundancy for the business to be able to quickly get back after an incident.
In order to create an efficient backup strategy, knowing the causes for data loss are required, so that the strategy can incorporate counter measures against them. There are several causes; in his study from 2003, David Smith identifies six different causes and estimates 4.6 million cases of data loss each year, based on data from Safeware, The Insurance Agency, Inc., and ONTRACK Data Interna- tional, Inc.[24]. Effecting approximately 2.5% of all computers annually, hard- ware failure is the largest cause of error resulting in data loss (40%). 29% of all causes apply to human error, effecting 1.8% of all computers annually. The four last causes are software corruption, theft, computer viruses and hardware
destruction. Examples are disk failure∗, spilling coffee on computer†, bug in
the file system‡and flood§.
Æleen Frisch[25] presents three ‘universally accepted’ truths about back- up: 1) the system administrator is responsible for effective backup; 2) effective backup require planning; and 3) the most effective strategies does not look at individual computers, but networks. The first axiom implies centralisation, and to create a strategy. The second and third axioms require us to analyse the site we are working on, and answer questions such as ‘What data needs to be backed up?’; ‘How often does the data change?’; and ‘How might the data be lost?’ Answering these questions have led to a list of so-called ‘best practices’, where the key practices for effective backup are to centralise, automate, verify and frequent restoration testing.
Backup systems incorporate many techniques to ensure high performance and space efficiency. Chervenak et al. lists the following three choices in their survey of backup techniques: 1) full or incremental backup; 2) device-based or
∗ Hardware failure. † Human error. ‡Software Corruption. § Hardware destruction.
2.2. BACKUP
file-based (physical or logical); and 3) snapshots. Options concerning business values are: 1) on-line or off-line; 2) parallelism; 3) compression; 4) restoration;
5) media management∗; and 6) disaster recovery[26]. All three backup tech-
niques are discussed below, together with on-line backups.
In the context of full and incremental backup, a full backup is simply to copy the entire file system to the backup device. The whole file system, or individual files, can later be restored. However, the process of copying the whole file system is slow, and consumes much space on the backup medium, especially if the number of changed files is low. A faster technique is to do incremental backups, which is to only copy the files which have been changed since the ‘last’ backup. Last is relative because it is configurable by defining backup ‘levels’. If a level 0 backup is a full backup, a level 1 backup is to back up the files which have changed since the full backup. A level 2 backup is to back up the files which have changed since the last level 1 backup, and so on. Incremental backups are faster, and consume less space than full backups, but restoring is slower, as each level has to be iterated. The higher the level, the more has to be traced back. This problem can be solved by using complex rotation schemes like the ‘Towers of Hanoi’. The Towers of Hanoi scheme is based on the puzzle with the same name, and yield current-, week-, month-, or even year old copies of the data, without backing up changed files more than two times[27].
When doing a logical backup, the file system is read and the meta-data interpreted so that files are copied to the medium. The problem with this is that the physical blocks of a file might not be stored contiguously on the disk, which requires more seek time to read, than if the blocks were read contigu- ously. Physical backup systems duplicate the physical medium to the backup
medium. This is much faster, and require less CPU1time than logical backup,
but restoration has been thought to be slower because the files might not stored contiguously on the backup medium[26].
Hutchinson et al. have compared the performance of logical and physi- cal backup strategies with Network Appliance’s WAFL file system[28], which implement both schemes, and concludes that physical backup and restora- tion can achieve higher throughput with less CPU consumption than logical backup[29]. WAFL were used because it implements both strategies. Snap- shots are discussed in section 2.3 on the following page.
Concerning business value, the backup window†stands out as a major de-
ciding factor when choosing a backup solution. Traditionally, backup software have yield the most predictable results when run on off-line, or read-only, file systems. Physical backups are subject to inconsistencies in the file system, because file systems work asynchronously to increase performance. Data is buffered in memory before it is synchronised with the media, and physical backup solutions only see what is on the device. For these reasons, file sys-
∗The paper actually discusses tape management, but the methods mentioned are not limited
to tape in full.
†
The backup window is the time from a backup starts until it is completed.
Orig3
Orig Snap
Orig1 Orig2
(a) Snapshot has been created
copy Orig2 Orig1 Orig3 Orig Snap Updated Snap3
(b) Original data has been copied to the snapshot before writing an update
Figure 2.4: Creating a snapshot using copy-on-write and updating the working copy.
tems have to be synchronised and off-line to ensure data consistency. Logical backup solutions, on the other hand, use the higher level file system operators, and see buffered data, but have in turn other difficulties.
Backup software goes through different phases when doing a backup. The
most basic is a scan phase followed by a dump∗phase. Software first scan the file
system to get an image of the directory structure; if then files are moved, or the structure is otherwise changed during the dump phase, the backup may not be consistent with the source and the files not backed up. In case that backup is incremental, the backup software will think that the file is already backed up, as the modification date of missing files are older than the previously backed up files’ information[2]. Two solutions for these problems are to take the file system off-line or to use snapshots as backup source; both will produce con- sistent backups.