Techniques for data management and their impact on metadata

2.8 Trends in file system metadata management

2.8.1 Techniques for data management and their impact on metadata

As previously discussed the fraction of hot data frequently accessed in a file system is often quite small. Hierachical storage management (HSM) systems‡‡ _{make use of this fact to allow}

the trading of some access latency for storage capacity by migrating file data from its original location to other storage tiers. Depending upon the system, additional tiers use cheaper disk and/or tape storage. After a successful migration, file data is deleted and a stub is left in its place that consists of metadata and sometimes a small part of the file. §§ _{When a migrated}

file is accessed, the HSM system transparently loads the file back into its original location. HSM helps reduce the amount of data on a primary file system by moving rarely used data to cheaper storage. The amount of metadata does not change: in fact, it even increases as the information needed to restage file data (i.e. its new location) must also be recorded. Some systems place this information in the stub as data or metadata whereas others use an external data store such as a database. In both cases, the fraction of metadata in an HSM file system will be higher than that in a corresponding normal file system.

Compression

Transparent compression of file data is available in several local file systems (e.g., NTFS or ZFS [Sun05]) and can be added to others using in-band file compression at the protocol level [SA05] through the use of compression appliances. Current implementations of file system level compression operate only on single files that retain all of their metadata. In contrast, archiving utilities such as TAR, JAR and ZIP merge files and their metadata into a single and possibly compressed file that also contains the directory structure. Unfortunately, even read- only access to archive files has only been implemented as a protocol handler for higher-level languages, and is not available at the file system level. Merging multiple files into one could lead to a significant reduction of metadata of seldom used and/or archived files.

‡‡_{Currently also known as Information Lifecycle Management-systems.}

2.8. Trends in file system metadata management 31 Data de-duplication

An current approach used to reduce storage requirements is data de-duplication. Due to common workflow patterns, such as several users saving the same email attachment data, a file with identical data can be stored in different locations in a file system. The purpose of de-duplication is to find and merge these files by linking all occurences to a single, common data location. Identical data is recognized by first comparing the hash values of the data in question and then, if a match is found, comparing the data itself. Two basic types of de- duplication in filesystems are block-based [KPCE06] and file-based de-duplication [Mic06]. In the former, merging occurs at the file system block level so that identical blocks in different files can be identified, but only within a single file system. File-based de-duplication requires that all of the data in the files is identical.¶¶ _{If data is written to a merged file, the affected}

data part of it is split up again. Depending on the type of data, at LRZ savings of 30% on standard home directories and 75% on VMware virtual machine disks were observed.

The impact of de-duplication on the amount of metadata is similar to that of HSM: metadata for merged files remains in place and is augmented by additional internal metadata needed to manage de-duplication. It can therefore be assumed that de-duplication also does not reduce the amount of metadata.

Snapshots

A snapshot, which provides a point-in-time view of previous states of file system data and metadata, is a space-efficient copy of a file system created by keeping old versions of modi- fied blocks and all metadata. At the API level, snapshots can be accessed in exactly the same manner as the live filesystem by browsing through a file system tree. Depending upon the snapshot technology, the snapshot data does or does not belong to the original file system.

An important difference between a snapshot and a full copy of a file system is that a snapshot physically shares unchanged data blocks with the active file system. Snapshots can be created using several different techniques.

With a copy on write-approach (COW), the file system is not aware of the snapshot because the snapshot process takes place at the block device layer. This layer intercepts writes to a block device with an active snapshot, copies the original content of a changed block to a separate storage device and then writes the new data block. From the file system perspective there are no changes, but there is significant overhead from the read-write-write-process. If the snapshot must be deleted, the content of the separate device can be purged. To access the snapshot, the block device offers the ’snapshot’ view as a virtual device that can be mounted as it has been for the original filesystem.

¶¶_{While quite similar to hardlinks, the de-duplication process is implicit and does not require application}

The second technique can only be used with log-based file systems such as WAFL [HLM02] or ZFS. In log-based filesystems, a default behavior is not to overwrite existing data. There- fore implementation of snapshots is quite intuitive: at the timepoint of a snapshot, the root pointer to internal metadata, and thus the current block allocation data is preserved until the snapshot is deleted. Although the write process does not change in any manner, the file system must determine whether a block is still a part of a snapshot when releasing blocks. In this configuration, the existence of snapshots does not influence performance. One option to present snapshots to the API layer is as a special directory (.snapshot in WAFL).

The basic application of snapshots is the rapid creation of fully consistent views of file system for backup or replication purposes. An example is the hot backup of a running database that is told to write to log files during the creation of a snapshot. The database file itself, which is captured by the snapshot, is then consistent from the perspective of the database. In contrary, normal ’copy’-based backups may have problems with a race condi- tion between ongoing changes to the file system and the backup process. Data management operations such as copying, searching and replicating data on snapshots can then rely on read-only semantics and do not have to track any changes in the file system.

In document Analyzing Metadata Performance in Distributed File Systems (Page 42-44)