With the exception of Ext2, all current Linux file systems support journaling. The journal is used to track changes of files as well as metadata. The goal of using a journal is to make sure that transactions are processed properly, especially if a power outage occurs. In that case, the file system will check the journal when it comes back up again and, depending on the journaling style that is configured, do a rollback of the original data or a check on the data that was open when the server crashed. Using a journal is essential on large file systems to which lots of files get written. Only if a file system is very small, or writes hardly ever occur on the file system, can you configure the file system without a journal.
NTip
An average journal takes about 40 MB of disk space. If you need to configure a very small file system, such as the 100 MB +^kkp partition, it doesn’t make sense to create a journal on it. Use Ext2 in those cases.In Chapter 4, you read about the scheduler and how it can be used to reorder read and write requests. Using the scheduler can give you a great performance benefit. When using a journal, however, there is a problem: write commands cannot be reordered. The reason is that, to use reordering, data has to be kept in cache longer, whereas the pur- pose of a journal is to ensure data security, which means that data has to be written as soon as possible.
To avoid reordering, a journal file system should use barriers. This ensures that the disk cache is flushed immediately, which ensures that the journal gets updated properly. Barriers are enabled by default, but they may slow down the write process. If you want your server to perform write operations as fast as possible, and at the same time you are willing to take an increased risk of data loss, you should switch barriers off. To switch off barriers, add a mount option. Each file system needs a different option:
s 8&3USESjk^]nnean.
s %XTUSES^]nnean9,.
s 2EISER&3USES^]nnean9jkja.
Journaling offers three different journaling modes. All of these are specified as options while mounting the file system, which allows you to use different journaling
s`]p]9kn`ana`: When using this option, only metadata is journaled and barriers are enabled by default. This way, data is forced to be written to hard disk as fast as possible, which reduces the chances of things going wrong. This journaling mode uses the optimal balance between performance and data security.
s`]p]9snepa^]_g: If you want the best possible performance, use this option. This option only journals metadata, but does not guarantee data integrity. This means that, based on the information in the journal, when your server crashes, the file system can try to repair the data but may fail, in which case you will end up with the old data (dating from before the moment that you initialized the write action) after a system crash. This option at least guarantees fast recovery after a system crash, which is sufficient for many environments.
s`]p]9fkqnj]h: If you want the best guarantees for your data, use this option. When using this option, data and metadata is journaled. This ensures the best data integ- rity, but gives bad performance because all data has to be written twice. It has to be written to the journal first, and then to the disk when it is committed to disk. If you need this journaling option, you should always make sure that the journal is written to a dedicated disk. Every file system has options to accomplish that.
Indexing
When file systems were still small, no indexing was used. An index wasn’t necessary to get a file from a list of a few hundred files. Nowadays, directories can contain many thou- sands, sometimes even millions, of files; to manage so many files, an index is essential.
Basically, there are two approaches to indexing. The easiest approach is to add an index to a directory. This approach is used by the Ext3 file system: it adds an index to all directories and thus makes the file system faster when many files exist in a directory. However, this is not the best approach to indexing.
For optimal performance, it is better to work with a balanced tree (also referred to as b- tree) that is integrated into the heart of the file system itself. In such a balanced tree, every file is a node in the tree and every node can have child nodes. Because every file is represented in the indexing tree, the file system is capable of finding files very quickly, no matter how many files there are in a directory. Using a b- tree for indexing also makes the file system a lot more complicated. If things go wrong, the risk exists that you will have to rebuild the entire file system, and that can take a lot of time. In this process, you even risk losing all data on your file system. Therefore, when choosing a file system that is built on top of a b- tree index, make sure it is a stable file system. Currently, XFS and ReiserFS have an internal b- tree index. Of these two, ReiserFS isn’t considered a very stable file system, so better use XFS if you want indexing.