Modern hard disks are much larger in size than previously available, making larger capac- ities possible with fewer disks. However, as discussed earlier, purchasing a small number of large disks may not be the best option in terms of I/O performance and throughput. For example, consider a database that is 400 GB in size and performs a balanced mix of random read and write operations. The number of possible RAID and disk configurations for this database exceeds the space available for this entire book. However, a few potential configurations are as follows:
1. RAID 1 using 2 x 600 GB 15 K RPM disks 2. RAID 5 using 3 x 300 GB 15 K RPM disks 3. RAID 5 using 5 x 146 GB 15 K RPM disks 4. RAID 10 using 8 x 146 GB 15 K RPM disks 5. RAID 10 using 14 x 73 GB 15 K RPM disks.
Based on the above RAID configurations, the following I/O throughput rates would be theoretically possible based on a 64 K random I/O workload for SQL Server:
1. 185 IOPS at 11.5 MB/sec 2. 222 IOPS at 14 MB/sec 3. 345 IOPS at 22 MB/sec 4. 816 IOPS at 51 MB/sec 5. 1609 IOPS at 101 MB/sec.
Hopefully, the message is clear: the number of disk heads, and the RAID configuration, will have a direct and dramatic impact on the potential I/O capacity of the RAID array. Remember, though, that these numbers are theoretical, meaning that they are
solely based on the potential I/O capacity of the disks in a given configuration. They take no account of other factors that can and will have an impact on overall throughput, including:
• RAID controller cache size and configuration for read and write caching, which can
improve read-ahead pre-fetch if more cache is dedicated to reads, or absorb heavy bursts of write activity if more cache is dedicated to writes.
• RAID stripe size, which determines the amount of data that is written to, or read
from, a single disk in a stripe before advancing to the next disk in the stripe. • Partition alignment, which ensures that the starting offset of a disk partition is
aligned with the RAID stripe size and sector offset for the disk, so that read and write operations don't cross sector boundaries, incurring an additional I/O operation to complete. See Incorrect partition alignment, later in this chapter, for further details. • NTFS format allocation unit sizes; the 4 K default for NTFS is good for file servers
and the operating system drives, but not database data files which perform better using a 64 K allocation unit.
The only way to be sure that your selected disk configuration will cope gracefully with the workload placed on it by your databases is to perform proper benchmarking of the I/O subsystem, prior to usage. Never rely on theoretical calculations for your values of either IOPS or MB/s throughput. You need to simulate a realistic I/O workload, using multiple workers, and get the actual numbers.
A number of tools exist for measuring the I/O throughput of a given configuration, but the two most common tools used for benchmarking storage configurations for SQL Server are SQLIO and IOmeter.
SQLIO has to be one of the worst-named tools in the world, since it leads to the common misconception that it simulates the I/O workload of SQL Server. The truth is that the tool has absolutely nothing to do with SQL Server; it is simply an I/O stress tool that generates I/O based on the command-like parameters that are passed to the tool. IOmeter is also an I/O stress testing tool, originally developed by Intel and later released as an open source project. Of the two, IOmeter is the most flexible, and can generate mixed I/O workloads that more closely reflect what might be generated by SQL Server. IOmeter also has a graphical user interface that is used for configuring the tests and monitoring their progress.
Downloading SQLIO or IOmeter
SQLIO can be downloaded for free from Microsoft's website at http://www.microsoft.com/down- load/en/details.aspx?id=20163.
IOmeter can be downloaded for free from http://www.iometer.org/.
Microsoft provides a separate tool for testing the reliability and integrity of a disk config- uration, named SQLIOSim. This tool tests the storage using the same disk operations
that SQL Server would perform for reads, writes, checkpoints, backups, and read-ahead operations, to ensure that the storage meets the reliability requirements for SQL Server. Unlike SQLIO and IOmeter, SQLIOSim uses separate data and log files to simulate the reading and writing activity of SQL Server, using the same types of I/O patterns for each
file that would occur under normal operations. This tool should be used to validate that the I/O subsystem functions correctly under heavy loads, but it should not be used for performance benchmarking the configuration.
Downloading SQLIOSim
SQLIOSim can be downloaded from Microsoft's website at http://support.microsoft.com/ kb/231619/en-us.