In SAN environments, where the storage is attached through a network connection, one of the biggest limiting factors to storage performance is the bandwidth available for the connection. This is true regardless of whether the connection is Fiber Channel, or iSCSI, using traditional Ethernet for the connectivity.
When I first started working with SAN environments for SQL Server the prevailing technology in use was Fiber Channel, and the standard port speed was 2 Gb/sec for a single path. At the time, newer hardware that supported 4 Gb/sec connections for a single path was just being implemented in environments. Soon thereafter, 1 Gb/sec iSCSI began to appear in small-to-medium business environments, as a lower-cost implementation of shared storage for the enterprise.
For me, one of the hardest things to understand, when working with SANs, was the impact that the paths to the SAN had on storage performance, especially in multi-path environments where the expectation was that multiple paths to the SAN would improve the overall performance of the storage connections.
When using SAN multi-path technologies, the initial benchmarking and validation of the storage configuration is incredibly important. You need to verify whether or not the multiple paths to the SAN really are being used in conjunction with each other to improve the performance.
Typically, a 1 Gb/sec connection to a SAN will be capable of a maximum throughput of 90–95 MB/sec, assuming the underlying storage configuration will support this level of throughput. So, theoretically, two 1 Gb/sec connections should provide 180–190 Mb/sec of throughput. However, depending on the exact multi-path configuration, this may not be an accurate expectation.
A full discussion of multi-path configurations of SANs is beyond our remit, but the key point in determining the available bandwidth is appropriate benchmarking of the storage subsystem using tools like SQLIO or IOmeter. If the storage performance is
not meeting the expectations for the configuration, you will have to determine where exactly the bottleneck in the configuration lies, and continue troubleshooting from there. It could be that the number of disks underlying the storage array cannot support the throughput requirements, or that the SAN caching is the bottleneck, or that the multi- path configuration is not functioning as expected and you are only getting an actual single path to the SAN for I/O.
Summary
Proper configuration of the hardware, as well as the SQL Server database engine, is critical to the optimum performance of the overall system. Incorrect hardware configu- rations, generally in the disk I/O subsystem, often result in a system that fails to meet performance expectations.
The best strategy with regard to disk I/O configuration problems could be summarized as "avoid them as far as possible, through appropriate planning and testing."
• Don't consider only storage capacity when provisioning the disk subsystem; I/O performance and throughput are critical.
• Make sure your files are separated according to the type of I/O workload. Data files, log files, and tempdb files should all be on separate disks, and configured appropriately for the given workload type.
• Test the performance of your disk subsystem under realistic loads; it's the only way to be sure will meet your I/O performance and throughput requirements.
If your system does suffer from disk I/O issues, there are very few quick fixes. First, you need to ensure that you're not wasting I/O cycles, either through poorly designed queries, lack of indexing, or through hardware-related "bugs" such as disk partition misalignment. Beyond that, you probably need to add I/O capacity, or work out what component is causing the I/O bottleneck. If you are using DAS, this may be relatively straightforward. If you're using a SAN, it helps if you are friendly with your SAN administrator.
A CPU-bound system is relatively easy to spot, but not always as easy to diagnose. If you notice that one or more of the CPUs are working at close to maximum capacity, along with a dramatic decrease in server performance, then you've likely got a CPU issue. The CPU is involved in almost all SQL Server activity, from running queries to moving data in and out of memory, and so on, which means that an over-taxed CPU can have dire consequences.
Unfortunately, the source of CPU pressure is not always easy to pinpoint since what seems like a CPU problem may actually have its root cause elsewhere, such as insuf- ficient memory, causing SQL Server to constantly move data in and out of memory, or poorly written queries, inadequate indexing, or even inappropriate configuration option settings. The source of the CPU pressure may also be a non-SQL Server process running on the server.
Regardless of what caused the problem, the goal of the investigation stage of trouble- shooting excessive CPU utilization in SQL Server is to isolate the problem to a specific source. Generally, this will require the collection of multiple pieces of information, using tools such as Performance Monitor (PerfMon), SQLTrace and several of the SQL Server Dynamic Management Views.
Once it has been confirmed that the high CPU usage is due to the SQL Server process, and the problem has been isolated to a specific query (or set of queries), we can seek to alleviate the CPU pressure via design changes, such as tuning CPU-intensive queries, adding appropriate indexes, replacing ad hoc SQL with stored procedures to improve plan reuse and so on, or by tweaking SQL Server and Windows configuration settings.