In my experience, the Plan Cache in SQL Server 2005 and 2008 is one of the most underused assets for troubleshooting performance problems in SQL Server. As a part of the normal execution of batches and queries, SQL Server tracks the accumulated execution information for each of the plans that is stored inside of the plan cache, up to the point where the plan is flushed from the cache as a result of DDL operations, memory pressure, or general cache maintenance. The execution information stored inside of the plan cache can be found in the sys.dm_exec_query_stats DMV as shown in the example query in Listing 1.6. This query will list the top ten statements based on the average number of physical reads that the statements performed as a part of their execution. SELECT TOP 10 execution_count , statement_start_offset AS stmt_start_offset , sql_handle , plan_handle ,
total_logical_reads / execution_count AS avg_logical_reads ,
total_logical_writes / execution_count AS avg_logical_writes ,
total_physical_reads / execution_count AS avg_physical_reads ,
t.text
FROM sys.dm_exec_query_stats AS s
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) AS t
ORDER BY avg_physical_reads DESC
Listing 1.6: SQL Server execution statistics.
The information stored in the plan cache can be used to identify the most expensive queries based on physical I/O operations for reads and for writes, or based on different criteria, depending on the most problematic type of I/O for the instance, discovered as a result of previous analysis of the wait statistics and virtual file statistics.
Additionally, the sys.dm_exec_query_plan() function can be cross-applied using the plan_handle column from the sys.dm_exec_query_stats DMV to get the execution plan that is stored in the plan cache. By analyzing these plans, we can identify problematic operations that are candidates for performance tuning.
Query performance tuning
A full discussion of query performance tuning is beyond the scope of this book. In fact, several notable books have been written on this topic alone, including "SQL Server 2008 Query Performance Tuning Distilled" (http://www.amazon.com/Server-Performance-Tuning-Distilled-Experts/ dp/1430219025) and "Inside Microsoft SQL Server 2008: T-SQL Querying" (http://www.amazon.com/ Inside-Microsoft®-SQL-Server®-2008/dp/0735626030).
The information in the sys.dm_exec_query_stats DMV can also be used to identify the statements that have taken the most CPU time, the longest execution time, or that have been executed the most frequently.
In SQL Server 2008, two additional columns, query_hash and query_plan_hash, were added to the sys.dm_exec_query_stats DMV. The query_hash is a hash over the statement text to allow similar statements to be aggregated together. The query_ plan_hash is a hash of the query plan shape that allows queries with similar execution plans to be aggregated together. Together, they allow the information contained in this DMV to be aggregated for ad hoc workloads, in order to determine the total impact of similar statements that have different compiled literal values.
Summary
This chapter has outlined my basic approach to investigating performance problems in SQL Server. This approach is more or less the same, regardless of whether it is a server I know well, or one I'm investigating for the first time, with no prior knowledge of the health and configuration of the SQL Server instance it houses. Based on the information gathered using this methodology, more advanced diagnosis of the identified problem areas can be performed, using information contained in the subsequent chapters in this book.
The most important point that I want to stress in this opening chapter is that no single piece of information in SQL Server should be used to pinpoint any specific problem. The art of taming an unruly SQL Server is the art of assembling the various pieces of the puzzle so that you have a complete understanding of what is going on inside of a server. If you focus only on what is immediately in front of you, you will, in most cases, miss the most important item, which is the true root cause of a particular problem in SQL Server.
Improper hardware configuration is a common cause of SQL Server performance and scalability problems. If you try to run SQL Server on workstation-grade hardware, you will run into problems, but even on server-grade hardware, performance problems can and will occur if one or more of the hardware components have been improperly sized and configured for the SQL Server workload.
SQL Server is very different from other applications in terms of its disk usage character- istics and storage requirements, and the disk subsystem is one of the most commonly undersized and poorly-configured hardware components for SQL Server. Often the I/O workload can be substantially reduced by measures such as appropriate use of indexing, or ensuring queries don't read more data than strictly necessary. However, if I/O problems persist as indicated, for example, by high latency values for the Physical Disk\Disk Reads/sec and Physical Disk\Disk Writes/sec PerfMon counters, or specific I/O-related wait types in the DMVs, then you'll need to troubleshoot your disk I/O system. This task may be relatively straightforward, or grievously complex, depending on whether you're using simple, directly attached storage, or a vast and complex, enter- prise-wide Storage Area Network (SAN).
This chapter discusses the basics of disk I/O subsystem configuration for SQL Server, and the most common problems associated with the incorrectly sized or incorrectly designed disk configurations, covering topics such as:
• Appropriate choice of hardware RAID level – there are many possible configurations,
and the right one will depend largely on the nature of the SQL Server workload. • Storage capacity versus throughput – 1 TB of database storage can be satisfied with a
• Workload type and distribution – specific considerations for data, log, and
tempdb files.
• Common disk I/O problems – such as disk partition misalignment and network
bandwidth issues in SANs.
By far the best way to deal with disk I/O issues is to work as hard as you can during the system design phase to prevent them from occurring. The only way to be sure that your disk I/O subsystem will meet your I/O performance and throughput requirements is to understand what those requirements are, and test the performance of your disk subsystem under realistic loads, using a tool such as SQLIO or IOmeter.