Optimising SQL Server CPU performance

(1)

SQL Server

Troubleshooting

performance problems

on a database system

can be an overwhelming

task. Knowing where

to look for trouble is

important, but more

crucial is knowing why

your system reacts the

way that it does to a

particular request.

A number of factors can

affect CPU utilisation

Zach Nichter

on a database server: compilation and recom-pilation of SQL statements, missing indexes, multithreaded operations, disk bottlenecks, memory bottlenecks, routine maintenance, and extract, transform and load (ETL) activ-ity, among others. CPU utilisation is not a bad thing in itself – performing work is what the CPU is there for. The key to healthy CPU utilisation is making sure that the CPU is spending its time processing what you want it to process and not wasting cycles on poor-ly optimised code or sluggish hardware.

Two paths leading to the same place

When viewed from a high level, there are two paths to identifying CPU performance problems. The first is reviewing the system’s hardware performance, an exercise that helps determine where to look when you head down the second path, reviewing the server’s query efficiency. This second path is usually more effective in identifying SQL Server performance issues. Unless you know exactly where your query performance issues lie, however, you should always start with a

At a glance:

Troubleshooting database performance issues Reviewing hardware causes

Using PerfMon to track database bottlenecks Evaluating query performance

Optimising SQL Server

CPU performance

(2)

lay some groundwork so that we can review both of these paths.

Laying the groundwork

A high-end dual-core processor will easily outperform the RAM in a machine, which will in turn be faster than an attached stor-age device. A good CPU can handle approxi-mately six times the throughput of current top-end DDR2 memory and about two times that of top-end DDR3 memory. Typical mem-ory throughput is more than 10 times that of the fastest fibre channel drives. In turn, hard disks can only perform a finite number of IOPS (input/output operations per sec-ond), a value which is entirely limited by the number of seeks per second a drive can per-form. To be fair, it is not typical that only a single storage drive is used to handle all of the storage needs on enterprise database sys-tems. Most setups today utilise Storage Area Networks (SANs) on enterprise database servers or larger RAID groups that can

nul-is that no matter what your setup looks like, disk and memory bottlenecks can affect the performance of your processors.

Because of the I/O speed differences, re-trieving data from disk is much more cost-ly than retrieving data from memory. A data page in SQL Server is 8Kb. An extent in SQL Server is made up of eight 8Kb pages, mak-ing it equivalent to 64Kb. This is important to understand because when SQL Server re-quests a particular data page from disk, it is not just the data page that is retrieved but the entire extent in which the data page re-sides. There are reasons that actually make this more cost-effective for SQL Server, but I won’t go into details here. Pulling a data page that is already cached from the buffer pool, at peak performance, should take un-der half a millisecond; retrieving a single ex-tent from disk should take between 2 and 4 milliseconds in an optimal environment. I typically expect a well-performing, healthy disk subsystem read to take between 4 and 10ms. Retrieving a data page from memory is generally between 4 and 20 times faster than pulling a data page from disk.

When SQL Server requests a data page, it checks the in-memory buffer cache before looking for the data page on the disk subsys-tem. If the data page is found in the buffer pool, the processor will retrieve the data and then perform the work required. This is called a soft page fault. Soft page faults are ideal for SQL Server because the data that is retrieved as part of a request must be in the buffer cache before it can be used. A data page that is not found in the buffer cache must be retrieved from the server’s disk subsystem. When the OS has to retrieve the data page from disk, it’s known as a hard page fault.

When correlating memory performance, disk performance and CPU performance, a common denominator helps us put ev-erything in perspective: throughput. In not-so-scientific terms, throughput is the measurement of how much data you can stuff down a finite pipe.

Path 1: System performance

There are really only a few methods for deter-mining if a server has a CPU bottleneck and Hyper-threading is a topic that is worth discussing a bit more because of

the way that it affects SQL Server. Hyper-threading actually presents two logical processors to the OS for each physical processor. Hyper-threading essentially leases time on the physical processors so that each processor ends up more fully utilised. The Intel Web site (microsoft.com/uk/intel) gives a far fuller description of how hyper-threading works.

On SQL Server systems, the DBMS actually handles its own extremely efficient queuing and threading to the OS, so hyper-threading only serves to overload the physical CPUs on systems with already high CPU utilisation. When SQL Server queues multiple requests to perform work on multiple schedulers, the OS has to actually switch the context of the threads back and forth on the physical processors to satisfy the requests that are being made even if the two logical processors are sitting on top of the same physical processor. If you are seeing Context Switches/sec higher than 5000 per physical processor you should strongly consider turning off hyper-threading on your system and retesting performance.

In rare cases, applications that experience high CPU utilisation on SQL Server can effectively use hyper-threading. Always test your applications against SQL Server with hyper-threading both turned on and off before implementing changes on your production systems.

(3)

SQL Server

there aren’t many potential causes of high CPU utilisation. Some of these issues can be tracked using PerfMon or a similar system-monitoring tool while others are tracked us-ing SQL Profiler or similar tools. Another method is to use SQL commands through Query Analyzer or SQL Server Management Studio (SSMS).

The philosophy I use when evaluating a systems performance is ‘Start broad, then focus deep’. Obviously, you can’t focus on problem areas until you’ve identified them. After you evaluate overall CPU utilisation with a tool like PerfMon, you can use it to look at a couple of very simple and easy-to-understand performance counters.

One of the most familiar performance counters is % Processor Time; when you’re in PerfMon, it’s highlighted as soon as you open the Add Counter window. % Processor Time is the amount of time the processors stay busy executing work. Utilisation on proces-sors is generally considered high when this value is 80 percent or higher for most of your peak operating time. It’s typical, and should be expected, that you will see spikes up to 100 percent at times even when the server is not operating with 80 percent utilisation.

Another counter you should review is Processor Queue Length, which can be found under the System performance ob-ject in PerfMon. Processor Queue Length shows how many threads are waiting to per-form work on the CPU. SQL Server manages its work through schedulers in the database engine, where it queues and processes its own requests. Because SQL Server manages its own work, it will only use a single CPU thread for each logical processor. This means that there should be minimal threads waiting in the processor queue to perform work on a system dedicated to SQL Server. Typically you shouldn’t have anything higher than five times the number of physical processors on a dedicated SQL Server, but I consider more than two times problematic. On serv-ers where the DBMS shares a system with other applications, you will want to review this along with the % Processor Time and Context Switches/sec performance counters (I’ll discuss context switches shortly) to deter-mine if your other applications or the DBMS needs to be moved to a different server.

When I see processor queuing along with high CPU utilisation, I look at the Com-pilations/sec and Re-ComCom-pilations/sec coun-ters under the SQL Server: SQL Statistics performance object (see Figure 1). Com-piling and recomCom-piling query plans adds to a system’s CPU utilisation. You should see val-ues close to zero for the Re-Compilations, but watch trends within your systems to determine how your server typically be-haves and how many compiles are normal. Recompiles can’t always be avoided, but que-ries and stored procedures can be optimised to minimise recompiles and to reuse que-ry plans. Compare these values to the actu-al SQL statements coming into the system through the Batch Requests/sec also found in the SQL Server: SQL Statistics performance object. If the compilations and recompila-tions per second comprise a high percentage of the batch requests that are coming into the system, then this is an area that should be reviewed. In some situations SQL developers may not understand how or why their code can contribute to these types of system re-source problems. Later in this article I’ll pro-vide some references to help you minimise this type of activity.

While you’re in PerfMon, check out the performance counter called Context Switches/sec (see Figure 2). This counter tells how many times threads have to be taken out of the OS schedulers (not SQL Figure 1 Selecting the counters to monitor

(4)

more frequent on database systems that are shared with other applications like IIS or other vendor application server compo-nents. The threshold that I use for Context Switches/sec is about 5000 times the num-ber of processors in the server. This value can also be high on systems that have hy-per-threading turned on and also have mod-erate to high CPU utilisation. When CPU utilisation and context switches both ex-ceed their thresholds regularly, this indicates a CPU bottleneck. If this is a regular occur-rence, you should start planning for the pur-chase of more or faster CPUs if your system is outdated. For further information, see the ‘Hyper-threading’ sidebar.

The SQL Server Lazy Writer (as it’s called in SQL Server 2000) or the Resource Monitor (as it’s called in SQL Server 2005) is another area to monitor when CPU utilisation is high. Flushing the buffer and procedure caches can add to CPU time via the resource thread called the Resource Monitor. The Resource Monitor is a SQL Server process that deter-mines which pages to keep and which pag-es need to be flushed from the buffer pool to disk. Each page in the buffer and

proce-when that page is placed into the cache. This cost value is decremented each time the Resource Monitor scans it. When a request requires cache space, the pages are flushed from memory based upon the cost associat-ed to each page; pages with the lowest val-ues are the first to be flushed. The Resource Monitor’s activity can be tracked through the Lazy Writes/sec performance counter under the SQL Server: Buffer Manager object with-in PerfMon. You should track how this value trends to determine what threshold is typi-cal on your system. This counter is usually re-viewed along with the Page Life Expectancy and Checkpoints/sec counters to determine whether there is memory pressure.

The Page Life Expectancy (PLE) counter helps determine memory pressure. The PLE counter shows how long a data page stays in the buffer cache. The industry-accepted threshold for this counter is 300 seconds. Anything less than a 300-second average over an extended period of time tells you that the data pages are being flushed from memory too often. This causes the Resource Monitor to work harder, which in turn forc-es more activity onto the procforc-essors. The

Performance counter Counter object Threshold Notes

% Processor Time Processor > 80% Potential causes include memory pressure, low query plan reuse, non-optimised queries.

Context Switches/sec System > 5000 x

proces-sors Potential causes include other applications on the server, more than one instance of SQL Server running on the same server, hyper-threading turned on.

Processor Queue Length System > 5 x processors Potential causes include other applications on the server, high compilations or recompilations, more than one instance of SQL Server running on the same server.

Compilations/sec SQLServer:SQL Statistics Trend Compare to Batch Requests/sec. Re-Compilations/sec SQLServer:SQL Statistics Trend Compare to Batch Requests/sec.

Batch Request/sec SQLServer:SQL Statistics Trend Compare with the Compilation and Re-Compilations per second. Page Life Expectancy SQLServer:Buffer Manager < 300 Potential for memory pressure.

Lazy Writes/sec SQLServer:Buffer Manager Trend Potential for large data cache flushes or memory pressure. Checkpoints/sec SQLServer:Buffer Manager Trend Evaluate checkpoints against PLE and Lazy Writes/sec. Cache Hit Ratio: SQL Plans SQLServer:Plan Cache < 70% Indicates low plan reuse.

Buffer Cache Hit Ratio SQLServer:Buffer Manager < 97% Potential for memory pressure. Figure 2 Performance counters to watch

(5)

SQL Server

PLE counter should be evaluated along with the Checkpoints Pages/sec counter. When a checkpoint occurs in the system, the dirty data pages in the buffer cache are flushed to disk, causing the PLE value to drop. The Resource Monitor process is the mechanism that actually flushes these pages to disk, so during these checkpoints you should also expect to see the Lazy Writes/sec value in-crease. If your PLE value goes up immediate-ly after a checkpoint is completed, you can ignore this temporary symptom. On the oth-er hand, if you find that you are regularly be-low the PLE threshold there is a very good chance that additional memory will alleviate your problems and at the same time release some resources back to the CPU. All of these counters are found in the SQL Server: Buffer Manager performance object.

Path 2: Query performance

Query plans are evaluated, optimised, com-piled and placed in the procedure cache when a new query is submitted to SQL Server. Each time a query is submitted to the server, the procedure cache is reviewed to at-tempt to match a query plan with a request. If one is not found then SQL Server will cre-ate a new plan for it, which is a potentially costly operation.

Some considerations for T-SQL CPU op-timisation are:

• Query plan reuse

• Reducing compiles and recompiles • Sort operations

• Improper joins • Missing indexes • Table/index scans

• Function usage in SELECT and WHERE clauses

• Multithreaded operations

So let’s pull this back into perspective a bit. SQL Server typically pulls data from both memory and from disk, and it’s not of-ten that you are working with just a single data page. More often, you have multiple parts of an application working on a record, running multiple smaller queries or joining tables to provide a full view of relevant data. In OLAP environments, your applications may be pulling millions of rows from one or two tables so that you can consolidate, roll

up and summarise data for a regional sales re-port. In situations like these, returning data can be measured in milliseconds if the data is in memory, but those milliseconds can turn into minutes when retrieving the same data from disk instead of RAM.

The first example is a situation with a high volume of transactions and plan reuse de-pends on the application. Low plan reuse causes a large number of compilations of SQL statements which in turn causes a great deal of CPU processing. In the second ex-ample, the heavy system resource utilisation can cause a system’s CPU to be overly active, as existing data has to be constantly flushed from the buffer cache to make room for the large volume of new data pages.

Consider a highly transactional system, where a SQL statement like the one shown below is executed 2000 times over a 15- minute period in order to retrieve shipping carton information. Without query plan re-use you could hypothetically have an indi-vidual execution time of around 450ms per

When tracing your SQL Server app, it pays to become familiar with the stored procedures used for tracing. Using the GUI interface (SQL Server Profiler) for tracing can increase system load by 15 to 25 percent. If you can use stored procedures in your tracing, this can drop by about half.

When I know that my system is bottlenecked somewhere and I want to determine which current SQL statements are causing problems on my server, I run the query that you see below. This query helps me get a view of individual statements and the resources that they are currently using, as well as statements that need to be reviewed for performance en-hancements. For more information on SQL traces, see msdn2.microsoft. com/ms191006.aspx.

SELECT

substring(text,qs.statement_start_offset/2 ,(CASE

WHEN qs.statement_end_offset = -1 THEN len(convert(nvarchar(max), text)) * 2 ELSE qs.statement_end_offset

END - qs.statement_start_offset)/2) ,qs.plan_generation_num as recompiles ,qs.execution_count as execution_count

,qs.total_elapsed_time - qs.total_worker_time as total_wait_time ,qs.total_worker_time as cpu_time

,qs.total_logical_reads as reads ,qs.total_logical_writes as writes FROM sys.dm_exec_query_stats qs

CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st LEFT JOIN sys.dm_exec_requests r

ON qs.sql_handle = r.sql_handle ORDER BY 3 DESC

(6)

query could probably run in about 2ms, bringing the total execution time down to about 5 seconds. USE SHIPPING_DIST01; SELECT Container_ID ,Carton_ID ,Product_ID ,ProductCount ,ModifiedDate FROM Container.Carton WHERE Carton_ID = 982350144;

Query plan reuse is critical for optimal performance on highly transactional systems and it is most often achieved by parameteris-ing your queries or stored procedures. Here are a few excellent resources for information on query plan reuse:

• Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005 (microsoft.com/technet/prodtechnol/ sql/2005/recomp.mspx)

• Optimizing SQL Server Stored Procedures to Avoid Recompiles (microsoft.com/uk/sqlserver)

• Query Recompilation in SQL Server 2000 (msdn2.microsoft.com/aa902682.aspx) One helpful place that has a wealth of in-formation is the SQL Server 2005 dynamic management views (DMVs). When CPU util-isation is high, there are a couple of DMVs that I use to help me determine if the CPU is being used appropriately or not.

with a means to determine each resource type or function that SQL Server uses and it mea-sures the amount of time the system waits because of that resource. The counters in this DMV are accumulative. This means that in order to get a clean look at what resourc-es could be affecting different areas of the system, you will first have to issue a DBCC SQLPERF(‘sys.dm_os_wait_stats’, CLEAR) command to reset all of the counters after you review the data for any outstanding is-sues. The sys.dm_os_wait_stats DMV is the equivalent of the database consistency check DBCC SQLPERF(WAITSTATS) command in SQL Server 2000. You can find out more about the different wait types in SQL Server Books Online at msdn2.microsoft.com/ ms179984.aspx.

It’s important to know that waits are typi-cal in a system even when everything is run-ning optimally. You just have to determine if the waits are being affected by a CPU bottle-neck. Signal waits should be as low as pos-sible in relation to the overall wait time. The amount of time a particular resource waits for a processor resource can be determined simply by subtracting the signal wait time from the total wait time; this value should not be greater than approximately 20 per-cent of the total wait time.

The sys.dm_exec_sessions DMV shows all of the open sessions on the SQL Server. This DMV provides a high-level view of the performance of each session and all of the work that each session has performed from its inception. This includes the total amount of time the session has spent waiting, total CPU usage, memory usage and a count of reads and writes. The DMV will also provide you with the login, login time, host machine and the last time the session made a request of SQL Server.

Using the sys.dm_exec_sessions DMV, you will be able to determine only the active ses-sions, so if you are seeing high CPU utilisa-tion this is one of the first places to look. Review the sessions that have a high CPU count first. Determine the application and the user that has been performing the work, then start to dig deeper. Pairing sys.dm_ exec_sessions with the sys.dm_exec_requests SELECT es.session_id ,es.program_name ,es.login_name ,es.nt_user_name ,es.login_time ,es.host_name ,es.cpu_time ,es.total_scheduled_time ,es.total_elapsed_time ,es.memory_usage ,es.logical_reads ,es.reads ,es.writes ,st.text FROM sys.dm_exec_sessions es

LEFT JOIN sys.dm_exec_connections ec ON es.session_id = ec.session_id LEFT JOIN sys.dm_exec_requests er ON es.session_id = er.session_id

OUTER APPLY sys.dm_exec_sql_text (er.sql_handle) st WHERE es.session_id > 50 -- < 50 system sessions ORDER BY es.cpu_time DESC

(7)

SQL Server

DMV can provide much of the information that is available through the sp_who and sp_who2 stored procedures. If you join this data together along with the sys.exec_sql_

text dynamic management function (DMF) through the sql_handle column, you can get the session’s currently running query. The snippet in Figure 3 shows how to pull this data together to help determine what is cur-rently happening on a server.

I find that this statement helps determine which applications you need to focus on. When I compare CPU, memory, reads, writes and logical reads for all of the sessions within an application and I determine that the CPU resource is much higher than other resourc-es that are utilised, I start focusing on those SQL statements.

To track SQL statements historically for an application I use SQL Server traces. You can get to these either through the SQL Server Profiler tool or through the trace system stored procedures to help evaluate what is going on. (See the sidebar ‘SP trac-ing’ for more information on this topic.) Profiler should be reviewed for statements with high CPU utilisation as well as hash and sort warnings, cache misses and other red flags. This can help you narrow in on scific SQL statements or a spescific time pe-riod that caused high resource utilisation. Profiler is capable of tracking things like the SQL statement text, execution plans, CPU usage, memory usage, logical reads, writes, caching of query plans, recompiles, ejection of query plans from the cache, cache misses, table and index scans, missing statistics, and many other events.

After I have collected data from either sp_ trace stored procedures or the SQL Server Profiler, I generally use a database, which is populated with trace data either after the fact or by setting the trace to write to the da-tabase. Populating the database after the fact can be done using the SQL Server system function called fn_trace_getinfo. The bene-fit of this approach is that I can query and sort the data in multiple ways to see what SQL statements used the most CPU or had the most reads, count how many recompiles occur, and many other things. Here’s an ex-ample of how this function is used to load a table with a profiler trace file. The default specifies that all of the trace files for that trace will be loaded in the order that they were created: SELECT * INTO trc_20070401 FROM fn_trace_gettable(‘S:\Mountpoints\TRC_20070401_ 1.trc’, default); GO Wrapping it up

As you’ve seen, high CPU utilisation doesn’t necessarily indictate that there is a CPU bottleneck. High CPU utilisation can be masking a number of other application or hardware bottlenecks too. Once you’ve iden-tified that you have high CPU utilisation despite your other counters looking healthy, you can start looking for the cause within the system, and isolate a solution (whether it be purchasing more CPUs or optimising your SQL code). And whatever you do, don’t give up! With the tips provided in this article, along with a bit of practice and research, op-timising CPU utilisation under SQL Server is an attainable execution plan.

For more information, visit the SQL Server Tech Center: http://technet.microsoft.com/

en-gb/sqlserver/default.aspx ■

Zach Nichter is a SQL Server

profes-sional with more than 10 years of experience. He has held a number of SQL Server support roles including DBA, team lead, manager and consultant. Currently, Zach is employed by Levi Strauss & Co. as the DBA Architect, focusing on SQL Server performance, monitoring and architecture as well as other strategic initia-tives. In addition, Zach is the author of a video blog found on www.sqlcatch.com.

Optimising SQL Server CPU performance

SQL Server

Troubleshooting

performance problems

on a database system

can be an overwhelming

task. Knowing where

to look for trouble is

important, but more

crucial is knowing why

your system reacts the

way that it does to a

particular request.

A number of factors can

affect CPU utilisation

Zach Nichter

Optimising SQL Server

CPU performance

Profiler can help you

narrow in on specific

SQL statements or a

specific time period

that caused high

resource utilisation