Parameter sniffing - Troubleshooting SQL Server A Guide for the Accidental DBA

Parameter sniffing is a process used by SQL Server when creating an execution plan for a stored procedure, function, or parameterized query. The first time the plan is compiled, SQL Server will examine, or "sniff", the input parameter values supplied, and use them, in conjunction with the column statistics, to estimate the number of rows that will be touched by the query. It then uses that estimate in its costing of various possible execution plans. A problem only arises if the values that were passed as input parameters on initial plan creation, result in a row count that is atypical of that which will result from future executions of the procedure. Parameter sniffing only occurs at the time a plan is compiled or recompiled, and all subsequent executions of the stored procedure, function, or parameterized query will use the same plan.

During the initial compile, only the values of the input parameters can be sniffed as any local variables will have no value. If a statement within the batch is recompiled, both parameter and variable values can be sniffed, as the variables will, by that time, have values.

By way of an example, we'll use the _{AdventureWorks database again, and specifically the} ShipDate in the Sales.SalesOrderHeader table. This column has a minimum date of 2011/07/08 and maximum date of 2004/08/07. Listing 3.10 shows a stored procedure to find all sales order numbers (also in this table) that are between two given ship dates.

CREATE PROCEDURE user_GetCustomerShipDates

( @ShipDateStart DATETIME , @ShipDateEnd DATETIME ) AS SELECT CustomerID ,

SalesOrderNumber FROM Sales.SalesOrderHeader

WHERE ShipDate BETWEEN @ShipDateStart AND @ShipDateEnd

Listing 3.10: The user_GetCustomerShipDates stored procedure.

This would be supported by a non-clustered index on _{ShipDate, as shown in Listing 3.11.}

CREATE NONCLUSTERED INDEX IDX_ShipDate_ASC ON Sales.SalesOrderHeader (ShipDate)

Listing 3.11: A non-clustered index on the ShipDate column.

Now, we'll execute the stored procedure twice, as shown in Listing 3.12, with the first query spanning a date range of several years, and so returning many rows, and the second one only covering a range of ten days. Be sure to retrieve the actual execution plan with the results.

DBCC FREEPROCCACHE

EXEC user_GetCustomerShipDates '2001/07/08', '2004/01/01' EXEC user_GetCustomerShipDates '2001/07/10', '2001/07/20'

Listing 3.12: Executing the user_GetCustomerShipDates stored procedure, with the large date range query first.

Note that we ran _{DBCC FREEPROCCACHE to clear the plan cache and ensure a new plan is} created. The plan is identical in each case, as shown in Figure 3.6

Figure 3.6: Execution plans for the user_GetCustomerShipDates stored procedure.

In the plan we see that optimizer has chosen not to use the non-clustered index on the ShipDate column, which we created especially for this procedure. The reason is that it is not a covering index (it doesn't include the _{SalesOrderNumber column) and the} number of rows that the optimizer estimated, based on the parameter values for the initial execution, along with the statistics, was too high to make the combination of index seek/key lookup optimal. Hence the optimizer ignores that index and scans a different one.

Now, run Listing 3.12 again, this time not returning the execution plan, but with

STATISTICS IO and STATISTICS TIME enabled. The reason we're doing a separate run is that returning the actual execution has an impact on the query's performance, so executions that return the execution plan should not also be used to check the query's performance statistics. The results are as follows (the separating headers were added manually, for clarity).

==FIRST EXECUTION (LARGE DATE RANGE)===

(Table 'SalesOrderHeader'. Scan count 1, logical reads 686, physical reads 0. SQL Server Execution Times:

CPU time = 16 ms, elapsed time = 197 ms. SQL Server Execution Times:

CPU time = 16 ms, elapsed time = 197 ms. ==SECOND EXECUTION (SMALL DATE RANGE)===

Table 'SalesOrderHeader'. Scan count 1, logical reads 686, physical reads 0. SQL Server Execution Times:

CPU time = 15 ms, elapsed time = 5 ms. SQL Server Execution Times:

CPU time = 15 ms, elapsed time = 5 ms.

The logical reads are shown as 686 reads for both, but it's the CPU time that is of far more interest here. It's worth noting that the majority of the elapsed time is actually the trans- mission and display of the rows, hence this time will be far larger for 17,000 rows (the first set) than for 40 rows (the second).

Now, we'll clear the cache again and flip the order of execution, so the shorter date range query is executed first, as shown in Listing 3.13. This means that, this time, the parameter sniffing process will result in a much lower estimated number of rows.

DBCC FREEPROCCACHE

EXEC user_GetCustomerShipDates '2001/07/10', '2001/07/20' EXEC user_GetCustomerShipDates '2001/07/08', '2004/01/01'

Listing 3.13: Executing the user_GetCustomerShipDates stored procedure, with the shorter date range query first.

As expected, the execution plan has changed dramatically, as shown in Figure 3.7. The smaller number of estimated rows leads the optimizer to use an index seek on our _{IDX_} ShipDate_ASC index, followed by a key lookup to retrieve the remaining rows.

Figure 3.7: New execution plans for the user_GetCustomerShipDates stored procedure.

Run Listing 3.13 again, this time without the execution plan but with the statistics, and you'll see that the plan works just fine for the first execution, but causes problems for the next one.

==FIRST EXECUTION (SMALL DATE RANGE)===

Table 'SalesOrderHeader'. Scan count 1, logical reads 127, physical reads 0, read- ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms. ==SECOND EXECUTION (LARGE DATE RANGE)===

Table 'SalesOrderHeader'. Scan count 1, logical reads 52429, physical reads 0. SQL Server Execution Times:

Now that's not a huge jump in CPU time, but remember that even the second execution is dealing with 17,000 rows in the result set. At larger row counts, this problem can have a very significant impact.

This is a classic, though small-scale, example of parameter sniffing working against us. The plan with the key lookups is only optimal for small row counts (typically < 1% of the table). Above that, the additional I/O and additional CPU required for the key lookups becomes very significant.

There are several different ways to tackle parameter sniffing problems, depending on the situation and on the version of SQL Server you're using.

In document Troubleshooting SQL Server A Guide for the Accidental DBA (Page 95-100)