If the result of a query is to be ordered by a column or group of columns, the query optimizer can use various techniques to sort it.
Suppose we execute the following query:
SELECT customer_no, balance FROM accounts ORDER BY customer_no
There are no indexes presently on the Accounts table, and so the estimated query execution plan must involve table scan. This is shown in Figure 4.49.
Figure 4.49: Graphical query execution plan showing an ORDER BY using a table scan The stream of rows resulting from the table scan is passed to a Sort operator. This sorts the output according to the ORDER BY clause of the query. SQL Server will attempt to perform the sort in memory, but, if the query has insufficient memory, disk space will be used. This situation can be detected by the Sort Warnings event in the SQL Profiler. Suppose we create a clustered index on the customer_no column and repeat the query. The estimated query
Figure 4.50: Graphical query execution plan showing an ORDER BY using a clustered index The Sort operator has disappeared, and the Table Scan operator has been replaced by the Clustered Index Scan operator. The query optimizer, knowing that the clustered index will return the data in key sequence order (customer_no column order), can scan the clustered index and therefore avoid any sort.
Suppose the clustered index on the customer_no column is replaced by a nonclustered index. The estimated query execution plan is shown in Figure 4.51.
Figure 4.51: Graphical query execution plan showing an ORDER BY using a table scan rather than a nonclustered index
Note that this is the same estimated query execution plan as that shown in Figure 4.49. In other words, the addition of the nonclustered index makes no difference—a table scan is still performed. This should come as no surprise, because of what we already know. The query optimizer has decided that a table scan and sort are less costly than the many data page requests that would result by scanning the nonclustered index leaf level and fetching the data pages. However, if the range of rows is reduced, the nonclustered index will be used. Suppose the following query is executed:
SELECT customer_no, balance FROM accounts WHERE customer_no BETWEEN 1000 AND 1100 ORDER BY customer_no
The estimated query execution plan is shown in Figure 4.52.
Figure 4.52: Graphical query execution plan showing an ORDER BY using a nonclustered index
There is no Sort operator, because the query optimizer knows that the leaf level of the nonclustered index is in account_no column order, so scanning it will return the data rows in the correct sequence.
Suppose we execute the following query:
SELECT customer_no, balance FROM accounts WHERE customer_no BETWEEN 1000 AND 1100 ORDER BY customer_no, balance
The estimated query execution plan is shown in Figure 4.53.
Figure 4.53: Graphical query execution plan showing an ORDER BY using a nonclustered index plus a sort
Since we only have a nonclustered index present on the customer_no column in the Accounts table, the query optimizer will need to place a Sort operator in the query plan to ensure that the data is sorted by the balance column also. A composite index will fix this. Let us create a nonclustered index on a composite key of the customer_no and balance columns. If we execute the query again, the Sort operator disappears, as shown in Figure 4.54.
Figure 4.54: Graphical query execution plan showing an ORDER BY using a composite nonclustered index
Suppose we now ask for the result to be sorted in descending order of balance, as follows: SELECT customer_no, balance FROM accounts
WHERE customer_no BETWEEN 1000 AND 1100 ORDER BY customer_no, balance DESC
We can see from the estimated query execution plan shown in Figure 4.55 that the Sort operator has returned.
Figure 4.55: Graphical query execution plan showing an ORDER BY using a composite nonclustered index and a descending order
Figure 4.55 illustrates an important fact. SQL Server can index scan equally well in both directions on a single-column index. This is because, as we know, the index pages at a given level are linked by next and prior pointers pointing to the next and previous index pages at that level.
However, if an ORDER BY clause specifies different directions for the key columns, a Sort operator is needed. If such a sort is a frequent requirement, the index can be defined with a mix of ascending and descending keys. In our example, we can create the index as follows: CREATE INDEX nciCustomerNoBalance
ON accounts (customer_no, balance DESC)
The index keys will now be stored in ascending order of customer number but descending order of balance. The resulting estimated query execution plan is shown in Figure 4.56.
Figure 4.56: Graphical query execution plan showing an ORDER BY using a composite nonclustered index and a descending index key
We can see that the Sort operator has again disappeared. Finally, let us return to our earlier query:
SELECT customer_no, balance FROM accounts ORDER BY customer_no
Again, we will ensure that only a nonclustered index on the customer_no column of the Accounts table is present. We know that the estimated query execution plan for this query involves a Table Scan operator, as was shown in Figure 4.50. The query optimizer decided that this plan was cheaper than using the nonclustered index. In fact, the query uses 1,570 logical reads.
In the case of a table scan and sort no rows will be returned until the result set has been sorted. If the nonclustered index was chosen by the query optimizer, the first row could be returned
immediately. This behavior can be forced with the FASTFIRSTROW query optimizer hint, as follows:
SELECT customer_no, balance FROM accounts WITH (FASTFIRSTROW)
ORDER BY customer_no The query plan is shown in Figure 4.57.
Figure 4.57: Graphical query execution plan showing an ORDER BY using a nonclustered index and a FASTFIRSTROW query optimizer hint
The query returns the first row instantly, but overall the query now uses 25,048 logical reads. The penalty, therefore, for this rapid return of the first row is usually a slower query response time overall caused by more logical reads.