• No results found

EXPLAIN DESCRIBES THE SELECTED ACCESS PATHS It is quite easy to identify suspicious access paths, particularly if the EXPLAIN

Reactive Index Design

EXPLAIN DESCRIBES THE SELECTED ACCESS PATHS It is quite easy to identify suspicious access paths, particularly if the EXPLAIN

output is stored in a table, thereby enabling easy accessibility. This is why the analysis of SQL statements, for which the optimizer has selected a suspi- cious access path, often starts the index improvement process. The following performance alarms that we have studied in some depth are quickly detected with EXPLAIN.

Full Table Scan or Full Index Scan

If the most suspicious access paths, scanning the whole index or the whole table, have not already been checked before cutover, now is the time to do it.

Sorting Result Rows

A sort of the result rows is the most useful alarm after that for full scans. There may be two reasons for the sort:

1. There is not an index that would make the sort unnecessary for the SELECT statement.

2. The access path chosen by the optimizer includes a redundant sort. In the first case, the sort can be eliminated by an improvement of the index (refer to Chapters 4 and 5). The second case will be discussed in Chapter 14.

Often a sort does no harm. For example, an application may have a thousand different SELECT statements with hundreds of them having a sort in their access

path; 90% of these statements may have excellent performance with the sort contributing an insignificant amount to the response time. Thus, the large number of false alarms may make this review method somewhat unproductive.

There are many database consultants who consider the sort to betheenemy. We believe that those consultants who emphasize death by random I/O are more credible.

Cost Estimate

The EXPLAIN feature of several DBMSs display the optimizer’s estimate for the local response time, or at least the CPU time, for the chosen access path. Some products, SQL Server 2000, for instance, show the estimated CPU time and I/O time for each step in the access path. In a simple case there might be several steps such as:

1. Retrieve index rows. 2. Sort pointers. 3. Retrieve table rows. 4. Sort result rows.

Experience has shown that it can be dangerous to place too much reliance on the optimizer’s cost estimates. After all, the optimizer is only producing the cost estimate in order to choose the fastest alternative access path available. The estimates sometimes give false alarms, yet don’t showall the alarms; but it is an easy and fast tool to use, and enables anearly check to be made. Consequently SQL statements with unusually high cost estimates—perhaps the 20 statements that have the highest cost estimates belonging to a new application—should be checked. It is quite likely that inadequate indexing or optimizer problems will be discovered.

Unfortunately, two serious issues limit the value of using the cost estimate in this way:

1. The optimizerestimate for the local response time may be very different from theactual value (refer to Chapter 14).

2. Whenever a predicate uses host variables (which is very common, of course), the optimizer estimate for the filter factor is based on the aver- age input, or even worse, a default value. To obtain the more valuable worst-caseestimate, the host variables in the EXPLAIN must be replaced by the worst input literals. This is a tedious operation that requires appli- cation knowledge.

EXPLAIN is an indispensable tool for analyzing optimizer problems. This is its raison d’ˆetre, not index design. Airport security does not only checksuspicious looking passengers at the airport!

DBMS-Specific EXPLAIN Options and Restrictions

The EXPLAIN features of many DBMSs have been enhanced significantly over the past 10 years. As well as a compact report producing a single line per execu- tion step, graphs are available at the statement level. Thesuspicious access paths may even be marked in some EXPLAIN outputs. The statistics on which the opti- mizer’s decision was based, together with a list of all the current indexes, and their characteristics, for each accessed table may be provided, which is extremely useful; without this, it would be necessary to access this information from the system tables separately. The best EXPLAINs have already integrated the next desirable step: to propose index improvements.

On the other hand, the current EXPLAINs still have a few irritating short- comings. DB2 for z/OS V7 does not report index screening (rejecting a row by using column values copied to the index row). This shortcoming does not directly affect index reviews, but it does mean that some optimizer problems are only revealed by measuring the number of table touches. Oracle 9i does not report the number of index columns involved in the screening process; in fact it doesn’t even report the number of matching columns either (the PLAN TABLE column SEARCH COLUMNS in Oracle 9i is not currently used). Oracle 9i does report the type of index scan: UNIQUE SCAN (one row), RANGE SCAN, FULL INDEX SCAN, or FAST FULL INDEX SCAN. Thus, the most suspicious access paths are revealed, but the lack of the MC value makes it more difficult to ana- lyze optimizer problems caused by difficult predicates (sometimes called index suppression in the Oracle literature).