optimizer_index_cost_adj
This is the most important parameter of all, and the default setting of 100 is incorrect for most Oracle systems. For OLTP systems, re-setting this parameter to a smaller value (between 10- to 30) may result in huge performance gains!
If you are having slow performance because the CBO first_rows optimizer mode is favoring too many full-table scans, you can reset the optimizer_index_cost_adj parameter to immediately tune all of the SQL in your database to favor index scans over full-table scans. This is a
"silver bullet" that can improve the performance of an entire database in cases where the database is OTLP and you have verified that the
full-table scan costing is too low.
It can also be enabled at the session level by using the alter session set optimizer_index_cost_adj = nn syntax. The
optimizer_index_cost_adj parameter is a great approach to whole-system SQL tuning, but you will need to evaluate the overall effect by slowly resetting the value down from 100 and observing the percentage of full-tale scans. You can also slowly bump down the value of optimizer_index_cost_adj when you bounce the database and then either use the access.sql scripts or reexamine SQL from the
STATSPACK stats$sql_summary table to see the net effect of index scans on the whole database.
Adjustments
We have seen that there are two assumptions built into the optimizer that are not very sensible.
- A single block read costs just as much as a multi-block read - (not really likely, particularly when running on file systems without direction) - A block access will be a physical disk read - (so what is the buffer cache for?)
Set the optimizer_index_caching to something in the region of the "buffer cache hit ratio." (You have to make your own choice about whether this should be the figure derived from the default pool, keep pool, or both).
Another method to define it:
col a1 head "avg. wait time|(db file sequential read)"
col a2 head "avg. wait time|(db file scattered read)"
col a3 head "new setting for|optimizer_index_cost_adj"
select a.average_wait a1, b.average_wait a2,
round( ((a.average_wait/b.average_wait)*100) ) a3 from
(select d.kslednam EVENT,
s.kslestim / (10000 * s.ksleswts) AVERAGE_WAIT from x$kslei s, x$ksled d
where s.ksleswts != 0 and s.indx = d.indx) a, (select d.kslednam EVENT,
s.kslestim / (10000 * s.ksleswts) AVERAGE_WAIT from x$kslei s, x$ksled d
where s.ksleswts != 0 and s.indx = d.indx) b where a.event = 'db file sequential read'
and b.event = 'db file scattered read';
Some results I have obtained from various combinations of hardware platform and IO sub-system.
avg. wait time avg. wait time new setting for (db file sequential read) (db file scattered read) optimizer_index_cost_adj -
.171659257 3.33033582 5
.13254 1.12365 12
.017605522 .104148241 17
1.29639067 2.06954043 63
.535133533 .397919802 134
.940889054 .509830001 185
.537904057 .145183814 370
In real life, this metric is only good enough to give a very rough indicator as to how fast the IO sub-system is. New-value settings below 100 indicate slow disks, anything above 100 might indicate the presence of fast or cache-backed disks (or abuse of the UNIX file system cache).
You have to exaggerate these results for it to have any real influence on the CBO. For example, if the above query suggests a new setting of 63%, you may have to go as low as 1% or 2% before the CBO will actually use an index. Conversely, a suggestion of 370% may need to be bumped up to around 3700% before a full-table or index fast-full scan is favoured.
Optimizer Modes
In Oracle there are four optimizer modes, all determined by the value of the optimizer_mode parameter. The values are rule, choose, all_rows and first_rows. The rule and choose modes reflect the obsolete rule-based optimizer so we will focus on the CBO modes.
The optimizer mode can be set at the system-wide level, for an individual session, or for a specific SQL statement:
alter system set optimizer_mode=first_rows_10;
alter session set optimizer_goal = all_rows;
select /*+ first_rows(100) */ from student;
Oracle offers several optimizer modes that allow you to choose your definition of the “best” execution plan for you:
optimizer_mode=first_rows This is a cost-based optimizer mode that will return rows as soon as possible, even if the overall query runs longer or consumes more computing resources than other plans. The first_rows optimizer_mode usually involves choosing an index scan over a table scan because index access will return rows quickly. Since the first_rows mode favors index scans over full-table scans, the first_rows mode is more appropriate for OLTP systems where the end-user needs to see small result sets as quickly as possible.
optimizer_mode=all_rows This is a cost-based optimizer mode that ensures that the overall computing resources are minimized, even if no rows are available until the entire query has completed. The all_rows access method often favors a parallel full-table scan over a full-index scan, and sorting over pre-sorted retrieval via an index. Because the all_rows mode favors full-table scans, it is best suited for Data Warehouse, decision support systems and batch-oriented databases where intermediate rows are not required for real-time viewing.
optimizer_mode=first_rows_n This is an Oracle9i optimizer mode enhancement that optimizes queries for am small, expected return set. The values are first_rows_1, first_rows_10, and first_rows_100 and first_rows_1000. The CBO uses the 'n' in first_rows_n as an important driver in determining cardinalities for query result sets. By telling the CBO, a priori, that we only expect a certain number of rows back from the query, the CBO will be able to make a better decision about whether to use an index to access the table rows.
While the optimizer_mode is the single most important factor in invoking the cost-based optimizer, there are other parameters that influence the CBO behavior.
Using histograms with the CBO
In some cases, the distribution of values within an index will effect the CBOs decision to use an index vs. perform a full-table scan. This happens when the value with a where clause has a disproportional amount of values, making a full-table scan cheaper than index access.
A column histogram should only be created when we have a highly-skewed column, where some values have a disproportional number of rows. In the real world, this is quite rare, and one of the most common mistakes with the CBO is the unnecessary introduction of histograms in the CBO statistics. The histograms signals the CBO that the column is not linearly distributed, and the CBO will peek into the literal value in the SQL where clause and compare that value to the histogram buckets in the histogram statistics.
As a general rule, histograms are used to predict the cardinality and the number of rows returned in the result set. For example, assume that we have a product_type index and 70% of the values are for the HARDWARE type. Whenever SQL with where product_type=’HARDWARE’
is specified, a full-table scan is the fastest execution plan, while a query with where product_type=’SOFTWARE’ would be fastest using index access.
Because histograms add additional overhead to the parsing phase of SQL, they should be avoided unless they are required for a faster CBO execution plan.
So how do we find those columns that are appropriate for histograms? One exciting feature of dbms_stats is the ability to automatically look for columns that should have histograms, and create the histograms. Again, remember that multi-bucket histograms add a huge parsing overhead to SQL statements, and histograms should ONLY be used when the SQL will choose a different execution plan based upon the column value.
To aid in intelligent histogram generation, Oracle uses the method_opt parameter of dbms_stats. There are also important new options within the method_opt clause, namely skewonly, repeat and auto.
method_opt=>'for all columns size skewonly' method_opt=>'for all columns size repeat' method_opt=>'for all columns size auto' Let’s take a close look at each method option.
The first is the “skewonly” option which very time-intensive because it examines the distribution of values for every column within every index. If dbms_stats discovers an index whose columns are unevenly distributed, it will create histograms for that index to aid the cost-based SQL optimizer in making a decision about index vs. full-table scan access. For example, if an index has one column that is in 50% of the rows, a full-table scan is faster than and index scan to retrieve these rows.
Histograms are also used with SQL that has bind variables and SQL with cursor_sharing enabled. In these cases, the CBO determines if the column value could affect the execution plan, and if so, replaced the bind variable with a literal and performs a hard parse.
begin
dbms_stats.gather_schema_stats(
ownname => 'SCOTT',
estimate_percent => dbms_stats.auto_sample_size, method_opt => 'for all columns size skewonly', degree => 7
);
end;
/
The auto option is used when monitoring is implemented (alter table xxx monitoring;) and creates histograms based upon data distribution and the manner in which the column is accessed by the application (e.g. the workload on the column as determined by monitoring). Using method_opt=>’auto’ is similar to using the gather auto in the option parameter of dbms_stats.
begin
dbms_stats.gather_schema_stats(
ownname => 'SCOTT',
estimate_percent => dbms_stats.auto_sample_size, method_opt => 'for all columns size auto', degree => 7
);
end;
/