• No results found

TUNING CONSIDERATIONS FOR DIFFERENT APPLICATIONS : Objectives :

• Use the available data access methods to tune the logical design of the database.

• Identify the demands of online transaction processing systems (OLTP)

• Identify the demands of decision support (DSS) systems

• Reconfigure systems on a temporary basis for particular needs.

OVERVIEW :

1. Data Design phase

2. Logical structure of the database Data Design phase

The database design process undergoes a normalization stage in which data is analysed to ensure that no reduldant data is held anywhere. However you amy need to denormalise it for performance reasons.

Another consideration is the avoidance of contention on data eg. Consider a database 1 terrabyte in size on which a thousand of user access only 0.5% of data. The hotspot could cause performance problems.

Logical structure of the database : This primarily concerns

The index design ensuring that the data is neither over nor under indexed (this doesn’t encounter the PK indexes, designed in the previous stage, but additional indexed to support the application).

The correct use of different types of indexes

- B-tree

- Bitmap

- Reverse

The appropriate use of sequences, clusters, index-organised tables The necessity for histograms collection for the CBO

The use of parallelised queries The optional use of partioning data OLTP :

• These are high throughput, Insert / updata intensive segments

• Contain large volumes of data that

- grow continuously

- are accessed concurrently by hundreds of users

• The tuning goals are - Availability

- Speed

- Concurrency - Recoverability DSS

- Perform queries on large amounts of data - Make heavy use of full table scans The tuning goals are

- High response time

When you design a DSS you must ensure that queries on large amounts of data can be performed within a reasonable time.

The parallel query is particularly designed for DSS.

Multipurpose Applications :

• Combination of OLTP and dSS

• Hybrid system rely on several configurations

Data Access Method :

To enhance performance, you can use the following data access methods

⇒ Indexes

Bitmap Reverse Key B-Tree

⇒ Index organized tables Clusters Histograms

B-Tree Index :

When to use B-Tree Indexes

B-Tree indexes improves the performance of queries that select a small percentage of rows from a table. As a general guidelines, you should create indexes on tables that are often queried for less than 10% to 15% of the table’s rows. This value may be higher in situations where all data can be retrieved from an index, or where the indexed columns can be used for joining to other tables.

INDEX_ORGANISED TABLES :

IOT is like a regular table with index on one or none of its columns, but instead of maintaining two separate segments for table and tree index, the database system maintains one single B-Tree index which contains both :

• The primary key value

• The other column value for the corresponding row performance Benefits :

- There is no duplication of the values for the primary key column (index and table column in indexed tables) less storage requirements

- Index organized tables provide faster key based access for queries involving exact match or range search or both.

When to create Index_Organised Tables:

IOT are suitable for frequent data access through primary key or through any key that is prefix of the primary key, such as in applications using inverted indexes used in text searches.

These indexes keep the value and all its locations together. Therefore each word has one entry and that entry records all the places where the word occurs. Indexed_organized tables are very efficient mechanisms for modeling inverted indexes.

Row Overflow :

PCTTHRESHOLD CLAUSE INCLUDING CLAUSE OVERFLOW CLAUSE Syntax for Index_organized Tables

SQL>CREATE TABLE scolt.sales

office_cd NUMBER (3)

qtr_end DATE

revenue NUMBER (10,2)

review VARCHA R 2 (1000)

CONSTRAINT sales_pk

PRIMARY KEY (office_cd), qtr_end)

ORGANISATION INDEX TABLE SPACE INDEX

PCTTHRESHOLD 20

INCLUDING review

OVERFLOW TABLE SPACE user_data

PCTTHRESHOLD CLAUSE :

This clause specifies the percentage of space reserved in the index block for an

index_organized table flow. If a row exceeds the size calculated based on this value, all columns after the column named in the INCLUDING Clause are moved to the overflow segment. If overflow is not specified then rows exceeding the threshold are rejected. PCTTHRESHOLD default to 50 and must be a value from 0 to 50.

Including clause : This clause specifies a column at which to divide an index organized table row into index and overflow portions. All columns that follow the INCLUDING keyword are stored n the overflow data segment. If this is not specified and a row size exceeds

PCTTHRESHOLD, all columns except the primary key columns will be moved to the overflow area. The column is either the name of the last primary key column or any non primary

keycolumn.

OVERFLOW CLAUSE AND SEGMENT : this clause specifies that index organized table data rows exceeding the specified threshold are placed in the data segment defined by the segments attributes, which specify the tablespace, storage and block utilization parameters.

DICTIONARY VIEWS ON IOT

SQL> select table name, table space name, iot name, iot type from DBA_TABLES SQL> select index name, index type, table space name, table name from dba indexes.

SQL> select segment name, table space name, segment type from dba segments CLUSTERS

Definition : A cluster is a group of one or more tables that share the same data blocks because they share common columns and are often used together in join queries.

Performance Benefits :

- DISK I/O is reduced and access tune improved for joins of clustered tables

- Each cluster key value is stored once for all the rows of the same key value therefore it uses less storage.

Performance Consideration : Full table scans are generally slower on clustered tables than on non-clustered tables.

Cluster Type :

- Index cluster - Hash cluster

Index cluster : An index cluster uses an index known as cluster index to maintain the data within the cluster.

• The cluster index must be available to store access or maintain data in an index cluster

• The cluster index is used to point to the block that contains the rows with a given key value.

• The structure of a cluster index is similar to that of normal index. Although a normal index does not sotre a NULL key value, cluster index store NULL keys. There is only one entry for each key value in the cluster index. Therefore they are likely to be smaller than a normal index on the same set of key values

• To store or retrieve rows from a cluster the Oracle server uses the cluster index to locate the first row that corresponds to given key value and then retrieves the rows for the given key.

• If several rows in an index cluster have the same cluster key, the cluster key is not repeated for each row. In a table with large no.of rows per key value, use of an index cluster may reduce the amount of space needed to store data.

Hash Cluster : A Hash Cluster uses a function to calculate the location of the row. The hash function uses the cluster key and can either user defined or system generated:

- when a row is inserted into a table in a Hash Cluster

• The hash key columns are used to compute a hash value.

• The row is stored based on the hash value.

The hash function is used to locate the row while retrieving the data from a hashed table.

For equality searches that use the cluster key a hash cluster can provide greater performance gains than an index cluster

• Only one segment to scan

• Very fast access to the row with the hash function providing the row address immediately.

Situations where clusters are Useful : When not to use clusters :

• Full scan is often executed on one of the clustered tables. This table is stored on more blocks than if it had been created alone.

• If the data from all tables with the same cluster key value exceeds more than one or two oracle blocks to access a row in a clustered table, oracle reads all blocks containing rows with the same value.

• Partitioning is not compatible with clustering.

When not to use hash clusters :

• If the table is constantly growing and if it is impractical to rebuild a new large hash cluster.

• If your application often performs full table scans and you had to allocate a great deal of space to the hash cluster in anticipation of the table growing.

OLTP Requirements : - Space Allocation :

• Avoid the performance load of dynamic space allocation allocate space explicitly to tables clusters and indexes.

• Check growth patterns regularly to find the rate at which extents are being allocated so that you can plan extents creation.

- Indexing :

• Indexing is critical to data retrieved in OLTP systems. DML statements on indexed tables need index maintenance and this is a significant performance overhead. So your indexing strategy must be closely geared to the real needs of the application.

• Indexing a foreign key helps child data to be modified w/o locking the parent data.

• B-Tree indexing is preferred to bitmap indexing because of locking issues affecting DML operations when a B-Tree index entry is locked a single row is locked.

• Reverse key indexes avoid frequent B-Tree block splits for sequence columns. They also make oracle parallel server application faster.

• You need to rebuild indexes regularly.

OLTP Requirements : Roll back segments : Short Transactions :

Transactions are likely to be short, which has consequences for Roll back segment configuration.

• They are unlikely to run out of roll back segment space.

• They need enough roll back segments to prevent contentions for transaction tables.

To get the correct number of roll back segments, you need to know about the transaction pattern. For ex. Consider a system with the following characteristics

• 170 users logged on

• each use executing three transactions per minute, on average.

• Each transaction lasting on e second on average

There is only a very small mathematical probability that at any given time more than 8 transactions are active. So 8 roll back segments are enough.

The roll back segment extent size can be relatively small (10k if these small transactions are the only ones on the system)

You will need to set MINEXTENTS to be atleast 10 for small databases and 20 for large databases because

• Dynamic extension of roll back segment is just as much of a performance issue as dynamic extension of tables.

• It also reduces the risk of hitting extents in use when wrapping roll back segments

OLTP Application Issues.

- Use constraints instead of application code - make sure that code is shared

- use bind variables rather than literals.

Integrity constraints :

If there is choice between keeping application logic in procedural code or using declarative constraints, bear in mind that constraints are always less expensive to process. Referential integrity and CHECK constraints are the main types to consider here.

Shared Code :

Otherwise, you need to make certain that code is shared by stored procedural objects, such as packages, procedures and functions.

Bind Variables :

You want to keep the overhead of parsing to a minimum. Try to ensure that the application code uses bind variables rather than literals.

DSS Requirement Storage Allocation :

• Set DB_BLOCK_SIZE to the maximum

• Set DB_FILE_MULTI_BLOCK_READ_COUNT carefully

• Ensure that extent sizes are multiples of this numner

• Run analyze regularly 1. DB_BLOCK_SIZE

You should normally set the DB_BLOCK_SIZE parameter to the maximum value supported for your platform. Even if ithis means recreating a large database, it almost certainly pays of because a large block size facilitates read extensive operations that are characteristic of DSS application.

2. DB_FILE_MULTI_BLOCK_READ_COUNT parameter :

It determines during fully table scans how many databases blocks are read with a single operating system read calls 8 or 8k block size, 4 or 16k block size.

A large value gives cheaper table scan cost and favours table scans over indexes.

DSS Requirement continued :

• Indexing

- Evaluate the need for indexes - use bitmap indexes when possible

- use index_organized tables for large data retrieval by pk

- generate histograms for data indexes that are distributed non uniformaly.

• Clustering

- Hash clusters for performance access

• Partitioning

Indexing : consider how you can minimize the space and performance overhead of index maintenance. Since, most queries use full table scans, you could :

• Dispense with indexes altogether

• Maintain them only for few tables which are accessed selectively

• Regularly generate histograms for data indexes that are distributed non-uniformly

• Choose bitmap indexes for queries or columns with few distinct values They offer much faster retrieval access

For bulk inserts and updates, you must set the sorting init.ora parameters appropriately : SORT_AREA_SIZE, BITMAP_MERGE_AREA_SIZE,

CREATE_BITMAP_AREA_SIZE

• Use index_organized tables for a faster key based access to tables data for queries involving exact match and for range search and complete row data retreval.

Clustering : Both types of clusters are to be considered and especially has clusters for their best access performance, excluding the tables growing regularly during bulk loads, except if you have the possibility to recreate the cluster.

Partitioning : Though an option, this feature must be considered for large amounts of data for which queries concentrate access on rows that were generated recently. It offers partition scan instead of full table scan.

DSS Application Issues :

• Parse time is less important

• Execution plan must be optional

- use parallel query feature

- tune carefully, using hints if appropriate - test an realistic amounts of data.

• Consider using PL/SQL functions to code logic into queries

• Bind variables are problematic.

Parse Time ;

The time taken to parse SELECT statements is likely to be very small proportion of the time taken to execute the query. Tuning the library cache is much less of an issue for DSS thaw for OLTP.

Your priority is an optional access path in the execution plan, small variations can cost minutes or hours. Developers must

• Use parallel zed queries which enable multiple processes to work together simultaneously to process a single SQL statement symmetric multiprocessor (SMP) clustered or

massively parallel processing (MPP) configurations gain the largest performance benefits because the operation can be effectively split among many CPUs on a single system.

• Use the explain plan command to tune SQL statements and hints to control access paths.

If your application logic uses bind variables you look the befit of this feature : The optimizer makes a blanket assumption about the selectivity.

Ex : if you use the following statement, the value of :1 is not known when the optimizer determines the execution plan.

SQL>SELECT*DROM big table Where number column = :1

The optimizer cannot calculate the no. of rows returned by this step exactly. Theis may mean that the server may use a sub-optional execution plan.

So although it is recommended strongly to use bind variable in OLTP systems with index access, they are not ideal for decision support systems since there are few opportunities that the same statement is executed twice.

Parameters for hybrid systems

• Memory use

- SHARED_POOL_SIZE

- DB_BLOCK_BUFFERS These parameter will have higher values for

- SOR_AREA_SIZE day time (i.e. OLTP)

• Parallel query

- reconfigure parameters for DSS.

Parallel query

• During the day both PARALLEL_MIN_SERVERS & PARALLEL_MAX_SERVERS could be set to zero to prevent parallelisation

• Full table scans of large tables can be restricted : use different day time profiles to limit LOGICAL_REDS_PER_CALL OR CPU_PER_CALL and assign these profiles to users.

• In off-peak time, you can reset these parameters to the appropriate number and reassign night time profiles to the DSS querying users.

SQL Tuning:

Objectives

• Use oracle tools to diagnose SQL statement performance

• Track and register module usage for packages, procedures and triggers

• Identify alternative SQL statements to enhance performance Overview:

• Application tuning is the most important part of tuning

• Database administrators :

- may not directly involved in application tuning

- must be familiar with the impact that poorly written SQl statements can have upon database performance