Most ORDBMSs, including Oracle [Ora 99a] [SMS+ 00], IBM DB2 [IBM 99] [CCF+ 99], or Informix IDS/UDO [Inf 98] [BSSJ 99], provide extensibility interfac- es in order to enable database developers to seamlessly integrate custom object types and predicates within the declarative DDL and DML. In this section, we shortly outline the extensible indexing and optimizing frameworks of modern ORDBMSs.
3.2.1 Extensible Indexing
An important requirement for applications is the availability of user-defined ac- cess methods. Extensible indexing frameworks proposed by Stonebraker [Sto 86] enable developers to register custom secondary access methods at the database server in addition to the built-in index structures. An object-relational indextype encapsu-
Extensible Query Language 33
lates stored functions for creating and dropping a custom index and for opening and closing index scans. The row-based processing of selections and update operations follows the iterator pattern [GHJV 95]. Thereby, the indextype complements the functional implementation of user-defined predicates. Figure 16 shows some basic indextype methods invoked by extensible indexing frameworks. Additional func- tions exist to support query optimization, and user-defined aggregates.
If the optimizer decides to include a custom index into the execution plan for a declarative DML statement, the appropriate indextype functions are called by the built-in query processor of the database server. Thereby, the maintenance and access of a custom index structure is completely hidden from the user, and the desired data independence is achieved. Furthermore, the framework guarantees any redundant index data to remain consistent with the user data.
Exemplarily, we create an object type CADOBJECT to encapsulate the data and semantics of three-dimensional CAD objects. Instances of this custom object type are stored as elements of relational tuples. Figure 17 depicts some of the required ob- ject-relational DDL statements in pseudo SQL thus abstracting from technical details which depend on the chosen product.
After having created a custom indextype CADINDEX for the intersect predicates on the type CADOBJECT, we can create an index CADIDX on the geom attribute of the CADOBJECTS table by submitting the usual DDL statement (cf. Figure 18).
By using either the functional or the index-based binding of the user-defined pred- icate intersect_boolean, object-relational queries can be expressed in the usual de- clarative fashion (cf. Figure 19a).
Function Task
index_create(),
index_drop() Create and drop a custom index. index_open(),
index_close() Open and close a custom index.
index_fetch() Fetch the next record from the index that meets the query predicate.
index_insert(), index_delete(), index_update()
Add, delete and update a record of the index.
The integration of ranked intersection queries is a little bit more complex, but can be achieved by means of ancillary operators [Ora 99b]. An ancillary operator, e.g.
Rank, has a functional implementation that has access to state generated by the index
based implementation of the primary operator, e.g intersect_ranked, occurring in the WHERE clause. By means of a common tag, e.g. 1, the ancillary operator and the primary operator are connected to each other (cf. Figure 19b). Note, the execution of this statement requires an index scan for the intersect_ranked operator [Ora 99b].
3.2.2 Extensible Optimizing
Query optimization is the process of choosing the most efficient way to execute a declarative DML statement. Object-relational database systems typically support rule-based and cost-based query optimization, whereby the cost-based approach is preferable to the rule-based approach when referencing user-defined methods as
// Type declaration
CREATE TYPE VOXEL AS OBJECT (x NUMBER, y NUMBER, z NUMBER); CREATE TYPE VOXELSET AS TABLE OF VOXEL;
CREATE TYPE CADOBJECT AS OBJECT ( voxels VOXELSET,
MEMBER FUNCTION intersect_boolean (o CADOBJECT) RETURN BOOLEAN,
MEMBER FUNCTION intersect_ranked (o CADOBJECT) RETURN NUMBER
) ;
// Type implementation // …
// Functional predicate binding
CREATE OPERATOR intersect_boolean (a CADOBJECT, b CADOBJECT) RETURN BOOLEAN BEGIN RETURN a.intersect_boolean(b); END; CREATE OPERATOR intersect_ranked (a CADOBJECT, b CADOBJECT) RETURN NUMBER BEGIN RETURN a.intersect_ranked (b); END;
// Table definition
CREATE TABLE CADOBJECTS
(id NUMBER PRIMARY KEY, geom CADOBJECT);
Extensible Query Language 35
predicates [BO 99][HS 93]. The extensible indexing framework comprises interfac- es to tell the built-in optimizer about the characteristics of a custom indextype. Figure 20 shows some cost-based functions, which can be implemented to provide the optimizer with feedback on the expected index behavior. The computation of custom statistics is triggered by the usual administrative SQL statements. With a cost model registered at the built-in optimizer framework, the cost-based optimizer is able
CREATE TYPE CADINDEX_IM AS OBJECT ( // Attributes
// ODCII-Functions
STATIC FUNCTION ODCIIndexCreate ..., STATIC FUNCTION ODCIIndexStart ..., MEMBER FUNCTION ODCIIndexFetch ..., MEMBER FUNCTION ODCIIndexClose ..., ....
// Additional Functions .... );
CREATE INDEXTYPE CADINDEX
FOR intersect_boolean(CADOBJECT, CADOBJECT), intersect_ranked (CADOBJECT, CADOBJECT) USING CADINDEX_IM;
CREATE INDEX CADIDX ON CADOBJECTS (geom) INDEXTYPE IS CADINDEX;
Figure 18: A custom index CADINDEX for CAD objects.
a) Implementation of indextype, b) Creation of indextype, c) Creation of an instance
a)
b)
c)
// Collision query
SELECT id FROM CADOBJECTS
WHERE intersect_boolean (geom, :query_obj);
// Ranked collision query
SELECT id, Rank (1) FROM CADOBJECTS db
WHERE intersect_ranked (geom, :query_obj,1)
Figure 19: SQL-statements for intersection queries. a) Boolean intersection queries, b)Ranked intersection queries a)
to rank the potential usage of a custom access method among alternative access paths. Thus, the system supports the generation of efficient execution plans for que- ries containing user-defined predicates. This approach preserves the declarative par- adigm of SQL, as it requires no manual query rewriting.
To sum up, the main advantages of extensible indexing and optimizing frame- works are:
• The maintenance and access of a custom index structure is completely hidden from the user, achieving thereby data independence.
• Any redundant index data remains consistent with the user data. The declarative paradigm of SQL is preserved.
• The index structure can be integrated into the cost-based query optimization pro- cess.