Data Denition and Manipulation Language
4.4 Object Database Systems
In the mid-eighties, a new class of database systems combining object-oriented programming languages with database technology became available. Recently, a standard for this class of database systems has been released [Cat93]. In accordance to the standard denition we call these systems object database systems (ODBS). Meanwhile a substantial number of systems such as GemStone [CM84],
O
2 [LRV88], Ontos [AHS91], Versant [Gor87], Orion8 [KBC+89]
and ObjectStore [LLOW91] are available as products.
The common features that any ODBS must support were initially dened in [ABD+90]. Be-
sides these mandatory features a high number of extensions such as versions, views, active capabilities or schema updates have also been proposed. Most of them are important for PSDE construction and have been implemented in one or the other ODBS. Among the sys- tems that oer the greatest functionality is the
O
2 ODBS since extensions required for PSDEconstructions have been added in the GOODSTEP project [GOO94]. As we have this sys-
8The product name of Orion was changed to ITASCA. As all of the concepts of the ITASCA product have
tem available for practical experiments, we present the review of this class of DBSs using the particular functionality oered by the
O
2 system and only mention other ODBSs when theyhave important dierences. While doing so, we enhance and rene the arguments presented in [EKS93].
Data Denition and Manipulation Language
To implement project-wide abstract syntax graphs, node types are implemented as classes of the database schema. Classes in
O
2 are dened in theO
2C
data denition language.GemStone provides a dedicated language called OPAL, and Orion uses CommonLisp, whereas all other ODBSs use C++ as data denition language. They could, therefore, equally well be considered as persistent C++ programming language implementations. Nodes are represented by instances of these classes then. They are objects whose instance variables represent edges and attributes. Navigation along these edges is done by dereferencing instance variables. It is a slight drawback that instance variables only support navigation in one direction. Therefore, whenever navigation in both directions is required, a pair of instance variables must be dened to implement an edge. For implementation of multi-valued edges, type constructors such as lists (if the edges are ordered) or sets (otherwise) are used. Navigation is then expressed in terms of an object query language (OQL) or iteration primitives.
The type-compatibility in an
O
2C
schema is checked at compile-time which, compared withrun-time type checking in OPAL, achieves better type safeness and improved performance. The set of target nodes of a particular edge should, therefore, be restricted to those types of nodes that are allowed according to syntax and static semantics of the language. Therefore, we exploit the type-system provided by typed ODBSs (such as
O
2) to dene the types of instancevariables as a rst step towards type safeness.
For schema simplication, multiple inheritance is used to dene common properties of nodes such as outgoing syntactic or non-syntactic edges or attributes in a super class, only once. Subclasses of this super class then inherit the denition. Moreover, edges do not always connect nodes of the same type. Heterogeneous edges, i.e. edges leading to dierent types of nodes can be implemented using polymorphism.
Integrity constraints are enforced by encapsulation, i.e. applications are not allowed to modify instance variables directly, but must use the methods dened. This form of encapsulation is only made possible by the computational completeness of the schema denition language. In any ODBS, persistence of objects is dened by reachability from a persistent object. The DDL of
O
2, therefore, includes the concept of names that can be dened in a schema. Anobject that is assigned to a name at run-time is called named object. Named objects are persistent. A tool schema will, therefore, include a name whose type is a set of document root nodes. Then any node in a document's subgraph whose root node is included in the set becomes persistent, because they are all reachable from the root node.
Schemas in
O
2 can be structured into dierent sub-schemas. Therefore, particular sub-schemacomponents can be designated as exports and then other sub-schemas can import these com- ponents. Hence the information hiding paradigm is not only applied for single classes but also on a more coarse-grained level for schemas, i.e. sets of classes.
Views
Various view denition facilities for ODBSs [SLT91], [Ber92], [AB91] and [HZ90] have recently been suggested. The fact that they allow denition of dierent interfaces for the same objects is a feature common to all of them. For the
O
2 ODBS the view mechanism proposed in [AB91]has been implemented [SAD94]. As the implementation is available for practical use, we now consider this view mechanism in more detail.
The view mechanism of
O
2 allows a tool builder to specify virtual schemas and virtualdatabases. A virtual schema denition is based on a conceptual schema called root schema. A virtual schema can hide classes dened in the root schema and can modify the interface of classes dened in the root schema. To achieve this modication, virtual classes can be dened on the basis of root class denitions. Objects contained in databases that instantiate the root schema are represented according to the respective virtual class denitions when they are ac- cessed through a virtual schema. We then refer to these objects as virtual objects. Therefore, the virtual schema denition implicitly denes virtual databases.
Object-oriented views partly overcome the view update problem of relational databases. As argued in [SLT91] this is due to the concept of object identity. Opposed to relational databases, where the schema designer must designate unique key attributes to address tuples, ODBSs dene object identity in a way transparent to the schema designer. In relational databases, views can be constructed that hide primary key attributes of base relations and then these views are no longer updatable. In object-oriented views, virtual objects always store the object identity of their base object. Then it is always dened into which base object to migrate a virtual object update.
The view mechanism of
O
2 is particularly suitable for a tool builder for dening dierent viewson a project-wide abstract syntax graph as was required in Section 3.3. As discussed in [Bec95], the overall structure of the graph can be dened in a conceptual schema using
O
2's schemadenition language. Based on this schema a number of virtual schemas can be dened for tools so that each tool is provided with its own view of the abstract syntax graph. For a class that implements a node type in the conceptual schema, the virtual schema for a tool includes a virtual class that shows only those edges that are of concern for the tool and hides any others. Node types that must not be seen at all can be hidden by not dening a virtual class for this class. Moreover, the virtual schema can hide those methods that implement modications that ought not be invoked by a tool. It can add additional methods that have not been dened in the conceptual schema for instance to implement dierent unparsing schemes or dierent parsers in dierent tools. Using the view mechanism in this way, a tool builder enables dierent tools to use dierent schemas particularly suited for their purposes, while sharing nodes in the project-wide abstract syntax graph with other tools.
Schema Updates
Almost each of the above mentioned ODBSs supports incremental updates to an already es- tablished schema. Only ObjectStore, Objectivity, Versant and
O
2, however, enable an existingdatabase to migrate to a changed schema. Of these systems, the
O
2 ODBS oers the mostsophisticated support for controlling migration after a schema update. We, therefore, discuss the problem of schema updates in ODBSs using the particular choices taken in the
O
2 systemIn the
O
2 system, changes to bodies of methods can be performed without any additionalmeasures. The change is in place as soon as the transaction that performed the change is completed. Updates to signatures of methods can be done as well.
O
2 then compiles alldepending method bodies anew. The change becomes eective if all depending bodies have been successfully compiled and the transaction that performed the change has been completed. These schema updates do not aect the consistency of existing databases at all.
Unlike changing a method denition, changes to instance variable declarations in classes will aect the database, if it contains objects of these classes. To support such a change and have all existing objects of the respective class migrating to the new class denition,
O
2oers conversion functions [HVZ90]. A conversion function is an
O
2C
function with an inputparameter of the old class' type and a result type of the new class' type. After a change, such a conversion function can be associated with the modied class and the database system executes this function for each object of the changed class before it is accessed the next time. When to execute conversion functions can be determined by the tool builder by choosing between an immediate or lazy strategy. In the immediate strategy all objects of the changed class are converted during the schema update transaction. In the lazy strategy an object is only converted when a transaction is about to access the object. Whichever strategy is chosen, for tools operating on the changed schema the eect is always the same [FMZ94a].
These facilities can be exploited in order to change the structure of an existing abstract syntax graph stored in
O
2 consistently. New types of nodes can be introduced by dening new classes.Edges can be added to or deleted from existing nodes by adding or deleting an instance variable to or from the class. If an instance variable is added, a conversion function can determine the target node of the edge. Depending on the number of nodes, immediate or lazy conversions of the existing nodes can be performed.
Versions
Only few of the available ODBSs provide support for versioning at all. ObjectStore and Versant support versions of objects by providing a predened class from which other classes can inherit the property of being versioned. This can be used to maintain versions of single nodes but is of very limited use for the implementation of versions of subgraphs of the project-wide syntax graph.
Only Orion and
O
2provide support for the versioning of composite objects. In Orion compositeobjects are dened statically within the schema, whereas in
O
2 composite objects are deter-mined in a more exible way during run-time by including objects in a versionable container object. We, therefore, consider this approach in more detail.