THE EXECUTION OF EVALUATION PLANS - On the Implement ation of iDB P Q L

On the Implement ation of iDB P Q L

5.3. THE EXECUTION OF EVALUATION PLANS

REE Stack Areas

ES ES n, OID - - - I _{n, OID} -- - - --- - -- - -- ' n, value I I n, *ptr ' n, *ptr I _{n, OID} - - - - _{- - - - \} ' ' I ' ' ... _I . . . _' _' I ' ' - - -- -- - ---- -- _ _ _ _ _ _ l _'_ _ _ _ ... _ _ _ _ ' '

Main Memory Object Store ' '

' ' f ' ' OlD ' ' (_obj ect ) V - - - - --- - - -- - -- - ---_,_ - - - _\...' ' -

�

' ' ' ' ' ' ' ' ' ' ' (_obj ect ) u ' ' ' ' ' -_{- -} ( _obj ect ) t ' p 0 - s c Markus Kirchberg

Evaluation Plan Area

Shared Memory Area of the Persistent Object Store

X ·

Shared Memory Area of the

_ _ _{,_- - .,.,. (} Remote Communication _{b .} ₎ _Module

\ _o J ect y

(_obj ect ) z

. - - - - · · · - - · - - - -· - - · - - - -

Run-Ti me MetaData Area

J>"

- _{object references realised as main memory pointers} - - ... object references represented by OIDs

· · · ·t> references to associated metadata entries

DBS MetaData Area

·�

Fig. 5 . 5 . Local Heap with Embedded POS and RCM Shared Memory Areas.

- Values reside on stacks and queues that are maintains in REE stack areas.

- Objects t , u and v are physically located in the local main memory object store and have associated run-time metadata entries. Object v is a local transient object, which contains three references: One to the transient object t , one to the transient object u and one to the persistent object x.

References to objects, which are held in the main memory object store, from stacks and queues in local REE stack areas and references between objects in the main memory store are represented as main memory pointers.

- Object x is a local persistent object and made accessible in the shared memory area of the persistent object store. References appear in the form of object identi fiers. Special POScall primitives, which facilitates access to persistent objects, are introduced further below.

- Objects y and z reside in the shared main memory area of the remote communica

tion module, i.e. these objects are 'on loan' from remote ODBS instances. References appear in the form of object identifiers. Access to loaned transient and persistent

5.3. THE EXECUTION OF EVALUATION PLANS Markus Kirchberg objects is enabled through the remote communication module. When objects are re trieved from other nodes, so are their metadata information. Object y is a transient object. Thus, its metadata information is added to the local run-time metadata cat alogue. In contrast, object z is a persistent object that has its associated metadata information in the DBS metadata catalogue - a new __ s chema entry, which consists only of the necessary information that are required to process all objects on loan from this particular schema.

The REE Stack Area. As a new main evaluation plan is ready for processing, the first REE stack sub-area is created. This sub-area is the run-time component of our approach that is most similar to the two-stack abstract machine as defined in the SBA approach. Assuming that we only have one execution stream, the evaluation of iDBPQL code that makes up the main evaluation plan and all evaluation plans, which are invoked during processing, are executed in this sub-area. However, it is more common that simultaneous execution is utilised. Thus, multiple sub-areas exist within each REE stack area.

Every sub-area contains an environment stack (ES), which consists of frames (in SBA they correspond to sections) . As its name suggests, the environment stack rep resents the environment in which an evaluation plan is executed. Scoping and binding are the two main tasks performed on this stack.

The environment stack can be regarded as a collection of name binders . These binders can either appear as singletons or as collections. They associate external names with transient or persistent entities. A binder is a triple (n, rt, e ) , where n is an exter nal name, rt is its associated run-time type or class, and e is a transient or persistent

iDBPQL entity (e.g. an object (reference) , a value, a variable, an evaluation plan etc.) . If we deal with a transient entity, e is a pointer to a memory area within the heap (recall the use of pointer swizzling techniques to enhance performance) . In contrast, local per sistent entities, which are made accessible in the shared memory area of the persistent object store, and migrated objects or objects located on remote ODBS instances are referenced using object identifiers (i.e. e is a value of type __ DID) .

Definition 5 . 5 . A name binder (of internal type __ b inder) is a triple (n, tr, e) with the following properties:

- n is a String value of type ( char *) . It represents an external name, which is bound to the entity e.

- tr is a reference a _ _ type lnf o or __ c lass info structure that identifies the run-time

type or class, respectively, of the entity e .

- e is an iDBPQL entity. The internal type and the value of this entity is as follows:

__ obj ect * _JillllObj ect

{

__ DID __ o i d e =

__ iDBPQLvalue __ value

if referencing a persistent or remote object; if referencing an object in the heap; and

if holding a simple, structured, collection-type, or NULLable value.

5 .3. THE EXECUTION OF EVALUATION PLANS Markus Kirchberg

Access to information that is captured by name binders is possible through the . (dot) operator. The identifying name, run-time type or bound entity of the top-most name binder on ES may be retrieved by executing top ( ES ) . n, top ( ES ) . tr or top ( ES ) . e, respectively.

The environment stack is divided into frames and sub-frames, which help with scop ing and binding. With every behaviour invocation, a new frame is created. Similarly, whenever a new evaluation block is encountered, a new sub-frame is created. Frames and sub-frames group all run-time entities that are local to the respective behaviour implementation or evaluation block, respectively. During the process of binding, the top-most sub-frame (i.e. the most local (sub-)environment) is considered first. In the event that a name binder is not found, the next sub-frame or frame4 is visited. This approach is continued until the bottom of the stack, which describes the global envi ronment, is reached. The evaluation of any request that is formulated in iDBPQL will always be able to locate a binder on ES. Otherwise, the evaluation plan together with all annotations and references is not well formed and should have been rejected by the compiler.

Frames have a second stack associated, the result stack (RS). This is different to the SBA approach, which only defines one global result stack that holds results in the form of tables. Having a result stack associated with a particular invocation enables the sharing and re-use of result values more easily. For instance, if we encounter a sequence of identical invocations of a static method, we may perform the computation once, retain the result and then share it among all invocations. Nevertheless, the purpose of RS remains largely unchanged. It stores intermediate results and assists with passing of results between frames. RS stores intermediate results in the form of result queues (RQs). This is necessary to better support pipelining and simultaneous and distributed processing. In addition, supporting RQs also allows for a more refined approach to how results are represented. While the SBA approach restricts results to a representation that corresponds to a table (or bag) , we support the storage of results in the form of singletons or different types of collections (i.e. as a bag, set, list or array) . Result queues are associated with evaluation steps. Each such step may have one or more corresponding result queues. These queues are used to exchange result values, store intermediate results or synchronise different forms of processing.

The Environment Stack ( ES ) . In accordance with traditional programming languages and the SBA approach, (references to) run-time entities that are available at a given point in time during the evaluation procedure are maintained on the environment stack. The availability of these entities is determined by their appearance in the respective evaluation plan and a set of scoping rules. The latter adhere to the following principles:

1 . A local name is given priority over an inherited, static or global name;

2. The local context of the implementation of a behaviour specification is hidden from other evaluation plans the former one invokes; and

3 . Nesting of run-time entities is not restricted.

4 While the next sub-frame is always the previous sub-frame, the same is not true for frames. To facilitate the

5.3. THE EXECUTION OF EVALUATION PLANS Markus Kirchberg The first principle outlines that name binding results in a search starting from the top of ES. In the event that the name is located in the top sub-frame of ES, binding terminates successfully. Thus, the most local entity is bound to the given name. If the name is not found, the search continues with the next sub-frame and so on until all sub-frames of the top-most frame are considered. While sub-frames are never skipped, the same does not apply to frames themselves. The second principle advocates that entities, which are local to a particular implementation, must be hidden from the view of other implementations. This relieves programmers from knowing details of implementations of behaviour specifications that they utilise. As a result, frames are linked by prevScop e pointers , which chain those frames together that may access one another's local variables. In the event that a name cannot be located in a particular frame, the search continues in the next frame encountered along the chain formed by the associated prevSc ope pointers.

Frames and sub-frames on the environment stack are nested according to block and behaviour invocation specifications in all evaluation plans that are encountered during the processing of a user request. The third principle implies that there is no restriction on the depth of the nesting of these block specifications and behaviour invocations.

Environment Stack <1.) 8 prevScope J: 1 Sub-Frame prevScope prevScope <1.) I � � Sub-Frame I _ _ _ _ _ _ _ _ _ _ _ 1 _Sub-Frame : - - _s.?�!�� -_Sub-Frame - Sub-Frame <1.) � - - - -

�

1 Sub-Frame

Fig. 5 . 6 . Logical View of the Environment and Result Stacks.

Result Stack

Figure 5.6 (left-hand side) provides a logical view of the composition of the environ ment stack. At the bottom of the stack, there is the global environment. This includes

5.3. THE EXECUTION OF EVALUATION PLANS Markus Kirchberg definitions that are imported from the DBS metadata catalogue, type and class defini tions associated with the run-time metadata catalogue and other definitions that are global to the respective user request. The frame above the global environment corre sponds to the main evaluation plan. The third frame from the bottom holds entities that are local to the evaluation plan, which has been invoked during the processing of the main evaluation plan. Similarly, the fourth and fifth frames (counting from the bot tom) hold entities that are local to the corresponding evaluation plan, which has been invoked during the processing of the behaviour implementation described in the frame below. In accordance with the second principle, prevScope pointers, which ensure that local implementation entities remain hidden, are outlined.

Additional pointers are associated with frames. For instance, a pointer that keeps track of the corresponding THIS object will be added. Corresponding additions are mo tivated and outlined throughout the remainder of this chapter. The common rationale behind the usage of additional pointers is mainly related to performance considerations.

As mentioned earlier, a new frame is created with every behaviour invocation. That is, a new scope is opened. Parameters supplied during the invocation procedure are maintained in the frame itself. During the evaluation process, new sub-frames are es tablished whenever a new evaluation block is encountered. Since every evaluation plan consists of at least one evaluation block, every frame has at least one sub-frame. Vari ables local to a particular evaluation block are maintained in the respective sub-frame on ES.

The Result Stack (RS ) . A result stack is associated with each frame, i.e. with each behaviour invocation. Intermediate results as well as the behaviour's return value are maintained on RS. In fact , each result stack can be regarded as a stack of result queues as outlined in Figure 5.6 (right-hand side) . The result queue at the bottom of RS serves a special purpose. We also refer to it as return result queue. It facilitates the exchange of values that are returned as the result of a behaviour's invocation. The return result queue is always present on RS except for frames that correspond to behaviour implementations with no return type (e.g. object constructors) or with the VOID return type. All other result queues are associated with the evaluation of individual iDBPQL statements and expressions.

The size of a result queue is dynamic. Memory is allocated from the heap. In order to enable two evaluation procedures to exchange result values, the result stack and its result queues are created, maintained and accessed as follows:

- Upon the invocation of a behaviour implementation, a new scope is opened. That is, a new frame is placed on top of the environment stack. During this process, a result stack is initialised. If the behaviour's return descriptor is empty or of type

VO ID, then there is no return result queue and the initialisation of RS is complete. Otherwise, a return result queue is pushed onto RS5 .

- A new result queue is created implicitly with every sub-evaluation or, in the event that the local evaluation procedure requires storage space for intermediate results, it is created explicitly.

5 The return result queue is always initialised by the evaluation process that invokes the behaviour. This way, the calling process retains access to the result queue even after the invoked behaviour has terminated.

5.3. THE EXECUTION O F EVALUATION PLANS Markus Kirchberg

- Each result queue has two handles that regulate access associated. On one hand, results can be appended to or, in some cases, accessed from the end of the queue, the tail. On the other hand, results can be accessed from the front of the queue, the head.

If the result queue was created explicitly, both means of access are available from the current evaluation procedure. Otherwise, the evaluation procedure that invokes a behaviour retains the read-only access to the front of the queue. However, write access by means of appending result values is only available to the procedure that evaluates the behaviour implementation or the sub-evaluation routine, respectively.

- In order to support pipelining of results, a means of synchronisation between the corresponding two evaluation procedures is required. This is enabled by means of a status operator and a special result value that marks the end of the pipelining of result values.

As we will see next, result queues may be accessed using both stack operators and queue operators. In the event that a stack operator is invoked upon a result queue, t he queue is treated as a stack. For implicitly defined queues, there is only one end of t he queue made accessible to the particular evaluation routine. This available end is regarded as the top of the (queue-)stack. Otherwise, if both ends are accessible to the evaluation routine, the tail-end is considered as the top.

RQs have the ability to hold different types of results. For instance, the result of an evaluation process may j ust be a simple atomic value, the NULL value, a structured value, an object identifier, a main memory reference, a bag, a set, a list or an array. In order to determine the type of a result queue, a header is associated with every queue. The format of a RQ header is as follows:

- type . . . a reference to a _ _ typeinf o or _ _ clas s i nf o structure held in the run-time

or DBS metadata catalogue;

- pipe . . . a Boolean value that indicates whether or not result values are pipelined; - eAccStep . . . a Natural value that corresponds to an offset value. Such offset values

are utilised by the eAccess array, which speeds up access-by-position to lists and arrays (a value of 0 implies that there is no such support) ; and

- eAccess [] .. . an array of pointers to elements held on the result queue. The first

pointer refers to the eAccStepth element, the second pointer to the 2 * eAccStepth element and so on. The eAccess array is used for lists and arrays to implement access-by-position more efficiently and also to assist with the evaluation of opera tions such as searching, sorting etc.

Access to information maintained in a RQ's header is supported through the ' . ' (dot) operator. Queue elements are maintained according to the respective collection type. If, for instance, RQ holds a set of object references, it is ensured that no object is referenced twice from RQ.

Operations on Stacks and Queues. ES, RS and RQ support most of the common

operations defined on stacks (i.e. empty, pop, push and t op) and queues (i.e. head, t ail, prev, and next). Furthermore, queues may be modified through the following operators:

5.3. THE EXECUTION OF EVALUATION PLANS Markus Kirchberg

move infront , moveBehind, swap, cut infront, cutBehind, append and merge . Details are as follows:

- vo id append ( q1 , q2 ) , where q1 and q2 are pointers to result queues. The append

operator simply adds all entities from q2 to the tail of q1 .

- q2 * cutBehind ( q1 , e ) , where q1 and q2 are result queues and e is a pointer to an entry on q1 . The cutBehind operator splits the given queue q1 in two parts: All elements in-front of e remain on q1 . Entry e and all following entries are moved to the new queue q2 of identical format. The return value of the cutBehind operator is a pointer to the newly created result queue.

- q2 * cut inf ront ( q1 , e ) , where q1 and q2 are result queues and e is a pointer to an entries on q1 . Similarly to the cutBehind operator, the cut infront operator splits the given queue q1 in two parts. This time, however, all elements following entry e stay behind. Entry e and all preceding entries are moved to the new queue q2 and a pointer to this queue is returned as result.

- boo lean empty ( s ) , where s is either ES, RS or RQ. The empty operator tests

whether or not the given stack or queue is empty.

- e head ( q ) , where q is a result queue and e is an entry on q similar to the top operator (with the exclusion of name binders) . The head operator returns the entry

In document Integration of database programming and query languages for distributed object bases : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems at Massey University (Page 175-199)