INDEX DATA STRUCTURES IN OBJECT-ORIENTED DATABASES

(1)

INDEX DATA STRUCTURES IN

OBJECT-ORIENTED DATABASES

(2)

The Kluwer International Series on ADVANCES IN DATABASE SYSTEMS

Series Editor

Ahmed K. Elmagarmid

Other books in the Series:

Purdue University West Lafayette. IN 47907

DATABASE CONCURRENCY CONTROL: Methods, Performance, and Analysis by Alexander Thomasian

ISBN: 0-7923-9741-X

TIME-CONSTRAINED TRANSACTION MANAGEMENT Real-Time Constraints in Database Transaction Systems by Nandit R. Soparkar. Henry F. Korth. Abraham Silberschatz

ISBN: 0-7923-9752-5

SEARCHING MULTIMEDIA DATABASES BY CONTENT by Christos Faloutsos

ISBN: 0-7923-9777-0

REPLICATION TECHNIQUES IN DISTRIBUTED SYSTEMS by Abdelsalam A. Helal. Abdelsalam A. Heddaya. Bharat B. Bhargava

ISBN: 0-7923-9800-9

VIDEO DATABASE SYSTEMS: Issues, Products, and Applications

by Ahmed K. Elmagarmid. Haitao Jiang. Abdelsalam A. Helal. Anupam Joshi. Magdy Ahmed ISBN: 0-7923-9872-6

DATABASE ISSUES IN GEOGRAPHIC INFORMATION SYSTEMS by Nabil R. Adam and Aryya Gangopadhyay

ISBN: 0-7923-9924-2

The K1uwer International Series on Advances in Database Systems addresses the following goals:

• To publish thorough and cohesive overviews of advanced topics in database systems.

• To publish works which are larger in scope than survey articles, and which will contain more detailed background infonnation.

• To provide a single point coverage of advanced and timely topics.

• To provide a forum for a topic of study by many researchers that may not yet have reached a stage of maturity to warrant a comprehensive textbook.

(3)

INDEX DATA STRUCTURES IN OBJECT-ORIENTED DATABASES

by

Thomas A. MUECK Martin L. POLASCHEK

Universitat Wien Vienna, Austria

~

.

. ,

SPRINGER SCIENCE+BUSINESS MEDIA, LLC

(4)

ISBN 978-1-4613-7849-5 ISBN 978-1-4615-6213-9 (eBook) DOI 10.1007/978-1-4615-6213-9

Library of Congress Cataloging-in-Publication Data A C.I.P. Catalogue record for this book is available from the Library of Congress.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photo- copying, recording, or otherwise, without the prior written permission of the publisher, Springer Science+Business Media, LLC.

Printed on acid-free paper.

(5)

PREFACE

Object-oriented database management systems (OODBMS) are used to imple- ment and maintain large object databases on persistent storage. Regardless whether the underlying database model follows the object-oriented, the relational or the object-relational paradigm, a key feature of any DBMS product is content based access to data sets. On the one hand this feature provides user-friendly query interfaces based on predicates to describe the desired data.

On the other hand it poses challenging questions regarding DBMS design and implementation as well as the application development process on top of the DBMS.

The reason for the latter is that the actual query performance depends on a technically meaningful use of access support mechanisms. In particular, if chosen and applied properly, such a mechanism speeds up the execution of predicate based queries. In the object-oriented world, such queries may involve arbitrarily complex terms referring to inheritance hierarchies and aggregation paths. These features are attractive at the application level, however, they increase the complexity of appropriate access support mechanisms which are known to be technically non-trivial in the relational world.

In the field of databases and database management systems, such an access support mechanism for improved query performance relies on one or more underlying search data structures and is usually called index. Informally, the central idea behind this kind of data structure application is to find the identifiers of all objects fulfilling a given query predicate without reading the objects from disk. The practical benefit of indexing large persistent object sets is there- fore a significant reduction in the number of disk I/O operations thus yielding a performance gain.

The purpose of this book is to provide technical information about current and future issues of search data structures used to index large object-oriented databases. The intended audience of this book includes all kinds of practitioners involved in OODBMS product selection, application dependent database performance tuning and application development on top of object databases as well as researchers and students interested in the technical issues of object-oriented

(8)

viii INDEX DATA STRUCTURES IN OODB

databases. The only prerequisite for understanding the material presented in this book is a working knowledge object-oriented modeling and programming concepts and a minimum knowledge of algebraic concepts like for example sets.

After the introduction two preparatory chapters present the underlying database model as outlined in the ODMG-93 [Cat96] proposal on the one hand and a chapter elaborating on the technical issues of search data structures and their use for indexing large data sets on the other hand. The three subsequent chapters deal with major indexing topics in object-oriented databases, in particular, type hierarchy indexing, aggregation path indexing, and speedup of collection operations. The presentation is concluded with a performance analysis example in the field of type hierarchy indexing.

A related issue not covered in this book is physical object clustering or, in other words, the mapping of object identifiers to physical storage addresses.

Decoupling support for content based access from physical object management and, therefore, the indexing component from an OODBMS's persistent object store provides a high degree of flexibility for both application programmers and system developers. Therefore the issues in the context of object clustering form a separate research domain beyond the scope of this book. Details about the indexing components of particular OODBMS products have been omitted from this book for two reasons. At first, it is hardly possible to get detailed technical information on the indexing components from vendors and secondly, even if this kind of information could be obtained, it is quickly dated. So it seems to be more appropriate to describe the technical issues and solutions in this field and help the reader in this way to decide about products in presence of timely and hopefully detailed information.

Acknowledgments

We thank our colleagues at the Abteilung fiir Data Engineering, Universitat Wien, for hours of fruitful discussions and in particular our former room mate Erich Schikuta for introducing us to the versatile field of search data structures in the early days.

Also, we would like to thank Professor Ahmed K. Elmagarmid for supporting this book project.

This book would not exist without the continuing encouragement by the peo- ple at Kluwer Academic Publishers, especially by Scott Delman and his staff.

Special thanks for being patient.

INDEX DATA STRUCTURES IN OBJECT-ORIENTED DATABASES

INDEX DATA STRUCTURES IN

OBJECT-ORIENTED DATABASES

The Kluwer International Series on ADVANCES IN DATABASE SYSTEMS

Ahmed K. Elmagarmid

INDEX DATA STRUCTURES IN OBJECT-ORIENTED DATABASES

by

Thomas A. MUECK Martin L. POLASCHEK

Universitat Wien Vienna, Austria

.

. ,

CONTENTS

PREFACE

Acknowledgments