• No results found

In this chapter, we have shown how we can accelerate spatial query processing by means of statistics which are available for free, as they are maintained by the cost models belonging to the corresponding index structures or are index inherently avail- able. We have implemented our approach for the RI-tree, the RQ-tree as well as for the RR-tree on top of the Oracle9i database system. According to our experiments, we achieved speed-up factors of up to two orders of magnitude. Our new statistic- driven approach accelerates the query processing considerably. This acceleration is

0 50 100 150 200 250 300 0,00 0,10 0,40 1,00 s e le ctivity ru n time [ sec.]

Figure 67: RR-tree for varying selectivity.

full table scan R-tree unchanged

R-tree scan σ = 0

R-tree scan σ = 1/3

Summary 119

due to the fact, that we can dynamically switch between a further use of the index structure and a linear scan. Our statistic-driven approach adapts the access method continuously variable to the best of these two worlds.

This statistic-based acceleration approach can fruitfully be applied to time critical applications as for instance the digital mockup process which is based on collision queries for complex spatial objects.

121

Chapter 6

Cost-based Decompositioning of

Complex Spatial Objects

Modern database application impose new requirements on efficient spatial query processing. Particular problems arise from the need of high resolutions for large spa- tial objects, including cars, space stations, planes and industrial plants, and from the design goal to use general purpose database management systems in order to guaran- tee industrial-strength. In the past two decades, various stand-alone spatial index structures have been proposed but their integration into fully-fledged database sys- tems is problematic. Most of these approaches are based on the decomposition of spatial objects leading to replicating index structures. In contrast to common black-and-white decompositions which suffer from the lack of intermediate solu- tions, we introduce gray containers as a new and general concept. These gray con- tainers are stored in a spatial index structure. Additionally, we store the exact infor- mation of these gray containers in a compressed way. The gray containers are created by using a cost-based decompositioning algorithm which takes the access probability and the decompression cost of the gray containers into account. We demonstrate the benefits of our new method for the RR-tree, the RQ-tree and the RI-tree as well as for spatial join processing. The experimental evaluation on real-world test data points out that our new concept leads to an acceleration of up to two orders of magnitude with respect to the overall query response time. The experiments show that our ge- neric approach is especially useful for high resolution spatial data, which is becom- ing the standard case for modern spatial database applications.

6.1 Introduction

As a common and successful approach, spatial objects can conservatively be ap- proximated by a set of voxels, i.e. cells of a grid covering the complete data space. By means of space filling curves, each voxel can be encoded by a single integer and, thus, an extended object is represented by a set of enumerated voxels. As a principal design goal, space filling curves achieve good spatial clustering properties since cells in close spatial proximity are encoded by contiguous integers. As explained in Chap- ter 2, adjacent cell values can be grouped together to black intervals, black tiles or black boxes which are basic datatypes for spatial applications. By expressing spatial region queries as intersections of these spatial primitives, vital operations for CAD applications can be supported. As outlined in Chapter 3, a seamless and capable integration of spatial indexing into industrial-strength databases is essential.

Besides the integration into fully-fledged database systems, a further new require- ment for large spatial objects, including cars, planes or space stations, is a high ap- proximation quality which is primarily influenced by the resolution of the grid cov- ering the data space. High resolution spatial objects may consist of several hundreds of thousands of voxels. Although the voxels can further be grouped together to black intervals, black tiles or black boxes, the number of the resulting spatial primitives still remains very high. On the other hand, one-value approximations of spatially extend- ed objects often are far too coarse. In many applications, GIS or CAD objects feature a very complex and fine-grained geometry. The rectilinear bounding box of the brake line of a car, for example, would cover the whole bottom of the indexed data space. A non-replicating storage of such data causes region queries to produce too many false hits that have to be eliminated by subsequent filter steps.

In this chapter, we introduce a cost-based decompositioning algorithm for com- plex spatial objects which helps to range between the two extremes of one-value approximations and the use of unreasonably many approximations. Our new ap- proach takes compression algorithms for the effective storage of decomposed spatial objects and access probabilities of these decompositions into account.

The remainder of this chapter is organized as follows. In Section 6.2, we shortly review the related work in the area of object decompositioning. In Section 6.3, we introduce gray container objects, which can be stored within a spatial index. Further- more, we discuss in detail our cost-based grouping algorithm which can be used

Related Work 123

together with arbitrary packing algorithms. In Section 6.4, we discuss how intersec- tion queries based on compressed gray containers can be posted on top of the SQL engine. In Section 6.5, we adapt the presented techniques to the efficient processing of spatial joins. In Section 6.6, we present the empirical results, which are based on two real-world test data sets of our industrial partners, a German car manufacturer and an American plane producer, dealing with high resolution voxelized CAD data. We resume our work in Section 6.7 and close with a few final remarks on future work.

6.2 Related Work

In this section we will shortly discuss different aspects related to an effective de-

compositioning of complex spatial objects.

Complex Spatial Objects. Gaede pointed out that the number of voxels repre- senting a spatially extended object exponentially depends on the granularity of the grid approximation [Gae 95]. Furthermore, the extensive analysis given in [MJFS 96] and [FJM 97] shows that the asymptotic redundancy of an interval- and tile-based decomposition is proportional to the surface of the approximated object. Thus, in the case of large high-resolution parts, e.g. wings of an airplane, the number of tiles or intervals can become unreasonably high.

Decompositioning Algorithm. In [SK 93], Kriegel and Schiwietz tackled the complex problem of “complexity versus redundancy” for 2D polygons. They inves- tigated the natural trade-off between the complexity of the components and the re- dundancy, i.e. the number of components, with respect to its effect on efficient query processing. The presented empirically derived root-criterion suggests to decompose a polygon consisting of n vertices in many index entries. As this root- criterion was designed for 2D polygons and was not based on any analytical reason- ing, it cannot be adapted to complex 3D objects. In this chapter, in contrast, we will present an analytical cost-based decomposition approach which can be used for all kinds of spatially extended objects.