• No results found

Algorithmische Geometrie in der Praxis

N/A
N/A
Protected

Academic year: 2021

Share "Algorithmische Geometrie in der Praxis"

Copied!
103
0
0

Loading.... (view fulltext now)

Full text

(1)

DB2 on Demand | Feb 2008

Algorithmische

Algorithmische

Geometrie

Geometrie

in

in

der

der

Praxis

Praxis

Integration von räumlichen Daten und Operationen in relationale Datenbanksysteme

(2)

Important Disclaimer

Important Disclaimer

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.

WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND

ACCURACY OF THE INFORMATION CONTAINED IN THIS

PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED.

IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.

IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:

• CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS); OR

• ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT GOVERNING THE USE OF IBM SOFTWARE.

(3)

Agenda

Agenda



Introduction



SQL/MM Spatial standard



DB2 Spatial Extender & DB2 z/OS Spatial Support Feature

– History

– Functionality

– Implementation details

– Spatial indexing



Current & future developments

– 3D support

– Graph, Network & Topology

– Integration Aspects

(4)

Introduction

(5)

What is Spatial Data

What is Spatial Data



Information about anything that can be located on Earth’s surface

– Natural  rivers, lakes, mountains, …

– Man-made  buildings, utility facility, …

– Cadastral  property boundary, voting districts, …

– Artificial  CAD/CAM



Represented by a geometry

– Point, linestring, polygon



Location and geometry defined by

– Coordinates  latitude/longitude; x/y

– Addresses  ‘Piazza Leonardo da Vinci 32, 20133 Milano, Italy’ – Geocode to get coordinates

– Name  ‘White House’

(6)

What is Spatial Data

What is Spatial Data

(cont.)

(cont.)



Spatial data is modeled as

raster or vector, and …



… organized as collections of

(7)

Spatial Applications

Spatial Applications



Traditional GIS (G is central)

– Natural resources, government, remote sensing

– Sophisticated processing, dynamic geographic features

Emphasis on geographic data



Business, commercial GIS (IS is central)

– Enterprise databases

– Simpler, static geography

– Emphasis on other attributes



Virtually every database has locations!

(8)

Products

Products



IBM DB2 Spatial and Geodetic Extender & IBM DB2 z/OS Spatial

Support Feature



IDS Spatial DataBlade, SpatialWare & Geodetic DataBlade



Oracle Spatial & Oracle Locator



MySQL Spatial

(9)

Use of Location Information...

Use of Location Information...

Demographic data

– Average income – Family size

– Average education level – Population density – Gender/race distribution – Crime data Geophysical – Flood plains – Rivers – Fault-lines – Mineral resources Meteorological – Weather conditions Government – Regulatory compliance

(10)

... which is Available At Little or No Cost

... which is Available At Little or No Cost

FBI - Crime data such as

– Robbery crimes per 100,000 people – Property crimes per 100,000 people – Car thefts per 100,000 people

FEMA - Disaster data such as

– 100-year and 500-year flood plain areas – Coastal Barrier Resources Act areas

USGS - Geophysical data such as

– Fault lines

– Known mineral deposits – Active mines

Census - Demographic data such as

– % of population 65 or older – % of population never married – Average household size

(11)

It

It

s just SQL . . .

s just SQL . . .

Business applications often require the combination of different types of data

CREATE TABLE HospitalRooms (

Name VARCHAR(128) NOT NULL, Equipment Document NOT NULL Where ST_Point NOT NULL );

-- "Show me all rooms available for the -- next five days within 45 miles of the -- patient’s location equipped with a -- defibrillator?"

SELECT h.name

FROM hospitalRooms AS h, patients AS p WHERE ST_Within(h.where,

ST_Buffer(p.location, 45, 'MILES') ) = 1 AND

p.name = 'Fred Flintstone' AND

Contains(h.equipment, 'defibrillator') = 1

(12)
(13)

Coordinate Systems

Coordinate Systems



Geographic – 3D

– On surface of the Earth (often latitude/longitude)

– Spheroid (datum, semi-major and semi-minor axis)

– Primeridian (e.g. Greenwich)

– Unit (e.g. degrees, radians)



Projected – 2D

– Geographic plus projection to 2D space

– Many different projection algorithms



„Abstract“

– Not related to Earth‘s surface

(14)

longitude latitude 0 -90 (90°W) +90 (90°N) R

(15)

Flattening the Earth

Flattening the Earth



Plane Geometry on lat-long

– Singularities and scale distortion at the poles

– Wrap-around at 180º longitude

– Poor location of lines



Local/Regional Projections

– Limited valid range

– Map edge-matching problems

– Non-uniform scale



Indexing: it gets worse!

– Multiple “bounding boxes” or complete loss of selectivity UTM 10 UTM 11 -90 -180 0 180 0 90

(16)

Planar Coordinates

Planar Coordinates

500,000 5,000,000 Northing Easting

(17)

Spatial Data in DBMS

Spatial Data in DBMS

SELECT sid, COUNT(*) AS count, AVG(income) AS avgIncome FROM stores AS s, customers AS c WHERE ST_distance(s.loc, c.loc,

'MILE') < 100 AND s.state = 'CA' AND income > 100000 GROUP BY sid;

"Tell me the number and average income of customers who make more than $100K and live within 100 miles of our California

stores."

text image

spatial

DB2_stores Spatial Index / Data DB2 View_customers

DBMS

cid loc income sid loc

(18)

Spatial/

Spatial/

OpenGIS

OpenGIS

Spatial SQL Functions

Spatial SQL Functions

ST_Distance(g1,g2)

?

SELECT a.policy_number, a.address FROM customers a, floodzones b

WHERE ST_Intersects(a.location,b.location) = 1 AND b.last_flood_year > 1950

ST_Intersects(g1,g2)

?

SELECT a.cust_number, a.address FROM customers a, stores b

WHERE ST_Distance(a.location,b.location) < 2000 AND b.last_order_year < 2001

(19)

SQL/MM Spatial standard

(20)

Environment

Environment



SQL99/SQL2003

defines many object-relational extensions

– structured types, UDFs, procedures, LOBs



SQL/MM Part 3: Spatial

defines SQL extensions for spatial data

– types, functions/methods, information schema (catalog)

(21)

Overview on SQL/MM Spatial

Overview on SQL/MM Spatial



Part 3 of SQL/MM

– Part 1 - Framework

– Part 2 - Full Text

– Part 3 - Spatial

– Part 5 - Still Image

– Part 6 - Data Mining



Derived from OGC Simple Feature Specification for SQL

– OpenGIS Consortium (OGC)

– OGC Simple Feature Specification for SQL – not really a standard, just a specification

(22)

Content

Content



Currently status

– 2nd version is International Standard since 2003

– 3rd version in Working Draft state (slow progress)



Based on SQL99

– Structured types, methods, ...



Standard does not define

– Indexing mechanisms

– Implementation issues

– number, names, and types of attributes of the types

– specific algorithms

(23)

ST_Point ST_Curve ST_LineString ST_Polygon ST_Surface ST_GeomCollection ST_Geometry ST_Geometry ST_MultiPoint ST_Curve ST_MultiLineString ST_MultiPolygon ST_Surface

Spatial Type Hierarchy (1)

(24)

Spatial Type Hierarchy (2)

Spatial Type Hierarchy (2)



Additional optional types

ST_CircularString

ST_CompoundCurve

(25)

Spatial Type Hierarchy

Spatial Type Hierarchy

Discussion

Discussion



Attempt to implement Composite Design Pattern

– originated from OGC class hierarchy

– pattern cannot be implemented in SQL



ST_Point unrelated to ST_MultiPoint,

ST_LineString unrelated to ST_MultiLineString,

ST_Polygon unrelated to ST_MultiPolygon

– further processing of results leads oftentimes to much more complex SQL statements



Optional types not properly mirrored between single-part and

multi-part subtrees of the hierarchy

(26)

ST_Empty ST_MultiPoint ST_Point ST_CircularString ST_MultiCurve ST_Geometry ST_Polygon ST_MultiPolygon

Spatial Type Hierarchy

Spatial Type Hierarchy

Possible Solution

Possible Solution

ST_MultiCircStrin g ST_MultiSurface ST_MultiLineStrin g ST_LineString

(27)

Spatial Methods

Spatial Methods

ST_Geometry ST_Point ST_LineString ST_CircularString ST_CompoundCurve ST_CurvePolygon ST_Polygon ST_GeomCollection ST_MultiPoint ST_MultiLineString ST_MultiPolygon ST_AsText ST_AsBinary ST_AsGML ST_Transform

ST_Dimension

ST_CoordDim

ST_GeometryType

ST_SRID

ST_IsEmpty

ST_IsSimple

ST_IsValid

ST_Boundary

ST_Envelope

ST_ConvexHull

ST_Buffer

ST_Intersection

ST_Union

ST_Difference

ST_SymDifference

ST_Length

ST_StartPoint

ST_EndPoint

ST_IsClosed

ST_IsRing

ST_NumPoints

ST_PointN

ST_NumCurves

ST_CurveN

ST_Area

ST_Perimeter

ST_Centroid

ST_PointOnSurface

ST_ExteriorRing

ST_InteriorRings

ST_NumInteriorRing

ST_InteriorRingN

ST_Distance

ST_Equals

ST_Relate

ST_Disjoint

ST_Intersects

ST_Touches

ST_Crosses

ST_Within

ST_Contains

ST_Overlaps

ST_X

ST_Y

(28)

Spatial Functionality (1)

Spatial Functionality (1)



Constructors

– well-known text representation (WKT)

– well-known binary representation (WKB)

– ESRI shape representation

– GML representation

– Coordinates (for ST_Point values)



Comparison functions (to be used in predicates)

– ST_Overlaps, ST_Within, ST_Disjoint, ST_EqualSRS, ST_Relate, ...

(29)

Spatial Functionality (2)

Spatial Functionality (2)



Return information about properties

– ST_Length, ST_X, ST_Is3D, ST_MinY, ST_NumPoints, ...



Derive/compute new geometries

– ST_Union, ST_Intersection, ST_MidPoint, ST_Buffer, ST_ConvexHull, ...



Functions involving 2 geometries are usually performed in 2D

space

(30)

Constructors

Constructors



Well-known Text Representation

point (10 10)

multipolygon (((1 1, 2 2, 1 2, 1 1)),((10 10, 10 20, 20 20, 20 10, 10 10)))



Well-known Binary Representation

x'010100000000000000000024400000000000002440‘



Geography Markup Language

<gml:Point><gml:coordinates>10, 10</gml:coordinates></gml:Point>  (Additional formats in products, e.g. Shape format)

(31)

Spatial Methods

Spatial Methods

Discussion

Discussion



Duplicated functionality

– ST_Overlaps, ST_Intersects, ST_Crosses

– ST_GeometryType & SPECIFIC_TYPE (SQL99)



Additional functionality already supported by products

– ST_ShortestPath (in current WD)

– Z/M coordinate support

– Other external data representations

(32)

Spatial Information Schema (Catalog)

Spatial Information Schema (Catalog)

ST_GEOMETRY_

COLUMNS

COLUMNS

(SQL2003)

is subset

ST_SPATIAL_

REFERENCE_SYSTEMS

ST_UNITS_OF_

MEASURE

ST_SIZINGS

associated

(33)

Spatial Information Schema

Spatial Information Schema

Discussion

Discussion



Merge ST_SIZINGS view with SIZINGS (SQL2003)

– additional facilities in SQL2003 needed, first



ST_SPATIAL_REFERENCE_SYSTEMS is rather primitive,

EPSG uses:

– Coordinate Axis Name, Coordinate Axis, Coordinate System, Coordinate Reference System, Coordinate Operation, Ellipsoid, Datum, Prime Meridian, Coord_Op Method, Coord_Op Parameter, Coord_Op Parameter Usage, Coord_Op Parameter Value,

(34)

DB2 LUW Spatial Extender &

DB2 LUW Spatial Extender &

DB2 z/OS Spatial Support Feature

(35)

Introduction

Introduction

Connectivity

Connectivity BackupBackup RestoreRestore

Server Components Server Components

DB2

DB2

DB2

Spatial Spatial Geodetic Geodetic Grid Grid

Your idea goes here

(36)

What is an Extender?

What is an Extender?

Types Types Functions Functions Casts Casts Aggregates Aggregates Indexes Indexes Tables Tables Client Code Client Code New New Extender/ Extender/ DataBlade DataBlade

(37)

DB2 z/OS Spatial Support

DB2 z/OS Spatial Support



No extender technology used



Deeply integrated into DB2 engine

Better performance

Better customer acceptance

(38)

– History –

(39)

Spatial Technology

Spatial Technology

A Joint IBM & ESRI Effort

A Joint IBM & ESRI Effort



IBM teamed up with Environmental Systems Research Institute

(ESRI)

– over 30 years of experience in spatial technology

– broad portfolio of GIS products and tools



Collaboration combined best of both partners

– ESRI provided code that does all the spatial calculations

– IBM threw in database knowledge and extended DB2‘s OR features to accomodate ESRI

(40)

ESRI SDE 3.0.2

ESRI SDE for DB2 UDB ESRI Spatial Tools DataJoiner with DB2 Spatial Extender SDE 3.0.2 Other SQL Applications DataJoiner Spatial Extender ESRI Spatial Tools

DB2 UDB 8.1 with the DB2 Spatial Extender Feature ESRI Spatial Tools Other SQL Applications ArcInfo V8 ArcSDE8 DB2 UDB 7.1 DB2 Spatial Extender ArcSDE 9

Evolution

Evolution

(41)

select * from customersc, indexi, featuref where i.xmin > xmin(:circle) and

i.ymin > ymin(:circle) and i.xmax < xmax(:circle) and i.ymax < ymax(:circle) and

i.oid = f.oid and i.oid = c.oid and c.income > 50000 GIS Application GIS Gateway RDBMS Proprietary GIS APIs SQL (ODBC, JDBC, etc.) (1) Spatial types (2) Spatial indexes (3) Spatial functions (4) Spatial predicates

(5) Spatial query composer (6) Spatial query rewriter

(1) Spatial meta tables

(2) "Hidden" tables for spatial data (3) "Hidden" table for spatial index

Table: customers

Constraint: within(loc, :circle)

Other constraint: income > 50000

Loosely Coupled Architecture

(42)

GIS Application GIS Gateway RDBMS Proprietary GIS APIs SQL (ODBC, JDBC, etc)

(1) Spatial query composer (1) Spatial meta tables (2) Spatial types

(3) Spatial functions (4) Spatial indexes (5) Spatial predicates

select * from customersc

where within(c.location, :circle) and c.income > 50000

Table: customers

Constraint: within(loc, :circle)

Other constraint:income > 50000

Integrated Architecture

Integrated Architecture

(43)

– Functionality –

(44)

Geocoder data Stored

Proc.

Structured

Types MethodUDF/

DB2 Applications Client functions Spatial Business Data DB2 Client SQL

Architecture

Architecture

(45)

Spatial Extender Features at a Glance (1)

Spatial Extender Features at a Glance (1)



Integrated into DB2 Control Center

Graphical user interface

Includes spatial management functions

– Enable/disable spatial DB

– Create/drop coordinate systems, spatial reference systems, and geocoders

– Register/unregister spatial columns

– Manage geocoding instances

(46)

Spatial Extender Features at a Glance (2)

Spatial Extender Features at a Glance (2)

 Extender is based on DB2’s OR features – User-defined structured types

– Spatial index extensions

– Grid index for all 2-dimensional data – Z-order index for point data in 2D space

 Spatial catalog views

– Coordinate systems & spatial reference systems

– Spatial columns

– Geocoders

 Geocoder framework

– Support to plug in custom geocoders

 Just define a UDF that does the geocoding

 Import/export for special spatial file formats – ESRI shapefiles

(47)

Spatial Extender Functions and Methods

Spatial Extender Functions and Methods

ST_Contains ST_Touches ST_Within ST_Overlaps ST_Intersects ST_Crosses ST_Disjoint ST_Relate ST_Equals ST_MBRIntersects ST_Intersection ST_Difference ST_Union ST_SymDifference ST_Buffer ST_PointOnSurface ST_Boundary ST_Envelope ST_Centroid ST_Perimeter ST_ConvexHull ST_MBR ST_Generalize ST_FindMeasure ST_MeasureBetween MBR aggregate Union aggregate ST_Geometry ST_Point ST_LineString ST_Polygon ST_GeomCollection ST_MultiPoint ST_MultiLineString ST_MultiPolygon ST_AsText ST_AsBinary ST_AsShape ST_AsGML ST_ToGeomColl ST_ToPoint ST_ToLineString ST_ToPolygon ST_ToMultiPoint ST_ToMultiLine ST_ToMultiPolygon ST_Transform ST_Is3D ST_IsMeasured ST_IsClosed ST_IsEmpty ST_IsRing ST_IsSimple ST_IsValid ST_StartPoint ST_MidPoint ST_EndPoint ST_CoordDim ST_Dimension ST_GeometryType ST_NumPoints ST_PointN ST_ExteriorRing ST_NumInteriorRing ST_InteriorRingN ST_NumGeometries ST_GeometryN ST_NumLineStrings ST_LineStringN ST_NumPolygons ST_PolygonN ST_NumPoints ST_PointN ST_Area ST_Distance ST_Length ST_Perimeter ST_AppendPoint ST_ChangePoint ST_RemovePoint ST_PerpPoints ST_X ST_Y ST_Z ST_M ST_MinX ST_MinY ST_MinZ ST_MinM ST_MaxX ST_MaxY ST_MaxZ ST_MaxM ST_Edge_GC_USA ST_EqualCoordsys ST_EqualSRS ST_GetIndexParms ST_SrsName ST_SrsId

(48)

Administrative Stored Procedures

Administrative Stored Procedures

 ST_ALTER_COORDSYS  ST_ALTER_SRS  ST_CREATE_COORDSYS  ST_CREATE_INDEX  ST_CREATE_SRS  ST_CREATE_SRS_2  ST_DROP_COORDSYS  ST_DROP_INDEX  ST_DROP_SRS  ST_IMPORT_SHAPE  ST_REGISTER_SPATIAL_COLUMN

(49)

Some Examples (1)

Some Examples (1)

CREATE TABLE customers ( name VARCHAR(100), address VARCHAR(200),

location ST_Point )

CREATE TABLE streets (

name VARCHAR(100),

location ST_LineString )

CREATE TABLE stores (

name VARCHAR(100), address VARCHAR(200), manager VARCHAR(100),

(50)

Some Examples (2)

Some Examples (2)

INSERT INTO customers

VALUES ( 'customer 1', 'address 1',

ST_Point(100, 200, 1) )

INSERT INTO streets

VALUES ( '1st street', ST_LineString( 'linestring ( 50 50, 100 100,

100 250, 200 250 )', 1) )

INSERT INTO stores

VALUES ( 'store 1', 'address 2', 'john',

(51)

Some Examples (3)

Some Examples (3)

 Distance between all stores and all customers, measured in meters SELECT s.location..ST_Distance(c.location, 'METER')

FROM stores AS s, customers AS c

 Find all street intersections, return intersecting point as GML – will exploit spatial index if possible

SELECT s1.name, s2.name,

s1.location..ST_Intersection(s2.location).. ST_AsGML()

FROM streets AS s1 JOIN streets AS s2

(52)

Some Examples (4)

Some Examples (4)

 Find all customers that live in an area around any store SELECT s.name, c.name, c.address

FROM stores AS s, customers AS c

WHERE s.location..ST_Buffer(5, 'KILOMETER').. ST_Contains(c.location) = 1

 Find closest store for all customers

WITH cust_store_dist(cust, store, dist) AS

( SELECT c.name, s.name, ST_Distance(c.location, s.location, 'METER')

FROM stores AS s, customers AS c ) SELECT o.cust, o.store

FROM cust_store_dist AS o

WHERE o.dist <= ALL ( SELECT i.dist

(53)

Geocoding

Geocoding

Address Data

Address Data

Customers

Street

Segments

Geocoder



Populate the spatial column with the location (point)

information

Cid Address City State Zip Location

1 4335 Queen Anne Drive San Jose CA 95129 …

--2 555 Bailey Avenue San Jose CA 95120 …

--… --… … …

(54)

--The

The

Geocoding

Geocoding

Process

Process

US Streets

Map Data

Geocoder 555 Bailey Avenue San Jose, CA 95141

Real Earth x,y Coordinates (Latitude, Longitude) DB2 UDB DB2 DB2 Spatial ADDRESS

(55)

– Implementation Details –

(56)

Integer Coordinates (1)

Integer Coordinates (1)



All coordinates are stored as positive 32-bit integers

– Integer calculation faster than floating points

– Less storage required

– Compression of coordinates



Floating points are converted to integers on the fly

(57)

Integer Coordinates (2)

Integer Coordinates (2)



Conversion floating point



integer

1. Substract offset

2. Multiply with scale factor

3. Add 0.5 for proper rounding

int = ( fp – offset ) * scale + 0.5



Conversion integer



floating point

(58)

Determining Offsets and

Determining Offsets and

Scale factors

Scale factors



Get maximum spatial extent of data

– Min/max X/Y/Z/M coordinates

– Enlarge area by 20% to accommodate for growth



Offset:

– Set to min X/Y/Z/M



Scale factor

– Set to MAX_INT / ( max X/Y/Z/M + offset )

– MAX_INT is 2,147,483,648



Tradeoff is between coordinate precision (determined by scale

(59)

Coordinate Compression

Coordinate Compression



Internally, coordinates are stored

relative

to previous point in

geometry

– Small integer values can be compressed better



Compression rate is data dependent

– Geometries that are small (relative to scale factor) compress better

(1, 2) (3, 6) [2, 4] (6, 7) [3, 1] (6, 3) [0, -4] (4, 4) [-2, 1] (3, 1) [-1, -3]

(60)

Added method support

Added method support



Version 7

– Only ST_function(geometry) or ST_function(geometry1, geometry2)



Version 8

– Now also geometry..ST_function() or

geometry1..ST_function(geometry2)



Reasons

– Standard compliance

– Ease of use

(61)

Catalog changes

Catalog changes

 Version 7

– Catalog tables highly inadequate (but actually standardized that way)

– Irrelevant information was shown – Relevant information was missing

– 1:1 dependency between base tables and views was mandatory – Specific naming conventions and limits (name lengths)

– Example:

geometry_columns(layer_catalog, layer_schema, layer_table, layer_column, geometry_type, srid)

 Version 8

– Implemented new catalog

– Changed SQL/MM standard to “comply” with the product

 Implementation issues – Backward compatibility

– Everything in the code was affected

(62)

Import/Export

Import/Export

Rewrite of Queries

Rewrite of Queries



Version 7

INSERT INTO table

VALUES (..., ST_GeomFromShape(blob), ...)

– Initialization of buffers inside DB2

– Initialization inside the function



Version 8

INSERT INTO table

SELECT ..., ST_GeomFromShape(data), ... FROM TABLE ( VALUES (..., blob, ...), VALUES (..., blob, ...), ... ) AS ( ..., data, ...)

– Initialization done only once

(63)

Exploitation of UDF features

Exploitation of UDF features



Use of SCRATCHPAD

– Carry internal buffers and already initialized objects from call to call

– All functions: construct coordinate system objects only once

Geocoder: caching of data results in much less file I/O

Several thousand memory operations to geocode a single address



Allow parallelization of function execution



Performance enhancements

– Improve parsers for WKT, WKB, GML, …

(64)

Error Handling & Tracing

Error Handling & Tracing



Version 7

– At least 5 different ways to handle error conditions – NLS issues due to hard-coded error messages

– No tracing available

– No first-fault data capture (FFDC)



Version 8

– Common architecture throughout the (IBM-controlled part of the) product

(65)

– Spatial Indexing –

(66)

The Need for a Spatial Index

The Need for a Spatial Index



Spatial queries are typically in 2-D space



Native B-Tree is insufficient

– B-Tree does not support structured types

– 0-D objects (points, e.g. location of buildings) can be indexed in B-tree based on X and Y coordinates

– But not with best possible performance

– Neither 1-D nor 2-D objects (lines, e.g. road segments, or polygons, e.g. lakes, counties) can be indexed this way

(67)

Function that can Exploit a Spatial Grid Index

Function that can Exploit a Spatial Grid Index



ST_Equals



ST_Disjoint



ST_Intersects, ST_Overlaps, ST_Crosses



ST_Touches



ST_Within, ST_Contains



ST_MBRIntersects

(68)

Function that can Exploit a Spatial Z

Function that can Exploit a Spatial Z

-

-

Order Index

Order Index



ST_ZEquals



ST_ZIntersects, ST_ZOverlaps, ST_ZCrosses



ST_ZWithin, ST_ZContains



ST_ZMBRIntersects



ST_ZTouches

(69)

A uniformly-spaced square indexing grid

each feature exists in one or more grid cells

multiple geometries can exist in a single grid cell, especially overlapping geometries

MBR of geometries is used for indexing speed up searches

Up to 3 levels of grids for the spatial index

spatial index is like a two-dimensional column index

Records which grid cells

each feature resides in

Geometries are indexed

by the grid level, the

overlapped grid cells, and

the MBR of the geometry

The Spatial Index Grids

The Spatial Index Grids

(70)

Spatial index search avoids access to the data

How the Spatial Index Works

How the Spatial Index Works

1. Determine all overlapping grid cells (on all levels) of search argument

 Eliminate all geometries that do not overlap any of those grid cells  Index scan returns set of candidates, possibly including false positives

2. Compare MBR of search agrument with candidates

 Eliminate all geometries where the MBR is already disjoint

 Set of candidates is reduced but might still contain false positives

3. Do an exact comparison

(71)

Find all Parcels that intersect the Flood Zone?

Remember:



The goal is to avoid comparing all geometries with the „flood zone“



Use a 3 level filtering process

Parcel Flood Zone

Let

(72)

1.

Eliminate geometries that do not overlap grid cells of

search argument

Eliminate all parcels that do not intersect with the 4 grid cells

Spatial Index Example

(73)

1.

Eliminate geometries that do not overlap grid cells of search argument

2.

Eliminate geometries by filtering with the MBR

Eliminate all parcels that do not intersect the MBR of the flood zone.

Spatial Index Example

(74)

1.

Eliminate geometries that do not overlap grid cells of search argument

2.

Eliminate geometries by filtering with the MBR

3.

Eliminate geometries by exact coordinate comparison

Spatial Index Example

(75)

Spatial Index Grid Levels

Spatial Index Grid Levels



Up to 3 grid levels possible in an index



Goal

– Reduce number of index entries

– Make grid cells as small as possible

– Higher resolution on index scans  better filtering in this step



Use multiple grid levels when geometries have skewed size

– Avoids lots of index entries for large geometries

– Geometry is promoted to next grid level if it covers more than 4 grid cells



Note

– Multiple grid levels result in multiple index scans

(76)

Choosing the Best Grid Size

Choosing the Best Grid Size



Grid size is size of grid cells (1 dimension)



Use Spatial Index Advisor (gseidx) to get

– Information on the spatial data (size etc.)

– Suggestions for grid sizes

– Index statistics for user-providedgrid sizes



Rules

– Make grid size as small as possible while

(77)

Spatial Index Statistics

Spatial Index Statistics



Use key generator function of grid index extension

Feed geometries into it

Returns table of all data stored in the index

Use SQL to evaluate data in that table



Simpler version: operate on MBR of the geometries

(78)

Rules for Index Exploitation

Rules for Index Exploitation



Spatial predicate must be used in WHERE clause



Spatial function must be on left-hand side of

comparison operator



Equality comparison must use integer constant 1



At least of of the function’s parameters must be an

indexed spatial column

SELECT * FROM customers c WHERE ST_Within(c.location, :BayArea) = 1 SELECT * FROM customers c WHERE ST_Distance(c.location, :SanJose) < 10

SELECT * FROM customers c WHERE ST_Length(c.location) > 10

SELECT * FROM customers c WHERE 1 = ST_Within(c.location, :BayArea) SELECT * FROM customers c WHERE ST_Within(c.location, :BayArea) = 2

(79)

– Demo –

(80)

Problem Statement

Problem Statement



A real estate website needs the capabilities that can

satisfy the following objectives:

Allow users to search property listings and display the search

results on a map

Allow users to attain detailed information about a particular

property

Allow users to drill down and find the spatial relationship

between the selected property and other point of interests

(81)

Implementation Overview

Implementation Overview

PHP Application HTTP Server

Internet

Google Map Server ESRI ArcWeb Map Server Web Browser ---AJAX / JavaScript Real Estate

Listing Data School Data Crime Data

(82)
(83)

Show the details of each result by clicking on a green pushpin on the map. At the query panel below, query the schools within 2 miles of the selected home and API > 700.

(84)
(85)
(86)

Current Developments

Current Developments

(87)

– 3D –

(88)

Modelling 3D Objects

Modelling 3D Objects



Extend spatial type hierarchy

– Solids and Polyhedrons and Collections

– Approximate curved surfaces with polygons

 



New methods for 3D types

– ST_Volume, ST_BoundingArea, ST_ExteriorShell, ST_InteriorShellN, and ST_NumFacets



Adjust existing methods to

– Operate on 3D geometries

(89)

Extend External Data Formats

Extend External Data Formats



Well-known text (WKT), well-known binary (WKB), and

Geography Markup Language (GML)



Boundary based representation

POLYHEDRON Z ((((0 0 0, 1 0 0, 0 0 1, 0 0 0)), ((1 0 0, 0 1 0,

0 0 1, 1 0 0)), ((0 0 1, 0 1 0, 0 0 0, 0 0 1)), ((0 0 0, 0 1 0, 1 0

0, 0 0 0))))



Indexed face set representation

POLYHEDRON Z INDEX ((0 0 0, 1 0 0, 0 0 1, 0 1 0), (((1, 2,

(90)

CGAL as 3D Computation Engine

CGAL as 3D Computation Engine



Scientific library implementing spatial algorithms with

optimal

theoretical

runtime



CGAL Polyhedron for true 3D objects



More general Nef-polyhedrons

All kinds of spatial objects can be described in R

3

(91)

3D Performance Tests

3D Performance Tests



Approximated spheres with growing complexity



Tested functionality

– Constructors & conversion to external data format – Storage and construction time

– Spatial comparison of two geometries

– Generation of new geometries – Intersection computation



Performance of traditional Spatial Extender not impacted besides

(92)

Some Performance Results

Some Performance Results

Construction Intersection

(93)

Summary or 3D with CGAL

Summary or 3D with CGAL



Performance is not acceptable despite optimal

theoretical runtimes



No native binary interface for I/O available



Extremely long construction times



Some simple optimization in CGAL show potential for

significant improvements

(94)

– Graph –

(95)

Integrating Graphs into DBMS

Integrating Graphs into DBMS



All geometries have inherent topological properties

Can be mapped to graph structures

Allows application of graph operations directly on geometries



Evaluate graph operations inside the DBMS

Applications do not need any additional graph functionality

(96)

Deriving Graphs from

Deriving Graphs from

LineStrings

LineStrings



Points become vertices

Intersections of linestrings without common point does not result in graph vertex



Line segments connecting points become edges



Reverse mapping requires identifier (primary key value or RID)



Retain point coordinates in vertices

(97)

Test Data for Prototype

Test Data for Prototype

89.5%

201,571

1,926,298

270,334

Wyoming

76.6%

1,689,279

7,214,599

2,259,314

Texas

74.9%

620,257

2,469,506

842,099

Pennsylvania

85.3%

330,786

2,248,623

451,326

New Mexico

68.5%

428,435

1,361,034

595,662

Michigan

35.4%

9,084

14,059

14,762

D.C.

68.1%

29,266

91,673

39,429

Delaware

66.8%

112,520

340,099

148,232

Connecticut

75.3%

1,200,783

4,867,271

1,589,712

California

savings

#vertices

#points

# streets

(98)

Graphs as Index Structures

Graphs as Index Structures



Usage in SQL statements

Finds linestrings participating in shortest path

Extracts relevant segments of linestrings

SELECT sp.line, t.street_name

FROM t, TABLE ( ST_ShortestPath( ST_Point(3.25, 3.25), ST_Point(5.5, 6.5),

'SAMPLE_GRAPH') ) AS sp(seq, line, id) WHERE sp.id = t.id

ORDER BY sp.seq



ST_ShortestPath automatically accesses graph

structure

(99)

– Integration Aspects –

(100)

Integration

Integration



Utilities

Export and Load/Import do not support spatial types and

spatial file formats



Language Bindings

No spatial types available in JDBC

Spatial data must be transferred as LOB and then converted



Distributed Database Systems

(101)

Summary

(102)

Summary

Summary



(Only) basic spatial functionality available



Standardized in SQL/MM

Implementations available by all major database vendors

More or less adherence to standard



Index support is mandatory



Spatial Support for z/OS much better integrated with

DB2 engine



Federated access on spatial data and replication

Different products = different data types and functions

Coordinate Systems

(103)

Questions?

References

Related documents