DB2 on Demand | Feb 2008
Algorithmische
Algorithmische
Geometrie
Geometrie
in
in
der
der
Praxis
Praxis
Integration von räumlichen Daten und Operationen in relationale Datenbanksysteme
Important Disclaimer
Important Disclaimer
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.
WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND
ACCURACY OF THE INFORMATION CONTAINED IN THIS
PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED.
IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.
IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:
• CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS); OR
• ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT GOVERNING THE USE OF IBM SOFTWARE.
Agenda
Agenda
Introduction
SQL/MM Spatial standard
DB2 Spatial Extender & DB2 z/OS Spatial Support Feature
– History
– Functionality
– Implementation details
– Spatial indexing
Current & future developments
– 3D support
– Graph, Network & Topology
– Integration Aspects
Introduction
What is Spatial Data
What is Spatial Data
Information about anything that can be located on Earth’s surface
– Natural rivers, lakes, mountains, …
– Man-made buildings, utility facility, …
– Cadastral property boundary, voting districts, …
– Artificial CAD/CAM
Represented by a geometry
– Point, linestring, polygon
Location and geometry defined by
– Coordinates latitude/longitude; x/y
– Addresses ‘Piazza Leonardo da Vinci 32, 20133 Milano, Italy’ – Geocode to get coordinates
– Name ‘White House’
What is Spatial Data
What is Spatial Data
(cont.)
(cont.)
Spatial data is modeled as
raster or vector, and …
… organized as collections of
Spatial Applications
Spatial Applications
Traditional GIS (G is central)
– Natural resources, government, remote sensing
– Sophisticated processing, dynamic geographic features
– Emphasis on geographic data
Business, commercial GIS (IS is central)
– Enterprise databases
– Simpler, static geography
– Emphasis on other attributes
Virtually every database has locations!
Products
Products
IBM DB2 Spatial and Geodetic Extender & IBM DB2 z/OS Spatial
Support Feature
IDS Spatial DataBlade, SpatialWare & Geodetic DataBlade
Oracle Spatial & Oracle Locator
MySQL Spatial
Use of Location Information...
Use of Location Information...
Demographic data
– Average income – Family size
– Average education level – Population density – Gender/race distribution – Crime data Geophysical – Flood plains – Rivers – Fault-lines – Mineral resources Meteorological – Weather conditions Government – Regulatory compliance
... which is Available At Little or No Cost
... which is Available At Little or No Cost
FBI - Crime data such as
– Robbery crimes per 100,000 people – Property crimes per 100,000 people – Car thefts per 100,000 people
FEMA - Disaster data such as
– 100-year and 500-year flood plain areas – Coastal Barrier Resources Act areas
USGS - Geophysical data such as
– Fault lines
– Known mineral deposits – Active mines
Census - Demographic data such as
– % of population 65 or older – % of population never married – Average household size
It
It
’
’
s just SQL . . .
s just SQL . . .
Business applications often require the combination of different types of data
CREATE TABLE HospitalRooms (
Name VARCHAR(128) NOT NULL, Equipment Document NOT NULL Where ST_Point NOT NULL );
-- "Show me all rooms available for the -- next five days within 45 miles of the -- patient’s location equipped with a -- defibrillator?"
SELECT h.name
FROM hospitalRooms AS h, patients AS p WHERE ST_Within(h.where,
ST_Buffer(p.location, 45, 'MILES') ) = 1 AND
p.name = 'Fred Flintstone' AND
Contains(h.equipment, 'defibrillator') = 1
Coordinate Systems
Coordinate Systems
Geographic – 3D
– On surface of the Earth (often latitude/longitude)
– Spheroid (datum, semi-major and semi-minor axis)
– Primeridian (e.g. Greenwich)
– Unit (e.g. degrees, radians)
Projected – 2D
– Geographic plus projection to 2D space
– Many different projection algorithms
„Abstract“
– Not related to Earth‘s surface
longitude latitude 0 -90 (90°W) +90 (90°N) R
Flattening the Earth
Flattening the Earth
Plane Geometry on lat-long
– Singularities and scale distortion at the poles
– Wrap-around at 180º longitude
– Poor location of lines
Local/Regional Projections
– Limited valid range
– Map edge-matching problems
– Non-uniform scale
Indexing: it gets worse!
– Multiple “bounding boxes” or complete loss of selectivity UTM 10 UTM 11 -90 -180 0 180 0 90
Planar Coordinates
Planar Coordinates
500,000 5,000,000 Northing EastingSpatial Data in DBMS
Spatial Data in DBMS
SELECT sid, COUNT(*) AS count, AVG(income) AS avgIncome FROM stores AS s, customers AS c WHERE ST_distance(s.loc, c.loc,
'MILE') < 100 AND s.state = 'CA' AND income > 100000 GROUP BY sid;
"Tell me the number and average income of customers who make more than $100K and live within 100 miles of our California
stores."
text image
spatial
DB2_stores Spatial Index / Data DB2 View_customers
DBMS
cid loc income sid loc
Spatial/
Spatial/
OpenGIS
OpenGIS
Spatial SQL Functions
Spatial SQL Functions
ST_Distance(g1,g2)
?
SELECT a.policy_number, a.address FROM customers a, floodzones b
WHERE ST_Intersects(a.location,b.location) = 1 AND b.last_flood_year > 1950
ST_Intersects(g1,g2)
?
SELECT a.cust_number, a.address FROM customers a, stores b
WHERE ST_Distance(a.location,b.location) < 2000 AND b.last_order_year < 2001
SQL/MM Spatial standard
Environment
Environment
SQL99/SQL2003
–
defines many object-relational extensions
– structured types, UDFs, procedures, LOBs
SQL/MM Part 3: Spatial
–
defines SQL extensions for spatial data
– types, functions/methods, information schema (catalog)
Overview on SQL/MM Spatial
Overview on SQL/MM Spatial
Part 3 of SQL/MM
– Part 1 - Framework
– Part 2 - Full Text
– Part 3 - Spatial
– Part 5 - Still Image
– Part 6 - Data Mining
Derived from OGC Simple Feature Specification for SQL
– OpenGIS Consortium (OGC)
– OGC Simple Feature Specification for SQL – not really a standard, just a specification
Content
Content
Currently status
– 2nd version is International Standard since 2003
– 3rd version in Working Draft state (slow progress)
Based on SQL99
– Structured types, methods, ...
Standard does not define
– Indexing mechanisms
– Implementation issues
– number, names, and types of attributes of the types
– specific algorithms
ST_Point ST_Curve ST_LineString ST_Polygon ST_Surface ST_GeomCollection ST_Geometry ST_Geometry ST_MultiPoint ST_Curve ST_MultiLineString ST_MultiPolygon ST_Surface
Spatial Type Hierarchy (1)
Spatial Type Hierarchy (2)
Spatial Type Hierarchy (2)
Additional optional types
–
ST_CircularString
–
ST_CompoundCurve
Spatial Type Hierarchy
Spatial Type Hierarchy
–
–
Discussion
Discussion
Attempt to implement Composite Design Pattern
– originated from OGC class hierarchy
– pattern cannot be implemented in SQL
ST_Point unrelated to ST_MultiPoint,
ST_LineString unrelated to ST_MultiLineString,
ST_Polygon unrelated to ST_MultiPolygon
– further processing of results leads oftentimes to much more complex SQL statements
Optional types not properly mirrored between single-part and
multi-part subtrees of the hierarchy
ST_Empty ST_MultiPoint ST_Point ST_CircularString ST_MultiCurve ST_Geometry ST_Polygon ST_MultiPolygon
Spatial Type Hierarchy
Spatial Type Hierarchy
–
–
Possible Solution
Possible Solution
ST_MultiCircStrin g ST_MultiSurface ST_MultiLineStrin g ST_LineString
Spatial Methods
Spatial Methods
ST_Geometry ST_Point ST_LineString ST_CircularString ST_CompoundCurve ST_CurvePolygon ST_Polygon ST_GeomCollection ST_MultiPoint ST_MultiLineString ST_MultiPolygon ST_AsText ST_AsBinary ST_AsGML ST_TransformST_Dimension
ST_CoordDim
ST_GeometryType
ST_SRID
ST_IsEmpty
ST_IsSimple
ST_IsValid
ST_Boundary
ST_Envelope
ST_ConvexHull
ST_Buffer
ST_Intersection
ST_Union
ST_Difference
ST_SymDifference
ST_Length
ST_StartPoint
ST_EndPoint
ST_IsClosed
ST_IsRing
ST_NumPoints
ST_PointN
ST_NumCurves
ST_CurveN
ST_Area
ST_Perimeter
ST_Centroid
ST_PointOnSurface
ST_ExteriorRing
ST_InteriorRings
ST_NumInteriorRing
ST_InteriorRingN
ST_Distance
ST_Equals
ST_Relate
ST_Disjoint
ST_Intersects
ST_Touches
ST_Crosses
ST_Within
ST_Contains
ST_Overlaps
ST_X
ST_Y
Spatial Functionality (1)
Spatial Functionality (1)
Constructors
– well-known text representation (WKT)
– well-known binary representation (WKB)
– ESRI shape representation
– GML representation
– Coordinates (for ST_Point values)
Comparison functions (to be used in predicates)
– ST_Overlaps, ST_Within, ST_Disjoint, ST_EqualSRS, ST_Relate, ...
Spatial Functionality (2)
Spatial Functionality (2)
Return information about properties
– ST_Length, ST_X, ST_Is3D, ST_MinY, ST_NumPoints, ...
Derive/compute new geometries
– ST_Union, ST_Intersection, ST_MidPoint, ST_Buffer, ST_ConvexHull, ...
Functions involving 2 geometries are usually performed in 2D
space
Constructors
Constructors
Well-known Text Representation
point (10 10)
multipolygon (((1 1, 2 2, 1 2, 1 1)),((10 10, 10 20, 20 20, 20 10, 10 10)))
Well-known Binary Representation
x'010100000000000000000024400000000000002440‘
Geography Markup Language
<gml:Point><gml:coordinates>10, 10</gml:coordinates></gml:Point> (Additional formats in products, e.g. Shape format)
Spatial Methods
Spatial Methods
–
–
Discussion
Discussion
Duplicated functionality
– ST_Overlaps, ST_Intersects, ST_Crosses
– ST_GeometryType & SPECIFIC_TYPE (SQL99)
Additional functionality already supported by products
– ST_ShortestPath (in current WD)
– Z/M coordinate support
– Other external data representations
Spatial Information Schema (Catalog)
Spatial Information Schema (Catalog)
ST_GEOMETRY_
COLUMNS
COLUMNS
(SQL2003)
is subset
ST_SPATIAL_
REFERENCE_SYSTEMS
ST_UNITS_OF_
MEASURE
ST_SIZINGS
associated
Spatial Information Schema
Spatial Information Schema
–
–
Discussion
Discussion
Merge ST_SIZINGS view with SIZINGS (SQL2003)
– additional facilities in SQL2003 needed, first
ST_SPATIAL_REFERENCE_SYSTEMS is rather primitive,
EPSG uses:
– Coordinate Axis Name, Coordinate Axis, Coordinate System, Coordinate Reference System, Coordinate Operation, Ellipsoid, Datum, Prime Meridian, Coord_Op Method, Coord_Op Parameter, Coord_Op Parameter Usage, Coord_Op Parameter Value,
DB2 LUW Spatial Extender &
DB2 LUW Spatial Extender &
DB2 z/OS Spatial Support Feature
Introduction
Introduction
Connectivity
Connectivity BackupBackup RestoreRestore
Server Components Server Components
DB2
DB2
DB2
Spatial Spatial Geodetic Geodetic Grid GridYour idea goes here
What is an Extender?
What is an Extender?
Types Types Functions Functions Casts Casts Aggregates Aggregates Indexes Indexes Tables Tables Client Code Client Code New New Extender/ Extender/ DataBlade DataBladeDB2 z/OS Spatial Support
DB2 z/OS Spatial Support
No extender technology used
Deeply integrated into DB2 engine
–
Better performance
–
Better customer acceptance
– History –
Spatial Technology
Spatial Technology
–
–
A Joint IBM & ESRI Effort
A Joint IBM & ESRI Effort
IBM teamed up with Environmental Systems Research Institute
(ESRI)
– over 30 years of experience in spatial technology
– broad portfolio of GIS products and tools
Collaboration combined best of both partners
– ESRI provided code that does all the spatial calculations
– IBM threw in database knowledge and extended DB2‘s OR features to accomodate ESRI
ESRI SDE 3.0.2
ESRI SDE for DB2 UDB ESRI Spatial Tools DataJoiner with DB2 Spatial Extender SDE 3.0.2 Other SQL Applications DataJoiner Spatial Extender ESRI Spatial Tools
DB2 UDB 8.1 with the DB2 Spatial Extender Feature ESRI Spatial Tools Other SQL Applications ArcInfo V8 ArcSDE8 DB2 UDB 7.1 DB2 Spatial Extender ArcSDE 9
Evolution
Evolution
select * from customersc, indexi, featuref where i.xmin > xmin(:circle) and
i.ymin > ymin(:circle) and i.xmax < xmax(:circle) and i.ymax < ymax(:circle) and
i.oid = f.oid and i.oid = c.oid and c.income > 50000 GIS Application GIS Gateway RDBMS Proprietary GIS APIs SQL (ODBC, JDBC, etc.) (1) Spatial types (2) Spatial indexes (3) Spatial functions (4) Spatial predicates
(5) Spatial query composer (6) Spatial query rewriter
(1) Spatial meta tables
(2) "Hidden" tables for spatial data (3) "Hidden" table for spatial index
Table: customers
Constraint: within(loc, :circle)
Other constraint: income > 50000
Loosely Coupled Architecture
GIS Application GIS Gateway RDBMS Proprietary GIS APIs SQL (ODBC, JDBC, etc)
(1) Spatial query composer (1) Spatial meta tables (2) Spatial types
(3) Spatial functions (4) Spatial indexes (5) Spatial predicates
select * from customersc
where within(c.location, :circle) and c.income > 50000
Table: customers
Constraint: within(loc, :circle)
Other constraint:income > 50000
Integrated Architecture
Integrated Architecture
– Functionality –
Geocoder data Stored
Proc.
Structured
Types MethodUDF/
DB2 Applications Client functions Spatial Business Data DB2 Client SQL
Architecture
Architecture
Spatial Extender Features at a Glance (1)
Spatial Extender Features at a Glance (1)
Integrated into DB2 Control Center
–
Graphical user interface
–
Includes spatial management functions
– Enable/disable spatial DB
– Create/drop coordinate systems, spatial reference systems, and geocoders
– Register/unregister spatial columns
– Manage geocoding instances
Spatial Extender Features at a Glance (2)
Spatial Extender Features at a Glance (2)
Extender is based on DB2’s OR features – User-defined structured types
– Spatial index extensions
– Grid index for all 2-dimensional data – Z-order index for point data in 2D space
Spatial catalog views
– Coordinate systems & spatial reference systems
– Spatial columns
– Geocoders
Geocoder framework
– Support to plug in custom geocoders
Just define a UDF that does the geocoding
Import/export for special spatial file formats – ESRI shapefiles
Spatial Extender Functions and Methods
Spatial Extender Functions and Methods
ST_Contains ST_Touches ST_Within ST_Overlaps ST_Intersects ST_Crosses ST_Disjoint ST_Relate ST_Equals ST_MBRIntersects ST_Intersection ST_Difference ST_Union ST_SymDifference ST_Buffer ST_PointOnSurface ST_Boundary ST_Envelope ST_Centroid ST_Perimeter ST_ConvexHull ST_MBR ST_Generalize ST_FindMeasure ST_MeasureBetween MBR aggregate Union aggregate ST_Geometry ST_Point ST_LineString ST_Polygon ST_GeomCollection ST_MultiPoint ST_MultiLineString ST_MultiPolygon ST_AsText ST_AsBinary ST_AsShape ST_AsGML ST_ToGeomColl ST_ToPoint ST_ToLineString ST_ToPolygon ST_ToMultiPoint ST_ToMultiLine ST_ToMultiPolygon ST_Transform ST_Is3D ST_IsMeasured ST_IsClosed ST_IsEmpty ST_IsRing ST_IsSimple ST_IsValid ST_StartPoint ST_MidPoint ST_EndPoint ST_CoordDim ST_Dimension ST_GeometryType ST_NumPoints ST_PointN ST_ExteriorRing ST_NumInteriorRing ST_InteriorRingN ST_NumGeometries ST_GeometryN ST_NumLineStrings ST_LineStringN ST_NumPolygons ST_PolygonN ST_NumPoints ST_PointN ST_Area ST_Distance ST_Length ST_Perimeter ST_AppendPoint ST_ChangePoint ST_RemovePoint ST_PerpPoints ST_X ST_Y ST_Z ST_M ST_MinX ST_MinY ST_MinZ ST_MinM ST_MaxX ST_MaxY ST_MaxZ ST_MaxM ST_Edge_GC_USA ST_EqualCoordsys ST_EqualSRS ST_GetIndexParms ST_SrsName ST_SrsId
Administrative Stored Procedures
Administrative Stored Procedures
ST_ALTER_COORDSYS ST_ALTER_SRS ST_CREATE_COORDSYS ST_CREATE_INDEX ST_CREATE_SRS ST_CREATE_SRS_2 ST_DROP_COORDSYS ST_DROP_INDEX ST_DROP_SRS ST_IMPORT_SHAPE ST_REGISTER_SPATIAL_COLUMN
Some Examples (1)
Some Examples (1)
CREATE TABLE customers ( name VARCHAR(100), address VARCHAR(200),
location ST_Point )
CREATE TABLE streets (
name VARCHAR(100),
location ST_LineString )
CREATE TABLE stores (
name VARCHAR(100), address VARCHAR(200), manager VARCHAR(100),
Some Examples (2)
Some Examples (2)
INSERT INTO customers
VALUES ( 'customer 1', 'address 1',
ST_Point(100, 200, 1) )
INSERT INTO streets
VALUES ( '1st street', ST_LineString( 'linestring ( 50 50, 100 100,
100 250, 200 250 )', 1) )
INSERT INTO stores
VALUES ( 'store 1', 'address 2', 'john',
Some Examples (3)
Some Examples (3)
Distance between all stores and all customers, measured in meters SELECT s.location..ST_Distance(c.location, 'METER')
FROM stores AS s, customers AS c
Find all street intersections, return intersecting point as GML – will exploit spatial index if possible
SELECT s1.name, s2.name,
s1.location..ST_Intersection(s2.location).. ST_AsGML()
FROM streets AS s1 JOIN streets AS s2
Some Examples (4)
Some Examples (4)
Find all customers that live in an area around any store SELECT s.name, c.name, c.address
FROM stores AS s, customers AS c
WHERE s.location..ST_Buffer(5, 'KILOMETER').. ST_Contains(c.location) = 1
Find closest store for all customers
WITH cust_store_dist(cust, store, dist) AS
( SELECT c.name, s.name, ST_Distance(c.location, s.location, 'METER')
FROM stores AS s, customers AS c ) SELECT o.cust, o.store
FROM cust_store_dist AS o
WHERE o.dist <= ALL ( SELECT i.dist
Geocoding
Geocoding
Address Data
Address Data
Customers
Street
Segments
Geocoder
Populate the spatial column with the location (point)
information
Cid Address City State Zip … Location
1 4335 Queen Anne Drive San Jose CA 95129 …
--2 555 Bailey Avenue San Jose CA 95120 …
--… --… … …
--The
The
Geocoding
Geocoding
Process
Process
US Streets
Map Data
Geocoder 555 Bailey Avenue San Jose, CA 95141
Real Earth x,y Coordinates (Latitude, Longitude) DB2 UDB DB2 DB2 Spatial ADDRESS
– Implementation Details –
Integer Coordinates (1)
Integer Coordinates (1)
All coordinates are stored as positive 32-bit integers
– Integer calculation faster than floating points
– Less storage required
– Compression of coordinates
Floating points are converted to integers on the fly
Integer Coordinates (2)
Integer Coordinates (2)
Conversion floating point
integer
1. Substract offset
2. Multiply with scale factor
3. Add 0.5 for proper rounding
int = ( fp – offset ) * scale + 0.5
Conversion integer
floating point
Determining Offsets and
Determining Offsets and
Scale factors
Scale factors
Get maximum spatial extent of data
– Min/max X/Y/Z/M coordinates
– Enlarge area by 20% to accommodate for growth
Offset:
– Set to min X/Y/Z/M
Scale factor
– Set to MAX_INT / ( max X/Y/Z/M + offset )
– MAX_INT is 2,147,483,648
Tradeoff is between coordinate precision (determined by scale
Coordinate Compression
Coordinate Compression
Internally, coordinates are stored
relative
to previous point in
geometry
– Small integer values can be compressed better
Compression rate is data dependent
– Geometries that are small (relative to scale factor) compress better
(1, 2) (3, 6) [2, 4] (6, 7) [3, 1] (6, 3) [0, -4] (4, 4) [-2, 1] (3, 1) [-1, -3]
Added method support
Added method support
Version 7
– Only ST_function(geometry) or ST_function(geometry1, geometry2)
Version 8
– Now also geometry..ST_function() or
geometry1..ST_function(geometry2)
Reasons
– Standard compliance
– Ease of use
Catalog changes
Catalog changes
Version 7
– Catalog tables highly inadequate (but actually standardized that way)
– Irrelevant information was shown – Relevant information was missing
– 1:1 dependency between base tables and views was mandatory – Specific naming conventions and limits (name lengths)
– Example:
geometry_columns(layer_catalog, layer_schema, layer_table, layer_column, geometry_type, srid)
Version 8
– Implemented new catalog
– Changed SQL/MM standard to “comply” with the product
Implementation issues – Backward compatibility
– Everything in the code was affected
Import/Export
Import/Export
–
–
Rewrite of Queries
Rewrite of Queries
Version 7
INSERT INTO table
VALUES (..., ST_GeomFromShape(blob), ...)
– Initialization of buffers inside DB2
– Initialization inside the function
Version 8
INSERT INTO table
SELECT ..., ST_GeomFromShape(data), ... FROM TABLE ( VALUES (..., blob, ...), VALUES (..., blob, ...), ... ) AS ( ..., data, ...)
– Initialization done only once
Exploitation of UDF features
Exploitation of UDF features
Use of SCRATCHPAD
– Carry internal buffers and already initialized objects from call to call
– All functions: construct coordinate system objects only once
– Geocoder: caching of data results in much less file I/O
Several thousand memory operations to geocode a single address
Allow parallelization of function execution
Performance enhancements
– Improve parsers for WKT, WKB, GML, …
Error Handling & Tracing
Error Handling & Tracing
Version 7
– At least 5 different ways to handle error conditions – NLS issues due to hard-coded error messages
– No tracing available
– No first-fault data capture (FFDC)
Version 8
– Common architecture throughout the (IBM-controlled part of the) product
– Spatial Indexing –
The Need for a Spatial Index
The Need for a Spatial Index
Spatial queries are typically in 2-D space
Native B-Tree is insufficient
– B-Tree does not support structured types
– 0-D objects (points, e.g. location of buildings) can be indexed in B-tree based on X and Y coordinates
– But not with best possible performance
– Neither 1-D nor 2-D objects (lines, e.g. road segments, or polygons, e.g. lakes, counties) can be indexed this way
Function that can Exploit a Spatial Grid Index
Function that can Exploit a Spatial Grid Index
ST_Equals
ST_Disjoint
ST_Intersects, ST_Overlaps, ST_Crosses
ST_Touches
ST_Within, ST_Contains
ST_MBRIntersects
Function that can Exploit a Spatial Z
Function that can Exploit a Spatial Z
-
-
Order Index
Order Index
ST_ZEquals
ST_ZIntersects, ST_ZOverlaps, ST_ZCrosses
ST_ZWithin, ST_ZContains
ST_ZMBRIntersects
ST_ZTouches
A uniformly-spaced square indexing grid
each feature exists in one or more grid cells
multiple geometries can exist in a single grid cell, especially overlapping geometries
MBR of geometries is used for indexing speed up searches
Up to 3 levels of grids for the spatial index
spatial index is like a two-dimensional column index
Records which grid cells
each feature resides in
Geometries are indexed
by the grid level, the
overlapped grid cells, and
the MBR of the geometry
The Spatial Index Grids
The Spatial Index Grids
Spatial index search avoids access to the data
How the Spatial Index Works
How the Spatial Index Works
1. Determine all overlapping grid cells (on all levels) of search argument
Eliminate all geometries that do not overlap any of those grid cells Index scan returns set of candidates, possibly including false positives
2. Compare MBR of search agrument with candidates
Eliminate all geometries where the MBR is already disjoint
Set of candidates is reduced but might still contain false positives
3. Do an exact comparison
Find all Parcels that intersect the Flood Zone?
Remember:
The goal is to avoid comparing all geometries with the „flood zone“
Use a 3 level filtering process
Parcel Flood Zone
Let
1.
Eliminate geometries that do not overlap grid cells of
search argument
Eliminate all parcels that do not intersect with the 4 grid cells
Spatial Index Example
1.
Eliminate geometries that do not overlap grid cells of search argument
2.
Eliminate geometries by filtering with the MBR
Eliminate all parcels that do not intersect the MBR of the flood zone.
Spatial Index Example
1.
Eliminate geometries that do not overlap grid cells of search argument
2.
Eliminate geometries by filtering with the MBR
3.
Eliminate geometries by exact coordinate comparison
Spatial Index Example
Spatial Index Grid Levels
Spatial Index Grid Levels
Up to 3 grid levels possible in an index
Goal
– Reduce number of index entries
– Make grid cells as small as possible
– Higher resolution on index scans better filtering in this step
Use multiple grid levels when geometries have skewed size
– Avoids lots of index entries for large geometries
– Geometry is promoted to next grid level if it covers more than 4 grid cells
Note
– Multiple grid levels result in multiple index scans
Choosing the Best Grid Size
Choosing the Best Grid Size
Grid size is size of grid cells (1 dimension)
Use Spatial Index Advisor (gseidx) to get
– Information on the spatial data (size etc.)
– Suggestions for grid sizes
– Index statistics for user-providedgrid sizes
Rules
– Make grid size as small as possible while
Spatial Index Statistics
Spatial Index Statistics
Use key generator function of grid index extension
–
Feed geometries into it
–
Returns table of all data stored in the index
–
Use SQL to evaluate data in that table
Simpler version: operate on MBR of the geometries
Rules for Index Exploitation
Rules for Index Exploitation
Spatial predicate must be used in WHERE clause
Spatial function must be on left-hand side of
comparison operator
Equality comparison must use integer constant 1
At least of of the function’s parameters must be an
indexed spatial column
SELECT * FROM customers c WHERE ST_Within(c.location, :BayArea) = 1 SELECT * FROM customers c WHERE ST_Distance(c.location, :SanJose) < 10
SELECT * FROM customers c WHERE ST_Length(c.location) > 10
SELECT * FROM customers c WHERE 1 = ST_Within(c.location, :BayArea) SELECT * FROM customers c WHERE ST_Within(c.location, :BayArea) = 2
– Demo –
Problem Statement
Problem Statement
A real estate website needs the capabilities that can
satisfy the following objectives:
–
Allow users to search property listings and display the search
results on a map
–
Allow users to attain detailed information about a particular
property
–
Allow users to drill down and find the spatial relationship
between the selected property and other point of interests
Implementation Overview
Implementation Overview
PHP Application HTTP ServerInternet
Google Map Server ESRI ArcWeb Map Server Web Browser ---AJAX / JavaScript Real EstateListing Data School Data Crime Data
Show the details of each result by clicking on a green pushpin on the map. At the query panel below, query the schools within 2 miles of the selected home and API > 700.
Current Developments
Current Developments
–
– 3D –
Modelling 3D Objects
Modelling 3D Objects
Extend spatial type hierarchy
– Solids and Polyhedrons and Collections
– Approximate curved surfaces with polygons
New methods for 3D types
– ST_Volume, ST_BoundingArea, ST_ExteriorShell, ST_InteriorShellN, and ST_NumFacets
Adjust existing methods to
– Operate on 3D geometries
Extend External Data Formats
Extend External Data Formats
Well-known text (WKT), well-known binary (WKB), and
Geography Markup Language (GML)
Boundary based representation
–
POLYHEDRON Z ((((0 0 0, 1 0 0, 0 0 1, 0 0 0)), ((1 0 0, 0 1 0,
0 0 1, 1 0 0)), ((0 0 1, 0 1 0, 0 0 0, 0 0 1)), ((0 0 0, 0 1 0, 1 0
0, 0 0 0))))
Indexed face set representation
–
POLYHEDRON Z INDEX ((0 0 0, 1 0 0, 0 0 1, 0 1 0), (((1, 2,
CGAL as 3D Computation Engine
CGAL as 3D Computation Engine
Scientific library implementing spatial algorithms with
optimal
theoretical
runtime
CGAL Polyhedron for true 3D objects
More general Nef-polyhedrons
–
All kinds of spatial objects can be described in R
33D Performance Tests
3D Performance Tests
Approximated spheres with growing complexity
Tested functionality
– Constructors & conversion to external data format – Storage and construction time
– Spatial comparison of two geometries
– Generation of new geometries – Intersection computation
Performance of traditional Spatial Extender not impacted besides
Some Performance Results
Some Performance Results
Construction Intersection
Summary or 3D with CGAL
Summary or 3D with CGAL
Performance is not acceptable despite optimal
theoretical runtimes
No native binary interface for I/O available
Extremely long construction times
Some simple optimization in CGAL show potential for
significant improvements
– Graph –
Integrating Graphs into DBMS
Integrating Graphs into DBMS
All geometries have inherent topological properties
–
Can be mapped to graph structures
–
Allows application of graph operations directly on geometries
Evaluate graph operations inside the DBMS
–
Applications do not need any additional graph functionality
Deriving Graphs from
Deriving Graphs from
LineStrings
LineStrings
Points become vertices
– Intersections of linestrings without common point does not result in graph vertex
Line segments connecting points become edges
Reverse mapping requires identifier (primary key value or RID)
Retain point coordinates in vertices
Test Data for Prototype
Test Data for Prototype
89.5%
201,571
1,926,298
270,334
Wyoming
76.6%
1,689,279
7,214,599
2,259,314
Texas
74.9%
620,257
2,469,506
842,099
Pennsylvania
85.3%
330,786
2,248,623
451,326
New Mexico
68.5%
428,435
1,361,034
595,662
Michigan
35.4%
9,084
14,059
14,762
D.C.
68.1%
29,266
91,673
39,429
Delaware
66.8%
112,520
340,099
148,232
Connecticut
75.3%
1,200,783
4,867,271
1,589,712
California
savings
#vertices
#points
# streets
Graphs as Index Structures
Graphs as Index Structures
Usage in SQL statements
–
Finds linestrings participating in shortest path
–
Extracts relevant segments of linestrings
SELECT sp.line, t.street_name
FROM t, TABLE ( ST_ShortestPath( ST_Point(3.25, 3.25), ST_Point(5.5, 6.5),
'SAMPLE_GRAPH') ) AS sp(seq, line, id) WHERE sp.id = t.id
ORDER BY sp.seq