The STC for Event Analysis:
Scalability Issues
Georg Fuchs
Gennady Andrienko
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Events
“Something [significant] happened s om ew here, s om etim e”
Analysis goal and domain dependent, e.g.
“Object starts/stops moving”,
“Object property changes”,
“Earthquake with magnitude > 2 on Richter scale”
Visualization methods
Animated and dynamic query maps
Space-Time Cube (STC)
The Scalability Challenge
53.000 events
© Fraunhofer-Institut für Intelligente
Analyse- und Informationssysteme IAIS 4
Analysis of Spatially Distributed Events:
Major Questions
How are the events distributed in space?
at one particular time moment, or
all events that occurred over a time period
How are the event occurrences distributed over time?
E.g., how does the overall event frequency vary?
How does the pattern of spatial distribution of the events change over time?
How are the events distributed in space + time? Are there any
spatio-temporal clusters?
Data structure:
Example: Earthquakes in Marmara region (western Turkey and around)
… … … …
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Adressing the Scalability Challenge:
Optimized Rendering?
Full Opacity
Adressing the Scalability Challenge:
Optimized Rendering?
50%
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Adressing the Scalability Challenge:
Optimized Rendering?
70%
Transparency
Events
“Something [significant] happened s om ew here, s om etim e”
Analysis goal and domain dependent, e.g.
“Object starts/stops moving”,
“Object property changes”,
“Earthquake with magnitude > 2 on Richter scale”
Visualization methods
Animated and dynamic query maps
Space-Time Cube (STC)
Analysis methods
Spatio-Temporal Aggregation
Addressing the Scalability Challenge
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Spatio-temporal aggregation
Spatial aggregation:
by units of any territory division
E.g., cells of a regular grid
Temporal aggregation:
by time intervals
Occlusion is still a problem since ST-aggregates typically use larger glyphs (e.g., spheres) to convey the aggregated region + time interval!
Reduction of object/rendering primitive count
Events
“Something [significant] happened s om ew here, s om etim e”
Analysis goal and domain dependent, e.g.
“Object starts/stops moving”,
“Object property changes”,
“Earthquake with magnitude > 2 on Richter scale”
Visualization methods
Animated and dynamic query maps
Space-Time Cube (STC)
Analysis methods
Spatio-Temporal Aggregation
Event Density Calculation
Addressing the Scalability Challenge
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Event Density Calculations
In case of 2D maps: compute density surfaces 1976
1977 1978
Dis claim er:
There are far more
polished tools than
the one used for
these illustrations...
Event Density Calculations
In case of 3D STC: worthwhile looking at volume visualization?
? ?
© MathWorks
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Events
“Something [significant] happened s om ew here, s om etim e”
Analysis goal and domain dependent, e.g.
“Object starts/stops moving”,
“Object property changes”,
“Earthquake with magnitude > 2 on Richter scale”
Visualization methods
Animated and dynamic query maps
Space-Time Cube (STC)
Analysis methods
Spatio-Temporal Aggregation
Event Density Calculation
Spatio-Temporal Clustering
Adressing the Scalability Challenge
Event Distribution in Space-Time
Finding clusters in Space-Time
This is what we are interested in!
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Event Distribution in Space-Time
Finding clusters in Space-Time
We see that all but one events really
occurred very close to each other. We can conclude that this is indeed a spatio-
temporal cluster and, hence, there may be
a relationship between these events
Event Distribution in Space-Time
Finding clusters in Space-Time
We see that the events seem to split into
two sequences with a certain time lapse
between them
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Event Distribution in Space-Time
• The number of clusters must be known in advance
• Returns convex shaped clusters
• Connection between events with a certain distance threshold.
• Difficult to parametrize.
• Extract arbitrarly shaped clusters.
• Doesn‘t require a priori specification of the
amount of clusters.
Automated Detection of ST Event Clusters
Density based Clustering Algorithm
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Event Distribution in Space-Time
Clusters detection using density-based clustering
Parameters:
spatial distance threshold = 10 km Temporal distance threshold = 30 days
20
Automated Detection of ST Event Clusters
Event Distribution in Space-Time
Clusters detection using density-based clustering
Observations and caveats:
The space-time cube reveals an interesting pattern: a west-east shift of cluster locations over the studied time period
Number of detected clusters (108) exceeds number of discernible colors
different clusters are often colored very similarly
Automated Detection of ST Event Clusters
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Automated Detection of ST Event Clusters
Density-based algorithms typically assume entire data fits into RAM at once
Might not hold during initial explorative analysis
e.g., Flickr photo-taking ~100,000,000 events
Proposed scalability extension to D B S CAN (EuroVA‘12)
Scalable to large datasets not fitting in RAM
Accounts for spatiotemporal nature of the data
Improved execution time compared to D B S CAN
Scaling to extremely large event data – Extended DBScan
Extended DBSCAN
Spatio-temporal neighborhood parameters
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
Data is successively loaded into RAM in partially overlapping frames
Database
Principal algorithm steps
Extended DBSCAN
DBSCAN is applied
to each frame independently using ST-neighborhood criterion
Database Main Memory: RAM
Principal algorithm steps
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
Database Main Memory: RAM
DBSCAN is applied
to each frame independently
using ST-neighborhood criterion
Principal algorithm steps
Extended DBSCAN
Database Main Memory: RAM
DBSCAN is applied
to each frame independently
using ST-neighborhood criterion
Principal algorithm steps
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
Database Main Memory: RAM
DBSCAN is applied
to each frame independently
using ST-neighborhood criterion
Principal algorithm steps
Extended DBSCAN
Database Main Memory: RAM
DBSCAN is applied
to each frame independently
using ST-neighborhood criterion
Principal algorithm steps
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
When clustering is completed, the clusters of consecutive frames are merged.
Database Main Memory: RAM
Principal algorithm steps
Extended DBSCAN
Database Main Memory: RAM
When clustering is completed, the clusters of consecutive frames are merged.
Principal algorithm steps
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
Database Main Memory: RAM Database
When clustering is completed, the clusters of consecutive frames are merged.
Principal algorithm steps
Extended DBSCAN
After merging, RAM occupied by old frames is released.
Database Main Memory: RAM Database
Principal algorithm steps
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
Database Main Memory: RAM Database
Principal algorithm steps
Extended DBSCAN
Database Main Memory: RAM Database
Principal algorithm steps
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
Database Main Memory: RAM Database
Principal algorithm steps
Extended DBSCAN
Database Main Memory: RAM Database
Principal algorithm steps
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
Merging process
Extended DBSCAN
Merging process
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
The proposed algorithm can be used for visual analysis
large datasets.
2 mil. points. / 17.200 GPS- tracks
Collected in one week.
Objective:
Detect traffic jams in the city.
Investigate the properties of the clusters.
Use for visual exploration
Extended DBSCAN
Detection:
Spatio-temporal clusters of slow movement events
Remove noise (i.e., spurious slow movements)
Investigation:
Temporal distribution of these traffic jams
Convex hulls/prism representation
Less objects/glyphs to visualize
Spatial and/or temporal zooming can be applied
Use for visual exploration
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
Use for visual exploration – convex hull cluster representation
Extended DBSCAN
Use for visual exploration – temporal zooming
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Extended DBSCAN
Use for visual exploration
Extended DBSCAN
Combine temporal with spatial framing
Dynamic frame sizes according to local density distribution
Exploit inherent parallelism of independent frame clustering
Future Work
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS