Analyzing Trajectory Data
Gerasimos Marketos
Information Systems Lab (InfoLab)
University of Piraeus (UniPi), Greece
http://infolab.cs.unipi.gr
Free University of Bolzano
20/3/2009
Preliminaries
Trajectory Data Warehousing
Our framework for trajectory analysis
Reconstructing trajectories
ETL processing
Addressing the distinct count problem
Traffic Mining
Modeling traffic flow in a road network
Our framework for mining traffic patterns
Detecting traffic flow in a road network
G. Marketos: Analyzing Trajectory Data 3
The big picture (the GeoPKDD project)
Traffic Management Accessibility of services Mobility evolution Urban planning …. interpretation visualization trajectory reconstruction p(x)=0.02 warehouse interpretation visualization trajectory reconstruction p(x)=0.02 ST patterns Trajectory warehouse Privacy-aware Data mining Bandwidth/Power optimizationMobile cells planning … Public administration or business companies Telecommunication provider GeoKnowledge
Aggregative Location-based services Privacy enforcement Traffic Management Accessibility of services Mobility evolution Urban planning …. interpretation visualization trajectory reconstruction p(x)=0.02 warehouse interpretation visualization trajectory reconstruction p(x)=0.02 ST patterns Privacy-aware Data mining Bandwidth/Power optimization
Mobile cells planning …
Public administration or business companies
GeoKnowledge
Aggregative Location-based services
G. Marketos: Analyzing Trajectory Data 5
A tour on GeoPKDD
Mobility data management
Acquiring and storing trajectories in MODs
Location-aware querying
Trajectory indexing
Mobility data mining and the geographic KDD process
Trajectory warehousing and OLAP
Mobility data mining and reasoning
Visual analytics for mobility data
Privacy aspects on mobility data
Key questions (for this talk)
How to
store
and
query
trajectory data?
Database technology is extended: Moving Object Databases
How to
reconstruct
a trajectory from raw logs?
Position devices provide us information just about location points
and not about trajectories
How to
analyze
trajectory data
Data Warehousing technology is adapted to handle trajectory
data
Are there any
spatio-temporal patterns
in my data?
New Data Mining algorithms are needed so as to discover such
patterns
G. Marketos: Analyzing Trajectory Data 7
Mobility Data
Typical structure and size:
N;Time;Lat;Lon;Height;Course;Speed;PDOP;State;NSat
… 8;22/03/07 08:51:52;50.777132;7.205580; 67.6;345.4;21.817;3.8;1808;4 9;22/03/07 08:51:56;50.777352;7.205435; 68.4;35.6;14.223;3.8;1808;4 10;22/03/07 08:51:59;50.777415;7.205543; 68.3;112.7;25.298;3.8;1808;4 11;22/03/07 08:52:03;50.777317;7.205877; 68.8;119.8;32.447;3.8;1808;4 12;22/03/07 08:52:06;50.777185;7.206202; 68.1;124.1;30.058;3.8;1808;4 13;22/03/07 08:52:09;50.777057;7.206522; 67.9;117.7;34.003;3.8;1808;4 14;22/03/07 08:52:12;50.776925;7.206858; 66.9;117.5;37.151;3.8;1808;4 15;22/03/07 08:52:15;50.776813;7.207263; 67.0;99.2;39.188;3.8;1808;4 16;22/03/07 08:52:18;50.776780;7.207745; 68.8;90.6;41.170;3.8;1808;4 17;22/03/07 08:52:21;50.776803;7.208262; 71.1;82.0;35.058;3.8;1808;4 18;22/03/07 08:52:24;50.776832;7.208682; 68.6;117.1;11.371;3.8;1808;4 …Location data producers: GSM, GPS, WiFi
) , , ( ),..., , , ( 1 1 1 i i ini ini ini i i x y t x y t TLocation data (id, x, y, t)
are generated
Moving
Object
Database
trajectory data
(obj-id, traj-id, (x, y, t)
*)
are reconstructed
Trajectory stream manager +
Trajectory reconstruction
)
,
,
(
),...,
,
,
(
1
1
1
i
i
i
n
i
i
n
i
i
n
i
i
i
x
y
t
x
y
t
T
G. Marketos: Analyzing Trajectory Data 9
Moving Objects Databases
The traditional database technology has been extended into
Moving
Object Databases
(MODs) that handle modeling, indexing and
query processing issues for trajectories
Spatial and temporal dimensions are considered as first-class
citizens.
Both past and current (as well as anticipated future) positions of
moving objects are of interest.
Several prototype MODs
DOMINO (Wolfson
et al.) EDBT‟02
PLACE (Mokbel
et al.) VLDB‟04
SECONDO (Güting et. al.) ICDE‟05
Hermes MOD engine
Built-in ORACLE ORDBMS
Data model: absolute vs. relative location coordinates
Current location as a function in time over the starting
location
linear and arc movement functions
MOD management
Insert / Update / Delete a moving object or a segment of its
trajectory
Functions over trajectories or sets of trajectories
Indexing support
Supported indices: R-tree (for stationary data)
G. Marketos: Analyzing Trajectory Data 11
G. Marketos: Analyzing Trajectory Data 14
Hermes: trajectory data type
Primitive definition:
Unit_Function =
dx
i:double, y
i:double, x
e:double, y
e:double, x
c:double, y
c:double, v:double,
a:double, flag:TypeOfFunction , where
TypeOfFunction={ CONST, PLNML_1, ARC_<1..8> }
Unit_Moving_Point =
dp: Period SEC , m: Unit_Function
Moving_Point =
d{ tab: set Unit_Moving_Point
| …
constraints
…}
yy' xx' tt' t1 t2 t3 t4 t ε [t1, t2) -> Linear movement t ε [t2, t3) -> Arc movement t5 t ε [t3, t4) -> Const movement t ε [t4, t5) -> Linear movement Y X o YY' XX' (xt,yt,t) (xe,ye,te) (xi,yi,ti) (xc,yc)
t3
t1 t7
t11
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
TB-Tree support in Hermes MOD engine
TB-Tree Index (Pfoser et al., 2000)
Maintains the „trajectory‟ concept
Each node consists of segments
of a single trajectory
Nodes are linked together in a chain
Effective for trajectory-oriented queries
Implemented in Hermes using
G. Marketos: Analyzing Trajectory Data 16
Trajectory Data Warehousing &
OLAP
Trajectory data warehousing
The
analysis
of
trajectory data
raises opportunities for discovering
behavioral patterns that can be exploited in various applications
TDW should
extract aggregate information from MOD
support a variety of dimensions (temporal, spatial, thematic, …) and
measures (about space, time and their derivatives)
Storing
measures
associated with facts, concerning the
set of trajs
crossing the cell
aggregate
information in base cells
Challenges
design of a trajectory-oriented data cube
high volume and complex nature of data; special query processing
requirements
extensions of traditional aggregation techniques to produce summary
G. Marketos: Analyzing Trajectory Data 18
Reconstructed trajectory data are stored in MOD
Trajectory data warehousing
A
Trajectory Reconstruction process
is applied on the raw
time-stamped location data in order to generate trajectories, which are then
stored into a MOD
An
Extract-Transform-Load
(
ETL)
procedure
is activated that feeds
the data cube(s) with aggregate information on trajectories
The final step of the process offers
OLAP
(and, eventually, DM)
capabilities over the aggregated information contained in the trajectory
cube model
trajectory data analyst location data producersLocation data (id, x, y, t) are recorded
Aggregates are loaded in the data cube (ETL procedure) Trajectory
Data Cube MOD
Trajectory reconstruction module Analysis over aggregate data is performed (OLAP)
Related work
Data Warehousing (DW) is widely investigated for conventional,
non-spatial data.
Some research work has been done in the area of Spatial Data
Warehousing (Han et al., 1998).
(Shekhar et al., 2001) extend the idea of cube dimensions so
as to include spatial and non-spatial ones, and of cube
measures so as to represent space regions and/or calculate
numerical data
One step beyond Spatial DW is modeling a TDW (Pelekis et al.,
2007). A simple TDW model is presented in (Orlando et al.,
2007).
The notion of multiple aggregation paths is defined in (Jensen et
al., 2004)
The integration of spatial and temporal dimensions is proposed in
G. Marketos: Analyzing Trajectory Data 20
Conventional DWs
In conventional DWs, measures about entities (e.g. shopping
baskets, customers etc.) refer to a specific base cell that are
formed by the dimensions
distinct objects appear only once at the minimum abstraction
level of the cube
Time
S
tor
e
count
Count of transactions
TV
VCR
PC
1/1
2/1 3/1
…
101
201
301
sum
A transaction cannot be
Trajectory DW
The challenge: In TDW, measures regarding trajectories may
affect more than one base cell
the same object might affect measures in several base cell due to
its movement (!!)
Time
L
ocat
io
n
count
Count of trajectories
50 < Age
Age 30
30 < Age 50
12.00 13.00 14.00
…
R1
R2
R3
sum
A trajectory can be found
in both base cells (the
distinct counting problem)
G. Marketos: Analyzing Trajectory Data 22
Moving Object Database
Trajectory Data Warehouse
Dimensions:
Spatial,
Temporal, Object Profile
Measures:
count
(trajectories), count (users),
avg (distance traveled), avg
(travel duration), avg (speed),
avg (abs (acceler) )
Sample schemas
OBJECT_PROFILE_DIM PK OBJPROFILE_ID GENDER BIRTHYEAR PROFESSION MARITAL_STATUS DEVICE_TYPE FACT_TBL PK,FK3 INTERVAL_ID PK,FK2 PARTITION_ID PK,FK1 OBJPROFILE_ID COUNT_TRAJECTORIES COUNT_USERS AVG_DISTANCE_TRAVELED AVG_TRAVEL_DURATION AVG_SPEED AVG_ABS_ACCELER SPACE_DIM PK PARTITION_ID PARTITION_GEOMETRY DISTRICT CITY STATE COUNTRY TIME_DIM PK INTERVAL_ID INTERVAL_START INTERVAL_END HOUR DAY MONTH QUARTER YEAR DAY_OF_WEEK RUSH_HOUROBJECTS(object-id: identifier, description: text, gender: {M | F}, birth-date: date, profession: text, device-type: text)
RAW_LOCATIONS(object-id: identifier, timestamp: datetime, eastings-x: numeric, northings-y: numeric, altitude-z: numeric)
MOD_TRAJECTORIES (trajectory-id: identifier, object-id:
Our framework for trajectory analysis
We focus our research on three issues that are critical to
trajectory data warehousing:
The preprocessing phase that deals with the explicit
reconstruction
of the trajectories, which are then stored into a
MOD
Alternative ETL processes
that feed the trajectory data
warehouse
The solutions proposed on the challenging issue of
measure
G. Marketos: Analyzing Trajectory Data 24 y
t
x
Reconstructing trajectories
Collected raw data represent time-stamped geographical
locations
Apart from storing raw data in the MOD, we are also
interested in reconstructing trajectories:
Raw points arrive in bulk sets
We need a filter that decides if the new series of data is to be
appended to an existing trajectory or not:
Tolerance distance
Temporal gap
Spatial gap
Maximum speed
Maximum noise duration
t
y
Reconstructing trajectories: parameters
Tolerance distance
The tolerance of the transmitted time-stamped positions. In other
words, it is the
maximum distance between two consecutive
time-stamped positions
of the same object in order for the
object to be
considered
as
stationary
t
y y
t
G. Marketos: Analyzing Trajectory Data 26
Reconstructing trajectories: parameters
Tolerance distance
Temporal gap between trajectories
The
maximum allowed time interval
between two consecutive
time-stamped positions of the same trajectory for a single moving
object
t y y t x xtemporal gap
Reconstructing trajectories: parameters
Tolerance distance
Temporal gap between trajectories
Spatial gap between trajectories
The
maximum allowed distance
in 2D plane between two
consecutive time-stamped positions of the same trajectory
t
y y
t
x x
G. Marketos: Analyzing Trajectory Data 28
Reconstructing trajectories: parameters
Tolerance distance
Temporal gap between trajectories
Spatial gap between trajectories
Maximum speed
It is used in order to determine whether a reported time-stamped
position must be considered as
noise
and consequently
discarded from the output trajectory
t
y y
t
Reconstructing trajectories: parameters
Tolerance distance
Temporal gap between trajectories
Spatial gap between trajectories
Maximum speed
Maximum noise duration
The
maximum duration of a noisy part
of a trajectory. Any
sequence of noisy time-stamped positions of the same object will
result in a new trajectory given that its duration exceeds
noise
max
t
y y
t
G. Marketos: Analyzing Trajectory Data 30
Reconstructing trajectories: algorithm
Algorithm Trajectory-Reconstruction
(PartialTrajectories List, P Point, OId ObjectId) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32.
IF NOT PartialTrajectories.Contains(OId) THEN CTrajectory=New Trajectory;
CTrajectory.AddPoint(P);
PartialTrajectories.Add(CTrajectory);
ELSE
CTrajectory=PartialTrajectories(OId);
IF Distance(CTrajectory.LastPoint,P)<= DTOL THEN
IF P.T – CTrajectory.LastPoint.T > gapTime THEN
Report Ctrajectory.LastPoint;
CTrajectory.Id=CTrajectory.Id+1; CTrajectory.AddPoint(P);
ENDIF
ELSEIF Speed(CTrajectory.LastPoint,P)> Vmax THEN
IF P.T – CTrajectory.LastPoint.T > noisemax THEN
Report CTrajectory.Noise; ELSE
CTrajectory.AddNoise(P); ENDIF
ELSEIF Distance(CTrajectory.LastPoint,P)> gapspace THEN
Report Ctrajectory.LastPoint; CTrajectory.Id=CTrajectory.Id+1; CTrajectory.AddPoint(P);
ELSE
IF P.T – CTrajectory.LastPoint.T > gapTime THEN
Report Ctrajectory.LastPoint; CTrajectory.Id=CTrajectory.Id+1; CTrajectory.AddPoint(P); ELSE CTrajectory.AddPoint(P); ENDIF ENDIF ENDIF
As a first step (lines 1-6),
the algorithm checks
whether the object has been
processed
if so, it retrieves its partial
trajectory from the
corresponding list
otherwise, it creates a new
trajectory and adds it to list
Then (lines 7-31), it
compares the incoming
point P with the tail of the
partial trajectory (LastPoint)
by applying the trajectory
reconstruction parameters
Our framework for trajectory analysis
We focus our research on three issues that are critical to
trajectory data warehousing:
The preprocessing phase that deals with the explicit
reconstruction
of
the trajectories, which are then stored into a MOD
Alternative ETL processes
that feed the trajectory
data warehouse
The solutions proposed on the challenging issue of
measure
G. Marketos: Analyzing Trajectory Data 32
ETL processing: loading
Loading data into the dimension tables
straightforward
Loading data into the fact table
complex
Fill in the measures with the appropriate numeric values
In order to calculate the measures, we have to extract the
portions of the trajectories that fit into the base cells of the cube
We propose two alternative solutions to this problem:
cell-oriented
trajectory-oriented
y
ETL processing: measures
Measure
Formula
COUNT_
TRAJECTORIES
count all distinct trajectory ids that pass through
base cell (bc)
COUNT_USERS
count all the distinct object ids that pass through
bc
AVG_DISTANCE_
TRAVELED
AVG_TRAVEL_
DURATION
AVG_SPEED
AVG_ABS_
ACCELER
) ( _ ) ( _ ) ( _ _ bc ES TRAJECTORI COUNT bc DISTANCE SUM bc TRAVELED DISTANCE AVG bc TP i iTP
len
bc
DISTANCE
SUM
_
(
)
(
)
) ( _ ) ( _ ) ( _ _ bc ES TRAJECTORI COUNT bc DURATION SUM bc DURATION TRAVEL AVG bc TP i i TP lifespan bc DURATION SUM_ ( ) ( ) ) ( _ ) ( _ ) ( _ bc ES TRAJECTORI COUNT bc SPEED SUM bc SPEED AVG bc TP i i i lifespanTP TP len bc SPEED SUM ) ( ) ( ) ( _ ) ( _ ) ( _ _ ) ( _ _ bc ES TRAJECTORI COUNT bc ACCELER ABS SUM bc ACCELER ABS AVG TP speed TP speed ( ) ( )G. Marketos: Analyzing Trajectory Data 34
ETL processing: algorithms
Cell-oriented approach
(COA)
Search for the portions of trajectories
that they reside inside a
spatiotemporal cell
Perform a
spatiotemporal
range query
that returns the
portions of trajectories that
satisfy the range constraints
This is efficiently supported by
the
TB-tree
Decompose the trajectory portions
with respect to the user profiles they
belong to
Compute measures for this cell
Repeat for the next cells
x
y
COUNT
_
TRAJECTORIES
= 2
COUNT
_
USERS
= 2
COUNT
_
TRAJECTORIES
= 1
COUNT
_
USERS
= 1
…
ETL processing: algorithms
Trajectory-oriented approach
(TOA)
Discover the spatiotemporal cells
where each trajectory resides in
In order to avoid checking all
cells, use the trajectory MBR
Identify the cells that overlap with the
MBR and contain portions of the
trajectory
Compute measures for each cell
…
Repeat for the next trajectories
…
x
y
COUNT
_
TRAJECTORIES
= 2
COUNT
_
USERS
= 2
…
G. Marketos: Analyzing Trajectory Data 36
Our framework for trajectory analysis
We focus our research on three issues that are critical to
trajectory data warehousing:
The preprocessing phase that deals with the explicit
reconstruction
of
the trajectories, which are then stored into a MOD
Alternative ETL processes
that feed the trajectory data warehouse
The solutions proposed on the challenging issue of
The distinct count problem: definition
During the ETL process, measures can be computed in an
accurate way by executing MOD queries
Once the fact table has been fed, aggregate-only information is
stored inside the TDW (no trajectory / user ids)
When rolling up,
COUNT_USERS, COUNT_TRAJECTORIES
and,
hence, all other measures defined over
COUNT_TRAJECTORIES
are subject to the distinct count problem (Tao et al., 2004):
if an object remains in the query region
for several timestamps during the query
interval, instead of counting this object
once, it is counted multiple times in the
result
y
G. Marketos: Analyzing Trajectory Data
38
38The distinct count problem: solution
(1/3)
We store in the base cells (
C
(x,y),t,p
) a tuple of auxiliary
measures that help us correct the errors due to the duplicates
when rolling-up:
C
(x,y),t,p
.
Traj
: number of distinct trajectories of profile
p
intersecting the cell
C
(x,y),t,p
.
cross-x
: number of distinct trajectories of profile
p
crossing the
spatial
border between
C
(x-1,y),t,p
and
C
(x,y),t,p
C
(x,y),t,p
.
cross-y
: number of distinct trajectories of profile
p
crossing the
spatial
border between
C
(x,y-1),t,p
and
C
(x,y),t,p
C
(x,y),t,p
.
cross-t
: number of distinct trajectories of profile p crossing
the
temporal
border between
C
(x,y),t-1,p
and
C
(x,y),t,p
X
Y
T
The distinct count problem: solution
(2/3)
Let
C
(x’,y’),t’,p’
be a cell consisting of the union of two adjacent
cells (i.e.
C
(x,y),t.p
C
(x+1,y),t,p
)
In order to compute the
number of distinct trajectories
:
C
(x’,y’),t’,p’
.
Traj = C
(x,y),t,p
.
Traj + C
(x+1,y),t,p
.
Traj – C
(x+1,y),t,p
.
cross-x
application of the well-known Inclusion/Exclusion principle for
sets: A B = A + B
A B
BUT
in some cases it holds that
C
(x+1,y),t,p
.
cross-x
A B
G. Marketos: Analyzing Trajectory Data 40
The distinct count problem: solution
(3/3)
C
x,y,t,pC
x+1,y,t,pC
x,y+1,t,pC
x+1,y+1,t,p(a)
C
x,y,t,pC
x+1,y,t,pC
x,y+1,t,pC
x+1,y+1,t,p(b)
Correct!
Not Correct!
C
x,y,t,p.Traj
=
1
C
x+1,y,t,p.Traj
=
1
C
x+1,y,t,p.cross-x
=
1
C
x+1,y,t,p.cross-x
=
0
Compute the number of distinct trajectories:
Conclusions (for this part)
We explored solutions for the efficient and effective
development of trajectory warehouses
Related work does not include research work on the complete
flow of processes in a TDW
In particular, we discussed about techniques for:
the solution of the
trajectory reconstruction
problem
efficient
ETL support
for trajectory data
handling measure aggregation issues, with a special attention to
G. Marketos: Analyzing Trajectory Data 42
Future work
Extend the proposed framework by applying OLAP analysis
and DM techniques over the aggregated data
Examine new measures for TDW, specifically suited for
trajectories. Two examples:
A
typical trajectory
describing the trend of movement within a cell
G. Marketos: Analyzing Trajectory Data 44
The ‘traffic’ problem and challenges
Traffic in cities is a well-known problem:
the majority of people living in large cities and use cars for their
transportation
the number of cars moving in a city increases from year to year
Recently, the technology achievements allow us to collect
huge amount of traffic related data:
mobile phones, traffic cameras, sensors, GPS devices …
How can we exploit this huge volume of data so as to gain
insights on the traffic problem?
The focus of our work is to detect how the different road segments of the
Related work
(Liu et al, 2006), proposed a distributed traffic stream mining
system
they focus on description of the distributed traffic stream system,
rather on the discovery of traffic related patterns
(Li et al., 2007) proposed a technique for the discovery of hot
routes in a road network
They introduced a density-based algorithm, called FlowScan
The algorithm requires the trajectories of the objects that move
G. Marketos: Analyzing Trajectory Data 46
How can the road segments be related? - 1
How can the road segments be related? - 2
Syngrou Ave.
Poseidonos Ave.
G. Marketos: Analyzing Trajectory Data 48
How can the road segments be related? - 3
Mesogeion Ave.
Kifissias Ave.
Network Traffic
We consider a fixed network consisting of a set of
non-overlapping regions
regions = road intersections or landmarks of interest
R
4
R
3
G. Marketos: Analyzing Trajectory Data 50
Network Graph
The network is modeled as a directed graph G=(V,E)
nodes V
regions
edges E
direct connections between regions
R
3R
1R
7R
6R
11R
10R
9R
15R
14R
13R
4R
8R
12R
16R
2R
5Capturing traffic through sensors
Each edge e=(v, v‟) is equipped with sensor technology that
captures the movement from region v to region v‟.
Definition: The
traffic series of a sensor
s
S during a time
period [t
s
, t
e
] consists of the number of cars passed through
this sensor during this period, recorded at Δt intervals and
ordered in time:
TS
s= {v
i, t
i}, t
s≤ t
i≤ t
e, Δt=t
i-t
i-1the transmission rate
of the sensor
v’
v
time
150
100
80
…
90
t
st
s+Δ
t
et
t
s+2
Δt
…
G. Marketos: Analyzing Trajectory Data 52
Network Traffic
R
3R
1R
7R
6R
11R
10R
9R
15R
14R
13R
4R
8R
12R
16R
2R
5Similarity between edges: value distance
Let e
1
, e
2
be two network edges and let
TS
1
={(v
1i
,t
i
)}, TS
2
={(v
2i
,t
i
)}
be their corresponding traffic series, t
i
[t
s
, t
e
]
Definition: The
value-based distance
between e
1
, e
2
during
periods p and p‟ is given by the absolute distance of their
corresponding traffic series TS
1
, TS
2
at these periods:
)
,
(
)
,
(
1
p
2
p
'
value
1
p
2
p
'
value
e
e
dis
TS
TS
dis
G. Marketos: Analyzing Trajectory Data 54
Similarity between edges: shape distance
Value-based distance is not appropriate for shape similarity
task since it is sensible to:
different baselines
e.g. one series fluctuating around 100 and the other series fluctuating
around 30
different scales
e.g. one series having a small amplitude between 90 and 100 and the other
series having a larger amplitude between 20 and 40
To deal with this problem, different techniques can be
applied for comparing time series, e.g.:
Dynamic Time Warping that allows us to compare time series
either on the same or adjacent time periods
Correlation-Coefficient measure
Traffic propagation
traffic from e
12propagates
to e
23
This might indicate objects that continue moving
in a highway
Condition:
Traffic split/ spread
traffic from e
12splits into e
23and e
26
This might indicate objects that leave a highway
and follow different directions to their destination
Conditions:
Traffic merge
traffic to e
23merges traffic from e
12and e
62 This might indicate objects that enter a highway
from different directions
Conditions:
Traffic relationships
R
3R
1R
7R
6R
2R
5R
4R
8e
23e
12R
3R
1R
7R
6R
2R
5R
4R
8e
23e
12e
26R
3R
1R
7R
6R
2R
5R
4R
8e
23e
12e
62 value p e p e valueTS
TS
dis
(
,
')
23 12 shape p e e p e shapeTS
TS
dis
(
,
',)
62 23 12 value p e p e p e valueTS
TS
TS
dis
(
,
(
',
'))
26 23 12 shape p e e p e shapeTS
TS
dis
(
,
',)
62 12 23 p p pTS
TS
TS
dis
((
',
'),
,
)
G. Marketos: Analyzing Trajectory Data 56
Traffic relationships
Traffic sink
traffic in e
12is sank because there is
no propagation or split relationship
with its outgoing edges
Traffic source
traffic starts (source) from e
23because there is no propagation or
merge relationship with its incoming
edges
R
3R
1R
7R
6R
2R
5R
4R
8e
12R
3R
1R
7R
6R
2R
5R
4R
8e
23e
34Conclusions (for this part)
We consider the problem of mining traffic flow in a road
network monitored through sensors placed at regions of
interest (crossroads, etc.).
We employ edge similarity measures based on time series
comparison
G. Marketos: Analyzing Trajectory Data 58
Thank you!
Questions?
Acknowledgement:
Research partially supported by EU
under the GeoPKDD (Geographic Privacy-aware
Knowledge Discovery and Delivery)
References
Han, J., Stefanovic, N., and Koperski, K. Selective Materialization: An Efficient Method for Spatial Data Cube Construction. Proc. PAKDD, 1998.
Jensen, C.S., Kligys, A., Pedersen, T.B., Dyreson, C.E., and Timko, I. Multidimensional data modeling for location-based services, The VLDB Journal, 13:1–21, 2004.
Li, X., Han, J., Lee, J.-G. and Gonzalez, H.: “Traffic density-based discovery of hot routes in road networks.” in
Proc. 10th International Symposium on Spatial and Temporal Databases., 2007.
Liu, Y., Choudhary, A. N., Zhou, J., and Khokhar, A. A.: “A scalable distributed stream mining system for highway
traffic data.” in Proc. 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, 2006, pp. 309–321.
Orlando, S., Orsini, R., Raffaetà, A., Roncato, A., and Silvestri, C. Spatio-Temporal Aggregations in Trajectory Data
Warehouses. Proc. DaWaK, 2007.
Pelekis, N., Frentzos, E., Giatrakos, N., Theodoridis, Y.: HERMES: aggregative LBS via a trajectory DB engine. SIGMOD Conference 2008: 1255-1258
Pelekis, N., Raffaetà, A., Damiani, M.-L., Vangenot, C., Marketos, G., Frentzos, E., Ntoutsi, I., and Theodoridis, Y.
Towards Trajectory Data Warehouses. Chapter in Mobility, Data Mining and Privacy: Geographic Knowledge Discovery. Springer-Verlag. 2008.
Pfoser, D., Jensen, C.S., and Theodoridis, Y. Novel Approaches to the Indexing of Moving Object Trajectories, Proc. VLDB, 2000.
Shekhar, S.., Lu, C., Tan, X., Chawla, S., Vatsavai, R. Map Cube: a Visualization Tool for Spatial Data Warehouses,
Chapter in Geographic Data Mining and Knowledge Discovery. Taylor and Francis, 2001.
Tao, T., and Papadias, D. Historical Spatio-Temporal Aggregation. ACM TODS, 23(1):61-102, 2005.
Tao, Y., Kollios, G., Considine, J., Li, F., and Papadias, D. Spatio-Temporal Aggregation Using Sketches. Proc. ICDE, 2004.
G. Marketos: Analyzing Trajectory Data 60
Selected literature on …
Mobility Data Modeling & MOD engines
Nikos Pelekis, Elias Frentzos, Nikos Giatrakos, Yannis Theodoridis: HERMES: aggregative LBS via a
trajectory DB engine. SIGMOD Conference 2008: 1255-1258
Nikos Pelekis, Yannis Theodoridis, Spyros Vosinakis, Themis Panayiotopoulos: Hermes - A Framework for
Location-Based Data Management. EDBT 2006: 1130-1134
Güting, R.H. et al. (2000) A Foundation for Representing and Querying Moving Objects. ACM Transactions
on Database Systems, 25(1):1-42.
Karimi, H. and X. Liu (2003) A Predictive Location Model for Location-Based Services, Proceedings of
ACM-GIS.
Pelekis, N. et al. (2005) An Oracle Data Cartridge for Moving Objects. UNIPI-ISL-TR-2005-01, Univ. of
Piraeus. Retrieved from http://infolab.cs.unipi.gr/
Schlieder, C. et al. (2001) Location Modeling for Intentional Behavior in Spatial Partonomies. Proceedings of
Location Modeling for Ubiquitous Computing Workshop.
Sistla, P. et al. (1997) Modeling and Querying Moving Objects. Proceedings of IEEE ICDE Conference. Wolfson, O. et al. (1998) Moving Objects Databases: Issues and Solutions. Proceedings of SSDBM
Selected literature on …
Mobility Data Warehousing
Han, J., Stefanovic, N., and Koperski, K. Selective Materialization: An Efficient Method for Spatial Data Cube
Construction. Proc. PAKDD, 1998.
Jensen, C.S., Kligys, A., Pedersen, T.B., Dyreson, C.E., and Timko, I. Multidimensional data modeling for
location-based services, The VLDB Journal, 13:1–21, 2004.
Leonardi, L., Orlando, S., Raffaetà, A., Roncato, A. and Silvestri, C. Frequent Spatio-Temporal Patterns in
Trajectory Data Warehouses. Proceedings of 24th Annual ACM Symposium on Applied Computing, 2009
Marketos, G., Frentzos, E., Ntoutsi, I., Pelekis, N., Raffaetà, A., and Theodoridis, Y. Building Real World
Trajectory Warehouses. Proc. MobiDE‟08, Vancouver, Canada
Orlando, S., Orsini, R., Raffaetà, A., Roncato, A., and Silvestri, C. Spatio-Temporal Aggregations in
Trajectory Data Warehouses. Proc. DaWaK, 2007.
Pelekis, N., Raffaetà, A., Damiani, M.-L., Vangenot, C., Marketos, G., Frentzos, E., Ntoutsi, I., and
Theodoridis, Y. Towards Trajectory Data Warehouses. Chapter in Mobility, Data Mining and Privacy: Geographic Knowledge Discovery. Springer-Verlag. 2008.
Pfoser, D., Jensen, C.S., and Theodoridis, Y. Novel Approaches to the Indexing of Moving Object
Trajectories, Proc. VLDB, 2000.
Shekhar, S., Lu, C., Tan, X., Chawla, S., Vatsavai, R. Map Cube: a Visualization Tool for Spatial Data
Warehouses, Chapter in Geographic Data Mining and Knowledge Discovery. Taylor and Francis, 2001.
Tao, Y., Kollios, G., Considine, J., Li, F., and Papadias, D. Spatio-Temporal Aggregation Using Sketches.
G. Marketos: Analyzing Trajectory Data 62
Selected literature on …
Mining Traffic Patterns
Horvitz, E., Apacible, J., Sarin, R., and Liao, L.: “Prediction, expectation, and surprise: Methods, designs,
and study of a deployed traffic forecasting service,” in In Twenty-First Conference on Uncertainty in Artificial Intelligence, 2005.
Li, X., Han, J., Lee, J.-G. and Gonzalez, H.: “Traffic density-based discovery of hot routes in road
networks.” in Proc. 10th International Symposium on Spatial and Temporal Databases., 2007.
Liu, Y., Choudhary, A. N., Zhou, J., and Khokhar, A. A.: “A scalable distributed stream mining system for
highway traffic data.” in Proc. 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, 2006, pp. 309–321.
Nakata T., and Takeuchi, J.: “Mining traffic data from probe-car system for travel time prediction,” in Proc.
10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 817–822.
Ntoutsi, I., Mitsou, N., Marketos, G.: Traffic mining in a road-network: How does the traffic flow? IJBIDM
3(1): 82-98 (2008)
Shekhar, S., Lu, C.-T., Chawla, S., and Zhang, P.: “Data mining and visualization of twin-cities traffic data,”