• No results found

Analyzing Trajectory Data

N/A
N/A
Protected

Academic year: 2021

Share "Analyzing Trajectory Data"

Copied!
61
0
0

Loading.... (view fulltext now)

Full text

(1)

Analyzing Trajectory Data

Gerasimos Marketos

Information Systems Lab (InfoLab)

University of Piraeus (UniPi), Greece

http://infolab.cs.unipi.gr

Free University of Bolzano

20/3/2009

(2)

Preliminaries

Trajectory Data Warehousing

Our framework for trajectory analysis

Reconstructing trajectories

ETL processing

Addressing the distinct count problem

Traffic Mining

Modeling traffic flow in a road network

Our framework for mining traffic patterns

Detecting traffic flow in a road network

(3)

G. Marketos: Analyzing Trajectory Data 3

(4)

The big picture (the GeoPKDD project)

Traffic Management Accessibility of services Mobility evolution Urban planning …. interpretation visualization trajectory reconstruction p(x)=0.02 warehouse interpretation visualization trajectory reconstruction p(x)=0.02 ST patterns Trajectory warehouse Privacy-aware Data mining Bandwidth/Power optimization

Mobile cells planning … Public administration or business companies Telecommunication provider GeoKnowledge

Aggregative Location-based services Privacy enforcement Traffic Management Accessibility of services Mobility evolution Urban planning …. interpretation visualization trajectory reconstruction p(x)=0.02 warehouse interpretation visualization trajectory reconstruction p(x)=0.02 ST patterns Privacy-aware Data mining Bandwidth/Power optimization

Mobile cells planning …

Public administration or business companies

GeoKnowledge

Aggregative Location-based services

(5)

G. Marketos: Analyzing Trajectory Data 5

A tour on GeoPKDD

Mobility data management

Acquiring and storing trajectories in MODs

Location-aware querying

Trajectory indexing

Mobility data mining and the geographic KDD process

Trajectory warehousing and OLAP

Mobility data mining and reasoning

Visual analytics for mobility data

Privacy aspects on mobility data

(6)

Key questions (for this talk)

How to

store

and

query

trajectory data?

Database technology is extended: Moving Object Databases

How to

reconstruct

a trajectory from raw logs?

Position devices provide us information just about location points

and not about trajectories

How to

analyze

trajectory data

Data Warehousing technology is adapted to handle trajectory

data

Are there any

spatio-temporal patterns

in my data?

New Data Mining algorithms are needed so as to discover such

patterns

(7)

G. Marketos: Analyzing Trajectory Data 7

Mobility Data

Typical structure and size:

N;Time;Lat;Lon;Height;Course;Speed;PDOP;State;NSat

… 8;22/03/07 08:51:52;50.777132;7.205580; 67.6;345.4;21.817;3.8;1808;4 9;22/03/07 08:51:56;50.777352;7.205435; 68.4;35.6;14.223;3.8;1808;4 10;22/03/07 08:51:59;50.777415;7.205543; 68.3;112.7;25.298;3.8;1808;4 11;22/03/07 08:52:03;50.777317;7.205877; 68.8;119.8;32.447;3.8;1808;4 12;22/03/07 08:52:06;50.777185;7.206202; 68.1;124.1;30.058;3.8;1808;4 13;22/03/07 08:52:09;50.777057;7.206522; 67.9;117.7;34.003;3.8;1808;4 14;22/03/07 08:52:12;50.776925;7.206858; 66.9;117.5;37.151;3.8;1808;4 15;22/03/07 08:52:15;50.776813;7.207263; 67.0;99.2;39.188;3.8;1808;4 16;22/03/07 08:52:18;50.776780;7.207745; 68.8;90.6;41.170;3.8;1808;4 17;22/03/07 08:52:21;50.776803;7.208262; 71.1;82.0;35.058;3.8;1808;4 18;22/03/07 08:52:24;50.776832;7.208682; 68.6;117.1;11.371;3.8;1808;4 …

(8)

Location data producers: GSM, GPS, WiFi

) , , ( ),..., , , ( 1 1 1 i i ini ini ini i i x y t x y t T

Location data (id, x, y, t)

are generated

Moving

Object

Database

trajectory data

(obj-id, traj-id, (x, y, t)

*

)

are reconstructed

Trajectory stream manager +

Trajectory reconstruction

)

,

,

(

),...,

,

,

(

1

1

1

i

i

i

n

i

i

n

i

i

n

i

i

i

x

y

t

x

y

t

T

(9)

G. Marketos: Analyzing Trajectory Data 9

Moving Objects Databases

The traditional database technology has been extended into

Moving

Object Databases

(MODs) that handle modeling, indexing and

query processing issues for trajectories

Spatial and temporal dimensions are considered as first-class

citizens.

Both past and current (as well as anticipated future) positions of

moving objects are of interest.

Several prototype MODs

DOMINO (Wolfson

et al.) EDBT‟02

PLACE (Mokbel

et al.) VLDB‟04

SECONDO (Güting et. al.) ICDE‟05

(10)

Hermes MOD engine

Built-in ORACLE ORDBMS

Data model: absolute vs. relative location coordinates

Current location as a function in time over the starting

location

linear and arc movement functions

MOD management

Insert / Update / Delete a moving object or a segment of its

trajectory

Functions over trajectories or sets of trajectories

Indexing support

Supported indices: R-tree (for stationary data)

(11)

G. Marketos: Analyzing Trajectory Data 11

(12)
(13)

G. Marketos: Analyzing Trajectory Data 14

Hermes: trajectory data type

Primitive definition:

Unit_Function =

d

x

i

:double, y

i

:double, x

e

:double, y

e

:double, x

c

:double, y

c

:double, v:double,

a:double, flag:TypeOfFunction , where

TypeOfFunction={ CONST, PLNML_1, ARC_<1..8> }

Unit_Moving_Point =

d

p: Period SEC , m: Unit_Function

Moving_Point =

d

{ tab: set Unit_Moving_Point

| …

constraints

…}

yy' xx' tt' t1 t2 t3 t4 t ε [t1, t2) -> Linear movement t ε [t2, t3) -> Arc movement t5 t ε [t3, t4) -> Const movement t ε [t4, t5) -> Linear movement Y X o YY' XX' (xt,yt,t) (xe,ye,te) (xi,yi,ti) (xc,yc)

(14)

t3

t1 t7

t11

t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12

TB-Tree support in Hermes MOD engine

TB-Tree Index (Pfoser et al., 2000)

Maintains the „trajectory‟ concept

Each node consists of segments

of a single trajectory

Nodes are linked together in a chain

Effective for trajectory-oriented queries

Implemented in Hermes using

(15)

G. Marketos: Analyzing Trajectory Data 16

Trajectory Data Warehousing &

OLAP

(16)

Trajectory data warehousing

The

analysis

of

trajectory data

raises opportunities for discovering

behavioral patterns that can be exploited in various applications

TDW should

extract aggregate information from MOD

support a variety of dimensions (temporal, spatial, thematic, …) and

measures (about space, time and their derivatives)

Storing

measures

associated with facts, concerning the

set of trajs

crossing the cell

aggregate

information in base cells

Challenges

design of a trajectory-oriented data cube

high volume and complex nature of data; special query processing

requirements

extensions of traditional aggregation techniques to produce summary

(17)

G. Marketos: Analyzing Trajectory Data 18

Reconstructed trajectory data are stored in MOD

Trajectory data warehousing

A

Trajectory Reconstruction process

is applied on the raw

time-stamped location data in order to generate trajectories, which are then

stored into a MOD

An

Extract-Transform-Load

(

ETL)

procedure

is activated that feeds

the data cube(s) with aggregate information on trajectories

The final step of the process offers

OLAP

(and, eventually, DM)

capabilities over the aggregated information contained in the trajectory

cube model

trajectory data analyst location data producers

Location data (id, x, y, t) are recorded

Aggregates are loaded in the data cube (ETL procedure) Trajectory

Data Cube MOD

Trajectory reconstruction module Analysis over aggregate data is performed (OLAP)

(18)

Related work

Data Warehousing (DW) is widely investigated for conventional,

non-spatial data.

Some research work has been done in the area of Spatial Data

Warehousing (Han et al., 1998).

(Shekhar et al., 2001) extend the idea of cube dimensions so

as to include spatial and non-spatial ones, and of cube

measures so as to represent space regions and/or calculate

numerical data

One step beyond Spatial DW is modeling a TDW (Pelekis et al.,

2007). A simple TDW model is presented in (Orlando et al.,

2007).

The notion of multiple aggregation paths is defined in (Jensen et

al., 2004)

The integration of spatial and temporal dimensions is proposed in

(19)

G. Marketos: Analyzing Trajectory Data 20

Conventional DWs

In conventional DWs, measures about entities (e.g. shopping

baskets, customers etc.) refer to a specific base cell that are

formed by the dimensions

distinct objects appear only once at the minimum abstraction

level of the cube

Time

S

tor

e

count

Count of transactions

TV

VCR

PC

1/1

2/1 3/1

101

201

301

sum

A transaction cannot be

(20)

Trajectory DW

The challenge: In TDW, measures regarding trajectories may

affect more than one base cell

the same object might affect measures in several base cell due to

its movement (!!)

Time

L

ocat

io

n

count

Count of trajectories

50 < Age

Age 30

30 < Age 50

12.00 13.00 14.00

R1

R2

R3

sum

A trajectory can be found

in both base cells (the

distinct counting problem)

(21)

G. Marketos: Analyzing Trajectory Data 22

Moving Object Database

Trajectory Data Warehouse

Dimensions:

Spatial,

Temporal, Object Profile

Measures:

count

(trajectories), count (users),

avg (distance traveled), avg

(travel duration), avg (speed),

avg (abs (acceler) )

Sample schemas

OBJECT_PROFILE_DIM PK OBJPROFILE_ID GENDER BIRTHYEAR PROFESSION MARITAL_STATUS DEVICE_TYPE FACT_TBL PK,FK3 INTERVAL_ID PK,FK2 PARTITION_ID PK,FK1 OBJPROFILE_ID COUNT_TRAJECTORIES COUNT_USERS AVG_DISTANCE_TRAVELED AVG_TRAVEL_DURATION AVG_SPEED AVG_ABS_ACCELER SPACE_DIM PK PARTITION_ID PARTITION_GEOMETRY DISTRICT CITY STATE COUNTRY TIME_DIM PK INTERVAL_ID INTERVAL_START INTERVAL_END HOUR DAY MONTH QUARTER YEAR DAY_OF_WEEK RUSH_HOUR

OBJECTS(object-id: identifier, description: text, gender: {M | F}, birth-date: date, profession: text, device-type: text)

RAW_LOCATIONS(object-id: identifier, timestamp: datetime, eastings-x: numeric, northings-y: numeric, altitude-z: numeric)

MOD_TRAJECTORIES (trajectory-id: identifier, object-id:

(22)

Our framework for trajectory analysis

We focus our research on three issues that are critical to

trajectory data warehousing:

The preprocessing phase that deals with the explicit

reconstruction

of the trajectories, which are then stored into a

MOD

Alternative ETL processes

that feed the trajectory data

warehouse

The solutions proposed on the challenging issue of

measure

(23)

G. Marketos: Analyzing Trajectory Data 24 y

t

x

Reconstructing trajectories

Collected raw data represent time-stamped geographical

locations

Apart from storing raw data in the MOD, we are also

interested in reconstructing trajectories:

Raw points arrive in bulk sets

We need a filter that decides if the new series of data is to be

appended to an existing trajectory or not:

Tolerance distance

Temporal gap

Spatial gap

Maximum speed

Maximum noise duration

t

y

(24)

Reconstructing trajectories: parameters

Tolerance distance

The tolerance of the transmitted time-stamped positions. In other

words, it is the

maximum distance between two consecutive

time-stamped positions

of the same object in order for the

object to be

considered

as

stationary

t

y y

t

(25)

G. Marketos: Analyzing Trajectory Data 26

Reconstructing trajectories: parameters

Tolerance distance

Temporal gap between trajectories

The

maximum allowed time interval

between two consecutive

time-stamped positions of the same trajectory for a single moving

object

t y y t x x

temporal gap

(26)

Reconstructing trajectories: parameters

Tolerance distance

Temporal gap between trajectories

Spatial gap between trajectories

The

maximum allowed distance

in 2D plane between two

consecutive time-stamped positions of the same trajectory

t

y y

t

x x

(27)

G. Marketos: Analyzing Trajectory Data 28

Reconstructing trajectories: parameters

Tolerance distance

Temporal gap between trajectories

Spatial gap between trajectories

Maximum speed

It is used in order to determine whether a reported time-stamped

position must be considered as

noise

and consequently

discarded from the output trajectory

t

y y

t

(28)

Reconstructing trajectories: parameters

Tolerance distance

Temporal gap between trajectories

Spatial gap between trajectories

Maximum speed

Maximum noise duration

The

maximum duration of a noisy part

of a trajectory. Any

sequence of noisy time-stamped positions of the same object will

result in a new trajectory given that its duration exceeds

noise

max

t

y y

t

(29)

G. Marketos: Analyzing Trajectory Data 30

Reconstructing trajectories: algorithm

Algorithm Trajectory-Reconstruction

(PartialTrajectories List, P Point, OId ObjectId) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32.

IF NOT PartialTrajectories.Contains(OId) THEN CTrajectory=New Trajectory;

CTrajectory.AddPoint(P);

PartialTrajectories.Add(CTrajectory);

ELSE

CTrajectory=PartialTrajectories(OId);

IF Distance(CTrajectory.LastPoint,P)<= DTOL THEN

IF P.T – CTrajectory.LastPoint.T > gapTime THEN

Report Ctrajectory.LastPoint;

CTrajectory.Id=CTrajectory.Id+1; CTrajectory.AddPoint(P);

ENDIF

ELSEIF Speed(CTrajectory.LastPoint,P)> Vmax THEN

IF P.T – CTrajectory.LastPoint.T > noisemax THEN

Report CTrajectory.Noise; ELSE

CTrajectory.AddNoise(P); ENDIF

ELSEIF Distance(CTrajectory.LastPoint,P)> gapspace THEN

Report Ctrajectory.LastPoint; CTrajectory.Id=CTrajectory.Id+1; CTrajectory.AddPoint(P);

ELSE

IF P.T – CTrajectory.LastPoint.T > gapTime THEN

Report Ctrajectory.LastPoint; CTrajectory.Id=CTrajectory.Id+1; CTrajectory.AddPoint(P); ELSE CTrajectory.AddPoint(P); ENDIF ENDIF ENDIF

As a first step (lines 1-6),

the algorithm checks

whether the object has been

processed

if so, it retrieves its partial

trajectory from the

corresponding list

otherwise, it creates a new

trajectory and adds it to list

Then (lines 7-31), it

compares the incoming

point P with the tail of the

partial trajectory (LastPoint)

by applying the trajectory

reconstruction parameters

(30)

Our framework for trajectory analysis

We focus our research on three issues that are critical to

trajectory data warehousing:

The preprocessing phase that deals with the explicit

reconstruction

of

the trajectories, which are then stored into a MOD

Alternative ETL processes

that feed the trajectory

data warehouse

The solutions proposed on the challenging issue of

measure

(31)

G. Marketos: Analyzing Trajectory Data 32

ETL processing: loading

Loading data into the dimension tables

straightforward

Loading data into the fact table

complex

Fill in the measures with the appropriate numeric values

In order to calculate the measures, we have to extract the

portions of the trajectories that fit into the base cells of the cube

We propose two alternative solutions to this problem:

cell-oriented

trajectory-oriented

y

(32)

ETL processing: measures

Measure

Formula

COUNT_

TRAJECTORIES

count all distinct trajectory ids that pass through

base cell (bc)

COUNT_USERS

count all the distinct object ids that pass through

bc

AVG_DISTANCE_

TRAVELED

AVG_TRAVEL_

DURATION

AVG_SPEED

AVG_ABS_

ACCELER

) ( _ ) ( _ ) ( _ _ bc ES TRAJECTORI COUNT bc DISTANCE SUM bc TRAVELED DISTANCE AVG bc TP i i

TP

len

bc

DISTANCE

SUM

_

(

)

(

)

) ( _ ) ( _ ) ( _ _ bc ES TRAJECTORI COUNT bc DURATION SUM bc DURATION TRAVEL AVG bc TP i i TP lifespan bc DURATION SUM_ ( ) ( ) ) ( _ ) ( _ ) ( _ bc ES TRAJECTORI COUNT bc SPEED SUM bc SPEED AVG bc TP i i i lifespanTP TP len bc SPEED SUM ) ( ) ( ) ( _ ) ( _ ) ( _ _ ) ( _ _ bc ES TRAJECTORI COUNT bc ACCELER ABS SUM bc ACCELER ABS AVG TP speed TP speed ( ) ( )

(33)

G. Marketos: Analyzing Trajectory Data 34

ETL processing: algorithms

Cell-oriented approach

(COA)

Search for the portions of trajectories

that they reside inside a

spatiotemporal cell

Perform a

spatiotemporal

range query

that returns the

portions of trajectories that

satisfy the range constraints

This is efficiently supported by

the

TB-tree

Decompose the trajectory portions

with respect to the user profiles they

belong to

Compute measures for this cell

Repeat for the next cells

x

y

COUNT

_

TRAJECTORIES

= 2

COUNT

_

USERS

= 2

(34)

COUNT

_

TRAJECTORIES

= 1

COUNT

_

USERS

= 1

ETL processing: algorithms

Trajectory-oriented approach

(TOA)

Discover the spatiotemporal cells

where each trajectory resides in

In order to avoid checking all

cells, use the trajectory MBR

Identify the cells that overlap with the

MBR and contain portions of the

trajectory

Compute measures for each cell

Repeat for the next trajectories

x

y

COUNT

_

TRAJECTORIES

= 2

COUNT

_

USERS

= 2

(35)

G. Marketos: Analyzing Trajectory Data 36

Our framework for trajectory analysis

We focus our research on three issues that are critical to

trajectory data warehousing:

The preprocessing phase that deals with the explicit

reconstruction

of

the trajectories, which are then stored into a MOD

Alternative ETL processes

that feed the trajectory data warehouse

The solutions proposed on the challenging issue of

(36)

The distinct count problem: definition

During the ETL process, measures can be computed in an

accurate way by executing MOD queries

Once the fact table has been fed, aggregate-only information is

stored inside the TDW (no trajectory / user ids)

When rolling up,

COUNT_USERS, COUNT_TRAJECTORIES

and,

hence, all other measures defined over

COUNT_TRAJECTORIES

are subject to the distinct count problem (Tao et al., 2004):

if an object remains in the query region

for several timestamps during the query

interval, instead of counting this object

once, it is counted multiple times in the

result

y

(37)

G. Marketos: Analyzing Trajectory Data

38

38

The distinct count problem: solution

(1/3)

We store in the base cells (

C

(x,y),t,p

) a tuple of auxiliary

measures that help us correct the errors due to the duplicates

when rolling-up:

C

(x,y),t,p

.

Traj

: number of distinct trajectories of profile

p

intersecting the cell

C

(x,y),t,p

.

cross-x

: number of distinct trajectories of profile

p

crossing the

spatial

border between

C

(x-1,y),t,p

and

C

(x,y),t,p

C

(x,y),t,p

.

cross-y

: number of distinct trajectories of profile

p

crossing the

spatial

border between

C

(x,y-1),t,p

and

C

(x,y),t,p

C

(x,y),t,p

.

cross-t

: number of distinct trajectories of profile p crossing

the

temporal

border between

C

(x,y),t-1,p

and

C

(x,y),t,p

X

Y

T

(38)

The distinct count problem: solution

(2/3)

Let

C

(x’,y’),t’,p’

be a cell consisting of the union of two adjacent

cells (i.e.

C

(x,y),t.p

C

(x+1,y),t,p

)

In order to compute the

number of distinct trajectories

:

C

(x’,y’),t’,p’

.

Traj = C

(x,y),t,p

.

Traj + C

(x+1,y),t,p

.

Traj – C

(x+1,y),t,p

.

cross-x

application of the well-known Inclusion/Exclusion principle for

sets: A B = A + B

A B

BUT

in some cases it holds that

C

(x+1,y),t,p

.

cross-x

A B

(39)

G. Marketos: Analyzing Trajectory Data 40

The distinct count problem: solution

(3/3)

C

x,y,t,p

C

x+1,y,t,p

C

x,y+1,t,p

C

x+1,y+1,t,p

(a)

C

x,y,t,p

C

x+1,y,t,p

C

x,y+1,t,p

C

x+1,y+1,t,p

(b)

Correct!

Not Correct!

C

x,y,t,p

.Traj

=

1

C

x+1,y,t,p

.Traj

=

1

C

x+1,y,t,p

.cross-x

=

1

C

x+1,y,t,p

.cross-x

=

0

Compute the number of distinct trajectories:

(40)

Conclusions (for this part)

We explored solutions for the efficient and effective

development of trajectory warehouses

Related work does not include research work on the complete

flow of processes in a TDW

In particular, we discussed about techniques for:

the solution of the

trajectory reconstruction

problem

efficient

ETL support

for trajectory data

handling measure aggregation issues, with a special attention to

(41)

G. Marketos: Analyzing Trajectory Data 42

Future work

Extend the proposed framework by applying OLAP analysis

and DM techniques over the aggregated data

Examine new measures for TDW, specifically suited for

trajectories. Two examples:

A

typical trajectory

describing the trend of movement within a cell

(42)
(43)

G. Marketos: Analyzing Trajectory Data 44

The ‘traffic’ problem and challenges

Traffic in cities is a well-known problem:

the majority of people living in large cities and use cars for their

transportation

the number of cars moving in a city increases from year to year

Recently, the technology achievements allow us to collect

huge amount of traffic related data:

mobile phones, traffic cameras, sensors, GPS devices …

How can we exploit this huge volume of data so as to gain

insights on the traffic problem?

The focus of our work is to detect how the different road segments of the

(44)

Related work

(Liu et al, 2006), proposed a distributed traffic stream mining

system

they focus on description of the distributed traffic stream system,

rather on the discovery of traffic related patterns

(Li et al., 2007) proposed a technique for the discovery of hot

routes in a road network

They introduced a density-based algorithm, called FlowScan

The algorithm requires the trajectories of the objects that move

(45)

G. Marketos: Analyzing Trajectory Data 46

How can the road segments be related? - 1

(46)

How can the road segments be related? - 2

Syngrou Ave.

Poseidonos Ave.

(47)

G. Marketos: Analyzing Trajectory Data 48

How can the road segments be related? - 3

Mesogeion Ave.

Kifissias Ave.

(48)

Network Traffic

We consider a fixed network consisting of a set of

non-overlapping regions

regions = road intersections or landmarks of interest

R

4

R

3

(49)

G. Marketos: Analyzing Trajectory Data 50

Network Graph

The network is modeled as a directed graph G=(V,E)

nodes V

regions

edges E

direct connections between regions

R

3

R

1

R

7

R

6

R

11

R

10

R

9

R

15

R

14

R

13

R

4

R

8

R

12

R

16

R

2

R

5

(50)

Capturing traffic through sensors

Each edge e=(v, v‟) is equipped with sensor technology that

captures the movement from region v to region v‟.

Definition: The

traffic series of a sensor

s

S during a time

period [t

s

, t

e

] consists of the number of cars passed through

this sensor during this period, recorded at Δt intervals and

ordered in time:

TS

s

= {v

i

, t

i

}, t

s

≤ t

i

≤ t

e

, Δt=t

i

-t

i-1

the transmission rate

of the sensor

v’

v

time

150

100

80

90

t

s

t

s

t

e

t

t

s

+2

Δt

(51)

G. Marketos: Analyzing Trajectory Data 52

Network Traffic

R

3

R

1

R

7

R

6

R

11

R

10

R

9

R

15

R

14

R

13

R

4

R

8

R

12

R

16

R

2

R

5

(52)

Similarity between edges: value distance

Let e

1

, e

2

be two network edges and let

TS

1

={(v

1i

,t

i

)}, TS

2

={(v

2i

,t

i

)}

be their corresponding traffic series, t

i

[t

s

, t

e

]

Definition: The

value-based distance

between e

1

, e

2

during

periods p and p‟ is given by the absolute distance of their

corresponding traffic series TS

1

, TS

2

at these periods:

)

,

(

)

,

(

1

p

2

p

'

value

1

p

2

p

'

value

e

e

dis

TS

TS

dis

(53)

G. Marketos: Analyzing Trajectory Data 54

Similarity between edges: shape distance

Value-based distance is not appropriate for shape similarity

task since it is sensible to:

different baselines

e.g. one series fluctuating around 100 and the other series fluctuating

around 30

different scales

e.g. one series having a small amplitude between 90 and 100 and the other

series having a larger amplitude between 20 and 40

To deal with this problem, different techniques can be

applied for comparing time series, e.g.:

Dynamic Time Warping that allows us to compare time series

either on the same or adjacent time periods

Correlation-Coefficient measure

(54)

Traffic propagation

traffic from e

12

propagates

to e

23

This might indicate objects that continue moving

in a highway

Condition:

Traffic split/ spread

traffic from e

12

splits into e

23

and e

26

This might indicate objects that leave a highway

and follow different directions to their destination

Conditions:

Traffic merge

traffic to e

23

merges traffic from e

12

and e

62

This might indicate objects that enter a highway

from different directions

Conditions:

Traffic relationships

R

3

R

1

R

7

R

6

R

2

R

5

R

4

R

8

e

23

e

12

R

3

R

1

R

7

R

6

R

2

R

5

R

4

R

8

e

23

e

12

e

26

R

3

R

1

R

7

R

6

R

2

R

5

R

4

R

8

e

23

e

12

e

62 value p e p e value

TS

TS

dis

(

,

'

)

23 12 shape p e e p e shape

TS

TS

dis

(

,

',

)

62 23 12 value p e p e p e value

TS

TS

TS

dis

(

,

(

'

,

'

))

26 23 12 shape p e e p e shape

TS

TS

dis

(

,

',

)

62 12 23 p p p

TS

TS

TS

dis

((

'

,

'

),

,

)

(55)

G. Marketos: Analyzing Trajectory Data 56

Traffic relationships

Traffic sink

traffic in e

12

is sank because there is

no propagation or split relationship

with its outgoing edges

Traffic source

traffic starts (source) from e

23

because there is no propagation or

merge relationship with its incoming

edges

R

3

R

1

R

7

R

6

R

2

R

5

R

4

R

8

e

12

R

3

R

1

R

7

R

6

R

2

R

5

R

4

R

8

e

23

e

34

(56)

Conclusions (for this part)

We consider the problem of mining traffic flow in a road

network monitored through sensors placed at regions of

interest (crossroads, etc.).

We employ edge similarity measures based on time series

comparison

(57)

G. Marketos: Analyzing Trajectory Data 58

Thank you!

Questions?

Acknowledgement:

Research partially supported by EU

under the GeoPKDD (Geographic Privacy-aware

Knowledge Discovery and Delivery)

(58)

References

 Han, J., Stefanovic, N., and Koperski, K. Selective Materialization: An Efficient Method for Spatial Data Cube Construction. Proc. PAKDD, 1998.

 Jensen, C.S., Kligys, A., Pedersen, T.B., Dyreson, C.E., and Timko, I. Multidimensional data modeling for location-based services, The VLDB Journal, 13:1–21, 2004.

 Li, X., Han, J., Lee, J.-G. and Gonzalez, H.: “Traffic density-based discovery of hot routes in road networks.” in

Proc. 10th International Symposium on Spatial and Temporal Databases., 2007.

 Liu, Y., Choudhary, A. N., Zhou, J., and Khokhar, A. A.: “A scalable distributed stream mining system for highway

traffic data.” in Proc. 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, 2006, pp. 309–321.

 Orlando, S., Orsini, R., Raffaetà, A., Roncato, A., and Silvestri, C. Spatio-Temporal Aggregations in Trajectory Data

Warehouses. Proc. DaWaK, 2007.

 Pelekis, N., Frentzos, E., Giatrakos, N., Theodoridis, Y.: HERMES: aggregative LBS via a trajectory DB engine. SIGMOD Conference 2008: 1255-1258

 Pelekis, N., Raffaetà, A., Damiani, M.-L., Vangenot, C., Marketos, G., Frentzos, E., Ntoutsi, I., and Theodoridis, Y.

Towards Trajectory Data Warehouses. Chapter in Mobility, Data Mining and Privacy: Geographic Knowledge Discovery. Springer-Verlag. 2008.

 Pfoser, D., Jensen, C.S., and Theodoridis, Y. Novel Approaches to the Indexing of Moving Object Trajectories, Proc. VLDB, 2000.

 Shekhar, S.., Lu, C., Tan, X., Chawla, S., Vatsavai, R. Map Cube: a Visualization Tool for Spatial Data Warehouses,

Chapter in Geographic Data Mining and Knowledge Discovery. Taylor and Francis, 2001.

 Tao, T., and Papadias, D. Historical Spatio-Temporal Aggregation. ACM TODS, 23(1):61-102, 2005.

 Tao, Y., Kollios, G., Considine, J., Li, F., and Papadias, D. Spatio-Temporal Aggregation Using Sketches. Proc. ICDE, 2004.

(59)

G. Marketos: Analyzing Trajectory Data 60

Selected literature on …

Mobility Data Modeling & MOD engines

 Nikos Pelekis, Elias Frentzos, Nikos Giatrakos, Yannis Theodoridis: HERMES: aggregative LBS via a

trajectory DB engine. SIGMOD Conference 2008: 1255-1258

 Nikos Pelekis, Yannis Theodoridis, Spyros Vosinakis, Themis Panayiotopoulos: Hermes - A Framework for

Location-Based Data Management. EDBT 2006: 1130-1134

 Güting, R.H. et al. (2000) A Foundation for Representing and Querying Moving Objects. ACM Transactions

on Database Systems, 25(1):1-42.

 Karimi, H. and X. Liu (2003) A Predictive Location Model for Location-Based Services, Proceedings of

ACM-GIS.

 Pelekis, N. et al. (2005) An Oracle Data Cartridge for Moving Objects. UNIPI-ISL-TR-2005-01, Univ. of

Piraeus. Retrieved from http://infolab.cs.unipi.gr/

 Schlieder, C. et al. (2001) Location Modeling for Intentional Behavior in Spatial Partonomies. Proceedings of

Location Modeling for Ubiquitous Computing Workshop.

 Sistla, P. et al. (1997) Modeling and Querying Moving Objects. Proceedings of IEEE ICDE Conference.  Wolfson, O. et al. (1998) Moving Objects Databases: Issues and Solutions. Proceedings of SSDBM

(60)

Selected literature on …

Mobility Data Warehousing

 Han, J., Stefanovic, N., and Koperski, K. Selective Materialization: An Efficient Method for Spatial Data Cube

Construction. Proc. PAKDD, 1998.

 Jensen, C.S., Kligys, A., Pedersen, T.B., Dyreson, C.E., and Timko, I. Multidimensional data modeling for

location-based services, The VLDB Journal, 13:1–21, 2004.

 Leonardi, L., Orlando, S., Raffaetà, A., Roncato, A. and Silvestri, C. Frequent Spatio-Temporal Patterns in

Trajectory Data Warehouses. Proceedings of 24th Annual ACM Symposium on Applied Computing, 2009

 Marketos, G., Frentzos, E., Ntoutsi, I., Pelekis, N., Raffaetà, A., and Theodoridis, Y. Building Real World

Trajectory Warehouses. Proc. MobiDE‟08, Vancouver, Canada

 Orlando, S., Orsini, R., Raffaetà, A., Roncato, A., and Silvestri, C. Spatio-Temporal Aggregations in

Trajectory Data Warehouses. Proc. DaWaK, 2007.

 Pelekis, N., Raffaetà, A., Damiani, M.-L., Vangenot, C., Marketos, G., Frentzos, E., Ntoutsi, I., and

Theodoridis, Y. Towards Trajectory Data Warehouses. Chapter in Mobility, Data Mining and Privacy: Geographic Knowledge Discovery. Springer-Verlag. 2008.

 Pfoser, D., Jensen, C.S., and Theodoridis, Y. Novel Approaches to the Indexing of Moving Object

Trajectories, Proc. VLDB, 2000.

 Shekhar, S., Lu, C., Tan, X., Chawla, S., Vatsavai, R. Map Cube: a Visualization Tool for Spatial Data

Warehouses, Chapter in Geographic Data Mining and Knowledge Discovery. Taylor and Francis, 2001.

 Tao, Y., Kollios, G., Considine, J., Li, F., and Papadias, D. Spatio-Temporal Aggregation Using Sketches.

(61)

G. Marketos: Analyzing Trajectory Data 62

Selected literature on …

Mining Traffic Patterns

 Horvitz, E., Apacible, J., Sarin, R., and Liao, L.: “Prediction, expectation, and surprise: Methods, designs,

and study of a deployed traffic forecasting service,” in In Twenty-First Conference on Uncertainty in Artificial Intelligence, 2005.

 Li, X., Han, J., Lee, J.-G. and Gonzalez, H.: “Traffic density-based discovery of hot routes in road

networks.” in Proc. 10th International Symposium on Spatial and Temporal Databases., 2007.

 Liu, Y., Choudhary, A. N., Zhou, J., and Khokhar, A. A.: “A scalable distributed stream mining system for

highway traffic data.” in Proc. 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, 2006, pp. 309–321.

 Nakata T., and Takeuchi, J.: “Mining traffic data from probe-car system for travel time prediction,” in Proc.

10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 817–822.

 Ntoutsi, I., Mitsou, N., Marketos, G.: Traffic mining in a road-network: How does the traffic flow? IJBIDM

3(1): 82-98 (2008)

 Shekhar, S., Lu, C.-T., Chawla, S., and Zhang, P.: “Data mining and visualization of twin-cities traffic data,”

References

Related documents

same time. For an average discovery interval of 1 second, which is well within the usual window of opportunity for establishing contact, the power consumption can be as low as 1.5

For more than two decades, Apple Computer was predominantly a manufacturer of personal computers, including the Apple II, Macintosh, and Power Mac lines, but it faced

(3) A person who contravenes the provisions of this section commits an.. 1) A person shall not carry out any business listed in the Fourth Schedule within the

Veracode’s Vendor Application Security Testing (VAST) Program provides the first comprehensive application security compliance program to large enterprise customers as well as

In this dissertation, CoMP network operation is studied, where the BSs connected within a cooperation cluster are prompted to cooperate with each other with the objective of

367, 374 (1987) (“We conclude that here, as in Lafa- yette , reasonable police regulations relating to inventory procedures administered in good faith satis- fy the Fourth

When considering the motives found in this study in light of satis fiers and dissatisfiers (Herzberg 2008 , 2017 ) we see that some motives were only mentioned by continuing students as

This project is reported here in the second paper in this volume, entitled ‘How does a secondary school support the emotional health and well-being of its pupils during,