www.ptvgroup.com Seite 1 www.ptvgroup.com
Lorenzo Meschini - CEO, PTV SISTeMA COST TU1004 final Conference
Paris, 11 May 2015
BIG DATA FOR MODELLING 2.0
ENHANCING MODELS WITH MASSIVE REAL DATA INTEGRATION
BIG DATA FOR MODELLING 2.0
ENHANCING MODELS WITH MASSIVE MOBILITY DATA INTEGRATION
www.ptvgroup.com Seite 2
“A collection of data too massive to be handled efficiently by traditional
databases tools and methods”
Big Data IS NOT only related to non-trivial sizes of data, but it IS rooted in
the push to discover hidden/useful insights in data.
www.ptvgroup.com Seite 3
VOLUME is the sheer size of the data being collected
VELOCITY is the speed at which data is flowing into a business’s infrastructure andthe ability of software solutions to receive and process that data quickly
VARIETY refers to different data format incoming into your platform, and the challenge to be able to take raw, (un)structured data and organize it.www.ptvgroup.com Seite 4
Three challenges besides data availability from a business point of view:
STORE: can you store the vast amounts of data being collected?
PROCESS: can you organize, clean, and analyze the data collected?
ACCESS: Can you search and query this data in a organized manner?www.ptvgroup.com Seite 5
Once you get beyond storage and management, you still have the enormous task of
creating actionable business intelligence (BI) from the datasets you’ve collected.
There are so many types of analytic models, and different ways of providing infrastructure for this process. But the analytics solution must scale, too.
Ultimately, analytics tools rely on a great deal of reasoning and analysis to extract data patterns and data insights, but this capacity means nothing for a business if they can’t then create actionable intelligence.
www.ptvgroup.com Seite 6
Big Data Plans are Underway for Most Organizations
RDBMS Still Dominates the Broader IT Industry
SOME STATISTICS (2015)
Almost All Orgs Expect Their Storage Needs to Grow Exponentially
www.ptvgroup.com Seite 7 The market already offers world or continent wide services and solutions based on individual vehicle and/or people mobility trajectories or movements
Raw data sources
Vehicle Trajectories form black boxes for insurance applications or vehicle location systems
Vehicle Trajectories from navigation systems
Crowd sourcing from Mobile phone apps
Localization of mobile phones
Offered services / products
Real time traffic monitoring & information
Performance measures
Maps
Speed profiles and travel times on road segments
Travel time matrices
Observed od matrices
Trajectories
www.ptvgroup.com Seite 8 Public transport data are currently collected and stored on a local base:
Raw data sources
Service plans
PT vehicle trajectories from AVL and AVM systems
PT events (delay/cancellation/rerouting)
Tickets emission/collections
Crowd sourcing from Mobile phone apps
Services produced are currently often limited within the entities collecting the data
Real time information
Performance and Level of Service measures
Clearing
Service planning (schedule)
Some companies are trying to bring services to a global level
Aggregating local data
www.ptvgroup.com Seite 9
Challenges
Collecting data on PT worldwide: data are (owned?) by different authorities that won't provide them
Go multimodal: collecting Bike, pedestrians counts
Mode of transport identification car, bike, PT can be very similar in urban contexts Same trip, several transport systems
Enablers
Open data
Crowd sourcing
Internet of thingsOpportunities
“Smart cities”www.ptvgroup.com Seite 10
Big Data (historical) on PuT
Computer Science
Transportation Engineering
Pure
“statistical/machine
learning” approach
“Modelling” approach
+
Calibration by data
Modelling 2.0
www.ptvgroup.com Seite 11
Input: from same raw FCD data that provide today speed profiles
Output: calibrated traffic models + route choice
DATA DRIVEN MODELS - TODAY
FCD raw trajectories
Optima
Data Driven
• Network graph • Traffic zones• Available flow counts
Demand
• OD matrices
Route choice
• Turning ratio (by destination zone) Network attributes • Free flow speeds • Capacities
www.ptvgroup.com Seite 12
Input: from same raw FCD data that provide today speed profiles
Output: calibrated traffic models + route choice
DATA DRIVEN MODELS - TOMORROW
Multi modal trips
Optima
Data Driven
• Network & Service graph • Traffic zones
• Multi modal flow counts
Demand
• OD matrices
• Modal split
Route choice
• Turning ratio (by destination zone) Network attributes • Free flow speeds
• Transit Capacities
• Waiting times • Acess / Egress /
www.ptvgroup.com Seite 13
DATA DRIVEN MODELS – FUNCTIONAL OVERVIEW
Observed Vehicle trajectories Zones (Origin destinations) Graph Map Matching & speed calc.
Link speeds by day type Day types definitions Splitting rates by destination and day type Assignment matrix estimation Assignment matrix by day type Zones (Origin destinations) Graph Assignment matrix by day type Initial OD matrix OD matrix correction Corrected OD matrices by day type Zones (Origin destinations) Initial Graph Corrected OD matrices by day type speed, capacity and jam density
correction
Corrected Graph Flow measures Link speeds by day type
ASSIGNMENT MATRIX UPDATE OD MATRIX UPDATE GRAPH UPDATE Observed matrices by day type
www.ptvgroup.com Seite 14
MODELLING 2.0 – AN EXAMPLE
Creation of a graph model for Transport Assignment
Running the Big Data analysis tools you discover, from FCD probes for example, that some streets should be included into the model because they are deeply used !!!
Running online Big Data tools you can update in real time parameters of your model, for example for the route choice model the turn probabilities at a given intersection.www.ptvgroup.com Seite 15 Big data can contribute to enhance calibrating and validating all our models
Trip generation
Trip distribution
Mode choice
Route choice
Supply calibration
We need to conceive new calibrating methodologies
Capacity and flow level recognition
Transport system & mode recognition
Path choice recognition
www.ptvgroup.com Seite 16 Lorenzo Meschini
CEO, PTV SISTeMA
Realtime Solutions Director, PTV Group