Real-time data, typified by frames of weather radar animations or positions of tube trains on a network, require a work flow from data acquisition through to the final visualisation. It is this programmable element that distinguishes this section from the previous sections which dealt only with visualisation. Also included with real-time data is the ability to run models on the data and try “what if” scenarios with a view to prediction, both long-term and “now casting” for the short-term. Due to the quantity of real-time information that can be produced by automated sensors, real-time data also has to deal with information management and knowledge directed visualisation, because, in the vast quantity of information available, only a small portion of it is likely to be interesting. The programmable element comes from the fact that the map is now built by a piece of dynamic code, rather than a reference to a dataset. Programmable here can also imply explorable data as there is now direct access to the data and the code controls how it is visualised.
Taking the weather radar example, in the UK this data is available from the Meteo- rological Office, requiring a capabilities request to ascertain the available observation times, followed by requests for the actual data:
Capabilities Request http:// datapoint.metoffice.gov.uk/ public/ data/ layer/ wxfcs/ all/ xml/ capabilities?key=hmykeyi
PNG Overlay Request for Radar http:// datapoint.metoffice.gov.uk/ public/ data/ layer/ wxfcs/
Precipitation Rate/ png?RUN=2013-09-25T09:00:00Z&FORECAST= 0&key=hmykeyi
4.5. Real-time and Programmable Maps 165 In this case, the returned data is a single PNG image overlay, but recently a WMTS tile server has been added which allows Google Maps and OpenLayers compatible tiles to be requested for use in web-based tiled maps. As with all these types of open data systems, there are “fair usage” policies, so the example in figure 4.32 showing 10,000 requests per minute would not qualify. This identifies another architectural require- ment, namely a cache system to de-couple the load presented to the 3rd party system supplying the data and our own servers which take the full load. There are numerous commercial and open-source solutions to this, for example, Microsoft’s Application Request Routing (ARR) or HAProxy (www.haproxy.org), but this is also something that the MapTubeD system implements with its own tiles. A simple addition would be to add a new data source, specifying a 3rd party tile server as a pass-through. This method would allow limits to be defined for each data source to ensure that the fair use policies of each one were never exceeded. Applying this to the second example of real-time transport systems, data for all London Underground tube trains can only be queried once every three minutes21. In this instance the raw data requires significant
pre-processing to calculate vehicle positions, with real-time processing of Network Rail trains, London Underground Tubes and TfL buses and River Services the subject of a two month project called “Adaptive Networks for complex Transport Systems” (ANTS) in May 201222. The model used here is to create a library to acquire and process the data, with common code for both off-line analysis of long-term data and real-time in- stantaneous views. A REST API then uses this library to provide the data to any internal systems that request it using a 3-tier structure and relational database allowing access to the last 24 hours’ worth of data. Figure 4.37 shows the flow of data from the API or stream though to the derived vehicle positions. This includes making comparisons to baseline running data from the archive, which is used to identify any late running services. Due to the data coming from three different sources and in three different formats, this is as much an exercise in data fusion as it is real-time analytics. More in- formation on the ‘ANTS’ system and real-time London transport systems is published in [Che+14].
While the ‘ANTS’ library is used for processing the data, an additional web server component is required to enable other systems to access the data. The class diagram in
21The three minute cycle is accurate as of August 2014.
22The ANTS project was a 2 month project in May 2012 which was funded by Future ICT and resulted in a number of real-time
Figure 4.37: System diagram of data flow in the ANTS system.
figure 4.38 shows the implementation of the DataAPI. The pattern employed is to cre- ate an ‘ICityDataSource’ interface which all data sources implement to acquire data. A further ‘ICityChildDataSource’ interface is required for data sources which are de- rived from other data sources. For example, ‘Trackernet’ is the name of the TfL system providing open data on the status of the London Underground. The ‘TrackernetData- Source’ acquires this data via an HTTP request and derives positions of all the tube trains. The ‘TubeNumbers’ class is a child data source as it uses the data from the ‘TrackernetDataSource’ to count the number of tubes running on each line at that in- stant in time. The ‘CountdownDataSource’ is the TfL system for bus tracking, which in this case is a stream API, with ‘BusNumbers’ the total number of buses running in London. ‘NetworkRailDataSource’ and ‘NetworkRailNumbers’ provides a similar breakdown of Network Rail trains by train operating company (TOC).
4.5. Real-time and Programmable Maps 167
Figure 4.38: City Data source Class Diagram.
LINQ to SQL to generate the data classes automatically. This system then becomes another architectural element for providing real-time data required by online visualisa- tion systems, but only after an API has been added to provide a means for accessing the data. This was modelled on the ‘restSQL’23idea of providing limited querying ability
to SQL databases via HTTP REST. The query either takes the form of a request for filtered data, for example, the numbers of tubes on the London Underground over the last 24 hours (select query), or a query for a raw data file which might contain the latest tube positions for all lines (file query). All data downloaded from data APIs is stored on the server as files which are timestamped for easy access. The syntax of the two types of query is shown in table 4.2. Finally, the system archives all raw data downloaded which it stores on a local storage array for future data mining or any other long-term analytics.
Using the architecture just described as a new data source, it now becomes possible to create real-time visualisations based on London’s transport system at the current instant in time. Due to the complexity and nature of the data, the amount of useful information that can be extracted from a simple plot of vehicle positions is very limited. One example was during the tube strike of 29th April 2014 when Victoria Line trains were seen to be running as far south as Brixton when the official line from TfL was that they were turning around at Victoria. This type of obvious data is easy to extract, but,
Table 4.2: Real-time Transport API Web Services REST Syntax. Real-time Transport Data API Syntax
Select Query Syntax:
Usage REST URI
select 24 hours from {table} as json api.svc/ s/{table}/json
select 24 hours from {table} as xml api.svc/ s/{table}/xml
filter based on table column matching value api.svc/ s/{table}/json?column=value
limit returned results to {int} rows limit={int}
Examples:
1. Select the 12 most recent tube number counts for all lines: api.svc/ s/ trackernet/ json?\ limit=12
2. Select the 12 most recent network rail number counts for train operating company (toc id) 86: api.svc/ s/ networkrail/ json?toc id=86& limit=12
File Request Syntax:
Usage REST URI
return specific archive file api.svc/ f/{dir}?pattern={wildcard}
file pattern wild card to match any combination of characters
* Examples:
1. Download latest national rail file containing all train positions: api.svc/ f/ nationalrail?pattern=nationalrail *.csv
2. Download a buses (countdown) file for the latest available time between 09:00 and 09:59 on 11th Septem- ber 2014:
api.svc/ f/ countdown?pattern=countdown 20140911 09*.csv
for anything more complicated, a mathematical comparison between normal running and the current situation is required. This is where knowledge directed visualisation becomes essential.
Visualisation of vehicle positions with three minute resolution data also poses some problems due to the periods between data updates. Having positioned a tube on the net- work, no new data is available for another three minutes, so any real-time visualisation showing animated tubes is forecasting positions into the future based on the best infor- mation available at the time. This uses the reported time to next station to predict the arrival, but it is perfectly possible that the new data will show the vehicle as stationary and the forecast position will be wrong at the next data update. A first attempt at the real-time visualisation problem is available at: https:// github.com/ maptube/ FortyTwo. This is a Javascript and WebGL visualisation in 3D showing real-time tubes and buses. The data is extracted from the ‘ANTS’ server described earlier and contains vehicle positions as latitude and longitude for the last time point.
The visualisations in figure 4.39 fall into one of two programming paradigms, being based on either a key frame animation model, or on agent based modelling. The data contains positions of vehicles at specific points in time (key frames), so the first method
4.5. Real-time and Programmable Maps 169 simply animates positions based on linear interpolation. This is fine for archive data, but with real-time data where the future position of a vehicle is not know, the technique is extended to use the vehicle’s expected arrival time at its next location. While this method works, its limitation becomes clear as time continues to move forward. When a vehicle arrives at its next location and before the next data update giving a new esti- mated time of arrival, the simulation has to work out the vehicle’s route and estimate a new arrival time based on the data available. The key word here is ‘simulation’, as a real-time visualisation is a simulation of the network over time, based on behaviours built into the agents. In this case, a network graph of the London Underground network with times between stations recorded from the official timetables or archive running data which are used to make arrival time estimates.
(a) Tubes, buses and trains using Processing. (b) Tubes and buses in Chrome and WebGL.
(c) Tubes in Chrome and AgentScript. (d) Tube strike day one.
(e) Tube animation using Autodesk 3D Studio Max. (f) Bus animation using Autodesk 3D Studio Max.