The context for this work draws on experience with developing web-based mapping systems from when Google Maps was first released in February 2005, up until the present day. During this time, technological innovation has seen web-based mapping evolve from simple cartography and static, tiled, images into handling vector geometry with dynamic data attributes. The ultimate evolution is a fully functional “WebGIS” with spatial analytics capabilities, but, first, there are a number of technological chal- lenges to solve. Part of this thesis explores the fundamental architectures and algo- rithms that are required to make this a reality, while the more applied parts investigate how this can be used by researchers working on real world problems.
In the process of making maps more accessible to the general public, tools for au- tomatically making maps from data are now an integral part of the data preparation pipeline. These emerging, intelligent, tools are one aspect of this research, along with the “Internet quantities” of data available and infinite provisioning. The ability to han- dle complex, inter-related, geospatial data at scale using intelligent tools provides the primary motivation for the research presented in this thesis.
Real-time city data from application programming interfaces (APIs) and perma- nently connected streams are now available for London and other cities, providing in- formation on tubes, buses, trains, bikes, weather and air quality. The move to make cities ‘Smart’, as defined in IBM’s publication, “Smarter Cities Series: A Foundation for Understanding IBM Smarter Cities” [Keh+11], is resulting in data about cities be- coming more open. In city simulation, the concept of the ‘Digital Twin’ originally comes from production engineering, as defined in the article by Grieves, “Origins of the Digital Twin Concept” [Gri16]. He defines the Digital Twin as follows:
“It is based on the idea that a digital informational construct about a phys- ical system could be created as an entity on its own. The digital information would be a “twin” of the information that was embedded within the physical system itself and be linked with that physical system through the entire life-
cycle of the system.” (Grieves [Gri16])
This refers to the digital twin in the context of physical products, or devices, high- lighting a “predictive” and “interrogative” Digital Twin Environment for acting on dig- ital twins. Predictive is an environment designed to predict future behaviour, for ex- ample predictive component failures from specific instances of physical products in the real world, utilising known manufacturing tolerances. Interrogative is closer to the relationship with digital twins of cities, which Grieves describes as follows:
“Irrespective of where their physical counterpart resided in the world, individual instances could be interrogated for their current system state: fuel amount, throttle settings, geographical location, structure stress, or any other
characteristic that was instrumented.” (Grieves [Gri16])
The examples given are all engineering related, though, citing “space exploration”, “next generation fighter aircraft” and “NASA vehicles”. Digital twins of cities are discussed by Batty in his editorial, “Digital twins” [Bat18]. The topic of computer simulation in the context of Digital Twins and Smart Cities is raised, arguing that, “a computer model of a physical system can never be the basis of a digital twin for many elements of the real system are ignored in any such abstraction” [Bat18]. In general, the idea of the digital twin mirroring the original system is a modelling problem, depending on the model builder’s choice of key factors to include.
“Models are, by definition, simplifications of the real thing and in that sense, do not aim to replicate the original system in the same detail as that
system.” (Batty [Bat18])
Where the idea of a digital twin is referenced in the context of real-time city data, this is not so much a functional twin as it is Grieves’ interrogative view of the system. The system here is the unknown element under investigation, with the interrogative view of the “fuel amount and throttle settings” providing the researcher’s only view into its internal operation. A city is also a connected system of systems, which resists the
1.1. Motivation and Context for the Work 35 type of decoupling and isolated analysis which would be a first step in the engineering sciences. City systems cannot ordinarily be decoupled unless one element of the system fails, for example during a tube or bus strike. This is where long-term monitoring of the respective systems via data APIs can be useful in building a normal operating point as an aid to formulating a theory about how the system functions.
Central to this view of connected data systems are the fundamental algorithms which make the analysis possible. Computer architecture plays a role here, with parallelism enabling higher throughputs of data for any algorithms capable of exploiting it. Visuali- sation of the data is a primary aim, so different ways of enabling researchers to see their data in a spatial context are investigated. The quantities of spatial data now available on the Internet and the availability of real-time city data through this medium form the core of this thesis. The term “WebGIS” is used here to describe any form of spatial analysis or spatial visualisation using a web browser, however simplistic this might be. The idea is attractive from the point of view of lowering the bar to entry for non-expert users, while simultaneously simplifying the spatial analysis pipeline for everybody. From a computer architecture and algorithms point of view, this is an application of emerging and fast developing technologies to a real world problem.