Spatiotemporal Forecasting At Scale

(1)

Scholarship@Western

Electronic Thesis and Dissertation Repository

8-13-2019 11:00 AM

Spatiotemporal Forecasting At Scale

Rafael Felipe Nascimento de Aguiar The University of Western Ontario

Supervisor

Capretz, Miriam A. M.

The University of Western Ontario

Graduate Program in Electrical and Computer Engineering

A thesis submitted in partial fulfillment of the requirements for the degree in Master of Engineering Science

Follow this and additional works at: https://ir.lib.uwo.ca/etd

Part of the Software Engineering Commons

Recommended Citation Recommended Citation

Nascimento de Aguiar, Rafael Felipe, "Spatiotemporal Forecasting At Scale" (2019). Electronic Thesis and Dissertation Repository. 6316.

https://ir.lib.uwo.ca/etd/6316

This Dissertation/Thesis is brought to you for free and open access by Scholarship@Western. It has been accepted for inclusion in Electronic Thesis and Dissertation Repository by an authorized administrator of

(2)

Spatiotemporal forecasting can be described as predicting the future value of a variable given

when and where it will happen. This type of forecasting task has the potential to aid many

institutions and businesses in asking questions, such as how many people will visit a given

hospital in the next hour. Answers to these questions have the potential to spur significant

so-cioeconomic impact, providing privacy-friendly short-term forecasts about geolocated events,

which in turn can help entities to plan and operate more efficiently.

These seemingly simple questions, however, present complex challenges to forecasting

sys-tems. With more GPS-enabled devices connected every year, from smartphones to wearables

to IoT devices, the volume of collected spatiotemporal data that accompanies these questions

has exploded, following the Big Data trend. This thesis proposes a forecasting framework that

employs distributed computing in order to scale its internal components and overcome this

high data volume scenario. It also designs discretization components that allow for flexibility

in the framing of the forecasting questions. Furthermore, it devises a Geographically Global

Model (GGM) backed by an ensemble of Stochastic Gradient Boosted Trees, a collection of

Geographically Local Models (GLMs) backed by ARIMA models, and a non-linear blending

of those as part of its multistage machine learning pipeline in order to boost its performance

and stability.

The merit of the proposed research is evaluated in three experiments, each of which

com-prises millions of records, namely forecasting hourly taxi demand in the city of New York,

forecasting daily crime density in the city of Chicago, and forecasting hourly visits to places

of interest across Canada. The experimental results show the effectiveness of the proposed

Spatiotemporal Forecasting Frameworkin forecasting stable results across the three domains,

while also outperforming the naive baseline by at least 49.8% with respect to the SMAPE

residuals.

Keywords:Spatiotemporal, Forecasting, Big Data, Distributed Computing, Ensemble

Learn-ing, BlendLearn-ing, Local Learning

(3)