• No results found

Once geospatial trajectories have been collected, they can be processed and analysed to better understand people and their actions. This section presents several existing methods for processing raw geospatial trajectories to provide a foundation for understanding behaviour.

2.2.1

Reducing Uncertainty

Due to the di↵erent methods of collection of trajectories, each data point typ- ically carries some amount of uncertainty. Reducing this uncertainty can be achieved through filtering, outlier detection or tailored approaches such as map- matching that uses known information about the environment to estimate the true location of the entity [Qiu et al., 2013; Zheng, 2015]. Typical filtering approaches include the Kalman filter [Cooper and Durrant-Whyte, 1994; Mo- hamed and Schwarz, 1999; Zheng and Zhou, 2011], impulse response filter [Ge et al., 2000], particle filter [Giremus et al., 2004; Wang et al., 2007], and moving average filters [Tsai et al., 2004] to smooth out noisy data.

While most useful for vehicular trajectories, map-matching techniques aim to reduce uncertainty by utilising additional information about the world to determine the likely real location the trajectory point was recorded from. This can be achieved by simply mapping the recorded point to the closest road [White et al., 2000], or using more advanced filtering and estimation techniques (e.g. the Kalman filter mentioned earlier) [Goh et al., 2012; Ochieng et al., 2003; Pink and Hummel, 2008; Quddus et al., 2003].

2. Background and Related Work

2.2.2

Change-point Detection

Change-point detection can be applied to trajectories to identify the point at which significant change occurs with the goal of partitioning the trajectory into subtrajectories. Depending on the goal of the process, the criteria for selecting change-points will vary, but typically includes monitoring for rapid changes in speed, acceleration, or direction. Subtrajectories segmented in this manner have been used for travel method identification, where the goal is to determine what transportation mode (e.g. walking, cycling, driving) was in use for di↵erent components of a journey [Liao et al., 2007b; Patterson et al., 2003; Zheng et al., 2008a,b].

2.2.3

Visit Extraction

Visits, also referred to as stops or stays, are periods of a trajectory where the entity is likely to have remained in a single location, for example a shop or house for trajectories associated with individuals [Ashbrook and Starner, 2003], or a parking garage or traffic queue for trajectories associated with vehicles [Yang et al., 2013]. The identification of these visits enables applications to reason about behaviour as a sequence of interactions with the environment [Andrienko et al., 2011; Ashbrook and Starner, 2002, 2003; Bamis and Savvides, 2011; Mon- toliu and Gatica-Perez, 2010]. After such interactions have been identified, we are left with a sequence of visits performed by the entity:

V ={v(1), v(2), v(3), ..., v(n)} v(i) = ( (i), t(i), d(i)) (2.1)

Where V is a set of visits, with v(i) being an an individual visit associated with a position, time and duration ( (i), t(i), d(i) respectively). For some applications, the visits themselves can be ignored and only the periods of time between them are considered. This may be useful in applications such as exercise trackers where stationary periods are not of interest.

2. Background and Related Work

tion conducted by Ashbrook and Starner [2002; 2003] into identifying locations meaningful to a user. From the collected data, Ashbrook and Starner observed that the data loggers used did not function well indoors, as a GPS signal was rarely available, and therefore treated periods of missing data as visits. This approach is limited in that it assumes that all missing data is caused by a visit, and visits cannot occur when data was collected. Indeed, the authors note that the data logging devices were prone to run out of battery power, also caus- ing a lack of data. Building on this work, but assuming a constant flow of data, even when indoors, algorithms have been proposed that aim to identify periods of low mobility from within trajectories. Relying on time and distance thresholds, such algorithms typically operate by identifying subtrajectories that contain points such that the subtrajectory, or visit, is smaller than a specified radius (or, sometimes, that no consecutive points can be greater than a specified distance apart) and the duration of the subtrajectory exceeds some threshold [Andrienko et al., 2011, 2013; Hariharan and Toyama, 2004; Kang et al., 2004; Li et al., 2008; Zheng et al., 2010b, 2009; Zhou et al., 2014]. Montoliu and Gatica- Perez [2010] extend this technique, by adding an additional constraint that the time between consecutive data points in the same visit must be bounded, with the aim of preventing periods of missing data from being contained within a visit. If data became unavailable at one time, and became available at a nearby coordinate some time later, it is not possible to state with certainty that the user remained stationary for the missing period. Another approach considered for visit extraction makes use of the speed or velocity of the user, where low speeds are considered indicative of a visit occurring [Lee et al., 2015; Palma et al., 2008].

Although these techniques may overcome the issues caused by assuming that a loss of GPS signal is equivalent to a visit, they all su↵er from a lack of resilience to noise. In the thresholding approach, a single noise point outside the visit radius will end a visit prematurely, and when considering velocity, it is likely that noise points will artificially increase the reported velocity of the

2. Background and Related Work

user, thus also causing visits to be ended.

Aiming to overcome the drawbacks of existing approaches, by assuming noise in the dataset, Bamis and Savvides [2010] present the Spatio-Temporal Activity (STA) extraction algorithm. While the authors were specifically motivated by identifying activities that repeat in cycles through extraction and clustering, the first step of the algorithm, STA extraction, uses a definition of an activity that is identical to our definition of a visit, and thus performs visit extraction. The algorithm is similar to existing approaches in that it iterates over the trajectory points, but uses a weighted averaging filter over the spatial component to reduce the impact of noise before considering an activity to have ended. This technique, however, does have several drawbacks and assumptions relating to the data, for example, requiring evenly time-sliced data and a full data bu↵er before consideration of a visit can occur, consequently imposing a minimum bound on visit duration.

The topic of visit extraction is considered again later in Chapter 4, where an algorithm is proposed that aims to overcome the drawbacks of the approaches identified here. It is also considered in Chapter 5, where a novel approach to identifying land usage elements interacted with by users is presented, designed to replace traditional visit extraction for some domains.