Visual Analysis of Movement Behavior using Web Data for Context Enrichment

(1)

Visual Analysis of Movement Behavior using Web Data

for Context Enrichment

Robert Krueger∗ Dennis Thom∗ Thomas Ertl∗

Institute for Visualization and Interactive Systems (VIS), Universit ¨at Stuttgart, Germany

Figure 1: Our system for interactive visual analysis of semantically enriched movement data consists of two components: top) Geographic View -visualizing frequent destination clusters, routes, POIs (points of interest) and degree of uncertainty by varying color intensity of the icons; bottom) Temporal View - showing frequent temporal daily patterns. Most frequent patterns are shown on top. Color varies from green (frequent) to white (infrequent). Views are connected by brushing and linking.

ABSTRACT

With increasing use of GPS devices more and more location-based information is accessible. Thus not only more movements of peo-ple are tracked but also POI (point of interest) information becomes available in increasing geo-spatial density. To enable analysis of movement behavior, we present an approach to enrich trajectory data with semantic POI information and show how additional in-sights can be gained. Using a density-based clustering technique we extract 1.215 frequent destinations of∼150.000 user movements from a large e-mobility database. We query available context infor-mation from Foursquare, a popular location-based social network, to enrich the destinations with semantic background. As GPS mea-surements can be noisy, often more then one possible destination is found and movement patterns vary over time. Therefore we present highly interactive visualizations that enable an analyst to cope with the inherent geospatial and behavioral uncertainties.

Index Terms: Information Storage and Retrieval [H.3.3]:

In-∗_{_{forename.lastname}_}_{@vis.uni-stuttgart.de}

formation Search and Retrieval—Information filtering, Query for-mulation, Selection process; Database Applications [H.2.8]: Spa-tial databases and GIS—; Information Interfaces and Presentation [H.5.2]: User Interfaces—GUI;

1 INTRODUCTION

Nowadays, as data collection becomes easier and cheaper, more and more data is generated. For example, in the traffic domain there is an increasing use of GPS tracking devices to harvest location-based information. This information is not only used to support drivers but also for analysis purposes, for example in fleet man-agement, engine analysis or to model traffic simulations. Mostly technical details are in scope, but there is more information in the data. To better understand urban dynamics, but also to enable con-sumer acceptance analysis, there is an increasing interest to look into movement reasons, i.e. why people are moving, where they are going, for what purposes they use a product, and what they are doing at their destinations. However, such information cannot be accessed easily. For example, if a user stops at a specific point in the afternoon, there can be several reasons for this behavior and the recorded GPS positions are often not enough to come up with an explanation. They need to be enriched with further information to support the data analyst in the sensemaking process. As

(2)

location-enabled Web 2.0 data volumes are growing every day and more and more people are using location-based social media services to give information about POIs, this can be a suitable source for the necessary context data. Foursquare, for example, offers a web-service providing detailed geolocated information of several POIs, calledvenues. Such as restaurants, bars, business locations, univer-sities, sport parks, and public transport stations. Mainly used for ’What is here’ user centered navigation-applications, the services can also be used to enrich geospatial data with context information. In case of movement behavior analysis, one can enrich the move-ment data with venue information and investigate the destination of movements, i.e. which destinations the users visited and thereby gain more insights of movement reasons. However there are sev-eral challenges that hinder a straightforward data enrichment: First, GPS measurements are often not exact, and more inaccuracies can occur while broadcasting and logging this information. Thus data can become noisy. Second, location information is dense in some areas and it is not clear which venues were visited. While mobile phones stick with the user, vehicles have to be parked somewhere and are often not moved all the way to the actual destination. More-over, there is a high number of venues not recorded by POI services, and thus many map locations cannot be resolved. In this paper we address the challenge by applying a stepwise data processing ap-proach (see Figure 2). First, a clustering algorithm is used to detect ferquently visited destinations and remove noise (a). Second, the frequently visited destinations are enriched with POI information form a webservice (b). Finally an analyst can examine movement behaviors in an interactive visual loop (c) and investigate results (d). Our techniques were developed and evaluated based on an electric mobility dataset generated in a long-term field-study by a large ger-man power company. Electric scooters, particularly suited for urban use, were given to 500 interested users and their usage of the vehi-cles was recorded in the course of one year using on-board GPS devices.

While semantic enrichment of such trajectories with other data recently gained more and more interest and important research has been done in this area, the process and decision of venue assign-ment and uncertainty handling in space and time has, to the best of our knowledge, not been addressed so far. Therefore, the pri-mary contributions of this paper lie in the semantic enrichment of movements based on Web 2.0 data and the design of highly inter-active visualizations that allow an insightful handling of geospatial uncertainties resulting from location inaccuracy as well as temporal uncertainty resulting from changing behavior patterns at the same time.

The rest of this paper is structured as follows. In the next section we discuss related work in the field of movement data analysis and semantic enrichment as well as uncertainty visualization. Prepro-cessing and context enrichment of the data, i.e. steps of clustering and semantic enrichment are discussed in Section 3. In Section 4 we address uncertainty handling and visualization. The interactive analysis is described in Section 5. In Section 6 we demonstrate the applicability of our methods based on two case studies and give a conclusion and outlook in Section 7.

2 RELATEDWORK

This paper presents an approach to analyze movements using se-mantic context information, and visualizes uncertainties in geo-graphical and temporal space. Thus, we first discuss related work in the field of visual movement/trajectory analysis in general (section 2.1), before focusing on semantic enrichment. While some works present techniques, applications and results to directly analyze data from location-based web services (2.2), only few use such services to enrich movement data from different sources (2.3). Finally we discuss related work in the more general field of spatial and tempo-ral uncertainty visualization (section 2.4).

Figure 2: Analysis Pipeline: a) Extract frequent destinations, b) Data enrichment using a POI Service, c) Interactive Visual Analysis of POI results and movement patterns, d) Interpretation

2.1 Movement Analysis and Semantic Enrichment

Movement analysis is a well established research area in the field of Visual Analytics (VA). In recent years many techniques and ap-plications were proposed. Andrienko et al. [6], [4], [3] and Schreck et al. [30] present approaches for automated trajectory aggregation and visualization. Hurter et al. [23] and Krueger et al. [26] inves-tigate exploratory means using focus+context techniques, such as lenses, to visually group and filter trajectories. A visual interactive system for traffic congestion analysis was presented by Wang et al. [35]. While most of these works address macroscopic challenges, some approaches focus on micro pattern analysis. With TripVista a system to analyze trajectory data from crossroads, i.e. turns, speed and temporal behavior was presented by Guo et al. [21]. While the research is mostly about the trajectory data itself (e.g. form, speed, direction), some methods have been proposed to enrich trajectory data to investigate movement reasons. The work of Gonzales et al. [18] suggests techniques to understand basic laws of human mo-tions from individual human mobility patterns. They use a large dataset of 100.000 mobile phone users, calculate individual travel patterns and aggregate them into a single spatial probability distri-bution. Andrienko et al. [8] also examine and visualize locations from mobile phones based on call frequencies and guess whether the locations are work or home locations.

Alvares et al. [2] propose a generic data preprocessing model for semantic trajectory enrichment by transforming trajectories in stops and movement first, and then add geographic information. They also provide a brief survey on research in the field of seman-tic trajectory data models. Brakatsoulas et al. [10] formulate a data model to store and query trajectory semantics, like intersections, distances, times, speed, and covered areas, for traffic data. A se-mantic enrichment for animal movement trajectories was presented by Spaccapietra et al. [32]. They propose modeling approaches that add behavioral semantics (e.g. feeding), but also weather informa-tion such as wind, temperature, sky condiinforma-tions, and area condiinforma-tions like mountains, water extents, or deserts to the data.

2.2 Social Media Movement Analysis

With the increasing use of social media more and more geo-located information is collected and often becomes available via web-services. While there is a lot of research in location based web data, this discussion especially focuses on POI and trip analysis.

Mashups[33] is a system to visualize lifelogs from Foursquare, Facebook, etc. on an interactive map. Similarly, Preotiuc-Pietro et al. [29] analyze the behavior of Foursquare users. In contrast to our work Foursquare venues are not used as context to enrich other movement data, but to directly analyze the behavior of Foursquare

(3)

users. They also use aggregation techniques to investigate peri-odic behavior of the users. Kling and Pozdnoukhov [24] investi-gate movement patterns from Foursquare and geo-located Twitter messages to extract semantics behind activities. Movements are vi-sualized in several ways using streamgraphs, geographic heatmaps and word clouds. Fujsaka et al. [17] analyze movements of users based on geolocated Tweets and apply cluster analysis on frequent POIs. In contrast to our work, they focus on the Web 2.0 users and do not use the data as context to enrich trajectory data.

2.3 Semantic Enrichment Using Web Data

There is already some related work where trajectories are seman-tically enriched with POI information. Parent et al. [28] present concepts for (1) the construction of trajectories from plain move-ment data, (2) semantic enrichmove-ment to enable desired interpretations of movements, and (3) the application of data mining techniques to extract knowledge, i.e. behavior patterns of movements. Andrienko et al. [7] focus on privacy preserving movement analysis. They en-rich trajectory data from mobile phones with information from a POI database and present results in a semantic space by visualizing flows between activity types. Semantic annotations based on a POI service was discussed by Guc et al. [20]. They propose a conceptual annotation model for semantically enriched trajectories and present components and architecture of a graphical user interface. How-ever, trajectory enrichment is only done based on user knowledge and none of the discussed services are applied. In another work, Andrienko et al. [5] suggest to interpret meanings of personal POIs based on cyclic temporal patterns of visit times. Krueger et al. [25] show work in progress, where movement trajectories are enriched with context from location based Twitter messages. They are vi-sualized using geospatial tag clouds. In context of Nokia’s Mobile Data Challenge Hoeferlin et al. [22] present a visual system for ex-plorative analysis of user behavior from a mobile phone dataset. Although, they mention to query POI services for context informa-tion it is not the main focus of their work and it is not addressed how they process and integrate those results in an interactive anal-ysis loop and in visualization.

2.4 Uncertainty Visualization in Space and Time

Uncertainty visualization is a well establish research field. A good overview of existing techniques has been given by Griethe et al. [19]. Furthermore, MacEachren et al. [27] review progress in geographic uncertainty research towards visual tools and show frameworks suggesting different visual methods for uncertainty representation dependent on data properties such as type and quality. Correa et al. [14] present a framework to support uncer-tainty analysis in the VA process and presents some information visualizations for uncertain data. A visualization of uncertainty in lattices to support decision making is presented by Collins et al. [12]. They use a hybrid layout with varying color, size and transparency.

To the best of our knowledge, we are the first who enrich large movement data with Web 2.0 content in an automated way and allow to exploratively reason the data using interactive visualizations. Furthermore, none of the mentioned research in section 2.3 uses visualization techniques to show the uncertainties, resulting from context enrichment.

3 PREPROCESSING: SEMANTICENRICHMENT

Preprocessing itself is an important step in information visualiza-tion, but it often receives little attention. In recorded movement data, however, inaccuracies in space and time, ambiguities in move-ment and inconsistent measuremove-ments are part of the core problem, especially when data volumes are huge, and make preprocessing necessary. We use a dataset generated by 527 GPS-enabled vehicles

as a basis for the development of our methods and evaluation. These vehicles were tracked every 30 seconds for two years, leading to an overall amount of about 150.000 trips and more than 8.200.000 measurements. By visualizing the trips destinations we found that most destinations build dense clusters over time surrounded with some noise, as shown in Figure 3. Noise can occur due to several reasons: First, there are destinations only visited a few times and thus no clear cluster appears. Second, there are measuring inac-curacies due to GPS signal interference, deficient receiver perfor-mance, data logging, or broadcasting issues. We choose to accept the information loss of some less dense destinations and focus on frequent visited ones. Later approaches could of course also focus on poorly attended areas to indicate outliers.

3.1 Determine Destination Clusters

Figure 3: Popular destinations build dense clusters (red) surrounded with noise (white). Map details are blured due to privacy purposes.

In Figure 3 it can be seen that the density in destination clusters is mostly similar and there are clear borders between clusters and whitespaces. For automated destination cluster detection we apply DBScan [16], a density-based clustering technique with a geospa-tial distance measure. Advantages using DBScan compared to par-tition based approaches like k-means are that (1) we do not need a-priori knowledge about the number of clusters, (2) DB-Scan im-plicitly filters out noise and (3) is able to detect clusters of varying shape (longish, etc.). As a result we detect 1.215 clusters contain-ing 105.808 of the 150.000 total trip endpoints. Thus, about two of three endpoints have been often visited by a single user or have been visted at least once by serveral users. We assume that this will frequently be places of employment, shopping centers, homes, and service stations.

3.2 Context Enrichment and Destination Assignment

To generate up-to-date context information about the frequently vis-ited areas (clusters) we employ Foursquare as an additional data source. Foursquare is a location-based social networking service with dense information about POIs in city areas. Venue informa-tion from Foursquare is often quite recent as the data is regularly updated by users, which can always add, delete, or update POIs (in Foursquare calledvenues). Furthermore, it delivers additional in-formation such as the number of users that visited a venue and the number of individual Checkins from users, which can help to de-termine the recent prominence of a location. While there are other providers offering similar services like Google, Microsoft and Face-book, the Foursquare API has less query limitations and has the second highest POI density. The lower query restrictions were also an important factor in our decision, as it allows to compute the data enrichment online (at runtime) enabling more interactive requests.

Table 1 compares three major POI APIs in overall number of POIs worldwide, allowed data requests per minute and hour, caching policies, and quality of the data structures.

(4)

Table 1: Approximated POI API limits and facts (taken from Cons-tine [13], the Facebook API documentation [1] and public forum dis-cussions).

Criteria Google Facebook Foursquare Number of POIs 50.000.000 missing 30.000.000

Requests/min 1.000 3600 5.000 Requests/day 100.000 1.000.000 1.200.000

Caching forbidden lacking info 30 days

Structure fair fair good

When querying POI services there are several parameters to in-fluence search results. Most important is the geographic query ra-dius for a given latitute/longitude pair. For the e-mobility dataset we determined a 50 meters radius around the cluster centers as a good range, as users often park their vehicles not directly at a venue but in short walking distance. For some clusters we did not get a venue. These clusters are either private destinations (e.g. home, residence of a friend), charging stations, or anything unknown. Fig-ure 4 shows frequent destination clusters with different assigned destination types (venue, private, charging).

Charging Station: As our test dataset is about electric vehicles, we add another data source containing exact locations for power charging stations in that area. Each cluster with a dis-tance≤ 50 meters to such a station is handled as charging cluster.

Home Location: Every scooter is assigned a home location based on the destination that is most frequently visited by the user. However, some scooters were used for special testing pur-poses and are more frequently parked at various special lo-cations (e.g. repair and assembly shop). Therefore we ap-ply cleaning measures to remove vehicles that are missing a clearly identifiable home location.

Figure 4: Found Clusters: venues nearby (red), home (green), charg-ings (yellow). Map details are removed due to privacy purposes.

Overall we were able to enrich 75% of the data (clusters). From 1215 clusters we identified 266 as home locations, 27 as electric scooter charging stations, and 607 clusters could be connected to venues via Foursquare. For some of these latter clusters we got up to 42 possible venues in urban areas and often just one identifiable venue for clusters in more remote areas. Figure 5 illustrates the distribution of venues per cluster.

4 INTERACTIVEVISUALMOVEMENTREASONING

After preprocessing (i.e. clustering and context enrichment, de-scribed in section 3) the analyst can start reasoning using an

in-Figure 5: Found venues per cluster, using a 50 meters query radius. For 607 of the 1.215 clusters venues where found.

teractive system. As we want to characterize movements we have to deal with the spatial and temporal dimensions at the same time. We therefore developed two components and integrated them in our movement analysis system: a geographic map view and a temporal view. The geographic map view shows routes and destinations for identified POIs, while the temporal view shows frequent temporal behavior patterns. In both components (Figure 1) the focus is on dealing with uncertainties.

4.1 Geographic Component

The map component shows frequent destinations (clusters) and cor-responding context information and lets the analyst explore and in-vestigate routes and destinations in an interactive fashion (see Fig-ure 1). The context information, i.e. venues from Foursquare are categorized into three hierarchical levels. Each level aggregates its child categories to more comprehensive ones. The highest level contains 9 overview categories, as shown in Figure 6. Every venue

Figure 6: Foursquare category tree: each level summarizes its child nodes in parent categories. Leaf nodes are high detailed venue types, such as Post Office.

belongs to one of these categories. For exampleAnn’s Blue Jeans Shopbelongs to the subsubcategoryBoutique, subcategory Cloth-ing Storeand categoryShops and Services. Simply visualizing the most detailed venue information (420 different venue types) how-ever hampers fluent analysis and renders fast situation assessment impossible. Thus, we first show the main categories (Figure 6, left), which can then be interactively explored by changing the current category level to see more or less details. Doing so we follow the interactive visualization mantraOverview first, then zoom and filter, details on demand[31]. However, only in few cases the category

(5)

of a cluster can be clearly determined. Often more than one venue is found (see Figure 5), and often these venues belong to different categories, thus it is uncertain which venue was actually visited. This uncertainty has to be addressed and visualized. For example if 30 of 40 found venues for a cluster are restaurants, there is a good chance that the area is a food district and we can thus be quite con-fident that the scooter user went there to have a meal. According to the Foursquare category structure we define three degrees of detail for each category: main category (level 1), subcategory (level 2) and subsubcategory (level 3), while level 0 holds all found venues. For each levellwe can now determine the most likely category by comparing the number of venues of direct siblings (i.e. categories at the same level having the same parent category). Depending on the current detail level the certaintyc, 0≤c≤1 (c=1 is certain) of the determined categorycatat levellis calculated as a ratio of the number of venues found in this category and the overall number of venues found for the destination cluster.

ccatl=count(catl)/count(cat0) (1)

Thus, with higher level of detail categories usually get more and more uncertain, since only a low amount of found venues belong to exactly that category, e.gAustralian Restaurant, while with lower detail it becomes more certain that most venues belong to a main categoryFood. Using the proposed calculations we can now visual-ize venue categories for each cluster at the current level by showing the cluster points and the corresponding category icons. Uncer-tainty is visualized using a gradient color mapping for the icons and ranges from blue (certainty 1) to white (certainty 0). To inves-tigate categories and venues the analyst can place the mouse over an icon and drill down to lower category levels to get more specific information (Figure 7).

Figure 7: Found venues for a cluster in the inner city area, where venues are dense. From left to right: most likely main category, subcategory, subsubcategory with venue name. With higher detail certainty is decreasing.

4.1.1 Reducing Visual Clutter

While in outer city areas and suburbs distances between clusters are usually quite high for our e-mobility dataset, in the downtown area clusters can be much closer to each other, causing heavy vi-sual clutter when an icon is to be displayed for each of them. There have been several grouping techniques proposed to allow aggre-gation of icons, which are now quite common in interactive map visualizations. In our case, we group nearby icons together to form representative icons showing the number of aggregated POIs. To determine the certainty of the group icons we accumulate the cer-tainties of the included venues and compute their average. As this is done based on pixel distance it can easily be recomputed for all geographic zoom levels of the interactive map.

Figure 8 shows the map with detail view (top) and aggregated view (bottom).

4.1.2 Filtering

To investigate movements the analyst can either apply an area based filtering or filter the tracks on a per scooter basis. Filtering by area can be done using a lens magnifier tool and move it over the map [26]. The lens filter includes all movements of scooters hav-ing their home location in the lens area (see Figure 9). Only the

Figure 8: Map snippet showing clusters and found venues at cate-gory level 1. Top: Ungrouped icons may give better hints but clutter; Bottom: Aggregated icons avoid clutter. Map details are removed in the image due to privacy.

destination clusters visited with these scooters are then visualized. In addition the scooter tracks are shown on the map with a certain degree of opacity, making frequent routes appear with higher in-tensity. To filter for individual scooters, the analyst can select their

Figure 9: Area based analysis. Map snippet shows the scooter home location and frequent destinations like shops and offices. The area was selected using a lens magnifier. Map is hidden, due to privacy. home location by clicking on the correspondingResidenceicon. In urban areas often more than one venue is found for a given destina-tion cluster and our automated approach discussed before is used to select the most likely venue. However, sometimes the analyst may disagree with the systems choice and it should thus be possible to integrate her knowledge. For example, the analyst may recognize that a cluster has been formed on the large parking area of a ware-house store, which also happens to ware-house a small hot-dog stand. Because of the spatial proximity on the parking area, the system may have falsely suggested that most scooter users within the clus-ter went there for the hot-dog stand. Therefore the analyst should be allowed to change the icon and category on the current category level. Figure 10 showsShops and Servicesis preselected, since the certainty is the highest for all categories in that level. However with expert knowledge about a certain area she is aware that this destina-tion was most likely a restaurant, and switch to that category. The same procedure can be done on each category level. For example

(6)

she stays with theShop and Servicescategory but on lower level selects the hairdresser category instead of the preselected bank.

Figure 10: The analyst can integrate expert knowledge and change preselected categories on each level of detail.

4.2 Temporal Component

To allow detail analysis of frequent usage behavior of the scoot-ers we also provide a temporal analysis component (lower half of Figure 1). If one or more scooters are selected on the map, for example to investigate the typical usage behavior within a specific suburb, frequently visited POIs are highlighted within a linear tem-poral view. The view represents cyclically repeating venue occur-rences by aggregating them based on a definable timeframe, e.g. daily, weekly or restricted to weekends. For example, if the analyst is interested in daily behavior, the visualization subdivides the day in 24 sections. For each hour of the day, the most frequently vis-ited POI is displayed on top, while less frequent ones are stacked below in descending order of frequency. This way the analyst can always identify the most probable current location for a given hour of the day up top. This aggregation is restricted to POIs that have been visited for a duration not exceeding 12 hours in order to re-move periods where it was not used for a long time. In addition to the vertical order we also map the frequency to a color scheme ranging from green (most frequent) to white (least frequent). De-pending on the current level of detail (see section 4.1) the venues are aggregated according to respective categories. Thus the tem-poral visualization unfolds on higher detail levels (i.e. shows more distinct destinations) and folds up when lower detail levels like the primary categories are selected.

The map view and temporal view are interactively linked. By investigating the destinations in the temporal view details of the ag-gregated venues (type, number, frequency) are shown and all cor-responding destinations are consistently highlighted in the map, re-vealing their geographic position. Changes in the map view (e.g. selecting a more appropriate POI for a given destination) are also directly reflected in the temporal view.

The chosen visualization for temporal analysis has many sim-ilarities to common Streamgraph and Sankey diagram variations (see [34], [15], [9], [11]) where the categories are shown as contin-uous stacked sorted ribbons. Although it might be easier to follow the temporal changes for a given category in these visualizations, we experienced that they also lead to heavy clutter due to frequently crossing ribbons in the case of several changes in the category or-der. Because it is more important for our domain to understand the probability order of POIs for a given time of the day, instead of the temporal changes that a POI experiences, we chose to cut the rib-bon for each segment and indicate the category by a corresponding symbol at this point.

5 CASESTUDY

To demonstrate the applicability of our methods we apply our ap-proach to our evaluation dataset. In our first case study we want to investigate consumer acceptance in distinct city suburbs. The second study centers on individual product usage.

5.1 Motivation

To produce the dataset, the 500 study participants signed a contract that the complete usage of the bikes would be tracked and analyzed. While also several vehicle specific statistics were recorded and eval-uated, many questions remained unanswered, such as whether e-mobility was able to find it’s way into every day life. Thus in the first case study we want to investigate for what purposes the prod-ucts (scooters) were applied for and when, e.g. for daily means of transportation or for leisure. Before analyzing such higher aggre-gated data, it might also be important to focus on the usage of indi-vidual products, where venues can be validated on a more detailed and precise level. This is shown in the second study. Although users agreed to the analysis of their movements, this study still holds a high risk of violating their privacy. Therefore we anonymized the following results as good as possible.

5.2 Residence dependent consumer acceptance

To compare consumer acceptance in different city areas we use the lens magnifier and place it over a specific suburb. The lens is ad-justable in size so that it can be enlarged or shrinked until it covers the area to be analyzed. In this case we select an area which is about 20 kilometers outside of the city. We want to investigate the range of uses that people have for their scooters. Besides some technical statistics for the current selection (e.g. average distance, number of trips and battery drainage) we also see the tracks of the users and the corresponding venues. In Figure 11 it can be seen that the main destinations are characterized in the categoriesProfessionals,Shops and ServicesandUniversity. This reveals that these scooters were

Figure 11: Map snippet. All users living in the filtered suburb are selected. Routes show movements. Icons show frequently visited POIs.

probably employed for daily transportation means, like work, shop-ping and education. The thickness and intensity of the highlighted routes also gives us a general idea how frequent certain POIs were visited. The certainty coloring of the icons reveals that it is quite sure the selected users employ their scooters to head for the cities university, while office places are more infrequently visited in their movements. For more details, symbols can be examined by mouse-hover to show a tooltip with more information and the level of detail can be changed, as described in section 4.1.2. To investigate the dis-covered locations we can now check other venues that were found for a destination by clicking on the icon. This way we discover that at the offices there is also a restaurant, but it is less frequently vis-ited by Foursquare users and thus also less likely to be a frequent destination for the scooter users. To investigate the periodic fre-quencies and temporal behavior we continue our analysis using the temporal view. It shows that for the selected scooters the home lo-cation covers most of the time, indicating that the scooters are not

(7)

Figure 12: Temporal component showing frequent temporal patterns per hour on a daily basis. It can be seen that at night times, the selected scooters are usually at home, while they are used to get to work and education at daytimes. In the early evening shopping becomes more frequent.

used every day (see Figure 12). Earlier investigations showed that the vehicles were rarely used during the winter months. One can also see that the university and business areas were primarily vis-ited during the office hours. In the afternoon other venues are also visited frequently, such as the categoryOutdoors. The users obvi-ously use their scooter to go shopping and for dinner. By mouse-hovering over the venue bars in the temporal view a tooltip provides more detail about the venues, e.g. how often they were visited at this time of the day during the year. We can now switch to a differ-ent temporal template, showing only days during weekends. This reveals that shopping is a frequent activity for which the scooters were used during leisure times.

5.3 Individual scooter usage

In our second use case we focus on individual product usage. We start by selecting a scooter home location, which will highlight cor-responding routes, visited POIs and typical behavior, as can be seen in Figure 13. The map shows that the scooter is likely applied to visit the shopping sitesPenny MarktandALDI, two supermarkets and alsoBauhaus, a hardware store. While users can charge their scooters at home, this scooter was also charged at a public charging station. Further analysis suggests that the scooter was employed to drive to the hospital, for dining (McDonalds) and shopping at Style-code. However there are more sites found in the vicinity of these POIs, all showing a lower certainty. There is also a business loca-tion found in the north west of the map, but with expert knowledge we are able to identify this location as repair and assembly shop for the scooters. Examining the placeGrillwagen(hot-dog stand) reveals that there is also a different venue available here, another hardware store (Figure 13). We thus zoom in the map and indeed see a largeOBI (large german hardware store chain) sign on the map as well. Using our background knowledge we can infer that the scooter user went to this store rather than to theGrillwagen

(Figure 13). Final investigations in the temporal behavior reveals that the user is not a frequent driver, that the scooter was seldom used for daily transportation means, but that it was used now and then, primarily for shopping.

6 CONCLUSION

In this work we presented a stepwise approach to generate insights from recorded vehicle movements. We first processed a massive movement dataset and extracted frequent and dense destination ar-eas. We then enriched the data with context information using Foursquare, a popular POI webservice. We employed an interac-tive visual analysis loop to gain insights from the enriched data and cope with different types of uncertainty in the data. Our interactive visual analysis system consists of two linked components for (1) geographic information and (2) periodic temporal information of frequent movement behavior. We not only focused on uncertainty visualization in the geographical and temporal domain but also pre-sented ideas how to interactively deal with the uncertainties and

Figure 13: Map snippet. An individual scooter is selected. Routes show movements. Icons visualize frequent visited POIs. Map details are removed for privacy protection.

integrate them in fluent analysis. Future work will further improve the automated venue decisions based on temporal probabilities [29]. Considering time, one could also make better guesses which venues were visited and whether a user visited one or multiples based on the number and popularity of available POIs. Moreover, we want to take more context data, such as weather information and traffic models into account. Also enrichment results should be validated. Therefore a dataset with a-prori known destinations will be gener-ated and used for evaluation. We also plan to apply our approach to movement data from other domains, such as floating car data or animal movements.

ACKNOWLEDGEMENTS

This work has been conducted in the context of the project VASA (13N11254) funded by the German Federal Ministry of Education and Research (BMBF). Furthermore, it was supported by the grad-uate program Digital Media of the Universities of Stuttgart and T¨ubingen, and the Stuttgart Media University (HdM). The elec-tric scooter data was kindly provided by EnBW Energie Baden-W¨urttemberg AG. We would like to thank them for their collabora-tion.

REFERENCES

[1] Graph api rate limiting. http://www.facebook.com, 2013. https://developers.facebook.com/docs/reference/ads-api/api-rate-limiting/.

[2] L. O. Alvares, V. Bogorny, B. Kuijpers, J. A. F. de Macedo, B. Moe-lans, and A. Vaisman. A model for enriching trajectories with

(8)

seman-tic geographical information. InProceedings of the 15th annual ACM international symposium on Advances in geographic information sys-tems, page 22. ACM, 2007.

[3] G. Andrienko and N. Andrienko. Spatio-temporal aggregation for vi-sual analysis of movements. InVisual Analytics Science and Technol-ogy, 2008. VAST’08. IEEE Symposium on, pages 51–58. IEEE, 2008. [4] G. Andrienko and N. Andrienko. Sammon’s projection for clustering

complex geographical objects.et al.: GIScience, 2010.

[5] G. Andrienko and N. Andrienko. Visual Analytics of Movement. Springer-Verlag Berlin An, 2013.

[6] G. Andrienko, N. Andrienko, S. Rinzivillo, M. Nanni, D. Pedreschi, and F. Giannotti. Interactive visual clustering of large collections of trajectories. InVisual Analytics Science and Technology, 2009. VAST 2009. IEEE Symposium on, pages 3–10. IEEE, 2009.

[7] N. Andrienko, G. Andrienko, and G. Fuchs. Towards privacy-preserving semantic mobility analysis.

[8] N. Andrienko, G. Andrienko, H. Stange, T. Liebig, and D. Hecker. Vi-sual analytics for understanding spatial situations from episodic move-ment data.KI-K¨unstliche Intelligenz, 26(3):241–251, 2012. [9] Z. Beane. Movie box office charts, oct 2013.

[10] S. Brakatsoulas, D. Pfoser, and N. Tryfona. Modeling, storing and mining moving object databases. InDatabase Engineering and Ap-plications Symposium, 2004. IDEAS’04. Proceedings. International, pages 68–77. IEEE, 2004.

[11] L. Byron and M. Wattenberg. Stacked graphs–geometry & aesthet-ics. Visualization and Computer Graphics, IEEE Transactions on, 14(6):1245–1252, 2008.

[12] C. Collins, S. Carpendale, and G. Penn. Visualization of uncertainty in lattices to support decision-making. InProceedings of the 9th Joint Eurographics/IEEE VGTC conference on Visualization, pages 51–58. Eurographics Association, 2007.

[13] J. Constine. What are the pros and cons of each ”places” api? http://www.quora.com, 2011. http://www.quora.com/What-are-the-pros-and-cons-of-each-Places-API.

[14] C. D. Correa, Y.-H. Chan, and K.-L. Ma. A framework for uncertainty-aware visual analytics. InVisual Analytics Science and Technology, 2009. VAST 2009. IEEE Symposium on, pages 51–58. IEEE, 2009. [15] W. Cui, S. Liu, L. Tan, C. Shi, Y. Song, Z. Gao, H. Qu, and

X. Tong. Textflow: Towards better understanding of evolving top-ics in text.Visualization and Computer Graphics, IEEE Transactions on, 17(12):2412–2421, 2011.

[16] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algo-rithm for discovering clusters in large spatial databases with noise. In

KDD, volume 96, pages 226–231, 1996.

[17] T. Fujisaka, R. Lee, and K. Sumiya. Discovery of user behavior pat-terns from geo-tagged micro-blogs. InProceedings of the 4th Interna-tional Conference on Uniquitous Information Management and Com-munication, page 36. ACM, 2010.

[18] M. C. Gonzalez, C. A. Hidalgo, and A.-L. Barabasi. Understand-ing individual human mobility patterns. Nature, 453(7196):779–782, 2008.

[19] H. Griethe and H. Schumann. The visualization of uncertain data: Methods and problems. InSimVis, pages 143–156, 2006.

[20] B. Guc, M. May, Y. Saygin, and C. K¨orner. Semantic annotation of gps trajectories. In11th AGILE International Conference on Geographic Information Science, 2008.

[21] H. Guo, Z. Wang, B. Yu, H. Zhao, and X. Yuan. Tripvista: Triple per-spective visual trajectory analytics and its application on microscopic traffic data at a road intersection. InPacific Visualization Symposium (PacificVis), 2011 IEEE, pages 163–170. IEEE, 2011.

[22] B. Hoeferlin, M. Hoeferlin, and J. Raeuchle. Visual analytics of mo-bile data. InNokia Mobile Data Challenge 2012 Workshop (2012), 2012.

[23] C. Hurter, B. Tissoires, and S. Conversy. FromDaDy: Spreading aircraft trajectories across views to support iterative queries. IEEE Transactions on Visualization and Computer Graphics (Proceedings InfoVis), 15(6):1017–1024, 2009.

[24] F. Kling and A. Pozdnoukhov. When a city tells a story: urban topic analysis. InProceedings of the 20th International Conference on Ad-vances in Geographic Information Systems, pages 482–485. ACM,

2012.

[25] R. Kr¨uger, S. Lohmann, D. Thom, H. Bosch, and T. Ertl. Using so-cial media content in the visual analysis of movement data. In2nd Workshop on Interactive Visual Text Analytics, 2012.

[26] R. Kr¨uger, D. Thom, M. W¨orner, H. Bosch, and T. Ertl. Trajectorylenses–a set-based filtering and exploration technique for long-term trajectory data. InComputer Graphics Forum, volume 32, pages 451–460. Wiley Online Library, 2013.

[27] A. M. MacEachren, A. Robinson, S. Hopper, S. Gardner, R. Murray, M. Gahegan, and E. Hetzler. Visualizing geospatial information un-certainty: What we know and what we need to know. Cartography and Geographic Information Science, 32(3):139–160, 2005. [28] C. Parent, S. Spaccapietra, C. Renso, G. Andrienko, N. Andrienko,

V. Bogorny, M. L. Damiani, A. Gkoulalas-Divanis, J. Macedo, N. Pelekis, et al. Semantic trajectories modeling and analysis. ACM Computing Surveys (CSUR), 45(4):42, 2013.

[29] D. Preotiuc-Pietro and T. Cohn. Mining user behaviours: A study of check-in patterns in location based social networks.

[30] T. Schreck, J. Bernard, T. Von Landesberger, and J. Kohlhammer. Vi-sual cluster analysis of trajectory data with interactive kohonen maps.

Information Visualization, 8(1):14–29, 2009.

[31] B. Shneiderman. The eyes have it: A task by data type taxonomy for information visualizations. InVisual Languages, 1996. Proceedings., IEEE Symposium on, pages 336–343. IEEE, 1996.

[32] S. Spaccapietra, C. Parent, M. L. Damiani, J. A. de Macedo, F. Porto, and C. Vangenot. A conceptual view on trajectories. Data & knowl-edge engineering, 65(1):126–146, 2008.

[33] K. Takahashi, A. Shimojo, S. Matsumoto, and M. Nakamura. Mashmap: Application framework for map-based visualization of lifelog with location. InInformation and Telecommunication Tech-nologies (APSITT), 2012 9th Asia-Pacific Symposium on, pages 1–6. IEEE, 2012.

[34] E. R. Tufte and P. Graves-Morris. The visual display of quantitative information, volume 2. Graphics press Cheshire, CT, 1983. [35] Z. Wang, M. Lu, X. Yuan, J. Zhang, and H. van de Wetering. Visual