6.3.1 Ownership data
Information about ownership is an important part of attribute data as it shows who the specific plot belongs to. It was difficult to obtain because many people were reluctant to give their names. The reason for this reluctance was primarily that information about the family’s name can reveal certain types of information that may lead to people being killed. This risk was particularly high during the sectarian war which arose in Iraq from 2003 onwards. In order to improve the collection of ownership data, the first step was to gain the trust and confidence of the communities about the purpose and possible benefits of applying the project in the community. Furthermore, the community was re-assured that personal data held by the researcher would be kept confidential, and that it was anonymised when reported. Consequently, although some of the volunteers were still not prepared to give ownership details, the majority were willing to do so. The VGI-collected ownership data were verified by crowd- source agreement, by presenting it in front of the community gatekeepers. These were community representatives who had already gained the trust of all community members, had been active in representing community needs, and usually had lived in the community for more than ten years. Other representative members of the community, including older residents with considerable levels of knowledge, who were able to attend a verification workshop after data collection, also contributed to this ‘crowd-sourcing’. Table 6.1 shows the results of the use of this crowd-sourcing technique to test the consistency of attribute data collected by VGI. A sample of ~20% of the total number of land parcels surveyed was used for this crowd-sourcing, as well as for further analysis.
As part of the ISO standards, geographicinformation quality can be further assessed through qualitative quality indicators such as the purpose, usage, and lineage. These indicators are mainly used to express the quality overview for the data. Purpose describes the intended usage of the dataset. Usage describes the application(s) in which the dataset has been utilized. Lineage describes the history of a dataset from collection, acquisition to compilation and derivation to its form at the time of use (Van Oort and Bregt 2005, Hoyle 2001, Guin´ ee 2002). In addition, where ISO standardised measures and indicators are not applicable, we have found in the literature more abstract quality indicators to imply the quality of VGI. These are: trustworthiness, credibility, text content quality, vagueness, local knowledge, experience, recognition, reputation. Trustworthiness is a receiver judgment based on subjective characteristics such as reliability or trust (good ratings on the creations, and the higher frequency of usage of these creations indicate this trustworthiness) (Flanagin and Metzger 2008). In assessing the credibility of VGI, the source of information plays a crucial role, as it is what credibility is primarily based upon. However, this is not straight forward. Due to the non-authoritative nature of VGI, the source maybe unavailable, concealed, or missing (this is avoided by gatekeepers in authoritative data). Credibility was defined by Hovland et al. (1953) as the believability of a source or message, which comprises primarily two dimensions, the trustworthiness (as explained above), and expertise. Expertise contains objective characteristics such as
A lot of research has been carried out on event detection, thus next step might be a more in-depth analyses of event evolution over time and identification of correlated events. Another strength of user generated content is the implicitly contained information about the impact of events on people and their behaviour. Possible research questions are, which social groups react to an event - how and where? Amongst others, this requires the analysis and visualisation of the reactions of people in terms of judgements, responses, attitudes and emotions. Related research might deal with the development of methods to analyse the process of information spreading and opinion forming and to understand the temporal evolution of people´s relation and interaction with their physical contexts. All components should provide quality measures. The subjectivity and possible deliberate falsification of data requires special methods of evaluation and derivation of confidence measures. This will be the basis to ensure that from user-generated data reliable conclusions can be drawn and errors or misstatements be identified and minimised.
The organizational practices established while the project is being defined are critical to its success and acceptance. In order to solve a problem or obtain a desired transformation of everyday life in society, the organization must create value by making the information, knowledge or creativity of an online community useful, and achieving this requires getting people to participate. However, before securing participation, the organization must: (i) define and ensure people´s understanding and knowledge of what to do and why; (ii) define procedures and instructions on how to do it; (iii) and, since participation is voluntary, it is also necessary to create an engagement plan in order to obtain an emotional commitment from participants that is expressed through actions. These organizational practices determine how the online community is managed and establish the collaborative workplace that should facilitate and lower barriers to participation. A strong organizational plan helps ensure that the project’s conceptualization and its architecture of cooperation will be adopted, and also enables the participation processes that affect the evolution of the project and the community. In addition, facilitating participation serves to establish relationships that develop into new implementation agreements, forming new actors and rules, and ultimately generating the trust and identification that define the online community. These concerns affect a project’s adoption and the decision as to whether or not to participate in it. Also of key importance is having a vision of what the organization needs to do in order to succeed in the future and of its core values, top priorities and the driving forces that provide inspiration and direction to the project objectives and the online community. If the vision and values of the project match those of the participants, it is easier to build trust and obtain commitment to the project. To sum up, in order to clarify and manage the expectations of the people involved in a VGI project and advance towards its success, the organization has to define desired results, parameters and guidelines, and decide how to manage human capital in order to achieve those results. In addition, in order to promote the project’s success, it is important to establish accountability within the project by tracking its progress and performance. In this sense, some of the best practices in crowdsourcing projects proposed by Braham [ 21 ] include committing to communicating the impact of contributions to the online community, being transparent, honest and responsive, and also acknowledging participants. Subsequently, in order to foster motivation and trust, it can be helpful for the project to define a way to give back to the online community, such as publishing open databases, results, statistics, maps, services and tools, or giving credit and other types of benefits like awards or privileges for being part of the community.
This definition is necessary here, since there is no clear consensus in related works on the concept of VGI. On the one hand, there is the understanding of VGI in a strict sense, restricting it to user contributions that are made “with the intent of providing informa- tion about the geographic world” (Elwood et al. 2012). Examples for this interpretation of VGI are the co-creation of maps (e.g., OpenStreetMap, Wikimapia) or the collection of environmental data (e.g., Christmas Bird Count). On the other hand, user contributions without the intrinsic motivation to publish geographic data exist, since users create for tweets, posts, likes, reviews, photos or other content that has a spatial datum. Stefanidis et al. (2013) name such information Ambient Geospatial Information (AGI), but they are seen as VGI in a broader sense in this paper. Harvey (2013) suggests to delimit the term VGI to data that was collected with a clear opt-in of users. He suggests the term Con- tributed GeographicInformation (CGI) for all geographic data that was collected without that clear consent, but having an opt-out possibility for users as. Harvey names “cell phone tracking and RFID-equipped transport cards” as examples, but the use of such data, out- side the objective it was collected for, stays obviously often in conflict with privacy or data protection regulations. Therefore, the above given definition of VGI in contrast to non-voluntary generated data is preferred here.
McDougall, K., 2012. An assessment of the contribution of volunteeredgeographicinformation during recent natural disasters, in: Rajabifard, A., Coleman, D. (Eds.), Spatially Enabling Government, Industry and Citizens: Research and Development Perspectives. GSDI Association Press, Needham, MA, United States, pp. 201–214. Muller, C. l., Chapman, L., Johnston, S., Kidd, C., Illingworth, S., Foody, G., Overeem, A.,
Abstract. The use of VolunteeredGeographicInformation (VGI) in collecting, sharing and
disseminating geospatially referenced information on the Web is increasingly common. The potentials of this localized and collective information have been seen to complement the maintenance process of authoritative mapping data sources and in realizing the development of Digital Earth. The main barrier to the use of this data in supporting this bottom up approach is the credibility (trust), completeness, accuracy, and quality of both the data input and outputs generated. The only feasible approach to assess these data is by relying on an automated process. This paper describes a conceptual model of indicators (parameters) and practical approaches to automated assess the credibility of information contributed through the VGI including map mashups, Geo Web and crowd - sourced based applications. There are two main components proposed to be assessed in the conceptual model – metadata and data. The metadata component comprises the indicator of the hosting (websites) and the sources of data / information. The data component comprises the indicators to assess absolute and relative data positioning, attribute, thematic, temporal and geometric correctness and consistency. This paper suggests approaches to assess the components. To assess the metadata component, automated text categorization using supervised machine learning is proposed. To assess the correctness and consistency in the data component, we suggest a matching validation approach using the current emerging technologies from Linked Data infrastructures and using third party reviews validation. This study contributes to the research domain that focuses on the credibility, trust and quality issues of data contributed by web citizen providers.
These types of activity are referred to in various ways in the literature, and it is not our intention here to be exhaustive (for a discussion on this, see See et al., 2016). In giving this summary of the types of collaborative activity required for the production of crowdsourced geographicinformation, we would like to set up a context to discuss issues related to the quality of the information. This summary shows that the different types of collaborative activity result in geographic data with varying degrees of accuracy, structure, and format standardization. For instance, while geotagged social media messages may have heterogeneous formats (e.g., text-, image-, and map-based data) and varied structure, some projects employ a CGI standardized process for gathering CGI (e.g., Salk et al., 2016). However, even with more standardized data formats and procedures, the extent to which volunteers will adhere to them is uncertain. As a result, CGI is often suspected of having a heterogeneous quality and uncertain credibility (Flanagin & Metzger, 2008), which might affect the usability of the crowdsourced information (Bishr & Kuhn, 2013). The quality of CGI also depends on how the information is used, since the quality of the information is determined by its “ fitness for use ” within the context in which it is applied (Bordogna et al., 2016).
The main point in OSM is that nothing is copied from the under the copyright maps, but everything is mapped by the users themselves. Some companies have agreed the OSM to use their data in the project, such as the aerial imagery from Yahoo! is available. Other data sources from which contributors can trace roads and other objects are for example out-of- copyright maps (Haklay & Weber 2008: 14). Some countries have also given free geographical information for OSM, for instance The Topologically Integrated Geographic Encoding and Referencing system (TIGER), data for US streets, produced by the US Census Bureau (OSM 2010b). Other domain public data sources are for example land cover data for France and Estonia and the Netherlands have given the complete road network with addresses for their whole country (Geofabrik 2010a). A remarkable progress was at the end of the year 2010, when Microsoft declared that OSM contributors are allowed to access its global orthorectified aerial imagery Bing maps and create maps by tracing them (Microsoft 2010a).
Historically, both data creation and mapping in the face of disaster required the skills of trained professionals. After the Haitian earthquake, relatively untrained volunteers, NGOs, and citizens were all able to create data critical to the recovery and maps that contextualized this data . VGI played a critical role in the emergency response in Haiti. OpenStreetMap data was heavily used by multiple agencies and NGOs on the ground. Mobile communication devices provide ubiquitous user interfaces for the users (from healthcare professionals to citizens). The possibilities this technology offers for healthcare delivery are vast and the realization process of the potential has only just begun . Using the VGI paradigm as a means of collecting, managing, and disseminating health-care information is a promising research area and worthy of further investigation. The ability to integrate, fuse, or “mash-up”  these spatial data with other free and openly available sources of data (such as population statistics, population density, road networks, social deprivation indices, etc) could yield new understandings of issues in relation to provision of healthcare, dissemination of information etc. Citizens are not the only benefactors. The research community will benefit from these data resources being available freely and openly providing them with a new research tool that could help answer questions not previously addressable because of obstacles preventing access to up-to-date and accurate data. Hospitals, physicians, policy makers, and researchers could use these data for bench- marking and to identify targets for improvement efforts . Interoperability with existing health data and software systems will be crucial if the use of web-GIS in public health is to gain acceptance among public health practitioners and the general public . The issue of data quality must also be addressed by the VGI community. The percieved lack of cartographical, surveying, and GIS skills of contributors has seen spatial quality in OSM become a major issue. Mooney et al  remark that there are no accepted metrics for measuring the quality of OSM or to a wider extent the quality of VGI. Given
Abstract . The use o f VolunteeredGeographicInformation (VGI) in collecting, sharing and disseminating geospatially referenced information on the Web is increasingly common. The potentials o f this localized and collective information have been seen to complement the maintenance process o f authoritative mapping data sources and in realizing the development o f Digital Earth. The main barrier to the use o f this data in supporting this bottom up approach is the credibility (trust), completeness, accuracy, and quality o f both the data input and outputs generated. The only feasible approach to assess these data is by relying on an automated process. This paper describes a conceptual model o f indicators (parameters) and practical approaches to automated assess the credibility o f information contributed through the VGI including map mashups, Geo Web and crowd - sourced based applications. There are two main components proposed to be assessed in the conceptual model - metadata and data. The metadata component comprises the indicator o f the hosting (websites) and the sources o f data / information. The data component comprises the indicators to assess absolute and relative data positioning, attribute, thematic, temporal and geometric correctness and consistency. This paper suggests approaches to assess the components. To assess the metadata component, automated text categorization using supervised machine learning is proposed. To assess the correctness and consistency in the data component, we suggest a matching validation approach using the current emerging technologies from Linked Data infrastructures and using third party reviews validation. This study contributes to the research domain that focuses on the credibility, trust and quality issues o f data contributed by web citizen providers.
With the growth and development of the Internet and Web 2.0, a significant proportion of information on the internet is now created by users' contributions in their blogs, wikis, and social networks. Volunteeredgeographicinformation or VGI represents a type of crowdsourced spatial information voluntarily created by everyday users who insert their experiential or local knowledge of a place into the digital map. Indeed, VGI is part of a broader phenomenon known as 'user- generated content' in the Internet world [1,11]. The adaption of technologies such as GPS, Web 2.0, and location-based information-sharing and the emergence of VGI have shifted the spatio-temporal scales of community involvement . Fast paced production and dissemination of VGI generated spatial knowledge is particularly useful when managing events demanding rapid response . In times of natural disasters or other emergencies, citizens use web 2.0 and VGI to monitor and report rapidly changing ground conditions so that appropriate actions can be taken to save lives or mitigate threats. VGI is equally useful in non-emergency situations, where persistent spatial data gaps can be bridged through the incorporation of local knowledge through VGI.
The participation and engagement of grass-root level community groups and citizens for natural resource management has a long history in Australia. Since 1990, there have been many attempts by State governments to involve communities in environmental projects such as Salt-Watch, Water-Watch and Landcare projects with communities providing volunteer support to state government organisations (Carr, 2002). These community volunteer activities have been successful in achieving better environmental outcomes and acknowledgement by government agencies. The local environmental knowledge of these groups can also be used for spatial information collection and management. Traditionally, spatial information was managed and controlled by government agencies for NRM activities. Recent developments in ICT tools and spatial technology have provided community groups with a new opportunity to manage the natural resource data. There is a significant amount of spatial information collected/generated by land care groups, land holders and other community groups at grass-root level through these volunteer initiatives. Government organisations are also employing volunteered input into their mapping programs. It has opened the new avenue to manage and utilise spatial data for natural resource management.
In addition to contributing to the crisis flood map reports, social networking also played a major role in keeping people informed during the January 2011 flood. The social networking service Twitter <www.twitter.com> allowed people to post and receive short text based updates about the flood in real time. Photos and videos could also be attached to these updates. Similarly, the website Facebook <www.facebook.com> allowed groups such as the Queensland Police Service to provide flood information updates to users who browsed their Facebook page. Finally, YouTube <www.youtube.com> provided a forum for people to connect and inform through the use of user‐generated and contributed videos. Photography and imagery of the floods across different regions were posted on sites such as Flickr which were linked to a location through the map. Individuals had the opportunity to add comments and additional information regarding the context of these images. The posting time was also time‐stamped by the system. These images provide an excellent historic and current record of the flood events and features in the imagery can easily be used to reference flood heights at a particular time.
During the floods the Ushahidi crowd map was successfully deployed. Ushahidi is a non-profit technology company that specialises in developing free and open source software for information collection, visualisation and interactive mapping (Ushahidi, 2011). Crowdmap is an on online interactive mapping service, based on the Ushahidi platform (Crowdmap, 2011). It offered the ability to collect information from cell phones, email and the web, aggregate that information into a single platform, and visualise it on a map and timeline. Photos and videos were also able to be attached to these updates. This volunteered mapping information supplemented the informational already uploaded through Facebook, YouTube and Twitter and enabled people to connect and source updates and news on the flooding.
The test dataset was derived from the Map Kibera Project (http:mapkibera.org), a res- ident led effort to map the neighborhoods and community amenities of the Kibera region of Nairobi, Kenya. The dataset consists of a shapefile of VGI, GPS points with an assigned toponym for one of fifteen neighborhoods. Additionally, the dataset contains a set of poly- gons for the neighborhood boundaries. The line dataset used to represent major physical barriers (and thus potential neighborhood boundaries) was derived from road, river, and railway data courtesy of OpenStreetMap, an open source repository of transportation data. The Map Kibera Project is a community and NGO effort to collect significant geographic features of Kibera. The group splits into teams to collect points of interest (e.g., schools, clinics) throughout the community using GPS receivers. Each point has the neighborhood that the point resides in as part of its metadata. The neighborhood assignment to indivi- dual points was determined through consensus with the residents involved in the project. We use the point locations and the metadata as the input to test our algorithm. The neigh- borhood boundaries were hand-drawn over a satellite image of Kibera by a group of resi- dents and volunteers. We use the resident drawn boundaries as the basis of comparison for our algorithm. The Kibera dataset was selected because the neighborhood boundaries were community derived rather than defined administratively or by a cartographic expert. Moreover, the VGI point data is well distributed and each labelled point falls into its re- spective neighborhood. Figure 1 shows the neighborhood boundaries overlaid with their respective neighborhood points.
& Palen 2009). When originally published, this paper aimed to provide a stimulus for further exploration of the role of LBSN sources such as Twitter for crisis management and to consider the valuable geospatial component they can contain, particularly for time-critical events. It seems it has been effective, since it contributed to a wide corpus of literature published since then (see Introduction for details). Twitter is notable in its design in relation to both time and space. Tweets are organised in timelines (i.e., series of tweets sorted and displayed in reverse chronological order) and the time each tweet has been published is available with a level of accuracy of 1 second. The spatial dimension of Twitter is more complex, where georeferencing takes several basic forms. Firstly details can be provided in relation to tweets indirectly or directly. In an indirect form, a user’s location is provided on their profile page but this location is expected to be the place were they live and not their location when a tweet is made. Notably, applications running on GPS-enabled smartphones allow users to automatically update this location field each time a tweet is posted, thus converting Twitter into a genuine LBSN. For example, a user living in San Francisco can tweet from his GPS-enabled smartphone and allow his Twitter app to disclose his precise location as metadata of his tweets (‘direct location’), this would be referred to as ‘geotweeting’) (Stone 2009),. Oppositely, he might tweet from a desktop computer on which the browser is configured to not disclose any location information (although the Twitter server could guess it from the IP of the client computer, this information is not disclosed to third parties). In this case time zone,, no location will be available for this particular tweet and only the reference to San Francisco in the user profile can be used to guess (‘indirect location’) from where it has been published (although the user can be travelling anywhere in the world at that particular time)., It has to be expected that Twitter features are used in an heterogeneous manner by users, depending on their smartphone’s settings, their privacy concerns and their technological literacy.
VolunteeredGeographicInformation and Remote Sensing The availability of spatial data on the web provided by crowdsourcing communities, as well as its scientic accept- ance has increased signicantly in recent years. The integration of VGI in data analysis provides promising opportunities due to the amount of information made available [Good- child 2007a, Kinley 2013, Sester et al. 2014]. Yet, it also poses new challenges in terms of heterogeneity of the data and quality assurance [Flanagin and Metzger 2008, Haklay 2010]. Therefore, the focus in recent research lies at discerning the quality of VGI and hence to identify its applicability to further projects by complementing or substituting commercial data [Arsanjani et al. 2015, Ather 2009, Ciepªuch et al. 2011, Fan et al. 2014, Haklay 2010, Hecht et al. 2013, Helbich et al. 2012, Kounadi 2009, Mooney et al. 2010, Neis and Zipf 2012, Zielstra and Zipf 2010]. Yet, the integration of non-authoritative data with traditional, well established data and methods is a novel approach [Kinley 2013, Schnebele and Cervone 2013]. Very little research has been conducted in terms of combining VGI with remote sensing techniques. Hence VGI has rarely been incorporated within classication algorithms used to generate thematic maps [Kinley 2013]. However, some recent studies take verication a step further and use VGI as an additional data source to rene or update existing information.
Another important issue related to the reliability of the volunteeredland cover reference data is the mismatch between what people see from remote sensing imagery and what they see on the ground. Nagendra et al. [ 33 ] analyzed the difficulties that arise from this kind of misunderstanding and have provided recommendations for tackling this problem. The scale of ground features also needs to be matched to the spatial resolution of the remotely sensed images. The relationship between remote sensing data of different spatial resolutions and their potential use in mapping thematic ground features has also been summarized in Nagendra et al. [ 33 ], Nagendra and Rocchini [ 34 ], and Pettorelli et al. [ 35 ]. Therefore, we should provide more standard practice and training using examples from remote sensing. A good example would be to use the Land Use Cover Area Frame Survey (LUCAS) data set [ 36 ], which provides a set of ground-based photographs (taken in four cardinal directions and at the location) with corresponding detailed land cover and land use classes. Satellite imagery at different resolutions could then be provided with these photographs to help train the volunteers in recognizing different features at different spatial resolutions. Where possible, we should present the volunteers with remote sensing imagery at the most appropriate spatial resolution to improve their validation performance.