Spatial Data Quality

Top PDF Spatial Data Quality:

Spatial data quality

Spatial data quality

Imprecision in spatial data arises from a granularity or resolution at which observations of phenomena are made, and from the limitations imposed by computational representations, processing and presentational media. Precision is an important component of spatial data quality and a key to appropriate integration of collections of data sets. Previous work of the author provides a theoretical foundation for imprecision of spatial data resulting from finite granularities, and gives the beginnings of an approach to reasoning with such data using methods similar to rough set theory. This paper further develops the theory and extends the work to a model that includes both spatial and semantic components. Notions such as observation, schema, the frame of discernment, and vagueness are examined and formalized.
Show more

33 Read more

A Planning based Evaluation of Spatial Data Quality of OpenStreetMap Building Footprints in Canada

A Planning based Evaluation of Spatial Data Quality of OpenStreetMap Building Footprints in Canada

and its importance for planners were discussed. The discussion chapter of this thesis further investigated the usefulness of OSM for planning. Discussions on the use of building footprints for planning analysis as well as the level of spatial data quality required for such was touched upon. Furthermore, the vast opportunities for planning using OSM data were presented. These include filling in spatial data gaps, using OSM data for GIS analysis, having the most up-to-date data, having free data for use in GIS freeware in areas without financial resources and having planners contribute to OSM. Throughout this analysis, the second objective has been met. The third objective was stated as: to determine the level of quality of building footprints for OpenStreetMap for various cities in Canada and investigate reasons for variations. By running the models, the overall level of spatial data quality was determined for ten Canadian cities. The spatial data quality measures include completeness, zonal completeness, commissions, positional accuracy, topological consistency and attribute accuracy. These measures were reported for the ten study municipalities and then compared. Furthermore, an analysis of completeness over time was done between January 2010 and July 2018 for all study municipalities. The number of building footprints over time and the locations in which they were created were noted, thus giving insight into the spatial data quality at different time periods. Also noted was the number of contributors in each city in July 2018. The reasons behind varying levels of quality across the study municipalities were presented in the discussion chapter. These include local knowledge and interest, the type of buildings and the perceived importance of cities and regions. By presenting the spatial data quality data for all study municipalities and investigating reasons for variations, the third objected has been completed.
Show more

138 Read more

Spatial data quality in multi-criterial analysis for decision making process

Spatial data quality in multi-criterial analysis for decision making process

Abstract: Multi-criterial analysis is becoming one of the main methods for evaluation of influence of geographic environment on human activity, or human activity on geographic environment, respectively. Analysis results are often used in command and control systems, especially in armed forces and units of rescue systems. For analyses, digital geographic data - whose quality significantly influences the reached results - are used. Visualization of results of analyses in command and control systems are usually thematic layers over raster images of topographic maps. That is why this visualization must correspond to cartographic principles used for the creation of thematic maps. The article presents problems that an analyst encounters within the evaluation of the quality of the used data, performance of the analysis itself as well as preparation of data files for their transfer and publishing in command and control systems. Keywords: spatial data quality, multi-criterial analysis, thematic maps, command and control system
Show more

8 Read more

Quantifying the effects of variable selection, spatial scale and spatial data quality in marine benthic habitat mapping

Quantifying the effects of variable selection, spatial scale and spatial data quality in marine benthic habitat mapping

redundancy, also reduces ambiguity: relative difference to mean value (Group 1), local standard deviation (Group 2), easterness and northerness (Groups 3A-3B), local mean (Group 4), and a measure of slope preferably computed with Horn’s method (Group 5A, Table B.2 in Appendix B). Relative difference to mean value is a measure of relative position, local standard deviation is a measure of rugosity, and easterness and northerness are measures of orientation derived from aspect. Local mean may however not be required if users include elevation or depth in the analysis, as the local mean will be highly correlated with the input DTM. However, local mean could potentially be more reliable than the initial DTM if a surface is noisy, as using the mean may filter out some of the noise. This recommendation of six attributes increases replicability and generality as it selects terrain attributes that are easily computed from any software over terrain attributes that are only available in some of them. For instance, we recommend using local standard deviation in Group 2 over terrain ruggedness index or roughness as there is no ambiguity on how to compute standard deviation and all software have focal statistics tools that can compute it. To help potential users of this method, we provide with this paper a toolbox developed for ArcGIS, named TASSE (Terrain Attribute Selection for Spatial Ecology), that automatically generates the six proposed terrain attributes (Lecours, 2015).
Show more

395 Read more

Temporal-spatial modeling for fMRI data

Temporal-spatial modeling for fMRI data

are activated by the kth experiment condition. In the literature, commonly used forms of HRF include discretized Poisson, Gamma and Gaussian density functions (Figure 5.1). However, these HRFs assume predefined parametric forms, which are rather restrictive. Some work has been proposed to model the HRF using a small set of temporal basis functions to improve the flexibility of its form (Friston et al., 1995b; Josephs et al., 1997). One drawback of the basis approach is that it introduces less sensitivity of the estimation and the results are more difficult to interpret (Kherif et al., 2002). More recently, attempts have been made to combine data-driven and model-driven methods in a complementary way, for example, by Hu et al. (2005), Rayens and Andersen (2006) among others. Instead of assuming a certain form for the HRF, the authors model the HRF or the temporal component directly making use of the information extracted by some data-driven methods, such as independent component analysis (ICA) or principal component analysis (PCA).
Show more

120 Read more

An Infrastructure for Supporting Spatial Data Integration *

An Infrastructure for Supporting Spatial Data Integration *

A spatial mediator has the task of locating data by either a symbolic term or a location. The block structure of our view of the spatial mediator is shown in Figure 4.1. For symbolic terms(like city names), we use an ontology-like approach based on our earlier work[7,8] to locate data sources that contain the term. For either point or bounding box location data, the R-tree is used to locate potential data sources containing the required location. Rule Set I is used with the data source metadata to determine the best data source(s) to use in the integration script sent to the computation server. The rule set is used to rank the matching data sources by how well that they address the requirements of the request. Rule Set IIA is used in combination with all of the relevant metadata available (note, this could include metadata on the user, the tools available, and the data sources chosen) to create the integration script. The integration script is of the form O α (s1,s2), where O α is a tool name (e.g., vector_raster_fusion) for a tool that is
Show more

6 Read more

Spatial Data Preparation for Knowledge Discovery

Spatial Data Preparation for Knowledge Discovery

Malerba (Malerba et al 2002) proposed an object-oriented data mining query language for classification and association rules, which is implemented by the INGENS (Malerba et al 2003) software prototype. The first problem in this approach is that it works with object-oriented databases, while most SDBMS are relational or object-relational. The second problem is that only the INGENS software prototype implements the proposed language, and for the ATRE algorithm. Appice et al (2003) defined a spatial features extractor named FEATEX, to select features from a spatial database and to create an output for the SPADA algorithm. The drawbacks of this approach are that most SDBMS do not implement FEATEX’s functions and its output is a format for one specific algorithm.
Show more

8 Read more

Through Spatial Data Infrastructure in Turkey

Through Spatial Data Infrastructure in Turkey

Currently, there exist more than 2000 GPS receivers in the entire country. These GPS users, benefiting from static or RTK (real-time) techniques, are forming their own base stations, and then computing coordinates with the use of rover receivers. In static measurements, depending on the baseline length and applied method, rovers are required to collect data for periods extending from 15 minutes to multiple hours. When using RTK, on the other hand, solution can be acquired up to 5-10 km from the base station . This project will provide the

9 Read more

Spatial data fusion with visual feedback

Spatial data fusion with visual feedback

The system introduced in this paper is particularly relevant in the current state-of-the-art development of location aware systems, ubiquitous computing and interactive visualisation. It builds upon our previous work on pattern extraction from spatial data [18][19]. Spatial data fusion depends on the availability of a sensor infrastructure that would enable access to live data. Alternatively, simulated data can be used but the authenticity will be diminished. The interactive visualisation is the component that will be developed through extensive participatory design, since user involvement in the design process will prove highly beneficial.
Show more

6 Read more

Classification methods for spatial data representation

Classification methods for spatial data representation

Automatic classification methods are being incorporated in existing geographic information systems. However, the characteristics of the original data might be overlooked, or there might be a risk of mistaking judgment, if we do not have enough knowledge about the classification method as well as the distribution characteristics of the original data. Even if we are using the same number of classes and the same spatial data, we might obtain the quite different maps. A typical example is shown in figure 6. Hence, the following viewpoint is also important for a classification problem.
Show more

19 Read more

Spatial Data Mining Analysis Methods

Spatial Data Mining Analysis Methods

Incomprehensibly, bunching techniques for spatial databases don't have all the earmarks of being exceptionally progressive contrasted with those connected with social databases (programmed order). The bunching is performed utilizing a likeness capacity which was at that point classed as a semantic separation. Subsequently, in spatial databases it seems characteristic to utilize the Euclidean separation with a specific end goal to bunch neighboring items. Research examines have concentrated on the advancement of calculations. Geometric bunching produces new classes, for example, the area of houses as far as neighborhoods. This stage is regularly performed before other information mining undertakings, for example, affiliation discovery between gatherings or other geographic elements, or portrayal of a gathering.
Show more

7 Read more

Spatial Data With Range Search With Encryption

Spatial Data With Range Search With Encryption

urity index. The hash sketches that cryptography, image processing and information hash functions and inverted visual words. It yields slow performance in inverted The theme of cryptographic model is to method incurs higher storage The author in [8] ied about the issue of CSP towards search operations. n model was also suggested the data access. Their technique purely works as Public key The computational cost of decrypting the data [9] suggested an They defined two algorithms which depicts searching efficiency and privacy It was executed on personal health records of It fails to support synonyms or suggested an alternate scheme to handle method drastically lessened the processing cost and network traffic. It eliminates the search barriers in the information retrieval systems. And they also proposed other methods named, ‘Ranked Searchable Order persevering symmetric One many order preserving mapping'. irrelevant data and traffic issues. It The author in [11] framed search method based on Identity based Encryption. Their method supported a single query as well as multiAny cloud users can access the data but only The author in [12] based encryption technique that supports hierarchical functioning systems. It is Confidentiality is one of the security parameters used over makes use of ranking ystem. Depending on encrypted queries it ranks the documents and document having most
Show more

8 Read more

Computing environments for spatial data analysis

Computing environments for spatial data analysis

mechanisms to overcome the current lack of speed of the internet. Important questions remain about the division of labor between the server and the client in terms of the provided analytical capability. Many technical issues must be resolved before web delivery of analysis will be standard, but it is clearly an essential aspect of the analytical software tools of the future. Fourthly, the potential in terms of added functionality that could result from the fostering of a large community of developers in an open source context should not be underestimated. While it is unlikely that spatial data analysis will attract the same degree of attention as the maintenance and refinement of an operating system such as linux, the leverage of the input and commitment of many rather than a few could be significant. However, such a community can only exist if sufficient awareness and knowledge of the methods themselves has been generated, which is still far from being accomplished. Finally, there is likely to be an increasingly strong mutual reinforcement between spatial statistical and econometric methods and the computational tools to implement them in practice. For example, superior software tools for simulation have revolutionized the estimation of complex hierarchical models. Similarly, one can expect that significant advances in software tools for spatial data analysis will open up new opportunities for methodological and theoretical advances.
Show more

24 Read more

Spatial clustering method for geographic data

Spatial clustering method for geographic data

Margules et al. (1985) tested four agglomerative hierarchical fusion strategies with the adjacency constraint. The choice of classification strategy, which should depend on the type and amount of data and objective of the classification, is an important decision that applies equally to constrained or unconstrained classification. In this research, the Quadtree data structure is used for the process of finding the optimal space-cluster. The applications discussed here are limited to the Quadtree data structure. However, the following method can be applied to any other fusion strategies or data structures. Figure 1 shows an example in which an appropriate space-cluster is expressed using the Quadtree structure. Assuming the top level is the entire study area, the low rank can be considered sub-areas. Furthermore, each leaf can be considered the smallest sub-area, i.e., a space-cluster. That is, an adequate space-cluster can be obtained by traversing the tree using an evaluation function.
Show more

15 Read more

GeoDa: An Introduction to Spatial Data Analysis

GeoDa: An Introduction to Spatial Data Analysis

methods for dynamic graphics outlined in Cleveland and McGill (1988). In geographical analysis, the concept of ‘‘geographic brushing’’ was introduced by Monmonier (1989) and made operational in the Spider/Regard toolboxes of Haslett, Unwin, and associates (Haslett, Wills, and Unwin 1990; Unwin 1994). Several modern toolkits for exploratory spatial data analysis (ESDA) also incorporate dynamic linking, and, to a lesser extent, brushing. Some of these rely on interaction with a GIS for the map component, such as the linked frameworks combining XGobi or XploRe with ArcView (Cook et al. 1996, 1997; Symanzik et al. 2000); the SAGE toolbox, which uses ArcInfo (Wise, Haining, and Ma 2001); and the DynESDA extension for ArcView (Anselin 2000), GeoDa’s immediate predecessor. Linking in these implementations is constrained by the architecture of the GIS, which limits the linking process to a single map (in GeoDa, there is no limit on the number of linked maps). In this respect, GeoDa is similar to other freestanding modern implementations of ESDA, such as the cartographic data visualizer, or cdv (Dykes 1997), GeoVISTA Studio (Takatsuka and Gahegan 2002), and STARS (Rey and Janikas 2006). These all include functionality for dynamic linking, and to a lesser extent, brushing. They are built in open-source programming environments, such as Tkl/Tk (cdv), Java (GeoVISTA Studio), or Python (STARS) and thus easily extensible and customizable. In contrast, GeoDa is (still) a closed box, but of these packages it provides the most extensive and flexible form of dynamic linking and brushing for both graphs and maps.
Show more

18 Read more

Spatial and Spatiotemporal Modeling of Epidemiological Data

Spatial and Spatiotemporal Modeling of Epidemiological Data

The availability of spatial and spatiotemporal data has increased substantially in the last few decades due to advancement in computational tools, which enables us to collect real- time data coming from GPS, satellite etc. (Cressie, 2015; Plant, 2012; Ripley, 2005). Researchers nowadays in a wide variety of fields including epidemiology, forestry, and sociology to hydrology, have to deal with spatial and spatiotemporal data. Spatial data constitutes information about both an attribute of interest as well as its location. The location may include a set of coordinates such as longitude and latitudes or small areas such as census tracts, counties etc.
Show more

115 Read more

Finding Spatial Patterns in Network Data

Finding Spatial Patterns in Network Data

Finding interesting patterns in network data is an important task for network analysts and managers to recognize and respond to changing conditions quickly; within minutes when possible. This situation creates new challenges in coping with scale. Firstly, the analysis of the huge amounts (usually tera-bytes) of the ever-growing network data in detail and the extraction of interesting knowledge or general characteristics about the network behavior is a very difficult task. Secondly, in practice, network data with geographic attributes are involved, and it is often important to find network patterns involving geo-spatial locations. In this paper we address the problem of finding interesting spatial pat- terns in network data. Sharing ideas and techniques from the pattern visualization and geo–spatial visualization areas can help to solve this problem. We provide some examples for effective visualizations of net- work data in a important area of application: the analysis of e-mail traffic.
Show more

12 Read more

Statistical Inference for Large Spatial Data

Statistical Inference for Large Spatial Data

As outlined in [6], for a given estimation problem, the choice of a suitable CL function should be driven by statistical and computational considerations. However, it is noticeable that many existing methods of CLs are constructed with equally weighted pairs due to its simplicity. To improve statistical efficiency or to further reduce the computational burden associated with large data sets that have enormous number of pairs, several investigations have considered the choice of weights when constructing CL in the context of a spatial process. One popular strategy is to use binary weights to exclude those pairs whose dis- tances are beyond certain taper range [4, 7]. [8] and [9] consider selecting a taper range by maximizing certain criteria derived from the Godambe information matrix of a CL es- timator. The CL estimators based on binary weights have improved statistical efficiency over equally weighted CL methods. However, these methods ignore dependence among selected pairs and hence can still lead to considerable loss of statistical efficiency. Thus far, very limited work has been done on designing non-binary weights. [8] investigated weighted composite score for a scalar parameter and constructed a weight by minimiz- ing an upper bound of the asymptotic variance of the estimates. They find that the pro- posed weighted CL method performs better than both the binary weighted method and the equally weighted CL method for Gaussian random fields. [10] proposed a joint com- posite estimating function (JCEF) approach through a weight matrix to spatio-temporally clustered data.
Show more

82 Read more

Modelling of spatial effects in count data

Modelling of spatial effects in count data

Spatial errors following the CAR scheme are included in count data models which are typi- cally estimated using Bayesian Markov chain Monte Carlo (MCMC) and applied to a wide range of data, e.g. traffic crash data (Aguero-Valverde and Jovanis, 2006; Buddhavarapu et al., 2016; Li et al., 2007; Miaou et al., 2003; Quddus, 2008; Truong et al., 2016), pedes- trian casuality counts (Graham et al., 2013; Wang and Kockelman, 2013), crime counts (Jones-Webb et al., 2008; Haining et al., 2009), emergency department visits (Neelon et al., 2013), commuting patterns (Chakraborty et al., 2013), claim numbers on insurances (Czado et al., 2014; Dimakos and Rattalma, 2002; Gschl¨ oßl and Czado, 2007, 2008), and firm births (Liviano and Arauzo-Carod, 2014). The CAR approach for modelling spatial heterogeneity is also very popular in biometrics, e.g. for cancer counts (Bernardinelli and Montomoli, 1992; Torabi, 2016; Waller et al., 1997; Xia et al., 1997; Xia and Carlin, 1998; Wakefield, 2007), diabetes mellitus cases (Bernardinelli and Clayton, 1995; Bernardinelli et al., 1997), or Malaria counts (Briet, 2009; Villalta et al., 2012). Various other specifications of spatial error models for count data are applied in the literature as well: LeSage et al. (2007) use a simultaneous autoregressive scheme to model European patent data, Jiang et al. (2013) multiply two different spatial random effects in their Poisson temporal-spatial random effect model for traffic crashes in Florida, and Basile et al. (2013) employ a geoadditive negative binomial model for greenfield investments in the European Union, which includes a bivariate smooth term of latitude and longitude, to name a few.
Show more

170 Read more

Data Quality Management: Trade-offs in Data Characteristics to Maintain Data Quality

Data Quality Management: Trade-offs in Data Characteristics to Maintain Data Quality

utes every half an hour, I should as a leader should get or report out of that. Which store of mine, what product of mine, what promotion of mine, which are those guests, what are their segmentation, who is buying what. Do I ever cater to all of their needs, all of their want in a proper manner? If there is lot of return happening, I need that information very promptly. So, timeliness is a big factor now. I want to take decisions right away, if the promotion is work- ing, let's increase it, if the promotion is not working let’s kill it. If inventory is getting out of stock somehow get the inventory from somewhere because guests are stocking a lot of that so I want it to be provided. So, at that timeli- ness is very important. During holiday season, out of stock is very important matrix for me. I can be very much interested to understand in all of the stores, what is the current situation of the stock for a particular item if it is selling like a hot cake. There might be a situation where a few of the stores have gone completely out of stock. But that was not an accurate information. There were still some stock left but that is giving me a hint that I need constant sup- ply of that item of I want to catch hold on buyers. I can little low on accuracy not completely but very prompt on the timeliness. But let’s see a situation where I want a weekly report of a particular promotion on how it has worked for particular area. But since it is a weekly report I am not very much con- cerned on time. You give me Saturday morning; Friday evening or Monday morning I am okay. I will take a decision based on the last week complete data that you have provided me. But if it is not accurate, if it is not very pre- cise and if it is not giving me a correct information, that’s going be a big pain. Here I cannot trade-off accuracy, can little trade-off on the timeliness. You give me later but you give me accurate.
Show more

107 Read more

Show all 10000 documents...