• No results found

Toward an interactive system for checking spatio-temporal data quality

N/A
N/A
Protected

Academic year: 2021

Share "Toward an interactive system for checking spatio-temporal data quality"

Copied!
31
0
0

Loading.... (view fulltext now)

Full text

(1)

Toward an interactive system

for checking spatio-temporal

data quality

Dounia Azzi,

Christine Plumejeaud, Marlène Villanova-Oliver, Jérôme Gensel

Laboratory of Informatics of Grenoble (LIG), Grenoble, France

data quality

(2)

Overview

Context

Data quality

Examples

Qualestim

Qualestim

Conception

Validation

(3)

Three dimensions

Time

: 1950 - 2050

Thematic

: Social, economic,

environmental, demographic

Space

: from local to world scales

UNEP

A spatio-temporal

information cube

Objective :

Checking

data quality

geographical information HETEROGENEITY

through space, time, and thematic

(4)

Metadata

describe data : the identification, the provider, the

lineage, the quality, …

ISO 19115

defines a standard for describing geographic information

that can be

adapted to statistical data (profile)

Poor

quality report

(5)

Produce

reports

on data quality : HOW ?

Outliers detection

: find the values that don’t look like the

neighborhood

Complete metadata

Allow

visualization

of data and metadata in interactive tabs

Average evolution

(high range)

(6)

Tools for visualization

and geo-computing

Domain analysis

Data

Visualisation

Statistical

Analysis

DB

Connection

Metadata

Visualisation

Sada

[1997]

x

x

GeoDa

[1998]

x

x

None of those allow visualization of data and metadata !

CrimStat

[2004]

x

x

QuantumGis

[2002]

x

Grass

[2010]

[R programming]

x

(7)

Outlier :

An outlying observation, or outlier, is one that appears to deviate

markedly from other members of the sample in which it occurs.

[Grubbs, 69]

Geostatistic methods

Check abnormal values

Spatial, temporal and thematic dimensions

Uni/bi/multi-variate

Outlier detection

Univariate Bivariate*/Multivariate Methods

Thematic Spatial Thematic Spatial

Uni/bi/multi-variate

R

[

http://www.r-project.org/

]

is a well-known tool for

statistics, with spatial

packages

Thematic Spatial Thematic Spatial Standard Boxplot

x

Adjusted Boxplot

x

Bagplot

x

* Mahalanobis distance

x

Principal Components Analysis (PCA)

x

Local Regression

x

x

Multiple Linear Regression

x

Geographically Weighted Regression

x

x

(8)
(9)
(10)
(11)
(12)

Example : Hawkins’s test

f (d)

=

α

2

2

π

α

d

e

Exponential

Tobler’s first law of geography:

Everything is related to everything

else, but near things are more

related than distant things.

[Tobler, 1970]

(13)

Overview

Context

Data quality

Examples

Qualestim

Qualestim

Conception

Validation

(14)

Qualestim architecture

SPATIO-TEMPORAL DATABASE

R-stat embeded

in JAVA (JRI)

-spatial analysis

- thematic

Outliers map

Export metadata

on quality with

complete reports

Expert can

visually explore

the data

- thematic

analysis

- time series

(15)

Qualestim architecture

SPATIO-TEMPORAL DATABASE

R-stat embeded

in JAVA (JRI)

-spatial analysis

- thematic

Outliers map

Export metadata

on quality with

complete reports

Expert can

visually explore

the data

- thematic

analysis

- time series

(16)

ISO 19115 profile

for metadata

[Plumejeaud, 2010]

Dataset Stock Values

Metadata

for:

(17)

ISO 19115 profile

for metadata

[Plumejeaud, 2010]

Dataset Stock Values

Metadata

for:

(18)

ISO 19115 profile

for metadata

[Plumejeaud, 2010]

Dataset Stock Values

Metadata

for:

(19)

Java-R Interface

Create

a R virtual machine in the JVM

Allow us to

use R objects

to put the data into, like

SpatialPolygonsDataFrame object

(20)
(21)

(method + parameters)

(22)

(method + parameters)

Model of a quality report

(23)

(method + parameters)

Model of a quality report

(24)

Qualestim

Data and metadata visualization

Outlier detection

(25)

Qualestim

– Soon to come

(26)

Qualestim – Soon to come

Edition of quality reports

Export

Spatio-temporal

Database

+

(27)

Overview

Context

Data quality

Examples

Qualestim

Qualestim

Conception

Validation

(28)

Outcome

Interactive viewer of data with metadata together

Quality reports are created using outliers detection

methods, and can be viewed/modified

(29)

Outcome

Interactive viewer of data with metadata together

Quality reports are created using outliers detection

methods, and can be viewed/modified

Future work

Short term

Short term

Test of the system with users

Integrate pre-computed parameters and suggestions for

parameters (connection with a knowledge database)

(30)

Outcome

Interactive viewer of data with metadata together

Quality reports are created using outliers detection

methods, and can be viewed/modified

Future work

Short term

Short term

Test of the system with users

Integrate pre-computed parameters and suggestions for

parameters (connection with a knowledge database)

Long term

(31)

END

References

Related documents

Air Baltic (delayed flight), Air China (delayed flight), Air Mediterranée (delayed flight), British Airways (cancelled flight, rebooked on a flight departing 23 hours later),

If the cancellation is made up to 15 days before the beginning of the course, the school retains the amount of the registration fee, one week's tuition and one week's

The dispute over land ownership in My plaas se naam is Vergenoeg provides a link between this play and the modern Afrikaans farm novel.. In this subgenre the literary tradition of

• The most significant development transaction in the fringe market for the year was at Argent’s King’s Cross Central scheme where BNP Paribas Real Estate bought a site to develop

We investigate the question “To what extent can these methods be used for the purpose of a paraphrase identification task?” For the two gold standard data, we obtained

From accelerometer data, we have extracted magnitude variations and angular acceleration for pitch, roll, and yaw (angles around the x-axis, y-axis, and z-axis of the

With cloud and managed security services, integrated technologies and a team of security experts, ethical hackers and researchers, Trustwave enables businesses to transform the

This study seeks to explore and understand the complex interplay between offline and online reading skills of EFL postgraduate students as they read hypertext