A Content Management System for Spatio-Temporal Data: Tadaima

(1)

UNIVERSITÀ DEGLI STUDI DI NAPOLI “FEDERICO II”

C

ORSO DI

D

OTTORATO IN

S

CIENZE

C

OMPUTAZIONALI E

I

NFORMATICHE

XXV

C

ICLO

A

C

ONTENT

M

ANAGEMENT

S

YSTEM

FOR

S

PATIO

-T

EMPORAL

D

ATA

:

T

ADAIMA

A Case of Study in Cultural Heritage field

TESI DI DOTTORATO DI

V

INCENZA

A

NNA

L

EANO 31/03/2013

COORDINATORE DEL DOTTORATO REFERENTI SCIENTIFICI:

PROF.SA GIOCONDA MOSCARDIELLO PROF.FRANCESCO CUTUGNO

TUTOR ACCADEMICO PROF.ADRIANO PERON

(2)

(3)

(4)

A mio padre Pasquale

“e figl se vasn n’suonn…”

(5)

(6)

Abstract

This thesis focuses on spatio-temporal data modeling and visualization with an application case of study in the Cultural Heritage field.

Spatio-temporal data visualization assumes an important role presenting data to users. Offering a synchronized view on three dimensions of data (i.e. descriptive, temporal and spatial) helps users in their knowledge discovery process. In Cultural Heritage field, time assumes an important role to explore data. The same timeline could be viewed in different thematic contexts, temporal domain could be stratified and the time reference could be qualitative and imprecise. Managing this kind of features improves the ability of the users to recognize patterns in data.

This thesis presents a framework oriented to the manipulation of Spatio-Temporal data with a particular attention to the temporal specification needs of the Cultural Heritage context, producing a prototype of a Content Management System (CMS). The proposed framework exploits the RDF technology for definition and manipulation of (meta-) data and adopts the OGC standards and open source technologies (PostGIS, Geoserver and Openlayers) for encoding, representing and retrieving spatial information. The available Spatio-Temporal metaphors are parametric, so users can personalize them depending on the specific application context and needs. A real case study in the Cultural Heritage field, concerning spatio-temporal information contained in literary Latin and Greek texts referred to the geographic area of Campi Flegrei (Naples, Italy), describes the framework functionalities.

(7)

(8)

Contents

List of Figures

FIG.1:GMLGEOMETRY ENCODING ... 24

FIG.2:GMLGEOMETRY CLASS HIERARCHY ... 25

FIG.3:KMLCOMPONENTS ... 26

FIG.4:DESCRIBEFEATURE REQUEST XSDSCHEMA ... 31

FIG.5:GETFEATURE REQUEST XSDSCHEMA ... 32

FIG.6:WFSQUERY ELEMENT ... 32

FIG.7:GETFEATURE RESPONSE XSD SCHEMA ... 33

FIG.8:GETGMLOBJECT REQUEST XSDSCHEMA ... 33

FIG.9:LOCKFEATURE REQUEST XSD SCHEMA ... 34

FIG.10:TRANSACTION REQUEST XSDSCHEMA ... 35

FIG.11.TRANSACTION RESPONSE XSDSCHEMA ... 35

FIG.12:FILTER OPERATOR XSDSCHEMA ... 36

FIG.13:FILTER COMPARISON OPERATORS (OGC,2010) ... 36

FIG.14:FILTER SPATIAL OPERATOR (OGC,2010) ... 37

FIG.15:FILTER SPATIAL RELATIONSHIP ... 37

FIG.16: OWSWEB ARCHITECTURE (OGC,2005) ... 39

FIG.17:RDF STATMENT GRAPH REPRESENTATION ... 43

FIG.18:SPATIO-TEMPORAL CLASSIFICATION USED IN (KISILEVICH,2005) ... 55

FIG.19:REPRESENTATION OF MINARD’S MAP IN SPACE–TIME CUBE (ANDRIENKO,2003) ... 57

FIG.20:(LEE ET AL.,2005) WEB INTERFACE PROTOTYPE... 58

FIG.21:(STEFANAKIS,2008)WEB APPLICATION. ... 58

FIG.22:BERTIN VISUAL VARIABLES ... 59

FIG.23:DOT MAP (WIKIPEDIA,2013) ... 60

FIG.24:PROPORTIONAL SYMBOL MAP (SANDVIK,2008) ... 60

FIG.25:FLOW MAP (WIKIPEDIA,2013)... 61

FIG.26:ISARITHMIC MAP (WIKIPEDIA,2013) ... 61

FIG.27:CHOROPLETH MAP (WIKIPEDIA,2013) ... 62

FIG.28:PRISM MAP (SANDVIK,2008) ... 62

FIG.29:TAG CLOUD-MAP ... 63

FIG.30:TREE-MAP (SLINGSBY ET AL.,2008) ... 63

FIG.31:META-MODEL SCHEMA ... 70

FIG.32:TEMPORAL MODEL SCHEMA. ... 76

FIG.33:THE TRANSITIVITY TABLE FOR THE 12 TEMPORAL RELATIONSHIP (OMMITTING =)(ALLEN,1983) ... 80

(13)

FIG.35:TIME INCONSISTENCE EXAMPLE ... 81

FIG.36:TEMPORAL RELATION BETWEEN NODES OF TEMPORAL INTERVAL TREE ... 86

FIG.37:QUANTITATIVE TEMPORAL REFERENCE GRAPH ... 87

FIG.38:QUALITATIVE TEMPORAL REFERENCE GRAPH ... 89

FIG.39:SPATIAL MODEL ... 93

FIG.40:SPATIAL DOMAIN DB SCHEMA ... 98

FIG.41:SPATIAL REFERENCE DB IMPLEMENTATION ... 100

FIG.42:SPATIO-TEMPORAL CLASSIFICATION ... 101

FIG.43:SPATIO-TEMPORAL PROPERTIES RDF SCHEMA ... 104

FIG.44:SPATIO-TEMPORA REFERENCE DB SCHEMA ... 109

FIG.45:CMSINSTALLATION STEPS ... 112

FIG.46:FEATURE CREATION:MODEL DESIGNER USER INTERFACE ... 113

FIG.47:FEATURE CREATION DIAGRAM ... 114

FIG.48:FEATURE EDITING DIAGRAM ... 115

FIG.49:AUTO-GENERATE INTERFACE FOR INSERTING-EDITIGN LITERARYSOURCE FEATURE ... 115

FIG.50:TEMPORAL THEMATIC CONTEXT AND LAYERS CREATION ... 116

FIG.51:INSERT QUANTITATIVE PERIOD-EVENT ... 116

FIG.52:PLACE INSTANCE CREATION INTERFACE ... 117

FIG.53:VISUALIZATION PERSONALIZATION DIAGRAM ... 118

FIG.54:FRONT-END ARCHITECTURE ... 119

FIG.55:EXCHANGE PROTOCOL ... 122

FIG.56:FRONT-END INTERFACE MOCK-UP ... 123

FIG.57:TIMELINE MOCK-UP ... 124

FIG.58:TIMELINE GENERAL PANEL ACTIONS ... 125

FIG.59:TIMELINE SELECTION PANEL:INTERVAL SELECTION ... 126

FIG.60:DESCRIPTIVE COMPONENT MOCKUP: A)ACTIVE LAYERS, B)DETAIL PANEL ... 127

FIG.61:TRACCIA ABSTRACT CLASS DIAGRAM ... 130

FIG.62:CLUSTER METAPHOR APPLIED TO LITTERARY PASSAGE ... 131

FIG.63:PATH METAPHOR APPLIED TO AUTHOR ... 132

FIG.64:THREE PANEL WEB INTERFACE: A)ACTIVE LAYERS PANEL B)GEOBROWSER C)TIMELINE ... 132

FIG.65:TEMPORAL FILTER ... 133

FIG.66:BALLOON AND DETAILS PANEL ... 133

(14)

List of Tables

TABLE 1: GETCAPABILITIES PARAMETERS ... 27

TABLE 2:GETMAP REQUEST PARAMETERS ... 28

TABLE 3:GETFEATUREINFO REQUEST PARAMETERS ... 29

TABLE 4:RDFS PROPERTY DECLARATION EXAMPLE ... 44

TABLE 5:FRIST ORDER LOGIC FORMULA FOR RDFSTRIPLES... 44

TABLE 6:XML AND RDFREPRESENTATIONS ... 48

TABLE 7:XQUERY AND SPARQL... 49

TABLE 8:RDFMETA-MODEL ... 70

TABLE 9:META-MODEL CUSTOMIZATION ... 72

TABLE 10:ALLEN’PROPERTIES MAPPING ... 75

TABLE 11:TEMPORAL DOMAIN RDFSCHEMA ... 77

TABLE 12:TEMPORAL INTERVAL FOREST CREATION ALGORITHM ... 87

TABLE 13:SPARQL QUERY RESOLVING QUANTITATIVE PERIOD EVENT TEMPORAL REFERENCE ... 88

TABLE 14:SPARQL QUERY FOR QUALITATIVE PERIOD-EVENT REFERENCE ... 89

TABLE 15:SPATIAL DOMAIN RDF MODEL ... 93

TABLE 16:WFSREQUEST RETRIEVING SPATIAL LAYER ELEMENTS ... 99

TABLE 17: GETCHILD(PLACEINSTANCE)WFSREQUEST ... 99

TABLE 18:SPATIOTEMPORAL PROPERTIES RDF... 104

(15)

(16)

Introduction

Studies show that 80% of data stored in a database has a spatial reference (Franklin, 1992) and it could be reasonable to think that this proportion holds also for the temporal reference. The current state of art has always highlighted how this two features are “special” (Egenhofer, 1993), defining ad hoc representation models, exploration operators and visualization metaphors to better render them to the user.

From a theoretical point of view, a spatio-temporal data model has to be designed with the purpose of making representable all (or a lot of) the possible aspects of temporal, spatial and spatio-temporal feature, offering operators to manipulate and exploit them (Yuan, 1996). Many applications managing spatio-temporal data, based on desktop or on web architecture, have been proposed e.g. (Andrienko et al., 2007), (MacEachren et al., 2004) etc. Often these applications focus on particular aspects of spatio-temporal data, using proprietary tools for the representation or the visualization of data. The first goal of this thesis is to design a general temporal data model able to manage all the aspects of a spatio-temporal phenomenon based on existing standards and relying on a flexible architecture independent from proprietary visualization and storage tools.

The temporal domain assumes a strategically role in several fields, like for example the Cultural Heritage one. Experts in this field would organize the temporal domain in different ways according to the context under analysis (i.e. “History of Naples” or “History of Literature”). Each context might have different granularity levels (for example “History of Naples” could have levels: “Domination”, “Empire”, “Battle”, “Important Events”). The elements of this timeline might not have a precise quantitative temporal dating, but only topological relation with other events in the same context. From the visualization point of view, mapping this complex temporal structure into common temporal metaphor like the one-dimensional time-bar representation, would cause a loss of information and could compromise the data exploration process. This work offers the possibility of modeling such a particular domain together with an adequate

(17)

visualization metaphor to explore it.

The requirement of organizing a domain in thematic context having different granularity levels applies to the spatial one too. Furthermore, the evolution of spatial domain elements over time, and the correlation of ancient places with nowadays spatial locations, assumes a strategically role in user knowledge discovering process. Users can reference an object in space in several manners, for example, they can give absolute georeferences by providing coordinate or drawing geometry on a map, or they can use a semantic link, referring to a place by its name. As for the temporal domain concerns, it can happen that users have not precise information about the spatial reference, but only incomplete information starting from a well-known place, for example “10 km north of

Naples”, “inside an area”, etc. This work provides a spatial model fully based on

existing encoding and operators standards able to manage those spatial domain particular needs.

Several different forms of spatio-temporal data types are available in real world and the current state of art offers many ways to classify them (e.g. see (Asproth et al., 1995), (Kisilevich, 2005), (Nadi & Mahmoud, 2003)). For each spatio-temporal dataset several visualization metaphors have been proposed (Andrienko et al., 2003) in order to improve users knowledge discovering process. Many of the currently available spatio-temporal visualization systems focus on the spatial and descriptive representation of the spatio-temporal phenomena, limiting the representation of the temporal dimension to a numeric value or to a one-dimensional time-bar (timeline). This thesis provides a web interface that allows users to explore data in their spatial, temporal and descriptive dimensions in an independent and synchronized way.

A spatio-temporal application has to handle a data model able of managing temporal, spatial and spatio-temporal domains, and has to offer adequate visualization metaphor in order to represents the managed features.

Building an application that allows users to model, store and interact with these features and their spatial, temporal and spatio-temporal dimensions implies dealing with a set of technologies like web servers, spatial databases,

(18)

Introduction

geobrowsers, and temporal data structures that common users may not master. The underlying idea proposed in this thesis consists on providing a framework that allows users to model spatial, temporal and spatio-temporal features regardless their specific application domain. Starting from the triad model (Peuquet, 1994) we will consider our data as just having a set of properties, regardless their semantic meaning.

The RDF data model proposed by W3C seems to suit perfectly this purpose. In RDF, a resource is represented by a triple <Subject, Predicate, Object>. Each triple represents a statement of a relationship between the things denoted by the nodes that it links (W3C, 2004). Therefore, a triple represents a link (predicate) between two nodes (subject, object); the set of all triples generates an RDF Graph. An object/feature instance in the triad model can be seen as a set of triples where the subject is the feature itself, the predicates are the different properties (spatial, temporal and descriptive) and the objects are the corresponding property values.

Using RDF Schema concepts one can model a generic feature by defining classes and properties in a way very similar to the Object-Oriented design.

One of the more important advantages of using RDF data model is that the model and the data have the same data structure (graph) and one can query both of them with the same query languages (SPARQL), even if the specific data model is unknown. This is a very important feature in order to be independent from the specific domain application, because we can refer to a property as a general predicate, which will be instantiated and personalized by user needs.

SPARQL does not (still) support spatial and temporal operators. Some works introduce spatial and temporal query algebra, but none of them is nowadays a W3C recommendation. In addition, the proposed model does not provide support to the thematic and hierarchical stratification of the spatial and temporal domains proposed in this work.

To overcome this limitation this thesis provides a data model specification for spatial, temporal and spatio-temporal domains. The “semantic” entities are modeled using RDF/OWL structures while the specific operators are implemented

(19)

using ad hoc data structures and/or defined standards.

In order to allow users to define a generic Feature, an RDF(S)/OWL meta-model is designed, that allows the definition of spatial, temporal, spatio-temporal and descriptive properties.

The CMS back-end provides user-friendly methods to create features and model their properties. The creation process allows also importing standard RDF/OWL ontology and the possibility of personalizing some predefined Cultural Heritage class of feature. Users can choose between different types of views and visual metaphors for visualizing their data in the front-end.

The CMS front-end incorporates and synchronizes spatial, temporal, and descriptive views in an integrated and extensible way. It provides a new interaction metaphor that exploits the hierarchical and stratified temporal domain defined and presented in our work (Cerasuolo et al., 2012). The front-end consists in a configurable interface that allows to independently interact with the three dimensions of data (spatial, temporal and descriptive) and it offers spatiotemporal visualization with a high level of personalization. Users can choose between different types of views and visual metaphors to see, filter and compare objects into an area handled by a geobrowser, and activate or disable information layers of their interest. By interacting with a map, users can perform spatial queries to obtain information about the referencing objects and their content. Users can visualize the complex temporal structure in an intuitive way and they can perform complex temporal queries by clicking on the desired temporal element.

The CMS front-end extends our work presented in (Cutugno et al., 2012), i.e. a framework aiming at merging a spatio-temporal data model in a web-architecture which can be compliant with existing standards and independent from data storage and visualization tools. The framework defines a flexible three-tier architecture for web applications that shows low coupling among tiers and uses standard exchange data formats like WFS, KML, GML to guarantee independence from storage and visualization tools.

As an instance, we propose a case study adopting the above outlined approach aimed at promoting a rich archaeological site in the area of Campi

(20)

Introduction

Flegrei in the neighbourhood of Naples. In particular, the CMS prototype manages

at the moment about 200 literary excerpts from Greek and Latin authors annotated with their spatial (places in the Campi Flegrei area) and temporal references. The temporal domain is composed of about 400 temporal period-events structured in three thematic contexts (“History of Campi Flegrei”, “History

Events”, “Authors’ Life”). Each thematic context is arranged in 5 granularity layers

(“Epoch”, “Ages”, “Empire/Domination”, “Political and Historical Events”, “Important Events”).

Outline

The core of this thesis is divided in two parts. PART I introduces some relevant background concepts on standard spatial models and operators and on RDF technology, together with an overview of the most relevant related work. In particular Part I is composed by:

 CHAPTER 1: this chapter introduces the Open Geospatial Consortium

(OGC) standards to encode spatial data (GML, KML) and the spatial

web services that implements the standard spatial operators (WMS,

WFS, WCS). It also describes the suggested architecture for web

application and gives an overview of the most used GIS tools.

 CHAPTER 2: this chapter shows some background concepts on Resource

Description Framework (RDF) data modelling, used for the CMS data

model, and its query language SPARQL. An overview on Semantic Web and OWL it is also provided.

 CHAPTER 3: the topics of this chapter are related work on spatio-temporal data modelling, spatio-spatio-temporal visualization and spatial, temporal and spatio-temporal RDF/OWL ontologies.

Part II describes the proposed work in terms of data model, prototype and case of study:

 CHAPTER 4: this chapter shows the data-model the CMS relies on. In

(21)

allows users to describe features having spatial, temporal, spatio-temporal and descriptive properties; the spatio-temporal domain, that uses an original model to handle user defined contexts, granularities and qualitative temporal references; the spatial domain, that uses a model inspired by the OGC standards able to manage semantic, absolute and uncertain spatial references; the integration of space and time

properties that manages the various types of spatio-temporal

phenomena.

 CHAPTER 5: this chapter illustrates the CMS implemented prototype by showing the Back-End functions for creating a model and personalizing the visualization, and the Front-End interface that allows user to independently interact with the three dimensions of data in a synchronized way and offers a new visual metaphor to explore the hierarchical and stratified temporal domain.

 CHAPTER 6: this chapter describes the application of the CMS prototype to a real case of study in Cultural Heritage. It shows a deliverable of the project named TRACCIA supported by “FARO – Università degli Studi di Napoli Federico II – Polo delle Scienze Umane e Sociali”, documents a joint work with Latin philologists of the department "Filologia Classica F. Arnaldi". The aim of the project was to find, document and give public access to the literary and historical evidences of typical agricultural products in the area of Campi Flegrei This thesis is closed by some Conclusions and Future work remarks.

(22)

Part I

(23)

(24)

OGC and Spatial Standard

Chapter 1

The availability of tools dealing with spatial data and able to perform spatial operations is more and more increasing. With the advent of internet technologies, also GIS applications moved to this domain. When geographic data are shared between organizations dealing with different applications, there might be a heterogeneity problem if the organizations use different GIS platforms, hence, producing different digital formats of the data.

The Open Geospatial Consortium (OGC) is an international voluntary standards organization, which defines open standards for geospatial content and services and suggests best practice for Web-GIS architecture. In the rest of the chapter the encoding spatial data format, the OGC Web services, the Web GIS Architecture and some geospatial tools will be exploited.

1.1 Encoding Spatial Data

Today, the internet is the main platform for data sharing. Thus, to share and integrate geographic data in the internet environment requires a standard data format, which is interoperable, extensible and suitable for internet technology.

OGC, whose mission is to address the lack of interoperability between systems that process geo-spatial data, has developed encoding and interface standards to satisfy syntactic interoperability among geospatial web services. The main encoding standards are Geography Markup Language (GML) (Cox et al., 2002) and KML (Wilson, 2008) described in the following subsections.

GML 1.1.1

The Geographic Markup Language (GML) is an XML encoding that implements the standard ISO 19118 for the transport and storage of geographic information modeled according to the conceptual modeling framework used in the ISO 19100 series and including both the spatial and non-spatial properties of geographic features (Cox et al., 2002).

(25)

they are easier to understand and maintain than proprietary binary formats. GML separates content of geographic data from its presentation. GML mainly describes the structure of geographic data without regard to how the data can be presented to a human reader.

OGC initially developed GML 1.0, which was based on a combination of XML DTDs and the Resource Description Framework (RDF). GML 2.0, which replaces GML 1.0, was developed and adopted in March 2001 by OGC. It is entirely based on XML Schema. Adoption of XML Schema in GML incorporates support for type inheritance, distributed schema integration, and namespaces. GML 2.0 is based on linear geometry, it does support coordinates to be speciﬁed in three dimensions, but it does not provide direct support for three-dimensional geometric constructs. GML 3.0 has been extended to represent geo-spatial phenomena in addition to simple 2D linear features, including features with complex, non-linear, 3D geometry, features with 2D topology, features with temporal properties, dynamic features, coverage and observations. It also provides more explicit support for properties of features and other objects having complex value.

1.1.1.1 GML Feature

GML is based on the geographic model developed by the OGC, which describes the world in terms of geographic entities called features. This geographic model is based on the OGC Abstract Speciﬁcation (Open Geospatial Consortium, 2003), which deﬁnes a geographic feature as “an abstraction of a real world phenomenon, it is a geographic feature if it is associated with a location relative to the Earth”.

Thus, real world phenomena are represented digitally as a set of features. The state of a feature is deﬁned by a set of properties, where each property has a name, value and type descriptions. Geographic features are those features whose properties may be geometry-valued.

In GML a feature is represented as an XML element. The name of the feature element indicates the Feature Type. The content of a feature element

(26)

is a set of elements, which describes the feature in terms of a set of properties. Each child element of the feature element is a property. The name of the property element indicates the property type.

The value of a property is given in-line by the content of the property element, or by-reference as the value of a resource identified in a link carried as an XML attribute of the property element. If the in-line form is used, then the content may be a literal (a number, text, etc.), or may be structured using XML elements, but no assumptions can be made about the structure of the value of a property. In some cases, the value of a property of feature may be another feature.

Properties of a feature may be simple properties or geometric properties. Properties with simple types (e.g., integer, string, ﬂoat, boolean) are collectively known as simple properties and the properties that are geometry-valued, are known as geometric properties. A feature can have multiple simple properties as well as multiple geometric properties. A feature can be composed of other features. Such a feature is termed as a feature collection. A feature collection has a feature type and thus may have its own distinct properties, in addition to the features it contains.

1.1.1.2 GML Geometry Elements

In accordance with the OGC simple feature model (OGC,2006) in order to express a feature having a spatial property, GML provides the encoding of geometric elements that represents a feature in the spatial domain. The GML encoding of a generic geometry is showed in Fig. 1.

Geometry represent the way a feature is linked to the spatial domain, this link can be represented by one of the classes showed in in Fig. 2 and depicted as follows:

 Point: is defined by a single coordinate tuple.

 LineString: is a special curve that consists of a single segment with linear interpolation. It is defined by two or more coordinate tuples, with linear interpolation between them.

(27)

Fig. 1 GML Geometry Encoding

 LinearRing: is a closed, piece-wise linear path it is deﬁned by four or more coordinate tuples, with linear interpolation between them; the first and last coordinates shall be coincident so that they can form a ring.

 Polygon: is a connected surface of which the boundary is a set of LinearRings. The boundaries are characterized as interior and exterior boundaries. A Polygon must have at most one exterior boundary and zero or more internal boundaries.

 MultiPoint: is defined by one or more Points, referenced through pointMember elements.

 MultiLineString: is defined by one or more LineStrings, referenced through lineStringMember elements.

 MultiPolygon: is defined by one or more Polygons, referenced through polygonMember elements.

 MultiGeometry: is a geometry collection that includes one or more geometries, referenced through geometryMember elements.

(28)

Fig. 2: GML Geometry class hierarchy

KML 1.1.2

Keyhole Markup Language (KML) is an XML-based language schema for expressing geographic annotation and visualization in map applications and 3D “geobrowsers” (Wilson, 2008). It was first developed by Keyhole Inc., which was acquired by Google in 2004. In 2007, Google submitted KML to the Open Geospatial Consortium (OGC). KML was adopted as an OpenGIS standard in 2008 (Wilson, 2008), and the OGC has now the responsibility for maintaining and extending the standard.

KML is focused on visualization of geographic features on map or a globe. The XML language also includes controls of the user’s navigation in the sense of where to go and where to look (Wilson, 2008).

The relationship within GML and KML is the same holding within XML and HTML: GML is used to model and exchange geographic data, while KML is used to visualize them on a geobrowser.

KML specifies features (e.g. images, geometries, text etc.), their location in three dimensions, and optionally a preferred location from where to look at them. KML shares common geometry representations and features with GML. The KML schema elements are showed in Fig. 3.

(29)

Fig. 3: KML Components

A KML file does not specify coordinate reference system (CRS). It is assumed that longitude and latitude coordinates are defined in WGS84 and altitude in meters above sea level measured from WGS84 EGM96 Geoid Vertical Datum.

KML also defines style rules (element style) for the element described in order to allow the customization of the element showed on the geobrowser.

KML documents and their related images and 3-D objects (if any) may be compressed using ZIP -encoding into KMZ files (Wilson, 2008). This greatly reduces the file size and makes data transfer more efficient, overcoming one of the major criticisms of XML-based structures.

(30)

1.2 OGC Web Services.

OGC Web Services (OWS) are defined using open non-proprietary Internet standards, these services are responsible for handling the different kind of operation on geospatial data.

Web Map Service (WMS) 1.2.1

The Web Map Service (WMS) protocol (OGC, 2006) is responsible of dynamically producing maps for georeferenced data from one or more distributed geospatial databases. A WMS request defines the geographic layers and area of interest to be processed, the response is one or more a digital image (JPEG, PNG, GIF) representing a map to be displayed on a web-client.

The WMS standard defines three operations: one returns service-level metadata; another return a map and an optional third operation returns information about particular features shown on a map. Those operations are implemented defining three HTTP requests, respectively: getCapabilities, getMap and getFeatureInfo requests. All these requests could be used in order to create a basic map where the user can identify a feature, get some basic information from it and perform some basic queries and are described as follows.

1.2.1.1 Get Capabilities

The getCapabilities is a mandatory WMS operation that returns the service metadata, which is a machine readable (and human-readable) description of the server’s information content and acceptable request parameters values in XML format. See Table 1 on how the getCapabilities requests are formed.

Table 1: getCapabilities Parameters Request parameter Mandatory

/optional

Description

VERSION=version O Request version

SERVICE=WMS M Service type

REQUEST=GetCapabilities M Request name

FORMAT=MIME_type O Output format of service metadata

(31)

1.2.1.2 Get Map

The getMap is a mandatory WMS operation that returns a map as a georeferenced image over the specified area. This is also sent as a HTTP request, see Table 2 on how the getMap request are formed.

Table 2: GetMap Request Parameters Request parameter Mandatory

/optional

Description

VERSION=1.3.0 M Request version

SERVICE=WFS M Service type

REQUEST=GetMap M Request name

LAYERS=layer_list M Comma-separated list of one or more map layers

STYLES=style_list M Comma-separated list of one rendering style per requested

layer.

CRS=namespace:identifier M Coordinate reference system

BBOX=minx,miny,maxx,maxy M Bounding box corners (lower left, upper right) in CRS units

WIDTH=output_width M Width in pixels of map picture. HEIGHT=output_height M Height in pixels of map picture. FORMAT=output_format M Output format of map

TRANSPARENT=TRUE|FALSE O Background transparency of map (default=FALSE).

BGCOLOR=color_value O Hexadecimal red-green-blue colour value for the background

color (default=0xFFFFFF).

EXCEPTIONS=exception_format O The format in which exceptions are to be reported by the

WMS (default=XML). TIME=time O Time value of layer desired. ELEVATION=elevation O Elevation of layer desired. Other sample dimension(s) O Value of other dimensions as

appropriate.

The response to a valid GetMap request shall be a map of the spatially referenced information layer requested, in the desired style, and having the specified coordinate reference system, bounding box, size, format and transparency.

(32)

1.2.1.3 Get Feature Info

The getFeatureInfo is an optional request that will allow the user to retrieve information about an object. This request returns the attribute values from a certain location in the image retrieved from the getMap request. See Table 3 on how the getFeatureInfo request is formed.

Table 3: GetFeatureInfo request Parameters Request parameter Mandatory

/optional

Description

VERSION=1.3.0 M Request version

REQUEST=GetFeatureInfo Request name.

map request part M Partial copy of the Map request parameters that generated the map for which information is desired.

QUERY_LAYERS=layer_list M Comma-separated list of one or more layers to be queried. INFO_FORMAT=output_format M Return format of feature

information (MIME type).

FEATURE_COUNT=number O Number of features about which to return information

(default=1).

I=pixel_column M i coordinate in pixels of feature in

Map CS

J=pixel_row M j coordinate in pixels of feature in

Map CS

EXCEPTIONS=exception_format O The format in which exceptions are to be reported by the

WMS (default= XML).

The nature of the getFeatureInfo response is at the discretion of the service provider, but it shall refer to the feature(s) nearest to the location selected.

Web Feature Service (WFS) 1.2.2

When the user needs access to the actual data represented on a map instead of an image of it, e.g. in order to modify, create or delete feature, a WFS request has to be performed.

The WFS protocol (OGC, 2010) allows the request and the update of spatial data from a web client. A XML based grammar, named GML (Geographic Markup Language) is used to encode data, but other common GIS format (e.g. SVG,

(33)

Shapefile) are also supported.

A request sent as WFS returns the data over a specified geographic area. This differs from the WMS request (see section 1.2.1 ) that only returns an image over a geographic area. The retrieved data from the WFS request is a file in the eXtensible Markup Language (XML) format. To reduce the amount of information within the retrieved files a filter can be applied to the WFS request to reduce the amount of data within the requested area (see section 1.2.2.7). This filter is transmitted in XML format and it supports the CQL standard, which also is a standard from Open geospatial Consortium.

In order to retrieve and handle the data over a network WFS supports five operations: getCapabilities, DescribeFeatureType, GetFeature, GetGmlObject,

Transaction and LockFeature. Depending on the supported operations three class

of Web Feature Service can be defined:

1. Basic WFS: it would implement the GetCapabilities, DescribeFeatureType and GetFeature operations.

2. XLink WFS: it would support all the operations of a basic web feature service and in addition, it would implement the GetGmlObject operation. 3. Transactional WFS: it would support all the operations of a Basic WFS and

in addition, it would implement the Transaction operation. Optionally, a transaction WFS could implement the GetGmlObject and/or LockFeature operations.

In the following subsection, the five WFS operators and the applicable filters will be described.

1.2.2.1 Get Capabilities

The getCapabilities operation returns an XML file that describes the data set. It provides all the information about e.g. the feature types, coordinate systems and name of the layers that can be accessed by a WFS request. It also provides all the operations that are supported by a WFS request.

(34)

1.2.2.2 Describe Feature Type

The DescribeFeatureType operation generates a schema description of feature types available in a WFS implementation. That differs from the getCapabilities operation that contains much more information etc. supported operations on the data set. The schema descriptions define how a WFS implementation expects feature instances to be encoded on input (via Insert and Update requests) and how feature instances will be generated on output (in response to GetFeature and

GetGmlObject requests). The schema of the WFS request is showed in Fig. 4.

The DescribeFeatureType operation returns as response an XML schema document that is a valid GML application schema with the information to the user.

Fig. 4: DescribeFeature request XSD Schema

1.2.2.3 Get Feature

The GetFeature operation will send a query to the original data set and then return all data that fulfill these requirements set by the query. The returned data will contain features and it will be distributed in GML format. The schema of the

GetFeature request is showed in Fig. 5.

The <Query> element (Fig. 6) defines which feature type to query, what properties to retrieve and what constraints (spatial and non-spatial) to apply to the feature properties in order to select the valid feature set.

The mandatory typeName attribute is used to indicate the name of one or more feature type instances or class instances to be queried.

(35)

Fig. 5: GetFeature request XSD Schema

The <Filter> element can be used to define constraints on a query. It allows to describe both spatial and/or non-spatial constraints defined in (Vretanos, 2010) and as described in section 1.2.2.7.

Fig. 6: WFS Query Element

The response to a GetFeature request must be valid according to the structure described by the XML Schema description of the feature type. Thus the WFS must report all the mandatory properties of each feature, as well any properties requested through the <PropertyName> element. The schema of the GefFeature response uses GML and is showed in Fig. 7.

(36)

Fig. 7: GetFeature Response xsd schema

1.2.2.4 Get GML Object

The GetGmlObject operation allows the user to retrieve element instances depending on their ID. The schema of a GetGMLObject request is showed in Fig. 8.

Fig. 8: GetGMLObject Request XSD Schema

The response to a GetGmlObject request is the referenced GML element returned as an XML document fragment. This differs from the response to a

GetFeature request, which returns a complete document containing a wfs:FeatureCollection.

(37)

1.2.2.5 Lock Feature

The LockFeature is an optional operation that locks one or more features in order to ensure consistency. The set of feature to lock can be selected using a filter element. A feature locked with this operation can be modified by the operation allowed by the Transaction request (see next section). The LockFeature request schema is showed in Fig. 9.

Fig. 9: LockFeature request xsd Schema

The response to a LockFeature request is an XML document that will contain a lock identifier that a client application can use in subsequent WFS operations to operate upon the set of locked feature instances.

1.2.2.6 Transaction

The Transaction operation supports the creation, deleting and updating operations on geographic data. These operations allow the user to remotely modify a geographical data set. The create operation allows the user to add information to the retrieved data. The delete operation allows the user to remove information from the retrieved data. The update operation transmits the modifications done from the create and delete operation to the source and saves the changes to the original data set. The Transaction request schema is showed in Fig. 10.

(38)

Fig. 10: Transaction Request XSD Schema

The response to a Transaction request is an XML document (see Fig. 11) indicating the termination status of the transaction.

Fig. 11. Transaction Response XSD Schema

1.2.2.7 Filter

A filter is used to identify a subset of resources from a collection whose property values satisfy a set of logically connected predicates. If the property values of a resource satisfy all the predicates in a filter then that resource is considered to be part of the resulting subset (Vretanos, 2010). Fig. 12 shows the Filter xml schema.

(39)

Fig. 12: Filter Operator XSD Schema

The following types of filters are defined:

1. Comparison operators: are used to form expressions that evaluate the mathematical comparison between two arguments. If the arguments satisfy the comparison then the expression evaluates to true. Otherwise the expression evaluates to false. As showed in figure the OGC define the following comparison operators:

PropertyIsLike, PropertyIsNull, PropertyIsNil, PropertyIsBeetwen.

(40)

2. Spatial Operators: A spatial operator (see Fig. 14) shall determine whether its geometric arguments satisfy the stated spatial relationship. The operator shall evaluate to true if the spatial relationship is satisfied. Otherwise, the operator shall evaluate to false. The meaning of the defined spatial relationship is exploited in Fig. 15.

Fig. 14: Filter Spatial Operator (OGC, 2010)

Fig. 15: Filter Spatial Relationship

3. Temporal Operators: A temporal operator determines whether its

time arguments satisfy the stated temporal relationship. The operator evaluates to true if the temporal relationship is satisfied. Otherwise, the operator evaluates to false.

(41)

4. Logical Operators: A logical operator (i.e. AND, OR, NOT) can be used to combine one or more conditional expressions. The logical operator AND evaluates to true if all the combined expressions evaluate to true. The operator OR operator evaluates to true is any of the combined expressions evaluate to true. The NOT operator reverses the logical value of an expression.

Web Coverage Service (WCS) 1.2.3

The Web Coverage Service (WCS) (OGC, 2010) protocol provides geospatial data as coverage in digital information. The data served by a WCS are grid data, i.e. satellite images, usually encoded in a binary image format, so they cannot be easily displayed by a web client.

1.3 OGC OWS Architecture

OGC suggests a four loosely coupled tiers architecture (OGC, 2005). This OWS architecture is designed for application in which data are voluminous, but can be adapted to any kind of application bypassing un-needed tiers, as indicated by some arrows in Fig. 16. The communication intra and extra tier is done only through open non-proprietary internet standards like HTTPPOST, HTTPGET and SOAP. Follows a brief description of each tier:

 Clients: this tier is responsible to handle the interaction with users and to display the request information possibly on a map (i.e. on a Geobrowser).  Application Services Tier. This component contains services designed to

support thin client such as web browsers. Its design has the goal of relieving each client directly performing often-needed support functions.  Processing Services Tier. This tier contains services designed to process

both feature and image (coverage) data to render on the Geobrowser in the client.

 Information Management Services Tier. This tier contains services designed to store and provide access to data and metadata. Is used by invoking web services from the others tiers like WFS, WMS, WCS, etc.

(42)

Fig. 16: OWS Web Architecture (OGC, 2005)

1.4 Web GIS Tools

Many tools have been designed with the purpose of offering web GIS services. They can be classified into four class, but often implemented tools belongs to two classes. From the client to the data storage we can distinguish:

 GIS Client: these tools are responsible to render information on a navigable 2D or 3D map to the users, and to allow them to interact with displayed data. Tools that fall in this category are Google Maps (Google, n.d.), OpenLayers (OpenLayers, 2011) etc.

 Map Services: these tools are able to render raster and vector maps images in a GeoBrowser, implementing the WMS protocol (i.e GeoServer (OSGeo, 2011), MapServer (OSGeo, 2008).

 GIS Web Server: these tools offer services to retrieve spatial data, implementing the WFS and WCS protocol (again GeoServer, MapServer)  Spatial Data Storage: are responsible to store and make persistent spatial

data, offering spatial types inspired to the OCG Geometry (i.e. Oracle Spatial (Oracle, 2011), Postgis (PostGis, 2011)).

(43)

(44)

RDF and the Semantic web

Chapter 2

Nowadays the diffusion of information through the World Wide Web (WWW) is more and more increasing. The way data were exchanged between applications and diffused to user changed with the diffusion and availability of internet connection. We passed from the static and hand-filled web pages of the early stage of www, to the data stored in relational or semi-structured (XML) databases and dynamically visualized to users on request. This step that separates the data, stored in an abstract data structure, and their visualization, by dynamically generating web page on request, augmented significantly the amount of information available on the WWW. However, the vast majority of information displayed on the web is built only for human user visual consumption. The web will continue to grow; more people will participate in it, and more every-day procedures will be performed as web applications. This growth of data available needs to be processed by machines in order to retrieve relevant information. This is the goal of the new evolution of WWW: the Semantic Web. The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and humans to work in cooperation (Berners-Lee et al., 2001).

This means that resources as well as relations between resources are characterized in a formal way.

From the Semantic Web point of view, who provides data on the web has to migrate from the from human-only presentation of content to forms accessible also by machine agents by providing meta-data, thus enabling semantic-aware applications to be built on top. There is need for an accepted standard to express those metadata, the same way HTML and XML are accepted standard for document visualization and structure.

The Resource Description Framework (RDF) provides the infrastructure for the expression, the exchange and the extension of metadata. The RDF model specifies the expression of assertions as an oriented and labeled graph.

(45)

understanding of the concepts the RDF statements are made up of. There is a need of a machine-understandable mechanism for the definition of vocabularies which helps to associate a meaning to an RDF statement. RDF Schema (RDFS) consents adding structure and meaning to RDF statement, allowing an elementary structuring of a vocabulary understandable also by machine agents. RDFS has a limited expressivity, but the support tor more complex vocabulary definition such as ontology, is provided by the Web Ontology Language OWL.

In the next of the chapter, details on the RDF, RDFS, and OWL structure and their query language SPARQL will be provided.

2.1 Resource Description Framework (RDF)

The Resource description Framework (RDF) is a W3C recommendation for representing information in the Web (W3C, 2004).

A Uniform Resource Identifiers (URIs) identifies resources in RDF. URIs provide globally-unique and resolvable identifiers for entities on the Web. Everything is identifiable by an URI can be described in RDF (e.g. persons, animals, things etc.).

The basic elements of RDF are statements, which are triples <subject, predicate, object> consisting of the resource (the subject) being described, a property (the predicate), and a property value (the object). In a statement, the

subject and predicate must be resource and object could be a resource or a literal.

A literal is a string of a certain datatype and may only occur as the object of a statement. In some cases, there is the need of describing resources using more complex structures of data than using a literal string or an URI pointer. Anonymous resource, also called blank node, are used for this purpose. A blank node identifier represents such a resource. Follow the formal definition of an RDF-statement (Definition 2.1.1).

Definition 2.1.1 RDF Statement (Triple)

Let be U the set of resource identified by an URI, B the set of blank nodes

identifiers and L the set of possible literal values of whatever datatype. Then T: (s, p,o) є (U υ B) × (U) × (U υ B υ L) is an RDF statement (triple).

(46)

RDF triples can be visualized as a directed labeled graph, in which subjects and

objects are represented as nodes, and predicates as arcs (See Fig. 17).

Fig. 17: RDF statment graph representation

Definition 2.1.2 RDF Graph

A set of RDF statements is an RDF Graph:

G= {T | T is an RDF Triple}

RDF provides formalism, called reification, which aims to make statements about other statements. This formalism is useful in order to record information about when statements were made, who made them, or other similar information (this is sometimes referred to as "provenance" information). This is useful for example for trust and authoring issues, knowing who stated a concept, represented by a triple and reified, help to decide whether to trust or not the information contained in the triple.

In RDF reification a blank node symbolizes the statement to be described, while four other statements (rdf:Statement, rdf:subject, rdf:predicate, rdf:object) are used to provide the link between the blank node and the statement to be described.

The set of all URI resources and literals in an RDF graph is called vocabulary. In a broader sense, a vocabulary is a set of concepts with a well-understood meaning to make assertions in a certain domain, for example ontology.

There are many way to represent RDF graph. We have seen above a graphical representation that is very readable but is difficult to serialize and parse for a machine. Another format used to represent RDF graph are the N3 format and its derivate Turtle and N-Triple, they are textual line based representation where each line represent a triple and resource are encoded between angular parentheses.

(47)

an XML representation are manifold, XML is the standard for the exchanging of data and provides languages to query (XQuery) and transform (XSL) data to represent them.

2.2 RDF Schema (RDFS)

RDF Schema (RDFS) (Brickley & Guha, 2004) provides a standard vocabulary for describing the classes and relationships used in RDF graphs.

Classes represent logical groups of resources, and a member of a class is said to be an instance of the class. The rdf:type property is used to define class and property types (e.g., the triple <C, rdf:type, rdfs:Class> asserts that C is a class).

rdf:type is also used to denote instances of classes (e.g., <i, rdf:type, C> asserts

that i is an instance of C). It is possible to define a class hierarchy using the predicate rdfs:subClassOf.

The RDFS vocabulary offers also a way to model properties defying their range and domain using the elements rdf:Property, rdfs:domain, rdfs:range. For example the set of triple in Table 4 asserts that p is a property of the class C (domain) and has strings as range.

Table 4: RDFS property declaration example

P rdf:type rdf:Property

P rdfs:domain C

P rdfs:range Xsd:#string

It is also possible to define a hierarchy of properties using the predicate

rdsf:subPropertyOf.

Every RDFS statement can be seen as a Frist Order Logic formula as depicted in Table 5.

Table 5: Frist order Logic formula for RDFS Triples

RDF Triple FOL formula

(48)

<C rdfs:subClassOf D> ∀X (C(X)⇒D(X))

<P rdfs:subPropertyOf R> ∀X ∀Y (P(X,Y)⇒R(X,Y)) P rdfs:domainC ∀X ∀Y (P(X,Y) )⇒C(X))

P rdfs:rangeD ∀X ∀Y (P(X,Y)⇒ D(X))

RDFS provides the capability to define ontologies. Ontologies serve to formally specify the semantic of RDF data so that a common interpretation of the data can be shared across multiple applications.

2.3 SPARQL

SPARQL Protocol And RDF Query Language (SPARQL) was defined by the W3C Data Access Working Group in 2004. It defines a query language for RDF Graphs.

The most prominent concept in SPARQL query language is the triple pattern. A triple pattern is a triple <subject, predicate, object> where each of the three elements can be a variable and both subject and predicate can be a literal. A collection of triple pattern is called graph pattern.

The result of a SPARQL query on the RDF Graph G is the Sub-Graph of G matching the graph pattern(s) given in input. There are several types of graph patterns:

 Group graph pattern: the set of triple pattern are considered in logical AND, they are represented between brackets (“{}”) and concatenated by point (“.”). The result will satisfy all the triples in the group pattern.  Union graph pattern: the set of triple patterns are considered as

alternative, they are concatenated by the reserved word UNION. The result will satisfy at least one of the triples in the union pattern.  Optional graph pattern: the set of triple patterns are evaluated

optionally, they are concatenated by the reserved word OPTIONAL. If the optional pattern is not satisfied the execution does not stop, but the rest of the triple pattern will be shown.

 Filter graph pattern: this pattern is used in order to impose logical, mathematical and other constraints on the triple retrieved. The result will satisfy the filter constraints.

(49)

SPARQL provides several query types. The most common is the SELECT query that returns all, or a subset of, the variables bound in a query pattern match. Its syntax is similar to the SQL select and is described in Definition 2.3.3.

Definition 2.3.3 SPARQL SELECT syntax

SELECT V FROM U WHERE P

whereU is the URL of an RDF graph G, P is a SPARQL graph pattern and V is a

tuple of variables appearing in P.

The CONSTRUCT SPARQL query returns an RDF graph constructed by substituting variables in a set of triple templates. Its syntax is described in Definition 2.3.4.

Definition 2.3.4 SPARQL CONSTRUCT syntax

CONSTRUCT V WHERE P

where P is a SPARQL graph pattern and V is a tuple of variables appearing in P.

The ASK query returns a boolean indicating whether a query pattern matches or not. Its syntax is described in Definition 2.3.5.

Definition 2.3.5 SPARQL ASK syntax

ASK P

where P is a SPARQL graph pattern.

The DESCIBE query returns an RDF graph that describes the resources found. Its syntax is described in Definition 2.3.6.

Definition 2.3.6 SPARQL Describe syntax

DESCRIBE V WHERE P

where P is a SPARQL graph pattern and V is a tuple of variables appearing in P.

SPARQL uses post-filtering clauses which allow, for example, to order (ORDER BY clause), or to limit (LIMIT and/or OFFSET clauses) the answers of a query.

(50)

2.4 OWL

The OWL Web Ontology Language is designed for use by applications that need to process the content of information instead of just presenting information to humans. OWL facilitates greater machine interpretability of Web content by providing additional vocabulary along with a formal semantics (McGuinness & Van Harmelen, 2004).

OWL enrich RDF(S) model by providing vocabulary terms to express some concepts and relationships. For example OWL introduce two kind of properties (sub-property of rdf:Property):

 owl:DatatypeProperties, that define properties having as range RDF literals and XML Schema datatypes;

 owl:ObjectProperties, that define properties having as range other class instances.

OWL allow also to introduce restriction on the cardinality of a defined property, by using the elements owl:minCardinality and owl:maxCardinality.

More complex elements of owl allow to define transitivity, symmetric, functional and inverse property (respectively owl:transitiveProperty, owl:symmetricProperty, owl:functionalProperty, owl:inverseProperty).

Depending on what OWL elements are used and their instantiation, it is possible to classify OWL in three increasingly expressive sublanguages:

 OWL Lite: it supports property and class hierarchies and simple

constraints. For example, while it supports cardinality constraints, it only permits cardinality values of 0 or 1.

 OWL DL: its name is due to its correspondence with the Description

Logics. OWL DL includes all OWL language constructs, but they can be used only under certain restrictions (for example, while a class may be a subclass of many classes, a class cannot be an instance of another class). It guarantees the maximum expressiveness while retaining computational completeness and decidability.

(51)

syntactic freedom of RDF with no computational guarantees. It supports all owl vocabulary elements without restrictions.

OWL Full can be viewed as an extension of RDF, while OWL Lite and OWL DL can be viewed as extensions of a restricted view of RDF. Every OWL (Lite, DL, Full) document is an RDF document, and every RDF document is an OWL Full document, but only some RDF documents will be a legal OWL Lite or OWL DL document (McGuinness & Van Harmelen, 2004).

2.5 RDF(S)/OWL Modeling Vs. Standard Modeling

The prominent contribution of the presented thesis is the definition of a meta-model for representing data having Spatio-Temporal feature. RDF, RDFs and OWL seem to be the perfect candidate to express and model meta-data and their instances.

RDF modeling is easy to extend and to dynamically adapt to user needs, this feature would be difficult to achieve in the relational database model were the data have a predefined structure with simple record-type, and the schema is fixed and difficult to extend.

The graph structure of RDF statements allows modeling both meta-data that application data using the same formalism using the same query language (SPARQL) to retrieve information. Also the abstract graph structure of RDF allows a concept modeling and querying that is unambiguous and independent from the representation, which is not true for the tree-based XML structure. Let us make an example, suppose that user wants to model the concept “Tolkien written the

Lord of the Rings”. There are several XML representations of this concept, showed

in Table 6(a, b, c). There are also many formats to represent an RDF triple, as showed in showed in Table 6d (N-Triple) e Table 6 (RDF/XML).

Table 6: XML and RDF Representations

XML Reprepsentation a <Book>

<Author> J.R.R. Tolkien </Author> <Title> The Lord of the Rings </Title> </Book>

(52)

b <Author>

<Name> J.R.R. Tolkien </Name> <Books>

<Book>

<title> The Lord of the Rings </title> </Book>

</Author>

c <BOOK title="The Lord of the Rings"> <AUTHOR> J.R.R. Tolkien

</AUTHOR> </BOOK>

RDF Triple

d <authors:Tolkien> <hasWritten> "The Lord of the Rings" e <rdf:Description abuot="#authors:Tolkien">

<hasWritten> The Lord of the Rings </hasWritten> </rdf:Description>

Now, let be the case that another user wants to know who wrote the book “The lord of the Rings”. To retrieve this information from an XML datasource, one has to know the representation schema and write the appropriate XQuery/XPath (see Table 7(a, b, c)). On the contrary the SPARQL query (see Table 7d) remains independent from the RDF encoding because both the model and the language are based on the abstract graph data structure.

Table 7: XQuery and Sparql

XQUERY

a /Book[/Title/Text()="The Lord Of Rings"]/Author

b /Author[/Book/Book/title/text()="The Lord of Rings"]/Name

c /BOOK[@title="The Lord Of Rings"]/Author

SPARQL d Select ?x

(53)