What is RDF ?
Don Tonkin,
Semantic Software Asia Pacific
Session Code: 4013
Friday, Sep 18, 2015 (08:30 AM - 09:30 AM) Platform: Analytics
Agenda
RDF is an emerging technology for storing data. It introduces a new query language SPARQL. Both are supported by DB2 10.0+. This is a beginner’s
introduction to RDF and its related technologies
• Objectives
• History of Data Science
• Definition of RDF and an RDF Store
• The RDF Schema and Ontologies
• Query an RDF store using SPARQL
2
Don Tonkin
Semantic Software Asia Pacific
What is RDF ?
Please fill out your session evaluation before leaving!
Objectives
By the end of this session you will
• Know that RDF, RDFS, SPARQL and Ontologies exist
• Have a vague idea of how they are structured
• Have some concepts of how they are used
• Have a burning desire to know more!
4
The History of Data Science
A Bowdlerised Version
The highlights in the evolution of Data Science are:
• 1822 Babbage Difference engine – Field manipulation
• 1890 Hollerith punched cards – Record manipulation
• 1959 Sputnik
• 1960 IBM IMS – Hierarchical Database
• 1960-1980 – various empirically designed databases
• 1970 Codd Relational Theory – Relational Database
• 1996 XML – Data with structure and metadata
• 2002 RDF – AAA: Anyone can say anything about anything
The History of Data Science
Web Evolution
6
The History of Data Science
Hype Cycle
Source: http://www.gartner.com/newsroom/id/2819918?_ga=1.51071721.1904172021.1401730474
8
Definition of RDF and an RDF Store The Data Problem
How to manage -
• Structured Data
• Can be normalised
• Unstructured Data
• No data model etc
• Semi-structured Data
• Meta-data tags
Definition of RDF and an RDF Store
New Terminology
URI IRI RDF
Linked data SPARQL
Data Silo Triplestore
– uniform resource identifier
– resource description framework – shared connected data
– RDF query language
– high volume non-interoperable – international resource identifier
– triples database
Definition of RDF and an RDF Store
Basic Ideas of RDF
• RDF is a universal language that lets users
describe resources in their own vocabularies
• RDF does not assume, nor does it define semantics of any particular application domain
• The user can do so in RDF Schema using:
• Classes and Properties
• Class Hierarchies and Inheritance
• Property Hierarchies
• Each item qualified by a URI
Definition of RDF and an RDF Store
RDF Statement Rules
Subject Predicate Object
Statement 1 p1 type Person Statement 2 p1 name John Smith
Statement 3 p1 email [email protected]
• expressed as a triple
• subjects, predicates, and objects are named entities
• objects can be literals
An RDF graph is a set of RDF triples
• Subject/Predicate/Object
• Triples are statements (i.e. they are true or false)
• The smallest graph is a single triple
Definition of RDF and an RDF Store
RDF Graph
12
Subject Predicate Object
The Lord of the Rings Is Written By J.R.Tolkein The Lord of the Rings Is a Book
Resource Property Literal Value
Resource
The Lord of The Rings
J.R.Tolkein Is Written By
Book
Definition of RDF and an RDF Store
RDF Example
•Subject : "Lord of the Rings“
<
http://dbpedia.org/page/The_Lord_of_the_
Rings
>•Predicate : "Is written by"
<
http://dbpedia.org/ontology/author
>•Object : "J.R.R Tolkien"
<http://dbpedia.org/page/J._R._R._Tolkien>
Data structure but not the semantics
13
Definition of RDF and an RDF Store
RDF / xml Language (One of Many variants)
14
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cd="http://www.recshop.fake/cd#">
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
<cd:artist>Bob Dylan</cd:artist>
<cd:country>USA</cd:country>
<cd:company>Columbia</cd:company>
<cd:price>10.90</cd:price>
<cd:year>1985</cd:year>
</rdf:Description>
</rdf:RDF>
XML declaration RDF root
Description of “about”
End RDF statement End Description Description
Name space Name space
Resource Resource Resource Resource
Resource
The RDF Schema and Ontologies
RDF Schema
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://www.animals.fake/animals#">
<rdfs:Class rdf:ID="animal" />
<rdfs:Class rdf:ID="horse">
<rdfs:subClassOf rdf:resource="#animal"/>
</rdfs:Class>
</rdf:RDF>
The RDF Schema and Ontologies
RDF Schema Components
16
OWL XSD XML/N3
RDFS RDF
Triple and URI Single fact Additional structure Defines classes and
relationships File format Describes property types Adds meaning, allows inferencing
John hasA dog Boolean, integer
Reasoners, synonyms
Subclasses Types
Turtle
The RDF Schema and Ontologies
Semantic Web Layers
18
who how where why what when access
Provenance Information in the Web of Data Olaf Hartig 2009
The RDF Schema and Ontologies
Provenance Data
The RDF Schema and Ontologies
Web Ontology Language - OWL
• describes relationships
• XML
• on RDF
• on the web
• understood by computers
• not for people
• W3C standard
The RDF Schema and Ontologies
OWL Components
20
Class and individual elements
• owl:Class
• owl:Thing
• owl:Nothing
• owl:Namedlndividual
RDFS elements used in OWL
• rdfs:subClassOf
• rdf:Property
• rdfs:subPropertyOf
• rdfs:domain
• rdfs:range
Datatype specification
• xsd:datatypes
Property characteristics
• owl:ObjectProperty
• owl:DataProperty
• owl:inverseOf
• owl:TransitiveProperty
• owl:SymmetricProperty
• owl:FunctionalProperty
• owl:InverseFunctionalProperty
• Cardinality restrictions
• owl:minCardinality
• owl:rnaxCardinality
• owl:cardinality
The RDF Schema and Ontologies
OWL Components - 2
Equality/inequality
• owl:equivalentClass
• owl:equivalentProperty
• owl:sameAs
• owl:differentFrom
• owl:AllDifferent
• owl:distinctMembers
Property restrictions
• owl:Restriction
• owl:onProperty
• owl:all ValuesFrom
• owl:someValuesFrom
Class intersection
• owl:intersectionOf
OWL DL & OWL Full:
Class axioms
• owl:one of, dateRange
• owl:disjointWith
• owl:equivalentClass
(applied to class expressions)
• rdfs:subClassOf
(applied to class expressions)
Boolean combinations of class expressions
• owl:unionOf
• owl:complementOf
• owl:intersectionOf
Property information
• owl:hasValue
The RDF Schema and Ontologies
Ontology v Database Schema
22
ONTOLOGY Focus on Meaning Data Instances
• optional
Axioms
• meaning and context
Comments
• contained within
Change
• flexible
Ambiguity
• frequent
DATABASE SCHEMA Focus on Data
Data Instances
• fundamental
Database Constraints
• data integrity
Metadata Repository
• separate Entity
Change
• rigid and slow
Ambiguity
• rare
The RDF Schema and Ontologies
Semantic Data v Relational Data
SEMANTIC DB
• Meaning of Information is Stored
• Relationship Between Classes
• No Restriction on Data Type/Size
• Any Query Can be Run Ad Hoc
Any Relation Can be Viewed
• No Keys are Required
RDBMS
• Meaning of Information is not captured
• Relationships not Supported
• Restricted Data Types and Sizes
• Many Queries Have to be Predicted
“Joins” are frequently Needed
• Keys are Required and are Static
The RDF Schema and Ontologies
Ontologies
• There are public ontologies for thousands of subject areas
• Finance
• Medicine
• Real Estate
• Geospatial
• There are tools for managing ontologies
• Creating and editing
• Pruning
• Aligning
• Visualising
24
The RDF Schema and Ontologies
Benefits of RDF
• Graph modelling is flexible
• Adding/removing a new edge or vertex is simple
• Adding an edge is like adding a new column to a table but easier
• Standard based graph representation
• RDF was defined by W3C. Allows interoperability
• Computers can understand the semantics RDF graphs
• Discover hidden relationships via inferencing
• Same URI means same resource
• Data integrated over multiple disciplines
The RDF Schema and Ontologies
Data Integration Example
26
Query an RDF store using SPARQL
SPARQL Protocol and RDF Query Language
SPARQL (SPARQL Protocol and RDF Query Language) is
• W3C standard for querying and manipulating RDF content
• Queries may consist of triple patterns, conjunctions, disjunctions, and optional patterns
• Queries/updates and corresponding results are communicated via HTTP with a SPARQL endpoint
• A SPARQL endpoint implements the SPARQL protocol and serves RDF data from a RDF triplestore or RDF view
Query an RDF store using SPARQL
SPARQL Components
Four query variations
1. SELECT query. Extracts raw values from a SPARQL endpoint, the results in table format
2. CONSTRUCT query. Extracts information from the SPARQL endpoint and transforms the results into valid RDF
3. ASK query. Provides a simple True/False result for a query on a SPARQL endpoint
4. DESCRIBE query. Extracts an RDF graph from the SPARQL endpoint
28
Query an RDF store using SPARQL
SPARQL Examples
SPARQL query
"What are all the country capitals in Africa?":
PREFIX ex: <http://example.com/exampleOntology#>
SELECT ?capital ?country
WHERE { ?x ex:cityname ?capital ; ex:isCapitalOf ?y . ?y ex:countryname ?country ; ex:isInContinent ex:Africa . }
Query an RDF store – Find all persons
30
Query an RDF store – Find Alice
Query an RDF store - Recurse who Alice knows
32
Query an RDF store – Find who Alice doesn’t know
Query an RDF store using SPARQL
SPARQL 1.1 Update: INSERT DATA
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX my: <http://www.mydomain.com/>
INSERT DATA
{ my:person1 foaf:name “Don Tonkin" . my:person1 foaf:gender “male”}
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX my: <http://www.mydomain.com/>
INSERT { ?p foaf:worksFor “IBM Australia" } WHERE { ?p foaf:worksFor “IBMA" }
34
Query an RDF store using SPARQL
:res1 rdf:type :House .
:res1 :baths "2.5"^^xsd:decimal . :res1 :bedrooms "3"^^xsd:decimal .
:res2 rdf:type :Unit .
:res2 :baths "2"^^xsd:decimal . :res2 :bedrooms "2"^^xsd:decimal .
:res3 rdf:type :House
:res3 :baths "1.5"^^xsd:decimal . :res3 :bedrooms "3"^^xsd:decimal .
RDF Data SELECT ?r ?ba ?br
WHERE { ?r rdf:type :House .
?r :baths ?ba .
?r :bedrooms ?br FILTER (?ba > 2) }
?r | ?ba | ?br
===================
:res1 | "2.5" | "3"
Query an RDF store using SPARQL
:res1 rdf:type :House .
:res1 :baths "2.5"^^xsd:decimal . :res1 :bedrooms "3"^^xsd:decimal . :res1 ogc:hasGeometry :geom1 . :geom1 ogc:asWKT "POINT(151.21, - 33.85)"^^ogc:wktLiteral .
:res3 rdf:type :House
:res3 :baths "1.5"^^xsd:decimal . :res3 :bedrooms "3"^^xsd:decimal . :res3 ogc:hasGeometry :geom3 . :geom3 ogc:asWKT " POINT(149.10, - 35.35)"^^ogc:wktLiteral .
36
RDF Data SELECT ?r ?ba ?br
WHERE { ?r rdf:type :House .
?r :baths ?ba .
?r :bedrooms ?br FILTER (?ba > 2) }
Add spatial terms such as
• Distance
• Within area
• Touches
?r | ?ba | ?br
===================
:res1 | "2.5" | "3"
Query an RDF store using SPARQL
• Extremely complex queries are possible
• SPARQL queries can be imbedded in SQL statements
• Vendor specific functions at this stage
• Integration with analytics tools
Conclusions
RDF:
• allows integration of structured and unstructured data
• provides mechanism for ‘Big Data’ management
• Ontologies and data models differ
• Consistency but not completeness
• Integration challenges
• Data ambiguity is a challenge
• Semantic processing is increasing
• NLP is a key component
38
Query an RDF store using SPARQL Some examples
• An example of a Triple Store http://dbpedia.org/page/London
• Some Ontologies
http://protegewiki.stanford.edu/wiki/Protege_Ontology_Libra ry
• Relationship Finder
http://www.visualdataweb.org/relfinder/relfinder.php