Iván Ruiz Rube
Departamento de Lenguajes y Sistemas Informáticos
Universidad de Cádiz
Publishing Linked Data
from relational databases
Roadmap
The evolution of the Web
Linked Open Data
Exposing databases with D2R Server
Case study: The VOA3R Project
Conclusions
THE EVOLUTION OF
THE WEB
PUBLISHING LINKED DATA FROM RELATIONAL
DATABASES
World Wide Web
Most important infrastructure for the
distribution of information.
Rich and broad information: text, images,
videos, slides, etc.
Web navigators support HTML, JS, CSS
and other formats.
Navigation based on hyperlinks.
Web Evolution
09/11/2011 6
Web 1.0
Web 2.0
Web 3.0
Web 1.0
Beginnings of the
Web
Static pages
Limited use of
standards
Lack of interaction
Web 2.0
Higher bandwidth
Standards
Rich User Interface
Accessibility
Usability
Social networks
Web 3.0
3D virtual
environments
The Internet of
Things
Domotics
Cloud Computing
Semantic Web
Semantic Web
09/11/2011 10
“I have a dream for the Web (in which computers) become
capable of analyzing all the data on the Web….”
“…the day-to-day mechanisms of trade, bureaucracy and
our daily lives will be handled by machines talking to
machines.”
1Tim Berners-Lee
LINKED OPEN DATA
PUBLISHING LINKED DATA FROM RELATIONAL
DATABASES
Information Age
Huge amount of
information
A large number of
information
systems
Big challenges:
◦
Data integration
◦
Data analysis
Need for open data
Improvement of
organizational
transparency
Public data
Foster the
research
Promote the
development of
third-party system
Linked Open Data
“A method of publishing structured data so that it can be
interlinked and become more useful.
…it extends web pages to share information in a way that can
be read automatically by computers.”
1
09/11/2011 Jornadas de Software Libre y Web 2.0 14
Resource Description Format
http://publisher.org/Papers/
Resource Description Format
2008
year
Linked Data -
The Story So Far
title
http://publisher.org/Papers/
Paper12345
Resource Description Format
author
publishedIn
2008
http://publisher.org/Journals/
JournalSWIS
year
Linked Data -
The Story So Far
title
http://w3.org/People/
Berners-Lee
http://publisher.org/Papers/
Resource Description Format
author
publishedIn
2008
http://publisher.org/Journals/
JournalSWIS
year
http://w3c.org
director
Linked Data -
The Story So Far
title
http://w3.org/People/
Berners-Lee
http://publisher.org/Papers/
Paper12345
09/11/2011 Jornadas de Software Libre y Web 2.0 18
http://xmlns.com/foaf/
Person
RDF (sintaxis)
<http://publisher.org/Papers/Paper12345>
title
"Linked Data - The Story So Far";
year
"2008-01-01";
author
<http://w3.org/People/Berners-Lee>;
publishedIn
<http://publisher.org/Journal/JournalSWIS> .
<rdf:Description
rdf:about="http://publisher.org/Papers/Paper12345">
<title>
Linked Data - The Story So Far
</title>
<year>
2008-01-01
</year>
<author
rdf:resource="http://w3.org/People/Berners-Lee" />
<publishedIn
rdf:resource="http://publisher.org/Journal/JournalSWIS" />
Ontologies (vocabularies)
09/11/2011 20
“An ontology is an explicit and formal specification of a shared
conceptualization
1
“
1Tom Gruber
EXPOSING DATABASES
WITH D2R SERVER
PUBLISHING LINKED DATA FROM RELATIONAL
DATABASES
How to publish Linked Data?
Annotation
◦
Manual
◦
Collaborative
◦
(Semi-)automatic
Exposure
◦
RDF Triple Store
◦
HTML+RDF (RDFa)
◦
RDF Wrappers
◦
SQL2RDF
09/11/2011 Jornadas de Software Libre y Web 2.0 24
2008 JournalSemanticWeb W3C The Story So Far Berners-Lee LinkedData
Web Application Architecture
Relational
Database
Application Server
Web Application Architecture using
D2R Server
09/11/2011 26Relational
Database
Application
Server
Web Application Architecture using
D2R Server
<http://cris.org:/resource/projects/Organic> a cerif:Project ;
rdfs:label "Multilingual Federation of Learning Repositories"@en-uk ; cerif:acronym "Organic.Edunet" ; cerif:endDate "2010-09-30"^^xsd:date ; cerif:internalIdentifier "ff808181300cf99e01300d1a355f0003" cerif:isLinkedByOrganisationUnit
Relational
Database
D2R
Application
Server
Web Application Architecture using
D2R Server
09/11/2011 28
<http://cris.org:/resource/projects/Organic> a cerif:Project ;
rdfs:label "Multilingual Federation of Learning Repositories"@en-uk ; cerif:acronym "Organic.Edunet" ; cerif:endDate "2010-09-30"^^xsd:date ; cerif:internalIdentifier "ff808181300cf99e01300d1a355f0003" cerif:isLinkedByOrganisationUnit
Relational
Database
D2R
Server
Application
Server
Exposing and Consuming Linked
Data
Internet Navigator
URL: http://mashup.org File Favourites Help
D2R
Server
Relational
Installing D2R
Using D2R
~/d2rserver$> generate-mapping
-o
MAPPING.n3
-d
com.mysql.jdbc.Driver
-u
USER
-p
PASSWORD
jdbc:mysql://localhost:3306/DATABASE
Database model example
Vocabularies
#Built-in vocabularies
@prefix
rdf
: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix
rdfs
: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix
xsd
: <http://www.w3.org/2001/XMLSchema#> .
@prefix
owl
: <http://www.w3.org/2002/07/owl#> .
#Specific vocabularies
@prefix
foaf
: <http://xmlns.com/foaf/0.1/> .
@prefix
dc
: <http://purl.org/dc/elements/1.1/> .
@prefix
dcterms
: <http://purl.org/dc/terms/> .
@prefix
bibo
: <http://purl.org/ontology/bibo/> .
Database Connection
09/11/2011 34map:database a d2rq:Database;
# Main settings
d2rq:jdbcDriver
"com.mysql.jdbc.Driver";
d2rq:jdbcDSN
"jdbc:mysql://localhost:3306/DATABASE";
d2rq:username
"USER";
d2rq:password
"PASSWORD";
# Other settings
jdbc:autoReconnect
"true";
jdbc:zeroDateTimeBehavior
"convertToNull";
d2rq:allowDistinct
"true";
jdbc:keepAlive
"3600"; # value in seconds
jdbc:keepAliveQuery
"SELECT 1";
.
Exposing RDF Resources
map:OrganisationUnits a d2rq:ClassMap;
d2rq:dataStorage
map:database;
d2rq:class
cerif:Organization;
d2rq:uriPattern
"organizations/@@ORGANISATIONS.ACRONYM@@";
d2rq:condition
"ORGANISATIONS.ACRONYM
<> ''“ .
http://dataset.org/organizations/
UCA
http://eurocris.org/cerif/
rdf:type
Exposing literal properties
09/11/2011 36map:OrganisationUnits_Headcount
a d2rq:PropertyBridge;
d2rq:belongsToClassMap
map:OrganisationUnits;
d2rq:property
cerif:headcount;
d2rq:column
"ORGANISATIONS.HEADCOUNT
“ .
Jornadas de Software Libre y Web 2.0
http://dataset.org/organizations/
UCA
cerif:headcount
Exposing 1:N relations
map:OrganisationUnits_Name a d2rq:PropertyBridge;
d2rq:belongsToClassMap
map:OrganisationUnits;
d2rq:property
cerif:name;
d2rq:join
"ORG_NAME.ORGID = ORGANISATIONS.ID";
d2rq:column
"ORG_NAME.NAME“ .
http://dataset.org/organizations/
UCA
cerif:name
University of
Universidad
de Cádiz@es
cerif:name
Exposing N:M relations
09/11/2011 38
map:OrganisationUnits_Person a d2rq:PropertyBridge;
d2rq:belongsToClassMap
map:OrganisationUnits;
d2rq:property
cerif:members;
d2rq:join
"ORG_PERS.ORGID = ORGANISATIONS.ID";
d2rq:join
"ORG_PERS.PERSID = PERSON.ID";
d2rq:refersToClassMap
map:Person
.
Jornadas de Software Libre y Web 2.0
http://dataset.org/people/
InvestigadorXYZ
http://dataset.org/organizations/
UCA
CASE STUDY: THE
VOA3R PROJECT
PUBLISHING LINKED DATA FROM RELATIONAL
DATABASES
Platform based on semantic technologies
to integrate open contents for
researchers.
Manages scientific context:
◦
Organizations
◦
Research Projects
◦
Researcher Profiles
◦
etc.
Publishes its data using D2R Server.
Organization’s Data in VOA3R
Organization’s Data in RDF
<http://voa3r.cc.uah.es/dataset/resource/organisationUnits/UAH>
rdf:type cerif:OrganisationUnit ;
rdfs:label
"University of Alcala" ;
cerif:acronym
"UAH" ;
foaf:homepage
<http://www.uah.es> ;
cerif:researchActivities
"Ontologies, Linked Data" ;
Organization’s Data in RDF (II)
…
cerif:researchProjects
<http://voa3r.cc.uah.es/dataset/resource/projects/Organic.Edunet> ,
<http://voa3r.cc.uah.es/dataset/resource/projects/Organic.Lingua> ,
<http://voa3r.cc.uah.es/dataset/resource/projects/VOA3R> ;
cerif:innerGroups
<http://voa3r.cc.uah.es/dataset/resource/organisationUnits/IERU> ;
cerif:members
<http://voa3r.cc.uah.es/dataset/resource/person/Salvador_Sanchez> ,
<http://voa3r.cc.uah.es/dataset/resource/person/Miguel_Refusta> ,
<http://voa3r.cc.uah.es/dataset/resource/person/Luis_Torrico> .
SPARQL Client
CONCLUSIONS
PUBLISHING LINKED DATA FROM RELATIONAL
DATABASES
Conclusions
Web based on documents
Web based on
Data.
Linked Data as a way for interchanging data
between different datasets in the Web.
RDF as a standard format to describe data.
D2R allows to publish RDF metadata from
databases (non-intrusive technique).
Main aim: Create new third-party
applications using open linked data from LD
systems.
References
Linked Data: Evolving the Web into a
Global Data Space
◦
http://linkeddatabook.com/
W3C Linking Open Data Project
◦
http://www.w3.org/wiki/SweoIG/TaskForces/C
ommunityProjects/LinkingOpenData
D2R Server
◦
http://www4.wiwiss.fu-berlin.de/bizer/d2r-server
Iván Ruiz Rube
09/11/2011 50
Publishing Linked Data from
relational databases
thanks