Timely. Practical. Reliable.
http://applied-semantic-web.org
An Introduction to the Semantic Web
for Life Science Practitioners
Emanuele Della Valle
http://emanueledellavalle.org
Share, Remix, Reuse — Legally
§
This work is licensed under the Creative Commons
Attribution 3.0 Unported License.
§
Your are free:
•
to Share
— to copy, distribute and transmit the work
•
to Remix
— to adapt the work
§
Under the following conditions
•
Attribution
— You must attribute the work by inserting
–
© applied-semantic-web.org at the end of each reused slide
–
a credits slide stating These slides are partially based on
An Introduction to the Semantic Web for Life Science Practitioners
by Emanuele Della Valle
http://applied-semantic-web.org/slides/2011/01/1a-intro-4bio.ppt
§
To view a copy of this license, visit
Emanuele Della Valle - http://applied-semantic-web.org
§
Motivation
§
The Semantic Web
Introduction
Emanuele Della Valle - http://applied-semantic-web.org
Web-based Knowledge Discovery Today
Large number of integrations
-
ad hoc
-
pair-wise
Possible, but a very
painful process
-- Carole Globe
Search &
Mash-up
Engine
Millions of Applications
Introduction
Computer don t understand much
…
Large number of integrations
-
ad hoc
-
pair-wise
Possible, but a very
painful process
-- Carole Globe
Each site is understandable for us
Computers don t understand much
?
Search &
Mash-up
Engine
010 0 1 1 0 0 1101 10100 10 0010 01 101 101 01 110 1 10 1 10 0 1 1 0 1 0 1 0 0 1 1 0 1 1 1 10 0 1 101 0 1Millions of Applications
Emanuele Della Valle - http://applied-semantic-web.org
The Problem: Semantic Gap
Sensor Data
Semantic Gap
Introduction
Understanding Means Bridging the Gap
Sensor Data
understanding
Emanuele Della Valle - http://applied-semantic-web.org
Do We Really Know What Understanding means?
Introduction
Two ways for computer to “understand”
§
Smart Machine
Emanuele Della Valle - http://applied-semantic-web.org
Smart Machines
§
Working examples found on the Web
•
Image Processing
–
retrievr: find by sketching
http://labs.systemone.at/retrievr/
•
Audio Processing
–
midomi: find by singing
http://www.midomi.com/
•
[…]
•
Natural Language Processing
–
semantic proxy:
http://semanticproxy.opencalais.com/
about.html
Sensor Data
Symbolic Description
Ima
ge
Pro
ce
ssi
ng
Au
di
o
Pro
ce
ssi
ng
N
at
ura
l L
an
gu
ag
e
Pro
ce
ssi
ng
[
…
]
Introduction
Smart Machines alone cannot bridge the gap
…
Natural Language Processing (NLP)
meets Image Processing (IP)
NLP
: What does your eye see?
IP
: I see a sea
NLP
: You see a c ?
IP
: Yes, what else could it be?
[Source NLP Related Entertainment
http://www.cl.cam.ac.uk/Research/NL/amusement.html]
Sensor Data
Symbolic Description
Ima
ge
Pro
ce
ssi
ng
N
at
ura
l
Language
Pro
ce
ssi
ng
sea
c
Semantic Gap
Emanuele Della Valle - http://applied-semantic-web.org
…
smart data are need
Sensor Data
Symbolic Description
Ima
ge
Pro
ce
ssi
ng
N
at
ura
l
Language
Pro
ce
ssi
ng
sea
c
smart data
Natural Language Processing (NLP)
meets Image Processing (IP)
NLP
: What does your eye see?
IP
: I see a
wordnet:word-sea
NLP
:
mmm, I
see a
wordnet:word-c
IP
:
I believe we have different
understanding of the world …
NLP
:
So do I
The
Semantic Web
offers a
set of
standards
that lowers
the barriers to emplo
y
smart
Introduction
What a machine understands of the Web
§
What we say to Web agents
§
" For more information visit
<a
href= http://www.ex.org >
my company </a>
Web
site. . .
§
What they hear
§
" blah blah blah blah blah
<a
href= http://www.ex.org >
blah blah blah </a>
blah
blah. . .
§
Jet this is enought to train them
Emanuele Della Valle - http://applied-semantic-web.org
What Google understands
§
Understanding that
•
[page1] links [page2]
à
page2 is interesting
§
Google is able to rank results!
•
The heart of our software is PageRank™, a system for
ranking web pages […] (that)
relies on the uniquely
democratic nature of the web by using its vast link
structure as an indicator of an individual page's value
.
http://www.google.com/technology/
Introduction
The Semantic Web 1/4
§
The Semantic Web is not a separate Web, but an
extension of the current one, in which information is given
well-defined meaning, better enabling computers and
people to work in cooperation.
The Semantic Web , Scientific American Magazine, Maggio 2001
http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21§
Key concepts
•
an extension
of the current Web
•
in which
information is given well-defined meaning
•
better enabling
computers
and
people
to work in
cooperation.
Emanuele Della Valle - http://applied-semantic-web.org
The Semantic Web 2/4
§
The Semantic Web is not a separate Web,
but
an extension
of the current one […]
Introduction
The Semantic Web 3/4
§
The Semantic Web […] ,
in which information is given
well-defined meaning
[…]
Human
understandable
but
“only”
machine-readable
Human and machine
“
understandable
”
?Emanuele Della Valle - http://applied-semantic-web.org
The Semantic Web 4/4
Semantic Web
META META META META META META META META META METAFewer Integration
-
standard
-
multi-lateral
[…] better enabling
computers and
people to work in
cooperation.
Even More Applications
Easier to understand for people
More “understandable” for computers
Semantic
Mash-ups
&
Introduction
Emanuele Della Valle - http://applied-semantic-web.org
Bio2RDF REST services
§
Describe a resource by a dereferencable URI
•
http://bio2rdf.org/ns
:id
§
Global services over federated endpoints
•
http://bio2rdf.org/links/
ns:id
•
http://bio2rdf.org/search/
searchedTerm
§
Targeted services to a specific endpoint
•
http://bio2rdf.org/linksns/
ns2/ns1:id
Introduction
Example of questions Bio2RDF can answer
§
What is known about human
BRCA
genes?
•
http://bio2rdf.org/search/BRCA1
§
What is known about human
BRCA
genes in Entrez Gene
databank (i.e., the Bio2RDF data source whose namespace is
geneid)?
•
http://bio2rdf.org/searchns/geneid/BRCA1
§
What can you tell me which fact are known about the human
tumor suppressor gene
BRCA1
(Gene ID:
672
)?
•
http://bio2rdf.org/geneid:672
§
What information is linked to geneid:672?
•
http://bio2rdf.org/links/geneid:672
§
Which is the FASTA sequence of the human
5-hydroxytryptamine
receptor 2A
(whose accession number is
AB037513
) in NCBI
GeneBank databank (i.e., the Bio2RDF data source whose
namespace is genbank).
•
http://bio2rdf.org/fasta/genbank:AB03751
§
And the image?
Emanuele Della Valle - http://applied-semantic-web.org
Complex Example: Linking Open Data Project
§
Goal: extend the Web with data commons by publishing
open data sets using Semantic Web techs
Visit
http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
!
Project Chartres
•
RDFizers
and
ConverterToRdf
•
Publishing Tools
•
Semantic Web
Browsers and
Client Libraries
•
Semantic Web
Search Engines
•
Applications
•
[
…
]
Bio2RDF
Introduction
Browsing the LOD with http://sig.ma/
ry
it!
ht
tp
://
si
g.
ma
/se
arch
?q
=Pro
pra
no
lo
l
Emanuele Della Valle - http://applied-semantic-web.org
Semantic Web layer cake
Standardized
Under
Investigation
Already
Possible
[ source http://www.w3.org/2007/03/layerCake.png ]Introduction
Emanuele Della Valle - http://applied-semantic-web.org
§
Introduction and RDF slides are partially based on
Fundamentals of the Semantic Web by David Booth