• No results found

The Semantic Web: Web of (integrated) Data

N/A
N/A
Protected

Academic year: 2021

Share "The Semantic Web: Web of (integrated) Data"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)

The Semantic Web:

Web of (integrated) Data

Frank van Harmelen Vrije Universiteit Amsterdam

Take home message

„

Semantic Web = Web of Data

(no longer only web of text, web of pictures)

„

Set of open, stable W3C standards

„

Rapidly emerging tools & vendors

„

Use cases:

z data integration z web services

z knowledge management z search (intranets)

(2)

Outline

„

The vision

„

What is required

„

Machine representation

z

XML, RDF, OWL

„

Where are we now?

„

Examples

Things we would

(3)

“Intelligent” things

we can’t do today

„Search engines

• concepts, not keywords

• semantic narrowing/widening of queries

„Shopbots

• semantic interchange, not screenscraping

„E-commerce

z Negotiation, catalogue mapping, personalisation

„Web Services

z Need semantic characterisations to find them, z to combine them

„Navigation

• by semantic proximity, not hardwired links

„...

harmelen harmelen

(4)

Other use-case are

z personalisation z semantic linking z data integration z web services z ...

Sounds good, so..

how is this tackled?

(5)

Outline

„

The vision

„

What is required

„

Machine representation

z

XML, RDF, OWL

„

Where are we now?

„

Examples

machine accessible meaning

(What it’s like to be a machine)

disease name symptoms drug administration

Meta-data !

(6)

What is meta-data?

„

it's just data

„

it's data describing other data

„

its' meant for machine consumption

disease name symptoms drug administration

meta-data +

ontologies

<name> <symptoms> <drug> <drug administration> <disease> <treatment> IS-A reduces

(7)

What’s inside an ontology?

„terms + specialisation hierarchy „classes + class-hierarchy

„instances „slots/values

„inheritance (multiple? defaults?) „restrictions on slots (type, cardinality) „properties of slots (symm., trans., …)

„relations between classes (disjoint, covers) „reasoning tasks: classification, subsumption

Increasing semantic “weight”

In short

(for the duration of this tutorial)

„

Ontologies are

not

definitive descriptions of

what exists in the world (= philosphy)

„

Ontologies

are

shared models of the world

constructed

to facilitate communication

„

Yes, ontologies exist

(because we build them)

(8)

Real life examples

„

handcrafted (often by communities)

z music: CDnow (2410/5), MusicMoz (1073/7) z biomedical: SNOMED (200k), GO (15k),

Emtree(45k+190k)

„

ranging from lightweight (

Yahoo, UNSPC

)

to heavyweight (

Cyc

)

„

ranging from small (

METAR

)

to large (

UNSPC

)

allright,

but how to represent all this

in a computer?

(9)

Outline

„

The vision

„

What is required

„

machine representation

z

XML, RDF, OWL

„

Where are we now?

„

Examples

(10)

What was XML again?

country

name capital

“Netherlands” name areacode “Amsterdam” “020”

<country name=”Netherlands”> <capital name=”Amsterdam”>

<areacode>020</areacode> </capital>

</country>

So why not just use XML?

„No agreement on: z structure • is country a: –object? –class? –attribute? –relation? –something else? • what does nesting

mean? z vocabulary

• is country the same as nation?

<countryname=”Netherlands”> <capitalname=”Amsterdam”>

<areacode>020</areacode> </capital>

</country> <nation>

<name>Netherlands</name> <capital>Amsterdam</capital> <capital_areacode>

020

</capital_areacode> </nation>

●Are the above XML documents the same? ●Do they convey the same information? ●Is the answer machine-derivable?

(11)

So: XML

machine accessible meaning

CV name education work private < > < > < > < > < > < Χς> <ναμε> <εδυχατιον> <ωορκ> <πριϖατε>

(12)

W3C Stack

„

XML

:

z Surface syntax, no semantics

„

XML Schema

:

z Describes structure of XML documents

„

RDF

:

z Datamodel for “relations” between “things”

„

RDF Schema

:

z RDF Vocabulary Definition Language

„

OWL

:

z A more expressive

Vocabulary Definition Language

RDF & RDF Schema

„

RDF =

z relations between things

z all objects are URL’s (both things and relations)

„

RDF Schema =

z hierarchical organisation of an RDF vocabulary z all things are URL’s

(classes of things, subclass relations)

(13)

The semantic pyramid again

OWL:

things RDF Schema can’t do

„

equality

„

enumeration

„

number restrictions

z Single-valued/multi-valued z Optional/required values

„

inverse, symmetric, transitive

„

boolean algebra

z Union, complement

Again:

(14)

Sounds good in theory.

How far are you with this

in practice?

Where are we now: tools

„

Languages are stable (W3C)

„

Tooling is rapidly emerging

z HP, IBM, Oracle, Adobe, … z Parsers,

z Editors, z visualisers,

z large scale storage and querying z Portal generation

Aduna

(15)

Three example use-cases

„

Closed-world data integration:

DOPE browser @ Elsevier

„

Open-world data integration:

streaming media @ Philips

„

Semantic Web services

„

Conclusions

This section joint with Aduna and

Anita de Waard@Elsevier

This section joint with Aduna and

Anita de Waard@Elsevier

Closed-world data integration:

(16)

Background

„

Vertical Information Provision

z Buy a topic instead of a Journal ! z Web provides new opportunities

„

Business driver: drug development

z Rich, information-hungry market z Good thesaurus (EMTREE)

The Data

„

Document repositories:

z ScienceDirect: approx. 500.000 fulltext articles z MEDLINE: approx. 10.000.000 abstracts

„

Extracted Metadata

z The Collexis Metadata Server: concept-extraction ("semantic fingerprinting")

„

Thesauri and Ontologies

z EMTREE:

(17)

RDF Schema EMTREE Query interface RDF Datasource 1 RDF Datasource n

….

Architecture:

(18)
(19)
(20)
(21)

This section material from Zharko Aleksovski @ VU & Philips

This section material from Zharko Aleksovski @ VU & Philips

Web-based

data integration scenario:

heterogeneous

(22)

Motivating scenario

consumer.philips.com User devices Semantic Web iTunes Wal*Mart Buy.com Napster eMusic Musicmatch Rhapsody Providers MusicNet MusicNow LaunchCast

Example

Evergreens and Golden hits are related: Golden hits is mostly subclass of Evergreens

Music Ontology

Mediator

(23)

Domain characteristics

„

Many music providers

„

Wide variety of music offered

„

Constantly increasing in size and evolving

„

Cumbersome to browse and retrieve music

„

There is no agreement

z Different terms are used

z The same terms contain different sets of artists

CDNow (Amazon.com)

All Music Guide

MusicMoz

ArtistGigs

Artist Direct Network

CD baby

Yahoo

Size: 96classes Depth: 2levels Size: 2410classes

Depth: 5levels Size: 382classes Depth: 4levels Size: 222classes Depth: 2levels Size: 1073classes Depth: 7levels Size: 465classes Depth: 2levels Size: 403classes Depth: 3levels

data-sources

(24)

Why

approximate matching

„

Genre is not precisely defined

„

Pop and Rock have no common definition

on the big portals AllMusic.com,

Amazon.com and MP3.com

„

Exact reasoning will not be useful

A X % 1 % 99

Results

A - AllMusicGuide B - ArtistDirectNetwork 0 100000 200000 300000 400000 500000 600000 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 B subClass of A A subClass of B equivalences

(25)

This section material from Marta Sabou @ VU

This section material from Marta Sabou @ VU

Semantic Web Services

What are web-services

„

a software system designed to support

interoperable machine-to-machine

interaction over a network.

„

has an interface described in a machine

processable format (specifically WSDL).

„

Other systems interact with a web service

in a manner specified by its descriptions

using SOAP messages

(26)

Web Service Tasks

„Web Service Discovery & Selection

z Find an airline that can fly me to Marina del Rey

„Web Service Invocation

z Book flight tickets from NWAto arrive 12thOct.

„Web Service Composition & Interoperation

z Arrange taxis, flights and hotel for travel from

Southampton to Portland, OR, via Marina del Rey, CA.

„Web Service Execution Monitoring

z Has the taxi to Gatwick Airport been reserved yet?

Limitations of WS Technology

„

Manual Discovery

„

Manual Invocation

„

Manual (ad hoc) Mediation

(27)

Use of Semantics: Example

<do:HotelBooking rdf:ID=”WS1">

<owls:hasInput rdf:resource=”do:Hotel”/> </do:HotelBooking >

<do:HostelBooking rdf:ID=”WS2"> <owls:hasInput rdf:res=”do:Hostel”/> </do:HostelBooking >

R:(BookingService,Hotel)=> * exact match with WS1 * plug-in match for WS2

Degrees

of WS Matching

Match Advertisement with Request:

„ Exact: Adv equals Req

„ Plug-In: Adv is more general than Req

„ Subsume: Adv is less general than Req „ Intersection: Adv and Req overlap (a bit)

„ Disjoint: Adv and Req don’ t overlap

Matchmaking algorithms (primarily) employ subsumption reasoning over the knowledge provided by the domain ontologies.

(28)

Take home message again:

Take home message

„

Semantic Web = Web of Data

(no longer web of text, web of pictures)

„

Set of open, stable W3C standards

„

Rapidly emerging tools & vendors

„

Use cases:

z data integration z web services

z knowledge management z search (intranets)

References

Related documents

A vocabulary browser which is visualizing all the vocabularies used by registered SPARQL endpoints including freetext search for concepts can be provided to users to examine the

Selain itu berurutan yang menjadi atribut dominan peluang Kabupaten Solok untuk berkembang menjadi sentra produksi bawang merah adalah peluang pasar di daerah-daerah Pulau

Peter Schmidtke, Vincent Le Guilloux, Julien Maupetit, and Pierre Tufféry fpocket: online tools for protein ensemble pocket detection and tracking Nucleic Acids Research Advance

Semantic Web Technologies Linked Open Data &amp;.. Semantic

Therefore, this study will investigate the relationship between Hispanic ethnicity, drug use, bullying, and suicidal ideation among 9 th - 12 th grade adolescent females

524, 529 (1973) (reversing judgment “[b]ecause of the trial court’s refusal to make any inquiry as to racial bias of the prospective jurors after petitioner’s timely request

With a welfarist policy objective, there is no corrective component in the marginal income tax rate faced by the low-productivity type, since low- productivity individuals do