• No results found

Information for the Semantic Web. Procedures for Data Integration through h CIDOC CRM Mapping

N/A
N/A
Protected

Academic year: 2021

Share "Information for the Semantic Web. Procedures for Data Integration through h CIDOC CRM Mapping"

Copied!
36
0
0

Loading.... (view fulltext now)

Full text

(1)

Session II: Chair Luc Van Eycken

Encoding Cultural Heritage

Encoding Cultural Heritage

Information for the Semantic Web

Procedures for Data Integration

h

h

i

through CIDOC CRM Mapping

Ø. Eide*, A. Felicetti**, C. E. Ore*, A. D’Andrea*** and J. Holmen*

(2)
(3)

The BABEL tower….

I CH i l th BABEL t

i

Metaphor about lack of communication or

In CH circles the BABEL tower is a

incommutability for too many languages

and different cultures

and different cultures

Languages Terminology ThesauriTh i StandardS d d

Languages gy

Languages Terminology Thesauri Standard Languages Terminology Thesauri Standard

Languages Terminology Thesauri Standard Languages Terminology Thesaurih StandardS d d

Languages gy

(4)

The BABEL tower….

As regard archaeological documentation

(forms, reports, etc..) we have different

Archaeological Best Practices

(forms, reports, etc..) we have different

streams:

National

Archaeological

Backgrounds Best Practices

Rules Standard Local Experiences Legal Obbligations Experiences

(5)

The BABEL tower….

A paradoxical situation could be:

Io sono un I’m a classical je ne fait pas des fouilles

Io sono un preistorico e scavo seguendo le superfici… I m a classical archaeologist. I dig following inscriptions and texts….

je ne fait pas des fouilles archéologiques... mais je rassemble la documentation suivant le St d d ICOMOS p Standard ICOMOS….

(6)

The BABEL tower….

Could the technology help us in overcoming these gy p g issues….SURE!!!!

(7)

The BABEL tower….

Languages Terminologies Thesaurii Standard

+

Human

h

+

approach

+

National

Rules obbligationsLegal Standard Guidelines

Legal

approach

l f Program Formats Softwares

+

ITC

Platforms Languagesg Formats Softwares

+

(8)

A new challenge

A new challenge for the

reconciliation of these disparate

reconciliation of these disparate

sources stored separately each other

(we wanted also try to reconciliate different disciplinary groups separated by cultures a much as content (methods, theories…)

A SEMANTIC APPROACH

A SEMANTIC APPROACH……

…..The integration of knowledge….

through ontologies

through ontologies

The use of CIDOC-CRM as a sort of inter-lingua enabling to guarantee data and schemas integration among cultural heritage information/archives/repositories

(9)

What is the CIDOC CRM?

• is

not

a metadata standard

• it is a

Conceptual Reference Model

for the

analysis and design of cultural

information systems

• does

not

define the terminology used to

gy

document these data structures

• It is addressed to explain the logic of

what they “

do

” documentation

(10)

WP 3.3 Common Infrastructure

As part of the EPOCH a complete framework for the

mapping and management of cultural heritage

mapping and management of cultural heritage

information in a semantic web context has been

developed by

• PIN (University of Florence, Italy)

CISA (University of Naples “L’Orientale” Italy)

• CISA (University of Naples “L’Orientale”, Italy)

• EDD (University of Oslo, Norway)

AMA

AMA

(Archive Mapper for Archaeology) is one of the

NEWTON projects (NEW TOols Needed)

(11)

Background

BUT!!!

mapping requires skills and

mapping requires skills and

knowledge which are

g

uncommon among the

cultural heritage

cultural heritage

(12)

AMA Project

The aim of the AMA project was to develop a tools

for

semi-automated mapping

of cultural heritage

for

semi automated mapping

of cultural heritage

data to CIDOC-CRM (ISO 2006: 21127).

The reason for this investment was that such tool

The reason for this investment was that such tool

can enhance

interoperability

among the different

archives and datasets produced in the field of

p

Cultural Heritage.

We created a tool able to

extract and encode

We created a tool able to

extract and encode

legacy information coming from diverse

sources

, to store and manage this information using

a semantic enabled container and to make it

available for query and reuse.

(13)

AMA project: Two key-words

The tool relied upon two concepts:

Mapping

= the rules allowing to write and

conciliate different schemas

Template

p

= the instantiation of the abstract

mapping between the source data structure and the

mapping target (CIDOC-CRM compliant). Templates

h

f h

d

capture the semantic structure of the sources and

their transformations ensuring the modularity and

future maintenance of the system

(14)

A schema for a mapping-tool

Template

Classes and Properties Template

E65 E t

Schema to be mapped CIDOC-CRM

Classes and Properties

US

E31_InformationCarrier

E65_Event

CompilationEvent P94_was created by

is a

Site

p

P14_carried out by

E39 actor P70 is documented in Compiler E39_actor E21 person _ is a YearForm E21_person E50_date

(15)

AMATool

The tool set developed in the AMA

project includes:

p j

• A powerful

mapping application

for the

creation of mappings from existing datasets

• A tool for mapping cultural heritage

information contained in

free text

into a CIDOC

information contained in

free text

into a

CIDOC-CRM compliant data model

Templates

Templates

describing relations between the

describing relations between the

structure of existing archives and CIDOC-CRM

A semantic framework

to store, manage and

(16)

AMATool: partners

CISA

(Italy)

PIN

(It l )

PIN

(Italy)

UNIREL

(Italy)

CIMEC

(Romania)

CIMEC

(Romania)

IAA

(Israel)

Oxford ArchDigital

g

(UK)

(

)

University of Kent

(UK)

Paveprime LtD

(UK)

U i

it

f O l

(N

)

University of Oslo

(Norway)

ROB

(Netherlands)

VARTEC

(Belgium)

(17)

Tools developed

AMA

Current release Current release

New online application Future developmentp

Software release after EPOCH

MAD MAD

Review release

Development after reviewDevelopment after review

Future development

(18)

AMA: Archive Mapper for Archaeology

Creation of an Open Source tool for mapping existing Creation of an Open Source tool for mapping existing archaeological datasets to CIDOC-CRM compliant

structuresEnhance data interchange and

interoperability between existing and future interoperability between existing and future

(19)

AMA: the new online application

A new and powerful online version of the application is under development (PHP)

(20)

AMA: new version features

Possibility to upload both starting schema and target

ontology for any kind of mapping (not only towards ontology for any kind of mapping (not only towards

CIDOC-CRM)

Ontology to ontology mapping capabilities Ontology to ontology mapping capabilities

Advanced mapping features for complex mapping

d fi iti (i ti f titi d ti

definitions (i.e. creation of new entities and properties to represent implicit elements and relations)Creation of

shortcuts to simplify the mapping processp y pp g p

Generation of mapping templates to be applied to

information stored in databases in order to get perfectly information stored in databases in order to get perfectly converted semantic archives

Graphic visualization of simple an complex relations Graphic visualization of simple an complex relations

(21)

AMA: forthcoming developments

Inplementation of a high level mapping language to Inplementation of a high level mapping language to describe the mappings (under development by Martin Doerr's team in Crete)

Implementation and archiving of mapping templates to be used as mapping starting points

Integration with MAD and other tools for authomatic data extraction and storing in a semantic containerg

(22)

AMA releases after EPOCH

2 versions: Online Application and a stand-alone application

GPL licenses

Supported formats: all XML compliant documents, including semantic grammars (RDF and OWL), and schema formats (XSD/RDFS)

(23)

CIDOC-CRM

(24)

TEI guidelines

A set of guidelines for XML encoding of

culture heritage texts There are three

culture heritage texts. There are three

primary functions of the TEI guidelines:

• guidance for individual or local practice in

• guidance for individual or local practice in

text creation and data capture;

t f d t i t

h

• support of data interchange;

• support of application-independent local

processing.

(25)

From text to CIDOC-CRM

• Free text documents contain useful

information

information

• This information is more useful in a

computerised environment if it can be

computerised environment if it can be

extracted into a well defined format

• This is impossible to do correctly by

• This is impossible to do correctly by

automatic tools

• This is very time consuming to do manually

• This is very time consuming to do manually

• Therefore: A semi-automatic approach

(26)

Amatexttool

• Developed by EDD, University of Oslo

B

d

14

i

XML

• Based on 14 years experience on XML

content markup of texts

• Tool for semi-automatic semantic

enrichment of XML markup

• Creates CIDOC-CRM compliant markup

• Information stored in XML

(27)

What to encode in a text

• Reference to entities (e.g. person names)

( g p

)

• Co-reference in the text (e.g. "he" refers

to the same person as the name on the

to the same person as the name on the

previous line)

• Relations between entities (e g family

• Relations between entities (e.g. family

(28)

Search—choose—tag

• In the tool a document can be searhed

• In the tool, a document can be searhed

for patters

The result list is a KWIC concordance

• The result list is a KWIC concordance

• All or some of the results can be chosen

ffor tagging

• Iterative process —semantic

bootstrapping

(29)
(30)

MAD: Managing archaeological data

Application designed to manage structured and

unstructured archaeological excavation datasets

encoded in XMLDeveloped using Open Source

technologies and entirely based on XML and W3C g y

standardsThe core: eXist Native XML Database

(http://exist-db.org)

Fully XPath/XQuery and SPARQL aware

Featured by dynamic XSLT transformation and Featured by dynamic XSLT transformation and presentation of documents and query results

(31)

Data structure of MAD

MAD

XML documents indexed and stored in a

file-system-like structure of folders and

(32)

MAD: Testing the application

Fi t l f MAD t t d f th EPOCH P t ’ XML

First release of MAD tested for the EPOCH Partners’ XML

database (

http://partners.epoch-net.org:8080/exist/partners/index.xml)g p )

... and for an XML database of relevant european and

italian projects (

http://www.epoch-italian projects (http://www.epoch net.org:8080/exist/projects/index.xml)

(33)

MAD: The SAD extension

Implementation of the RDF language

Full SPARQL and RQL query languages support Full SPARQL and RQL query languages support

Semantic Browser: intuitive set of interfaces to

navigate semantic modelsFull CIDOC-CRM compliancy navigate semantic modelsFull CIDOC-CRM compliancy Presented at the last EPOCH review in July 2007

(34)

MAD: Real Life applications

Archaeological excavation dataset of Cuma containing information on Stratigraphical units and other related resources converted from Syslat (HyperCard) and managed using MADg g

The MAD framework used for the creation of an online application for the complete managament of coins application for the complete managament of coins

collections for the COINS Project - http://www.coins-project.eu)

Some features of MAD will be used for the creation of a wide XML repository of ancient iberian pottery (University of Jaen - Spain)

(35)

MAD: The future

Full integration of GML geospatial features within the CIDOC-CRM framework

XML nativ support of MAD can be used in the field of digital preservation for setting up annotations repositories and creating annotations repositories and creating

co-reference resolution services

P ibl COLLADA/X3D t f 3D bj t Possible COLLADA/X3D support for 3D objects represented in XML-based formats

AJAX integration and XForms support for easy web applications development and complex XML information deployment and integration XML information deployment and integration

(36)

MAD releases

2 versions: Online Application and Downloadable

Version to be used as a stand-alone application or

i t t d i li t t t (i i di t ib t d integrated in a client-server context (i.e. using distributed archives)

GPL License

Supported formats: all XML compliant documents, including Supported formats: all XML compliant documents, including semantic grammars (RDF and OWL), graphic formats (SVG), geographic formats (GML) and 3D formats (COLLADA/X3D)

References

Related documents

The two species also share short stem and ovary with glabrous base and glandular pubescent upper half but the later has elliptic to obovate leaf lamina with short

(Per National Com- prehensive Cancer Network [NCCN] guidelines prior to 2014, NSCLC adenocarcinoma of stage IIIb or higher was eligible for molecular testing; 2014 NCCN

KNOW ALL MEN BY THESE PRESENTS, that we the principal -Village and War Chiefs of the Ottawa, Chippawa, Pottowatomy and Huron Indians Nations of Detroit for and in

Music and dance students spend many hours practicing and rehearsing, so it is not surprising that rehearsal and practice activities came up throughout the interviews, even outside

The workshop consensus on this issue is that (1) an early and clearly defined risk posture helps guide FM design, and (2) consistency throughout the project is necessary to

ACC - Aberdeen Proving Ground (SCRT) (Fort Monmouth, NJ) 411 th CSB Yongsan, ROK 410 th CSB USARSO Ft Sam Houston, TX Directorates of Contracting ACC - Picatinny (Picatinny, NJ) 413

CWU Libraries would plan to support our researchers throughout the lifecycle of their research including providing support in finding resources to inform and enhance their

Primary Level Field Activities Defense Energy Support