• No results found

XML Schemadefinition

N/A
N/A
Protected

Academic year: 2021

Share "XML Schemadefinition"

Copied!
45
0
0

Loading.... (view fulltext now)

Full text

(1)

Modul 2:

XML Schemadefinition

a.Univ.-Prof. Dr. Werner Retschitzegger

Vorlesun g IFS in der B ioinfo rmati k SS 20 11

Johannes Kepler University Linz

www.jku.ac.at

Johannes Kepler University Linz

www.jku.ac.at Institute of Bioinformatics www.bioinf.jku.at Institute of Bioinformatics www.bioinf.jku.at

IFS

IFS

Information Systems Group

www.ifs.uni-linz.ac.at

IFS

IFS

IFS

IFS

Information Systems Group

www.ifs.uni-linz.ac.at M2-2 XML Schemadefinition XML Schema Namespaces XML 1.0 Introduction

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Outline

„ Introduction

z Motivation for XML

z Document Markup Languages

z Application Areas for XML

„ XML 1.0 „ Namespaces „ XML Schema

The following slides are based (among others) on:

„Elliotte Rusty Harold, W. Scott Means, XML in a Nutshell: A Desktop Quick Reference, 3rd Edition, O'Reilly & Associates, 2005

(2)

M2-3

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Motivation for XML

1/5

From HTML to XML

"If I invent another programming language,

its name will contain the letter

X

."

(N. Wirth, Software Pioniere Konferenz, Bonn 2001)

223 Mio. SQL 252 Mio. ABC 20,6 K “Werner Retschitzegger” 237 Mio. Soccer 603 Mio. XML 2,2 Mrd. Love Google Indicator: ... as of Sep/16/08 M2-4 XML Schemadefinition XML Schema Namespaces XML 1.0 Introduction

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Motivation for XML

2/5

From HTML to XML

Brian Kerningham: "The problem with HTML-WYSIWYG is that

what you see is all you've got"

„

HTML (HyperText Markup Language) is the "Lingua Franca"

for representing Hypertext Documents at the Web

„

Standardized 1989 by W3C (World Wide Web Consortium)

„

Basic concept: "Markup" in terms of "Tags"

„

Drawbacks

z Restricted number of pre-defined tags

{ permanent extensions with proprietary tags

z Tags primarily describe layout aspects

(3)

M2-5

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Motivation for XML

3/5

From HTML to XML

<h1>PDACatalog</h1> <h2>Nokia 8210</h2> <table border="1"> <tr> <td>Battery</td><td>900mAh</td> </tr> <tr> <td>Weight</td><td>141g</td> </tr> … </table>

HTML describes layout of content

<PDACatalog> <Producer name="Nokia"> <PDA name="8210"> <Battery>900mAh</Battery> <Weight>141g</Weight> … </PDA> </Producer> </PDACatalog> XML describes

structureand semantics of content

Tim Bray, Co-Editor of XML 1.0:

"XML will become the ASCII of the 21st century

-basic, essential, unexciting" PDA-Catalog Battery Weight PDA-Catalog M2-6 XML Schemadefinition XML Schema Namespaces XML 1.0 Introduction

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Motivation for XML

4/5

Features of XML

„ Layout Independence

z Separation of structure and semantics of the content from its layout

„ Platform and Vendor Independence

z Endorsed by the W3C „ Internationality

z Based on the UNICODE-Standard „ Extensibility

z Tags can be defined and named arbitrarily – meta language „ Structurability

z Tags can be nested arbitrarily

„ Semi-structured

z Content can contain fully structured parts and fully unstructured parts „ Self-describing

z Tags describing structure and semantics of the content are z ... for humans: relatively easy to read and edit

z ... for machines: easy to generate and parse „ X-Technology Infrastructure

z W3C provides a set of XML-based standards – „XML Standards Family“ „ Correctness Proof

(4)

M2-7

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„ Well-formedness

z syntactical properties, e.g.:

{ At least 1 tag per document { Exactly 1 root tag

{ Tags have to be none-overlapping { Each tag has to have

an end tag { ....

„ XML-Processors parse XML documents and check

z either solely well-formedness (non-validating processors)

z or also validity (validating processors)

„ Can be called from within an application (e.g., browser) „ Decompose an XML document into its parts forming a tree,

which allows to access its parts from within an application „ Validity

z XML document is well-formed

and corresponds to a schema

z Schema defines vocabulary and

grammar z Alternatives: DTD or XML Schema-Standard Application Document parts Errors Catalog.DTD XML Processor Parser Entity Manager PDACatalog1.XML PDA XML-Document Features Entities

Motivation for XML

5/5

Properties of XML Documents and XML Processors

M2-8 XML Schemadefinition XML Schema Namespaces XML 1.0 Introduction

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Document Markup Languages

1/4

History

„ Vannevar Bush 1945 Memex

„ Douglas Engelbart 1962 Augment

„ Ted Nelson 1965 Xanadu

„ William Tunniclife (GCA) 1967 GenCode

„ Goldfarb, Mosher, Lorie (IBM) 1969 GML (Generalized Markup Language)

„ ANSI 1978 Standardisierung (GenCode & GML)

Charles Goldfarb

„ ISO 1986 SGML (Standard Generalized Markup

Language - ISO 8879)

„ Tim Berners-Lee (CERN) 1989 HTML (Hypertext Markup Language)

„ Mark Andreessen (NCSA) 1993 HTML-Forms (XMosaic)

„ Netscape, Microsoft 1994 HTML-Derivations

„ Jon Bosak, Tim Bray, 1996 XML Working Group

James Clark et al. (W3C)

10. 2. 1998 XML 1.0

(5)

M2-9

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Document Markup Languages

2/4

Memex

http://www.ps.uni-sb.de/~duchier/pub/vbush/vbush-all.shtml M2-10 XML Schemadefinition XML Schema Namespaces XML 1.0 Introduction

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

SGML

XML

Meta Level

XHTML

Language Level

(e.g. DTDs)

HTML

MathML

Instance Level

(documents)

e

i

π

+1= 0

n f (n) =

Σ

k k=1

WML

z.B.

z.B.

M2

M1

M0

[www.omg.org]

Document Markup Languages

3/4

(6)

M2-11

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Document Markup Languages

4/4

XML versus ...

... SGML

z XML vs. SGML (60 pages vs. 600 pages)

z XML has 20% of SGML’s complexity, but 80% of its functionality z XML documents are conform to an ISO revision of SGML

-WebSGML (Annex to the SGML-Standard ISO8879)

... HTML

z XML is complementary to HTML (semantic and structure vs.

layout)

z XML is not backward compatible to HTML

z Simple conversion from HTML documents to XML

... XHTML

z = Extensible HTML

z W3C Recommendation Aug. 2002 (2nd edition)

z HTML 4.01 as an „XML application“, i.e. HTML was described by

means of a XML-DTD M2-12 XML Schemadefinition XML Schema Namespaces XML 1.0 Introduction

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Application Areas of XML

1/4

Three Main Application Areas

„

Data Exchange ("Portable Data")

z Using XML solely as an exchange format or z Using also a common schema

„

Multi-Delivery

z One and the same content can be

delivered to different end user devices

„

Intelligent Retrieval

z Instead of a simple keyword search on

basis of HTML documents, structure-based search on basis of XML documents

"Mozart" -Componi

st or choc olate ball?

(7)

M2-13

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

[http://www.oasis-open.org/cover/xml.html#applications]

XML-DTDs for ...

„ Literature "Gutenberg" „ Travel "openTravel" „ News "NewsML" „ Marketing "adXML" „ Weather "OMF" „ Human Resources "XML-HR" „ Voice Applications "VoxML" „ Vector Graphics "SVG" „ Mobile Applications "WML" „ Geo Applications "ANZMETA" „ Health Care "HL7" „ Mathematics "MathML” „ Banking "MBA” „ eGovernment “eGovML” „ Electronic Commerce z CBL: Common Business

Library (Commerce One)

z BizTalk: Microsoft

z cXML: Commerce XML

z RosettaNet:Format for

Online-Orders

z ebXML: OASIS + XML/EDI

z FnXML: Financial Products

Markup Language „ ...

Application Areas of XML

2/4

Industrial Sectors – "Verticalisation of XML"

M2-14 XML Schemadefinition XML Schema Namespaces XML 1.0 Introduction

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Application Areas of XML

3/4

Sources of XML Data

„

Inter-application and mobile devices communication

data

z e.g., Web Services

„

Logs and Blogs

z e.g., RSS „

Metadata

z e.g., Schema, WSDL, XMP „

Presentation data

z e.g., XHTML „

Documents

z e.g., Word

„

Views of other sources of data

z e.g., Relational, LDAP, CSV, Excel, etc.

(8)

M2-15

© 2011 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„ XML

z XML language concepts incl. DTD

„ XML Namespaces

z Support of a global identification schema

for element names and attribute names „ XPath (XML Path Language)

z Path expressions for navigation in

XML documents „ XML Schema

z XML-based language for the definition of XML schemata

„ XLink, XPointer

z XML-based language for the linking of (parts of) XML documents

„ XSL (Extensible Stylesheet Language)

z XSLT: Transformation of XML documents (declarative) z XSL-FO: Rendering of XML documents (declarative)

„ DOM (Document Object Model)

z API for accessing XML documents in a procedural manner

W3C Standardization Levels: (1) Note (2) Working Draft (WD) (3) Candidate Recommendation (CR) (4) Proposed Recommendation (PR) (5) Recommendation (REC)

Application Areas of XML

4/4

XML Standardization Family (excerpt)

„It takes ten minutes to understand (base) XML, but then ten month to understand the new technologies hung around it. „

(Peter Chen)

M2-16

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema XML 1.0 Introduction XML Schemadefinition

Outline

„

Introduction

„

XML 1.0

z XML Document z DTD z Entities „

Namespaces

„

XML Schema

(9)

M2-17

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Document

1/3

Running Example: PDACatalog

<?xml

<?xml version="1.0" version="1.0" encoding="UTF-8"?>> <

<PDACatalogPDACatalog>> <!

<!---- NOKIA NOKIA ---->> <Producer

<Producer name="NOKIA"name="NOKIA">> <

<ProducerNoProducerNo no="h1234"no="h1234"/>/>

<PDA

<PDA name="7110"name="7110">>

<Weight>

<Weight>141g141g</Weight></Weight>

<Price

<Price contract=contract=““yes"yes">>999999</Price></Price>

<Price

<Price contract=contract=““no"no">>49994999</Price></Price>

</PDA>

</PDA>

<PDA

<PDA name="8210"name="8210">>

... ... </PDA> </PDA> </Producer> </Producer> </ </PDACatalogPDACatalog>> “Root Element" or “Document Element" Prologue (optional) "xml declaration" Comment Start Tag

End Tag Attribute

Attribute Value Elementname Text “Character Data" “Element Content" of <Producer> “Empty Element" Subelement PDACatalog1.XML PDACatalog1.XML “Mixed Content" M2-18

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema

XML 1.0

Introduction XML Schemadefinition

XML Document

2/3

Elements and Attributes

„

Element- and attribute names have to be valid "XML Names"

z [ letter | _ | : ] [ letter | '0..9' | '.' | '-' | '_' | ':' ]* z "letter": A-Z, a-z, and others like ä, êς

z ':' reserved for namespaces z No length restriction z Case-sensitive

„

Empty elements can be represented in long form or short

form

z <ProducerNo no="h1234"></ProducerNo>or

z <ProducerNo no="h1234"/>

„

Attribute values must be enlosed by quotation marks

z <PDA name='8210'> or

(10)

M2-19

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Document

3/3

Comments

„

Can stretch across multiple rows

z Between start tag and end tag of an element z Before or after the root element

„

Restrictions

z Comment within a tag not allowed z Nesting of comments not allowed z "--" within a comment not allowed

<!--A comment may comprise also <tagNames> or &entities;

--> ...

M2-20

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema

XML 1.0

Introduction XML Schemadefinition

„

A DTD defines vocabulary and grammar for a set of XML

documents

„

An XML document is allowed to reference a single DTD only

("document type declaration -

DOCTYPE

")

„

A DTD has to be referenced

z AFTER the prologue

z but BEFORE the root element

„

A DTD does NOT DEFINE the root

element of a XML document

z The root element is rather defined

within the XML document itself using the

DOCTYPE

-Declaration

z Can be an arbitrary element of the DTD

DTD

1/8

Purpose and Characteristica

<?xml version="1.0"?>

<?xml version="1.0"?>

<!DOCTYPE

<!DOCTYPE PDACatalogPDACatalog ...... < <PDACatalogPDACatalog>> ... ... PDACatalog1.XML PDACatalog1.XML Catalog.DTD Catalog.DTD Root Element Definition Usage

(11)

M2-21

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

DTD

2/8

Incorporating DTD’s into XML Documents – 3 Alternatives

1.

External DTD, i.e., a dedicated file (

*.dtd

) identified by

means of an URI ("external subset")

<!DOCTYPE PDACatalog SYSTEM "Catalog.dtd">

2.

Internal DTD, i.e., defined within the XML document

("internal subset")

<!DOCTYPE PDACatalog […]>

3.

External & internal DTD, i.e., internal complements

external

„

Excursus – URL vs. URI:

z An URL (Uniform Resource Locator) identifies Internet

resources on basis of their location using the Domain Name Service (DNS)

z An URI (Uniform Resource Identifier) identifies arbitrary

resources on basis of their names (z.B. ISBN#) or other properties of the resource

z Each URL is a valid URI

M2-22

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema XML 1.0 Introduction XML Schemadefinition

DTD

3/8

Example –

Catalog.dtd

<!-- Catalog DTD Version 1.0 --> <!ELEMENT PDACatalog (Producer*)>

<!ELEMENT Producer (ProducerNo, PDA+)>

<!ATTLIST Producer name CDATA #REQUIRED>

<!ELEMENT ProducerNo EMPTY>

<!ATTLIST ProducerNo no ID #REQUIRED>

<!ELEMENT PDA (Weight, Price+)>

<!ATTLIST PDA name CDATA #REQUIRED>

<!ELEMENT Weight (#PCDATA)>

<!ELEMENT Price (#PCDATA)>

<!ATTLIST Price contract (yes|no) "no"> Weight ProducerNo no * 1..* Price contract PDA name PDACatalog Producer name 1 1 1..*

UML Class Diagram XML DTD

XML Element XML Attribute Legend:

1 : exactly once

1..*: once or several times

* : 0 or several times : part-of

(12)

M2-23

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

DTD

4/8

Element Declaration

<!ELEMENT element name

(Content Model)> „

Sequence

<!ELEMENT Producer (ProducerNo, PDA+)>

„

Alternative

<!ELEMENT Battery (LiIo | NiMh | NiCd)>

„

Cardinality

z Optional (0 or once)

<!ELEMENT PDA (Comment?)>

z Null or several times

<!ELEMENT PDACatalog (Producer*)>

z Once or several times

<!ELEMENT Producer (PDA+)>

z Content model can be nested by means of paranthesis

<!ELEMENT div1 (head, (p | list | note)*, div2*)>

M2-24

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema XML 1.0 Introduction XML Schemadefinition

DTD

5/8

Element Declaration

„

Empty Element

z Element may contain attributes, but neither text nor subelements

<!ELEMENT ProducerNo EMPTY> „

Element Content

z Element contains subelements and optional attributes but no text

<!ELEMENT PDACatalog (Producer*)> „

Mixed Content

z Element contains text and optional subelements or attributes

<!ELEMENT Price (#PCDATA)>

<!ELEMENT Price (#PCDATA | Category | Discount)*> „

Element with arbitrary content

z Content not exactly specified in DTD z Used elements have to be declared anyway

(13)

M2-25

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

DTD

6/8

Attribute Declaration

<!ATTLIST element name

attributename1 type default

attributename2 type default

...

>

„

Attribute names must be unique within an element

„

Default specifications

z NOT NULL #REQUIRED

z Optional Value #IMPLIED

z Default Value [#FIXED] "value"

M2-26

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema

XML 1.0

Introduction XML Schemadefinition

„

CDATA

z String

z <!ATTLIST Producer name CDATA #REQUIRED>

„

ID

,

IDREF(S)

z IDensures uniqueness of

attribute values within a document

z Per element 1 attribute of

typeIDallowed only

z IDREFis a reference to an attribute of typeID

z „Referential integrity“ (untyped!) is checked by XML processor z Values of ID- and IDREF(S)-attributes must be valid XML names,

i.e., starting numbers are not allowed

DTD

7/8

Attribute Declaration – 10 Types

<!ATTLIST Example

identity ID #IMPLIED

(14)

M2-27

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

DTD

8/8

Attribute Declaration – 10 Types

„

Enumeration Type

z A pre-defined set of values consisting of XML name tokens z <!ATTLIST Price contract (yes|no) "no">

„

ENTITY

,

ENTITIES

z Attribute value is the name of a declared non-parsed Entity z <!ATTLIST Image filename ENTITY #REQUIRED>

„

NMTOKEN(S)

z "XML name tokens” are an extended form of XML names z In addition, they can start with "0..9 ", ". " and "-" z <!ATTLIST journal year NMTOKEN #REQUIRED>

„

NOTATION

z Attribute value is the name of a declared notation – seldomly

used

<!ATTLIST image type NOTATION (gif | tiff) #REQUIRED>

M2-28

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema XML 1.0 Introduction XML Schemadefinition

Entities

1/9

Overview

General Entities Usage in XML documents Parameter Entities Usage in DTDs Pre-defined Replacement of XML-specific char’s Unicode Replacement of none-ASCII-char’s User-defined Replacement of document parts Internal embedded External file Parsed Non-parsed Internal External

„

Referenceable, named parts of

z XML documents (plain text, markup or other arbitrary formats) z or a DTD

„

Purpose: Character replacement – macros, modularisation

(15)

M2-29

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Purpose: Representation of XML specific characters

z e.g. <> – "escaping" „

5 pre-defined Entities

z &amp; & (ampersand)

z &lt; < (less than)

z &gt; > (greater than) „

Example

z <formular>x &lt; y</formular> „

Usage

z As element value or attribute value

„

Alternative:

CDATA

-Section

z Example:

<formular>x <![CDATA[<]]> y</formular>

z “Within”CDATAonly its end is recognized (']]>')

z CDATA-Sections cannot be nested

Entities

2/9

Pre-defined Entities

z &apos; ' (apostrophe)

z &qout; " (quotation mark)

Interpreted as plain text, NOT as markup

M2-30

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema

XML 1.0

Introduction XML Schemadefinition

„

Purpose

z

Representation of characters, not

available at the keyboard

z

http://www.unicode.org/

„

Unicode classifies characters into letters,

numbers, punctuations, symbols (general,

technical, mathematical), etc.

z Unique assignment of characters

to numbers

z Supports 25 living languages (Cyrillic, Hebrew, Hiragana, ...) z All in all approx. 50.000 different characters

„

Usage

z As element value or attribute value z Arbitrary Unicode-characters are

referenced via their numbers (decimal or hexadecimal)

Entities

3/9

Unicode ("Character Encoding") Entities

&#251; &#xFB; and © all represent the same character

(16)

M2-31

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Text or well-formed markup is associated with a name

„

Declaration within the DTD:

„

Usage

z As element value or attribute value of the XML document z In entities themselves – but cyclic references are forbidden

Entities

4/9

User-Defined Internal Entities

<!ENTITY entityName "replacementText or Markup">

&entityName;

M2-32

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema

XML 1.0

Introduction XML Schemadefinition

„

Purpose

z Decomposition of the XML document (similar to SSI – Server

Side Include-mechanism)

z Because of the document’s size or for reuse

„

Declaration within the DTD

„

Charakteristica

z In principal well-formed, but may contain multiple root

elements

z Reference to a DTD not allowed

„

Usage

z Syntax analogous to internal entities

z As element values of the XML document and within entities

themeselves

z Cyclic references forbidden z NOT within attribute values

Entities

5/9

User-Defined External Parsed Entities

(17)

M2-33

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Purpose

z References to files with arbitrary formats, e.g. ASCII,

not-wellformed XML, GIF, JPEG, QuickTime Movies

z NDATAdefines a "non-parsed" Entity and specifies an arbitrary

file format

z a NOTATION-declaration is necessary to identify a corresponding

application (via an URI), which is able to process files of this format

„

Usage

z Only as attribute value of type ENTITY

z Syntax: entity name within quotation marks (Note: NO &...;) z Processor informs the application only that there exists a

non-parsed entity at a certain location – no expansion!

„

(More expressive) Alternative: W3C’s XLink-Standard

Entities

6/9

User-Defined External Non-Parsed Entities

<!ENTITY entityName SYSTEM "URI" NDATA formatName> <!NOTATION formatName SYSTEM "URI">

M2-34

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema

XML 1.0

Introduction XML Schemadefinition

Entities

7/9

User-Defined Entities – Example

<?XML version="1.0"?>

<!DOCTYPE PDACatalog SYSTEM ”Catalog.dtd" [

<!ENTITY linkNokia "http://www.nokia.de/8210">

<!ENTITY address "<town>Linz</town>">

<!ENTITY features SYSTEM "feat8210.XML"> <!ENTITY bildNokia SYSTEM "/pictures/8210.jpg"

NDATA jpeg>

<!NOTATION jpeg SYSTEM "image/jpeg">

<!ATTLIST Image filename ENTITY #REQUIRED>

]>

<PDA name="8210">

<Picture><Image filename="bildNokia"/></Picture> <ProducerInfo>&linkNokia;</ProducerInfo>

&features; &address; </PDA>

D

e

c

lar

at

io

n

Usage

internal external, parsed external, non-parsed Usage as element value Usage as attribute value

(18)

M2-35

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Entities

8/9

Parameter Entities

<!ENTITY % Battery "(type, capacity)" >

<!ELEMENT PDABatt %Battery;>

<!ELEMENT camcorderBatt %Battery;>

Internal

<!ENTITY % linkNokia SYSTEM "http://nokia.de" > %linkNokia;

External

„

Purpose

z Modularization of DTDs

„

Syntactical difference to General Entities

z % blank includedfor declaration z % blank excludedfor usage

„

Definition of ...

z Name and content model of elements z Attribute declaration

M2-36

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) Namespaces XML Schema

XML 1.0

Introduction XML Schemadefinition

Entities

9/9

Parameter Entities – Overriding

<!ENTITY % residental_content "address,rooms">

External DTD

Internal DTD of a XML document

<!ENTITY % residental_content "address,rooms,baths">

„

A Parameter Entity defined within an external DTD can be

arbitrarily overriden within the internal DTD of a XML

document

„

This allows to adapt the external DTD to the requirements

of single XML documents without having to change the

external DTD

„

Thus, the Parameter Entity is used as a kind of

(19)

M2-37

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Outline

„

Introduction

„

XML 1.0

„

Namespaces

„

XML Schema

M2-38

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) XML Schema

Namespaces

XML 1.0

Introduction XML Schemadefinition

Namespaces

1/5

„

A XML namespace (NS) allows a unique global

identification of elments and attributes

z W3C-REC "Namespaces in XML", 14th Jan. 1999 (13 pages)

„

For this, elements and attributes of a domain (e.g.

MathML) are assigned to one or more NS

z XSL uses, e.g., different namespaces for XSLT and XSL-FO

„

A NS is represented by an URI

z Needs not directly refer to the corresponding vocabulary

z Thus, provides a level of indirection which allows to decouple the

location of the vocabulary from the unique identifier – the URI

„

The associated elements and attributes have to be qualified

by means of this URI in case of usage, thus being made

globaly unique

z This allows the reuse and especially the combination

(20)

M2-39

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Namespaces

2/5

NS with Prefix vs. Default NS

„ BUT: URIs cannot be used for direct qualification

z This is since URIs normally contain characters which are not allowed as

part of valid XML names (e.g., " / ", " & ")

z Instead, user-defined prefixes have to be used

„ One ore more NS are declared on basis of the pre-defined

attributexmlns

z This attribute can be defined in the context of any element of the DTD

z The name of the element itself where the NS has been declared as well

as direct and indirect subelements and attributes can be qualified with the NS – „NS-inheritance“

„ Default NS

z Also declared via the pre-defined attributexmlns– BUT – only 1 per

element, and without declaring any prefix

z None-qualified subelements are automatically associated with the

default NS, attributes NOT

z Can be overriden within subelements

M2-40

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) XML Schema

Namespaces

XML 1.0

Introduction XML Schemadefinition

Namespaces

3/5

Declaration and Usage

... <edi:HC

xmlns:edi='http://ecommerce.org/schema'

xmlns='http://www.mobildev.com/schema'> <model name="8210">

<edi:price edi:units='Euro'>32.18</edi:price> <price währung='USD'>25.16</price>

...

</model>... </edi:HC>

NS Prefix (optional) URI of the NS Pre-defined Attribute

for NS Declaration

Default-NS (no Prefix)

„ The NS of the element edi:priceis http://ecommerce.org/schema „ The NS of the elementsmodeland priceis the default NS

http://www.mobildev.com/schema

(21)

M2-41

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Namespaces

4/5

... and DTDs

„

NS are in principle independent of DTDs

z Can be used in documents with or without DTDs

„

BUT:

z All elements and attributes which are qualified in the XML

document must also be declared appropriately within the DTD

z Huge Overhead – this is since DTD’s are not aware of NS z <edi:HC> ... <!ELEMENT edi:HC (....)>

z <edi:price> ... <!ELEMENT edi:price (#PCDATA)>

„

What can be done is to specify a default NS within the DTD

z <!ATTLIST edi:HC xmlns

CDATA #FIXED 'http://www.mobildev.com/schema'>

M2-42

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) XML Schema Namespaces XML 1.0 Introduction XML Schemadefinition

Namespaces

5/5

Exemplary NS-URIs

„ RDF http://www.w3.org/1999/02/22-rdf-syntax-ns# http://www.w3.org/2000/01/rdf-schema# „ MathMLhttp://www.w3.org/1998/Math/MathML „ XHTML http://www.w3.org/1999/xhtml „ SMIL http://www.w3.org/TR/REC-smil „ XSL http://www.w3.org/1999/XSL/Transform http://www.w3.org/1999/XSL/Format

(22)

M2-43

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Outline

„

Introduction

„

XML 1.0

„

Namespaces

„

XML Schema

z Introduction

z Elements and Attributes z Pre-defined Datatypes z User-defined Datatypes

z Keys

z Schema Composition z Schema Modeling Styles

z Comparison DTD – XML Schema

M2-44

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema Namespaces XML 1.0 Introduction XML Schemadefinition

Introduction

DTD versus XML Schema 1/2

„

Drawbacks DTDs

z Proprietary syntax

z Few datatypes, in fact only

one – String

z Global definition of elements z Parameter Entities for

modularization & overriding

z ID, IDREF(S): Severe restrictions „

Advantages XML Schema

z XML as syntax z Numerous pre-defined datatypes

z User-defined simple and

complex datatypes

z Inheritance z Keys, references:

flexible concept

„

XML Schema

z Definition of the structure of XML documents z W3C REC May 2001, approx. 420 pages z W3C REC 2nd edition October 2004

(23)

M2-45

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS) <?xml version="1.0"?>

<schema ...>

<simpleType name="producerNoType"> ...

<element name="PDACatalog"> <complexType>

<sequence>

<element name="Producer" minOccurs="0" maxOccurs="unbounded"> <complexType>

<sequence>

<element name="ProducerNo"

type="hc:producerNoType" minOccurs="1" maxOccurs="1"/> <element name=„PDA" minOccurs="1" maxOccurs="unbounded">

<complexType> <sequence>

<element name="Weight" type="string" minOccurs="1" maxOccurs="1"/> <element name="Battery" type="string" minOccurs="1" maxOccurs="1"/> </sequence> ... </schema> Catalog.xsd Catalog.dtd

Introduction

DTD versus XML Schema 2/2

...

<!ELEMENT PDACatalog (Producer*) > <!ELEMENT Producer (ProducerNo, PDA+)> <!ELEMENT PDA (Weight, Battery)> <!ELEMENT Weight (#PCDATA)> <!ELEMENT Battery (#PCDATA)> ...

M2-46

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

„

Namespace for own Vocabulary

z Namespace (NS) of the vocabulary to be defined can be declared

by means of attribute targetNamespace(optional!) „

NS of the XML Schema-Standard Vocabulary

z Declaration is obligatory!

z Additional NS (i.e., vocabularies) can be incorporated

„

A single NS can be defined as Default–NS

z Either own NS, XML Schema–NS or other NS z For all other NS used, a prefix is obligatory

<?xml version="1.0"?>

<schema targetNamespace="http://www.ifs.uni-linz.ac.at/hc"

xmlns:hc="http://www.ifs.uni-linz.ac.at/hc" xmlns="http://www.w3.org/2001/XMLSchema" attributeFormDefault="qualified" elementFormDefault="qualified"> ...

Introduction

(24)

M2-47

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Schema of a XML document is defined within the root

element via the attribute

schemaLocation

z 1. Part: targetNamespaceof the schema

z 2. Part: location of the schema document

Catalog.xsd Catalog1.xml <?xml version="1.0"?> <schema targetNamespace="http://www.ifs.uni-linz.ac.at/hc" xmlns:hc="http://www.ifs.uni-linz.ac.at/hc" xmlns="http://www.w3.org/2001/XMLSchema" attributeFormDefault="qualified" elementFormDefault="qualified"> ... <?xml version="1.0"?>

<PDACatalog xsi:schemaLocation="http://www.ifs.uni-linz.ac.at/hcCatalog.xsd"

xmlns="http://www.ifs.uni-linz.ac.at/hc"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance“ >

...

Introduction

Usage of NS in the XML Document

xsi:noNamespaceSchemaLocation= "directPathToXSD_File"

M2-48

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema Namespaces XML 1.0 Introduction XML Schemadefinition „

Element

„

Attribut

„

Global Definition

z Direct subelement of schema

z NOTE: the root element of the XML document is required to

be defined as global element!

„

Local Definition

z Definition on an arbitrary nesting level

„

Analoguosly for Datatypes!

<elementname="name"type="type"minOccurs="int"maxOccurs="int|unbounded"... /> Simple or

Complex Type

Cardinality: Upper/Lower Bounds (only in “local” elements)

<attributename="name"type="type"use="how-its-used"default/fixed="value"... />

Values: required, optional, prohibited (only in “local” attributes)

only relevant, if “use” is not defined Simple Type

Elements and Attributes 1/3

Global / Local Definition

(25)

M2-49

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Global or Local Datatypes

„

Reference to an existing Element or Attribute

<elementname="name"minOccurs="int"maxOccurs="int|unbounded"...> <complexType>…</complexType>

</element>

<elementref="name"minOccurs="int"maxOccurs="int|unbounded".../> <attributename="name"use="how-its-used"default/fixed="value"...>

<simpleType>...</simpleType> </attribute>

<attributeref="name"use="how-its-used"default/fixed="value".../>

Elements and Attributes 2/3

Global / Local Datatypes and References

M2-50

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema Namespaces XML 1.0 Introduction XML Schemadefinition <schema ...> <element name="Producer"> <complexType> <sequence>

<element name="ProducerNo" type="hc:producerNoType" minOccurs="1" maxOccurs="1"/>

<element ref="hc:PDA" maxOccurs="unbounded"/> </sequence>

<attributename="name" type="string" use="required"/> </complexType>

</element>

<element name="PDA"> <complexType>

<sequence>

<element name="Weight" type="string"/> <element name="Battery" type="string"/> </sequence> </complexType> </element> <simpleType name="producerNo"> … Global Element, local Datatype Reference to a global Element Local Element, global Datatype Global Element, local Datatype Local Element, pre-def. Datatype Local Attribute, pre-def. Datatype

Elements and Attributes 3/3

Summarizing Example – Global/Local

(26)

M2-51

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

string boolean float double duration date Time

time date gYear gMonth Day gDay gYear Month anyType anySimple Type (all complex types)

gMonth hex Binary base64 Binary any URI QName NOTATION normalized String token

language NMTOKEN Name

NMTOKENS NCName ID IDREF ENTITY IDREFS ENTITIES decimal integer nonPositiveInteger nonNegativeInteger

negativeInteger positiveInteger unsignedLong

unsignedInt unsignedShort unsignedByte long int short byte (W3C REC, 28th Oct. 2004) z Primitive (atomic) z Derived

Pre-Defined Datatypes

1/4 M2-52

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

Because of backward-compatibility reasons, usable only as types for attributes

Pre-Defined Datatypes

2/4

String Datatypes

string anySimpleType hex Binary base64 Binary any

URI NOTATION QName

normalized String token language NMTOKEN Name NMTOKENS NCName ID IDREF ENTITY IDREFS ENTITIES

z Pre-defined primitive Types

z Pre-defined derived Types

Backward-compatibility to DTDs Normalized String with whitespace replacement. Each Tab, Linefeed and CR is replaced by Blank.

"Tokenized" String – all whitespace characters are replaced by blanks,

all starting and ending blanks are deleted and multiple consecutive blanks are replaced by a single one.

Standardized language codes (e.g. en, en-US, de, de-DE) Name token: String without blanks (z.B. "CMS", "234234")

XML-Name: must start with letter, ":" or "-" (e.g., "CMS", "-1") Name without prefix

String-Datatype without Whitespace-Replacement

Binary string-encoded Datatypes

Qualified name: supports the usage of names with NS-prefix

(27)

M2-53

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Pre-defined Datatypes

3/4

Numerical Datatypes

float double anySimpleType decimal integer nonPositiveInteger nonNegativeInteger

negativeInteger positiveInteger unsignedLong

unsignedInt unsignedShort unsignedByte long int short byte

z Pre-defined primitive Types

z Pre-defined derived Types

Decimal Numbers: decimal separator ".", "+" or "-" possible.

64, 32, 16 or 8 Bit

Floating Point Numbers: simple (32 Bits) and double (64 Bits) precision

boolean

M2-54

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

Pre-defined Datatypes

4/4

Date- and Time Datatypes

durationtime dateTime dategYearMonthgYear gMonthDay gDay anySimpleType gMonth "CCYY-MM-DDThh:mm:ss" "CCYY-MM-DD" "CCYY-MM" "CCYY" "--MM-DD" "---DD" "--MM" "hh:mm:ss" "PnYnMnDTnHnMnS"

Day of the month

Day of the year

(28)

M2-55

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

User-defined Datatypes

Alternatives

Should the Type contain Elements or Attributes? Unstructured Content <simpleType> Structured Content <complexType> Derivation „ <restriction> „ <union> or „ <list> Derivation „<restriction> „<extension> Nesting „<sequence> „<all> „<choice> Empty / Mixed Name d / An onymo u s

Should the Type contain Elements?

yes no

yes no

Attributes & Elements

<complexContent> Attributes <simpleContent> Note: <complexContent> only necessary in case of derivation from a user-defined type M2-56

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema Namespaces XML 1.0 Introduction XML Schemadefinition

User-defined Datatypes

Alternatives – Examples

<xsd:complexType name="BookTypeWithID"> <xsd:complexContent> <xsd:extension base="BookType">

<xsd:attribute name="ID" type="xsd:token"/>

</xsd:extension> </xsd:complexContent> </xsd:complexType> <xsd:complexType> <xsd:sequence> .... </xsd:sequence> </xsd:complexType> <xsd:simpleType name="longitudeType"> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="-180"/> <xsd:maxInclusive value="180"/> </xsd:restriction> </xsd:simpleType> <xsd:integer> No Derivation Derivation Simple Complex User-defined Pre-defined Anonymous Named

(29)

M2-57

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Restriction of a pre-defined datatype

z <restriction>

„

Union of pre-defined datatypes (Extension)

z <union>

z Values must correspond to at least one of the combined

datatypes

„

List of values of one pre-defined datatype

(or again of a List-Datatype)

z <list>

User-defined Datatypes

Derived Simple Datatypes –

<

simpleType

>

M2-58

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

„ Alternative Definition Possibilities

z Referencing an existing datatype via the attribute base

z Local definition from scratch by using simpleTypeas subelement of the

restriction-Element

„ 12 Possible Restrictions, depending on the base datatype z length z minLength z maxLength z pattern z enumeration z minInclusive z maxInclusive z minExclusive z maxExclusive z whiteSpace z totalDigits z fractionDigits

<simpleTypename="batteryType"> <restrictionbase="string">

<enumeration value="NiMh"/> <enumeration value="NiCd"/> <enumeration value="LiIo"/> </restriction>

</simpleType>

<element name="Battery" type="hc:batteryType"/>

<Battery>NiCd</Battery> XML-Document

User-defined Datatypes

(30)

M2-59

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

User-defined Datatypes

Derived Simple Datatypes

<

simpleType

>

restriction

M2-60

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

User-defined Datatypes

Derived Simple Datatypes

<

simpleType

>

restriction

„

Restrictions using a “pattern” element

„

Restrictions of the lexical values

„

Simple regular expressions

z Normal characters: "C&amp;A"

z Categories of characters:"\p{IsBasicLatin}"

z Sets of characters: "[\p{IsBasicLatin}-[\d]]" z Quantifiers: "[a-zA-Z]{1,8}"

z Paranthesis: "(XML(\s+|-))?Schema" z Combinations of these expressions

(31)

M2-61

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Alternative Definition Possibilities

z Referencing an existing datatype via attributes

(memberTypesor itemType)

z Local definition from scratch by using simpleTypeas subelement

of the union- or list-Elements <simpleType name="PDAFeatureType">

<unionmemberTypes="hc:PDAColor hc:PDARobustness"/> </simpleType>

<simpleType name="PDAFeatureListType"> <listitemType="hc:PDAFeature"/> </simpleType>

<element name="PDAFeatureList" type="hc:PDAFeatureListType"/> XML-Dokument:

<PDAFeatureList>blue waterproof shockproof</PDAFeatureList>

User-defined Datatype

Derived Simple Datatypes

<

simpleType

>

union/list

M2-62

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

„

Nested Elements

z Possible within a complex datatype only

„

Attributes

z Possible within a complex datatype only z Independentof the existence of nested elements

„

Empty Content

z Possible within a complex datatype only z Does not have nested elements

„

Mixed Content

z Datatype may contain nested elements and text

z In contrast to DTDs, for nested elements, the ordering and

cardinality properties can be arbitrarily specified

User-defined Datatypes

(32)

M2-63

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Sequence –

<sequence>

„

Choice –

<choice>

„

Arbitrary Ordering –

<all>

z Nested Elements can be used in arbitrary order

„

Cardinality

z Expressed by means of minOccursand maxOccurs <complexTypename=“PDAType">

<sequenceminOccurs="1" maxOccurs="1">

<element name="Weight" type="string" minOccurs="1" maxOccurs="1"/> <element name="Battery" type="string" minOccurs="1" maxOccurs="1"/> </sequence>

<attribute name="no" type="nonNegativeInteger" use="required"/> </complexType>

User-defined Datatype

<

complexType

>

– Nested Elements / Attributes

M2-64

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

<complexType name=“PDAType" mixed="true"> <sequence>

<element name="Weight" type="string" minOccurs="1" maxOccurs="1"/> <element name="Battery" type="string" minOccurs="1" maxOccurs="1"/> </sequence>

</complexType>

<element name=„PDA" type="hc:PDAType"/>

<PDA>Type Nokia 7110 has <Weight>141g</Weight>and <Battery>900mAh</Battery>

</PDA>

XML Document

User-defined Datatypes

(33)

M2-65

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Extension

z <extension>

z Additional nested elements and/or attributes

„

Restriction

z <restriction>

z Domain z Cardinality

„

Abstract Datatypes

z <complexType>with attribute abstract = "true“ „

Prohibition of Derivation

z <complexType>with attribute final

z Potential values: #all, restriction, extension

User-defined Datatypes

<

complexType

>

– Derivation of Complex Types

M2-66

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

„

Elements are attached at the end

„

Extension must be specified within a

<

complexContent

>

-Tag

<complexType name=“extendedPDAType"> <complexContent>

<extensionbase="hc:PDAType" > <sequence>

<element name=“Band" type="string" minOccurs="1" maxOccurs="1"/> <element name="Feature" type="string"

minOccurs="1" maxOccurs="10"/> </sequence> </extension> </complexContent> </complexType> extendedPDAType PDAType

User-defined Datatypes

(34)

M2-67

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

The declarations of the base datatype

which should retain must be repeated

„

Restriction must be specified within a

<

complexContent

>

-Tag

<complexType name=“restrictedPDAType"> <complexContent>

<restrictionbase="hc:extendedPDAType"> <sequence>

<element name="Weight" type="string" minOccurs="1" maxOccurs="1"/> <element name=“Band" type="string" minOccurs="1" maxOccurs="1"/> <element name="Feature" type="string" minOccurs="1" maxOccurs="5"/> </sequence>

</restriction> </complexContent> </complexType>

User-defined Datatypes

<

complexType

>

– Derivation via Restriction

extendedPDAType

restrictedPDAType PDAType

M2-68

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema Namespaces XML 1.0 Introduction XML Schemadefinition „

Static

„

Dynamic

z Definition of the derived datatype within the XML document via

the attribute typeof the XML Schema Instance (xsi) NS

Element PDA has datatype PDAType <PDA> <Weight>141g</Weight> <Battery>900mAh</Battery> </PDA> <PDA xsi:type=“extendedPDAType"> <Weight>115g</Weight> <Battery>550mAh</Battery> <Band>Dualband</Band> <Feature>Waterproof</Feature> </PDA> Datatype extendedPDAType is derived from PDAType: Extension with

Band & Feature

User-defined Datatype

(35)

M2-69

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Characteristics of a key (

key

)

z Value (combination) must be unique z Value must exist

z Key must be defined as subelement of another element –

following the type definition

„

Candidates for keys (

field

)

z Elements with simple datatypes only! z Attributes

z Combinations of elements and attributes

„

Scope can be defined (

selector

)

„

Reference to key can be defined (

keyref

)

„

Elements, Attributes and Combinations thereof can be

defined to be unique (

unique

)

z Value (combination) must be unique z Value need NOT exist

Keys

1/2

M2-70

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema Namespaces XML 1.0 Introduction XML Schemadefinition

Keys

2/2 <element name="PDACatalog"> <complexType> ...</complexType> <keyname=“typeKey"> <selectorxpath="hc:Producer/hc:PDA"/> <fieldxpath="@name"/> <fieldxpath="@version"/> </key>

<keyrefname="refToTypeKey" refer="hc:typeKey"> <selectorxpath="hc:Stock/hc:PDAQuantity"/> <fieldxpath="@name"/>

<fieldxpath="@version"/> </keyref>

</element>

PDA Name Version Weight ... PDAQuantity Name Version Quantity <element name="PDACatalog"> <complexType> ...</complexType> <uniquename="uniqueProducerNo"> <selector xpath="hc:Producer"/> <field xpath="@producerNo"/> </unique> </element>

(36)

M2-71

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Group of Elements

<groupname="mainData">

<sequence>

<element name="Weight" type="string" minOccurs="1" maxOccurs="1"/> <element name="Battery" type="string" minOccurs="1" maxOccurs="1"/> </sequence>

</group>

<complexType name=“PDAType"> <sequence>

<groupref="hc:mainData"/>

<element name="Feature" type="string" minOccurs="1" maxOccurs="10"/> </sequence>

</complexType>

Schema Composition

Within a Schema

1/2

M2-72

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema Namespaces XML 1.0 Introduction XML Schemadefinition „

Group of Attributes

<attributeGroupname="BatteryAttributeGroup"> <attribute name="type" type="string" default="NiMh"/> <attribute name=“capacity" type="string" use="required"/> </attributeGroup> <complexType name=“BatteryType"> <sequence>...</sequence> <attributeGroupref="hc:BatteryAttributeGroup"/> </complexType>

Schema Composition

Within a Schema

2/2

(37)

M2-73

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Incorporation of other schemata via

include

,

redefine

and

import

z include, redefineand import elements must be subelements

of schemaprior to any other declaration „

Include of a Schema –

include

z NS of included schema must be equal to the NS of the including

schema or no NS at all

z The included schema can be used as if it were declared directly

within the including schema

<schema...>

<includeschemaLocation="PDA.xsd"/>...

Schema Composition

Different Schemata 1/2

M2-74

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

„

Including and Redefining a Schema –

redefine

z Same functionality as include

z In addition, included components (simpleType, complexType, group, attributeGroup)can be newly defined

z New definitions replace the original ones

„

Import of a Schema –

import

z Imported schema can have an arbitrary NS (could be unequal to

the current one)or none

<importnamespace="http://" http://www.somewhere.else.com" schemaLocation="Producer.xsd"/>... <redefineschemaLocation="PDA.xsd"> <complexType name=“PDAType">....</complexType>... </redefine>...

Schema Composition

Different Schemata 2/2

(38)

M2-75

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Schema Modeling Styles

Non-Normative Datamodel of XML Schema Concepts

Legend:

http://www.w3.org/TR/xmlschema-1/

M2-76

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

Schema Modeling Styles

XML Schema Concepts in Practice

„ Analysis of 1400

Schemata of diverse standard vocabularies

z Open Travel Alliance

(OTA), z Human Resource XML (HR-XML), z W3C, z Global Justice XML, z etc.

P. Kiel, Profiling XML Schema,

(39)

M2-77

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

Schema Modeling Styles

Relationships /Global vs. Local /Element vs. Type

„ Relationships

z Realisation by means of nesting or via references

„ Global Elements/Attribute-Declarations

z Pre-requisite for reuse in the same/another schema

z Root element must be global

„ Local Element/Attribute-Declarations

z In case that a declaration makes sense only in combination with the

declared type

„ Local Element Declarations

z Can occur with different structure but the same name in different types

„ Local Attribute Declarations

z Makes sense since attributes are most often tightly coupled to elements

„ Three Stereotypical Design Forms

z Russian Doll Design

z Salami Slice Design

z Venetian Blinds Design

„ Literature

z XMLSchema Best Practices (Roger Costello): www.xfront.com

z P. Kiel, Profiling XML Schema, http://www.xml.com/pub/a/2006/09/20/profiling-xml-schema.html

M2-78

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema Namespaces XML 1.0 Introduction XML Schemadefinition „

Nested Element

Declarations

z Local declarations only z Prevents global types

„

Advantages

z Structure obvious (corresponds to the XML document‘s structure) z Prevents side-effects „

Disadvantages

z Danger of deep nesting levels

z No reuse of declarations – redundancies z No extensibility in terms of derivation

Schema Modeling Styles

Russian Doll Design

(40)

M2-79

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

„

Global Element Declarations

z Usage of global elements per

reference (ref-Attribute)

z Each global element can be a

root element

„

Advantages

z Reuse of elements

„

Disadvantages

z Large numger of global elements

{ Confusing

{ Danger of side-effects in case of

changes to global elements

z No extensibility in terms of derivation

Schema Modeling Styles

Salami Slice Design

M2-80

© 2010 JKU Linz, Institut für Bioinformatik, Arbeitsgruppe Informationssysteme (IFS)

XML Schema

Namespaces XML 1.0

Introduction XML Schemadefinition

„

Global Type Declarations

z Elements, except the root

element, are declared locally

„

Advantages

z Reuse of types

{ A named type is available for

each element and attribute

{ Types can be imported from

other schemata

z Extensibility by derivation

<redefine> „

Disadvantages

z Large number of global types

{ Confusing

Schema Modeling Styles

Venetian Blinds Design

References

Related documents

The rapid expansion in the number of voluntary and proprietary agencies engaged in supplying home-care services to LTCI beneficiaries indicates that where the existing capacity

The Mpeg7 DDL is basically the XML Schema language [1], what means that Mpeg7's descriptors are dened as XML Schema types and Mpeg7 descriptions are XML documents [2].. For this

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 29 [Mel05].

iii) XML Schema evolution steps The minimized set of changes on a concep- tual model is translated in XML schema evolution steps. The schema evolution steps change the XML

Xml schemas contain element type of integervalued property also document based on information about resources, explain more precise reference appears.. We

• When a document contains element types from independently developed markup languages, there can be name clashes: The same name is defined in different vocabularies with

Relax ng datatype validation error message in xml document can be accomplished using an xml schema is invalid xml schema beforehand rather be prompted

Example data below reveal is XML Schema XSD The XML Schema language is also referred to as XML Schema Definition XSD An XML.. One approach remains an element for each novel object