• No results found

XML and Data Management

N/A
N/A
Protected

Academic year: 2021

Share "XML and Data Management"

Copied!
26
0
0

Loading.... (view fulltext now)

Full text

(1)

XML and

Data Management

• XML standards

• XML

• DTD , XML Schema

• DOM , SAX , XPath

• XSL

(2)

Overview of internet technologies for

document management and archiving

server technologies

database coupling

pure HTML

XML+XSL

(3)

Data centric XML - XML data storage

<doc>

<Auftrag>

<Kunde> Arm </Kunde>

<PC> pc400 </PC>

</Auftrag>

<Auftrag>

<Kunde> Meier </Kunde>

<PC> pc500 </PC>

</Auftrag>

<Auftrag>

<Kunde> Reich </Kunde>

<PC> pc500 </PC>

</Auftrag>

</doc>

%

Kunde

PC

auftrag(

).

).

).

auftrag(

auftrag(

Arm

pc400

Meier

pc500

Reich

pc600

doc

opening tag

closing tag

(4)

eXtended Markup Language (XML)

XML - a family of standards:

XML (eXtensible Markup Language)

data format exchangable accross different operating systems,

applications, and enterprises

often used for content

XPath

path expressions used for navigation in XML trees

used within other XML standards (e.g. XSL)

XSL (eXtensible Stylesheet Language)

used to describe layout of content / to convert data

many more standards: XQuery (

queries

) ,

(5)

Unique Standard for Content

DTD or XML Schema:

defines structure of all XML trees exchanged

=> unique data format for all participants

data formats exchangable accross company borders

New data exchange formats and languages based on XML

example:

ebXML (E-Business XML) as a basis for

OTA (Open Travel Association)

data exchange between travel agency , airline etc.

Consequence of these standards:

(6)

Separation of content and layout

content (product1.xml)

content (product2.xml)

layout (customer1.xsl)

layout ( technican2.xsl)

HTML file

(7)

Separation of content and layout (2)

Consequences:

• 1 (content) data source for different layouts

(technican, seller, customer, re-seller, ...)

• layout may change without changing content

( different logo, different seller or customer,

different employee or job, new view of data )

• reuse 1 layout for different content

( frame with company logo, ...)

• content may change without changing layout

(8)

XML on Java servers

• XML + XSL separate layout and content

• layout (.xsl file)

• content data (.xml file)

• combine them in the web server

Servlet

Browser

HTML-page

client

server

calls

generated

HTML page

input

transform

XML+XSL

Î HTML

XML

file

XSL

file

(9)

XML document as a data storage

<doc>

<Auftrag>

<Kunde> Arm </Kunde>

<PC> pc400 </PC>

</Auftrag>

<Auftrag>

<Kunde> Meier </Kunde>

<PC> pc500 </PC>

</Auftrag>

<Auftrag>

<Kunde> Reich </Kunde>

<PC> pc500 </PC>

</Auftrag>

</doc>

%

Kunde

PC

auftrag(

).

).

).

auftrag(

auftrag(

Arm

pc400

Meier

pc500

Reich

pc600

doc

opening Tag

closing Tag

(10)

XML syntax

XML - Prolog:

<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>

<?xml-stylesheet type="text/xsl" href="xmlbsp1.xsl"?>

XML - main part:

<Auftrag>

<Kunde> meier </Kunde>

<PC> pc500 </PC>

</Auftrag>

version

character set without DTD !

used stylesheet

(only inside ie5)

start tag

element

text node

/end tag

(11)

XML syntax (2)

In the XML main part:

<Angebote>

<Liefert wer=“vobis“ teil=“pc500“ >

</Liefert>

<Liefert wer=“IBM“ teil=“pc600“ / >

</Angebote>

attribute attribute value

end of tag (no text)

(arbitrarily) no text node

(12)

XML-Syntax (3)

all tags must be closed

(<tag> ... </tag> or <singleTag />)

incorrectly nested tags not allowed

( <tag1> <tag2> ... </tag1> </tag2> )

case-sensitive ( <tag> different from <Tag> )

attribute values must be quoted

( z.B. <p align="center"> )

text must be enclosed in elements

(13)

XML document as a tree

<doc>

<Kunde

name=“meier“

>

<Auftrag>

...

</Auftrag>

<Adresse>

</Adresse>

</kunde>

<Kunde>

<Auftrag/>

<Adresse/>

</Kunde>

</doc>

doc

Kunde

Kunde

Adresse

Auftrag

Auftrag

Adresse

name =

“meier“

(14)

XML node types

7 kinds of nodes:

root

- has no parent node

element

text

- leaf node (has no child node)

attribute

- leaf node (has no child node)

comment

- leaf node (has no child node)

name-space

- leaf node (has no child node)

processing-instruction

- leaf node (has no child node)

(15)

DTD and XML Schema

DTD ( the older standard ) :

+ defines the structure (nesting of tags) of the documents

<kunde>

<auftrag>

<teil> …

+ defines structural dependencies,

e.g. every auftrag contains at least one teil element

XML-Schema ( the newer standard ) additionally :

+ binds XML elements to types defined in the XML Schema

+ defines Domains

(16)

Document-Type-Definition (DTD)

<!-- DTD xmlbsp2d.dtd for example xmlbsp2d.mxl -->

<!ELEMENT Auftraege (Auftrag)* >

<!ELEMENT Auftrag ( Kunde , PC ) >

<!ELEMENT Kunde (#PCDATA) >

<!ELEMENT PC (#PCDATA) >

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>

<!DOCTYPE Auftraege SYSTEM "

xmlbsp2d.dtd

">

<?xml-stylesheet type="text/xsl" href="xmlbsp2.xsl"?>

<Auftraege>

<Auftrag>

<Kunde>Meier</Kunde>

<PC>pc500</PC>

</Auftrag>

<Auftrag> ... </Auftrag>

parsed char data

sequence

required

arbitrary many

(17)

Element declarations in DTDs

<!ELEMENT PC (#PCDATA) >

<!ELEMENT Liefert (EMPTY) >

<!ELEMENT Angebot (Liefert) >

<!ELEMENT Angebote (Liefert)* >

<!ELEMENT Auftrag (Kunde,PC) >

<!ELEMENT Zahlung (Bar|Karte) >

<!ELEMENT E ((A|B)*,C,(D)?)+ >

text (no elements)

empty

1 sub-element

? 0 or 1

* arbitrary many

+ al least 1

sub-element

sequence

choice

paranthesis

(18)

Attribute declarations in DTDs

<!-- DTD xmlbsp2d.dtd for the example xmlbsp2d.xml -->

<!ELEMENT Angebote (Liefert)* >

<!ELEMENT Liefert (EMPTY) >

<!ATTLIST Liefert wer CDATA #REQUIRED

teil CDATA #REQUIRED >

<Angebote>

<Liefert wer=“vobis“ teil=“pc500“ >

</Liefert>

<Liefert wer=“IBM“ teil=“pc600“ / >

type

(char data)

attribute

must occur

arbitrary many

root element

(19)

Axes in XML document trees

XML document

Axes:

child-axis

/child::doc/child::Kunde/child::Auftrag

/ doc / Kunde / Auftrag

attribute-axis

/child::doc/child::Kunde/

attribute::name

/ doc / Kunde

/ @ name

doc

Kunde

Kunde

Adresse

Auftrag

Auftrag

Adresse

name =

“meier“

(20)

Axes in XML document trees (2)

doc

Kunde

Adresse

Auftrag

parent

ancestor

following

following-sibling

PC

Handbuch

@nr

attribute

descendant

child

self

ancestor-or-self

descendant-or-self

(21)

Axes in XML document trees (3)

doc

Kunde

Kunde

Adresse

Auftrag

<doc>

<Kunde> … </Kunde>

<Kunde>

<name> … </name>

<Auftrag>

...

</Auftrag>

<Adresse> …

</Adresse>

</Kunde>

<Kunde> … </Kunde>

</doc>

name

Kunde

ancestor::

descendant::

following::

preceding::

self::

(22)

Axes in XML document trees (4)

Die following axes select for a given context node:

• child::

its child nodes

• descendant::

its descendants (=children and their descendants)

• parent:: the parent node (only root does not have a

parent).

• ancestor::

nodes on the path to the root (=parent and its anc's).

• following-sibling::

siblings have identical parent , following in doc order

(empty for attribute and namespace nodes).

• preceding-sibling:: inverse to

following sibling

(empty for attribute and namespace nodes).

• following::

all nodes following in doc order after context node

(excluding descendant-, attribute- & namespace-nodes).

• preceding::

all nodes preceeding in doc order before context node

(excluding ancestor-, attribute- & namespace-nodes).

• attribute:: its attributes (empty for each non-element node).

• namespace::

its namespace-nodes

(23)

Axes in XML document trees (5)

the following axes select for a given context node:

• self:: the context node itself

• descendant-or-self:: the context node and its descendants

• ancestor-or-self:: the context node and

its ancestors

When ignoring attribute nodes and namespace nodes,

the following holds for everey document node:

the axes ancestor::, descendant::, following::, preceding:: and self::

partition a document fully, i.e., the selected node sets do not overlap

but the union of all partitions contain all nodes of the document.

(24)

XML Schema example (1)

<xsd:element name="address" …>

<xsd:sequence>

<xsd:element name="fullname" maxOccurs="1">

<xsd:sequence>

<xsd:element name="firstname"/>

<xsd:element name="lastname"/>

</xsd:sequence>

</xsd:element>

<xsd:choice>

<xsd:element name="street"/>

<xsd:element name="POB"/>

</xsd:choice>

</xsd:sequence>

(25)

XML Schema example (2)

<xsd:element name="shipTo" type="CoAddress"/>

<xsd:complexType name="Address">

<xsd:complexContent>

<xsd:sequence>

<xsd:element name="fullname"/>

<xsd:element name="street"/>

</xsd:sequence>

</xsd:complexContent>

</xsd:complexType>

<xsd:complexType name="CoAddress">

<xsd:extension base="Address">

<xsd:sequence>

<xsd:element name="countrycode"/>

</xsd:sequence>

</xsd:extension>

</xsd:complexType>

</xsd:element>

(26)

XML summary

• XML :

tree structure for content

• DTD :

structure definition

• XML-Schema

additionally:

type checking and

logic consistency checking

well documented standards

References

Related documents

Petrescu-Mag Ioan Valentin: Bioflux, Cluj-Napoca (Romania) Petrescu Dacinia Crina: UBB Cluj, Cluj-Napoca (Romania) Sima Rodica Maria: USAMV Cluj, Cluj-Napoca (Romania)

Within analyzes of production performances in Serbian agriculture are discussed the dynamics of agricultural production as well as partial productivity in

As you may recall, last year Evanston voters approved a referendum question for electric aggregation and authorized the city to negotiate electricity supply rates for its residents

The aim of this study was to evaluate the current vac- cination status of the HCWs in all of the Departments different from the Department for the Health of Women and Children of one

Sample sets for two experiments are shown samples from four premature infants (infants 5 – 8) and five term infants (infants 9 – 13) are shown, encompassing two experiments: (i) for

Cross Industry Standard Process for Data Mining (CRISP-DM) presents a hierar- chical and iterative process model, and provides an extendable framework with

innovation in payment systems, in particular the infrastructure used to operate payment systems, in the interests of service-users 3.. to ensure that payment systems

Looking into figure 02 knowledge management practices dimensions it is revealed knowledge creation among the employees was high compared to the other dimensions