• No results found

XML_1.ppt

N/A
N/A
Protected

Academic year: 2020

Share "XML_1.ppt"

Copied!
52
0
0

Loading.... (view fulltext now)

Full text

(1)

Service-Oriented Computing

(2)

What is Markup(encoding)?

Markup is text that is added to the data of a document in

order to convey information about it.

.

The markup instructions are often called

"tags."

Example:

<centre on> This is a <italics on> very serious <italics off> matter.<centre off>

This is a

very

serious

matter.

Historically, the word markup has been used to describe annotation

within a text intended to instruct a typist how a particular passage should be printed or laid out.

(3)

What is XML?

XML stands for

eXtensible Markup Language

XML is a markup language for documents containing

structured information

XML is used for describing other languages(i.e WSDL,RDF,

WML)

XML is used for data interchange

XML DTD and XML Schema define rules to describe data

An open W3C standard

A subset of SGML

(4)

What is XML?

XML is a “use everywhere” data specification

Documents Configuration

Database Application X

Repository

XML XML

(5)

Documents vs. Data

XML is used to represent two main types of things:

– Documents

Lots of text with tags to identify and annotate portions

of the document

– Data

(6)

XML and Structured Data

Pre-XML representation of data:

XML representation of the same data:

“PO-1234”,”CUST001”,”X9876”,”5”,”14.98”

<PURCHASE_ORDER>

<PO_NUM> PO-1234 </PO_NUM> <CUST_ID> CUST001 </CUST_ID> <ITEM_NUM> X9876 </ITEM_NUM> <QUANTITY> 5 </QUANTITY>

(7)

Benefits of XML

Open W3C standard

Representation of data across heterogeneous

environments

– Cross platform

– Allows for high degree of interoperability

Strict rules

– Syntax

– Structure

(8)
(9)

9

HTML and XML

XML is not a replacement for HTML.

HTML XML

HTML was designed to display data, with focus on how data looks

XML was designed to transport and store data, with focus on what data is

HTML is used to mark up text so it can be displayed to users

XML is used to mark up data so it can be processed by computers

HTML describes both structure (e.g. <p>, <h2>, <em>) and appearance (e.g. <br>, <font>, <i>)

XML describes only content, or “meaning”

HTML uses a fixed, unchangeable set of tags

(10)

10

HTML and XML

HTML and XML look similar, because they are both

SGML

languages (SGML =

Standard Generalized

Markup Language

)

– Both HTML and XML use

elements

enclosed in

tags

(e.g.

<body>This is an element</body>

)

– Both use tag

attributes

(e.g.,

<font face="Verdana" size="+1" color="red">

)

(11)

11

HTML and XML

HTML is for humans

– HTML describes web pages

– You don’t want to see error messages about the web pages you visit

– Browsers ignore and/or correct as many HTML errors as they can, so HTML is often sloppy

XML is for computers

– XML describes data

– The rules are strict and errors are not allowed  In this way, XML is like a programming language

– Current versions of most browsers can display XML  However, browser support of XML is spotty at best

(12)

12

HTML vs. XML

<h1> Bibliography </h1>

<p> <i> Foundations of DBs</i>, Abiteboul, Hull, Vianu <br> Addison-Wesley, 1995

<p> <i> Logics for DBs and ISs </i>, Chomicki, Saake, eds.

<br> Kluwer, 1998

<bibliography>

<book> <title> Foundations of DBs </title> <author> Abiteboul </author> <author> Hull </author>

<author> Vianu </author>

<publisher> Addison-Wesley </publisher> ....

.</book>

<book> ... <editor> Chomicki </editor>... </book> ... </bibliography>

HTML tags:

presentation, generic document structure

XML tags:

(13)

13

XML-related technologies

 DTD (Document Type Definition) and XML Schema are used to define legal XML tags and their attributes for particular purposes

 CSS (Cascading Style Sheets) describe how to display HTML or XML in a browser

XSLT (eXtensible Stylesheet Language Transformations) and XPath are used to translate from one form of XML to another

(14)

Layout of a typical XML document

Layout of a typical XML document

(15)

Components of an XML Document

Processing instructions(prologue)

– Encoding specification (Unicode by default) – Namespace declaration

– Schema declaration

Elements

– Each element has a beginning and ending tag

 <TAG_NAME>...</TAG_NAME>

– Elements can be empty (<TAG_NAME />)

Attributes

(16)

The Prolog (processing instructions)/XML declaration

 It tells the browser or parser that this document is marked up in XML. This prolog is actually a part of HTML as well, but most HTML authors leave it out. In HTML the prolog might look like this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

This tells browser that this document will be using HTML 4.0 Transitional.

But the prolog for an XML document can also contain: – the DTD or schema being used

– declarations of special pieces of text – text encoding

(17)

17

The Prolog (processing instructions)/XML declaration

The XML declaration statement is used to indicate that the

specified document is an XML document.

The XML declaration looks like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

 The XML declaration is not required by browsers, but is required by most XML processors (so include it!)

 XML declaration starts with <? xml, and ends with ?>  version="1.0" is required (this is the only version so far)

 encoding can be "UTF-8" (ASCII) or "UTF-16" (Unicode), or something else, or it can be omitted

(18)

18

Elements and attributes

An XML element is the basic syntactic construct of an XML

document

Attributes and elements are somewhat interchangeable

Example using just elements:

<name>

<first>David</first> <last>Matuszek</last> </name>

Example using attributes:

<name first="David" last="Matuszek"></name>

(19)

19

Entities

Five special characters must be written as entities:

&amp; for & (almost always necessary)

&lt; for < (almost always necessary) &gt; for > (not usually necessary)

&quot; for " (necessary inside double quotes) &apos; for ' (necessary inside single quotes)

These entities can be used even in places where they

are not absolutely required

(20)

20

CDATA

By default, all text inside an XML document is parsed.

The term CDATA is used about text data that should not be

parsed by the XML parser.

You can force text to be treated as unparsed

character data

by

enclosing it in

<![CDATA[ ... ]]>

Any characters, even

&

and

<

, can occur inside a CDATA

Whitespace inside a CDATA is (usually) preserved

CDATA is useful when your text has a lot of illegal characters

(for example, if your XML document contains some HTML text)

(21)

CDATA Example

 Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA.

 A CDATA section starts with "<![CDATA[" and ends with "]]>":

<script> <![CDATA[

function matchwo(a,b) {

(22)

22

Comments

<!-- This is a comment in both HTML and XML -->

Comments are useful for:

– Explaining the structure of an XML document

– Commenting out parts of the XML during development and testing

Comments are not elements and do not have an end tag

The blanks after

<!--

and before

-->

are optional

The character sequence

--

cannot occur in the comment

The closing bracket must be

-->

(23)

Well-formed vs. Valid

XML must be

well-formed

(An XML document that obeys the syntax rules is said to be well-formed)

– correct syntax

– tags match, tags nest, all characters legal

– parser must reject if not well-formed

XML may be

valid

with respect to a Schema (A well-formed

XML document that conforms to its schema is said to be valid.)
(24)

24

Rules for Well-formed XML (XML Syntax))

Every XML document must have one and only one root element

Every element must have both a start tag and an end tag, e.g.

<name> ... </name>

– But empty elements can be abbreviated: <break />.

– XML tags may not begin with the letters xml, in any combination of cases

Elements must be properly nested,

e.g. not

<b><i>bold and italic</b></i>

Attribute values must be enclosed in “” or ‘’

Processing instructions are optional

XML is case-sensitive

(25)

Namespaces: Overview

 An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names.

Allow authors to differentiate between tags of the same

name (using a prefix)

– Frees author to focus on the data and decide how to best describe it – Allows multiple XML documents from multiple authors to be merged

Identified by a URI (Uniform Resource Identifier)

– When a URL is used, it does NOT have to represent a live server – To guarantee uniqueness, typically a URI (Uniform Resource

(26)

Namespaces: Declaration

xmlns: bk = “urn:mybookstuff.org:bookinfo”

Namespace declaration examples:

Namespace declaration Prefix URI (URL)

xmlns: bk = “http://www.example.com/bookinfo/”

There are two ways to use namespaces:

– Declare a default namespace

(27)

Namespaces: Examples

<BOOK xmlns:bk=“http://www.bookstuff.org/bookinfo”> <bk:TITLE>All About XML</bk:TITLE>

<bk:AUTHOR>Joe Developer</bk:AUTHOR>

<bk:PRICE currency=‘US Dollar’>19.99</bk:PRICE>

<bk:BOOK xmlns:bk=“http://www.bookstuff.org/bookinfo” xmlns:money=“urn:finance:money”>

(28)

Namespaces: Default Namespace

An XML namespace declared without a prefix becomes

the default namespace for all sub-elements

All elements without a prefix will belong to the default

namespace:

<BOOK xmlns=“http://www.bookstuff.org/bookinfo”> <TITLE>All About XML</TITLE>

(29)

Namespaces: Scope

Unqualified elements belong to the inner-most default

namespace.

– BOOK, TITLE, and AUTHOR belong to the default book namespace

– PUBLISHER and NAME belong to the default publisher namespace

<BOOK xmlns=“www.bookstuff.org/bookinfo”> <TITLE>All About XML</TITLE>

<AUTHOR>Joe Developer</AUTHOR>

<PUBLISHER xmlns=“urn:publishers:publinfo”> <NAME>Microsoft Press</NAME>

(30)

Namespaces: Attributes

Unqualified attributes do NOT belong to any namespace

– Even if there is a default namespace
(31)

Valid XML

XML is

valid

if it declares a DTD/XSD Schema and

conforms to that schema

Schemas: Overview

– DTD (Document Type Definitions)

Not written in XML

No support for data types or namespaces

– XSD (XML Schema Definition)

 Written in XML

Supports data types

(32)

Schemas: Purpose

Define the “rules” (grammar) of the document

– Data types

– Value bounds

A XML document that conforms to a schema

is said to be valid

– More restrictive than well-formed XML

Define which elements are present and in what order

Define the structural relationships of elements

(33)

What is a DTD?

A DTD (

Document Type Definition

) defines the structure of a

“valid” XML document

An XML document may have an optional DTD.

Only the elements defined in a DTD can be used in an XML

document

A DTD can be

internal

The DTD is part of the document file

external

The DTD and the document are on separate files An external DTD may reside

(34)

Connecting a Document with its DTD

 An internal DTD

<?xml version="1.0"?>

<!DOCTYPE db [<!ELEMENT ...> … ]> <db> ... </db>

 A DTD from the local file system:

<!DOCTYPE db SYSTEM "schema.dtd">

 A DTD from a remote file system:

(35)

DTD

An internal DTD

<!DOCTYPE invoice [

<!ELEMENT invoice (sku, qty, desc, price) > <!ELEMENT sku (#PCDATA) >

<!ELEMENT qty (#PCDATA) > <!ELEMENT desc (#PCDATA) > <!ELEMENT price (#PCDATA) > }>

<invoice>

<sku>12345</sku> <qty>55</qty>

<desc>Left handed monkey wrench</desc> <price>14.95</price>

</invoice>

(36)

DTD

An referenced external DTD

<?xml version=“1.0”>

<!DOCTYPE invoice SYSTEM “invoice.dtd”>

<invoice>

<sku>12345</sku> <qty>55</qty>

<desc>Left handed monkey wrench</desc>

(37)

DTD

An external DTD (invoice.dtd)

<?xml version=“1.0”?>

<!ELEMENT invoice (sku, qty, desc, price) > <!ELEMENT sku (#PCDATA) >

<!ELEMENT qty (#PCDATA) > <!ELEMENT desc (#PCDATA) > <!ELEMENT price(#PCDATA) >

(38)

DTD

Data Types

Parsed Character Data

– #PCDATA

 <!ELEMENT firstname (#PCDATA)

 <!ELEMENT lastname (#PCDATA)

Unparsed Character Data

– CDATA

 <firstname><![CDATA[<b>Jim</b>]]></firstname>

(39)

DTD

XML Document

<db><person><name>Alan</name>

<age>42</age>

<email>[email protected] </email> </person>

<person>………</person> ……….

</db>

(40)

DTD

<!DOCTYPE db [

<!ELEMENT db (person*)>

<!ELEMENT person (name, age, email)> <!ELEMENT name (#PCDATA)>

<!ELEMENT age (#PCDATA)> <!ELEMENT email (#PCDATA)> ]>

(41)

DTD

Occurrence Indicator:

Indicator

Occurrence

(no indicator)

Required

One and only one

?

Optional

None or one

*

Optional,

repeatable

None, one, or more

+

Required,

repeatable

(42)

Why You Should Use XSD

Newest W3C Standard

Broad support for data types

Reusable “components”

– Simple data types – Complex data types

Extensible

Inheritance support

Namespace support

(43)

XML Schema – Better than DTDs

The purpose of a Schema is to define the legal building

blocks of an XML document, just like a DTD.

XML Schemas

 are easier to learn than DTD

 are extensible to future additions

 are richer and more useful than DTDs  are written in XML

(44)

A Simple XML Document

Look at this simple XML document called

"note.xml":

<?xml version="1.0"?> <note>

<to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

(45)

A DTD File

 The following example is a DTD file called "note.dtd" that defines the elements of the XML document ("note.xml"):

<!ELEMENT note (to, from, heading, body)>

<!ELEMENT to (#PCDATA)>

<!ELEMENT from (#PCDATA)>

(46)

An XSD

 The following example is an XML Schema file called "note.xsd" that defines the elements of the XML document ("note.xml"):

<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com"> <xs:element name="note"> <xs:complexType> <xs:sequence>

<xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence>

</xs:complexType> </xs:element>

(47)

A Reference to a DTD

 This XML document has a reference to a DTD:  <?xml version="1.0"?>

<!DOCTYPE note SYSTEM

"http://www.w3schools.com/dtd/note.dtd"> <note>

<to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

(48)

A Reference to an XSD

 This XML document has a reference to an XML Schema:

<?xml version="1.0"?> <note

xmlns="http://www.w3schools.com"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd"> <to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

(49)

49

Another well-structured example

<novel>

<foreword>

<paragraph> This is the great American novel. </paragraph>

</foreword>

<chapter number="1">

<paragraph>It was a dark and stormy night. </paragraph>

<paragraph>Suddenly, a shot rang out! </paragraph>

(50)

50

XML as a tree

An XML document represents a hierarchy; a hierarchy is a tree

novel

foreword chapter

number="1"

paragraph paragraph paragraph

This is the great American novel.

It was a dark and stormy night.

(51)

Displaying XML

XML documents do not carry information about how to

display the data

We can add display information to XML with

– CSS (Cascading Style Sheets)

– XSL (eXtensible Stylesheet Language) --- preferred

XML, HTML,

XML

XSLT

(52)

XML Applications

Computer-computer communications.

Enterprise Application Integration.

Content Management Systems.

Wireless Communication Systems.

PDAs and Handheld Devices.

eLearning and Educational Services.

Web Services.

computer technology,

References

Related documents

The proceeding analysis will unfold in the following manner. Chapter Two provides the historical and sociological grounding for the present study by offering a four-part narrative

Patients treated with azacitidine needed less red blood cell transfusions (2.7 versus 7, P &lt; 0.001) and less platelet transfusions (0.3 versus 5, P &lt; 0.001) during the first

Global news media, including mainstream news agencies which mostly rely on government and military officials for information on military conflicts such as Iraq War II, become

However, unlike private university trustees, public university trustees often do not have final control over the tuition levels that their institutions charge or their state

ccccccc ccccccccc

In more recent years, as train operating speeds increase, the dynamic effects caused by trains passing over bridges, the running safety and riding comfort of the train have

Key words : gendered international political economy, social reproduction, markets, national accounting, domestic economy, state, women's work.. Address for correspondence:

Blockley isolates harboured a macrolide inactivation gene cluster mphA- 37 mrx-mphr(A) within a novel Salmonella Azithromycin Resistance Genomic Island (SARGI), the full.. 38