• No results found

CSE 3241: XML Extensible Markup Language (Ch. 12)

N/A
N/A
Protected

Academic year: 2021

Share "CSE 3241: XML Extensible Markup Language (Ch. 12)"

Copied!
54
0
0

Loading.... (view fulltext now)

Full text

(1)

CSE 3241: XML

Extensible Markup Language

(Ch. 12)

1

(2)

Topics

 Structured, Semistructured, and Unstructured Data

 XML Hierarchical (Tree) Data Model

 XML Documents

 DTD (Document Type Definition)

 XML Schema

 Storing and Extracting XML Documents from Databases

 XML Languages

(3)

Structured, Semistructured, and Unstructured Data

Structured data

◦ Represented in a strict format

◦ Example: information stored in databases

Semistructured data

◦ Has a certain structure

◦ Not all information collected will

have identical structure

(4)

Structured, Semistructured,

and Unstructured Data (cont’d.)

Self-describing data

 Schema information mixed in with data values

 May be displayed as a directed graph

Labels or tags on directed edges represent:

 Schema names

 Names of attributes

 Object types (or entity types or classes)

 Relationships

(5)

Unstructured Data

 Limited indication of the of data document that contains

information embedded within it

HTML documents

◦ Do not include schema information about type of data

Static HTML page

◦ All information to be displayed

explicitly spelled out as fixed text in

HTML file

(6)

Unstructured Data

 HTML uses a large number of predefined tags

Tag

 Text that appears between angled brackets:

<...>

End tag

 Tag with a slash: </...>

(7)

Projects Proj X

Proj Y Worker

s

Worker s

(8)

Semistructured Data

(9)

SemiStructured Data: XML

Data sources

◦ Database storing data for Internet applications

Hypertext documents

◦ Common method of specifying

contents and formatting of Web

pages

(10)

What is XML?

 XML – The eXtensible Markup Language

 What’s a Markup Language?

◦ Language used to annotate a document for some purpose

Uses tags that are distinguished from the content of the document to provide that annotation

◦ HTML (HyperText Markup Language) and LaTeX

Both examples of document publishing languges

Tags used to indicate formatting

 Tags follow a defined structure to keep them

separate from the content of the document 10

(11)

What is XML?

 XML provides a framework to define a structure for data

An XML document is a collection of related data items

Document is “marked up” with tags known as elements

 Elements are used to provide structure to the data

11

(12)

XML Hierarchical (Tree) Data Model

Elements and attributes

◦ Main structuring concepts used to construct an XML document

Complex elements

◦ Constructed from other elements hierarchically

Simple elements

◦ Contain data values

 XML tag names

◦ Describe the meaning of the data

elements in the document

(13)

XML Hierarchical (Tree) Data Model (cont’d.)

 XML attributes

◦ Describe properties and

characteristics of the elements (tags) within which they appear

May reference another element in another part of the XML

document

◦ Common to use attribute values in

one element as the references

(14)

The XML Data Model

 Attributes vs.

Elements

◦ Data can be stored as the contents of an

element OR as an

attribute of an element

14

<?xml version=“1.0”

standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</Location>

<Dept_no>5</Dept_no>

</Projects>

Why pick one over the other?

Best practice:

Attributes at describe/modify the element Elements to hold the actual data values

Much like in HTML:

Element (tag) contents are the data to be displayed Attributes (generally) modify/describe how it is to be displayed

(15)
(16)

What does XML have to do with databases?

 Recall: What is a database?

A logically coherent collection of data with some specific meaning that has been

designed for a specific purpose.

 Structured and semi-structured data files vs.

database?

More practically, XML is used as a data exchange framework

 Moving data from one application to another, from one database to another

 Taking data from a database and turning it into a website, a report, or other human readable

document

◦ Even some implementations of “XML native” DBs

 XML as the “back end” storage instead of relations

16

(17)

The XML Data Model

17

XML uses a hierarchical model

Also known as a tree model Documents can be

represented as trees

Each simple element contains one data value

Leaves of the tree

Complex elements can contain multiple child elements

Internal nodes of the tree

Each complex element can belong to one complex parent element

Parent node of the tree

One root element contains everything else

Root of the tree

(18)

A sample XML tree

18

Internal nodes are complex elements

Leaf nodes are simple elements

The root node is the root element

Root element

contains all other elements within it

Projects

Project Id=“1”

Name Location Dept_no Workers

Ssn Last_name Hours Ssn Hours

Worker Worker

“Product X” “Bellaire” “5”

“123456789” “Smith” “32.5” “453453453” “15.5”

Project Project

(19)

A sample XML tree

19

<?xml version=“1.0” standalone=“yes”?

>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</Location>

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

</Projects>

Projects

Project Id=“1”

Name Location Dept_no Workers

Ssn Last_name Hours Ssn Hours

Worker Worker

“Product X” “Bellaire” “5”

“123456789” “Smith” “32.5” “453453453” “15.5”

Project Project

(20)

A sample of XML

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</Location>

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

20

(21)

A sample of XML

21

XML Declaration

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</ Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(22)

A sample of XML

22

root element

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</ Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(23)

A sample of XML

23

Beginning of root element

End of root element root element

<?xml version=“1.0”

standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</

Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(24)

A sample of XML

24

First child element of root

(Other child elements possible in here – do not even need to be “Project”

elements necessarily)

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</ Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(25)

A sample of XML

25

The first Project element has an attribute named number

with a value of “1”

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</ Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(26)

A sample of XML

26

First child element of Project element where id=“1”

Simple element with a name of “Name” and a value of “Product X”

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</ Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(27)

A sample of XML

27

Second child element of Project element where id=“1”

Simple element with a name of “Location” and a value of “Bellaire”

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</ Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(28)

A sample of XML

28

Third child element of Project element where id=“1”

Simple element with a name of “Dept_no” and a value of “5”

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</ Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(29)

A sample of XML

29

Fourth child element of Project element where id=“1”

Complex element with a name of “Workers”

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</ Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(30)

A sample of XML

30

First child element of Projects/

Project[number=“1”]/

Workers

Complex element with a name of “Worker”

<?xml version=“1.0” standalone=“yes”?>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</ Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

….

</Projects>

(31)

XML Hierarchical (Tree) Data Model (cont’d.)

Tree model or hierarchical model

 Main types of XML documents

Data-centric XML documents

Document-centric XML documents

Hybrid XML documents

Schemaless XML documents

◦ Do not follow a predefined schema of element names and corresponding

tree structure

(32)

XML Document Types – Data Centric XML

 Data-centric XML

◦ Highly structured

◦ Many small data items

Often used for data exchange purposes

 Transfer data from one system to another

◦ Also used to create web pages dynamically from databases

Generally follow a schema document that determines their structure

32

(33)

XML Document Types – Document-Centric XML

 Few structural elements

 Large amounts of text

◦ Articles, blog entries, books

 May have a schema document, but not required

◦ Schema may be very limited in semantics

 What’s a title?

 What’s a chapter?

 What’s a paragraph?

33

(34)

More XML Document Types

 Hybrid XML

◦ Some parts are highly structured

◦ Some parts mostly blocks of text and/or unstructured

◦ May or may not have a predefined schema

Schemaless XML documents

◦ Semi-structured documents without a predefined schema

◦ Denoted by the attribute

‘standalone=“yes”’ in the XML declaration on the top line

34

(35)

Valid XML

 An XML document is considered valid if:

◦ It is well-formed

◦ And…

35

To be continued after this definition…

(36)

Well-formed XML

An XML document is well-formed when it follows certain conditions:

◦ It must start with an XML declaration line:

<?xml version=“1.0” standalone=“yes”?>

◦ It must form a tree:

 Must start with a single root element

 Every child element must have start and end tags that are contained completely within a parent

element:

Good Bad

<parent> <parent>

<child> <child>

</child> </parent>

</parent> </child>

36

(37)

Valid XML

 An XML document is considered valid if:

◦ It is well-formed, and …

It follows a particular schema in a standard definition language

 A DTD document (Document Type Definition)

 An XML schema document

◦ DTDs are the original, older technology

◦ XML schema documents are the “new”

hotness

 First published in 2001

37

(38)

DTD – Document Type Definition

 Original method of specifying a schema definition

◦ Still in widespread use

 A very simple schema definition language

◦ Each possible element in the document is defined

 What children must it have?

 What children can it (optionally) have?

 What kinds of attributes can/must it have?

 If it is a leaf element, what kinds of values

can it have?

38

(39)

XML Documents, DTD, and XML Schema (cont’d.)

 Notation for specifying elements

 XML DTD

◦ Data types in DTD are not very general

◦ Special syntax

 Requires specialized processors

◦ All DTD elements always forced to follow the specified ordering of the document

 Unordered elements not permitted

(40)

A sample XML document and DTD

40

<?xml version=“1.0” standalone=“no”?>

<!DOCTYPE Projects SYSTEM “proj.dtd”>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</Location >

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

</Projects>

We declare that we want to use a DTD by Putting the DOCTYPE declaration at the top of our XML file

!DOCTYPE keyword Projects

The name of our DTD’s root node SYSTEM

indicating that this is an external DTD

“proj.dtd”

the filename (or URL)

(41)

A sample XML document and DTD

41

<?xml version=“1.0” standalone=“no”?>

<!DOCTYPE Projects SYSTEM “proj.dtd”>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</Location>

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

</Projects>

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

(42)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

A sample DTD

42

root element comes first

(43)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

A sample DTD

43

Name of element

(44)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

A sample DTD

44

List of children

Regular expression-like syntax:

+ – indicates 1 or more of this child

* – indicates 0 or more of this child

? – indicates 0 or 1 of this child No symbol – indicates exactly one child

So this indicates 1 or more Project children

are required

(45)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

A sample DTD

45

List of children

Regular expression-like syntax:

+ – indicates 1 or more of this child

* – indicates 0 or more of this child

? – indicates 0 or 1 of this child No symbol – indicates exactly one child

This indicates that Dept_no is an optional

field, but there can be only one of them

(46)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

List of children

Regular expression-like syntax:

+ – indicates 1 or more of this child

* – indicates 0 or more of this child

? – indicates 0 or 1 of this child No symbol – indicates exactly one child

This indicates that Dept_no is an optional

field, but there can be only one of them

A sample DTD

46

(47)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

List of children

Regular expression-like syntax:

+ – indicates 1 or more of this child

* – indicates 0 or more of this child

? – indicates 0 or 1 of this child No symbol – indicates exactly one child

This indicates that Dept_no is an optional

field, but there can be only one of them

A sample DTD

47

(48)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

A sample DTD

48

Project has an attribute named

“number”

(49)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

A sample DTD

49

Project has an attribute named

“number”

It’s “type” is a unique ID

This can be used to refer to this child by other elements – like a primary key

(50)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

50

Project has an attribute named

“number”

It’s “type” is a unique ID

This can be used to refer to this child by other elements – like a primary key

And this attribute ID must exist on all

Project children

A sample DTD

(51)

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

A sample DTD

51

Name is a “leaf node”

#PCDATA means that it holds

“parsed character data”

It will contain a value of some kind between its start and end tag (even an empty value counts as a value for the DTD)

(52)

A sample XML document and DTD

52

<?xml version=“1.0” standalone=“no”?>

<!DOCTYPE Projects SYSTEM “proj.dtd”>

<Projects>

<Project number=“1”>

<Name>Product X</Name>

<Location>Bellaire</Location>

<Dept_no>5</Dept_no>

<Workers>

<Worker>

<Ssn>123456789</Ssn>

<Last_name>Smith</LastName>

<Hours>32.5</Hours>

</Worker>

<Worker>

<Ssn>453453453</Ssn>

<Hours>15.5</Hours>

</Worker>

</Workers>

</Project>

</Projects>

<!ELEMENT Projects (Project+)>

<!ELEMENT Project (Name, Location, Dept_no?, Workers)>

<!ATTLIST Project number ID

#REQUIRED>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Location (#PCDATA)>

<!ELEMENT Dept_no (#PCDATA)>

<!ELEMENT Workers (Worker*)>

<!ELEMENT Worker (Ssn, Last_name?, First_name?, Hours)>

<!ELEMENT Ssn (#PCDATA)>

<!ELEMENT Last_name (#PCDATA)>

<!ELEMENT First_name (#PCDATA)>

<!ELEMENT Hours (#PCDATA)>

(53)

DTD Limitations

 Data types in DTD are not general

◦ Child nodes hold PCDATA values – strings

◦ DTD has its own syntax

 Need to write a special parser for it

 Can’t leverage existing XML parsers to do DTD parsing

◦ All elements must follow the ordering laid out

 Unordered elements not allowed

53

(54)

Summary

 Three main types of data:

structured, semi-structured, and unstructured

 XML standard

◦ Tree-structured (hierarchical) data model

◦ XML and DTD notation/language

 Next class…

◦ XML Schema

◦ Storing and Extracting XML Documents

◦ XML Languages

References

Related documents

Switching your payroll and all the processes along with it from one provider to the next: it’s the one thing people in the payroll world tend to fear the most.. If you’ve

In this study we examine teachers’ data use and the eff ects of school principals’ trans- formational leadership behavior on their teachers’ data use.. We based our fi nd- ings

Sistem Pendukung Keputusan ini dibangun utuk membantu dalam pemilihan dosen berprestasi di Akademi Maritim Djadajat Jakarta dengan menggunakan metode yaitu metode Simple

The table below provides a comparison of the cost per square foot for Menlo Park Fire District Station 2, located in East Palo Alto, bid and rebuilt starting in 2013, and Station

An interesting but somewhat trivial result regarding the quality of the final schedule collectively selected is that if the set of candidate schedules Y consists of an optimal

The study assesses participants’ level of knowledge of each module (ECD, child-centered care, inequity, teacher development and training, and evaluation) and their attitudes

The rapid expansion in the number of voluntary and proprietary agencies engaged in supplying home-care services to LTCI beneficiaries indicates that where the existing capacity

The aquifer test is one of the most useful tools available to hydrologists. Analysis of aquifer- test data to determine the hydraulic properties of aquifers and confining beds