• No results found

9. XML_ Defining Data for Web Applications V2

N/A
N/A
Protected

Academic year: 2020

Share "9. XML_ Defining Data for Web Applications V2"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

9. XML: Defining Data for Web Applications

1. Define Markup Language.

“A Markup is a set of instructions are also known as tags which can be added to text

files.”

Ex: Microsoft Rich Text Format (RTF), Adobe Portable Document Format (PDF), HTML

etc.,

2. What is XML? Explain.

XML stands for eXtensible Markup Language

• XML is a markup language much like HTML

• XML was designed to carry data, not to display data

• XML tags are not predefined. We must define our own tags

• XML is designed to be self-descriptive

• XML is a W3C Recommendation

Background for XML:

An Extensible Markup Language (XML) document describes the structure of data

XML and HTML have a similar syntax. Both derived from SGML (Standard

Generalized Markup Language)

• XML has no mechanism to specify the format for presenting data to the user

An XML document resides in its own file with an ‘.xml’ extension

The Basic Rules (XML Syntax Rules):

• XML is case sensitive

• All start tags must have end tags

• Elements must be properly nested

• XML declaration is the first statement

• Every document must contain a root element

• Attribute values must have quotation marks

• Certain characters are reserved for parsing

Some characters have a special meaning in XML. There are 5 predefined entity

(2)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist., If we place a character like "<" inside an XML element, it will generate an error

because the parser interprets it as the start of a new element. This will generate an XML

error:

<message>if salary < 1000 then</message>

To avoid this error, replace the "<" character with an entity reference:

<message>if salary &lt; 1000 then</message>

The Difference Between XML and HTML:

XML is not a replacement for HTML. HTML is about displaying information, while

XML is about carrying information. XML and HTML were designed with different goals:

• XML was designed to transport and store data, with focus on what data is

• HTML was designed to display data, with focus on how data looks

The general format (Syntax) of XML document is

<root>

<child>

<subchild> . . . </subchild>

</child>

</root>

Ex:1 XML documents use a self-describing and simple syntax:

<?xml version="1.0"?>

<note>

<to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

<body>Don't forget me this weekend!</body>

(3)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

Output:

The first line is the XML declaration. It defines the XML version (1.0).

The next line describes the root element of the document (like saying: "this document

is a note").

The next 4 lines describe 4 child elements of the root (to, from, heading, and body)

And finally the last line defines the end of the root element

Control Information:

There are three control structures. They are

Comments

Processing Instructions

Document type declarations

Comments in XML:

The syntax for writing comments in XML is similar to that of HTML.

<!-- This is a comment -->

Processing Instructions:

Processing Instructions are (PI) are used to control applications. For example,

<?xml version=”1.0”>

The above instruction tells the data in the file follows the rules of XML version 1.0.

Document Types Declarations:

Each XML document has an associated Document Type Definition. The DTD usually

held in a separate file and can be used with many documents.

Ex:

<!DOCTYPE Recipes SYSTEM “recipe.dtd”>

This declaration tells the parser that the XML file is of type Recipes and that uses a

(4)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist., Any DTD which we develop ourselves or have developed for us is denoted by the keyword

SYSTEM.

White-space is Preserved in XML:

HTML truncates multiple white-space characters to one single white-space:

HTML: Hello Tove

Output: Hello Tove

With XML, the white-space in a document is not truncated.

XML Stores New Line as LF:

In Windows applications, a new line is normally stored as a pair of characters:

carriage return (CR) and line feed (LF). In UNIX applications, a new line is normally stored

as an LF character. Macintosh applications also use an LF to store a new line. XML stores a

new line as LF.

XML Elements:

An XML document contains XML Elements. An XML element is everything from

(including) the element's start tag to (including) the element's end tag. An element can

contain:

other elements

text

attributes

or a mix of all of the above...

3. What is DTD? Explain.

“The Document Type Definition (DTD) describes a model of the structure of the

content of an XML document.”

DTD Elements:

In a DTD, elements are declared with an ELEMENT declaration. The syntax is

<!ELEMENT element-name category>

(or)

<!ELEMENT element-name (element-content)>

The purpose of a DTD is to define the legal building blocks of an XML document. A

(5)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

An external DTD subset

An internal DTD subset

An external DTD subset is a DTD subset is a DTD that exists outside the content of

the document. An internal DTD subset is a DTD that is included within the XML document.

A document can contain any one or both types of subsets. If a document consists both types

of subsets, the internal subset is process is processed first and then external subset is

processed.

An Internal DTD subset example:

Open a new file in Notepad and type the following code:

<?xml version="1.0" ?>

<!DOCTYPE note [

<!ELEMENT note (to,from,heading,body)>

<!ELEMENT to (#PCDATA)>

<!ELEMENT from (#PCDATA)>

<!ELEMENT heading (#PCDATA)>

<!ELEMENT body (#PCDATA)>

]>

<note>

<to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

<body>Don't forget me this weekend!</body>

</note>

Save the above file with .xml extension (For example, 4.xml) & Open it in a browser.

(6)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

An external DTD subset example:

Open a new file in Notepad and type the following code:

<!ELEMENT university (college*)>

<!ELEMENT college (name,dept*)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT dept (mpc,mpcs,mecs,mscs,bcom)>

<!ELEMENT mpc (#PCDATA)>

<!ELEMENT mpcs (#PCDATA)>

<!ELEMENT mecs (#PCDATA)>

<!ELEMENT mscs (#PCDATA)>

<!ELEMENT bcom (#PCDATA)>

Save the above file as one.dtd.

Open a new file in Notepad and type the following code:

<?xml version="1.0" ?>

<!DOCTYPE university SYSTEM "one.dtd">

<university>

<college>

<name>bvrice</name>

<dept>

<mpc>Phy_Che</mpc>

<mpcs>Phy_CSC</mpcs>

<mecs>Ele_CSC</mecs>

<mscs>Stat_CSC</mscs>

<bcom>Com_CSC</bcom>

</dept>

</college>

</university>

Save the above file with .xml extension (For example, index.xml).

(7)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

Output:

Explanation:

The DOCTYPE statement is the document type declaration.

The square brackets [ ] describes the DTD and define the rules of the document.

<!ELEMENT university (college*)> The symbol * indicates that it can contain zero

or more of the college elements.

An element which suffixed with the symbol ? is known as that element is optional.

#PCDATA specifies Parsed Character Data . The reserved character # indicates that

#PCDATA is a reserved word.

Structure Symbols:

XML uses a set of symbols for specifying the structure of an element declaration.

These symbols are also known as control characters. These are explained in the following

table:

Symbol Example Meaning

Asterix item* The item appears zero or more times.

Comma (item1, item2, item3) Separates items in a sequence in the order in

which they appear.

None item Item appears exactly once.

Parenthesis (item1, item2) Enclose a group of items

Pipe (item1 | item2) Separates a set of alternatives. Only one may

appear.

Plus item+ Item appears at least once.

(8)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

Attributes:

In a DTD, attributes are declared with an ATTLIST declaration. The syntax is

<!ATTLIST element-name attribute-name attribute-type attribute-value>

Ex:

DTD Valid XML

<!ATTLIST payment type CDATA "check"> <payment type="check" />

The attribute-type can be one of the following:

Type Description

CDATA The value is character data

(en1|en2|..) The value must be one from an enumerated list

ID The value is a unique id

IDREF The value is the id of another element

IDREFS The value is a list of other ids

NMTOKEN The value is a valid XML name

NMTOKENS The value is a list of valid XML names

ENTITY The value is an entity

ENTITIES The value is a list of entities

NOTATION The value is a name of a notation

xml: The value is a predefined xml value

The attribute-value can be one of the following:

Value Explanation

value The default value of the attribute

#REQUIRED The attribute is required

#IMPLIED The attribute is not required

#FIXED value The attribute value is fixed

Entities:

Entities are variables used to define shortcuts to standard text or special characters.

Entity references are references to entities. Entities can be declared internal or external.

Internal Entities:

These are used to create small pieces of data which we want to use repeatedly

throughout our schema. The syntax is

<!ENTITY entity-name "entity-value">

In XML, an entity has three parts: an ampersand (&), an entity name, and a

(9)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

Ex:

DTD Valid XML

<!ENTITY writer "KBR.">

<!ENTITY copyright "Copyright WT_ Notes.">

<author>&writer;&copyright;</author>

External Entities:

Almost anything which is data can be included in our XML as an external entity. The syntax is

<!ENTITY entity-name SYSTEM "URI/URL">

Ex::

DTD <!ENTITY writer SYSTEM "http://www.w3schools.com/entities.dtd"> <!ENTITY copyright SYSTEM "http://www.w3schools.com/entities.dtd">

Valid XML <author>&writer;&copyright;</author>

Namespaces:

A namespace is a way of keeping the names used by applications separate from each

other. Within a particular namespace no duplication of names can exist. The purpose of XML

Namespaces is to distinguish between duplicate elements and attribute names.

The following example explains there will be no conflict because the two <table>

elements have different names:

<h:table> <h:tr>

<h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr>

</h:table> <f:table>

<f:name>African Coffee Table</f:name> <f:width>80</f:width>

<f:length>120</f:length> </f:table>

XML developers can specify their own namespaces which can be used in many

applications. A namespace is included in the XML in the same way as a DTD.

Ex:

<?xml version="1.0" ?>

<!DOCTYPE Recipes SYSTEM "recipes.dtd">

<!xml:namespace ns="http://URL/namespaces/breads" prefix="bread"> <!xml:namespace ns="http://URL/namespaces/meats" prefix="lamb"> <recipes>

(10)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist., <bread:name>Basic Loaf</bread:name>

</category> <category>

<lamb:name>Roast Lamb</lamb:name> </category>

</recipes>

In the above example, each category of recipe has a name element and there is no

confusion because the namespaces have been declared.

4. Write about XML Schema.

An XML Schema describes the structure of an XML document. XML Schema is

an XML-based alternative to DTD. The XML Schema language is also referred to as XML

Schema Definition (XSD).

The purpose of an XML Schema is to define the legal building blocks of an XML

document, just like a DTD.

An XML Schema:

• defines elements that can appear in a document

• defines attributes that can appear in a document

• defines which elements are child elements

• defines the order of child elements

• defines the number of child elements

• defines whether an element is empty or can include text

• defines data types for elements and attributes

• defines default and fixed values for elements and attributes

XML Schema Data Types:

XML Schema data types can be generally categorized a "simple type" (including

embedded simple type) and "complex type."

Simple Type

A simple type is a type that only contains text data when expressed according

to XML 1.0. User can independently define. This type is used when a

restriction is placed on an embedded simple type to create and use a new

type.

Ex: <xsd:element name="Department" type="xsd:string" />

Here, the section described together with "xsd:string" is an embedded

(11)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist., the definition that the data type for the element called "Department" is a text

string.

Complex Type

A complex data type is a type that has a child element or attributes structure

when expressed according to XML 1.0. User can independently define. This

type is used when the type has a child element or attribute.

Ex: <xsd:complexType name="EmployeeType">

<xsd:sequence maxOccurs="unbounded">

<xsd:element ref="Name" />

<xsd:element ref="Department" />

</xsd:sequence>

</xsd:complexType>

<xsd:element name="Name" type="xsd:string" />

<xsd:element name="Department" type="xsd:string" />

In this case the type name "EmployeeType" is designated by the

name attribute of the complexType element.

Ex: An XML Schema Document (XSD file)

Open a new file in Notepad and type the following code:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="product" type="productType"/>

<xsd:complexType name="productType">

<xsd:sequence>

<xsd:element name="number" type="xsd:integer"/>

<xsd:element name="date" type="xsd:date"/>

</xsd:sequence>

</xsd:complexType>

</xsd:schema>

(12)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

An XML schema instance (XML file)

Open a new file in Notepad and type the following code:

<?xml version="1.0"?>

<product xmlns="http://www.w3schools.com"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation="http://www.w3schools.com productType.xsd">

<number>rama</number>

<date>2004-05-25</date>

</product>

Save the above program as “product.xml” in the same folder where we saved

“productType.xsd”.

Open the “product.xml” in the browser. If there is no errors, the output will be

displayed as the following:

4. Write about DOM and SAX.

Parsers:

A program that analyses the grammatical structure of an input, with respect to

a given formal grammar. (OR) An XML parser is a software component that

can read and validate any XML document.

The parser determines how a sentence can be constructed from the grammar of

the language by describing the atomic elements of the input and the

relationship among them.

(13)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

SAX (Simple API for XML):

a. The SAX API provides a serial mechanism for accessing XML documents.

b. The SAX model allows for simple parsers by allowing parsers to read

through a document in a linear way, and then to call an event handler

every time a markup event occurs.

c. When a parsing event happens, the parser invokes the corresponding

method of the corresponding handler.

d. The handlers are programmer’s implementation of standard Java API (i.e.,

interfaces and classes).

e. Similar to an I/O-Stream, goes in one direction.

Structure of SAX Parser:

DOM (Document Object Model):

a. In the Sun's implementation of DOM model, the parser will read in an entire

XML data source and construct a treelike representation of it in memory.

b. Under DOM, a pointer to the entire document is returned to the calling

application.

c. The application can then manipulate the document, rearranging nodes,

adding and deleting content as needed by using DOM API.

d. While DOM is generally easier to implement, it is far slower and more

resource intensive.

e. DOM can be used effectively with smaller XML data structures in

(14)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

Using the DOM API:

The following diagrams explain DOM views XML documents as trees. But this is

very much a logic view of the document.

There is no requirement that parsers include a tree as a data structure. Each node of

the tree represents an XML element is modeled as an object.

Differences between DOM and SAX parser:

SAX DOM

Both SAX and DOM are used to parse the XML document. Both have advantages and

disadvantages and can be used in our programming depending on the situation.

Parses node by node Stores the entire XML document into memory before

processing

Doesn’t store the XML in

memory Occupies more memory

We can’t insert or delete a node We can insert or delete nodes

Top to bottom traversing Traverse in any direction.

(15)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

SAX is a Simple API for XML Document Object Model (DOM) API

import javax.xml.parsers.*; import javax.xml.parsers.*;

import org.xml.sax.*; import org.w3c.dom.*;

import org.xml.sax.helpers.*;

doesn’t preserve comments preserves comments

SAX generally runs a little faster

than DOM SAX generally runs a little faster than DOM

If we need to find a node and doesn’t need to insert or delete we can go with SAX itself

otherwise DOM provided we have more memory.

5. How to work with XML Stylesheets (Presenting XML)?

Presenting XML:

The following program explains the presentation of XML.

Open a new file in Notepad and type the following code:

<?xml version="1.0"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>

<body>

<h2>My CD Collection</h2>

<table border="1">

<tr bgcolor="#9acd32">

<th>Title</th>

<th>Artist</th>

</tr>

<xsl:for-each select="catalog/cd">

<tr>

<td><xsl:value-of select="title"/></td>

<td><xsl:value-of select="artist"/></td>

</tr>

</xsl:for-each>

</table>

(16)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist., </html>

</xsl:template>

</xsl:stylesheet>

Save the above program as “pstyle.xsl”.

Open a new file in Notepad and type the following code:

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="pstyle.xsl"?>

<catalog>

<cd>

<title>Windows 95</title>

<artist>MS Team</artist>

<country>USA</country>

<company>Microsoft</company>

<price>5000.89</price>

<year>1995</year>

</cd>

<cd>

<title>MS-Office</title>

<artist>MS Team</artist>

<country>US</country>

<company>Microsoft</company>

<price>300.50</price>

<year>2007</year>

</cd>

<cd>

<title>Ilaya Raja Hits</title>

<artist>Veturi</artist>

<country>India</country>

<company>Aditya Music</company>

<price>15.75</price>

<year>2009</year>

(17)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist.,

Save the above file as “pxml.xml” in the same folder i.e., where we saved

“pstyle.xml” file.

Open the file “pxml.xml” in the browser and the output will be displayed as the

following:

Note:

If there is any error in the XSL file, the output will not be displayed.

Explanation:

XSL stands for eXtensible Stylesheet Language, and is a style sheet language for XML documents.

XSLT is a language for transforming XML documents into XHTML documents or to

other XML documents.

What is XSLT?

• XSLT stands for XSL Transformations

• XSLT is the most important part of XSL

• XSLT transforms an XML document into another XML document

• XSLT uses XPath to navigate in XML documents

• XSLT is a W3C Recommendation

<xsl:stylesheet> and <xsl:transform> are completely synonymous and either can be

used. The correct way to declare an XSL style sheet according to the W3C XSLT

Recommendation is:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

(OR)

<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

To get access to the XSLT elements, attributes and features we must declare the

XSLT namespace at the top of the document.

(18)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist., define a template for the entire XML document. The value of the match attribute is an

XPath expression (i.e. match="/" defines the whole document).

The <xsl:value-of> element can be used to extract the value of an XML element and

add it to the output stream of the transformation.

The XSL <xsl:for-each> element can be used to select every XML element of a

specified node-set.

The value of the select attribute is an XPath expression. An XPath expression works

like navigating a file system; where a forward slash (/) selects subdirectories.

To make a link XSL file to XML, the following syntax is required:

<?xml-stylesheet type="text/xsl" href="pstyle.xsl"?>

Where “pstyle.xsl” is a XML Stylesheet.

6. Explain XSL elements.

The general format of XSL element format is

xsl:element select=value

The following table describes the list of XSL elements:

Element Description

apply-imports Applies a template rule from an imported style sheet

apply-templates Applies a template rule to the current element or to the current element's child nodes

Attribute Adds an attribute

attribute-set Defines a named set of attributes

call-template Calls a named template

Choose Used in conjunction with <when> and <otherwise> to express

multiple conditional tests

Comment Creates a comment node in the result tree

Copy Creates a copy of the current node (without child nodes and

attributes)

copy-of Creates a copy of the current node (with child nodes and attributes)

decimal-format Defines the characters and symbols to be used when converting

numbers into strings, with the format-number() function

Element Creates an element node in the output document

Fallback Specifies an alternate code to run if the processor does not support

an XSLT element

for-each Loops through each node in a specified node set

If Contains a template that will be applied only if a specified

(19)

Written by Dept. of Computer Science @ Dr. BVRICE Bhimavaram W.G. Dist., Import

Imports the contents of one style sheet into another.

Note: An imported style sheet has lower precedence than the

importing style sheet

Include

Includes the contents of one style sheet into another.

Note: An included style sheet has the same precedence as the

including style sheet

Key Declares a named key that can be used in the style sheet with the

key() function

Message Writes a message to the output (used to report errors)

namespace-alias Replaces a namespace in the style sheet to a different namespace

in the output

Number Determines the integer position of the current node and formats a

number

Otherwise Specifies a default action for the <choose> element

Output Defines the format of the output document

Param Declares a local or global parameter

preserve-space Defines the elements for which white space should be preserved

processing-instruction Writes a processing instruction to the output

Sort Sorts the output

strip-space Defines the elements for which white space should be removed

Stylesheet Defines the root element of a style sheet

Template Rules to apply when a specified node is matched

Text Writes literal text to the output

Transform Defines the root element of a style sheet

value-of Extracts the value of a selected node

Variable Declares a local or global variable

When Specifies an action for the <choose> element

with-param Defines the value of a parameter to be passed into a template

References

Related documents

Sistem Pendukung Keputusan ini dibangun utuk membantu dalam pemilihan dosen berprestasi di Akademi Maritim Djadajat Jakarta dengan menggunakan metode yaitu metode Simple

[r]

Using blockchain enabled records of service activity on a capital equipment asset has already been identified as a potential use for the technology by a leading IT provider.. 6

20 Despite the compulsory laws put into place by the Nazis to get more “volunteers” for the Hitler Youth, the Catholic Church and its young members, especially in southern

The first time I (we) met him Atty Fraley, I, We felt comfortable the moment he explained how the bankruptcy works and documentation, support, the phone service, and the

Using Resistant Prey Demonstrates That Bt Plants Producing Cry1Ac, Cry2Ab, and Cry1F Have No Negative Effects on Geocoris punctipes and Orius insidiosus Jun-Ce Tian Cornell

She is an Education Research Analyst in MDR’s Market Research Department and Managing Editor of the EdNET Insight News Alert.. • Anne is also part of the EdNET Insight team, MDR’s

Variables are SP: yield spread (10-year government bond yield minus 3-month USD/DEM interest rate/call rate for Japan), SR: real stock return (monthly stock market index return