XML and Java
XML makes data portable
Underpinning for Web Related Computing
fits well to Java, which makes code portable
a standard for data interchange
Java API’s for XML processing:
JAXP, SAX, DOM, XSLT, JAXB, JAX-RPC,
SAAJ, JAXR
1
An Introduction to XML and Web Technologies
An Introduction to XML and Web Technologies
XML Documents
XML Documents
Anders Møller & Michael I. Schwartzbach 2006 Addison-Wesley
2
An Introduction to XML and Web Technologies
Objectives
Objectives
!
What is
XML
, in particular in relation to HTML
!
The XML
data model
and its
textual
representation
!
The XML
Namespace
mechanism
3
An Introduction to XML and Web Technologies
What is XML?
What is XML?
!
XML:
E
x
tensible
M
arkup
L
anguage
!
A
framework
for defining markup languages
!
Each language is targeted at its own
application domain
with its own markup tags
!
There is a common set of
generic tools
for
processing XML documents
!
XHTML
: an XML variant of HTML
!
Inherently
internationalized
and
platform
independent
(
Unicode
)
!
Developed by W3C, standardized in 1998
4
An Introduction to XML and Web Technologies
Recipes in XML
Recipes in XML
!
Define our own
“
Recipe Markup Language
”
!
Choose markup tags that correspond to concepts
in this application domain
• recipe, ingredient, amount, ...
!
No canonical choices
• granularity of markup? • structuring?
• elements or attributes? • ...
XML languages
an application of
the XML standard,
which defines a
vocabulary with
symbols,
semantics, and
rules
Some XML Languages (1)
XHTML
XML Hypertext Markup Language
an HTML version that adheres to the XML standard
XLINK
XML-based standard for hypertext links
XSLT
XML Stylesheet Language for Transformation
SOAP, WSDL are are used for Web Services
Some XML Languages (2)
XML schema: a description of the type of an XML document
(syntax and semantics)
XPATH: a query language for selecting nodes from an XML
document
RDF: Resource Description Framework a language to define
metadata models (entity-relationship, class diagrams)
XMI: a standard to exchange meta-data information via XML
MOF: a language to define meta-models
XML Structure
Tag-oriented representation of data
like HTML but tags identify the data
HTML-tags tell how to display
data
Example: Co
ff
ee price list
<PRICELIST>
<COFFEE>
<NAME>MOCHA JAVA</NAME>
<PRICE>11.95</PRICE>
</COFFEE>
<COFFEE>
<NAME>SUMATRA</NAME>
<PRICE>12.50</PRICE>
</COFFEE>
</PRICELIST>
2
5An Introduction to XML and Web Technologies
Example (1/2)
Example (1/2)
<co l l ec t i on>
<desc r i pt i on>Rec i pes sugges t ed by Jane Dow< / descr i pt i on> <r ec i pe i d=" r117">
<t i t l e>Rhubarb Cobb l er< / t i t l e> <da t e>Wed , 14 Jun 95< / da t e>
< i ngr ed i en t name="d i ced rhubarb" amount ="2 . 5" un i t =" cup" / > < i ngr ed i en t name=" suga r " amoun t="2" un i t =" t ab l espoon" / > < i ngr ed i en t name=" f a i r l y r i pe banana" amoun t="2" / >
< i ngr ed i en t name=" c i nnamon" amoun t="0 . 25" un i t=" t easpoon" / > < i ngr ed i en t name="nu tmeg" amount ="1" un i t ="dash" / >
<pr epar a t i on> <s t ep>
Comb i ne a l l and use as cobb l e r , p i e , or c r i sp . < / s t ep>
< / prepa ra t i on>
6
An Introduction to XML and Web Technologies
Example (2/2)
Example (2/2)
<commen t>
Rhuba rb Cobb l er made wi t h bananas as t he ma i n swee t ene r . I t was de l i c i ous .
< / comment >
<nu t r i t i on ca l or i es="170" f a t="28%"
ca rbohydr a t es="58%" prot e i n="14%" / >
<r e l a t ed r e f ="42">Ga rden Qu i che i s a l so yummy< / r e l a t ed> < / rec i pe>
< / co l l ec t i on>
7
An Introduction to XML and Web Technologies
Building on the XML Notation
Building on the XML Notation
! Defining the syntax of our recipe language
• DTD, XML Schema, ...
! Showing recipe documents in browsers
• XPath, XSLT
! Recipe collections as databases
• XQuery
! Building a Web-based recipe editor
• HTTP, Servlets, JSP, ... ! ...
– the topics of the following weeks...
8
An Introduction to XML and Web Technologies
XML Trees
XML Trees
! Conceptually, an XML document is a tree structure • node, edge • root, leaf • child, parent • sibling (ordered), ancestor, descendant 9An Introduction to XML and Web Technologies
An Analogy: File Systems
An Analogy: File Systems
10
An Introduction to XML and Web Technologies
Tree View of the XML Recipes
Tree View of the XML Recipes
11
An Introduction to XML and Web Technologies
Nodes in XML Trees
Nodes in XML Trees
! Text nodes: carry the actual contents, leaf nodes ! Element nodes: define hierarchical logical
groupings of contents, each have a name ! Attribute nodes: unordered, each associated
with an element node, has a nameand a value ! Comment nodes: ignorable meta-information
! Processing instructions: instructions to specific
processors, each have a targetand a value ! Root nodes: every XML tree has one root node
that represents the entire tree
12
An Introduction to XML and Web Technologies
Textual Representation
Textual Representation
! Text nodes: written as the text they carry ! Element nodes: start-end tags
• <bla...>... < /bla>
• short-hand notation for empty elements: <bla/ >
! Attribute nodes: name=“value” in start tags ! Comment nodes: < ! - - bla - - >
! Processing instructions: <?target value?> ! Root nodes: implicit
Tags
written in parentheses < ... >
any tag <x> must have a matching
end tag </x>
between is the definition of an
element of XML data
tags can contain tags (are nested)
Note: XML element names are
case-sensitive!
4
13 An Introduction to XML and Web Technologies
Browsing XML (without XSLT)
Browsing XML (without XSLT)
14 An Introduction to XML and Web Technologies
More Constructs
More Constructs
!
XML declaration
!
Character references
!
CDATA sections
!
Document type declarations and entity references
explained later...
!
Whitespace?
15 An Introduction to XML and Web Technologies
Example
Example
<?x
m
l ve r s i on="1 . 1" encod i ng=" I
S
O
- 8859 - 1" ?>
< !
DO
CTYPE
f ea t ur es
SYSTE
M
" exa
m
p l e . d t d">
< f ea t ur es a="b">
<?
m
y t oo l he r e i s so
m
e i n f or
m
a t i on spec i f i c t o
m
y t oo l ?>
E
l señor es t á b i en , ga r çon !
C
opy r i gh t
© ; 2005
< ! [
C
D
ATA
[ < t h i s i s no t a t ag> ] ] >
< ! - - a l
w
ays r e
m
e
m
be r t o spec i f y t he
r i gh t cha r ac t e r encod i ng - - >
< / f ea t ur es>
16 An Introduction to XML and Web TechnologiesWell
Well
-
-
formedness
formedness
!
Every XML document must be
well-formed
• start and end tags must
match
and
nest
properly
•
<x><y></y></x>
"
•
</z><x><y></x></y>
• exactly one
root element
• ...
!
in other words, it defines a proper tree structure
!
XML parser
: given the textual XML document,
Tag Attributes
Tags can contain attributes
additional information included as
part of the tag itself:
<message
to="
[email protected]
"
from=
"[email protected]
"
subject="
XML Is Really Cool
"
>
<text>
How many ways is XML cool? Let me
count the ways...
</text>
</message>
ATTRIBUTES
ARE FOLLOWED
BY AN = SIGN
AND SEPARATED
BY SPACES
ATTRIBUTE
Comments
XML comments look just like HTML
comments:
<MESSAGE TO="[email protected]"
" "
FROM="[email protected]"
SUBJECT="XML IS REALLY COOL">
<!-- THIS IS A COMMENT -->
<TEXT>
HOW MANY WAYS IS XML COOL? LET ME
" "
COUNT THE WAYS...
</TEXT>
</MESSAGE>
9
An Introduction to XML and Web Technologies
An Analogy: File Systems
An Analogy: File Systems
10
An Introduction to XML and Web Technologies
Tree View of the XML Recipes
Tree View of the XML Recipes
11
An Introduction to XML and Web Technologies
Nodes in XML Trees
Nodes in XML Trees
!
Text nodes: carry the actual contents, leaf nodes
!
Element nodes: define hierarchical logical
groupings of contents, each have a
name
!
Attribute nodes: unordered, each associated
with an element node, has a
name
and a
value
!
Comment nodes: ignorable meta-information
!
Processing instructions: instructions to specific
processors, each have a
target
and a
value
!
Root nodes: every XML tree has one root node
that represents the entire tree
12
An Introduction to XML and Web Technologies
Textual Representation
Textual Representation
!
Text nodes: written as the text they carry
!
Element nodes: start-end tags
•
<
bla
...
>
...
< /
bla
>
•
short-hand notation for empty elements:
<
bla
/ >
!
Attribute nodes:
name
=“
value
”
in start tags
!
Comment nodes:
< ! - -
bla
- - >
!
Processing instructions:
<?
target value
?>
!
Root nodes: implicit
Freitag, 31. Mai 13The XML Prolog
XML files allways starts with a prolog
minimum:
<?xml version="1.0"?>
..or with additional information:
<?XML VERSION="1.0" ENCODING="ISO-8859-1" STANDALONE="YES"?>
VERSION
IDENTIFIES THE VERSION OF THE XML MARKUP LANGUAGE USED IN THE DATA. THIS
ATTRIBUTE IS NOT OPTIONAL.
ENCODING
IDENTIFIES THE CHARACTER SET USED TO ENCODE THE DATA. "ISO-8859-1" IS "LATIN-1" THE
WESTERN EUROPEAN AND ENGLISH LANGUAGE CHARACTER SET. (THE DEFAULT IS
COMPRESSED UNICODE: UTF-8.)
STANDALONE
TELLS WHETHER OR NOT THIS DOCUMENT REFERENCES AN EXTERNAL ENTITY OR AN
EXTERNAL DATA TYPE SPECIFICATION (SEE LATER). IF THERE ARE NO EXTERNAL
Character Encoding Schemes
US-ASCII
is a 7-bit encoding scheme that covers the English-language alphabet. It is not
large enough to cover the characters used in other languages, however, so it is not very
useful for internationalization.
ISO-8859-1
is the character set for Western European languages. It's an 8-bit encoding
scheme in which every encoded character takes exactly 8-bits. (With the remaining
character sets, on the other hand, some codes are reserved to signal the start of a
multi-byte character.)
UTF-8
is an 8-bit encoding scheme. Characters from the English-language alphabet are all
encoded using an 8-bit bytes. Characters for other languages are encoded using 2, 3 or
even 4 bytes. UTF-8 therefore produces compact documents for the English language, but
for other languages, documents tend to be half again as large as they would be if they
used UTF-16. If the majority of a document's text is in a Western European language, then
UTF-8 is generally a good choice because it allows for internationalization while still
minimizing the space required for encoding.
UTF-16
is a 16-bit encoding scheme. It is large enough to encode all the characters from
all the alphabets in the world. It uses 16-bits for most characters, but includes 32-bit
characters for ideogram-based languages like Chinese. A Western European-language
document that uses UTF-16 will be twice as large as the same document encoded using
UTF-8. But documents written in far Eastern languages will be far smaller using UTF-16.
XML Benefits
plain text, can easily be debugged
identifies data by readable tags
is stylable:
When display is important, the stylesheet standard,
XSL
,
lets you dictate how to portray the data. For example, the
stylesheet for:
<to>[email protected]</to>
can say:
1. START A NEW LINE.
2. DISPLAY "TO:" IN BOLD, FOLLOWED BY A SPACE
3. DISPLAY THE DESTINATION DATA.
WHICH PRODUCES:
TO: YOU@YOURADDRESS
formatted xml display in
a modern Webbrowsers
XML tags must make sense
to Applications
multiple applications to use the same
XML must agree upon tag names
example: XML file for a messaging
application (e.g. Email)
<MESSAGE>
<TO>
[email protected]
</TO>
<FROM>
[email protected]
</FROM>
<SUBJECT>
XML IS REALLY COOL
</SUBJECT>
<TEXT>
HOW MANY WAYS IS XML COOL? LET ME
COUNT THE WAYS...
</TEXT>
</MESSAGE>
XHTML: an XML variant of
HTML makes sense to
modern Web browsers
5
17
An Introduction to XML and Web Technologies
Simpler Alternatives?
Simpler Alternatives?
S-expressions, 1958: ( co l l ec t i on
( rec i pe
( t i t l e "Rhubarb Cobb l er " ) ( da t e "Wed , 14 Jun 95" ) . . .
) )
! XML is defined as a simplified subset of SGML
! XML could have been designed simpler...
! ... but it wasn’t [end of discussion]
18
An Introduction to XML and Web Technologies
Applications
Applications
Rough classification:
! Data-oriented languages
! Document-oriented languages
! Protocols and programming languages
! Hybrids
19
An Introduction to XML and Web Technologies
Example: XHTML
Example: XHTML
<?xml ver s i on="1 . 0" encod i ng="UTF - 8" ?>
<html xml ns="ht tp : / /www.w3 . org / 1999 / xhtml "> <head>< t i t l e>He l l o wor l d ! < / t i t l e>< / head> <body> <h1>Th i s i s a head i ng< / h1> Th i s i s some t ex t . < / body> < / html > 20
An Introduction to XML and Web Technologies
Example: CML
Example: CML
<mo l ecu l e i d="METHANOL"> <a tomAr ray>
<s t r i ngAr ray bu i l t i n=" i d">a1 a2 a3 a4 a5 a6< / s t r i ngAr ray> <s t r i ngAr ray bu i l t i n="e l ementType">C O H H H H< / s t r i ngAr ray> < f l oa tAr ray bu i l t i n="x3" un i t s="pm"> - 0 . 748 0 . 558 . . . < / f l oa tAr ray> < f l oa tAr ray bu i l t i n=" y3" un i t s="pm"> - 0 . 015 0 . 420 . . . < / f l oa tAr ray> < f l oa tAr ray bu i l t i n=" z3" un i t s="pm"> 0 . 024 - 0 . 278 . . . < / f l oa tAr ray> < / a tomAr ray> < /mo l ecu l e>
XML is extensible
in contrast to HTML, XML allows you
to write your own tags
allowed tags in a document can be
described in a schema
describes hierarchical structure
and order of tags
SCHEMA LANGUAGES:
"
DOCUMENT TYPE
"
DEFINITION (DTD)
"
NEWER IS XML
Example of a DTD
<!ELEMENT priceList (coffee)+>
<!ELEMENT coffee (name, price) >
<!ELEMENT name (#PCDATA) >
<!ELEMENT price (#PCDATA) >
The first line in the example gives the highest level element,
priceList
, which means that all the other tags in the document will
come between the
<priceList>
and
</priceList>
tags. The
first line also says that the
priceList
element must contain one or
more
coffee
elements (indicated by the plus sign). The second line
specifies that each
coffee
element must contain both a
name
element and a
price
element, in that order. The third and fourth
lines specify that the data between the tags
<name>
and
</name>
and between
<price>
and
</price>
is character data that should
be parsed. The name and price of each coffee are the actual text
Schemas ensure Portability
Parsers can validate XML documents using a
Schema
humans can read XML documents, but tag names
by themself have no meaning to applications
applications must agree on the meaning of the
tags
XML is Everywhere
examples of use on this computer
this presentation
an Apple Keynote document
preferences of Mac OS X applications
see preferences of the Dock in /Users/korte/
Library/Preferences/com.apple.dock.plist
XML and Web Services
common language for
information interchange
between services and
clients
defining Web services operations
i.e. Web Services require parsing and processing
XML documents
Java provides vendor-neutral APIs
!"
!"##$%&'
()*%+,%-+
.%+,'%/",)'
012"31-,"'%2* 41-%$%,562,&'2&,
70( 70( 70( 70(!"#$%&'()'**+),$#)-)
./-'+-#+)0&(1/%*&(1/)&',#-./#23/2#*
)
),$#)+-/-)*431-'(*)&')*562.&'*..)-770&3-/&$'.89))
8&/9!&':%-&+9;+1*&9<=1>#$&
JAVA API’s for XML
document-oriented
◦
Java API for XML Processing (
JAXP
) -- processes XML
documents using various parsers
◦
Java Architecture for XML Binding (
JAXB
) -- processes XML
documents using schema-derived JavaBeans component
classes
◦
SOAP with Attachments API for Java (
SAAJ
) -- sends SOAP
messages over the Internet in a standard way
procedure-oriented
◦
Java API for XML-based RPC (JAX-RPC) -- sends SOAP
method calls to remote parties over the Internet and receives
the results
- now replaced by JAX-WS
◦
Java API for XML Registries (
JAXR
) -- provides a standard
JAXP - Java API for
XML Processing
contains:
SAX API - Simple API for XML
parser processes XML document line by line
and and notifies an application by calling
methods from the Content
Handler
Interface
J2EE: The Integrated Platform for Web Services
42
Figure 2.5
SAX- and DOM-Based XML Parser APIs
Figure 2.5 shows how the SAX and DOM parsers function. SAX processes
documents serially, converting the elements of an XML document into a series of
events. Each particular element generates one event, with unique events
represent-ing various parts of the document. User-supplied event handlers handle the events
and take appropriate actions. SAX processing is fast because of its serial access
and small memory storage requirements. Code Example 2.7 shows how to use the
JAXP APIs and SAX to process an XML document.
public class AnAppThatUsesSAXForXMLProcessing
extends DefaultHandler {
public void someMethodWhichReadsXMLDocument() {
// Get a SAX PArser Factory and set validation to true
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setValidating(true);
// Create a JAXP SAXParser
SAXParser saxParser = spf.newSAXParser();
// Get the encapsulated SAX XMLReader
xmlReader = saxParser.getXMLReader();
XML
Do
c
um
e
n
t
SAX
P
a
rs
e
r
Ev
e
n
t
H
a
nd
l
e
rs
Inpu
t
Cr
Ev
ea
e
n
t
t
e
s
s
XML
Do
c
um
e
n
t
DOM
P
a
rs
e
r
Inpu
t
Cr
Tr
ee
ea
t
e
s
SAX: easy, sequential
processing with small
memory footprint
SAX: encountering
an XML element
while parsing is an
event that triggers
the invocation of
an event handler
Summary of SAX APIs (1)
SAXParserFactory
A
SAXParserFactory
object creates an instance of
the parser determined by the system property,
javax.xml.parsers.SAXParserFactory
.
SAXParser
The
SAXParser
interface defines several kinds of
parse()
methods. In general, you pass an XML data
source and a
DefaultHandler
object to the parser,
which processes the XML and invokes the appropriate
methods in the handler object.
SAXReader
The
SAXParser
wraps a
SAXReader
. It is the
SAXReader
which carries on the conversation with the
SAX event handlers you define.
DefaultHandler
Not shown in the diagram, a
DefaultHandler
implements the
ContentHandler
,
ErrorHandler
,
DTDHandler
, and
EntityResolver
interfaces (with
null methods), so you can override only the ones
you're interested in.
112
Figure 4–1
SAX APIs
The parser wraps a
SAXReader
object. When the parser’s
parse()
method is
invoked, the reader invokes one of several callback methods implemented in the
application. Those methods are defined by the interfaces
ContentHandler
,
ErrorHandler
,
DTDHandler
, and
EntityResolver
.
Here is a summary of the key SAX APIs:
SAXParserFactory
A
SAXParserFactory
object creates an instance of the parser determined by
the system property,
javax.xml.parsers.SAXParserFactory
.
SAXParser
The
SAXParser
interface defines several kinds of
parse()
methods. In
gen-eral, you pass an XML data source and a
DefaultHandler
object to the
parser, which processes the XML and invokes the appropriate methods in the
handler object.
SAXReader
The
SAXParser
wraps a
SAXReader
. Typically, you don’t care about that, but
every once in a while you need to get hold of it using
SAXParser
’s
getXML-Note: well known
software design
patterns are used
here
Java API documentation:
look into package
javax.xml.parsers
Summary of SAX APIs (2)
ContentHandler
Methods like
startDocument
,
endDocument
,
startElement
, and
endElement
are invoked when an XML tag is recognized. This
interface also defines methods
characters
and
processingInstruction
, which are invoked when the parser
encounters the text in an XML element or an inline processing
instruction, respectively.
ErrorHandler
Methods
error
,
fatalError
, and
warning
are invoked in
response to various parsing errors.
DTDHandler
Defines methods you will generally never be called upon to use. Used
when processing a DTD to recognize and act on declarations for an
unparsed entity.
EntityResolver
The
resolveEntity
method is invoked when the parser must
identify data identified by a URI. In most cases, a URI is simply a URL,
which specifies the location of a document, but in some cases the
document may be identified by a URN - a public identifier, or name,
that is unique in the Web space. The public identifier may be specified
in addition to the URL. The
EntityResolver
can then use the public
identifier instead of the URL to find the document, for example to
access a local copy of the document if one exists.
112
Figure 4–1 SAX APIs
The parser wraps a
SAXReaderobject. When the parser’s
parse()method is
invoked, the reader invokes one of several callback methods implemented in the
application. Those methods are defined by the interfaces
ContentHandler,
ErrorHandler
,
DTDHandler, and
EntityResolver.
Here is a summary of the key SAX APIs:
SAXParserFactory
A
SAXParserFactoryobject creates an instance of the parser determined by
the system property,
javax.xml.parsers.SAXParserFactory.
SAXParserThe
SAXParserinterface defines several kinds of
parse()methods. In
gen-eral, you pass an XML data source and a
DefaultHandlerobject to the
parser, which processes the XML and invokes the appropriate methods in the
handler object.
SAXReader
The
SAXParserwraps a
SAXReader. Typically, you don’t care about that, but
A Uniform Resource Name (URN) is a
Uniform Resource Identifier
(URI) that uses
the urn
scheme
, and does not imply
availability of the identified resource. Both
URNs (names) and
URLs
(locators) are URIs,
and some URI may be a name and a locator
at the same time.
About Design Patterns...
provide solutions for common design problems
decorator
singleton
factory method
observer
Here are some
basics…
Decorator
augments the functionality of an object
decorator object wraps another object
decorator has a similar interface
calls are related to the wrapped object...
… but the decorator can provide
additional functionality
example:
java.io.BufferedReader
wraps and augments an unbu
ff
ered
Reader
object
wraps a class to
adapt its interface
Singleton
ensures only a single instance of a class exists
all clients use the same object
constructor is private to prevent external
instantiation
single instance optained via a static
getInstance
method
GIVE AN EXAMPLE!
Factory Method
a pattern for object creation
client require an object of a particular
interface or superclass type
a factory method is free to return an
implementing-class object or subclass
object
exact type depends on context
example:
SAXParserFactory.newInstance()
method of the following example…
adds flexibility by
not calling a
constructor which
would return a fixed
implementation
see: Sun - Designing
Web Services with
J2EE chapter 2:
plattechs.pdf p. 41
Observer
supports separation of internal model
from a view of that model
observer defines a one to many
relationship between objects
the object observed informs notifies
all observers of any state change
example: the
saxParser.parse(
new
File(argv[
0
]),
handler)
call of the following
example…
Observer
Observable
register
notify
the SAX parser
(observable) notifies
the ContentHandler
(observer) whenever it
encounters an XML tag
part
(1)
import
java.io.*;
import
org.xml.sax.*;
import
org.xml.sax.helpers.DefaultHandler;
import
javax.xml.parsers.SAXParserFactory;
import
javax.xml.parsers.ParserConfigurationException;
import
javax.xml.parsers.SAXParser;
public
class
Echo01
extends
DefaultHandler
{
StringBuffer textBuffer;
public
static
void
main(String argv[])
{
if
(argv.length != 1) {
System.err.println(
"Usage: cmd filename"
);
System.exit(1);
}
// Use an instance of ourselves as the SAX event handler
DefaultHandler handler =
new
Echo01();
// Use the default (non-validating) parser
SAXParserFactory factory = SAXParserFactory.newInstance();
try
{
// Set up output stream
out =
new
OutputStreamWriter(System.out,
"UTF8"
);
// Parse the input
SAXParser saxParser = factory.newSAXParser();
saxParser.parse(
new
File(argv[0]), handler);
}
catch
(Throwable t) {
t.printStackTrace();
}
System.exit(0);
}
static
private
Writer out;
//===========================================================
// SAX DocumentHandler methods
//===========================================================
public
void
startDocument()
throws
SAXException
{
emit(
"<?xml version='1.0' encoding='UTF-8'?>"
);
nl();
}
public
void
endDocument()
throws
SAXException
{
try
{
nl();
out.flush();
}
catch
(IOException e) {
throw
new
SAXException(
"I/O error"
, e);
}
}
public
void
startElement(String namespaceURI,
String sName,
// simple name
String qName,
// qualified name
Attributes attrs)
throws
SAXException
{
echoText();
String eName = sName;
// element name
if
(
""
.equals(eName)) eName = qName;
// not namespaceAware
emit(
"<"
+eName);
if
(attrs !=
null
) {
for
(
int
i = 0; i < attrs.getLength(); i++) {
String aName = attrs.getLocalName(i);
// Attr name
if
(
""
.equals(aName)) aName = attrs.getQName(i);
emit(
" "
);
emit(aName+
"=\""
+attrs.getValue(i)+
"\""
);
}
}
emit(
">"
);
}
public
void
endElement(String namespaceURI,
String sName,
// simple name
String qName
// qualified name
)
throws
SAXException
{
echoText();
String eName = sName;
// element name
if
(
""
.equals(eName)) eName = qName;
// not namespaceAware
emit(
"</"
+eName+
">"
);
}
public
void
characters(
char
buf[],
int
offset,
int
len)
throws
SAXException
{
String s =
new
String(buf, offset, len);
if
(textBuffer ==
null
) {
textBuffer =
new
StringBuffer(s);
}
else
{
This example is described in the J2EE
1.4 Tutorial page 123ff: Echoing an XML
File with the SAX Parser.
part
(2)
import java.io.*; import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.SAXParserFactory; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser;public class Echo01 extends DefaultHandler
{
StringBuffer textBuffer;
public static void main(String argv[])
{ if (argv.length != 1) { System.err.println("Usage: cmd filename"); System.exit(1); }
// Use an instance of ourselves as the SAX event handler
DefaultHandler handler = new Echo01();
// Use the default (non-validating) parser
SAXParserFactory factory = SAXParserFactory.newInstance(); try {
// Set up output stream
out = new OutputStreamWriter(System.out, "UTF8"); // Parse the input
SAXParser saxParser = factory.newSAXParser(); saxParser.parse( new File(argv[0]), handler); } catch (Throwable t) {
t.printStackTrace(); }
System.exit(0); }
static private Writer out;
//===========================================================
// SAX DocumentHandler methods
//===========================================================
public void startDocument()
throws SAXException {
emit("<?xml version='1.0' encoding='UTF-8'?>"); nl();
}
public void endDocument()
throws SAXException { try { nl(); out.flush(); } catch (IOException e) {
throw new SAXException("I/O error", e); }
}
public void startElement(String namespaceURI,
String sName, // simple name
String qName, // qualified name
Attributes attrs) throws SAXException
{
echoText();
String eName = sName; // element name
if ("".equals(eName)) eName = qName; // not namespaceAware
emit("<"+eName); if (attrs != null) {
for (int i = 0; i < attrs.getLength(); i++) {
String aName = attrs.getLocalName(i); // Attr name
if ("".equals(aName)) aName = attrs.getQName(i); emit(" "); emit(aName+"=\""+attrs.getValue(i)+"\""); } } emit(">"); }
public void endElement(String namespaceURI,
String sName, // simple name
String qName // qualified name
) throws SAXException {
echoText();
String eName = sName; // element name
if ("".equals(eName)) eName = qName; // not namespaceAware
emit("</"+eName+">"); }
public void characters(char buf[], int offset, int len) throws SAXException
{
String s = new String(buf, offset, len); if (textBuffer == null) {
textBuffer = new StringBuffer(s); } else {
textBuffer.append(s);
This example is described in the J2EE
1.4 Tutorial page 123ff: Echoing an XML
File with the SAX Parser.
part
(3)
import java.io.*; import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.SAXParserFactory; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser;public class Echo01 extends DefaultHandler
{
StringBuffer textBuffer;
public static void main(String argv[])
{ if (argv.length != 1) { System.err.println("Usage: cmd filename"); System.exit(1); }
// Use an instance of ourselves as the SAX event handler
DefaultHandler handler = new Echo01();
// Use the default (non-validating) parser
SAXParserFactory factory = SAXParserFactory.newInstance(); try {
// Set up output stream
out = new OutputStreamWriter(System.out, "UTF8"); // Parse the input
SAXParser saxParser = factory.newSAXParser(); saxParser.parse( new File(argv[0]), handler); } catch (Throwable t) {
t.printStackTrace(); }
System.exit(0); }
static private Writer out;
//===========================================================
// SAX DocumentHandler methods
//===========================================================
public void startDocument()
throws SAXException {
emit("<?xml version='1.0' encoding='UTF-8'?>"); nl();
}
public void endDocument()
throws SAXException { try { nl(); out.flush(); } catch (IOException e) {
throw new SAXException("I/O error", e); }
}
public void startElement(String namespaceURI,
String sName, // simple name
String qName, // qualified name
Attributes attrs) throws SAXException
{
echoText();
String eName = sName; // element name
if ("".equals(eName)) eName = qName; // not namespaceAware
emit("<"+eName); if (attrs != null) {
for (int i = 0; i < attrs.getLength(); i++) {
String aName = attrs.getLocalName(i); // Attr name
if ("".equals(aName)) aName = attrs.getQName(i); emit(" "); emit(aName+"=\""+attrs.getValue(i)+"\""); } } emit(">"); }
public void endElement(String namespaceURI,
String sName, // simple name
String qName // qualified name
) throws SAXException {
echoText();
String eName = sName; // element name
if ("".equals(eName)) eName = qName; // not namespaceAware
emit("</"+eName+">"); }
public void characters(char buf[], int offset, int len) throws SAXException
{
String s = new String(buf, offset, len); if (textBuffer == null) {
textBuffer = new StringBuffer(s); } else {
textBuffer.append(s);
This example is described in the J2EE
1.4 Tutorial page 123ff: Echoing an XML
File with the SAX Parser.
Page 3 of 3 Echo01.java
Printed: Freitag, 12. Mai 2006 15:14:07 Uhr
} } //=========================================================== // Utility Methods ... //===========================================================
// Display text accumulated in the character buffer
private void echoText() throws SAXException {
if (textBuffer == null) return;
String s = ""+textBuffer; emit(s);
textBuffer = null; }
// Wrap I/O exceptions in SAX exceptions, to
// suit handler signature requirements
private void emit(String s) throws SAXException { try { out.write(s); out.flush(); } catch (IOException e) {
throw new SAXException("I/O error", e); }
}
// Start a new line
private void nl() throws SAXException {
String lineEnd = System.getProperty("line.separator"); try {
out.write(lineEnd); } catch (IOException e) {
SAX Example
<priceList>
[parser calls startElement]
<coffee>
[parser calls startElement]
<name>Mocha Java</name>
[parser calls startElement,
characters, and endElement]
<price>11.95</price>
[parser calls startElement,
characters, and endElement]
</coffee>
[parser calls endElement]
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
saxParser.parse("priceList.xml", handler);
JAVA SOURCE CODE:
The DOM API
Document Object Model defined by
the W3C
builds an object representation in the
form of a tree of a parsed XML
document in memory
allows for the
manipulation of an
XML document
insert and remove elements
J2EE: The Integrated Platform for Web Services42
Figure 2.5
SAX- and DOM-Based XML Parser APIsFigure 2.5 shows how the SAX and DOM parsers function. SAX processes
documents serially, converting the elements of an XML document into a series of
events. Each particular element generates one event, with unique events
represent-ing various parts of the document. User-supplied event handlers handle the events
and take appropriate actions. SAX processing is fast because of its serial access
and small memory storage requirements. Code Example 2.7 shows how to use the
JAXP APIs and SAX to process an XML document.
public class AnAppThatUsesSAXForXMLProcessing extends DefaultHandler {
public void someMethodWhichReadsXMLDocument() {
// Get a SAX PArser Factory and set validation to true SAXParserFactory spf = SAXParserFactory.newInstance(); spf.setValidating(true);
// Create a JAXP SAXParser
SAXParser saxParser = spf.newSAXParser(); // Get the encapsulated SAX XMLReader
XML Document PSAXarser HEvandenletrs Input CrEveaenttess XML Document PDOMarser Input CrTreeeates Freitag, 31. Mai 13