7 File Representation of Models
7.1 From XML to Java and back
At the time of this writing three mechanism exist to handle XML documents in an application, namely SAX, DOM and data binding.
7.1.1 SAX
The Simple API for XML processes XML data like a text stream and alerts the application whenever something interesting comes along. This can be a tag like <activity> or text inside the tags. The advantages of SAX are: it is suitable for high-speed processing and takes little memory because it is not creating an in-memory copy of the data. This lack of a copy in the memory is a main reason why SAX is not suitable for an editor application. Another drawback of SAX is that it does not provide a mechanism for writing out XML. The memory representation has to be constructed more or less "by hand" and each element has to provide methods to write out the proper tags and content to the text stream.
7.1.2 DOM
For an application that needs an in-memory data structure, the Document Object Model represents a useful alternative. The application takes advantage of a document builder, which reads the XML data and then constructs a DOM as shown in Figure 71.
Figure 71: Transformation of an XML document to the corresponding DOM
Once the DOM has been constructed, the application can manipulate it in a variety of ways. Elements can be added, removed, shifted around, or their content can be changed. The main advantage of DOM is that it supports writing out the XML data.
7.1.3 Data binding
The data binding specification for Java and XML is currently being developed by an expert group and should be ready in the middle of the year 2001. This group consist of major industry parties, e.g. Oracle, Software AG, IBM or Sun Microsystems and the Java
development community16. At the moment some applications and APIs are available which support the direct mapping of an XML document to the corresponding classes. Examples are the Zeus17 and the Castor18 project. The main idea of the data binding facility is to define a document as a DTD or a schema and compile this document definition to the respective classes. As a result the generated classes represent (to a certain degree) the defined document together with accessory methods and correct data types. Two additional methods each class supports: The unmarshalling method to read in an XML document instance and the marshalling method to write out the document in XML format. Because no SAX or DOM processing is needed, the processing of XML documents can be done very fast similar to SAX while preserving the data structure as in the DOM. When the document definition changes the classes are simply recompiled to reflect the changes. Although this sounds very well in theory the current available APIs suffer from certain restrictions: They do not support polymorph data types as are defined in the w3c schema definition19 nor the minimum/maximum occurrence feature. If some additional coding has to be done inside the generated classes, this coding has to be redone as soon as the schema is recompiled. As long as there is no specification and the missing features are not
available this approach for XML processing fits very well for certain classes of
applications like message systems but not for a complex and long term project like the ProSpec editor, which makes heavy usage of object orientation programming features with abstract data types, polymorphism and inheritance.
7.1.4 The JDOM API
The JDOM is an open source project which is going to be integrated into the regular Java JAXP distribution package from Sun Microsystems. This package consists of three parts. The first one, the JAXP part, includes the SAX, DOM and finally the JDOM APIs. The second one is the JAXM part, which defines messaging with XML and the third part is the before mentioned data binding facility, which is not specified yet. The JDOM provides a higher degree of abstraction than the DOM and fits very well into the Java environment. At the time of this writing JDOM is only available as a beta version and lacks some
functionality, e.g. the in-memory validation of XML documents. The particular reasons why the JDOM API has been chosen are:
JDOM is Java platform specific: The API uses the Java language's built-in String
support, so that text values are always available as Strings. It also makes use of the
Java 2 platform collection classes, like List and Iterator, such providing an
environment suitable for programmers familiar with the Java language.
No hierarchies: In JDOM, an XML element is an instance of Element, an XML
attribute is an instance of Attribute, and an XML document is an instance of
16 For more information about the data binding specification visit java.sun.com 17 See http://www.enhydra.org.
Document. Because all of these represent different concepts in XML, they are always
referenced as their own types, not as an amorphous "node".
Class driven: Because JDOM objects are direct instances of classes like Document, Element, and Attribute, creating one is as simple as using the new operator.
The following examples show the usage of JDOM. Starting point is a simple document as shown in Listing 5.
Listing 5: Sample XML document to demonstrate the JDOM API
<?xml version="1.0" encoding="UTF-8"?>
<message type="roomQuery" reference="12345"> <sender name="traveller.move.at"/>
<receiver name="acme"/>
<query rooms="2" fromDate="14-12-2001" toDate="24-12-2001"> Our remarks
</query> </message>
The following piece of Java code in Listing 6 constructs this simple document with the help of methods provided by the JDOM API.
Listing 6: Java code to construct the sample XML document
Element message=new Element("message"); Document document=new Document(message);
Message.addAttribute(new Attribute("type","roomQuery")); Message.addAttribute(new Attribute("reference","12345")); Message.addContent(new
Element("sender").addAttribute("traveller.move.at"));
message.addContent(new Element("receiver").addAttribute("acme")); Element query=new Element("query").addContent("Our remark"); query.addAttribute("rooms","2");
query.addAttribute("fromDate","14-12-2001"); query.addAttribute("toDate","24-12-2001"); message.addContent(query);
The next code examples in Listing 7 and Listing 8 show how the document can be constructed with the help of a class RoomRequest that extends the org.jdom.Element. Listing 7: Class that represents the sample XML document
class RoomRequest extends org.jdom.Element{
public RoomRequest (String sender, String receiver, String reference, String remark, int rooms, Date fromDate, Date toDate){
addAttribute(“type”,”roomRequest”); addContent("sender",sender); ... query.addAttribute("toDate", toDate.toString()); message.addContent(query); } }
Listing 8: Class usage to generate the sample XML document
Document document=new Document(
new RoomRequest("traveller.move.at", "acme", "12345","Our remarks", 2, newDate("14-12-2001"), new Date(24-12-2001))
The following code example in Listing 9 writes out the sample document as a formatted XML representation to a stream.
Listing 9: Output of the sample document to a stream.
XMLOutputter outputter = new XMLOutputter(); outputter.setIndent(true);
outputter.setNewlines(true);
FileOutputStream out=new FileOutputStream(“filename.xml”); outputter.output(document, out );
out.close();