Structured Data and Visualization
Programming Language Support Schemas become Types Xml docs become Values
parsers and validators
Structured Data
A language to describe the structure of documents
<element name = “course” type = “courseT”/> <complexType name = “courseT”> <sequence>
<element name = “CD” .../> ...
</sequence>
<attribute name = “name” type = “string”/> </complexType>
A language to describe data that has this structure <course name = “SDV”> <CD> </CD> <S> </S> <T> </T> </course>
Programming Language Support
To write programs that deal with course documents:represent course documents in a programming language!
interface Course{
public String getName();
public void setName(String value); public CDT getCD();
public void setCD(CDT value); ...
}
Could be generated from the schema for course documents!
Programming Language Support
A course document for a given course A course composer An e-learning platform Generate xml from strings input by the user A web site generator. A result generator Read an xml document
Programming Language Support
Generate types (
interfaces and classes in java)
from a schema,
Support to generate xml from a value (
objects in java), the xml document will be valid with
respect to the schema used!
Support to generate a value from a valid xml
document, parsing including validation!
In Java
JAXB
a library for binding xml documents to
java objects,
A number of packages, javax.xml.bind javax.xml.parsers javax.xml.bind.util
A compiler to generate interfaces and classes from a schema:
xjc.sh (on unix)
xjc.bat(on windows)
XJC
prompt> YOURJAXBPATH/xjc.sh courseDoc.xsd Creates appropiate directories and generates java code:: interfaces and classes
corresponding to
elements and types in the schema Xml namespaces become java packages org.coursedoc
Plus a package for implementations org.coursedoc.impl
XJC
Declared complex types:<complexType name = “courseT”> <sequence>
... </sequence>
<attribute name = “name” type = “string”/> </complexType>
become interfaces in the proper package: package org.coursedoc;
public interface CourseT{ ...
String getName();
void setName(String value); }
XJC
And classes implementing these: package org.coursedoc.impl; public class CourseTImplimplements org.coursedoc.CourseT{ ...
}
Experiment yourself to see what happens when you use an anonymous type! What happens when you declare a simple type!
XJC
Declared objects<element name = “course” type = “courseT/> become interfaces in the proper package:
package org.coursedoc;
public interface Course extends CourseT{} and implementing classes:
package org.coursedoc.impl; public class CourseImpl
implements org.coursedoc.Course{ ...
}
XJC
One thing to be aware of is the treatment of minOccurs and maxOccurs. When they are not the default 1.<complexType name = “courseT”> <sequence>
...
<element name = “teacher” type = “teacherT” minOccurs= ”1” maxOccurs=“unbounded”/> </sequence>
...
</complexType>
public interface CourseT{ java.util.List getTeacher(); }
There is no setTeacher! Use add on the list!
XJC
also produced
A class implementing the abstract class
javax.xml.bind.JAXBContext
see
under
org/coursedoc/impl/runtime
!
Has control over a grammar for the language you defined with the xml-schema
Has a method returning a parser for this
grammar that generates internal representations from a valid xml document.
Has a method returning a “serializer” that produces valid xml from an internal representation.
java.xml.bind
Unmarshal
javax.xml.bind.JAXBContext jc = javax.xml.bind.JAXBContext.newInstance( "org.coursedoc" ); javax.xml.bind.Unmarshaller u = jc.createUnmarshaller(); u.setValidating(true);Course c = (Course)u.unmarshal(xmlFile); Taking an xml document into a Java program:
Parse the file according to the rules in the schema!
Generate an instance of CourseImpl (with all subcomponents)!
javax.xml.bind
Marshal
Producing xml from a Java program :javax.xml.bind.JAXBContext jc = javax.xml.bind.JAXBContext.newInstance( "org.coursedoc" ); javax.xml.bind.Marshaller m = jc.createMarshaller(); sdv = new org.coursedoc.impl.CourseImpl(); ... m.marshal(sdv,os); java.io.OutputStream os = new java.io.FileOutputStream( fileName+".xml" );
java.xml.bind
Things you should explore
What if a contents in an xml file is to match a
regular expression? Is this checked during
marshaling?
What if a contents is to be of a union type?
How can the programer deal with validation
issues? There are exceptions throw but there
are also events generated! You just register
a
ValidationEventHandler
with the
unmarshaler! See how to use this!
More?
I vissa programspråk kan finnas enklare stöd
för att programera kring xml dokument:
file.xml
Finns inget schema som formaliserar dess struktur!
En intern representation av file.xml kan ändå vara praktisk att ha!
Xml parsers
<recipe name=”lemon pie”>
<ingredient name=”sugar” amount=”3spoons”/> <instructions>
Start by turning on the oven ... </instructions>
</recipe>
Element:recipe Element:ingred
Attribute:name Element:instr
Lemon pie Attribute:name Sugar
Attribute:amount
3 spoons Start by
Xml parsers i java
DOM
The package javax.xmp.parsers follows with the standard distribution of java.
Class DocumentBuilder{
Document parse(InputStream in) Document newDocument()
... }
interface Document extends Node{ Node getFirstChild()
void createElement(String tag)
Element getElementByName(String name) ... }
Xml parsers in java
DOM
DocumentBuilderFactory.newDocumentBuilder() returns a DocumentBuider!When using it to parse, it buids an internal tree representation for the whole document!
Xml parsers in java
SAX
In the same package javax.xml.parsers we find
class SAXParser{
void parse(InputStream in, DefaultHandler dh) ...
}
class DefaultHandler implements ContentHandler{ void startDocument() void endDocument() ... } All implemented as DO NOTHING you should redefine the interesting ones!
Xml parsers in java
SAX
You get an instance using
SAXParserFactory.newSAXParser()
When using parse to read a document it doesn't build an internal representation but it generates events on finding elements, attributes, content and more. You can use it to build an internal representation using the event handler!