Part I
Outline of this part
A Word About Myself
Torsten Grust
Originally from Hannover
1989–1994 Student of Computer Science @ TU Clausthal 1994–2004 Database Research @ U Konstanz
1999 Promotion
2000 Visiting Scientist @ IBM, DB2 Everyplace 2004 Habilitation
2004-2005 Professor @ TU Clausthal 2005-2008 Professor @ TU M¨unchen since 9/2008 Professor @ U T¨ubingen
Welcome to this Course . . .
We will userelational database technology to develop a highly efficient, scalable processor for XML languages like XPath, XQuery, and XML Schema.
This means that
1 you will get to know these XML technologies quite well, and
2 you can apply and deepen your (rusty?) knowledge of RDBMSs in a
Relational XML Processing
XML Processors≡Tree Processors
⇒This is a course on Relational Tree Processors. Relational Tree Encoding E
• •kkkk • • • S S S S •ww •ww GG• • G G • E → · · ·
Map tree queries into relational queries over tree encodings: Tree tree query //
E
Tree
Rel
relational query //Rel E−1
Compiling XQuery to Relational Algebra (1)
Input: XQuery Expression
Query against an Internet auction database (think eBay): How many auction items are listed in each of the site’s [geographical] regions?
for $r in doc(”auction.xml”)/site/regions/* return count($r//item)
Tree query: Note how this query usestree navigation operators/
(read: child) and// (descendant) to explore the input XML documentauction.xml.
Compiling XQuery to Relational Algebra (3)
Output: Relational Algebra (MonetDB’s Dialect)
1 . . .
2 a0000 := a0004.reverse ().sort ().reverse ();
3 a0000 := a0000.CTrefine (a0003);
4 a0000 := a0000.CTrefine (a0002);
5 a0000 := a0000.mark (0@0).reverse ();
6 a0001 := a0000.leftjoin (a0002);
7 a0005 := a0000.leftjoin (a0004);
8 a0006 := a0000.leftjoin (a0003);
9 . . .
10 a0003 := count(a0004.reverse ());
11 a0007 := a0003.reverse ().mark (0@0).reverse ();
12 a0008 := a0003.mark (0@0).reverse ();
13 . . .
Pathfinder
For about 61/2years now, work is underway to design and build the
purely relationalXQuery processor Pathfinder. Joint work with a couple of brilliant guys from
Pathfinder generates an internal algebraic representation of XQuery expressions and then emits
1 MIL codefor consumption by MonetDB/XQuery, or
2 SQL:1999 codeto be executed by off-the-shelf RDBMS, e.g.,
Pathfinder & IBM DB2 vs. 110+ MB of XML
Hands On!
In a sense, this course is an in-depth tour of the techniques and concepts behindPathfinder.
Because Pathfinderhas been under development since 2002, the system is already usable and provides an ideal playgroundfor us.
Available under the Mozilla OSS License
www.pathfinder-xquery.org www.monetdb-xquery.org
Source code and installers for Unix (Linux, Mac OS X), Windows.
Further Reading Material . . .
. . . the XML standard family: http://www.w3.org/XML/(links marked with are frequently found on the slides)
Warning: rather impenetrable on first sight!
. . . on XPath and XQuery: XQuery from the Experts Jonathan Robieet.al. ISBN 0-321-18060-7 Addison-Wesley, 2003
The XML Query Language Michael Brundage
ISBN 0-321-16581-0 Addison-Wesley, 2004
. . . various research papers on how database technology can embrace XML, XPath, and XQuery (this is a vivid research area);
Further Reading Material
Easily digestable introductions to XML, XPath, and XQuery: The Annotated XML Specification
http://www.xml.com/axml/testaxml.htm
Chapter ’XPath’ of ’XML in a Nutshell’ (O’Reilly)
http://www.oreilly.com/catalog/xmlnut2/chapter/
XQuery: A Guided Tour
http://www.datadirect.com/developer/xml/ xquery/docs/katz˙c01.pdf
Organisatorisches
Termine
Zeit Ort
Vorlesung Do,13:15–14:45 Sand 6/7, kleiner H¨orsaal ¨
Ubung Di, 13:15–14:45 Sand 6/7, kleiner H¨orsaal (Jan Rittinger)
Homepage + Material zur Vorlesung
www-db.informatik.uni-tuebingen.de/teaching/ws0809/dbxml Folien [PDF] zum Download verf¨ugbar (ca. einen Tag vor Termin).
Wie profitiert man von dieser Vorlesung?
¨
Ubungsaufgaben und Klausuraufgaben werden sichsehr ¨ahneln.
Aktiv dabei sein!
¨
Ubungen starten n¨achsten Dienstag (28. Oktober) Beispiele nachvollziehen und eigene Experimente starten:
Michael Kay’sSaxon (www.saxonica.com)
Pathfinder
Klausur/mdl. Kolloq zum Ende des Semesters bestehen. “Sprechstunde” nutzen
Fast immer, wenn die T¨uren zu unseren B¨uros (Sand 13, B312 und B318) offen stehen. Effektiv sind das 90 % unserer Anwesenheitszeiten.
Questions?
Questions . . . ? Comments . . . ? Suggestions . . . ?