• No results found

Database-Supported XML Processors

N/A
N/A
Protected

Academic year: 2021

Share "Database-Supported XML Processors"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

Database-Supported

XML Processors

Prof. Dr. Torsten Grust

[email protected]

(2)

Part I

(3)

Outline of this part

(4)

A Word About Myself

Torsten Grust

Originally from Hannover

1989–1994 Student of Computer Science @ TU Clausthal 1994–2004 Database Research @ U Konstanz

1999 Promotion

2000 Visiting Scientist @ IBM, DB2 Everyplace 2004 Habilitation

2004-2005 Professor @ TU Clausthal 2005-2008 Professor @ TU M¨unchen since 9/2008 Professor @ U T¨ubingen

(5)

Welcome to this Course . . .

We will userelational database technology to develop a highly efficient, scalable processor for XML languages like XPath, XQuery, and XML Schema.

This means that

1 you will get to know these XML technologies quite well, and

2 you can apply and deepen your (rusty?) knowledge of RDBMSs in a

(6)

Relational XML Processing

XML Processors≡Tree Processors

⇒This is a course on Relational Tree Processors. Relational Tree Encoding E

• •kkkk • • • S S S S •ww •ww GG• • G G • E → · · ·

Map tree queries into relational queries over tree encodings: Tree tree query //

E

Tree

Rel

relational query //Rel E−1

(7)

Compiling XQuery to Relational Algebra (1)

Input: XQuery Expression

Query against an Internet auction database (think eBay): How many auction items are listed in each of the site’s [geographical] regions?

for $r in doc(”auction.xml”)/site/regions/* return count($r//item)

Tree query: Note how this query usestree navigation operators/

(read: child) and// (descendant) to explore the input XML documentauction.xml.

(8)
(9)

Compiling XQuery to Relational Algebra (3)

Output: Relational Algebra (MonetDB’s Dialect)

1 . . .

2 a0000 := a0004.reverse ().sort ().reverse ();

3 a0000 := a0000.CTrefine (a0003);

4 a0000 := a0000.CTrefine (a0002);

5 a0000 := a0000.mark (0@0).reverse ();

6 a0001 := a0000.leftjoin (a0002);

7 a0005 := a0000.leftjoin (a0004);

8 a0006 := a0000.leftjoin (a0003);

9 . . .

10 a0003 := count(a0004.reverse ());

11 a0007 := a0003.reverse ().mark (0@0).reverse ();

12 a0008 := a0003.mark (0@0).reverse ();

13 . . .

(10)

Pathfinder

For about 61/2years now, work is underway to design and build the

purely relationalXQuery processor Pathfinder. Joint work with a couple of brilliant guys from

Pathfinder generates an internal algebraic representation of XQuery expressions and then emits

1 MIL codefor consumption by MonetDB/XQuery, or

2 SQL:1999 codeto be executed by off-the-shelf RDBMS, e.g.,

(11)

Pathfinder & IBM DB2 vs. 110+ MB of XML

(12)

Hands On!

In a sense, this course is an in-depth tour of the techniques and concepts behindPathfinder.

Because Pathfinderhas been under development since 2002, the system is already usable and provides an ideal playgroundfor us.

Available under the Mozilla OSS License

www.pathfinder-xquery.org www.monetdb-xquery.org

Source code and installers for Unix (Linux, Mac OS X), Windows.

(13)

Further Reading Material . . .

. . . the XML standard family: http://www.w3.org/XML/(links marked with are frequently found on the slides)

Warning: rather impenetrable on first sight!



. . . on XPath and XQuery: XQuery from the Experts Jonathan Robieet.al. ISBN 0-321-18060-7 Addison-Wesley, 2003

The XML Query Language Michael Brundage

ISBN 0-321-16581-0 Addison-Wesley, 2004

. . . various research papers on how database technology can embrace XML, XPath, and XQuery (this is a vivid research area);

(14)

Further Reading Material

Easily digestable introductions to XML, XPath, and XQuery: The Annotated XML Specification

http://www.xml.com/axml/testaxml.htm

Chapter ’XPath’ of ’XML in a Nutshell’ (O’Reilly)

http://www.oreilly.com/catalog/xmlnut2/chapter/

XQuery: A Guided Tour

http://www.datadirect.com/developer/xml/ xquery/docs/katz˙c01.pdf

(15)

Organisatorisches

Termine

Zeit Ort

Vorlesung Do,13:15–14:45 Sand 6/7, kleiner H¨orsaal ¨

Ubung Di, 13:15–14:45 Sand 6/7, kleiner H¨orsaal (Jan Rittinger)

Homepage + Material zur Vorlesung

www-db.informatik.uni-tuebingen.de/teaching/ws0809/dbxml Folien [PDF] zum Download verf¨ugbar (ca. einen Tag vor Termin).

(16)

Wie profitiert man von dieser Vorlesung?

¨

Ubungsaufgaben und Klausuraufgaben werden sichsehr ¨ahneln.

Aktiv dabei sein!



¨

Ubungen starten n¨achsten Dienstag (28. Oktober) Beispiele nachvollziehen und eigene Experimente starten:

Michael Kay’sSaxon (www.saxonica.com)

Pathfinder

Klausur/mdl. Kolloq zum Ende des Semesters bestehen. “Sprechstunde” nutzen

Fast immer, wenn die T¨uren zu unseren B¨uros (Sand 13, B312 und B318) offen stehen. Effektiv sind das 90 % unserer Anwesenheitszeiten.

(17)

Questions?

Questions . . . ? Comments . . . ? Suggestions . . . ?

References

Related documents

In the present study, a mini core collection of finger millet (80 accessions), along with accessions from different parts of India and four control cultivars were used to

As shown in Algorithm 1, this process comprises two different interleaved stages, an individual planning process by which agents devise refinements over a centralized base plan and

In order to examine life recovery situations among the next two categories of “households with persons with disability” and/or “vulnerable households”, Model 3 tested

This paper develops a framework to analyze the business cycle movements of stock market returns, volatility and volatility risk-premia. In our model, the aggregate stock market

Title II disability recipients (SSDI and Adult Disabled Child benefits) who appeal Social Security's determinations that they are no longer disabled also have 10 days in which to

is to turn against the methodology and, in particular, to portray its reliance on individual optimising behaviour, equilibrium outcomes and mathematical modelling as a regressive

ALAN CULBRETH , MD Family Practice University of Louisville Clifty Drive Medical Building 445 Clifty Drive, Madison, IN 47250 812/273-7700 ROBERT ELLIS , MD Family Practice

We strongly recommend a multi-layer evaluation process, or an evaluation process derived from the balanced scorecard, for the appraisal of investments in services or