• No results found

Other Approaches to XQuery Processing

N/A
N/A
Protected

Academic year: 2021

Share "Other Approaches to XQuery Processing"

Copied!
46
0
0

Loading.... (view fulltext now)

Full text

(1)

ADT 2007

ADT 2007

Lecture 6

Lecture 6

Other App

Other App

roaches to XQuery Processing

roaches to XQuery Processing

Stefan Manegold

[email protected] http://www.cwi.nl/~manegold/

(2)

• 17.09.2007:

•RDBMS back-end support for XML/XQuery (1/2):

Document Representation (XPath Accelerator, Pre/Post plane)

XPath navigation (Staircase Join)

• 01.10.2007:

•XQuery to Relational Algebra Compiler:

•Item- & Sequence- Representation

Efficient FLWoR Evaluation (Loop-Lifting)

•Optimization

• 08.10.2007:

•RDBMS back-end support for XML/XQuery (2/2):

•Updateable Document Representation

Schedule

Schedule

(3)

Topics

Topics

Other approaches & techniques (selection, far from complete!)  Document storage / tree encoding:

 ORDPATH

 DataGuides

 XPath processing:

(4)

DataGuides

DataGuides

XPath Accelerator, ORDPATH & similar encoding schemes

encode the document's tree structure in the node ranks/labels they assign

DataGuides

Developed in the context of Lore project (DBMS for semi-structured data) Stanford University, Goldman & Widom, VLDB 1997

encode the document's tree structure in relation names Observation:

Each node is uniquely identified by its path from the root Paths of siblings with equal tag names can be unified, Provided we keep their relative order (rank) explicitly

(5)

DataGuides

DataGuides

Definition given a semistructured data instance DB,  a DataGuide for DB is a graph G s.t.: ­ every path in DB also occurs in G ­ every path in G occurs in DB ­ every path in G is unique

(6)

Example:

DataGuides

DataGuides

(7)

■ Multiple DataGuides for the same data:

DataGuides

DataGuides

(8)

Definition Let p, p’ be two path expressions and G a graph; we define p ≡G p’  if  p(G) = p’(G)      i.e., p and p' are indistinguishable on G. Definition G is a strong dataguide for a database DB if G is the same as ≡DB  Example: ­ G1 is a strong dataguide ­ G2 is not strong person.project  !≡DB dept.project person.project  !≡G1 dept.project        person.project     dept.project

DataGuides

DataGuides

(9)

■ Constructing the strong DataGuide G: Nodes(G)={{root}} Edges(G)=∅ while changes do choose s in Nodes(G), a in Labels add s’={y|x in s, (x ­a­>y) in Edges(DB)} to Nodes(G) add (x ­a­>y) to Edges(G) • Use hash table for Nodes(G) • This is precisely the powerset automaton construction.

DataGuides

DataGuides

(10)

Monet XML approach

Monet XML approach

Early attempt to store and query XML data in MonetDB By Albrecht Schmidt

(11)

Monet XML approach

Monet XML approach

(12)

Monet XML approach

Monet XML approach

(13)

Monet XML approach

Monet XML approach

Monet XML approach

Monet XML approach

(14)

Monet XML approach

Monet XML approach

Early attempt to store and query XML data in MonetDB By Albrecht Schmidt

Not related to Pathfinder & MonetDB/XQuery No XQuery compiler

XMark queries are hand-crafted and -optimized in MIL

Child, Descendant, Parent & Ancestor steps become regular expressions on

the relation names (i.e., catalog)

(15)

Twig Join Algorithms

Twig Join Algorithms

So far: interpreted XPath expressions in an imperative manner

Evaluated XPath expressions step-by-step, as stated in the query Given /

1::1/2::2/.../n::n,

 we first evaluated /, then XPath step 

1::1, then step 2::2, ...

This may not always be the best choice:

Intermediate results can get very large, even if the final result is small:

Database context => think in a declarative manner

(16)

Tree Patterns

Tree Patterns

In fact, XPath is a declarative language.

/descendant::timeline/child::event

“Find all nodes v1, v2, and v3, such that v1 is a document root,

v2 is a descendant element of v1 and is named timeline, and v3 is a child element of v2 and named event.

All nodes of type v3 form the query result.

Observe the combination of

(a) predicates on single nodes, and

(17)

Tree Patterns

Tree Patterns

Structural conditions: Intuitively expressed as tree patterns:Nodes labeled with node predicates

Structural conditions:

Double line: ancestor/descendant relationships Single line: parent/child relationships

Arbitrary predicates are allowed, but typical are predicate on tag names: Nodes labeled with requested tag name

Document root: label /

If not /-node specified:

search for pattern anywhere in the document timeline

(18)

Tree Patterns

Tree Patterns

timeline event

(19)

Tree Patterns

Tree Patterns

Not limited to path patternsMay also be twig patterns

Mapping between tree patterns and XPath is in general not trivial Examples: a b d e c f g h i j

(20)

PathStack Algorithm

PathStack Algorithm

(21)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

a b c d e d e

(22)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(23)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(24)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(25)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

timeline

timeline timeline

 event

first timeline node visited timeline

(26)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(27)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(28)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(29)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(30)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(31)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(32)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(33)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(34)

PathStack Algorithm: Path Patterns

PathStack Algorithm: Path Patterns

(35)

PathStack Algorithm: Twig Patterns

PathStack Algorithm: Twig Patterns

So far we only considered path patterns

Can we extend our ideas for efficient twig pattern evaluation?

Idea:

Decompose twig patterns into multiple path patterns.All path patterns start from the same root.

(36)

PathStack Algorithm: Twig Patterns

PathStack Algorithm: Twig Patterns

Example: Decompose twig pattern into path patterns

a b

c d

(37)

PathStack Algorithm: Twig Patterns

PathStack Algorithm: Twig Patterns

Example: Decompose twig pattern into path patterns

a b c d e a a b b c d e

(38)

PathStack Algorithm: Twig Patterns

PathStack Algorithm: Twig Patterns

(39)

PathStack Algorithm: Twig Patterns

PathStack Algorithm: Twig Patterns

(40)

Summary (1/5)

Summary (1/5)

XML  Document markup  Data exchange  Semi-structured  Tree model  DTDs  XML Schema  XPath

 Navigation, location steps, axes, node tests, predicates, functions  XQuery

(41)

Summary (2/5)

Summary (2/5)

XML Data Management  XML file processors  XML databases  XML integration platforms  RDBMS with XML functionality, SQL/XML

(42)

Summary (3/5)

Summary (3/5)

Purely Relational XML/XQuery processing: MonetDB/XQuery  Document encoding: XPath Accelerator (pre/post plane)

 XPath navigation: Staircase Join

 XQuery to Relational Algebra translation  Item- & Sequence-representation

 Iterations: Loop-lifting  Loop-lifted staircase join  Peephole Optimization

 Order-awareness, sort avoidance  XML/XQuery Update Support

(43)

Summary (4/5)

Summary (4/5)

Other approaches & techniques  Document storage/encoding:

 ORDPATH

 DataGuides

 XPath processing:

(44)

Summary (5/5)

Summary (5/5)

Literature  Slides

 Literature references in slides  Literature references on website:

http://homepages.cwi.nl/~mk/onderwijs/adt2007/html/xquery.html

Tentamen / Exam:

Tuesday October 23

09:30 – 11:30

(45)

Projects: Join the MonetDB Team!

Projects: Join the MonetDB Team!

• Own ideas, suggestions, initiative welcome! • Master Student Projects (6 Months)

• Various projects, each consisting of both research & implementation • See monetdb.cwi.nl/Development/Research/Projects/ for a sample list • Feel free to come with your own idea(s)!

• Implementation Projects

• Both short-term & long-term

• E.g. open feature requests: sf.net/tracker/?group_id=56967 • Become owner/maintainer of some (new) part of MonetDB

(46)

• 24x7x365 support & advice

• Membership in a kind & friendly Family-Team of Experts

• Chance to participate in & contribute to a large & successful

open-source research project

• Lots of experiences, exiting research & fun • Desk & workstation at CWI

 Fridge, micro-wave, free coffee, free soup, free cake (occasionally)  Master Students only (possibly part-time)

 Limited availability => FCFS!

Some pocket money (stage vergoeding)  Master Students only

We Offer...

We Offer...

References

Related documents

Lyttle (Bristol Royal Hos- pital for Children); Adrian Boyle (Cambridge University Hospitals Foundation Trust); Carolyn Hore (Chelsea and Westminster NHS Foundation Trust);

In this way they refute certain stereotypes concerning the CAP, while seeking an answer to the question of what part of the subsidies paid to agriculture in the EU-27

The well-padded 3 ring binder comes complete with matching vinyl slipcase in grey, maroon, red, green, burgundy, black, blue and brown... Made with white pages and single

Neil is currently serving as Vice President of Public Affairs for California Association of Health Underwriters (CAHU), which is the industry organization of professional

Allows them that great job follow letter sample cover letter written interview by mail letter is a major learning more personal and letter.. Skill you when a job up letter sample

44: Atom numbering of the alcohol compound (TBS-deprotected) C13.. 45: Reaction of formation of the carboxylic compound C14 using TEMPO reagent. 46: Illustartion of formation

Unfortunately, it is also human nature for Beginning Lawyer Smith, who heard these riotous tidbits at lunch to slip them into a conversation with the particular senior lawyer he

Within the labor movement, for example, Central Labor Councils have been strong in San Jose, San Francisco, the East Bay, and to a certain extent Sacramento as well, but even