• No results found

Indexing XML Data in RDBMS using ORDPATH

N/A
N/A
Protected

Academic year: 2021

Share "Indexing XML Data in RDBMS using ORDPATH"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

Indexing

Indexing

XML

XML

Data

Data

in

in

RDBMS

RDBMS

using

using

ORDPATH

ORDPATH

Microsoft

Microsoft®® SQL Server 2005SQL Server 2005™™

Concepts

Concepts developeddeveloped byby::

Patrick

Patrick OO‘‘NeilNeil, Elizabeth , Elizabeth OO‘‘NeilNeil, ,

(University of Massachusetts Boston) (University of Massachusetts Boston)

Shankar

Shankar PalPal, Istvan , Istvan CseriCseri, Oliver , Oliver SeeligerSeeliger, Gideon Schaller, , Gideon Schaller, Leo

Leo GiakoumakisGiakoumakis, Vasili , Vasili ZolotovZolotov, Nigel , Nigel WestburyWestbury

(Microsoft Corporation) (Microsoft Corporation)

(2)

5. Juli 2006 Stephan Müller 2

<BOOK ISBN=”1-55860-438-3”> <SECTION>

<TITLE> Bad Bugs</TITLE>

Nobody loves bad bugs. <FIGURE CAPTION=”Sample bug”/> </SECTION>

<SECTION>

<TITLE> Tree Frogs </TITLE>

All right-thinking people

<BOLD> love </BOLD> tree frogs. </SECTION>

</BOOK>

XML Data Model

(3)

5. Juli 2006 Stephan Müller 3 Book Figure Section Caption Bold ISBN Section

Title All right Frogs

Nobody Title

XML Document / Fragment - Properties:

XML Data Model Hierarchy Document Order: 1 < 2 < 3 < 4 < 5 < ….. < 11 < 12 8 1 3 2 4 5 6 7 9 10 11 12

(4)

5. Juli 2006 Stephan Müller 4

SQL with embedded XQuery and XPath:

SELECT id, xdoc.query(‘

for $s in

/BOOK[@ISBN=“1-55860-438-3“]//SECTION

return <topic> { data($s/TITLE) } </topic> ‘)

FROM docs;

SQL Command:

CREATE TABLE docs (

id INT PRIMARY KEY,

xdoc XML

);

XML Fragment as BLOB XML Fragment as BLOB 7 7 … … … … … … … … XML

XML DocumentDocument as BLOBas BLOB 2 2 XML Fragment as BLOB XML Fragment as BLOB 1 1 XDOC XDOC ID ID

Created docs Table:

(5)

ORDPATH

(6)

5. Juli 2006 Stephan Müller 6

What

What

we

we

expect

expect

from

from

a

a

labeling

labeling

scheme

scheme

:

:

Support

Support

for

for

structural

structural

fidelity

fidelity

(

(

Hierarchy

Hierarchy

+

+

Document

Document

Order)

Order)

Support

Support

for

for

efficient

efficient

structural

structural

modifications

modifications

to

to

the

the

XML

XML

tree

tree

--

insert

insert

sub

sub

-

-

tree

tree

--

delete

delete

sub

sub

-

-

tree

tree

--

move

move

sub

sub

-

-

tree

tree

Support

Support

for

for

high

high

-

-

performance

performance

query

query

plans

plans

for

for

native XML

native XML

queries

queries

using

using

relational primitives

relational primitives

Independence of XML

Independence of XML

schemas

schemas

typing

typing

XML

XML

instances

instances

without relabeling !!!
(7)

5. Juli 2006 Book Figure Bold Section ISBN Caption Section

Title All right Frogs Nobody Title 1 1.1 1.3.3 1.3 1.3.1 1.5 1.3.5 1.3.5.1 1.5.1 1.5.3 1.5.5 1.5.7

Example of an Initial Load

1.5.7 1.5.7 1.5.5 1.5.5 1.5.3 1.5.3 1.5.1 1.5.1 1.5 1.5 1.3.5.1 1.3.5.1 1.3.5 1.3.5 1.3.3 1.3.3 1.3.1 1.3.1 1.3 1.3 1.1 1.1 1 1 ORDPATH ORDPATH '

'treetreefrogsfrogs'' 4 ( 4 (ValueValue)) --' 'lovelove'' 1 (Element) 1 (Element) 7 (BOLD) 7 (BOLD) 'All

'All rightright--thinkingthinkingpeoplepeople''

4 (

4 (ValueValue))

--'

'TreeTreefrogsfrogs''

1 (Element) 1 (Element) 4 (TITLE) 4 (TITLE) Null Null 1 (Element) 1 (Element) 3 (SECTION) 3 (SECTION) 'Sample

'Sample bugbug''

2 (Attribute) 2 (Attribute) 6 (CAPTION) 6 (CAPTION) Null Null 1 (Element) 1 (Element) 5 (FIGURE) 5 (FIGURE) 'Nobody

'Nobody loveslovesbad bad bugsbugs''

4 ( 4 (ValueValue)) --'Bad Bugs' 'Bad Bugs' 1 (Element) 1 (Element) 4 (TITLE) 4 (TITLE) Null Null 1 (Element) 1 (Element) 3 (SECTION) 3 (SECTION) '1 '1--5586055860--438438--3'3' 2 (Attribute) 2 (Attribute) 2 (ISBN ) 2 (ISBN ) Null Null 1 (Element) 1 (Element) 1 (BOOK) 1 (BOOK) VALUE VALUE NODE_TYPE NODE_TYPE TAG TAG Document Order: 1 < 1.1 < 1.3 < 1.3.1 < … < 1.5.7 Hierarchy

(8)

L

(9)

5. Juli 2006 Stephan Müller 9

1.5.3.-9.11

O OKK L LKK … … O O11 L L11 O O00 L L00

ORDPATH Example Value:

Li /Oi Pair Desgin:

0100101101010110001111111000011

ORDPATH bit pattern:

We need a prefix-free L

i

encoding…

(10)

5. Juli 2006 Stephan Müller 10

(11)

5. Juli 2006 Stephan Müller 11

1.5.3.-9.11

L0= 3 O0 = 1 L1= 3 O1= 5 L2= 3 O2= 3 L3= 4 O3= -9 L4 = 4 O4= 11 01 001 01 101 01 011 00011 1111 100 0011 0100101101010110001111111000011 (Figure 3.2a)

Using Li values from Figure 3.2a

ORDPATH bit pattern

Li /Oi Pair Design

(12)

5. Juli 2006 Stephan Müller 12

Advantages of

Advantages of

comparing

comparing

ORDPATH

ORDPATH

Values

Values

:

:

Determination of

Determination of

ancestor

ancestor

descendent

descendent

relationships

relationships

for

for

any

any

two

two

ORDPATHs

ORDPATHs

is

is

very

very

easy

easy

.

.

Easy

Easy

determination

determination

of

of

the

the

distance

distance

between

between

two

two

ORDPATHs

ORDPATHs

.

.

Simple

Simple

bitstring

bitstring

(

(

or

or

byte

byte

-

-

by

by

-

-

byte

byte

)

)

comparison

comparison

yields

yields

document

document

order.

order.

(13)

5. Juli 2006 Stephan Müller 13

Context Node

Descendants of a given Context Node

Book Figure Bold Section ISBN Caption Section

Title All right Frogs Nobody Title 1 1.1 1.3.3 1.3 1.3.1 1.5 1.3.5 1.3.5.1 1.5.1 1.5.3 1.5.5 1.5.7 ( cn = 1.3 )

(14)

14

‚treetreefrogsfrogs'' 4 ( 4 (ValueValue)) --1.5.7 1.5.7 ‚ ‚lovelove'' 1 (Element) 1 (Element) 7 (BOLD) 7 (BOLD) 1.5.5 1.5.5 ‚

‚All All rightright--thinkingthinkingpeoplepeople''

4 ( 4 (ValueValue)) --1.5.3 1.5.3 ‚

‚TreeTree frogsfrogs''

1 (Element) 1 (Element) 4 (TITLE) 4 (TITLE) 1.5.1 1.5.1 Null Null 1 (Element) 1 (Element) 3 (SECTION) 3 (SECTION) 1.5 1.5 'Sample

'Sample bugbug''

2 (Attribute) 2 (Attribute) 6 (CAPTION) 6 (CAPTION) 1.3.5.1 1.3.5.1 Null Null 1 (Element) 1 (Element) 5 (FIGURE) 5 (FIGURE) 1.3.5 1.3.5 'Nobody

'Nobody loveslovesbad bad bugsbugs'' 4 ( 4 (ValueValue)) --1.3.3 1.3.3 'Bad Bugs' 'Bad Bugs' 1 (Element) 1 (Element) 4 (TITLE) 4 (TITLE) 1.3.1 1.3.1 Null Null 1 (Element) 1 (Element) 3 (SECTION) 3 (SECTION) 1.3 1.3 '1 '1--5586055860--438438--3'3' 2 (Attribute) 2 (Attribute) 2 (ISBN ) 2 (ISBN ) 1.1 1.1 Null Null 1 (Element) 1 (Element) 1 (BOOK) 1 (BOOK) 1 1 VALUE VALUE NODE_TYPE NODE_TYPE TAG TAG ORDPATH ORDPATH SELECT Ordpath FROM infoset WHERE 1.3 < Ordpath (cn) AND 1.4 > Ordpath (cn+1)

Descendants of a given Context Node SQL Query:

(15)

Arbitrary

(16)

5. Juli 2006 Stephan Müller 16

Rightmost / Leftmost Insertion:

Arbitrary Insertions Child4 3.5.-1 Parent Child1 Child2 3.5 3.5.1 3.5.3 Child3 3.5.5

(17)

5. Juli 2006 Stephan Müller 17

Careting in nodes between two existing nodes…

3.5.2.2.-1 3.5.2.2.1 3.5.2.2 3.5.2.3 3.5.2.1 3.5.2 3.5 3.5.3 3.5.1 Arbitrary Insertions

(18)

5. Juli 2006 Stephan Müller 18 Parent Child1 Child2 3.5 3.5.1 3.5.3 Child3 3.5.2.1 Child4 3.5.2.3 Child5 3.5.2.2.1 Child6 3.5.2.2.-1 Arbitrary Insertions Careting in nodes between two existing nodes…

(19)

5. Juli 2006 Stephan Müller 19

Note:

Note:

Multiple

Multiple

levels

levels

of

of

carets

carets

are

are

extremely

extremely

rare in

rare in

practice

practice

.

.

Advantage:

Advantage:

Insertions

Insertions

require

require

no

no

relabelings

relabelings

of

of

old

old

nodes

nodes

We

We

avoid

avoid

updates

updates

to

to

primary

primary

key

key

values

values

which

which

would

would

involve

involve

the

the

primary

primary

index

index

and all

and all

secondary

secondary

indexes

indexes

.

.

(20)

5. Juli 2006 Stephan Müller 20

ORDPATH

ORDPATH

is

is

a

a

hierarchical

hierarchical

prefix

prefix

-

-

based

based

labeling

labeling

scheme

scheme

.

.

provides efficient access to

provides efficient access to

subtrees

subtrees

.

.

provides all kinds of modifications.

provides all kinds of modifications.

References

Related documents

I Homework stresses key concepts from class; learning takes time.. I Come to

We applaud ONC’s commitment to transparency.  We would note, however, that certain 

If, after a period often days from the date of appointment of the two arbitrators appointed by the public employer and the employee organization, the third arbitrator has not

Dickens’s Victorian London, specifically through the perspective of Dickens’s social philosophy characterized by the need for reformative action in the fractured society represented

Since the maximum nullity, maximum positive semidefinite nullity, zero forcing number, and positive semidefinite zero forcing number are all equal for these particular graphs,

The Original Scientific Publishing Provide Book and Book Chapter Publication with ISBN Number for Referencing and Indexing in

§2.12 Statement of Spouse or Registered Domestic Partner Confirming Separate Property Business Interest §2.13 Buy-Sell Provision Confirming Community Property.