• No results found

Datenbank- Metaprogrammierung

N/A
N/A
Protected

Academic year: 2021

Share "Datenbank- Metaprogrammierung"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Datenbank-Metaprogrammierung

Gottfried Vossen Universität Münster

2 Gottfried Vossen, Metaprogramming in Databases

Motivation

Stonebraker: QUEL as a data type (1984) Gray, Turing award speech (1993) Asilomar Report (1998)

Treat procedural data as data (not as text); give up the separation of data and programs!

3 Gottfried Vossen, Metaprogramming in Databases

Applications

Data dictionaries, system catalogs:

Querying meta-data Maintenance of view definitions

ECA rules in active databases, trigger defs Methods in OO or OR databases

Web interfaces

Usage logs Intelligent search

Query mining (Ærunning example)

Web service composition

4 Gottfried Vossen, Metaprogramming in Databases

What I Mean By Meta-Querying

Which queries in the log do the most joins?

Syntactical; queries stored query expressions

Which queries in the log return an empty answer on the current state of the database?

Semantical; answer depends on result of dynamic execution

In each query in the log, replace each view name by its definition from the catalog.

Syntactical; performs transformations

Given new view definitions, which stored queries now give new answers?

Syntactical and semantical

Overview: 3 Approaches

The sky is the limit: the Reflective Algebra; compute “more” queries (many more, and enable novel search facilities on the web)

Getting real: the Meta Algebra; what it takes to handle procedural data in a typed way

Getting even more real, i.e., practical:

Meta-SQL; combining meta querying and XML “nicely”

Joint work with Ute Masermann and Jan Van den Bussche, Dirk van Gucht, Frank Neven, Stijn Vansummeren

RA, the Reflective Algebra

Idea: exploit (linguistic) reflection, i.e., allow a program to generate code that is integrated in its own execution

Store relational algebra expressions in a program relation

(separate from data relations) A program relation can be

manipulated by standard operators from relational algebra created dynamically

evaluated dynamically through eval

Achievement: considerable increase in expressive power

(2)

7 Gottfried Vossen, Metaprogramming in Databases

Example

Parent

Parent--child relation R over schema {P , C };child relation R over schema {P , C }; Compute the children and grandchildren of Fred:

Compute the children and grandchildren of Fred:

8 Gottfried Vossen, Metaprogramming in Databases

Program Relation

can be generated dynamically depending on database content

9 Gottfried Vossen, Metaprogramming in Databases

Pros and Cons of RA

Pros:

Allows to formalize QUEL+ et al.

Can express the PTIME queries (over ordered databases)

Catches:

Untyped setting Language difficult to use

(RA computations resemble “assembly-language” programming)

10 Gottfried Vossen, Metaprogramming in Databases

Taking RA To Higher Levels

RSQL: (ÆADBIS ´00)

Make SQLreflective by adding

Program relations Operators reifyand eval Allows parameterized queries, needed to answer user queries such as “what is known about John?” Downside: everything done manually

SISQL: (ÆIDEAS ´00)

Query a database whose schema is unknown (or not available)

Approach: use parameters instead of relation (or attribute) names and generate dictionary queries dynamically, thereby automating translation to RSQL

SISQL Examples

SELECT [A] FROM [R]

SELECT [R].name FROM [R]

WHERE [R].[A] like ‘J%’

SELECT [A] FROM [R] WHERE [B] = ‘John’

Recent Exploitations Structured search as in TransparenSee’s Discovery Search Engine:

Advanced keyword search Most similar search Controlled result display based on higher-level SQL

Rearranging search results based on user preference as in Yahoo! Mindset

(3)

13 Gottfried Vossen, Metaprogramming in Databases

Yahoo! Mindset Results for „Tisch“

14 Gottfried Vossen, Metaprogramming in Databases

Yahoo! MindsetÆShopping

15 Gottfried Vossen, Metaprogramming in Databases

Yahoo! MindsetÆResearch

16 Gottfried Vossen, Metaprogramming in Databases

MA, the Meta Algebra

Goal: a type-safe language for (roughly) the same purpose, but closer to classical algebra

Expressions comprise

Names of object and meta relations (the latter can hold queries and data)

Standard relational operators: π, σ, x, U,– New operators to work on query columns

Typing: width of an expression (or result)

(ÆPODS ’98 & Information Systems ’99)

Example

Extract

(4)

19 Gottfried Vossen, Metaprogramming in Databases

Local Rewrite

Similarly: global rewrite (“all”)

20 Gottfried Vossen, Metaprogramming in Databases

Eval

21 Gottfried Vossen, Metaprogramming in Databases

Sample Application

Bookstore database, queried over the Web Queries are (stored as) algebra expressions

Goal: maintain meta relation Logcontaining

pairs (u, q), where u is a user name and q is a query u has posed

Log is of type [0, <4>], i.e., a 4-ary relation is associated with each user name (e.g., holding sets of book records)

22 Gottfried Vossen, Metaprogramming in Databases

Sample Queries

Show the results of all queries posed by every user:

π1,3 eval2(Log)

Determine all queries that gave no result:

π2(Log)

π2 eval2(Log)

Show the union of the results of the queries posed by Jones:

π3,4,5,6 eval2 σ1= ‘Jones‘(Log)

Results

Extract, Rewrite, and Eval are primitive operators, i.e., MA is non-redundant

MA is a conservative extension of the relational algebra, i.e., the two coincide on ordinary databases

There is a Meta Calculus whose safe fragment is equivalent to MA (i.e., a “Codd Theorem”) Many additional operators can be defined (as short-hands) based on the ones in MA

Meta-SQL:

Making Meta-Querying Practical

Idea: stay declarative, i.e., with SQL, but exploit modern DBMS functionality:

External functions XML as a data type

Follow MA philosophy:

Queries are stored in “query columns”, but as XML syntax trees

Rewrite, extract, etc. done through XSLT functions Eval implemented on top

(5)

25 Gottfried Vossen, Metaprogramming in Databases

Encoding SQL Queries in XML

select director, avg(rating) as avgrat from Movies group by director

<query> <select> <sel-item> <column>director</column> </sel-item> <sel-item> <aggregate><avg/> <column-ref><column>rating</column></column-ref> </aggregate> <alias>avgrat</alias> </sel-item> </select> <from>

<table-ref> <table>Movies</table> </table-ref> </from>

<group-by>

<column-ref> <column>director</column> </column-ref> </group-by>

</query>

26 Gottfried Vossen, Metaprogramming in Databases

Corresponding DOM Tree

query select sel-item column sel-item aggregate alias avg column-ref column from table-ref table group-by column-ref column director rating

avgrat Movies Director

Foundation: Date

Foundation: Date‘‘s BNF grammar for SQL, turned into a DTDs BNF grammar for SQL, turned into a DTD

27 Gottfried Vossen, Metaprogramming in Databases

XSLT Fundamentals

Tree transformations

XSLT program= sequence of template rules

Template rule

= matching pattern [+ mode] + template

Processing:

template is instantiated for some source node that matches the pattern;

each instruction is then executed and replaced by the result it creates;

evaluation continues recursively.

28 Gottfried Vossen, Metaprogramming in Databases

Use of XSLT

“Which queries (stored in ‘Views’) do the most joins?“

assumptions: Views (name: string, def: xml) #joins = #table names

function count_tables returns number begin

<xsl:template match="/">

<xsl:value-of select="count(//table)“/> </xsl:template>

end

select name from Views where count_tables(def) =

(select max(count_tables(def)) from Views)

Use of XSLT (cont‘d)

“Which views become invalid after removal of the tables listed in `Removed´?”

function mentions_table param tname string returns string begin

<xsl:param name="tname"/> <xsl:template match="/">

<xsl:if test="//table[string(.)=$tname]"> true </xsl:if> </xsl:template>

end

select name from Views, Removed

where mentions_table(def, Removed.name) = 'true'

Tools for (Syntactic) Queries

XML variables

Range over the sub-elements of an XML tree Range can be narrowed by an XPath expression Allow to go a from a single to a set of XML documents

XML aggregation

For combining a set of XML documents into a single one

(6)

32 Gottfried Vossen, Metaprogramming in Databases

Example: Extract Subelements

“Find all pairs (v,t) s.t. v is is a view name and t is a

table name occurring in the definition of v“

function string_value returns string begin

end

select v.name, string_value(x) from Views v,x in v.def[//table]

returns the string value of a tablesubelement

33 Gottfried Vossen, Metaprogramming in Databases

Log Example

Stored queries in column Q of table Log;

goal is to find “hot spots,” i.e., subqueries occurring in at least 10 different queries:

select s

from Log l, s in l.Q[//query] group by s

having count(l.Q) >= 10

34 Gottfried Vossen, Metaprogramming in Databases

XML Aggregation

Function CMB combines a list of XML documents into a single document Example:

Given `Views3,´ where a view name may have multiple definitions: “For each name, return the Cartesian product of its definitions.”

1. XSLT function cartprod

2. select name, cartprod(CMB(def)) from Views3 group by name

35 Gottfried Vossen, Metaprogramming in Databases

Function cartprod

function cartprod returns xml

begin

<xsl:template match="/"> <query>

<select> <wildcard/> </select> <from> <xsl:apply-templates select="cmb/*"/> </from> </query> </xsl:template> <xsl:template match="/cmb/*"> <table-ref> <xsl:copy-of select="."/> </table-ref> </xsl:template> end

Sample Semantic Query

“On table Customer (custid: string, query: xml),(where

each query returns a table with columns item, price, …) find the max price of items requested by each

customer:”

select custid, max(t.price) from Customer c, EVAL(c.query) t group by custid

UEVAL available for cases where output schema unknown t is a standard SQL range var EVAL returns a

table

Another Semantic Query

“Which queries in the log return an empty answer?”

select Q from Log l where not exists

(select x from x in UEVAL(l.Q))

resulting table is presented as a set of XML documents x is now an XML var

(7)

38 Gottfried Vossen, Metaprogramming in Databases

Implementation Considerations

Meta-SQL program Meta-SQL compiler function f returns xml begin <xsl:template> ... </xsl:template> end ... select f(x) from L, x in L.A[//query] import com.icl.saxon. *; public class XSLT_f { ... } Java Wrapper Java compiler Extensible DBMS Meta-SQL

compiler select UDF_f(x.result) from L,

table(EXTRACT(L.A, ’//query’)) x SQL:1999 statement

Saxon XSLT Library Standard Meta-S QL functions

(EXTRACT, EVAL, UEVAL) Code base XSLT_f.class

Meta-SQL program result

create function UDF_f(XML) returns XML source ’XSLT_f.f’ language JAVA

External Function registration statment

Executed by Meta-SQL compiler

compilation execution Legend

39 Gottfried Vossen, Metaprogramming in Databases

Results & Exploitations

Experiments show that the approach, implemented on top of a db engine, induces constant, predictable overhead

Performance barriers remain w.r.t. XSLT processors

Ideal: XSLT processing directly incorporated

into query processor (ÆNatix approach)

XSLT is being replaced by XQuery Adaptive Web Services as expoitation

40 Gottfried Vossen, Metaprogramming in Databases

Web Services lookup publish utilize XML doc SQL query XML doc MetaSQL query + eval Meta SQL

41 Gottfried Vossen, Metaprogramming in Databases

Adaptive Web Services

Web Services Standard Web Services Adaptive Web Services “myService” customized services new, globally available services

An AWS Provider

Conclusions

Meta querying is easier than you thought (and probably has more applications)!

Meta querying now has a formal foundation and both a link to practice and to current research. Meta queries expressed in RA or in MA can also be expressed in Meta-SQL.

Meta-SQL (without eval) can be used as a query language for databases containing relational data and XML documents

Ideal platform: a DBMS that does XML processing inside its query processor

References

Related documents

Both the Cyber Security Research and Development Act of 2002 (PL 107-305) and the Homeland Security Act established new programs and authorized new funds for cybersecurity

where n component databases (each abiding internally to their own 3-level architecture) are integrated (via CF and the Mediator), resulting in the database schema of DBINT

На музейному матеріалі (постійних макропрепаратах, тимчасових і постійних мікропрепаратах) вивчити морфологічні особливості іксодови аргасових, дерманісусних х, кліщів і

The table below provides a comparison of the cost per square foot for Menlo Park Fire District Station 2, located in East Palo Alto, bid and rebuilt starting in 2013, and Station

Since online consumers are usually incidentally exposed to banner advertisements while their main goal is to read the content of webpages, it is crucial to study whether

Additional WM tracts involved in language processing are the middle longitudinal fascicle (MdLF), which connects anterior and posterior temporal regions, and the

Recruitment Counselors, the outgoing VP Recruitment, the outgoing President, the outgoing assistant VPR, the Rho Chi Coordinator, and the Panhellenic Executive Board-elect: President,

Other questions of the study were “Text messaging during work effect your speed of work and concentration” (Mean=4.52), Productive use of cell phone by you is in the benefit