Datenbank-Metaprogrammierung
Gottfried Vossen Universität Münster
2 Gottfried Vossen, Metaprogramming in Databases
Motivation
Stonebraker: QUEL as a data type (1984) Gray, Turing award speech (1993) Asilomar Report (1998)
Treat procedural data as data (not as text); give up the separation of data and programs!
3 Gottfried Vossen, Metaprogramming in Databases
Applications
Data dictionaries, system catalogs:Querying meta-data Maintenance of view definitions
ECA rules in active databases, trigger defs Methods in OO or OR databases
Web interfaces
Usage logs Intelligent search
Query mining (Ærunning example)
Web service composition
4 Gottfried Vossen, Metaprogramming in Databases
What I Mean By Meta-Querying
Which queries in the log do the most joins?
Syntactical; queries stored query expressions
Which queries in the log return an empty answer on the current state of the database?
Semantical; answer depends on result of dynamic execution
In each query in the log, replace each view name by its definition from the catalog.
Syntactical; performs transformations
Given new view definitions, which stored queries now give new answers?
Syntactical and semantical
Overview: 3 Approaches
The sky is the limit: the Reflective Algebra; compute “more” queries (many more, and enable novel search facilities on the web)
Getting real: the Meta Algebra; what it takes to handle procedural data in a typed way
Getting even more real, i.e., practical:
Meta-SQL; combining meta querying and XML “nicely”
Joint work with Ute Masermann and Jan Van den Bussche, Dirk van Gucht, Frank Neven, Stijn Vansummeren
RA, the Reflective Algebra
Idea: exploit (linguistic) reflection, i.e., allow a program to generate code that is integrated in its own execution
Store relational algebra expressions in a program relation
(separate from data relations) A program relation can be
manipulated by standard operators from relational algebra created dynamically
evaluated dynamically through eval
Achievement: considerable increase in expressive power
7 Gottfried Vossen, Metaprogramming in Databases
Example
ParentParent--child relation R over schema {P , C };child relation R over schema {P , C }; Compute the children and grandchildren of Fred:
Compute the children and grandchildren of Fred:
8 Gottfried Vossen, Metaprogramming in Databases
Program Relation
can be generated dynamically depending on database content
9 Gottfried Vossen, Metaprogramming in Databases
Pros and Cons of RA
Pros:
Allows to formalize QUEL+ et al.
Can express the PTIME queries (over ordered databases)
Catches:
Untyped setting Language difficult to use
(RA computations resemble “assembly-language” programming)
10 Gottfried Vossen, Metaprogramming in Databases
Taking RA To Higher Levels
RSQL: (ÆADBIS ´00)
Make SQLreflective by adding
Program relations Operators reifyand eval Allows parameterized queries, needed to answer user queries such as “what is known about John?” Downside: everything done manually
SISQL: (ÆIDEAS ´00)
Query a database whose schema is unknown (or not available)
Approach: use parameters instead of relation (or attribute) names and generate dictionary queries dynamically, thereby automating translation to RSQL
SISQL Examples
SELECT [A] FROM [R]
SELECT [R].name FROM [R]
WHERE [R].[A] like ‘J%’
SELECT [A] FROM [R] WHERE [B] = ‘John’
Recent Exploitations Structured search as in TransparenSee’s Discovery Search Engine:
Advanced keyword search Most similar search Controlled result display based on higher-level SQL
Rearranging search results based on user preference as in Yahoo! Mindset
13 Gottfried Vossen, Metaprogramming in Databases
Yahoo! Mindset Results for „Tisch“
14 Gottfried Vossen, Metaprogramming in Databases
Yahoo! MindsetÆShopping
15 Gottfried Vossen, Metaprogramming in Databases
Yahoo! MindsetÆResearch
16 Gottfried Vossen, Metaprogramming in Databases
MA, the Meta Algebra
Goal: a type-safe language for (roughly) the same purpose, but closer to classical algebraExpressions comprise
Names of object and meta relations (the latter can hold queries and data)
Standard relational operators: π, σ, x, U,– New operators to work on query columns
Typing: width of an expression (or result)
(ÆPODS ’98 & Information Systems ’99)
Example
Extract
19 Gottfried Vossen, Metaprogramming in Databases
Local Rewrite
Similarly: global rewrite (“all”)
20 Gottfried Vossen, Metaprogramming in Databases
Eval
21 Gottfried Vossen, Metaprogramming in Databases
Sample Application
Bookstore database, queried over the Web Queries are (stored as) algebra expressions
Goal: maintain meta relation Logcontaining
pairs (u, q), where u is a user name and q is a query u has posed
Log is of type [0, <4>], i.e., a 4-ary relation is associated with each user name (e.g., holding sets of book records)
22 Gottfried Vossen, Metaprogramming in Databases
Sample Queries
Show the results of all queries posed by every user:
π1,3 eval2(Log)
Determine all queries that gave no result:
π2(Log)
–
π2 eval2(Log)Show the union of the results of the queries posed by Jones:
π3,4,5,6 eval2 σ1= ‘Jones‘(Log)
Results
Extract, Rewrite, and Eval are primitive operators, i.e., MA is non-redundant
MA is a conservative extension of the relational algebra, i.e., the two coincide on ordinary databases
There is a Meta Calculus whose safe fragment is equivalent to MA (i.e., a “Codd Theorem”) Many additional operators can be defined (as short-hands) based on the ones in MA
Meta-SQL:
Making Meta-Querying Practical
Idea: stay declarative, i.e., with SQL, but exploit modern DBMS functionality:External functions XML as a data type
Follow MA philosophy:
Queries are stored in “query columns”, but as XML syntax trees
Rewrite, extract, etc. done through XSLT functions Eval implemented on top
25 Gottfried Vossen, Metaprogramming in Databases
Encoding SQL Queries in XML
select director, avg(rating) as avgrat from Movies group by director<query> <select> <sel-item> <column>director</column> </sel-item> <sel-item> <aggregate><avg/> <column-ref><column>rating</column></column-ref> </aggregate> <alias>avgrat</alias> </sel-item> </select> <from>
<table-ref> <table>Movies</table> </table-ref> </from>
<group-by>
<column-ref> <column>director</column> </column-ref> </group-by>
</query>
26 Gottfried Vossen, Metaprogramming in Databases
Corresponding DOM Tree
query select sel-item column sel-item aggregate alias avg column-ref column from table-ref table group-by column-ref column director rating
avgrat Movies Director
Foundation: Date
Foundation: Date‘‘s BNF grammar for SQL, turned into a DTDs BNF grammar for SQL, turned into a DTD
27 Gottfried Vossen, Metaprogramming in Databases
XSLT Fundamentals
Tree transformations
XSLT program= sequence of template rules
Template rule
= matching pattern [+ mode] + template
Processing:
template is instantiated for some source node that matches the pattern;
each instruction is then executed and replaced by the result it creates;
evaluation continues recursively.
28 Gottfried Vossen, Metaprogramming in Databases
Use of XSLT
“Which queries (stored in ‘Views’) do the most joins?“
assumptions: Views (name: string, def: xml) #joins = #table names
function count_tables returns number begin
<xsl:template match="/">
<xsl:value-of select="count(//table)“/> </xsl:template>
end
select name from Views where count_tables(def) =
(select max(count_tables(def)) from Views)
Use of XSLT (cont‘d)
“Which views become invalid after removal of the tables listed in `Removed´?”
function mentions_table param tname string returns string begin
<xsl:param name="tname"/> <xsl:template match="/">
<xsl:if test="//table[string(.)=$tname]"> true </xsl:if> </xsl:template>
end
select name from Views, Removed
where mentions_table(def, Removed.name) = 'true'
Tools for (Syntactic) Queries
XML variables
Range over the sub-elements of an XML tree Range can be narrowed by an XPath expression Allow to go a from a single to a set of XML documents
XML aggregation
For combining a set of XML documents into a single one
32 Gottfried Vossen, Metaprogramming in Databases
Example: Extract Subelements
“Find all pairs (v,t) s.t. v is is a view name and t is a
table name occurring in the definition of v“
function string_value returns string begin
end
select v.name, string_value(x) from Views v,x in v.def[//table]
returns the string value of a tablesubelement
33 Gottfried Vossen, Metaprogramming in Databases
Log Example
Stored queries in column Q of table Log;goal is to find “hot spots,” i.e., subqueries occurring in at least 10 different queries:
select s
from Log l, s in l.Q[//query] group by s
having count(l.Q) >= 10
34 Gottfried Vossen, Metaprogramming in Databases
XML Aggregation
Function CMB combines a list of XML documents into a single document Example:
Given `Views3,´ where a view name may have multiple definitions: “For each name, return the Cartesian product of its definitions.”
1. XSLT function cartprod
2. select name, cartprod(CMB(def)) from Views3 group by name
35 Gottfried Vossen, Metaprogramming in Databases
Function cartprod
function cartprod returns xmlbegin
<xsl:template match="/"> <query>
<select> <wildcard/> </select> <from> <xsl:apply-templates select="cmb/*"/> </from> </query> </xsl:template> <xsl:template match="/cmb/*"> <table-ref> <xsl:copy-of select="."/> </table-ref> </xsl:template> end
Sample Semantic Query
“On table Customer (custid: string, query: xml),(where
each query returns a table with columns item, price, …) find the max price of items requested by each
customer:”
select custid, max(t.price) from Customer c, EVAL(c.query) t group by custid
UEVAL available for cases where output schema unknown t is a standard SQL range var EVAL returns a
table
Another Semantic Query
“Which queries in the log return an empty answer?”
select Q from Log l where not exists
(select x from x in UEVAL(l.Q))
resulting table is presented as a set of XML documents x is now an XML var
38 Gottfried Vossen, Metaprogramming in Databases
Implementation Considerations
Meta-SQL program Meta-SQL compiler function f returns xml begin <xsl:template> ... </xsl:template> end ... select f(x) from L, x in L.A[//query] import com.icl.saxon. *; public class XSLT_f { ... } Java Wrapper Java compiler Extensible DBMS Meta-SQLcompiler select UDF_f(x.result) from L,
table(EXTRACT(L.A, ’//query’)) x SQL:1999 statement
Saxon XSLT Library Standard Meta-S QL functions
(EXTRACT, EVAL, UEVAL) Code base XSLT_f.class
Meta-SQL program result
create function UDF_f(XML) returns XML source ’XSLT_f.f’ language JAVA
External Function registration statment
Executed by Meta-SQL compiler
compilation execution Legend
39 Gottfried Vossen, Metaprogramming in Databases
Results & Exploitations
Experiments show that the approach, implemented on top of a db engine, induces constant, predictable overhead
Performance barriers remain w.r.t. XSLT processors
Ideal: XSLT processing directly incorporated
into query processor (ÆNatix approach)
XSLT is being replaced by XQuery Adaptive Web Services as expoitation
40 Gottfried Vossen, Metaprogramming in Databases
Web Services lookup publish utilize XML doc SQL query XML doc MetaSQL query + eval Meta SQL
41 Gottfried Vossen, Metaprogramming in Databases
Adaptive Web Services
Web Services Standard Web Services Adaptive Web Services “myService” customized services new, globally available services
An AWS Provider
Conclusions
Meta querying is easier than you thought (and probably has more applications)!
Meta querying now has a formal foundation and both a link to practice and to current research. Meta queries expressed in RA or in MA can also be expressed in Meta-SQL.
Meta-SQL (without eval) can be used as a query language for databases containing relational data and XML documents
Ideal platform: a DBMS that does XML processing inside its query processor