• No results found

Language Integrated Query in Java for XML and Relational Database

N/A
N/A
Protected

Academic year: 2019

Share "Language Integrated Query in Java for XML and Relational Database"

Copied!
57
0
0

Loading.... (view fulltext now)

Full text

(1)

RIT Scholar Works

Theses

Thesis/Dissertation Collections

2007

Language Integrated Query in Java for XML and

Relational Database

Anurag Naidu

Follow this and additional works at:

http://scholarworks.rit.edu/theses

This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please [email protected].

Recommended Citation

(2)

for XML

and

Relational Database

Master's Thesis

Anurag

Naidu

October 26, 2007

Committee

Dr. Axel T Schreiner, Chair

Dr. Rajendra K Raj, Reader

(3)

Title of thesis or dissertation: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ LANG.VAG.£ \ t-\l~6 RA.,. \::"1> &VE'R'X )~ :SAvA£Oe

Name of author: AN\JR.A.c. NA II> L)

Degree: \'v\ S

c..p

M.PU 1~ ><::"1 t:=N c:...f

Program: CJ) 1v'I..C>\J 1 \;Y$.

zc:.

1 \=-,... c....c:=

College: 6c..c..l S

I understand that I must submit a print copy of my thesis or dissertation to the RIT Archives, per current RIT guidelines for the completion of my degree. I hereby grant to the Rochester Institute of Technology and its agents the non-exclusive license to archive and make accessible my thesis or dissertation in whole or in part in all forms of media in perpetuity. I retain all other ownership rights to the copyright of the thesis or dissertation. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation.

Print Reproduction Permission Granted:

I,

Anurag Naidu

,hereby grant permission to the Rochester Institute Technology to reproduce my print thesis or dissertation in whole or in part. Any reproduction will not be for commercial use or profit.

Signature of Author: _ _ _ _ A_n_u_r_a~g'_N_a_id_u

________

Date: _ _ _ _ _

Print Reproduction Permission Denied:

I, , hereby deny permission to the RIT Library of the Rochester Institute of Technology to reproduce my print thesis or dissertation in whole or in part.

Signature of Author: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Date: _ _ _ _ _ _

Inclusion in the RIT Digital Media Library Electronic Thesis

&

Dissertation (ETD) Archive

I,·

Anurag Naidu

,additionally grant to the Rochester Institute of Technology Digital Media Library (RIT DML) the non-exclusive license to archive and provide electronic access to my thesis or dissertation in whole or in part in all forms of media in perpetuity.

I understand that my work, in addition to its bibliographic record and abstract, will be available to the world-wide community of scholars and researchers through the RIT DML. I retain all other ownership rights to the copyright of the thesis or dissertation. I also retain the right to use in future works (such as

articles or books) all or part of this thesis or dissertation. I am aware that the Rochester Institute of Technology does not require registration of copyright for ETDs.

I hereby certify that, if appropriate, I have obtained and attached written permission statements from the owners of each third party copyrighted matter to be included in my thesis or dissertation. I certify that the version I submitted is the same as that approved by my committee.

(4)

The

Undersigned Computer Faculty Approves the Thesis

Language Integrated

Query

in

Java for XML and Relational

Database

Axel T. Schreiner

Dr. Axel T Schreiner,

Chair

Rajendra

K.

Raj

Dr. Raj endra K Raj,

Reader

James M. Kwon

Dr. James M Kwon,

Observer

Thursday, 25 October 2007

(5)

XMLand relationaldatabasesarethemostcommonlyuseddata-sources fornumer ous Java-basedenterprise applications. Theever-growing dependenceofJava-based

applications on these technologies calls for developing a uniform means ofquerying

these data-sourcesin the Java layer such that queries written for anyofthesetech

nologieslooksimilar and areinfactgovernedbya single grammar. This thesisimple

ments a newbackendsystemencapsulatingqueryingcapabilities overXMLand JDBC basedrelationaldatabaseswhich would complementthe translationof queries written

usingQuEL [1],alanguageextensiontotheJava programminglanguage. The QuEL

language extensionto Javais part ofanother thesis developedpreviously and acts as

the querylanguagefortheback-end implemented in thisthesisforXMLand relational

(6)

Contents

1 Introduction 5

2 Core concepts 7

2.1 Hierarchical Data Model

/

ExtensibleMarkup Language

(XML)

7

2.2 Relational Data Model

/

Relational Database

(SQL)

15

3 Language Integrated Query

(LINQ)

19

3.1 XLinqand DLinq 21

3.1.1 XLinq 21

3.1.2 DLinq 21

4 QuEL 22

4.1 QuEL front-end 22

4.2 Object Collections 23

5 QuEL for XML and Relational Database 24

5.1 XML in QuEL 27

5.1.1 Background logic 27

5.1.2 Methodtranslation 28

5.1.3 XML backend design 29

5.2 Database query in QuEL 34

5.2.1 Background logic 34

5.2.2 Method translation 35

5.2.3 SQL backend design 37

(7)

6.1 QuEL querytranslation forXML 43

6.2 XQueryfor Java 45

6.3 QuEL query translation for SQL 45

6.3.1 GroupJoin issue 47

7 Conclusion 48

7.1 Timing 49

8 Limitations 50

8.1 Into 50

8.2 Timing 51

(8)

1

Introduction

Querylanguages areprogramming languages that are used typicallyto retrievedata from

databasesor otherinformationstorage mediums. The growingdependenceof major appli

cationson avarietyofdatabasesystems and othermediums ofstorage, likeXML [2], now

makes it important thatwe are ableto querysuch data-sources in anefficient andconsis

tent manner. These various data storage technologies haveevolved over along period of

time and originated from separate companies or special interest groups. What this means

isthateachofthesedifferenttechnologies hasitsown distinct languageswhich areusedto

writequeries for retrieving data.

In this thesis we aim to address the problem ofheterogeneity in querying different

data-sources with an aimtointroduceasyntacticallyconsistent querymechanismthat isalmost

independent of the nature ofthe data storage medium that is queried. We demonstrate

the various aspects ofquery preprocessingthat need tobe changedtomake the consistent

query experienceintegrate into Java.

The .NET

community at Microsoft Corporation has made considerable progress in devel

oping Language Integrated Query

(LINQ) [3]

for the C# and VB languagespecification

in .NET 3.0. This thesis work in conjunction with the thesis "Integrating a Universal

Querying Mechanism in Java"

[1]

is targeted towards developing a back-end for such a

querylanguage thatwould provide similarbenefits in Javafor retrievingdataout ofXML

documents and relational databases. One could now write queries using QuEL [1], as a

language extensionofJava,just as one would write SQL

[4]

queries on adatabaseconsole

orXPath

[5]

queriesto operate on aXML document. Apossiblefuture extension ofboth
(9)

for a powerful feature in Java that could be used to query XML and relational databases

(10)

2

Core

concepts

This chapter takes a closer look at the design and architecture of the two data-sources

that will be the target for QuEL queries for the purpose ofthis thesis. Simple examples

willbe usedtodemonstrate howwriting queriesin SQL

[4]

and XPath

[5]

worksbyusing

the existingAPIs provided in Java and what are the concerns with writing such queries.

This would helpin understanding both the advantages ofwriting queriesusing QuEL

[1]

as well as the backend designed inthis thesis for XML and relational database.

2.1 Hierarchical Data Model

/

Extensible

Markup

Language

(XML)

XML is a specification for definingmarkup languages designed to separate data from its

representation. This makes it avery usefulformat for data representation anddata inter

change between different systems. Though relativelynew in the computer industry, XML

has become very popular and is widely used as an information storage medium and con

figuration data-source insoftware applications.

Consider beginning with averysimple XML document that has information for the staff

in an organization, to see how adocument like thislooks and understand the structure of

such adocument. The document hasfirst name, last name and phone numberinformation

of each staff member ofthe department and maintainsthem as anXML document as de

(11)

<staff>

<person>

<first>Sandy</first>

<last>Ferrara</last>

<phone>55178</phone>

</person>

<person>

<first>Tina</first>

<last>Sturgis</last>

<phone>52529</phone>

</person>

<person male=,true'>

<first>Jason</first>

<last>Harrison</last>

<phone>52529</phone>

</person>

</staff>

Figure 1: Staff XMLdocument as atree

A quick look at the document shows a striking resemblance to a tree structure which is

rooted at the staff node.

Given this tree-like visualization of the document one could imagine querying this doc

ument being analogous totraversing the tree andpruning out the leavesthat do not pass

the filtering criteria. One would have to write many lines of code in Java using the W3C

DOM

[6]

toreadtheXML document into memoryandthenwrite a visitortocollect nodes

thatpassthefilteringcriteria oftheuser queryand constructanewdocumentwith reduced

[image:11.528.135.289.86.332.2]
(12)

first last phone

Sandy Ferrers 55

Person mete

'true'

first last phone

Jason Harrison 52!

first last phone

[image:12.528.172.344.55.227.2]

rina Sti#g* 52!

Figure 2: Staff XMLdocument as atree

scenariointhe programming domainto sievethroughXMLdocuments and collect dataof

interest at particular points.

XPath

[5]

is one suchquerylanguage tailored to operate efficientlyover XMLdocuments

to filter out data. XPath queries are crisp, non-verbose and extremely powerful tools for

working with XML. Since XML has been constantly gaining importance in the software

industry, a marriageofsome sortbetween XPathand Javawas alwaysalogicalnext step.

The growing adoption ofthis technology in the industrywas cemented further when Sun

introduced XPath APIs in Java 5 (JDK 1.5) for querying XML documents, which made

available thepower of efficient XPath querying from within user-levelJavacode. XQuery

[7]

is also one such mechanism that can be used to queryanXML document.

UsingfollowingtwosimpleXMLdocuments, RoomsandStaff, howaquerycanbewritten

(13)

<rooms>

<room room='3021">

<phone>55178<7phone>

</room><!--Sandy-->

<room room='3012'

>

<phone>57905</phone> </room><!-rina-->

<roomroom='3671'>

< phone >57905</phone> </room>

<room room='3008'>

<phono56084<7phone>

</room><!--Joanne-->

</rooms>

<staff> <person>

<first>Sandy</first> <last>Ferrara</last> <phone>55178</phone> </person> <person> <nrst>Tina</hrst> <last>Sturgis</last> <phone>57905</phono </person>

< person >

<first>Joanne</first> <last>Catan</last> <phone >56084</phone> </person>

[image:13.528.96.357.53.183.2]

</staff>

Figure 3: XML document forRooms and Staff

Staff is the same file that we saw above and Rooms is afile that maps the room number

with the phone number associated with that room. A querying scenario from these two

documentscouldbetosearch across thefiles and match records on phone numbers across

the documents and return staff names with the corresponding room they occupy. Fairly

common asthis scenario maybe, it isnot exactlyaprogramming cakewalkto achieve the

desired result.

Assuming that the developer has good knowledge ofXPath, XSLT

[8]

and HTML, the

codeforobtainingthe results ofthe queryacrossthe twoXML fileswould look something

like this in a XSLT file

<?xml version="1.0"

encoding="UTF-8"?>

<xsl:stylesheet version="1.0"

xmlns :xsl="http://www.w3.org/1999/XSL/Transf orm"

>

<xsl:output method="html" indent="yes" />

<! Read in all the room nodes from the Rooms xml file into a temporary variable >

<xsl:variable name="data" select="

document('rooms.xml')/rooms" />

(14)

<! Construct the html for the page headers etc. > <html>

<head><title>Occupancy page</title></head>

<body>

<table border="l"

align="center"> <tr><td colspan="3"

align="center">

<b>Name and Room occupancy</bx/td></tr>

<xsl:apply-templates select="/staff" />

</table>

</body>

</html>

</xsl:template>

<! Declare the template for matching the staff and room >

<xsl:template match="

staff">

<! Apply the template over all the staff nodes >

<xsl:for-each select="

./child: :*"> <xsl:variable name="p" select="

./phone/text

()" />

<xsl:variable name="f" select="

./first/text

()" />

<! Get all the rooms to test and iterate >

<xsl:for-each select="$data/child: :*">

<xsl:if test="./phone/text() = $p"> <tr>

<td><xsl:value-of select="$f"/></td>

<td><xsl: value-of select="

. /@room"/x/td> </tr>

</xsl:if>

</xsl:for-each>

</xsl:for-each>

</xsl:template>

</xsl:stylesheet>

Figure 4: XSLTto merge Staff and Rooms

The result of the above XSLT on the two XML documents yields the following HTML

output as can be seenin Figure 5.

Noticethe nested loopsxsl:for-each select with XPath expressionsthat do the trick

along with the presentation HTML code that displays the result. One could debatethe

(15)

cumber-NameandRoomoccupancy Sandy 3021 [Tina |3012 |7ma [3671 [Joanne 300S Eileen 3005

,Jason 3005

Christina [3022

Liant [3022

James ;3599

Sam 3596

Figure 5: Result ofXSLT solution

some to write, given howcommon thisscenario is whenworkingwith XML documents.

What we have looked at is a non-Javaimplementation ofthe query. Reconsider how the

same canbe achieved using Javaandthe Java XPath APIs.

public class XmlSample

{

/**

* Oparam args

*/

public static void main(String[] args){

DocumentBuilder documentBuilder = null;

Element staff = null;

Element rooms = null;

XPath xpath = null;

String staffXPathExpr = "/staff/person"

;

String roomXPathExpr =

"/rooms/room";

NodeList outer = null;

NodeList inner = null;

try

{

documentBuilder =DocumentBuilderFactory.newInstance()

.newDocumentBuilderO ;

[image:15.528.196.323.53.207.2]
(16)

}

try

{

staff =

documentBuilder.parse(new File("./linq/staff.xml"))

.getDocumentElementO ;

rooms = documentBuilder.parse(new File("

./linq/rooms.xml"))

.getDocumentElement;

}

catch (SAXException e)

{

e.printStackTraceO;

}

catch (IOException e)

{

e.printStackTraceO ;

}

xpath = XPathFactory.newInstanceO

.newXPathO ;

try

{

outer =

(NodeList) xpath.

evaluate (staffXPathExpr, staff, XPathConstants.NODESET);

inner =

(NodeList) xpath.

evaluate(roomXPathExpr, rooms,XPathConstants.NODESET);

}

catch (XPathExpressionException e)

{

e.printStackTrace;

}

// Match the results in nested loop // THIS IS THE ACTUAL QUERY LOGIC

int staffCount = outer. getLength()

;

int roomCount = inner.getLength()

;

String roomNum = " "

;

Node roomltem =

null;

for (int i =

0; i < staffCount; i++)

{

Node staffItem = outer.item(i)

;

try

{

roomNum = xpath.evaluate("phone/text()"

, staffItem);

for (int j =

0; j < roomCount; j++)

{

roomltem = inner.

item(j);

if(roomNum. equals(xpath.evaluate("phone/text()

"

, roomltem)) )

System.out.println(xpath.evaluate(

"

first/text( )"

, staffItem) + " "+ xpath. evaluate("Oroom", roomltem));

}

(17)

// TODO Auto-generated catch block

e.printStackTrace( );

}

}

}

}

Figure 6: Javacode forXML merging

The first thing that catches the eye is the wordiness of the solution and though it solves

theproblem, maintainabilityof repeatedboilerplatecode,mixed withtheactualfunctional

logic will always remain an issue over time. This problem is especially of concernas such

queryies for XML documents arefairlycommonplace.

Let us elucidate the problems seen so far in writing queries, given the existing infras

tructures and technologies, which would help explain why a change in this area is long

overdue.

Complex codelogic needed for fairlysimple querypatterns.

Proficiencyin many technologies like XPath, XSLT, JDK XPathAPIs etchas tobe

combined.

Repetitive boilerplatecodeto supplement the actual querycode.

The query itselfremainstype unchecked and so potentially unsafe.

Lackof consistent query language for alldata-sources.

Wewillcomeback to theseobservations when wediscuss thedesign for QuEL andexplain

(18)

2.2 Relational Data Model

/

Relational Database

(SQL)

A relational database is adata-store that is based on a relational modelbetween entities,

a modelthatworkson relational algebra and set theory Such a relational model was first

proposedby Edgar Codd

[9]

wherethe word relation refersto atable in the database.

The smallest building block of a relational database is called an attribute, which is a

name/value pair. An unordered set of attributes forms a tuple and an unordered set of

such tuples makes up a relation, which lends its name to a relational database. How

ever when dealing with relational databases we use the words table, row, column and

constraints which are equivalent terms that mapthe mathematicalconcepts on which re

lational databases weredesigned to an actual database implementation.

Someolder databases were also designed on ahierarchical data model like that ofXML.

Mostmoderndatabases havemovedinto a relational model becauseofthegreaterflexibil

ityand advantages thatit provides. Oneofthespecificadvantages with a relational model isamany-to-many relationsbetweenentities. Oneofthe severelylimitingconditionsfor a

hierarchical modelis that therecan at best be aone-to-manyrelational mapping between

entities; however, the model cannot accommodatemany-to-many relations.

A simple example of this could be a super-store where products are classified based on

the various sections like apparel, grocery, sports goods etc. The products in the display

lineare classified under onlyonesection, however if thereareproducts that reallyfallinto

morethanonesection, then thedatabasedoesnot provideanyoptiontolistthesame prod

(19)

orderto gather accountinginformationfrom sectionsforsales statistics. This discussionis

merely toprovide aninsightinto whythedatabaseindustryhasmovedtowardsarelational

model for its databaseneeds.

Seclionld Name Aisle 101 Grocery 21 102 Apparel 34 103 Spon 45 101 Foolv/ear 22

Sectionld Produced 101 10 102 12 103IM 14

Productld Name 10 meat 12 flour 14 Nikear

[image:19.528.126.411.162.311.2]

SectionToProduct

Figure 7: Hierarchicalmodel anomaly

We discussedtheobvious advantage of relationaldatabasesandhowthe databaseindustry

has moved towards a relational model, focus now on the technologies that one needs to

familiarize oneselfwith in order to be ableto work with relational models. The primary

means ofaccessing relational databases is via the Structured Query Language

(SQL)

[7]

invented byIBM in the 1970s.

In order to see how SQL is used consider the following two relational database tables

Staff and Rooms that contain the same data as shown earlier but this time in relational

data model. In orderto perform the samequery as we didfor theXML sample we would

now writea SQLquerythatjoinsthe two Staff andRoomstables onthe matchingphone

(20)

Room F-:-'

302 1 5517* ! 3D'

2 47905

3671 57906 aooe :=

:-Firs Lasl PTvy* Sa-S, =^ 55179

~

= 2-~'Q 5 E>T':*jf toenrae Ga-a~ '*>'**

- 35- 5-*

Room Staff

Figure8: Room andStaff tables

ThefollowingSQLquery returns theresult as listed below. The SQL query itself is very

intuitive and simplein structure and does thejob, however database are always coupled

with some applicationforwhichtheyserve asthedata-storeand thus themeasure ofthe

easeof use of adatabaseinthemannerinwhichit may be accessedbytheapplication it

serves.

SELECT S. FIRST. R ROOM FROMSTAFF AS S JOIH ROOM AS R OH S PHOME R PHONE

First Room

Sandy 3021

Tfa 3012

[image:20.528.140.387.97.167.2]

Joarrae 3008

Figure 9: SQLqueryand result

Javahasthe necessary APIs for writing SQL queries inuser level Java application code.

These APIswereintroduced intothelanguagealongtimeago andhaveevolved sincethen

[image:20.528.177.350.369.500.2]
(21)

Javaprogram again with a large amount ofcode wrapped around the actual query. This

addition setup code, though completely functional and correct, has similar drawbacks as

outlined for theJava XPath APIs.

public static void main(String[] args)

{

// Connection setup Connection con =

null;

try

{

Class.forName(

"

org.hsql.jdbcDriver

"

);

con = DriverManager.getConnectionC'localdb"

, "admin", "password");

}

catch (ClassNotFoundException e)

{

e.printStackTraceO ;

}

catch (SQLException e)

{

e.printStackTrace();

}

try

{

PreparedStatement ps = con

.prepareStatement("SELECT S.FIRST, R.ROOM FROM STAFF AS S JOIN ROOM AS R ON S. PHONE =

R.PHONE;"); ResultSet rs =

ps.executeQueryO ;

while (rs.nextO)

System.out. printIn("NAME: " +

rs.getString(O) + " , ROOM:

"

+ rs.getlnt(l));

}

catch (SQLException e)

{

e.printStackTrace(); [image:21.528.71.476.148.524.2]

}

Figure 10: Javacode using JDBCto accessdatabase

Thepower and speed ofSQL is immense butwith notype-safety andthe requirementfor

(22)

3

Language Integrated

Query

(LINQ)

The previous chapter introducedtwo prominent data-storesthat arecommonly used with

enterprise level applications and the existing technologies available to query them. The

areas ofconcern highlighted while using these technologies points a direction towards an

approach aimed at alleviating those concerns and making the entire query experience as

homogeneous as possible with additional safety checks that make it consistent with the

highlevel programming paradigm.

LanguageIntegrated Query

[10]

is a project at Microsoft that is picking up momentum

inthe programminglanguage communityfor its attempt at bringingin a queryingsyntax

into mainstream programming languages. The LINQ mechanism provides the program

mer with away toprogrammaticallyorganize and extract specificinformationfrom adata

source. It provides a consistent syntax for writing queries over various data-sources like

object collections, XML, relationaldatabases andeventhirdparty data-sourcesas longas

some ofthe contract requirements are implemented. CurrentlyinLINQ adatasource can

be a hierarchical document, like XML, a database, like mySQL, theobject collection, or

any collection that implementsthe Enumerable interface. Given the type of one ofthese

data sources, the user has the ability to extract information from this source using the

same words in the host language. From this vantage point the programmer is extracting

informationin the same mannerfrom three entirelydifferent sources.

The .NET platform and C# 3.0 language in particular has incorporated useful language

featuresthatmakesseamlesslyincorporatingLINQinto the.NETplatformfeasible. These

include Anonymous Classes, Extension Methods

[11],

Lambdaexpressions etc. Allthese
(23)

re-served words in the language and maps these words into a series ofmethod calls on the

data-source. These method calls are mapped to certain logical operations. For instance,

the logic ofajoin operation is mapped to a method named join(). This direct mapping

allows for many by products that do not exist in traditional queries performed on data

sources such as databases and hierarchical documents.

Oneofthe biggest resultsofthis technologyistypesafe

queries, ensuringat compiletime

and not runtime, that these queries are valid and will not fail. Ensuring proper typing

reduces the number of potentialbugs in a programin the samewaya strongly typed lan

guagelikeJavaremoveserrorsthatplague a not sostronglytypedlanguage likeCor C++.

Consider the query below in LINQ over an in-memory collection of Employee objects

to find the names ofemployees that areofthe sameage.

var result = from el in Employee from e2 in Employee

where el.ageO ==

e2.age() kk el.nameO != e2.name()

select new {personl =

el.nameO ,person2 =

e2.name()};

Noticehowthequery issimplein itsstructureand no nestedfor loopsare needed as would

berequired in a traditional program. The fromkeywords brings in newdata sources into

the queryenvironment, thewhere keyworddoes the comparison (like in an if statement)

andthe select keywordconstructsthereturntypeoftheresult. Thenewobject withthe

names of employeesis notpreviously defined but it isconstructed at runtime and assigned

to adynamicallytyped result var.

(24)

C#, like Anonymous Classes, Extension Methods [11], Lambdaexpressions that are not partofJava, inorder to implement QuEL for XML andrelational databases.

3.1

XLinq

and

DLinq

TheruntimeforLINQondifferent data-sources differs inwaysthatarespecifictohandling

the respective data-sources though from a user's perspective LINQ offers the idea of a

consistent queryexperienceacross any data-source backend.

3.1.1 XLinq

Theruntime thatprovidesLinqcapabilitiesfor XML queryingis just an extension ofLinq for dataobjects, with some special extension methods that are relevant in the context of

reading XML documents, like theelements0 methodto readthetext value of nodes and

attribO method used to read the text value of an attribute of a node. The rest ofthe

Linqgrammar staysthe same, AnXLinqquery is thusstructuredexactlythe sameway as a Linqqueryon collections ofdata objects.

3.1.2 DLinq

The Linq grammar by design resembles the syntax of SQL closely. Thus no additional grammar tokens are needed for extending the concept of Linq to databases. The imple mentation for SQL remains the same. Extension methods in the host language backend

corresponding tokeywords inthe querylanguage performthe actions requiredto fulfillthe

(25)

4

QuEL

We have lookedat theproblemsidentifiedwith query languagesand their integration into

a host high level programming language. We also discussed how the LINQ project at

Microsoft is bringing together query syntax inside mainstream application programming

languages like VB.NET and C#. In this chapter we will look at the recently offered po

tential solution for querylanguage support withinthe Javaprogramming language, called

QuEL

[1]

. Theofferedsolutionis afront-endthatconstitutesthenewQuELgrammar and

the translation mechanism that is so far implemented for performing queries over object

collections.

The QuEL (QueryEnhancedLanguage)mechanismisafront-endpreprocessor which when

given a Javafilewith an embedded query, written using the QuELgrammar, will convert

that language into syntactically correct Javacode, thereby making queries on Object col

lections less cumbersome and allowing the programmer to ignore the boiler plate coding

normallyneeded whileperformingthesequeries with thesupported JavaAPIssofar avail

able.

4.1 QuEL front-end

The front-end forQuELasdeveloped inthethesis "IntegratingaUniversalQueryingMech

anisminJava"

[1]

consists ofthe completeQuELgrammar andthe translationmechanism

that parses aquery written in QuELandperforms various preprocessingstepsneeded for

it to mapthequeryto the appropriateback-end. The steps inthispreprocessing involves

(26)

methodtreerepresentation isanabstractmappingofthequerytoany back-end implemen

tation. The next step is performed by aninterpreter that convertsthis abstract mapping

to a custom mapping in the formof pureJava code.

4.2 Object Collections

So far the interpreterofthe QuEL front-end is configuredto usethe back-endwrittenby

Dr. Axel Schreiner for thepurpose ofqueryingon object collectios

[12]

in Java. The Java

codeemittedbytheinterpretermakes method callsdefined inthisback-endas a successive

method call chain. Thesemethods arethereal operators ontheobject collections on which

theoriginalQuEL query iswritten, therebyreturning theresult as dynamicallycomputed

typedobject collection whichis again usableinuser writtenJavacode. Onecouldimagine

the back-end as the proxy for executingthe actual query andthe query language asjust

syntacticsugar for workdone bythe back-end.

The same LINQ query from the previous chapter is now slightly modified and is now

the QuEL query imbedded in Java code within the @@ markers. This query does the ex

act same function as described earlier but now in the Java enviornment using the QuEL

front-endand object collection back-end that was discussed.

result = from edu.rit

.cs.quel.Person el in Employee

from edu. rit.cs. quel.Person e2 in Employee

where el.ageO ==

e2.age() kk el.nameO != e2.name() select new {personl =

el.nameO ,person2

=

(27)

5

QuEL

for XML and Relational Database

Wediscussed in the previouschapter that the QuEL

[1]

mechanism is afront-endprepro

cessor which takes in a Java filewith a QuEL query and translates to pure Java codefor

object collections. The promise ofthis paradigm is however not just to provide a query

mechanism for data collections but a consistent querying experience across varied data

sources. Theworkdonein this thesisis aimed at providing just that experience, forwhich

wehavechosenXML andrelationaldatabaseastheotherdatasourcesforwhichthequery

mechanismcan be extended.

In an earlier chapter on core concepts, it was discussed that writing query over XML

or a relational database inside of Javacode is cumbersome and involves writing a lot of

code that a programmer, for the most part, does not like to be bothered about. It can

be debated that the programmer should not be expected to care about all the wrapper

code and focus around the actual logic of query. This is one ofthe basic reasons behind

developing QuEL and extending it to all data-sources like XML and relational database

so that thefocus ofquerying inside ofJavacomes backto thequery logic.

Asdiscussed inthepreviouschapter, thegrammarforthenewlanguageQuELhas beende

veloped under the thesis "Integrating aUniversal Querying Mechanism in Java"

[1]

where

QuEL will be used to write queries on collections inside a Java program that would be

preprocessed by the QuEL compiler to transform the QuEL part of the code into pure

Javasyntax which could then be compiledby javac.

The current thesis work utilizes the QuEL grammar for writing queries over DOM and

(28)

however some constructs specific to the DOM implementation in Java is not part ofthe

grammar. Thus for the purpose ofthis thesis the back-end for XML assumes the

recog-nistion of these constructs in the grammar so that pure Java code could be build from

QuEL queries for XML data-source. Like in the QuEL paper

[1]

the approach taken to

integrate these queries is a preprocessor based approach, so that the extensibility of the

Javaprogramminglanguagecanbe easilyproven. This approach alsogivesusthesoftware

engineering advantageofmaintainabilityandindependencefrom subsequent Sun JDK re

leases. Theprocesses ofincorporatingthenew grammarinto Javawouldthen beto merge

the QuELgrammar and theelements of our new compiler andthequeryback-endsupport

intothe Java compiler.

Figure 11 provides a high level understanding of the process that converts the special

Java file with QuEL query through a pre-processing stage into a complete compile able

Javafilewith the right mappings generatedto serve the queryfunctionality.

Theprocess of compilation ofQuELinjected Java code is atwo stageprocess where Java

file is first pre-processed by the QuEL processor that parses only the QuEL codewithin

"@@"

markers. The lexer for the preprocessor convertsthe query intoaparsetree that is

then mappedto a methodtree. Themethods inthis tree arethemethodsthat correspond

to each keywordofthe querygrammarlike from, where, select, join etc. The corre

sponding methods are members ofDocumentFragment or Queryable interface depending

on the datasource. The methods form a closed set of operations overthe resulting type.

Since the operations are closed on the result it makes it very simple to apply just about

(29)

Embedded Java Code Preproc&ssedJavaCode DocumentBuilder

docurcentB'jilder=

null; Element staff=

null.-Object q=88fraitr n ]^-ierifg (i -"'nt'< wheie i

-3005 select new Ir.rc

r.phone! <?8

gruery, elements(

rorir.s.).where<ne*r F.u-: rv,nlen, Elements (1

public Booleanf(E.lern&nr.r) I returnt.getAttribute "room"}. equalfl("3i15")) 1 Preprocessing Puce Java Code / . 1 <-+~\-ylow } Processor ,

\

i 1

A

J

r

L

-. r4n

? i ?

XML/SOLQuery

[image:29.528.104.419.77.295.2]

Backend Method Tree

Figure 11: QuELpreprocessingprocess

Lazy evaluation

The designof anybackend implementing the requiredinterface actions has to be made a closed set in order to achieve a cascaded evaluationflow whichis also a "lazy" evaluation

mechanism reminiscent in the functional programming language Haskell. The evaluation chainis triggeredonlywhenresultofthe query is firstused andnot while each partofthe

queryoperation is constructed. This givesthe flexibilityof "adding on"

anyofthe closed

(30)

5.1 XML in QuEL

Now cc jr how we canextend the newquery frameworkto beable towrite queries over

XMLdocuments. Alsokeepinmindthat theQuELgrammarisgenericforauniformquery

syntaxhowever becauseinternallytheback-end usestheorg.w3c.domAPIs inJava, parts

ofthe queryforXML will haveconstructs that are governed bythese APIs.

5.1.1 Background logic

TheJavaprogramin Figure 6 isthebaselinethatwe will usetocomparethe improvements

that we claimto achieve with animplementation ofQuEL for XML. Here is allthat now

needs to be written with QuEL in order to read in the Staffand Rooms document and

mergethem togenerate the occupancylist.

final Element staff = documentBuilder.par

se(

Tester. class.getClassLoader0.getResourceAsStream(

"quel/staff.xml")).getDocumentElementO;

final Element rooms = documentBuilder.parse(

Tester. class.getClassLoader( ).getResourceAsStream(

"quel/rooms.xml")) .getDocumentElementO;

result = from Element s in elements(staff)

join Element r in elements(rooms) on

s.getElementsByTagName

("phone"

) matches

r.getElementsByTagName("phone") into sr select new sr.first, sr.room

Figure 12: QuELqueryfor XML access

(31)

before you start the query. The query then is a very compact statement that uses the

fromkeyword tointroduce the Staff source andthe joinkeyword isusedtointroducethe

Rooms source and match on the phone node on both the documents that havethe same

value and merge the results ofthe join filter into a temporary structure referred by the

variablesr. Now theselectkeywordisusedchoosethenodes/attributesfromthemerged

document into afinal result whichis aDOM DocumentFragment structure.

5.1.2 Method translation

Inorder to understandtheworkingsbehindthe scenethat makes thewhole querylookas

simple, less wordy and yet achievethe same result, we will start by looking at the archi

tectural abstracted view ofthe processing that is part ofthe pre-processing stage of the

compilation.

Puxe Java cote

Tree it I

Processor

XML backend

Preprocessing to XML

rk

Method sub-trees

[image:31.528.90.437.383.568.2]

r-r-.-.

(32)

From Figure 13 it can be seen that a Java file with QuEL code is parsed by the lexer

and a parsetreeofwith thenodesofthe query is internallygenerated. Note that thepure

Java code part of the file is not scanned in this process and passed though this pipeline

without any modification. After the parsing is done a mapper(not shown in

figure)

walks

the parse tree and converts the tree into something called a Method tree. This tree is

closer representation of the final Java output that will get generated and has all typing

related information alongwith theoriginalquerylambdas andpredicatesthatcan nowbe

worked on by the interpreterthat converts it to thefinal Java code that can be compiled

with javac.

5.1.3 XML backend design

Inthis section we willdescribetheworkings ofthebackend designed for XML andhowthe

Proxy class in Java is used to add on extension methods to the DocumentFragment class

suchthat the allthe keywords in QuEL correspond tomethods addedto this classduring

runtime.

The extension methods are added to the DocumentFragment by defining an interface

QuELDocumentFragment with some ofthe following methods:

public interface QuELDocumentFragment <T extends Node> extends

DocumentFragment

{

public <K> QuELDocumentFragment select(Func<K, T> selector);

public <K> QuELDocumentFragment groupBy (Func<K, T> group);

public <K> QuELDocumentFragment orderBy(Func<K, T> group);

(33)

Func<Boolean, Node> predicatel, Func<Boolean, Node> predicate2,

Func2<Node, Node, Node>selector)

}

Thefirststep in the queryinterpretationprocessistocreate aninstanceoftheQueryclass,

which isthe entrypoint forcreatingaproxyoftheDocumentFragment class. This is done

bymakinga callto theelementsQ methoddefined inthisclass withthefollowingsignature:

public <T extends Node> QuELDocumentFragment elements(Node container)

This methodtakes in anXML document read in as aNodeobject and wraps a

QuELDocumentFragment aroundit. Inadditionitalsoperformsthecheckp.getNodeType0

== Node

.ELEMENTJJODE on each node of the input document before it is returned to the

next methodinthe querycomposition chain. This is important becausewhenaXMLdoc

ument isread into the DOM the nodes constitutesthe followingsub interfaces likeAttr,

Comment, Document, Entity, Element, Text etc and for the purpose ofthe query we

will for all practical purposes be concerned only with the Element node. This way the

elements() isnotonlyusedtointroducethe proxywrapperbut alsotoinitializethebasic

filtering for the querybeing constructed.

Oncethe proxy is introducedallthe QuELkeywordcall methoddefinitions are now visible

and can be invoked at any point on the resulting object of each method since the return

typeforeach ofthesemethodsisthe sametypeon which themethod wasinvoked, namely

in the XML case a QuELDocumentFragment class. Here is how aproxyobject ofthe same

(34)

return (QuELDocumentFragment) Proxy.newProxylnstance(

QuELDocumentFragment.class.getClassLoader(), newClass[]

{

QuELDocumentFragment.class, DocumentFragment.class },

new QuELDocumentFragmentHandler(this.db, values,

new Evaluate()

{

public DocumentFragment evalO

{ }

);

It's thebodyofthe Evaluate interfacethathastheeval0 methodthatisdistinct foreach

ofthe extension methodsthat have logicspecific tothe action thatneedstobeperformed

bythat method.

Consider aquery inQuEL and seethe convertedJavacodeforthe sameusingthebackend

design for XML.

result = from Element s in elements(staff) join Element r in elements(rooms) on

s.getElementsByTagName("phone

"

) matches

r. getElementsByTagName("phone") into sr

where s. getElementsByTagName("last").item(O)

.getTextContent().equals(

" Catan"

)

[image:34.528.77.403.72.236.2]

select new sr.first, sr.room

Figure 14: Another QuEL query for XML access

(35)

object(s) thatwould havespecial DOM likeextension methods that would beused to ex

tract node elements and attributes from theDOM. The query above in QuEL wouldlook

likethefollowingwhen mapped to the Java backend designed in this thesis.

QuELDocumentFragment 1 = query. elements(staff).join(

query.elements(rooms), new Func<String, Element>()

{

public String f(Element s)

{

return s. getElementsByTagName("phone").item(O)

.getTextContentO;

}

, new FuncjString, Element^)

public String f(Element r)

{

return r. getElementsByTagName("phone").item(O)

.getTextContentO;

}

, newFunc2jDocumentFragment, Node, Node^()

public DocumentFragment f(Node s, Node sr)

{

return query. combine("s", s, "sr", sr) ;

}

).where(new FuncjBoolean, Element^/)

public Boolean f(Element s)

{

return s. getElementsByTagName("last").item(O)

.getTextContentO .equals("Catan");

[image:35.528.65.387.134.430.2]

}

});

Figure 15: Javatranslation for QuEL queryfor XML

The code generatedusing the XML backend Queryclass can nowbecompiled using javac

would generatethesame result asthe Javacodethatwe wrotein Core Concepts chapter2.

Although the generated Java code is type safe and compiles, the query itself can still

failtogeneratethe expected result. The reasonforthatis, XML isnot schema enforceable

and thus query itselfcannot be type checked for correctness against the XML document

(36)

Kayandaims atproviding type checkingon queries writtenusing XPath. This is howwever

an optional facility andJavaat this time doesnot support it, so we ignore this aspect for

the scope ofthis work.

Thus a possible future extension to the QuEL for XML could be to makes the trans

lated queries completely type safe for documents that adhere to some schema definition.

However for all other XML documents it would be a very challenging task to do achieve

(37)

5.2 Database query in QuEL

Now considering how same query translation mechanism works for a new data source, in

this case we will lookat themost commonlyuses database medium, Relational databases.

Thebigchallengeand differencewhiletacklingthisproblemis, thatunlike inthe previous

cases for object collections or XML, the data sourceis on disk and should always remain

on disk during the entire query process. This might sound trivial at the moment but it

is animportant situation that needs tobe maintained and whichinfluences the designon

the backend for SQL, i.e. since allthe information cannot be cached we haveto translate

the query into SQL. We shall discuss that later in this chapter after we have built the

foundation for querytranslation.

The other important aspect of query mapping for relational databases is that of type

checking. The database tables over which the QuEL query executes exists on disk and

because the Java compiler has no way prior knowledge of their types, some way needs

to be devised to acquire typing information about such entities so that we can provide a

type-safequeryguaranteed to executesuccessfully.

5.2.1 Background logic

Once again reconsider the Java program inFigure 10 as the baseline to compare with for

theimprovementsthatwe claimto achieve with animplementationofQuELforrelational

database. Here is all that now needs to be written with QuEL in order to read in the

Staffand Rooms tables andmerge thembased onthe phone number fieldto generate the

(38)

final edu.rit.cs. quel.StaffBean staff = new

edu.rit.cs.quel.StaffBean(Staff); final edu.rit.cs. quel.RoomBean rooms = new

edu. rit.cs. quel.RoomBean(Rooms); @@

result = from edu.

rit.cs.quel.Staff Bean s in staff

join edu. rit.cs. quel.RoomBean r in rooms on

s.phone matches r.phone

where r.size > 2000

select new s.first, r.size

Figure 16: QuEL queryfor database access

Noticethat thequery isaverycompact statement thatuses thefrom keywordtointroduce

the Staff source and the join keyword is used to introduce the Rooms source and match

on the phone column on both the tables that have the value and on the result apply the

filter condition where the size of the room is greater than 2000. The select keyword is

used to project the columns from the filtered temporary result into a final result which is

a Queryable<T> type, where T is the return typeofthe query which can be an internally

generated typeor anyofthe standardbuilt-intypes.

5.2.2 Method translation

Figure 17 provides an understandingofthe workings of the translation system for QuEL

for SQL and the corresponding SQL backend.

The process of converting QuEL for relational database query is slightly different from

the process discussed earlier for XML. The hook for conversion of the QuEL processor

(39)

Preprocessing to SQL

Schemelookup

0-Hb

~

Processor

Collections backend

Si

Method sub-trees

S2*!'

Querymethod composition

"fe-Figure 17: SQLprocessing

whole translation scheme. The reasonfor thisdifference is for thefollowing factors:

QuEL queries for relational databases are converted to their SQL equivalent query

before they get executed on the database. This approach was taken in order to

leverage the performance that SQL provides for querying databases, especially in

cases when the scalability of the query translation mechanism is considered then

executing the appropriate SQL query to extract data becomes a primary concern

for designing thebackend. Another approach that was considered initially, involved

frontloading the data in all the tables into data objects called Beans. Once the all

the data in loaded the query could then be run in-memory on the beans, like in

the object collectionbackend [12]. Thisapproachworks wellforobjects however the

scale ofdatabase tables is far larger and it cannot be assumed that the system has

enough memory resources to store all the datain-memory. Also it would not be a

good design approach that stretches the limitsofthe systemresources.

[image:39.528.85.432.54.259.2]
(40)

that willhave to beexecuted to retrieve data. The query itself is astring and thus

cannotbecheckedfor syntacticalandtypecorrectness. Thusthe translation process

should not only generate the backend code for writing the query as SQL string, it

should also generate codethat would ensure typesafety ofthe generate query.

5.2.3 SQL backend design

We identified the need for type checking of the QuEL queries when converting to SQL,

which introduces the open-source object relational mapping tool SQL2JAVA [14]. The

tool is used to generate classes that will ensure type checking for the queries, and the

query translation mechanism generates code which when complied and run will generate

SQLquery and executethe same onthe concerned databaseand wrapsthe results appro

priately in so that it could be iterated upon conveniently in the user code.

SQL2JAVA

Thetoolisfreelyavailablefor downloadand comes withtheexecutablefortheopen source

HSQLDB

[15]

database. The tool can be configuredto run ant

[16]

scripts that generate

Javaclassesthatmapto tablesintheunderlyingrelationaldatabaseschema. The database

schema file should be provided to the tool for it to generate these classes, called Beans.

The beans themselves provide additional functionality like persistence and transactional

capabilities, the capability to execute simple queries such that from a programmer's per

spective the beanscan be treatedas in-memoryrepresentation oftherealdatabasetables.

However for the purpose of this thesis we shall not use the latter mentioned capabilities

for limitations explainedin the 5.2.3 section.

(41)

the beans for mappingthe tables into Javaclasses aregenerated before the preprocessing

step and made available to the translation engine. Only the attributes of the beans will

be used asthey map tothe columns ofthe tableand in that regardany other

relational-mappingtool could also be used topregenerate thesebeans. Another alternative to using

atool could be tohand-writethese beansbyexamining thedatabase schema.

The design for the backend involves the following Queryable<X> and Evaluate<X> in

terface that serve as the contractfor translation to SQL.

The Evaluate<X> interface is used to hide the bodyof methodthat chainsthe evaluation

process once itstrigerred.

public interface Evaluate<X>

{

/**

* The method to that computes the resulting intermediate query

* and determines if the query is complete and executes the same * on the database.

*

* return An Iterable result of the query */

public Iterable<X> evalO throws SQLException;

}

The Queryable<X>interface declaresthepubliclyavailable extension methods ofthe

back-end. The translationmechanism calls on thesemethods to build the Java code equivalent

ofa QuEL query.

/**

(42)

* type

* and maintains the logic to convert this Expression into

* the SQL string appropriate time comes, i.e. when the call to * evaluate the result occurs.

* author Anurag Naidu

*

* param <X>

*/

public interface Queryable<X> extends Iterable<X>

{

/**

* The select contains the columns to project and triggers the * evalution process and this is seen as logical termination of * complete QuEL query.

*

* param select

* return An object of itself so that its a closed operation

*/

public Queryable<X> select(Expression<SqlQuery> select); /**

* The where contains the labmda condition that needs to * be checked as a query result filter.

*

* param where * return

*/

public Queryable<X> where(Expression<SqlQuery> where); /**

* The where contains the sources that are introduced into a

* QuEL query, namely the database table names. *

* param from * return

*/

public Queryable<X> from(Expression<SqlQuery> from);

}

The hook intheQuellnterpreterisa subclassthattranslates theQuELquerytowards the

(43)

public interface Expression<T>

{

/**

* Function to generate the

query string. The string generated * so far is passed as input and the result is the same type

* the parts of the query computer from the extension method

* that defines this Expression class

*

* param Parameter to the function (same as return type)

* return

*/

T f (T param) ;

}

We can now look at the earlier QuEL query on relationaldatabase and compare with the

translated code. The simple query that brings together the Staff and Rooms tables and

returns all staff member who occupyrooms with size greater than 200 sq.ft.

final edu. rit.cs. quel.Staff Bean staff = new

edu.rit.cs.quel.Staff Bean(Staff);

final edu. rit.cs.quel.RoomBean rooms = new

edu. rit.cs.quel.RoomBean(Rooms);

@

result = from edu. rit

.cs.quel.Staff Bean s in staff

join edu. rit.cs. quel.RoomBean r in rooms on

s. phone matches r. phone

where r.size > 2000

select new s.first, r.size

Figure 18: QuELqueryfor database access

The queryaftertranslationismappedto theedu.rit. cs. sql.backend.Querybackend is

(44)

Queryable<AN0N1548173586> resultl = new

edu.rit.cs. sql.backend.Query<AN0N1548173586>(

AN0N1548173586.class, this, null).from(

new Expression<SqlQuery>()

{

public SqlQuery f(final SqlQuery current)

{

current.fromClause.append("staff s, ");

return current;

}

}).from(new Expression<SqlQuery>()

{

public SqlQuery f(final SqlQuery current)

{

current.fromClause.append("rooms r, "); return current;

}

}).where(new Expression<SqlQuery>()

{

public SqlQuery f(final SqlQuery current)

{

current.partialQueryClause.append("where ");

current.partialQueryClause.appendC'r.size");

current.partialQueryClause.append(">");

current.partialQueryClause.append("200");

return current;

}

}).select(new Expression<SqlQuery>()

{

public SqlQuery f(final SqlQuery current)

{

current.selectClause.append("select ");

current. selectClause. append("s.name, r.size,") ; return current;

}

[image:44.528.76.385.58.480.2]

});

Figure 19: Javatranslation for QuEL query for database

Notice in the convertedquery how themethod definitionfor the Expression<T> interface

has thecode to writeout itspart ofthe SQL querystring. When invoked theycontribute

their part ofthe query to the SqlQueryobject.

(45)

joinetc ofEvaluate<X> interfacetriggersthe T f (T param) methodpassedtoitasthe

first stepwhich cases the chain to first construct the correct SQL string equivalent ofthe

QuEL query. The methods select or groupBy carry special meaning as they execute

the query on the database using the internal connection object, in Figure 19 above the

code generatedfortype-safetychecksisomittedfrom output nowforbrevity, howeverthe

Java code generated also has the type-safety check code that does not ever get executed

but is merely there as a proof, that ifthe generated Java file compiles then the query is

(46)

6

Comparison

and

Results

This chapter considers a few QuEL queries over XML and relational databases and com

pare the results from the respective backend against the results from a native language

translation ofthesame query. As discussed in the backend design for relational database,

the QuEL query is converted to the SQL equivalent and one ofthe measureof success for

the backend is the comparison between the backend generated query and original QuEL

queryhand-written as SQL.

6.1 QuEL query translation for XML

Assume that we have the same Staff and Rooms XML documents from Figure 3 and we

wanttomerge thetwo documentsonthematchingphonenumbers andcreate a new

docu-ument using the name and room values after groupingthem based on room numbers and

alphabetically reverse sorting the names within each group. The QuEL query to do the

same would likethe following:

result = from s in elements(staff) join r in elements(rooms) on

s.getElementsByTagName(phone) .item(O).getTextContent() matches

r.getElementsByTagName(phone) .item(O).getTextContent0 into sr

select new

{

s.getElementsByTagName(f irst).item(O),

room = r.getAttribute(foom)

group by r.getAttribute(foom)

(47)

Figure 20: Quel queryforXML

Thetranslated queryreturns the

following

resultwhichmatcheswiththesamequerywrit

tenusingJava APIs for XML processing

involvingmanymorelinesofcodethan the QuEL

query.

Group groupedBy='3012'

first room='3012'

'Tina'

Group groupedBy='3599'

first room=3599

'James'

Group groupedBy= '3005'

first room='3005'

'Eileen'

first room='3005'

'Jason'

Group groupedBy='3022'

first room='3022'

'Christina'

first room='3022' 'Liane'

Figure 21: XML queryresult

Wewill skip the translated Java codeusing XML backend fromthistext dueto the shere

size ofthe code, however it isinterestingto notehow thisconciseQuELquerytogenerate

thisresults expands almost ten times itssize when convertedto pureJava using theXML

backend and when compared with code written using only the available Java APIs for

XML is roughlytwentytimesthelengthoftheactual QuELquery. Thisonlygoestoshow

(48)

interactingwith XML documents.

6.2

XQuery

for Java

There are a some XQuery

[7]

implementations that are available for use in Java as API

calls. Oneofthem isOJXQI

[17]

proposedbyOracle

[4]

designedonthelinesofJDBC

[18]

with similar APIs. OJXQI could be used to query XML documents by preparing a

PreparedXQueryobject and constructingan XQueryqueries with it.

OJXQI being similar to JDBC in Java, has some advantages and similar disadvantages

discussed in chapter 2 on XML which this thesis attempts to alleviate. However OJXQI

also allows support for SQL queries on documents through a sqlquery function. These

features bring it close the syntax of QuEL, however using OJXQI still involves writing

boilerplate setup codejust like in JDBC.

6.3 QuEL query translation for SQL

ConsideraQuELquerytoretrievedata froma relationaldatabase,inthiscaseHSQLDB [15],

that has the tables that we will query. Assume that we have the Staff and Rooms tables

fromFigure 8in our database and wejoinon thephone column andthen filterout rooms

on the condition that they are greater than 2020 sq.ft. and numbered above 2000 with

theresults groupedbyromms with same size. A SQL querytodothesame wouldlooklike,

select s.*, r.size from rooms r

inner join staff s on s.phone = r.phone

(49)

and r.size > 2020

order by r.size

The same querywritten in QuEL would then look like,

result = from

sql2java.linq.database. RoomsBean r in rooms

join sql2Java.linq. database.Staff Bean s in staff on r.phone matches s. phone

where r.size > 2020

where r.number > 2000

orderby sr.size

select new sr.name, sr.number, r.size

@@ . . .

The SQL backend converts this query into the following SQL,

select s.name, s.phone, r.size from rooms r

join staff s on s. phone = r. phone

where r. number > 2000 and r.size > 2020

order by r.size

The result ofexecutingthe QuEL query is the same as executingour first SQL query,

name=JasonMorrison phone=4010 size=2030

(50)

6.3.1 GroupJoin issue

OneofthelimitationsoftheQuELtranslationmechanism for SQL iswiththe into clause

withthe joinandgroupbyclauses. ThefollowingQuELqueryconvertsto theappropriate

SQL howeverwrapping the ResultSet into the correct return typeclass fails.

result = from

sql2Java.linq.database. RoomsBean r in rooms

join sql2java. linq.database. Staff Bean s in staff

on r.phone matches s. phone into sr

select new sr, r.number

The translated SQL appears likethe

following

select sr.*, r. number

from (

select s.name, r.*

from rooms r

inner join staff s on s. phone =

r. phone

) sr,

rooms r

This query executes on the database however the because the select clause specifies the

variable sr as one ofthe return types the QuEL mapper

[1]

creates the return type class

for the whole query with an instance of Staf fBean wrapped in a Iterablelmpl. This is

however insufficientforthe purpose ofSQLsince use of srin theselect clauseisequivalent

to selecting all the columns of the table Staff and Rooms. This makes the return type

(51)

7

Conclusion

Reconsidering

the problem areasthat we identifiedearlier with writingaqueryover XML

using XPathandDOMAPIs orover a relational databaseusing SQL in Javaandcompar

ing how the QuEL wayofquerying onXML andSQL backend solves theseproblems.

Complex codelogic neededforfairlysimple querypatterns

A queryforperforming the same operation is simpler, cleaner and more intuitive.

Proficiency in many technologies likeXPath, XSLT, JDK XPath APIsis needed. Theprogrammer onlyneeds to learn QuELforthe most partfor writing queries over

any of the supported data sources.

Repetitiveboilerplate codeto supplement the actualquerycode.

No boilerplate code is written by the programmer. The backendfor the respective

data source technologywill auto generate all boilerplate code andkeep the query logic clean.

The query itselfremainstype unchecked and so potentially unsafe.

The query is not checkedfor type safety by the backend. Since the backendgenerates pure Java code that is compiles, thus the queries are themselves are assured ofbeing

type checked otherwise the code generated by the backendwould not have compiled.

Lack of consistent query language for all data-sources.

QuEL grammardoes not change significantlywith the data-source itoperates on and

(52)

7.1

Timing

T

Figure

Figure 1: Staff XML document as a tree
Figure 2: Staff XML document as a tree
Figure 3: XML document for Rooms and Staff
Figure 5: Result of XSLT solution
+7

References

Related documents

xml coling dvi XMLTrans a Java based XML Transformation Language for Structured Data Derek Walker and Dominique Petitpierre and Susan Armstrong fDerek Walker,Dominique

In this paper an intranet application is designed and implemented using Object-Oriented Programming Language Java and Object-Relational Database Management System

When the source function in the programming language, either in procedural language like C or in object oriented language like Java, is continuous there is

SQL3 grammar was not entirely implemented; the SQL instructions recognized by the parser are the queries DDL (Data Definition Language) for the Description of the structure of

You already know a little about object-oriented programming because after working the example programs in Java Programming Language Basics, Part 1 and Part 2, you are somewhat

By using semantic grammar techniques which interleaves semantic and syntactic processing in distributed databases, LADDER system is used to parse natural language

In the Java programming language is performed Within the agents are implemented all standard Simple Network Management Protocol SNMP functionality.. Lab How to write Java code

They show that target language data types can be defined in Haskell so that if the compiler itself type checks, then so will the code it generates6. However, language embedding