• No results found

The Main Internal Modules

In document Semantic Caching for XML Queries (Page 176-182)

7.2 The ACE-XQ System Overview

7.2.3 The Main Internal Modules

Now we look more closely at some main modules implemented or incor- porated in the ACE-XQ system. We will describe how they interact with each other and which internal data structures are exploited.

XQuery Engine. The ACE-XQ system is integrated with the IPSI-XQ query engine version 1.0.1, which is implemented faithfully along the W3C Work- ing Drafts on XQuery 1.0 as of the date 2002-01-08 (i.e., the working drafts on the language syntax, data model, functions and operators, and formal semantics of XQuery 1.0).

In the ACE-XQ system architecture, three components of the IPSI-XQ engine, namely, the query parser, the static type inference, and the query executor, are utilized. Below we explain how each of these three compo- nents are interfaced with the other modules in the ACE-XQ system.

 Upon the submission of a query request, the input query is first parsed by the query parser provided by the IPSI-XQ engine. The parsed query tree structure is then intercepted by our query decomposer to perform the proposed query minimization, normalization and de- composition techniques.

 The type inference and subtyping functionalities are provided by the static type inference component of the IPSI-XQ engine. Such func-

tionalities are exploited by the query containment mapper module to realize the proposed containment checking technique.

 If a query is determined to be contained or overlapping with a cached query, then the rewritten probe query is executed by the query execu- tor of the IPSI-XQ engine installed at the local cache site for retrieving the answer available in the cache, while the remainder query sent to the remote data server where an IPSI-XQ engine is installed for query execution. In the case when no cached query can be used for answer- ing the new query, the whole query is transmitted to the remote data server to be executed. The ACE-XQ system can treat the query execu- tor as a black box which processes the query and returns its results in the form of an XML document to the ACE-XQ system.

Query Decomposer. Our query decomposer is interfaced with the query parser module of the IPSI-XQ engine from which it intercepts the parsed query tree structure and performs three passes on it.

The purpose of the first pass is to apply the variable minimization pro- cedure as described in Figure 3.8. After this pass, essential variables are distinguished from those non-essential ones and the former are collected into a set named EVS. Based on the identified variable essentiality, the ap- propriate normalization rules are then applied which result in a uniformly nested query form containing only the essential variables

In the second pass, the variable dependencies are extracted from the parsed query tree and the VarTree structure is constructed to reflect the pat- tern matching semantics of the input query.

7.2. THE ACE-XQ SYSTEM OVERVIEW 165

The data structure implemented for VarTree is a recursive tree struc- ture composed of VarNodes, each of which is a data structure accommodat- ing the information about an essential variable, such as the variable name, the bound path expression, and a list of children VarNodes. A Region data structure is also implemented and it has a field containing a list of VarNode references. This is used for indicating the associations between variables and their corresponding home FLWR blocks. The control-flow-induced variable dependencies can be inferred from such associations. The Region data structure also contains a field for a list of where-conditions which are specified in the current FLWR block but with the conditions imposed on the variables defined in the enclosing FLWR blocks.

The third pass focused on the hierarchy of the return constructs in the parsed query tree and the TagTree structure is built to reflect the result con- struction semantics of the input query.

Similar to VarTree, TagTree is also implemented as a recursive tree struc- ture which is however composed of RetNodes, the data structure for result constructors each containing the tag name, a list of locally defined vari- ables, and a list of children RetNodes. The query rewriter module of the ACE-XQ system would need to utilize this TagTree structure of the con- taining cached query for rewriting a new query.

Finally, the constructed VarTree and TagTree structures are registered as the descriptor of the corresponding query region assigned in the cache system. These two tree structures help to explicitly reveal the query se- mantics and they provide the basis on which the proposed containment checking and query rewriting techniques are founded. Hence, this query

decomposer is necessary since otherwise the query containment is hard to be reasoned about based on the input query strings.

Query Containment Mapper. After the query decomposer module fin- ishes the query pre-processing step, the query containment mapper con- ducts the containment checking between the new query and each of the cached query based on their corresponding VarTree structures. As explained in more details in Chapter 3, our proposed containment checking technique is primarily a tree embedding process that “maps” each VarTree node of the new query to a correspondence node of a cached query.

We implement such a tree embedding algorithm based on the data struc- ture designed for VarTree. At the macroscopic level, this tree-embedding algorithm guides the top-down mapping between the VarNodes of two VarTrees; At the microscopic level, a mapping is established between a pair of VarNodes only if there exists a containment relationship between their respective bound path expressions. In the implementation, the type infer- ence function provided by the IPSI-XQ engine is called along the traver- sal of each VarTree for inferring the type of each variable represented by a VarNode based on the type inferred for the parent variable and the relative XPath expression.

In addition, the containment checking is also conducted between the conditions contained in the respective Region nodes that the candidate variable pairs are associated with. All the other necessary checking steps indicated in the extended MAC mapping algorithm as shown in Figure 3.16 are also conducted.

7.2. THE ACE-XQ SYSTEM OVERVIEW 167

Given a new queryQ2and a cached queryQ1, our containment map- ping procedure finally produces the maximum mapping h from variables ofQ2to those ofQ1. Depending on different mapping results obtained, we can classify the containment cases into the categories as follows:

 Totally Contained, if the mapping h is a total injective mapping, namely, each variable inQ2has a mapping variable inQ1. This means that the answers toQ2is a subset of the answers toQ1. By rewritingQ2with respect to the view structure ofQ1, the answer toQ2can be obtained directly from the local cache.

 Disjoint, if there is no such a mapping h that maps any variable in Q2to a variable inQ1. For example, when the two queries involve different XML documents. This indicates thatQ1 and Q2 share no answers in common. In this case, the cached answers are not useful for answeringQ2, which is hence sent to the remote data server to be executed.

 Overlapping, if the mapping h maps some of the variables in Q2 to the variables in Q1. In this case, only partial answers toQ2 can be retrieved from the cache and the rest has to be obtained by sending a remainder query to the remote data server.

Query Rewriter. If Q1 is decided to be totally contained within Q2 or overlapping with it, the query rewriter is then called for rewritingQ1based on the view structure ofQ1. Unlike the relational queries which produce flat output schema, an XQuery results in an output tree. Therefore, we use

TagTree for representing the view tree structure of a query. To rewriteQ2 with respect toQ1, the query rewriter rewrites the variable bound expres- sions and return expressions by utilizing the TagTree structure registered forQ1and the containment mapping pairs produced by the query contain- ment mapper.

In one nutshell, our contribution lies in a complete semantic caching framework called ACE-XQ that we provide for targeting at XQuery. The core of our ACE-XQ system consists of three tightly related procedures: XQuery decomposition including the pre-processing steps, containment mapping and query rewriting, which are developed based on the most cur- rent theoretical results from the literature but own their unique features different from the alternative techniques.

169

Chapter 8

Experimental Studies

In document Semantic Caching for XML Queries (Page 176-182)