Table 3.2: Derived Function Table new value from the maintenance results.
Finally, as a general rule for function design, we should design a small set of core workarea functions such that they can be used to support a large class of other functions. The reason is simply that the derived function is much easier to implement (as only one scalar function is required). Such design is also beneficial for view matching as a small set of workarea func-tions in the view can be used to answer a variety of derived funcfunc-tions in the query. This workarea function model serves the basis for aggregate view management as we will elaborate in the next few sections.
3.3 View Maintenance using Workarea Functions
We first explore how to incrementally maintain the view based on the workarea function model in Section 3.2.
3.3.1 View Creation
Given an aggregate function F in the view, we first need to determine the method to maintain this function and if any additional function is neces-sary. Note that these additional functions need to be maintained as well.
All such information can be exploited from our workarea function model in Section 3.2. That is, if F is a workarea function, then we maintain it di-rectly using fF+, fF−in Table 3.1. If F is a derived function, then we maintain it indirectly by first maintaining its workarea functions and then compute the new F value from the maintained workarea functions.
Note that for both methods, some auxiliary workarea functions may be necessary to be added into the view in order to incrementally maintain F . Furthermore, these auxiliary workarea functions have to be incrementally maintained as well. As a result, more functions would have to be added.
In fact, such information can be easily pre-derived and stored by adding all necessary functions into the Auxiliary functions in Table 3.1 and into the Workarea functions in Table 3.2. For example, in Table 3.2, although the def-inition of Regr Slope only needs N Cov and N V arx two workarea func-tions, other workarea funcfunc-tions, such as Sumx and Sumy are also included in order to incrementally compute N Cov and N V arx.
Based on the above discussions, we need to add the following workarea functions into the view definition when creating an aggregate materialized view. For any workarea function F , we add its auxiliary functions (Ta-ble 3.1) into the view. For any derived function F , we add its workarea functions (Table 3.2) into the view. Note that this step can be done
auto-matically using the workarea function model without any user actions. For example, when user creates the view (1.1) in Section 1.3.2, this view defi-nition will be automatically rewritten by adding Wslope, i.e., the workarea function for Regr slope, which consists of six functions shown in Table 3.2.
CREATE VIEW SalesAnalysis′AS
SELECT o custkey, regr slope(l extendedprice, l quantity) as qtyonprice, Wslope(l extendedprice, l quantity) as wa, count(∗) as cnt
FROM lineitem, orders
WHERE l orderkey = o orderkey
GROUP BY o custkey
(3.1)
3.3.2 Incremental View Maintenance
In this section, we will first describe the existing view maintenance frame-work and then show how to extend this frameframe-work to support the mainte-nance of complex aggregate functions. In [CGL+96, MQM97], the authors proposed to maintain the views in two steps, namely, the Propagate phase and the Apply phase as shown in Figure 3.1. The Propagate phase computes the final delta from the base changes which represents the net effect of the changes to the view. The Apply phase integrates the final delta into the view.
We now describe how to extend this basic framework to support complex aggregate functions using our workarea function model.
Propagate()
Figure 3.1: Incremental View Maintenance Framework
Propagate Phase
The Propagate phase computes the final delta [MQM97] from the base changes which represent the net effect of the changes to the view. A number of algo-rithms [GL95, GMS93] discuss the propagation of the deltas through each operator, such as select, join, group-by, etc.
Take the view (3.1) for example, assume there are some inserts △L on Lineitem table. The final delta is computed as in Query (3.2), which also includes the workarea functions as a result of the rewriting of view (3.1).
CREATE VIEW F inalDeltaAS
SELECT o custkey,
regr slope(l extendedprice, l quantity), WSlope(l extendedprice, l quantity), count(∗)
FROM △L, orders
WHERE l orderkey = o orderkey
GROUP BY o custkey
(3.2)
Apply Phase
The Apply phase will evaluate an equi-join between the view and the final delta on the group-by columns. Note that a left outer-join may be required as the insert final delta may create new groups to the view. If all the tuples of one group are deleted, then that group should be deleted (COUNT(*) is hence required.). If both the view and the final delta contain tuples of the same group, these two tuples will be combined to update the correspond-ing tuple in the view. For any workarea function F in Table 3.1, we can fetch the corresponding fF+or fF−to incrementally compute F . For any derived function F in Table 3.2, we can compute its new value from maintained workarea functions.
A key feature of our maintenance framework is that, while user still needs to implement all workarea related functions, there is no need to hard code in the maintenance algorithm for any specific function. In order to add the support of new functions, we just need to insert the corresponding entries into Table 3.1 and 3.2. Clearly such extensibility is a very useful feature for user-defined aggregate functions [WZ00].