2.5 Swapping Rules for GPIVOT and GUNPIVOT
2.5.1 Swapping Rules for GPIVOT/SELECT
Both the rules for moving GPIVOT up SELECT or the rules for moving GPIVOT down SELECT will be studied. To show completeness, we will first show all possible scenarios. Then we will describe if there is any ap-plicable swapping rule under that scenario. Note that Sections 2.5.1 to 2.5.9 all follow the same structure.
Pullup GPIVOT through SELECT
There are two possible scenarios, namely, if the select predicate is on the pivoted output columns or not.
1) If the select condition is defined on non-pivoted output columns, then
we can push it down without any changes such as the condition σCountry=‘U SA′
in Figure 2.7.
2) If the select condition involves pivoted output columns and is null-intolerant (i.e., is false when NULL), then pushing down the selection re-sults in multiple self-joins.
For instance, in Figure 2.7, in order to push down the condition σSony∗∗T V ∗∗
P rice>200, we first need to find its corresponding tuple in the original table by σM anu=Sony ∧ T ype=T V ∧ P rice>200(V ). Note that the results of this selec-tion however only contain the Sony TV price informaselec-tion. In order to get the pivot results, we need to perform a self-join with the original table to find other information about these countries. In combination, the GPIVOT pullup is rewritten as:
GP IV OT (πCountry(σM anu=Sony ∧ T ype=T V ∧ P rice>200(V )) ⊲⊳ V ).
More self-joins are required if more pivoted output columns are in-volved. Intuitively, each pivoted output column (and its selection) corre-sponds to one tuple (and its selection) in the original table. Hence a se-lect condition on multiple pivoted output columns corresponds to mul-tiple selections on different tuples. Furthermore, in order to make sure all the selections are satisfied, we need to perform more self-joins. For-mally, assume a selection predicate over two pivoted output columns as:
σ“ai1
1 ∗∗...ai1m∗∗Bl1” cp “ai21 ∗∗...ai2m∗∗Bl2”(GP IV OT[A[{(ai1,...,aim)}]
1,...,Am] on [B1,...,Bn](V )). Here
‘cp’ is a comparison operation, such as = or ≤. This select predicate can be pushed down based on the rule in Equation (7) (proof in Appendix A).
Here K1 and Bl11 mean the columns K and Bl1 in the left operand, while K2and Bl22 mean the columns K and Bl22 in the right operand.
σ“ai11 ∗∗...ai1m∗∗Bl1” cp “ai21 ∗∗...ai2m∗∗Bl2”(GP IV OT[A[{(ai1,...,aim)}]
Figure 2.7: Pullup GPIVOT through SELECT
The above rules can be easily extended to handle predicates with even more pivoted output columns and complex conjunctive or disjunctive con-ditions. To handle more pivoted output columns, we need to perform more self-joins as mentioned before. Each join is to find one pivoted output col-umn. The final join result provides the key values that satisfy the condition.
Conjunctive and disjunctive conditions can be achieved by unioning or in-tersecting these key values.
However, the benefit of pulling GPIVOT up is likely offset by such
mul-tiple self-joins since propagating changes through mulmul-tiple self-join expres-sions is non-trivial. That is, such propagation would generate multiple join terms [GMS93], which is rather expensive. One alternative to address this potential performance problem is that for those conditions that result in multiple self-joins if pushed down, we pull both SELECT and GPIVOT up the query tree and design special update propagation rules. We will de-scribe this technique in Section 2.6.3.
Push GPIVOT Down SELECT
There are three possible scenarios. Assume the input table is V (K, A1, ..., Am, B1, .., Bn) as before and GPIVOT pivots [A1, ..., Am] on [B1, ..., Bn]. The se-lect condition can be either on the columns of K, or on the pivoting columns in {Ai}, or on the value columns in {Bj}.
1) If the select condition is on K, such as ‘σcountry=U SA’ in Figure 2.8, then we can push GPIVOT down the SELECT operator without change.
VCR}]
Figure 2.8: Pushdown GPIVOT through SELECT
2) If the select condition is only on the pivoting columns in {Ai}, such as
‘σT ype=T V’, then the pushdown results in a PROJECT, which turns all
non-‘TV’ columns, in this case, ‘VCR’ related columns, into ⊥. The reason is that all other non-‘TV’ information will be filtered by the select. After that, we will need a SELECT to remove the rows that contain only ‘⊥’ columns, since by GPIVOT definition in Equation (3), GPIVOT does not output rows with all ⊥ values. More precisely, it becomes: ‘σnot all⊥(πcountry,Sony∗∗T V ∗∗P rice, Sony∗∗T V ∗∗Quantity, P anasonic∗∗T V ∗∗P rice, P anasonic∗∗T V ∗∗Quantity,⊥,⊥,⊥,⊥)’.
3) If the select condition is on the value columns in {Bj}, such as ‘σP rice=250’, then the pushdown results in a PROJECT, which sets the‘∗ ∗ P rice’column and the‘∗ ∗ Quantity’column with the same prefix to ⊥ if the‘∗ ∗ P rice’ col-umn does not equal 250. This is followed by a SELECT, which also re-moves the rows that contain only ⊥ columns. More precisely, it becomes
‘σnot all⊥(πcountry, case(Sony∗∗T V ∗∗P rice, Sony∗∗T V ∗∗Quantity),case(Sony∗∗V CR∗∗P rice, Sony∗∗V CR∗∗Quantity), case(P anasonic∗∗T V ∗∗P rice, P anasonic∗∗T V ∗∗Quantity), case(Sony∗∗
V CR∗∗P rice,Sony∗∗V CR∗∗Quantity))’. Here case(column1, column2) is a case ex-pression that if column1 does not equal to 250, then it outputs (⊥, ⊥), oth-erwise it outputs (column1,column2).
In combination of scenario 2), we now describe the pushdown rule in Equation (8).
GP IV OT[A[{(ai1,...,aim)}]
1,...,Am] on [B1,...,Bn](σAu=x∧Bv=y(V )) = σnot all⊥(πK,{case(“ai1
1 ∗∗...ai1u...∗∗ai1m∗∗B1”,...,“ai11 ∗∗...ai1u...∗∗ai1m∗∗Bn”)})(
GP IV OT[A[{(ai1,...,aim)}]
1,...,Am] on [B1,...,Bn](V )) (8)
Here the case expression,case(“ai11∗ ∗...aiu1...∗ ∗aim1 ∗ ∗B1”, ..., “ai11∗ ∗...aiu1...∗
∗aim1∗ ∗Bn”), outputs(“ai11∗ ∗...aiu1...∗ ∗aim1∗ ∗B1”, ..., “ai11∗ ∗...aiu1...∗ ∗aim1∗ ∗Bn”)only
whenaiu1= x ∧ “ai11∗ ∗...aiu1...∗ ∗aim1∗ ∗Bv” = y. Otherwise, it outputs (⊥, ..., ⊥).
Note that this rule can also easily be extended to handle more complex conditions, such as disjunctive conditions. For example, if the condition on the left side of Equation (8) is σAu=x∨Bv=y(V ), then the condition in the case expression on the right side becomesaiu1= x ∨ “ai11∗ ∗...aiu1...∗ ∗aim1∗ ∗Bv” = y.