Conclusion - From Massive Parallelization to Quantum Computing: Seven Novel Approaches to Query

Postgres optimizer selects locally optimal plans for the subqueries within a query. We left this mechanism in place as justiﬁed in Section 2.4, even if it weakens the formal approximation guarantees for queries that contain subqueries (TPC-H queries 2, 4, 7, 8, 9, 11, 13, 15, 16, 17, 18, 20, 21, 22). In practice, the approximation guarantees were only violated in one case (TPC-H query 7) and only for speciﬁc choices ofα (α = 1.15).

Figure 2.10 shows the results for bounded MOQO.EXAis compared againstIRA(instead ofRTA) since onlyIRAguarantees to generate query plans that respect all hard bounds if such plans exist. Optimization always considers all nine objectives while the number of bounds varies between three and nine. Figure 2.10 reports the number of iterations (instead of the number of Pareto plans), the reported numbers for memory consumption refer to the memory reserved in the last iteration (memory that was allocated before can be reused). The performance ofEXAis insensitive to the number of bounds. The performance ofIRAvaries with the number of bounds: Time and memory consumption tend to be higher when hard bounds are specified. This can be seen by comparing Figure 2.10 with Figure 2.9, asIRA behaves exactly likeRTAif no bounds are specified. The reason is thatIRAmay have to choose a much smaller internal approximation factor thanRTA, in order to verify if the best generated query plan is near-optimal among all plans respecting the bounds. The performance gap between approximate and exact MOQO is still significant: Summing over all test cases for bounded MOQO,EXAhad 464 timeouts while eachIRAinstance had at most 4 timeouts. The total optimization time was more than 1200 hours forEXAand less than 15 hours forIRA

withα = 1.15. The number of iterations ofIRAincreases sometimes with the user-deﬁned

approximation factor. If hard bounds are set then the internal approximation precision required to guarantee a near-optimal plan does not necessarily correlate with the user-defined precision. However, even if the number of iterations increases, the total optimization time is not influenced significantly (except for queries with very low total optimization time where overhead by repeated query preprocessing is non-negligible). This was the goal of our precision refinement policy.

2.9 Conclusion

Our MOQO approximation schemes find guaranteed near-optimal plans within seconds where exhaustive optimization takes hours. We analyzed the cost formulas of typical cost metrics in MOQO and identified common properties. We believe that our findings can be exploited for design and analysis of future MOQO algorithms.

50 100 50 100 T -outs (%) 50 100 102 104 106 102 104 106 O ptimization T ime (ms) 102 104 106 102 104 106 102 104 106 M emor y (KB) 102 104 106 0 50 100 0 50 100 # Iter ations 0 50 100 150 150 200 W -C ost (%) 1 4 6 22 12 13 14 15 16 17 19 20 3 11 18 10 21 2 5 7 9 8 150 200 # Bounds: 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9

EXA With Timeouts IRA(α = 1.15) IRA(α = 1.5) IRA(α = 2) EXA Without Timeouts Time Limit

1 1 1 2 2 2 2 2 2 2 3 3 4 6 6

1 2 _{# Joined Tables}3 4 5 6 8

TPC-H Query Number

Figure 2.10 – Optimizer performance comparison for bounded MOQO using timeout of two hours

2.9. Conclusion

1: // Find best plan for query Q, weights W, bounds B 2: function EXACTMOQO(Q, W, B)

3: // Find Pareto plan set for Q 4: P ← FindParetoPlans(Q)

5: // Return best plan out of Pareto plans 6: return SelectBest(P ,W,B)

7: end function

8: // Find Pareto plan set for query Q 9: function FINDPARETOPLANS(Q) 10: // Calculate plans for singleton sets 11: for all q∈ Q do

12: Pq← 

13: for all pNaccess path for q do

14: Prune(Pq_{, p} N)

15: end for

16: end for

17: // Consider table sets of increasing cardinality 18: for all k∈ 2..|Q| do

19: for all q⊆ Q : |q| = k do

20: Pq← 

21: // For all possible splits of set q 22: for all q1, q2⊂ q : q1∪q˙ 2= q do 23: // For all sub-plans and operators 24: for all p1∈ Pq1, p2∈ Pq2, j∈ J do 25: // Construct new plan out of sub-plans 26: pN← Combine(j,p1, p2)

27: // Prune with new plan 28: Prune(Pq, pN) 29: end for 30: end for 31: end for 32: end for 33: returnPQ 34: end function

35: // Prune plan setP with new plan pN

36: procedure PRUNE(P ,pN)

37: // Check whether new plan useful 38: if¬∃p ∈ P : c(p) c(pN) then

39: // Delete dominated plans 40: P ← {p ∈ P | ¬(c(pN) c(p))}

41: // Insert new plan 42: P ← P ∪ {pN}

43: end if

44: end procedure

45: // Select best plan inP for weights W and bounds B 46: function SELECTBEST(P ,W,B)

47: PB← {p ∈ P | c(p) B} 48: if PB= then

49: return arg min[p∈ PB]CW(c(p)) 50: else

51: return arg min[p∈ P]CW(c(p)) 52: end if

53: end function

1: // FindαU-approximate plan for query Q, weights W 2: function RTA(Q, W,αU)

3: // FindαU-approximate Pareto plan set

4: P ← FindParetoPlans(Q,αU)

5: // Return best plan inP for inﬁnite bounds

6: return SelectBest(P ,W,∞)

7: end function

8: // FindαU-approximate Pareto plan set 9: function FINDPARETOPLANS(Q,αU)

// Derive internal precision fromαU

αi← |Q|αU

...

10: [13] // Prune access paths for single tables Prune(Pq, pN,αi)

...

11: [25] // Prune plans for non-singleton table sets Prune(Pq, pN,αi)

...

12: end function

13: // Prune setP with plan pNusing precisionαi 14: procedure PRUNE(P ,pN,αi)

15: // Check whether new plan useful

16: if¬∃p ∈ P : c(p) _αic(pN) then

...

17: end if

18: end procedure

Algorithm 2 – The Representative-Tradeoffs Algorithm: An approximation scheme for Weighted MOQO. The code shows only the differences to Algorithm 1.

2.9. Conclusion

1: // FindαU-approximate plan for query Q, 2: // weights W, bounds B

3: function IRA(Q, W, B,αU)

4: i← 0 // Initialize iteration counter

5: repeat

6: i← i + 1

7: // Choose precision for this iteration

8: α ← α2_U−i/(3l − 3)

9: // Findα-approximate Pareto plan set

10: P ← FindParetoPlans(Q,α)

11: // Select best plan inP

12: popt←SelectBest(P ,W,B) 13: untilp∈ P : c(p) αB ∧CW(c(p)) α < CW(c(popt)) αU 14: return popt 15: end function

Algorithm 3 – The Iterative-Reﬁnement Algorithm: An Approximation Scheme for Bounded- Weighted MOQO. The Code Uses Sub-Functions From Algorithm 2.

3 Incrementalization

The approximation schemes presented in the last chapter make multi-objective query optimization feasible for medium-sized queries. They require however users to specify their preferences before optimization starts. For users, it is often more convenient to select preferred execution cost tradeoffs in an interactive process. The algorithms presented in the last chapter are not suitable to support interactive query optimization. In this chapter, we will see an incremental anytime algorithm that divides optimization into many small incremental steps. This algorithm enables responsive user interfaces where users may integrate their preferences after each incremental step, thereby leading optimization quickly towards interesting parts of the cost space.

In document From Massive Parallelization to Quantum Computing: Seven Novel Approaches to Query Optimization (Page 55-61)