SAH-based Construction - The Surface Area Metric and Surface Area Heuristic

2.5 The Surface Area Metric and Surface Area Heuristic

2.5.3 SAH-based Construction

Construction of high quality kd-trees and BVHs aims at reducing SAM cost for the resulting tree. Besides introducing SAMMacDonald and Booth [1989,1990]also developed a greedy top-down construction algorithm for kd-trees which reduces SAM in a more direct manner than Goldsmith’s and Salmon’s algorithm. In contrast to Goldsmith and Salmon [1987] the output of their algorithm does not depend on the order of input primitives. WhileMacDonald and Booth [1989,1990]note that their approach could be directly applied to BVHs,Müller and Fellner [1999]where the first to present and evaluate a similar approach for BVHs. Müller and Fellner [1999] were not aware of the prior work from MacDonald and Booth [1989,1990]and proposed a slightly simpler cost model applied to BVHs. Wald et al. [2007]presented a direct adaptation of MacDonald and Booth’s approach, which in contrast to Müller and Fellner’s approach made traversal performance of BVHs competitive to kd-trees.

On a high level the construction process is the same for kd-trees and BVHs. The prob- lem solving heuristic for the greedy algorithm is intuitively derived from Equation2.35. It assumes that the set of input primitivesPembedded in a leaf node n is split into a left leaf l with primitivesP_l and a right leaf r with primitivesP_r which share a common parent. The cost for this split is

where px is the conditional probability from Equation2.34w.r.t. the original node. Con- struction aims at finding a partition ofP, which gives the lowest csplit. The cost for not splitting n is the cost for processing the node as a leaf:

cleaf = |n|ci. (2.38)

If the lowest csplit is lower than cleaf the partition is executed and construction recursively continues onPl andPr. The recursion terminates as soon as the best csplit is larger than or equal to cleaf. Considering that pl and prare both w.r.t. the same parent node area An for all split candidates, a common variant of Equation2.37avoids the involved division by multiplying with A_n:

c_split= A_nc_t+ A_l_|l|c_i+ A_r_|r|c_i. (2.39) The adapted variant of Equation2.38for recursion termination is

c_leaf = A_n_|n|c_i. (2.40)

The kd-tree and BVH construction process differs in creation of the partition candidates. We first discuss partition creation for kd-trees and then proceed with BVHs.

kd-tree Partition Candidates Finding the best partition for a kd-tree boils down to finding the split plane, which results in the lowest cost. Figure2.20shows the behavior of the cost when sweeping a split plane along the coordinate axes for some example scene. We see that the cost c(s) is a piecewise linear function with respect to split position s. Whenever the plane starts or ends to intersect a primitive there is a discontinuity. Primitive starts cause a discontinuous increase of c(s) as primitives are split, while primitive ends cause a discontinuous decrease as they are not split anymore. Thus, for each dimension we only have to look for the minimum of csplit at the 2|P| distinct discontinuities caused by the primitives inP.

Wald and Havran [2006]presented an efficient implementation for finding the best partition candidate. For a fixed sweeping dimension a listE of events is created which contains the discontinuities of c(s). An event is a pair (xe, e), where xe∈ R is the position of the discontinuity and e ∈ {START, END} is an event label which distinguishes between a primitive start or end. For each primitive such a start event (xSTART, START) and an end event (xEND, END) is created and added toE. Then the events inEare sorted w.r.t. x in ascending order, where in the case that two events have the same position, a start event goes before an end event. The next step is to extract split candidates from the sorted list. For this we track the number of primitives L and R on the left and right side of a partition by updating them depending on the order of events encountered in the event list. Initially, L is zero and R is |P|. Now, we iteratively inspect the sorted events. On a start event a primitive enters the left side while on an end event a primitive leaves the right side. This gives the following simple update rules for L and R based on the event label e:

e= START : L← L + 1

e= END : R ← R − 1. (2.41)

For every encountered event we compute csplit with the current counts L and R, and the bounds associated with this split. To compute the bounds we simply have to set the maxi- mum of the left bounds and the minimum of the right bounds in the sweeping dimension

2.5. The Surface Area Metric and Surface Area Heuristic 1 2 3 4 5 6 2 3 4 x cx 2 3 4 1 2 3 c_y y

Figure 2.20: Example for the candidate cost function when sweeping the candidate plane along the x- and y-axis. The z-dimension is not depicted. The scene consists of three triangles where the scene bounding box has width, height, and depth (6,3,1). The pair of implementation dependent constants (ct, ci) is set to (12, 1). From the given box geometry

the cost function for split positions on the x-axis is cx(x) = 12+|l|8x+654 +|r| 1 −8x+654

. For split positions on the y-axis the cost function is c_y(y) =1₂+|l|14 y+12₅₄ +|r|1 −14 y+12₅₄ . The dashed lines mark discontinuities at the start and end of primitives. Primitive starts cause a discontinuous increase of c_x and c_y as the primitive is split in two parts. At Primitive ends cx and xy show a discontinuous decrease as the primitive is not split anymore. cx(x) has its minimum at x = 3 with cx(3) ≈ 1.94 and the minimum for cy(y) is at y = 2 with c_y(2) ≈ 2.76. Thus, the best split candidate is the plane x = 3. With c_i = 1 the leaf cost c_leaf = 3 is simply the number of triangles. As the best split has lower cost than the leaf cost it would have been chosen for splitting the node.

to the plane position of the current event. The remaining bound limits are identical to the parent bounds. This process is repeated for the other two sweeping dimensions to find the overall best candidate. The described procedure allows to find the best candidate in the sorted list in O(n). As the event list had to be sorted first the overall complexity of this step is O(n log n). Wald and Havran [2006]also had a third event for the case when a primitive completely lies in the splitting plane. We left this event out for simplicity. Recur- sively repeating this procedure on both sides of a partition to construct the whole kd-tree has complexity O(n log2n).

Wald and Havran [2006]proposed a modification to improve on this complexity which requires additional memory. For this approach event lists for each sweeping dimension are created and sorted for all input primitives once before construction. Then, at construction time best candidates can be found in O(n). When partitioning the primitives the sorted event lists are partitioned, too, and merely have to be maintained to preserve the event order for each partition. This maintenance step only has O(n) complexity in contrast to the

Figure 2.21: Example of generating candidate partitions for greedy SAH-based top-down BVH construction. First, the primitives are sorted w.r.t. a fixed dimension coordinate of the centroid of their AABBs. The initial partition contains the left most sorted primitive on its left side (blue) while all other primitives are put to the right side (red). Additional partitions are generated by iteratively removing a primitive from the right side and adding it to the left side in the order they appear in the sorted primitive list. This process is repeated for all coordinate dimensions as well. The candidate partition which gives the lowest SAH cost is chosen.

O(n log n) cost for sorting after each partitioning step. Constructing the whole kd-tree has O(n log n) complexity with this approach. As the sorting pre-process also has O(n log n) complexity the overall complexity is O(n log n).

BVH Partition Candidates SAH-based BVH construction has to find a partition (L,R) of P which has the lowest cost c_split. With respect to the power set P(P) of P the set of partition candidates ofPis

C_P= {(L,R) ∈ P(P) × P(P) \ {(;,P), (P, ;)} |L∪R=P∧L∩R= ;} (2.42) As for everyL∈ P(P) \ {;,P} there is exactly oneR∈ P(P), (L,R) ∈CP the number of

candidate pairs is |C_P| = |P(P)|/2−1 = 2|P|−1−1. Thus, in contrast to kd-tree construction the number of candidates is exponential in the number of primitives which is unpractical. Most of these possible partitions are unreasonable as they do not provide a good spatial separation of the primitives.

Instead of testing all partitions, which is infeasible,Wald et al. [2007]propose to test only a reasonable subset which should result in acceptable partitions. The idea is to sweep for candidates along each coordinate axis similar to kd-trees in the previous section. First all primitives are sorted w.r.t. the x-coordinate of the centroid of their AABBs. To reduce memory traffic the algorithm works on primitive references. A primitive reference is a pair (i, Bi) where i and Bi are the index and the bounds of the referenced primitive. For the first partition the left most sorted primitive is put to the left side while all other primitives are put to the right side. Remaining partitions are generated by iteratively removing a primitive from the right side and adding it to the left side in the order they appear in the sorted primitive list. This results in a total of |P| − 1 partitions. Every iteration also has to keep track of the partition bounds to compute the SAH cost. Let us assume B₁, . . . , B_|P| is the array of sorted reference bounds and let Ba⊕ Bb denote the grow operation which computes tight bounds for the given bounds Ba and Bb. Then the left bounds B_lk of the k-th partition are Bk

l = L k

j=1Bj. The respective right bounds are Bkr = L

|P|

j=k+1Bj. For the left side the partition bounds Bk

l can be incrementally updated when adding primitives to it by simply computing Bk

l = B k−1

l ⊕ Bk, where B

l = B1. For the bounds B k r of the right side it is not possible to simply remove the bounds of the removed primitive from B_r. An efficient way to compute all right partition bounds is to perform a scan from the

2.5. The Surface Area Metric and Surface Area Heuristic

xmin xmax

xmax−xmin

Figure 2.22: Example for a scene which has been subdivided into B = 4 bins with three equidistant split planes (red) along the x-axis.

right on the sorted bounds B1, . . . , B|P| using the grow operator ⊕. That is we compute

Bk_r= Bk_r+1⊕ Bk+1, where B|P|−1r = B|P|. The resulting array of right partition bounds has to be stored in additional memory. The partition index k is also the number of primitives on the left side of the partition. As BVH construction does not split primitives the number of primitives on the right side of the k-th partition is simply |P|−k. Now we have all required information to compute the SAH cost of all partitions. The candidate partition which gives the lowest SAH cost is chosen. This process is repeated for the y- and z-axis and the overall best candidate is selected. From Figure2.21it can already be seen that this procedure can generate partitions with reasonable spatial separation. As with kd-tree construction this sorting-based candidate determination has O(n log n) complexity resulting in an overall complexity of O(n log2n). It is possible to adapt the concepts of the O(n log n) approach for kd-tree construction to BVHs to also achieve O(n log n) SAH-based BVH construction.

In document Higher Performance Traversal and Construction of Tree-Based Raytracing Acceleration Structures (Page 49-53)