2.3 Shadow Depth Maps
2.4.3 Reducing the Complexity
Because of the large number of shadow polygons that are processed, there has been much research to significantly reduce such complexity. The optimizations can ba- sically be categorized into three approaches that can be used in combination: a per-mesh silhouette approach, a global scene analysis optimization, and a hybrid shadow depth map approach.
These discussions are useful for both CPU and GPU versions of the shadow volume algorithm. On the GPU, this allows fill-rate reduction of the stencil buffer, which is crucial for performance.
Per-mesh Silhouette Approach
Many efficient implementations [46, 37, 48, 165] eliminate redundant shadow poly- gons within a single polygonal mesh by exploiting some silhouette identification
Figure 2.26. Reduced number of shadow polygons generated as a result of the object sil- houette. Image courtesy of Jerome Guinot/Geeks3D.com.
(seeFigure 2.26, where shadow polygons are generated only at the silhouette of the object). These silhouettes mainly correspond to the polygon edges that have differ- ent ˆN⋅ ˆL positive/negative values (the view-dependent boundary case) and also to edges that have no other shared edges (the view-independent boundary case). Such polygon edges are the only edges that require the generation of shadow polygons, i.e., any internal polygon edges do not need shadow-polygon generation. Such an approach can potentially reduce the number of shadow polygons quite signifi- cantly.
As a side note, two observations about silhouettes are interesting to mention. First, the degree of a silhouette vertex is even [6], where the degree n means the vertex on such a silhouette can be connected by n silhouette edges. Second, another observation from McGuire [399] is that in many triangle meshes consisting of f triangles, the number of silhouette edges is approximately f0.8.
Note that the above silhouette optimizations are only accurate for 2-manifold shadow casters, and there have been attempts at improving the generality of the meshes [46, 8, 400, 306]. Bergeron [46] indicates that silhouette edges with two adjacent triangles should increment/decrement the shadow count by 2 and incre- ment/decrement the shadow count by 1 for open edges. To generalize this approach even more, Aldridge and Wood [8] and Kim et al. [306] compute the multiplic- ity of each shadow polygon, and the shadow count is incremented/decremented by this multiplicity value (although the usual case is either 2 or 1, as in the case identified by Bergeron). While supporting non-manifold shadow casters, these approaches require storage of the multiplicity value per shadow polygon, and a GPU stencil buffer implementation is more complicated because the standard sten- cil buffer only supports incrementing/decrementing by 1; additionally, the risk of exceeding the stencil buffer’s 8-bit limit (shadow counts above 255) becomes higher. To mitigate the above issues, for the case of increments/decrements of 2, creating double-quads is a possible brute-force approach. Another possibility is proposed by McGuire [400], who experiments with additive blending to a color buffer.
The shadow count also needs to consider some portions of the original mesh (i.e., the light cap) within the shadow volume polygons. However, those poly- gons can be culled because of their irrelevance to the shadow-count computa- tions [345, 397]. For example, if the occluder is between the light and the viewpoint, and the view direction is pointing away from the occluder, these polygons are not necessary for consideration for the shadow count. Similarly, if the viewpoint is be- tween the occluder and the light and the view direction is pointing away from the occluder, then these polygons do not need to be considered for the shadow-count computation. In fact, in the latter case, all the shadow (silhouette) polygons for this mesh can be culled from any shadow-count computation. This work can reduce the fill rate for GPU-based solutions even more.
Global Scene Analysis Optimizations
Beyond optimizing for just a single mesh, conservative occlusion culling tech- niques as discussed by Cohen-Or et al. [106] can be used as a global scene analysis optimization technique. Additional specific examples are elaborated below.
Slater [537] points out that a shadow volume completely enclosed within an- other shadow volume can be eliminated. In other words, by processing shadow volumes in a front-to-back order from the light source, simple configurations of shadow volume clipping can reduce the extent of shadow volumes closer to the light source. This can be seen inFigure 2.27, where the shadow volume of occluder A can eliminate the need for the shadow volume for occluder B as a result from this sorting.
Fill rates can be further improved by reducing the size of the shadow polygons. For example, the intensity of the light diminishes as the square of its distance; thus, shadow polygons can be clipped at a distance where light intensity becomes so small that it does not affect shading [345, 397]. Similarly, the shadow polygons can be clipped beyond a certain camera depth range [398] because those points are not easily visible. Both techniques may result in shadow errors if there are many shadow-casting lights, and the accumulation errors may add up (see Section 2.5.3 for additional implications of handling many lights).
Another reduction of shadow polygon size can be achieved by clipping the shadow polygons to exactly fit to camera-visible objects [357, 155], a tool referred to as scissors [346]. A bounding volume hierarchy can also be traversed front-to-back to accelerate the determination [631, 565] of the pruning suggested in [537, 357, 155] because entire nodes (in the hierarchical bounding volumes) can be pruned along with all the leaf nodes that reside under the node.
In another optimization, a hierarchical shadow volume algorithm [3] is pre- sented, where the screen is divided up into many tiles of 8× 8 pixels. If the shadow polygons do not intersect the bounding box formed by the objects within this 8×8 tile, then the points within this 8× 8 tile are either entirely in shadow or entirely in light, and the fill rate can be reduced because the stencil value for this 8× 8 tile will be the same, and can be computed with any arbitrary ray through the 8× 8 tile.
L
shadow polygons
occluder A
occluder B
Figure 2.27.Occluder A can eliminate the need for the enclosed shadow volume from oc- cluder B.
Fawad [169] proposes a lower level of detail in certain cases for the silhouette computation, thus reducing the time needed for the computation of the silhouette as well as the complexity of the number of shadow polygons. In addition, temporal coherence is applied to the next frame to further reduce the silhouette computation time. While this sounds theoretically interesting, self-shadowing is likely going to be a difficult problem to get around due to the different levels of detail. Zioma [675] provides a partial solution for closed surfaces, where front-facing (to the light) polygons are avoided for near capping, although self-shadowing concerns remain at the silhouette polygons.
Hybrid Shadow Depth Map
An interesting contribution in reducing the complexity of the general shadow vol- ume approach comes from McCool [393]. A standard shadow depth map is first created, and then an adapted 2D edge detector is applied on the shadow depth map to identify the edges that form the only polygons casting shadow volumes. This scheme results in no overlapping shadow volumes (which others [37, 605] have tried to achieve through geometric clipping of volumes, but the computations are very numerically unstable); thus, only a 1-bit stencil buffer on the GPU is necessary to process the shadow volumes. When visualizing static scenes (i.e., the shadow depth map and edge detector preprocessing are not needed per frame), the actual shadow volume rendering should be very fast. However, the extra overhead in converting the shadow depth map to shadow volumes via an edge detector can be slow. The shadow depth map resolution limits how well these extracted edges cap- ture smaller details and can lead to some aliasing artifacts in the resulting shadow volumes.
In the hybrid shadow-rendering algorithm proposed by Chan and Du- rand [84], the shadow depth map is initially generated. For the non-silhouette regions, it can be determined whether the region is entirely in shadow or entirely lit. For the silhouette regions, shadow volumes are used to determine exact shad- owing, thus significantly reducing the fill rates because only silhouette regions are being considered. Unfortunately, this algorithm also has the same concerns as the above approach by McCool.