Visual selective rendering - Cognitive resources and limitations

Chapter 3 Cross-modal perception and applications to rendering

3.4 Cognitive resources and limitations

3.6.1 Visual selective rendering

Yee et al.[YPG01] proposed a selective rendering framework based on saliency estimations in order to reduce rendering computational times. Specifically, for every frame a spatio-temporal error map and a saliency map enriched with a motion channel were created. The former was constructed based on velocity dependant contrast sensitivity values and saliency estimations. Both of these maps were constructed using low quality rendered versions of the frame computed either using rasterisation or by computing only the direct illumination component (see section 5.2.2) of the governing rendering equation. The error tolerance map and the saliency map are merged appropriately to yield what the authors term the Aleph map. This image guides computational resources in the main rendering step.

In another study, Haber et al. [HMYS01] proposed a perceptually-guided methodology for exploring virtual scenes in real time using a corrective splatting algorithm. Before the main computation step, the algorithm approximates a solution to the rendering equation (see 5.2.2) based on tracing particles in the scene. During the main computation step, the remaining computational resources are allocated to

improving the quality of non-di↵use regions in the virtual scene. These important regions are identified through the use of a saliency map computed from the low quality image that obtained in the pre-computation step. The authors use Itti and Koch’s saliency model [IKN98] in their implementation.

Cateret al. [CCL02, Cat04] demonstrated in practice how the phenomenon of Inattentional Blindness (see section 3.2) can be exploited to speed up computation times when rendering animations. In their framework, all areas related to an assigned task were rendered at higher quality whilst the quality at the non-relevant regions was rendered significantly lower. The authors conducted psychophysical ex- periments and validated that subjects who performed the task, failed to perceive the low quality areas of the frames that composed the animation. This study clearly demonstrated that the top-down visual attention mechanism can be successfully utilised in the development of selective rendering frameworks.

In a related study by Cateret al. [CCW03], the authors achieve to imitate the HVS’s property to ignore task irrelevant objects and encode them in what they term as aTask map. Specifically, in their framework, a first order rendering of the frame is computed which serves as the starting point to extract object and motion related information. The extracted data are combined with information from a data structure that compactly encodes the importance of each object in the scene and yields the task map. The authors make use of the task map along with the draft rendered version of the image and a contrast sensitivity model to render every frame of the animation in a progressive manner. Their results show significant speed-up compared to classic rendering techniques.

Sundstedt et al. [SDLC05] also used the idea of the task map as was defined by Cateret al.[CCW03] and that of the saliency map from Ittiet al.[IKN98] in order to construct what the authors name as Importance map. This map ap- plies di↵erent weighting coefficients to the task map and the saliency map in order to obtain combined information of both top-down and bottom-up visual attention characteristics. Specifically, the importance map is denoted as IM(!T,!S), where

!T, !S are the weighting coefficients applied to the task map and saliency map respectively. For instance, a saliency map can be defined as IM(0,1) while a task map as IM(1,0). An importance map that contains equal contributions from the other two maps can be written as IM(1₂,1₂). This work demonstrated that importance maps can be used in rendering animations at Selective Quality (SQ) with no perceptual di↵erence compared to the uniformly computed versions of the images at High Quality (HQ).

in 2005 [DC05, Deb06] included the application of path tracing for computing a global illumination solution based on adaptive sampling on the components of the scene. This novel methodology was termed Component-based adaptive sampling. Specifically, earlier work used adaptive sampling at pixel level to speed-up the rendering times. This sampling methodology calculates radiance di↵erences at pixels and estimates areas of high variance where more rays need to be shot compared to the rest of the scene. The authors applied the adaptive sampling technique on the reflectance properties of the various objects that composed the scene instead at pixel level and showed a significant speed-up at the rendering times compared to the previously used adaptive methods.

Longhurstet al.[LDC06] developed a GPU saliency model for selective rendering applications. Their work included a series of novel contributions related to the selective rendering methodology and the estimation of the visual saliency it- self. Primarily, it was implemented on a GPU giving results around seventy times faster than the conventional CPU implementation. The authors concentrated on identifying salient features at every pixel composing the image instead of discover- ing which region was attended first. In the saliency model, the orientation channel was replaced by an edge detector while the implementation also included a motion channel as was proposed by Yee et al. [YPG01]. Furthermore, the authors added ahabituation channel to account for the e↵ect of the HVS’s short-memory feature. This habituation module preserves the saliency of a virtual object for some time depending on its appearance in successive frames of an animation and after that it suppresses its saliency for about ten seconds to simulate the e↵ect of temporal memory.

In document Resource allocation for multi sensory virtual environments (Page 81-83)