Related Work - Improving the performance of video based reconstruction and validating it within

Exact Polyhedral Visual Hulls (EPVH) [39] is a state of the art polyhedral re-

construction algorithm that creates a guaranteed watertight, manifold polyhedral representation of the visual hull. A polyhedral visual hull may have a texture applied to it derived from the camera images, special consideration needs to be

made for the mapping of textures when there are several candidate cameras [66]

and [97] describe the mapping of textures from multiple cameras onto such forms.

Image Based Visual Hulls (IBVH) [79] is a reconstruction approach based on sim-

ilar principles to EPVH, but renders a specific viewpoint rather than outputting a model. It is clear from both that the computational complexity of such approaches is largely in searching for intersections with camera images. To this end the authors of both approaches suggest optimisations that may be employed to reduce the overhead of such searches, in both cases requiring the building of lookup ta- bles. Alternatively, spatial data searches such as the search for intersections can be

accelerated by making use of a tree like structure, such as the R-Tree ([47]). The

R-Tree is a structure that can be used to accelerate querying many types of multi- dimensional data, and numerous adaptations exist to meet specific requirements.

One such adaptation, the R+-Tree [108] can increase performance by reducing

overlapping areas, and is also quick to initialise from a static dataset. In [40],

the EPVH authors determine that their method for creating the polyhedron still demonstrates silhouette consistency even when contours are modelled at camera image pixel precision, rather than at the sub-pixel precisions previously employed - This important finding significantly reduces the computational cost of the algo-

rithm for real-time applications. [41] describes a system in which the EPVH al-

gorithm is processed in parallel by a number of networked computers. It is found that a real-time visual hull may be generated from a number of cameras when processed in this way. In this scheme, the work is broken down into tasks that are executed by a cluster of networked computers. A three stage pipeline is employed along which a stream of frames progresses; each processing node can be assigned a pipeline stage resulting in stream level parallelism, whilst the partitioning of work within each stage of the pipeline can be further divided into units of work

that can be executed in parallel across a single frame. Tasks that are not inher- ently parallel are executed sequentially in between pipeline stages. The downside of this scheme is that distributed processing will inevitably introduce delays compared with a shared memory system. Data must be communicated between the nodes, and this can lead to synchronisation bottlenecks where parallel threads are stalled waiting for data, which can result in poor load balancing. Furthermore any code that must be executed sequentially, or network data that is passed sequentially could hinder the flow of parallel execution if it takes significant time to

complete [4]. Whilst this might initially seem like a trivial consideration, future

massively multicore processors could spread the load over many more processing

units, resulting in greater impact of sequential code segments [51]. The distributed

processing scheme for EPVH introduced in [41] is deployed in an end to end telep-

resence system [1] and [93]. With the recent advent of GPGPU computing there

are numerous emerging languages and standards that can be used to target a variety of compute resources. Some languages, such as CUDA are vendor specific, and therefore useful only on specific hardware. OpenCL is a language that aims to provide a portable way to exploit the parallel nature of a number of funda- mentally different types of multi-core processors from a variety of manufacturers. Identical code can be run on CPUs, GPUs or a combination of the two, making it highly portable. However, OpenCL does not yet provide automatic performance

portability, so code may need to be adapted following device profiling [118] to

achieve the best results. For multi-core CPUs various parallelisation libraries such as OpenMP provide a means by which programmers can easily schedule

work to be executed in parallel on a number of cores. [24] provide a comparison

of OpenMP and OpenCL in terms of their relative performance, and discuss the advantages and disadvantages of each: OpenCL requires greater porting effort for existing algorithms, but results in an implementation that can be executed on a range of hardware. OpenMP requires manual configuration to get the most out of the various extensions offered by a processor, whereas OpenCL achieves this au-

tomatically. We previously published work Duckworth and Roberts [32] in which

we accelerated certain parts of the EPVH algorithm using the GPU. We found that for increased camera counts and image resolutions, the GPU offered accelerated performance compared with the CPU variant. However, there were a number of

CHAPTER 5. IMPROVING PERFORMANCE OF VBR 115 shortcomings to this work: Testing used synthetic data rather than real camera images, only part of the algorithm was parallelised, the study only compared GPU performance against the sequential (single core) CPU counterpart.

In document Improving the performance of video based reconstruction and validating it within a Telepresence context (Page 126-128)