Encoding, storage and transmission - Remote rendering for virtual reality

3.2 Remote rendering for virtual reality

4.2.7 Encoding, storage and transmission

As discussed above, a key problem with HDR video systems is the amount of data that is generated. Efficient data formats and compression techniques are thus essential in order to cope with the large data requirements of HDR video data on existing infrastructures. Reducing the volume of digital data benefits areas including, but not limited to: a reduction of transmission channel bandwidth; a decrease of the buffering and storage requirement; and a reduction of data-transmission time at a given rate.

HDR video compression may be classified as one-stream or two-stream [13]. The one- stream approach utilises a single layer transfer function to map the HDR content to a fixed number of bits, typically 10 to 12 [77, 89, 90, 91, 92]. While it is true that many scenes do contain a dynamic range of lighting that is sufficiently low to be adequately contained within a limited number of bits, there are many others for which 10 or even 12 bits may be insufficient [13].

In a two-stream method the encoder has one input HDR video and two bit streams as an output [93, 94, 95, 96]. These streams can consist of (1) a standard compliant bit stream, for example HEVC Main 10, H.264 etc. and (2) one another stream corresponding to additional data to reconstruct HDR video (which can also consist of a standard bit stream) or metadata (if so desired). When combined these streams reproduce the full HDR content with minimal (or indeed no) perceptual loss. A process flow diagram for a generalised two-stream method is shown in Figure 4.2.

Renderer HDR Encoder Video Encoder

Video Encoder Server Client GPU Decode HDR Display LDR Display Tonemap

Figure 4.2– Flow chart for two-stream HDR transmission method.

One-stream HDR video compression methods take advantage of the higher bit depth en- codings possible using of modern encoding standards, typically 10-bit HEVC [97], to transform a single HDR video input stream into a single compressed stream using a transfer function [13]. In SMPTE ST2084 [98] this transfer function is the PQ curve [91], while ARIB STD-B67 [99] uses Hybrid Log Gamma (HLG) [92]. A Power Transfer Function (PTF) is a one-stream HDR video compression method that uses a power function, similar to the well-known Gamma function used in traditional, Low Dynamic Range (LDR) video. This has been shown recently to produce high quality HDR video compression very efficiently for a value ofγ = 4[100]. In Chapter 7 and Chapter 8 ST2084 is used as the method of HDR video compression.

4.3 Summary

HDR video, which is becoming commonplace in consumer electronics, represents one of the high-fidelity imaging frontiers which is discussed further in Chapter 6, Chapter 7 and Chapter 8. Chapter 6 discusses the implementation of a platform for streaming HDR video, which is util- ised in the subsequent chapters.

Chapter 5

Research Focus & Methodology

It would, without doubt, seem odd to a mathematician to go about to convince him the diagrams he saw upon paper were not the figures, or even the likeness of the figures… in this science the reasonings are free from those inconveniences which attend the use of arbitrary signs, the very ideas themselves being copied out and exposed to view upon paper.

— George Berkeley,An Essay Towards a New Theory of Vision

The forthcoming chapter contains a discussion and critical assessment of the techniques for the capture, transmission, storage and display of imagery described thus far, and as concerns rendered computer graphics. A specific focus will be placed on the ability to incorporate new and emerging imaging technologies, as in Figure 5.1, which shows the different technologies applicable to each stage of the video capture-to-display pipeline. In light of the overview given of the area, a research question, methodology and objectives will be presented.

5.1 Assessing existing approaches to future imaging technologies

The common thread amongst the technologies and techniques for image display discussed in the previous chapters is that the nature of the stored and transmitted image is changing across multiple dimensions. The future display technologies, incorporating both traditional viewing screens as well as head-mounted displays and other virtual reality methods, are constantly broad- ening in approach and capabilities. HDR screens are now coming to market commercially, of-

fering screen brightnesses of up to 1000 nits [73], while for head-mounted displays pixel density is rapidly increasing, approaching or surpassing the display resolutions of mobile phone screens or broadcast TV streams. Figure 5.2 shows a matrix of imaging technologies and their inter- section with the requirements of HMDs. Other sectors such as 360° video capture and storage have gained renewed interest and research through proximity to VR. All these concepts and more must be accommodated for in capture, processing, storage and display. Streaming, the focus of this thesis, amalgamates aspects of each of these stages, in that that the content to be streamed must be captured and processed in real time, then stored in the form of data packets to be transmitted before display.

In the following sections there will be a brief analysis of the state of the art in these areas, each one a representative area of high-fidelity imaging, to allow reflection on where there is work yet to be covered. With this presented, a roadmap will be assembled for the chapters of this thesis to be placed in context with their underlying motivation.

In document Efficient streaming for high fidelity imaging (Page 66-69)