Backward compatible HDR-MPEG ( hdrmpeg ) - Overview of HDR video compression algorithms

Chapter 3 High Dynamic Range Video Compression

3.3 Overview of HDR video compression algorithms

3.3.6 Backward compatible HDR-MPEG ( hdrmpeg )

The HDR video compression algorithm designed on the basis of the above mentioned OETF and EOTF is described as follows:

Compression

Input HDR frames are first linearly normalised such that the pixel valuesV _∈[0,1]. The normalised pixel values are then linearly multiplied by an arbitrary constant of 12.0 such thatV _∈(0,12]. The OETF mentioned in equation 3.29 is then applied toV such that the output non-linear signal L_∈[0,1]. Subsequently, the pixel values undergoe colour space conversion and discretisation before being passed on to the codec and encoded at 10 bits/pixel/channel. The normalisation factor, typically the maximum value of each HDR frame is stored as look-up table and passed on as an auxiliary metadata stream which is used by the decompression side of the algorithm.

Decompression

The output stream undergoes a reverse process whereby the decodedYCbCrframes are con- verted toRGB′_{and subsequently the EOTF as given in equation 3.30 is applied to the signal.}

The resultant RGB is then normalised by a factor of 12.0 and subsequently multiplied by the normalisation factor obtained from the look-up table, thus reproducing the decoded output HDR frames.

3.3.6 Backward compatible HDR-MPEG (hdrmpeg)

Mantiuk et al. [MEMS06] proposed the firstbackward compatible HDR compression algorithm. This method incorporated backward compatibility as shown in Figure 3.1b by creating a tone mapped base stream which can be played back on an LDR screen using any available video player. However, the method also introduces a new colour space transformation, a reconstruction function which can be considered as a precursor to inverse tone-mapping and non-linear quantisation. The steps are described as follows:

The LDR stream:

Considering backward compatibility with existing 8 bit video decoders, the HDR video content is tone mapped, using photographic TMO, to produce an 8 bit RGB frames. They are transformed to YCbCr colour space and encoded using any MPEG-4 encoder. The LDR stream can be played back on any LDR displays in the absence of an HDR display.

Lhdr L hdr Uhdr Vhdr L_hdr Lldr V ldr Uldr L_ldr sRGB to TRANSFORM COLOUR SPACE

L hdr Uhdr Vhdr L hdr Uhdr Vhdr V ldr Uldr L ldr sRGB to TRANSFORM COLOUR SPACE

L hdr Uhdr Vhdr L hdr Uhdr Vhdr L U V ldr ldr ldr HDR FRAMES MPEG DECODE FRAME FILTERED FRAMES RESIDUAL FRAME

QUANTIZED AND FILTERED FRAME Lldr LDR Frames XYZ to ldr hdr Res = L − RF(L ) TONE MAPPING MPEG ENCODE RECONSTRUCTION FUNCTION ENCODING RESIDUAL FRAME FILTER INVISIBLE NOISE FUNCTION QUANT.

TRANSFORM COLOUR SPACE

FUNCTION

HUFFMAN/RUNLENGTH QUANTIZE RESIDUAL

FIND RECONSTRUCTION

ENCODING PART

MPEG ENCODE

LUV and RESIDUAL COMBINE INVERSE COLOUR TRANSFORM

to XYZ XYZ TO RGB (HDR)

MPEG DECODE MPEG DECODE

to RECONSTRUCTION FUNCTION RESIDUAL FRAMES RESIDUAL STREAM AUXILIARY STREAM LDR STREAM DECODING PART AUXILIARY STREAM XYZ FRAMES HDR FRAMES FRAMES sRGB

Figure 3.11: Schematic diagram of HDR-MPEG Colour space transformation:

The method introduces a new backward compatible perceptually uniform Lu’v’ colourspace [Man06] which is able to encode Luminance values of HDR as well as LDR frames. This is done to ensure that colour channels of both LDR as well as HDR pixels contain the same information. The decoded LDR frames are transformed from gamma corrected sRGB

toLldrUldrVldr and the corresponding HDR frames are transformed toLhdrUhdrVhdr colour

space with 12 bits allocated to encode the luminance and 8 bits each for two chroma channels. To encode real world luminance Y to 12 bit luma, lhdr, the following conversion formula is used: lhdr(y) =          a.y ify<yl b.yc₊_d _if_y l≤y<yh e.log(y) +f ify_≥yh

and the inverse operation to maplhdr to real world luminance y are:

y(lhdr) =          a′_._lhdr _if_lhdr_<_ll b′₍_l hdr+d)c ifll≤lhdr<lh e′_.exp₍_f′_._lhdr₎ _if_lhdr_≥_lh

The constants are given in the table below: Henceforth, all operations are conducted a =17.554 e = 209.16 a’ = 0.056968 e’ = 32.994

b = 826.81 f = -731.28 b’ = 7.3014e-30 f’ = 0.0047811 c = 0.10013 yl = 5.6046 c’ = 9.9872 ll = 98.381 d = -884.71 yh = 10469 d’ = 884.17 lh = 1204.7 Table 3.3: Constants used for the Luminance and Luma mapping on the luma channel.

Reconstruction function:

The authors introduce a strictly monotically increasing reconstruction function using a look- up table (LUT). This used to predict HDR pixel values from its corresponding LDR frame. The reconstruction function essentially maps LDR pixel values to HDR pixel values, con- tained in one of the 256 bins of the LDR pixel values. It is defined as the arithmetic mean of the all the pixels in a particular binΩiand is given in equation 3.31.

RF(l) = 1

Card(Ωi)Σlhdr(i) where Ωi=i∈[1,N]:lldr=l (3.31)

l_∈[0,255]is an index of a bin,Nis the spatial resolution of a frame,lldr(i)andlhdr(i)are luma values of thei-th LDR and HDR pixel respectively.

Residual frame computation:

The reconstruction function (lookup table) as mentioned in Section 3.3.6 is then used to predict theLhdr values from theLldrvaluesresulting in aPredictedhdr luma frame. Subse- quently, the residual luma is calculated as:

Residuall =Lhdr−prLumahdr whereprLumahdr=RF(Lldr) (3.32)

Therefore, the accuracy of the predictedLhdrandResidualllargely depends on the accuracy of the reconstruction function.

Noise Reduction and frame quantisation:

Residual rames do not compress well primarily because they contain a lot of high frequen- cies including noise. To mitigate this problem, invisible noise filtering is applied to the residual frame using the CDF 9/7 discrete wavelet filter pair [XWHL94,WL05].

The filtered frame can ideally contain values up to 12 bits (0 to 4095) which cannot be encoded using an 8-bit encoder. The authors introduce a simple yet effective quantization

function to quantize and limit residual pixel values to 8 bits as given below, ˆ

Res.l(i) = [Resl(i)/q(m)]−127÷127,wherem=k⇔i⊂Ωk (3.33) and the quantization factor, q(m), is calculated for each binΩk as:

q(m) =max(qmin,maxi∈Ωi(|Resl(i)|)

127 ) (3.34)

The quantization factorsq(m)wherem_∈[0,255]is stored in the auxiliary stream alongside the reconstruction function. The entire encoding is visually described in Figure 3.11. Decoding and merging to HDR:

The decoding process is fairly straightforward. The decoded sRGB frames are transformed

to Lhdruhdrvhdr colourspace. Using the reconstruction function from auxiliary stream,

Lhdruhdrvhdrvalues are predicted and finally merged with the decoded residual framesResl to re-create the HDR frame. The hybrid luma space is inverse mapped to real world luminance (see Section 3.3.6) and Yu’v’ is transformed to XYZ followed by inverse transformation to 16 bit RGB frames.

In document Accurate light and colour reproduction in high dynamic range video compression (Page 74-77)