2.4 Statistical models for prediction and coding
2.4.4 Planar model
is used to compute the floating point values zi∗ as:
zi∗= xi yi 1 θ∗= zi− ε∗i, i= 1, 2, . . . , n. (2.10)
where ε∗i are the optimal modeling errors. The initial region can be then recon-structed as follows: at the pixel location (xi, yi), the value ˆzi is computed by rounding zi∗ as
ˆzi= bzi∗e= zi∗−∆∗i, i= 1, 2, . . . , n, (2.11) where ∆∗i ∈ [−0.5, 0.5] are the rounding errors. From (2.10) and (2.11) results that the value zi is reconstructed as ˆzi= zi−(ε∗i + ∆∗i).
In the lossless case, besides the plane parameters θ∗,the difference
zi−ˆzi= ε∗i + ∆∗i, i= 1, 2, . . . , n, (2.12) must be encoded for a perfect reconstruction.
In the lossy case, the planar model is introducing a distortion in the region Ω`
by encoding only the optimal plane parameters θ∗. The distortion can be measured by Mean Squared Error (MSE), which is computed as
M SEp(LS)(Ω`) = 1 n
n
X
i=1
(zi−ˆzi)2= 1 n
n
X
i=1
(ε∗i + ∆∗i)2. (2.13)
The distortion introduced in the region Ω`by the constant model with the param-eter d`=n1Pn
i=1zi can be computed as M SEc(Ω`) = 1 n
n
X
i=1
(zi− bd`e)2. (2.14)
Lossless Compression of Depth-Map Images
“The cure for boredom is curiosity.
There is no cure for curiosity.”
— unknown author The chapter starts by briefly presenting, in the first section, the state of the art coders in the research field of lossless compression of depth-map images. Three methods [P1, P3, P5] were developed for this topic and they contain two stages:
contour compression and region reconstruction. Two contour compression rithms are introduced in the second section and two region reconstruction algo-rithms in the third section. In the last section, we summarize the coders by selecting an algorithm from each stage.
3.1 State of the art coders
In the lossless compression field, there are several algorithms that were designed for depth-map image compressing. In one approach, instead of encoding (condi-tionally) each pixel’s depth value, the authors are applying different transforms to the binary representation of the depth values. The integers in the depth-map image are then represented and encoded as a sequence of bit-planes. The large constant patches (with the same symbol) will also be present in the bit-planes.
The transforms intend to lead to bit-planes with even larger constant patches. In the last stage of the methods, binary masks or entire bit-planes are encoded as binary images by a specialized entropy coder. For example, in one of the methods the authors first use the Gray code [26] (also known as reflected binary code) to do a transformation of the initial bit-plane representation of the image. In [39], the image is divided into blocks and the depth value of each pixel is converted to a gray code. The algorithm checks if the blocks of size 16 × 16 are full of symbols ‘0’
or symbols ‘1’, and encodes them using a binary switch. Since not all the blocks 19
are filled with only one symbol, the masks of the remaining blocks are encoded using the MPEG-4 Part 2 [15], known as the Visual Binary Shape coding scheme of the Moving Picture Experts Group (MPEG) [63]. The same idea of transform-ing the bit-planes is used also in [101]. This time the converted binary planes are encoded by the Joint Bi-level Image Experts Group (JBIG) standard for bi-level images. Moreover, the algorithm is extended to conditionally encode the left-right pairs of depth-map images from two viewpoints of the scene. The use of theJBIG standard offers to this method an important advantage compared to the previous ones. This group of methods is easy to implement and relies on the performance of the chosen entropy coder, but they were outperformed by various methods that exploited the specific characteristics of the depth-map image.
The algorithm in [12] is called Piecewise-Constant image model (PWC). The
PWC algorithm was designed for the compression of palette images and was ob-tained by further developing the contour coding algorithm from [90]. In publication [P3], the experimental results show thatPWC is a good solution for compressing depth-map images. Another palette coding method can be found in [68]. The reason why a method designed for palette image compression has good results in depth-map compression is that it is using a context coding algorithm which is able to detect and encode very efficiently the object boundaries and smooth areas inside the depth-map image.
Recently, many published algorithms use the idea of modifying some of the tra-ditional lossless image coders to take advantage of the proprieties that depth-map images have. The H.264/AVC [9] and MPEG-4 Part 10, Advanced Video Cod-ing (MPEG-4 AVC) are the video compression standards that are commonly used for video compression. For example, in [29] and [30], the authors modified the H.264/AVC standard to improve the results for depth-map compression. How-ever, H.264 is mainly used in the lossy compression of video depth-map sequences.
The generic lossless image coders may also be used for compressing depth-map images. JPEG-LS is theJPEGstandard for lossless and near-lossless compression of continuous-tone images and its core algorithm is called LOssless COmpression for Images (LOCO-I) [93, 94]. The algorithm consists of two main stages that are called modeling and encoding. Prediction, residual modeling and residual context-based coding are the main concepts used by the algorithm. LOCO-Iachieves low complexity using the assumption that the computed prediction residuals are fol-lowing a two-sided geometric distribution, and from the use of the Golomb-like codes in the coding stage, shown to be an optimal solution for coding geometric distributions. The programs are publicly available and their executable files can be downloaded from [45]. The advantages thatLOCO-Ioffers are: it is easy to use and it is available online; has a low complexity and a small runtime. However, its compression performance can be easily outperformed by using more complex methods.
The Context-based Adaptive Lossless Image Coder (CALIC) [99] is one of the best generic lossless image coder that obtains high lossless compression for continuous-tone images. Modeling contexts are used by CALIC to condition the residuals of a non-linear predictor and to make the predictor able to adapt to
xnn
xnw xn xne
xww xw X Unknown values
Figure 3.1: The causal neighborhood used by the Gradient Adjusted Predictor from CALIC. The current pixel position, marked by X, is predicted using the value of six neighboring pixels, marked by xn, xw, xnw, xne, xnn, xww.
different source statistics. The non-linear predictor is called Gradient Adjusted Predictor (GAP) and uses a causal neighborhood of six neighboring pixels (see Figure 3.1) to detect three types of edges (sharp, normal and weak) on both hori-zontal and vertical directions. In the adaptation process, the algorithm estimates the expectation of the prediction residuals conditioned on a large number of con-texts rather than estimating a large number of conditional error distributions.
CALIC is achieving a low time and space complexities thanks to the efficient tech-niques for forming and quantizing modeling contexts. The CALICexecutable files are available online [98]. The large number of modeling contexts and the efficient coding of the predicted residuals makes possible to detect and encode the sharp edges and the smooth areas of the depth-map image. The small runtime and the very good compression performance are makingCALIC one of the best options for depth-map image compression.
In our test we used as state of the art encoders thePWC, LOCO-I andCALIC algorithms. Although LOCO-Iis the fastest, PWC and CALIC have the best com-pression performance, wherePWCusually has a small advantage. In the following two sections we describe a set of algorithms for each of the two stages of our ap-proach: contour compression stage and region reconstruction stage. In the final section we propose different encoders by selecting an algorithm from the set of each stage.