Existing Rate Control Algorithms - Efficient algorithms for scalable video coding

Rate control algorithms for video coding were widely studied prior to the release of the SVC standard, the aim being to ensure successful transmission of an encoded bitstream and to make full use of the limited bandwidth. Consequently, rate control plays an important role, as it directly influences the coding efficiency of the video encoder. Whether the rate control is efficient or not largely depends on the accuracy of the rate control model and the effectiveness of the rate control algorithm. It affects not only the stability of the bit rate, but also the picture quality of the entire video sequence.

Several rate control algorithms for scalable video coders have been proposed. These include the JVT-W043 rate control algorithm, which has been incorporated in the latest JSVM reference software, and several improved algorithms. JVT-W043 and a number of representative rate control algorithms suggested for SVC will be discussed in the following subsections.

6.1.1 Default Implementation in JSVM

JVT-W043 was suggested by JVT and has become the default rate control scheme for the base layer of SVC. JVT-W043 follows closely the rate control implementation that was proposed in JVT-G012, called the adaptive basic unit layer rate control algorithm. JVT-G012 is

the default rate control scheme of the H.264/AVC standard, and achieves a good balance between algorithm complexity and rate control performance. JVT-G012 introduces a new concept of ‘basic unit’ and a linear MAD prediction model to solve the ‘chicken and egg paradox’ that exists. It involves three steps: GOP level rate control, frame level rate control and basic unit level rate control. A basic unit usually comprises a number of consecutive macroblocks; when it contains only one macroblock then it is considered as macroblock level rate control algorithm. When it contains all the macroblocks in a frame then it be- comes frame level rate control.

1. GOP level rate control

In this step, the target number of bits for each GOP is allocated according to the target bit rate and the current capacity of the virtual buffer. When starting to encode theith GOP, the target number of bitsTr ni,0

for this GOP are determined as

T_r n_i_,0=u ni,1 f ×Ngop− _B s 8 −Bc n_i₋_1,_N_gop (6.1) whereu n_i_,1is the instantaneous target bit rate when the 1stframe of theith_{GOP is} being coded;f denotes the predefined frame rate;N_goprefers to the number of frames in each GOP;Bs is the buffer size andBc(ni−1,Ngop)is the current capacity of the virtual

buffer after coding the(i−1)thGOP.

As the channel bandwidth, namely the target bit rate, may vary with time,Tr ni,j

is updated after each frame is coded as

Tr ni,j =Tr ni,j−1 +u ni,j −u ni,j−1 f Ngop−j −Tp ni,j−1 (6.2) whereTp ni,j−1

refers to the actual number of bits generated by the j−1thframe of theithGOP.

2. Frame level rate control

target buffer level, the frame rate, the target bit rate, and the capacity of the buffer. ˜ f n_i_,_j=u ni,j f +γ T b l n_i_,_j−B_c n_i_,_j (6.3) whereγis a constant, and its typical value is 0.75 when there is no B-frame, and is 0.25 otherwise.T b l ni,j

is the target buffer level.

When the available bits remaining are also considered, the bit budget for the jth_frame is calculated as ˆ f ni,j = Wp ni,j−1 Tr ni,j Wp ni,j−1 Np,r j−1 +Wb ni,j−1 Nb,r j−1 (6.4) whereNp,r j−1 andNb,r j−1

refer to the remaining number of P- and B-frames in the GOP respectively;Wp ni,j−1

andWb ni,j−1

denote the average picture complexity of the P-frames and B-frames respectively.

The final target number of bits for thejthframe is determined as a weighted combina- tion of ˆf ni,j and ˜f ni,j f ni,j =β×f nˆ i,j + 1−β×f n˜ i,j (6.5) After encoding thejthframe, the model coefficients are updated according to the actual generated bits, the Qp value used, and the MAD of the residual component obtained.

3. Basic unit level rate control

If the basic unit is not a frame, basic unit level rate control needs to be performed. As for frame level rate control, basic unit level rate control comprises the following steps: 1) Target bit allocation for each basic unit. 2) Linear prediction of MAD. 3) Calculation of Qp. 4) Update of the model coefficients.

6.1.2 Other Improved Algorithms

So far, most rate control algorithms have been developed for single layer video coding, however several rate control algorithms have been suggested for SVC[98–104]. They con-

sider either precise target bit allocation or the optimisation of the RD model.

The rate control algorithm in[98]operates in each spatial layer individually, that is the dependency between spatial layers is not exploited explicitly. If the spatial correlation and other new features introduced by the layer-based structure of SVC can be adequately exploited, improved coding performance is expected.

Later, Xu et al. proposed a rate control algorithm for spatial and CGS scalable coding in SVC[99]. This method employs the improved TMN8 model for Qp estimation based on the mode analysis of I-, P-, and B-frames. The TMN8 rate control algorithm was developed for the H.263 video coding standard and is primarily used in low bit rate and low delay environments.

In[100]and[101], Liu et al. proposed that the MAD can be predicted from either the previous frame in the same layer or the corresponding frame in the base layer, through a switching law. It is shown that the previous temporal frames and the reference frame in the base layer can provide useful information for MAD prediction in the enhancement layer. In order to make full use of the available information, an improved MAD model needs to be developed, by which information from previous temporal frames and the reference frame in the base layer is considered together.

Hu et al.[102]proposed a frame level rate control algorithm for temporal scalability in SVC by developing a set of weighting factors for bit allocation. Their work focused on bit allocation among different temporal layers and lacked the support for other scalability features. Subsequently, Hu et al. developed a Cauchy distribution-based Rate-Quantisation (R-Q) model for each spatial layer in[103]. However, it is arguable whether the Cauchy distribution-based R-Q model outperforms the classic Laplace distribution-based model. In[104], Liu et al. proposed a bit allocation algorithm for SVC where the inter-layer dependency is taken into consideration. Although Liu et al.’s work provides a reasonable bit allocation mechanism, inter-layer correlation is not explicitly exploited in optimisation of the RD model.

In document Efficient algorithms for scalable video coding (Page 154-158)