Multistage vector quantization - A Parametric Approach for Efficient Speech Storage, Flexible S

2.3 Quantization

2.3.2 Multistage vector quantization

Vector quantization can be considered the best possible memoryless compression tool in the sense that no other memoryless coding scheme that maps a signal vector into one ofN binary words can outperform vector quantization as there always exists a vector quantizer with codebook sizeN that provides at least the same accuracy [Ger92]. However, in many application scenarios, the related memory consumption and the computational complexity of the codebook search can of- ten make direct use of the basic vector quantization approach impractical. Con- sequently, many alternative quantizer structures and search strategies have been proposed in the literature. Examples of such alternative approaches include split vector quantization, gain-shape quantization, binary search codebooks, and lattice vector quantization (see, e.g., [Ger92], [Gra84], and [Kon04] for more information on the different alternative quantizer structures and search strategies). From the viewpoint of this thesis, multistage vector quantization (MSVQ) [Jua82] is of particular interest due to the excellent tradeoff between the performance and the resource needs in terms of the computational load and memory usage that it offers. MSVQ can also be considered an excellent choice because it can regarded as a generalization that also represents many of the other alternatives. For example, split vector quantizers and gain-shape quantizers can be realized as special cases of the multistage vector quantization approach.

A multistage VQ [Jua82] quantizes the vectors in two or more additive stages. The objective is to find a vector combination, in other words a sum of the selected vectors at different stages, that minimizes the resulting distortion. The quantized vector can be defined as

ˆ x= K X j=1 c(j)_l j , (2.12)

where c(j)m denotes themth reproduction vector from the jth stage , K is the total

Figure 2.3: Block diagram of a multistage vector quantizer using sequential search. (From [Nur01a].)

Codebook search in MSVQ

The use of multistage codebooks can drastically reduce the memory consumption but the computational complexity of a multistage vector quantizer depends on the applied search algorithm. If full search is used, i.e., the distortion measure is calculated for every possible vector combination, the computational complexity is higher than with the normal unconstrained vector quantizer, due to the extra additions needed to sum up the codevectors from the different stages. However, an efficient search algorithm can significantly reduce the complexity.

The simplest search algorithm for multistage quantization is the sequential search. The process begins with a full search quantization using only the codebook of the first stage. Then, the quantization error is calculated as the error between the original vector and the quantized vector. After that, the error vector is quantized using only the second stage codebook and the resulting error vector is computed. This is carried on until the quantization for every stage has been performed. Finally, the quantized vector is the sum of the quantized vectors at different stages. This procedure is illustrated in Figure 2.3.

The sequential search algorithm is simple, but the resulting quantization performance is rather poor. A better choice is to use the M-L tree search depicted in Figure 2.4, in which theM best vector combinations are searched at each stage. That is, at the first stage, the M vectors that result in the lowest distortion are selected. Then at the second stage each reproduction vector is combined with the M vectors selected at the first stage and again the M paths that achieve the lowest overall distortion are selected. This is carried out for all stages. Finally, the full path with the lowest distortion determines the channel symbols for each stage.

It is easy to see that settingM = 1 corresponds to a sequential search. Natu- rally, it is beneficial to use a largerM in the search algorithm since usually larger values of M lead to smaller overall distortion. However, it has been found that the M-L tree search achieves performance close to that of the full search with a relatively smallM [LeB93].

Figure 2.4: Example of M-L tree search procedure withM = 4 in a 4-stage VQ. (From [Nur01a].)

Codebook training for MSVQ

Training the codebooks is more complicated for a multistage VQ than for a con- ventional VQ because the final reproduction vector depends on the codebooks of all the stages. The simplest applicable training method is to train the codebooks sequentially. In this approach, the codebook for the first stage is computed in a traditional manner using, e.g., the generalized Lloyd algorithm. Next, the training data is quantized with this one-stage VQ and the quantization error vectors are calculated. Then, the codebook for the second stage is trained using these error vectors as the training data. This is repeated for all the stages, with each new codebook calculated using the error between the original vector and the reconstruction vector including all the previous stages as the training data. The training procedure is terminated when the codebooks for all the stages have been computed.

The basic sequential training method is simple, but unfortunately the resulting codebooks are only sub-optimal with respect to the overall performance [Cha92]. The algorithm fails to efficiently exploit the inter-stage dependencies in the codebook optimization. The performance of the sequential training can be improved by making two modifications to the algorithm. Firstly, the error vectors can be calculated as the error between the original vector and the multistage reproduction vector including all the stages except the current stage (that the codebook is currently trained for). Secondly, the algorithm is repeated until the relative change in the distortion is low enough or the total number of repetitions has reached a cer- tain predetermined limit. The resulting algorithm is referred to as the joint design of the stage codebooks [Cha92].

The joint codebook design algorithm offers a performance improvement over the traditional sequential codebook training. However, the improvement is rather modest [Cha92] and the codebook optimization is still performed for one codebook at a time. Furthermore, the convergence of the algorithm is quite slow. The simultaneous joint design algorithm proposed in [LeB93] offers yet another step towards better performance and faster convergence. The basic idea in this method is to jointly optimize the codebooks after each pass over the training sequence. The resulting multistage VQ simultaneous joint design algorithm [LeB93] will be used extensively in this thesis.

The simultaneous joint design algorithm is usually initialized with sequentially designed random codebooks. In theory, it is assumed that the quantization is performed using full search. However, it has been experimentally found that good performance can be achieved by employing the M-L tree search with a mod- erate value of M [LeB93]. This is partially enabled by the fact the codebooks are reordered at each training iteration in such a manner that the energy at any given stage after subtracting the codebook mean is less than the corresponding energies at all the previous stages. See [LeB93] for more detailed information on the algorithm.

In document A Parametric Approach for Efficient Speech Storage, Flexible Synthesis and Voice Conversion (Page 37-40)