Separation by segmentation - Implementations to use smaller FFTs

for high sensitivity

6.3 Implementations to use smaller FFTs

6.3.2 Separation by segmentation

In this section, we look at the results when we separate the input and output samples by sections. First, we perform a segmentation by a factor NS, and then by a factor NP.

Segmentation by a factor NSusing the equation with the matrix H

By segmenting the sequences by a factor NS, each sequence is separated into NSsub-sequences of NP samples, and Eq. (6.4) can be expressed as



s0p0 s0p1 s0p2 s0p3 s2p3 s0p0 s0p1 s0p2 s2p2 s2p3 s0p0 s0p1 s2p1 s2p2 s2p3 s0p0



s1p0 s1p1 s1p2 s1p3 s0p3 s1p0 s1p1 s1p2 s0p2 s0p3 s1p0 s1p1 s0p1 s0p2 s0p3 s1p0



s2p0 s2p1 s2p2 s2p3 s1p3 s2p0 s2p1 s2p2 s1p2 s1p3 s2p0 s2p1 s1p1 s1p2 s1p3 s2p0



s2p0 s2p1 s2p2 s2p3 s1p3 s2p0 s2p1 s2p2 s1p2 s1p3 s2p0 s2p1 s1p1 s1p2 s1p3 s2p0



s0p0 s0p1 s0p2 s0p3 s2p3 s0p0 s0p1 s0p2 s2p2 s2p3 s0p0 s0p1 s2p1 s2p2 s2p3 s0p0



s1p0 s1p1 s1p2 s1p3 s0p3 s1p0 s1p1 s1p2 s0p2 s0p3 s1p0 s1p1 s0p1 s0p2 s0p3 s1p0



s1p0 s1p1 s1p2 s1p3 s0p3 s1p0 s1p1 s1p2 s0p2 s0p3 s1p0 s1p1 s0p1 s0p2 s0p3 s1p0



s2p0 s2p1 s2p2 s2p3 s1p3 s2p0 s2p1 s2p2 s1p2 s1p3 s2p0 s2p1 s1p1 s1p2 s1p3 s2p0



s0p0 s0p1 s0p2 s0p3 s2p3 s0p0 s0p1 s0p2 s2p2 s2p3 s0p0 s0p1 s2p1 s2p2 s2p3 s0p0



or in a more concise way

Each matrix Hi contains only two secondary code chips, one is present on and above the diagonal while the other is present below the diagonal. This means that a matrix Hiis either circulant or skew-circulant (see Appendix A.2). There are thus only two possible matrices. For example, if we consider that s0= 1, s1= 1, and s2= −1, Eq. (6.20) becomes

Implementing these operations using FFTs requires

• 2 FFTs of NP points, for the sequence pn,

• 2NSFFTs of NPpoints, for the combinations of the sequences xi ,n,

• 2NSIFFTs of NP points, for the sequences yi ,n,

• 2NSproducts of NPpoints, for the matrix-vector products,

• 2NS+ 1 products of NPpoints, for the skew-circular correlations,

• NS(NS− 2) additions of NP points, for the combinations of the sequences xi ,n,

• NSadditions of NP points, for the additions to obtain the sequences yi ,n,

Therefore, the number of multiplications is

and the number of additions is

N_{ad d}= (4NS+ 2)¡N_Plog₂(N_P)¢ + NS(N_S− 1)NP which means an increase of 15.3 % for the multiplications and 39.0 % for the additions, compared to the direct implementation. Compared to the downsampling by a factor NS, the number of FFTs is higher (2 + 4NSinstead of 3NS), but the number of multiplications for the matrix-vector products is reduced a lot (2NSinstead of N_S²). This explains a lower increase for the multiplications, and a higher increase for the additions.

Here, it is possible to exploit the repetitions, but the fact to have at the same time circulant and skew-circulant matrices implies to double the number of FFTs.

Segmentation by a factor NSusing the equation with the matrix X

By segmenting the sequences by a factor NS, each sequence is separated into NSsub-sequences of NP samples, and Eq. (6.5) can be expressed as



or in a more concise way





 y0





 =







s₀X0 s₁X1 s₂X2

s0X1 s1X2 s2X0

s0X2 s1X0 s2X1











 p p p





, (6.27)

where the matrices Xi are Hankel (see Appendix A.2.4). Hankel matrices can be embedded into circulant matrices of double size.

Thus, implementing these operations using FFTs requires

• 1 FFT of 2NPpoints, for the sequence pn,

• NSFFTs of 2NPpoints, for the sequences xi ,n,

• NSIFFTs of 2NP points, for the sequences yi ,n,

• NSproducts of 2NPpoints, for the matrix-vector products,

• NS(NS− 1) additions of NP points, for the combinations of the results of the matrix-vector products.

Therefore, the number of multiplications is N_mul= (2NS+ 1)µ 2NP

2 log₂(2NP)

+ NSNP

= NPNS

2 log₂(NP) + 3 +log₂(2NP) NS

¶ ,

(6.28)

and the number of additions is

Nad d= (2NS+ 1)¡2NPlog₂(2NP)¢ + NS(NS− 1)NP

= NPN_S µ

4 log₂(NP) + NS+ 3 +2 log₂(2NP) NS

. (6.29)

Considering NP= 20 460 and NS= 20, this gives Nmul≈ 409 200 (31.64) + 313 457.81 ≈ 13 260 970

Nad d≈ 409 200 (80.28) + 626 915.62 ≈ 33 478 340, (6.30) which means an increase of 11.9 % for the multiplications and 46.3 % for the additions, compared to the traditional implementation.

Compared to the segmentation by a factor NS using the matrix H, the number of FFTs is divided by two but the FFT length is doubled (which requires slightly more operations), but there is not the additional product required by the skew-circular correlations. This explains the lower increase for the multiplications, and the higher increase for the additions.

Here, it is possible to exploit the repetitions, but the fact to have Hankel matrices implies to double the length of the FFTs.

Segmentation by a factor NPusing the equation with the matrix H

By segmenting the sequences by a factor NP, each sequence is separated into NPsub-sequences of NSsamples, and Eq. (6.4) can be expressed as



or in a more concise way



where the matrices Hi are Toeplitz (see Appendix A.2.3). Toeplitz matrices can be changed to circulant matrices by doubling their size. Thus, implementing these operations using FFTs requires

• NP FFTs of 2NSpoints, for the sequence hi ,n,

• NP FFTs of 2NSpoints, for the sequences xi ,n,

• NP IFFTs of 2NSpoints, for the sequences yi ,n,

• NP products of 2NSpoints, for the matrix-vector products,

• NP(NP− 1) additions of NSpoints, for the combinations of the results of the matrix-vector products.

Therefore, the number of multiplications is N_mul= 3NP¡NSlog₂(2NS)¢ + NPNS

= NPNS¡3log₂(2NS) + 1¢ , (6.33)

and the number of additions is

Nad d= 3NP¡2NSlog₂(2NS)¢ + NP(NP− 1)NS

= NPNS¡6log₂(2NS) + NP− 1¢ . (6.34)

Considering NP= 20 460 and NS= 20, this gives N_mul≈ 409 200 (16.97) ≈ 6 942 399

Nad d≈ 409 200 (20 490.93) ≈ 8 384 889 198, (6.35)

which means a reduction of 41.4 % for the multiplications but an increase of 36 538.5 % for the additions, compared to the direct implementation.

Compared to the downsampling by a factor NP, the number of FFTs is higher and the FFT length is doubled, which explains a higher number of the multiplications. The large increase for the additions is still due to the NP(NP− 1) additions of the matrix-vector products to compute.

Here, each matrix H_i contains all the secondary code chips, thus we cannot exploit the repetitions in the tiered code. This means that the same result would be obtained with any signal.

Segmentation by a factor NPusing the equation with the matrix X

Doing the same separation as previously using Eq. (6.5) leads a similar implementation with the same complexity, therefore we do not give the details.

6.3.3 Summary

Table 6.1 provides a summary of the different implementations according to the separation (downsampling or segmentation), the factor (NSor NP) and the equation used (Eq. 6.4 with H as matrix or Eq. 6.5 with X as matrix). It can be seen that using a factor NPis not efficient because of the prohibitive number of additions, even if we can exploit the repetition of the primary code.

The most efficient implementations (downsampling and segmentation by a factor NS) have similar performance. However, for a hardware implementation the implementation obtained by segmentation is more interesting. Indeed, for each section of the output, there is only NS

products to perform between the FFTs and the storage of intermediate value is relatively small (the result of one product), while with the implementation obtain by downsampling, there are N_S²product and the storage is much higher, as already shown for the case NS= 2 in Section 4.3.6.

Factor N_S N_P

Matrix used H X H X

Separation by downsampling

Matrices

Circulant Circulant Circulant Circulant obtained

Table 6.1: Summary of the implementations according to the separation and separation factor.

The complexity is given in comparison to the direct implementation (Fig. 6.1), for the number of multiplications and additions.

In document Resource-efficient parallel acquisition architectures for modernized GNSS signals (Page 163-169)