IV. Results
4.2.1 Binary ADC Results
The results of the eight-step test discussed in Section 3.3.1 are summarized in Table 4.1. The procedure began by recreating the successful results that Lievsay published in [18]. The processing time was 𝑇𝑝 = 42 minutes on the fastest computer
used by Lievsay. This test, however, used a newer computer and the processing was only 𝑇𝑝 = 37.5 minutes using his data and algorithm. After optimizing Lievsay’s
algorithm by eliminating superfluous variables and changing the transmit and receive signal variables from double to single precision in MATLAB®, the processing time
improved to 𝑇𝑝 = 15.5 minutes. As expected, no impact to the accuracy of the results
were found when using single precision.
To simulate the output of a binary ADC, only the sign of the transmit and receive variables was used for the next step. As hypothesized, the results were consistent with Lievsay’s [18]. Although the memory requirements and processing times were improved to this point, the processing time was still prohibitively long, so further optimization was required.
Table 4.1. Summary of binary ADC test results
Step Description Peak Memory (GB) Time (minutes) Consistent? 1 Lievsay’s results [18] 40 37.5 Yes
2 Minor mods 28 22.3 Yes
3 Single precision 21 15.5 Yes
4 Signs Only 21 15.5 Yes
5 Replaced interp1 15.5 9.35 Yes
6 Parallel 42 5.38 Yes
7 FFT points 35 4.7 Yes*
8 GPU feasibility – – –
* Slight decrease to velocity resolution
MATLAB®’s interp1 function was used by Lievsay [18] to generate a bank of reference signals and was the most time consuming line in the code, necessitating an alternative. There are many features built into the interp1 function that make the function versatile for many applications, but also slow it’s processing speed. For the purposes of this algorithm, a simple index shift to the transmitted signal based on the target’s relative radial velocity is all that is needed to create the bank of reference signals. To understand the index shifts required, consider the time signal in Figure 4.1. The simulated transmit signal is plotted with a dotted line along with the simulated receive signal in the dashed line. The receive signal measurement window is shortened at the first sample index by Δ𝑡, at the second sample index by 2Δ𝑡 and so on. The received signal is compressed as a result of the target’s relative radial
velocity as detailed in (2.23); however, the received signal is still sampled at the same sample time as the transmitted signal, 𝑇𝑠. The number of index shifts expected in
the receive signal is equal to
𝐿 =⌈ 𝑁 ∣Δ𝑡∣ 𝑇𝑠
⌉
, (4.1)
where 𝐿 is the number of shifts, 𝑁 is the number of samples in the signal, and ⌈⋅⌉ represents the ceiling, or rounding of a non-integer up to the next integer. For an inbound target, the first ⌈𝑇𝑠/(2∣Δ𝑡∣)⌉ samples of the reference signal will be mapped to
the same sample values as the transmitted signal. The next ⌊𝑇𝑠/∣Δ𝑡∣ − 𝑇𝑠⌋ samples of
the reference signal, where ⌊⋅⌋ represents a round-down to the next integer, will instead be mapped to the respective transmitted signal values plus one sample. Similarly, the next ⌊𝑇𝑠/∣Δ𝑡∣ − 𝑇𝑠⌋ are mapped to the values corresponding to the same samples plus
two samples and so on. For a target with unknown velocity, a bank of reference signals are built using this approach to compare to the received signal to the reference signal. The reference signal built with the reference velocity corresponding to the target’s actual velocity has a maximum correlation with the received signal.
After replacing the interp1 function with the algorithm discussed above, the mem- ory usage of the correlation routine was sufficiently low to use MATLAB®’s parallel processing toolbox. Running the routine in parallel across four processors decreased the processing time to 5.38 minutes while maintaining the same results. Next, in step seven, the fft function used for signal correlation was optimized to improve pro- cessing speed and memory. MATLAB®’s fft function operates most efficiently when the number of points in the FFT is set to a power of two. By default, the number of points used is equal to the length of the vector being transformed. Lievsay [18] kept the default and the FFT was running 𝑁𝐹 𝐹 𝑇 = 200M points. To speed up the
Figure 4.1. Representation of received signal compared to transmitted signal for an inbound target.
𝑁𝐹 𝐹 𝑇 = 200M, which is 227 = 134217728. The result of this change to the FFT was
a 7 GB reduction in RAM and over half a minute improvement to parallel processing using four cores of the computer. However, the drawback to this approach is the elongating of the target response along the velocity axis as can be seen in Figure 4.2. The elongation that results from fewer FFT points is equivalent to diminished ve- locity resolution due to a shorter measurement window, 𝑇 . For this target scenario, however, the diminished velocity resolution is negligible.
Not only did the range and velocity estimation performance remain the same using only the signs of the received and reference signals for correlation, but as expected, the post-processing SNR did not change significantly. As discussed in Section 2.5.4, the pre-processing SNR has an impact on the post-processing SNR. Quantization noise affects the pre-processing SNR as seen in (2.29), limiting the maximum pre-processing SNR in a binary ADC to 7.86 dB. For a dominant target with a pre-processing SNR
(a) Lievsay’s results (b) After step 4
(c) After step 6 (d) After step 7
Figure 4.2. Results for target at 10 m with inbound relative radial velocity of 5 m/s.
greater than 7.86 dB, the binary ADC limitations will have a negligible impact to the post processing SNR defined by (2.36). In fact, the post-processing SNR actually improved slightly as can be seen in Table 4.2. Only a subset of the eight steps is shown in this table, but the results show no negative impact to the post-processing SNR by using a binary ADC compared to an eight-bit ADC for a single target using the AFIT RNR.
A graphical view of the post-processing SNR comparison can be seen in the range- velocity plots in Figures 4.3 and 4.4. In Figure 4.3, the post-processing SNR for the full eight-bit data is plotted against (a) range perfectly matched in velocity and (b) velocity perfectly matched in range. In Figure 4.4, the post-processing SNR for the simulated binary data is plotted in two-dimensions as well. A comparison of these
Table 4.2. Post-processing SNR comparison of 2D processing results.
Step Description ADC type (simulated) Change in SNR 1 Lievsay’s results Eight-bit ADC -
4 Signs only Binary ADC (sim) 0.5 dB 5 Replaced interp1 Binary ADC (sim) 1.3 dB 7 FFT points Binary ADC (sim) 1.1 dB
figures shows that the binary ADC will not degrade post-processing SNR for a single target with sufficient pre-processing SNR.
(a) (b)
Figure 4.3. Full eight-bit ADC post-processing SNR plotted against (a) range perfectly matched in velocity and (b) velocity perfectly matched in range.
(a) (b)
Figure 4.4. Simulated binary ADC post-processing SNR plotted against (a) range
At this point, it is clear that switching to a binary ADC alone does not allow for mass parallelization in the AFIT RNR using a GPU or FPGA. Further memory optimization must be accomplished to take advantage of the available GPU. The next section details the results of such an optimization effort aimed at reducing the memory burden in order to perform correlation processing on a GPU.