• No results found

Reconstruction and Compressed Sensing

5.2.5 Performance Guarantees

5.2.5.4 Mutual Coherence

As pointed out already, estimating and testing the RIC for large M is impractical. A tractable yet conservative bound on the RIC can be obtained through the mutual coherence of the columns of A defined as

M(A) = max

i= j |AHi Aj|

Mutual coherence can be used to guarantee stable inversion through1recovery [53,54], although these guarantees generally require fairly small values of s. Furthermore, the RIC is conservatively bounded byM(A) ≤ Rs(A) ≤ (s − 1)M(A). The upper bound is very loose, as matrices can be constructed for which the RIC is nearly equal to the mutual coherence over a wide range of s values [55].

The mutual coherence is of particular importance in radar signal processing. Recall from Section 5.2.2.2 that entries of the Gramian matrix AHA are samples of the radar am-biguity function. The mutual coherence is simply the maximum off-diagonal of this matrix.

Thus, the mutual coherence of a radar system can be reduced by designing the ambiguity

Melvin-5220033 book ISBN : 9781891121531 September 14, 2012 17:41 166

166 C H A P T E R 5 Radar Applications of Sparse Reconstruction

function appropriately. This view was explored for the ambiguity function, along with a deterministic approach to constructing waveforms that yield low mutual coherence for the resulting A, in [56]. In a nutshell, the thumbtack ambiguity functions that are known to be desirable for radar systems [57] are also beneficial for CS applications. In [12], the authors use mutual coherence as a surrogate for RIP when designing waveforms for multistatic SAR imaging. As one might expect, noise waveforms provide good results in both scenarios.20

The ambiguity function characterizes the response of a matched filter to the radar data. At the same time, the ambiguity function determines the mutual coherence of the forward operator A, which provides insights into the efficacy of SR and CS. Thus, CS does not escape the limitations imposed by the ambiguity function and the associated matched filter. Note that virtually all SR algorithms include application of the matched filter AHrepeatedly in their implementations. Indeed, SR algorithms leverage knowledge of the ambiguity function to approximately deconvolve it from the reconstructed signal.

Put another way, SR can yield signal estimates that lack the sidelobe structure typical of a matched filtering result, but the extent to which this process will be successful is informed by the ambiguity function.

5.3 SR ALGORITHMS

In the previous section, we surveyed much of the underlying theory of CS. As we have seen, CS combines randomized measurements with SR algorithms to obtain performance guarantees for signal reconstruction. In this section, we will review several examples of SR algorithms and their associated CS performance guarantees. It is worth mentioning that these algorithms can and are used in situations when the sufficient conditions associated with CS are not satisfied. In spite of this failure to satisfy these conditions, the resulting reconstructions are often desirable.

One issue is that the traditional CS theory provides error bounds on reconstructing xtrue. In many radar problems, the signal xtrue represents fine sampling of a parameter space, such as the set of image voxels or the angle-Doppler plane. In these scenarios, producing a reconstruction whose nonzero elements are slightly shifted in the vector ˆx may be perfectly acceptable to a practitioner, as this would correspond to a small error in estimating the relevant parameter. However, the traditional error definitions of CS would suggest that this reconstruction is extremely poor.

To give a concrete example, suppose that our signal of interest is a single tone in noise.

The vector xtruerepresents the discrete Fourier transform (DFT) of the signal sampled on a fine grid, and A is simply a DFT matrix. If the true signal is zero except for a single entry in the first position equal to 1 and ˆxσ contains a single 1 in the second position, then we have almost perfectly reconstructed the signal. The model order is correct with a single sinusoid, and the frequency has been estimated to an accuracy equal to the sampling density of our frequency grid. Yet the CS measure of errorxtrue− ˆxσ

2would be larger

20We note that it is not possible to improve the Kruskal rank of a matrix by left-multiplication; see [58].

This motivates attempts to change the waveform or collection geometry in radar problems to improve sparse reconstruction performance rather than simply considering linear transformations of the collected data.

Melvin-5220033 book ISBN : 9781891121531 September 14, 2012 17:41 167

5.3 SR Algorithms 167

than the norm of the true signal, suggesting complete failure. See [37] for an excellent discussion of this issue. This example perhaps suggests why one might consider using a given SR algorithm even when conditions like RIP are not met, provided that the signal of interest is still sparse or nearly so. Performance guarantees for these sample parameter problems remain at least a partially open research problem, although some progress has been made for the specific DFT case in [59].

We shall consider several classes of SR algorithms along with examples. New algo-rithms are being developed and extended at a breathtaking pace in the CS literature. Indeed, it is not at all uncommon to see articles improving and extending other articles that are still available only as preprints. In some cases, multiple generations of this phenomenon are observed. Thus, while several current state-of-the-art algorithms will be referenced, online references can be consulted for new developments before actually employing these techniques. The good news is that a wide range of excellent SR algorithms for both general and fairly specific applications are available online for download and immediate use.21 Furthermore, part of the beauty of these SR algorithms is their general simplicity. Sev-eral highly accurate algorithms can be coded in just a few dozen lines in a language like MATLAB.

Before delving into the collection of algorithms, we will make a few comments about the required inputs. The SR algorithms we survey will typically require a handful of parameters along with the data y. Most of the algorithms will require either an explicit estimate of s or a regularization parameter that is implicitly related to the sparsity. In many cases, these parameters can be selected with relative ease. In addition, many of the algorithms are amenable to warm-starting procedures, where a parameter is varied slowly. The repeated solutions are greatly accelerated by using the solution for a similar parameter setting to initialize the algorithm. One of the many algorithms leveraging this idea is Nesterov’s Algorithm (NESTA) [61].

Finally, an issue of particular importance is the required information about the forward operator A. Obviously, providing the matrix A itself allows any required computations to be completed. However, in many cases, A represents a transform like the DFT or discrete wavelet transform (DWT), or some other operator that can be implemented without explicit calculation and storage of the matrix A. Since the A matrix can easily be tens or even hundreds of gigabytes in some interesting problems, the ability to perform multiplications with A and AH without explicit storage is essential. While some of the SR algorithms require explicit access to A itself, many first-order algorithms require only the ability to multiply a given vector with A and AH. We will focus primarily on these algorithms, since they are the only realistic approaches for many large-scale radar problems.

We will divide our discussion of SR algorithms into a series of subsections. First, we will discuss penalized least squares methods for solving variants of (5.12). We will then turn to fast iterative thresholding methods and closely related reweighting techniques. All of these approaches have close ties to (5.12). In contrast, greedy methods leverage heuristics to obtain very fast algorithms with somewhat weaker performance guarantees. Finally, our discussion will briefly address Bayesian approaches to CS, methods for incorporating signal structure beyond simple sparsity, and approaches for handling uncertainty in the forward operator A.

21An excellent list is maintained in [60].

Melvin-5220033 book ISBN : 9781891121531 September 14, 2012 17:41 168

168 C H A P T E R 5 Radar Applications of Sparse Reconstruction 5.3.1 Penalized Least Squares

The convex relaxation of the 0 reconstruction problem given in (5.12) can be viewed as a penalized least squares problem. We have already seen that this problem arises in a Bayesian framework by assuming a Gaussian noise prior and a Laplacian signal prior.

This approach has a long history, for example, [25], and the use of the1norm for a radar problem was specifically proposed at least as early as [62].

The good news is that (5.12) is a linear program for real data and a second-order cone program for complex data [1]. As a result, accurate and fairly efficient methods such as interior point algorithms exist to solve (5.12) [24]. Unfortunately, these solvers are not well suited to the extremely large A matrices in many problems of interest and do not capitalize on the precise structure of (5.12). As a result, a host of specialized algorithms for solving these penalized least squares problems has been created.

This section explores several of these algorithms. It should be emphasized that solvers guaranteeing a solution to (5.12) or one of the equivalent formulations discussed herein inherit the RIP-based performance guarantee given in (5.15), provided that A meets the required RIP requirement.