Regularization Methods
4.2 Truncated SVD
0 0.5 1
−0.1 0 0.1 0.2
cond(A) = 4979, ||e||
2 = 5e−5
0 0.5 1
0 0.5 1 1.5
cond(A) = 3.4e9, ||e||
2 = 1e−7
0 0.5 1
0 0.5 1 1.5
cond(A) = 2.5e16, ||e||
2 = 0
Figure 4.2. Exact solutions (smooth lines) together with the naive solutions (jagged lines) to two test problems. Left: deriv2 with n = 64. Middle and right:
gravity with n = 32 and n = 53. Due to the large condition numbers (especially for gravity), the small perturbations lead to useless naive solutions.
(TSVD) solution xk and the Tikhonov solution xλ, to be introduced in the next two sections, are (we hope) good approximations to xexact.
To further illustrate this aspect we consider two test problems, namely, the second derivative problem from Exercise 2.3 (implemented in deriv2) of size n = 64, and the gravity surveying test problem from Exercise 3.5 (implemented in gravity) with n = 32 and n = 53. Figure 4.2 shows the exact solutions to these problems, together with the naive solutions computed for small perturbations of the right-hand side.
For deriv2 the perturbation is scaled such thate2= 5· 10−5, and with this noise level we cannot find any information about xexact in the naive solution. For gravity with n = 32 the noise level is much smaller,e2 = 10−7, but due to the much larger condition number for this problem such a small perturbation is still large enough to produce a useless naive solution. If we increase the problem size to n = 53, then the matrix is now so ill conditioned that even with e = 0 the rounding errors during the computation of the naive solution render this solution useless.
By means of the regularization methods that we develop in this chapter, as well as in Chapters 6 and 8, we are able to compute approximations to xexactthat are much less sensitive to the perturbations of the right-hand side. In particular, we return to regularized solutions to the gravity surveying problem in Sections 4.2 and 4.4 below.
4.2 Truncated SVD
From the analysis in the two previous chapters, it is clear that the extremely large errors in the naive solution come from the noisy SVD components associated with the smaller singular values. In particular, the results in (3.21) show that the naive solution is dominated by SVD coefficients of the form uiTb/σi ≈ uiTe/σi (where e is the perturbation of b) corresponding to the smaller singular values; see Figure 3.7.
The good news is that our analysis also reveals that some of the SVD coefficients are rather trustworthy, namely, the coefficients of the form uiTb/σi ≈ uTi bexact/σi (in which bexact= A xexactis the exact right-hand side) corresponding to the larger singular
Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
56 Chapter 4. Computational Aspects: Regularization Methods values; again, see Figure 3.7. This gives us hope that we can actually recover some of the information about the solution to the problem.
Another piece of good news is that we can assume that the exact right-hand side satisfies the discrete Picard condition (otherwise there is no point in trying to solve the discrete ill-posed problem). As a consequence, the SVD components of the exact solution with largest magnitude are precisely those coefficients that are approximated reasonably well, because for the small indices i we have uiTb/σi ≈ uTi bexact/σi = viTxexact.
These considerations immediately lead to a “brute force” method for computing regularized approximate solutions: simply chop off those SVD components that are dominated by the noise. Hence, we define the truncated SVD (TSVD) solution xk as the solution obtained by retaining the first k components of the naive solution:
xk ≡
k i =1
uiTb
σi vi. (4.2)
The truncation parameter k should be chosen such that all the noise-dominated SVD coefficients are discarded. A suitable value of k often can be found from an inspection of the Picard plot; for example, for the two noise problems in Figure 3.7 we would choose k = 16 and k = 9, respectively.
There is an alternative formulation of the TSVD method which is important.
While the definition of xk in (4.2) takes its basis in a specific formula for computing the regularized solution, we can also define a regularized solution as the solution to a modified and better conditioned problem. Let us introduce the TSVD matrix Ak, which is the rank-k matrix defined as
Ak = is typically much smaller than the condition number cond(A) = σ1/σn of the original matrix A.
Hence, it seems like a good idea to replace the original and ill-conditioned problem A x = b or minA x−b2with the better conditioned least squares problem minAkx− b2. The least squares formulation is needed, because we can no longer expect to find an x such that Akx = b. However, since Ak is rank deficient (as long as k < n), there is not a unique solution to this least squares problem; it is easy to show that the general solution has the form
x =
To define a unique solution, we must supplement the least squares problem with an additional constraint on the solution x . A natural constraint, in many applications, is
Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
4.2. Truncated SVD 57
Figure 4.3. TSVD solutions xk to the gravity surveying problem for eight different values of k. The exact solution is shown in the bottom right corner. A good approximation is achieved for k = 10, while the noise is clearly visible in xk for k = 14 and k = 16.
that we seek the solution with minimum 2-norm:
minx2 subject to Akx− b2= min . (4.5) It is easy to show that the solution to this constrained problem is precisely the TSVD solution xk in (4.2). Hence, (4.5) is an alternative definition of the TSVD solution, in which the regularity requirement on x , in the form of the minimization of its norm
x2, is explicit.
From the above definition of xk it follows that we can also write the TSVD solution in the form
2The matrix A†kis the pseudoinverse of the TSVD matrix Akdefined in (4.3).
Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
58 Chapter 4. Computational Aspects: Regularization Methods then the covariance matrix for the TSVD solution is
Cov(xk) = A†kCov(b) (A†k)T = η2
k i =1
σi−2viviT.
The norm of this matrix is
Cov(xk)2= η2/σk2,
and since σk is always larger—and often much larger—than σn, we conclude that the elements of Cov(xk) are generally smaller—and often much smaller—than those of the covariance matrix Cov(x ) for the naive solution.
The price we pay for this reduction of the variance in xk, compared to the naive solution x = A−1b, is bias. While A−1b is unbiased, i.e., the expected value is the exact solution,E(A−1b) = xexact, the TSVD solution xk has a nonzero bias:
E(xk) =
However, due to the discrete Picard condition, the coefficients|viTxexact | in the bias term, and therefore also the norm of the bias term,
A somewhat common, but unfortunate, mistake in connection with TSVD is to base the criterion for choosing the truncation parameter k on the size of the singular values σi. As long as the noise is restricted to the right-hand side, the truncation parameter k must define the break-point between the retained and discarded (filtered) SVD coefficients, and therefore the choice of k is determined by the behavior of the noisy coefficients uiTb.
The confusion seems to arise because the numerical rank of a noisy matrix is related to the singular values as well as the norm of the perturbation matrix; see, e.g., [32] for details about these issues. Also, if|uiTb| decay just slightly faster than σi, and if they are scaled such that|uiTb| ≈ σi, then one may not notice the difference.