Iterative Regularization
6.1 A Few Stationary Iterative Methods
We start our treatment of iterative regularization methods with a short discussion of a few classical stationary iterative methods that exhibit semiconvergence. While these
Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
6.1. A Few Stationary Iterative Methods 111 methods are not suited as general-purpose iterative regularization methods because they can exhibit very slow convergence, they have certain specialized applications, e.g., in tomography applications, where they are competitive with other methods. A nice overview of these methods can be found in Chapter 7 of [48].
6.1.1 Landweber and Cimmino Iteration
One of the best known methods that exhibits semiconvergence is usually referred to as Landweber iteration (although many scientists have independently discovered this simple method). In its basic form the method looks as follows:
x[k+1]= x[k]+ ω AT(b− A x[k]), k = 1, 2, . . . , (6.1) where ω is a real number that must satisfy 0 < ω < 2ATA−12 = 2/σ21. That’s it;
all that is needed in each iteration is to compute the residual vector r[k]= b− A x[k]
and then multiply with AT and ω; this correction is then added to the iterate x[k] to obtain the next iterate.
The iterate x[k]has a simple expression as a filtered SVD solution, similar to the TSVD and Tikhonov solutions; cf. (5.1). Specifically, we can write the kth iterate as
x[k]= V Φ[k]Σ−1UTb,
where the elements of the diagonal matrix Φ[k] = diag(ϕ[k]1 , . . . , ϕ[k]n ) are the filter factors for x[k], which are given by
ϕ[k]i = 1− (1 − ω σi2)k, i = 1, 2, . . . , n. (6.2) Moreover, for small singular values σi we have that ϕ[k]i ≈ k ω σ2i, i.e., they decay with the same rate as the Tikhonov filter factors ϕ[λ]i (4.11). (The verification of these results is left as Exercise 6.1.) Figure 6.2 shows how these filters “evolve” as the number of iterations k increases. We see that as k increases, the filters are shifted toward smaller singular values, and more SVD components are effectively included in the iterate x[k].
If we, quite arbitrarily, define the break-point in the filter factors as the value σ[k]break of σi for which ϕ[k]i = 0.5, then it is easy to show that
σbreak[k] =
*
1− (12)1k
ω .
The break-points are indicated in Figure 6.2 by the circles. To see how the break-point varies with k, let us consider the ratio σbreak[k] /σbreak[2k] between the break-points at k and 2k iterations:
σ[k]break σ[2k]break =
+,
,- 1 − (12)1k 1− (12)2k1 =
+, ,-!
1 + (12)2k1" !
1− (12)2k1"
1− (12)2k1
= .
1 + (12)2k1 →√
2 for k → ∞.
Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
112 Chapter 6. Toward Real-World Problems: Iterative Regularization
10−2 10−1 100
10−2 10−1 100
σi
Landweber filter factors φi[k] = 1 − (1 − ω σi2)k
k = 10 ω = 1 k = 20 ω = 1 k = 40 ω = 1 k = 80 ω = 1 k = 10 ω = 0.2
Figure 6.2. The Landweber filter factors ϕ[k]i = 1− (1 − ω σi2)k are functions of the number of iterations k and the parameter ω. The circles indicate the break-points σ[k]break in the filter factors where they attain the value 0.5.
Hence, as k increases, the break-point tends to be reduced by a factor√
2≈ 1.4 each time the number of iterations k is doubled. The consequence is that the semiconver-gence toward the exact solution is quite slow.
Another classical iterative method with semiconvergence, known as Cimmino iteration, takes the basic form
x[k+1] = x[k]+ ω ATD (b− A x[k]), k = 1, 2, . . . ,
in which D = diag(di) is a diagonal matrix whose elements are defined in terms of the rows aTi = A(i , : ) of A as
di =
⎧⎪
⎨
⎪⎩ 1 m
1
ai22
, ai = 0,
0, ai = 0.
Landweber’s and Cimmino’s methods are two special cases of a more general class of stationary Landweber-type methods, sometimes referred to as simultaneous iterative reconstruction techniques [71]. These methods take the form
x[k+1] = x[k]+ ω ATM (b− A x[k]), k = 1, 2, . . . ,
where M is a symmetric and positive (semi)definite matrix. The study of this class of methods is similar to that of Landweber’s method, except that in the analysis we substitute M1/2A and M1/2b for A and b.
Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
6.1. A Few Stationary Iterative Methods 113
The Landweber-type methods can be augmented to take the form x[k+1]=P
x[k]+ ω ATM (b− A x[k])
, (6.3)
where the operatorP represents so-called hard constraints by which we can incorporate additional a priori knowledge about the regularized solution. Exercise 6.2 illustrates this for the case whereP represents a nonnegativity constraint.
6.1.2 ART, a.k.a. Kaczmarz’s Method
Another classical method that is worth mentioning here is Kaczmarz’s method, also known in computerized tomography as the algebraic reconstruction technique (ART);
cf. Section 5.3.1 in [56]. In this so-called row action method, the kth iteration consists of a “sweep” through the rows of A, in which the current solution vector x[k] (for k = 0, 1, 2, . . .) is updated as follows:
x[k(0)]= x[k]
for i = 1, . . . , m
x[k(i )]= x[k(i−1)]+bi− aiTx[k(i−1)]
ai22
ai
end
x[k+1]= x[k(m)]
Here biis the i th component of the right-hand side b, and aiis the i th row turned into a column vector. The row normsaTi 2=ai2are, of course, precomputed once.
All experience shows that the ART method always converges quite quickly in the first few iterations, after which the convergence may become very slow. It was probably the fast initial convergence that made it a favorite method in the early days of computerized tomography, where only a few iterations were carried out.
It can be shown that Kaczmarz’s method is mathematically equivalent to applying Gauss–Seidel iterations to the problem x = ATy , A ATy = b. The method can be written in the form of a Landweber-type method, but with a matrix M that is nonsymmetric. This severely complicates the analysis of the method, and the iterates cannot be written in the form (5.1) with a diagonal filter matrix.
Let us for a moment study Kaczmarz’s method from a geometric perspective. As shown in [54], each iterate x[k(i )]is obtained by projecting the previous iterate x[k(i−1)] orthogonally onto the hyperplaneHi={ x | aTi x = bi} defined by the ith row aTi and the corresponding element biof the right-hand side. Also, it is proved in [54] that if A is square and nonsingular, thenA−1b−x[k(i )]22=A−1b−x[k(i−1)]22−(bi−aTi x[k(i−1)])2; i.e., the residual norm is nonincreasing.
The above geometric interpretation of ART gives some insight into the regular-izing properties of this method. Let θi j denote the angle between the vector ai and the j th right singular vector vj. Then, using that aiT = eiTA, where ei is the i th unit vector, we get
cos θi j = aTi
ai2
vj = eiTA vj
ai2
= σjeiTuj
ai2
= σjui j
ai2
Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php .
114 Chapter 6. Toward Real-World Problems: Iterative Regularization
0 10 20 30 40
10−1 100
k
Relative errors || x exact − x [k] ||
2 / || x exact ||
2 Landweber Kaczmarz (ART)
Figure 6.3. Example of the convergence histories for the Landweber and Kaczmarz (ART) methods applied to the shaw test problem of size n = 128. Both methods converge slowly, but the initial converge of ART (during the first, say, 6 iterations) is quite fast.
We can conclude that the vectors ai are rich in the directions vj that correspond to large singular values σj, while they are almost orthogonal to those singular vectors vj
that correspond to the small singular values. Hence, the Kaczmarz (ART) iterates x[k(i )] (which always lie on the hyperplanesHi and thus in the direction of the vectors ai) are guaranteed to have large components in the direction of the right singular vectors corresponding to the larger singular values. On the other hand, components in directions that correspond to small singular values are hardly present. In other words, the iterates tend to behave as regularized solutions.
Figure 6.3 shows an example of the error histories, i.e., the relative errors
xexact− x[k]2/xexact2 as functions of k, for the Landweber and Kaczmarz (ART) methods. The Landweber parameter was chosen as ω = A−2F (using the Frobe-nius norm as an overestimate of σ1). We see that both methods exhibit semicon-vergence: they actually converge (during the first many iterations) toward the exact solution xexact. Moreover, both methods converge very slowly. During the first few iterations the convergence of ART is quite good; but the method essentially stagnates after about 10 iterations. Clearly, if better accuracy is required, then there is a need for faster general-purpose iterative methods.