A Few Stationary Iterative Methods - Iterative Regularization

Iterative Regularization

6.1 A Few Stationary Iterative Methods

We start our treatment of iterative regularization methods with a short discussion of a few classical stationary iterative methods that exhibit semiconvergence. While these

Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

6.1. A Few Stationary Iterative Methods 111 methods are not suited as general-purpose iterative regularization methods because they can exhibit very slow convergence, they have certain specialized applications, e.g., in tomography applications, where they are competitive with other methods. A nice overview of these methods can be found in Chapter 7 of [48].

6.1.1 Landweber and Cimmino Iteration

One of the best known methods that exhibits semiconvergence is usually referred to as Landweber iteration (although many scientists have independently discovered this simple method). In its basic form the method looks as follows:

x^[k+1]= x^[k]+ ω A^T(b− A x^[k]), k = 1, 2, . . . , (6.1) where ω is a real number that must satisfy 0 < ω < 2A^TA⁻¹2 = 2/σ²₁. That’s it;

all that is needed in each iteration is to compute the residual vector r^[k]= b− A x^[k]

and then multiply with A^T and ω; this correction is then added to the iterate x^[k] to obtain the next iterate.

The iterate x^[k]has a simple expression as a ﬁltered SVD solution, similar to the TSVD and Tikhonov solutions; cf. (5.1). Speciﬁcally, we can write the kth iterate as

x^[k]= V Φ^[k]Σ⁻¹U^Tb,

where the elements of the diagonal matrix Φ^[k] = diag(ϕ^[k]₁ , . . . , ϕ^[k]_n ) are the ﬁlter factors for x^[k], which are given by

ϕ^[k]_i = 1− (1 − ω σi²)^k, i = 1, 2, . . . , n. (6.2) Moreover, for small singular values σi we have that ϕ^[k]_i ≈ k ω σ²i, i.e., they decay with the same rate as the Tikhonov filter factors ϕ^[λ]_i (4.11). (The verification of these results is left as Exercise 6.1.) Figure 6.2 shows how these filters “evolve” as the number of iterations k increases. We see that as k increases, the filters are shifted toward smaller singular values, and more SVD components are effectively included in the iterate x^[k].

If we, quite arbitrarily, deﬁne the break-point in the ﬁlter factors as the value σ^[k]_break of σi for which ϕ^[k]_i = 0.5, then it is easy to show that

σ_break^[k] =

1− (¹₂)¹^k

ω .

The break-points are indicated in Figure 6.2 by the circles. To see how the break-point varies with k, let us consider the ratio σ_break^[k] /σ_break^[2k] between the break-points at k and 2k iterations:

σ^[k]_break σ^[2k]_break =

,- 1 − (¹²)¹^k 1− (¹₂)^2k¹ =

+, ,-!

1 + (¹₂)^2k¹" !

1− (¹₂)^2k¹"

1− (¹₂)^2k¹

= .

1 + (¹₂)^2k¹ →√

2 for k → ∞.

Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

112 Chapter 6. Toward Real-World Problems: Iterative Regularization

10⁻² 10⁻¹ 10⁰

σ_i

Landweber filter factors φ_i^[k] = 1 − (1 − ω σ_i²)^k

k = 10 ω = 1 k = 20 ω = 1 k = 40 ω = 1 k = 80 ω = 1 k = 10 ω = 0.2

Figure 6.2. The Landweber ﬁlter factors ϕ^[k]_i = 1− (1 − ω σi²)^k are functions of the number of iterations k and the parameter ω. The circles indicate the break-points σ^[k]_break in the ﬁlter factors where they attain the value 0.5.

Hence, as k increases, the break-point tends to be reduced by a factor√

2≈ 1.4 each time the number of iterations k is doubled. The consequence is that the semiconver-gence toward the exact solution is quite slow.

Another classical iterative method with semiconvergence, known as Cimmino iteration, takes the basic form

x^[k+1] = x^[k]+ ω A^TD (b− A x^[k]), k = 1, 2, . . . ,

in which D = diag(di) is a diagonal matrix whose elements are deﬁned in terms of the rows a^T_i = A(i , : ) of A as

di =

⎧⎪

⎨

⎪⎩ 1 m

ai²2

, ai = 0,

0, ai = 0.

Landweber’s and Cimmino’s methods are two special cases of a more general class of stationary Landweber-type methods, sometimes referred to as simultaneous iterative reconstruction techniques [71]. These methods take the form

x^[k+1] = x^[k]+ ω A^TM (b− A x^[k]), k = 1, 2, . . . ,

where M is a symmetric and positive (semi)deﬁnite matrix. The study of this class of methods is similar to that of Landweber’s method, except that in the analysis we substitute M^1/2A and M^1/2b for A and b.

Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

6.1. A Few Stationary Iterative Methods 113

The Landweber-type methods can be augmented to take the form x^[k+1]=P

x^[k]+ ω A^TM (b− A x^[k])

, (6.3)

where the operatorP represents so-called hard constraints by which we can incorporate additional a priori knowledge about the regularized solution. Exercise 6.2 illustrates this for the case whereP represents a nonnegativity constraint.

6.1.2 ART, a.k.a. Kaczmarz’s Method

Another classical method that is worth mentioning here is Kaczmarz’s method, also known in computerized tomography as the algebraic reconstruction technique (ART);

cf. Section 5.3.1 in [56]. In this so-called row action method, the kth iteration consists of a “sweep” through the rows of A, in which the current solution vector x^[k] (for k = 0, 1, 2, . . .) is updated as follows:

x^[k⁽⁰⁾^]= x^[k]

for i = 1, . . . , m

x^[k^{(i )}^]= x^[k⁽ⁱ⁻¹⁾^]+bi− ai^Tx^[k⁽ⁱ⁻¹⁾^]

ai²2

end

x^[k+1]= x^[k^(m)^]

Here b_iis the i th component of the right-hand side b, and a_iis the i th row turned into a column vector. The row normsa^Ti 2=ai2are, of course, precomputed once.

All experience shows that the ART method always converges quite quickly in the ﬁrst few iterations, after which the convergence may become very slow. It was probably the fast initial convergence that made it a favorite method in the early days of computerized tomography, where only a few iterations were carried out.

It can be shown that Kaczmarz’s method is mathematically equivalent to applying Gauss–Seidel iterations to the problem x = A^Ty , A A^Ty = b. The method can be written in the form of a Landweber-type method, but with a matrix M that is nonsymmetric. This severely complicates the analysis of the method, and the iterates cannot be written in the form (5.1) with a diagonal ﬁlter matrix.

Let us for a moment study Kaczmarz’s method from a geometric perspective. As shown in [54], each iterate x^[k^{(i )}^]is obtained by projecting the previous iterate x^[k⁽ⁱ⁻¹⁾^] orthogonally onto the hyperplaneHi={ x | a^Ti x = b_i} deﬁned by the ith row a^Ti and the corresponding element biof the right-hand side. Also, it is proved in [54] that if A is square and nonsingular, thenA⁻¹b−x^[k^{(i )}^]²2=A⁻¹b−x^[k⁽ⁱ⁻¹⁾^]²2−(bi−a^Ti x^[k⁽ⁱ⁻¹⁾^])²; i.e., the residual norm is nonincreasing.

The above geometric interpretation of ART gives some insight into the regular-izing properties of this method. Let θi j denote the angle between the vector ai and the j th right singular vector v_j. Then, using that a_i^T = e_i^TA, where e_i is the i th unit vector, we get

cos θi j = a^T_i

ai2

vj = e_i^TA vj

ai2

= σje_i^Tuj

ai2

= σjui j

ai2

Downloaded 07/26/14 to 129.107.136.153. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php .

114 Chapter 6. Toward Real-World Problems: Iterative Regularization

0 10 20 30 40

10⁻¹ 10⁰

Relative errors || x^exact − x^[k] ||

2 / || x^exact ||

2 Landweber Kaczmarz (ART)

Figure 6.3. Example of the convergence histories for the Landweber and Kaczmarz (ART) methods applied to the shaw test problem of size n = 128. Both methods converge slowly, but the initial converge of ART (during the ﬁrst, say, 6 iterations) is quite fast.

We can conclude that the vectors ai are rich in the directions vj that correspond to large singular values σj, while they are almost orthogonal to those singular vectors vj

that correspond to the small singular values. Hence, the Kaczmarz (ART) iterates x^[k^{(i )}^] (which always lie on the hyperplanesHi and thus in the direction of the vectors ai) are guaranteed to have large components in the direction of the right singular vectors corresponding to the larger singular values. On the other hand, components in directions that correspond to small singular values are hardly present. In other words, the iterates tend to behave as regularized solutions.

Figure 6.3 shows an example of the error histories, i.e., the relative errors

xêxact− x^[k]2/xêxact2 as functions of k, for the Landweber and Kaczmarz (ART) methods. The Landweber parameter was chosen as ω = A⁻²_F (using the Frobe-nius norm as an overestimate of σ1). We see that both methods exhibit semicon-vergence: they actually converge (during the first many iterations) toward the exact solution xêxact. Moreover, both methods converge very slowly. During the first few iterations the convergence of ART is quite good; but the method essentially stagnates after about 10 iterations. Clearly, if better accuracy is required, then there is a need for faster general-purpose iterative methods.

In document Discrete Inverse Problem - Insight and Algorithms (Page 106-110)