Convergence analysis - Implicit Fixed-point Proximity Framework

2.5 Implicit Fixed-point Proximity Framework

2.5.7 Convergence analysis

In the following, we conduct a convergence analysis on the implicit fixed-point proximity algorithm stated in Algorithm 1. In particular, we prove that the sequence generated by Algorithm 1 converges with a rate of O(_K1), where K is the number of outer iterations.

Algorithm 1 is developed from the perspective of the fixed-point proximity equation and is specifically designed for those algorithms in which the operator LM has a fully implicit

scheme. The operator LM is evaluated by performing inner iterations via contractive map-

pings. By taking into consideration the errors caused by the inner loops, the sequence {wk_}

generated by Algorithm 1 can be viewed as a sequence generated by the following inexact Krasnosel’skiˇı–Mann iteration

wk+1 = wk+ λk LM(wk) + ek− wk , (2.29)

where ek ∈ Rd represents the computational errors caused by computing LM approximately.

Theorem 2.5 for exact Krasnosel’skiˇı–Mann algorithms is not sufficient to guarantee the convergence of an implicit fixed-point proximity algorithm, and the convergence results for existing fixed-point proximity algorithms are also not applicable to implicit algorithms.

Therefore, we utilize the definitions and techniques introduced in [23, 25] to complete the proof for the convergence of Algorithm 1.

First, let us introduce the definition of quasi-Fejér monotonicity and some convergence results in [23]. The quasi-Fejér monotonicity is an extension of Fejér monotonicity, which can address the errors ek from inner iterations.

Definition 2.16 A sequence {xk_{} in R}d _{is said to be a quasi-Fej´}_{er monotone sequence}

relative to a target set S ⊆ Rd _{if there exists ε}

k ≥ 0 such that P εk < ∞ and for all x ∈ S,

kxk+1_{− xk ≤ kx}k_{− xk + ε} k.

Note that this definition of the quasi-Fejér monotonicity refers to the quasi-Fejér monotonicity of Type I in [23]. Also, if εk= 0 for all k ∈ N then quasi-Fejér monotone reduces to

Fej´er monotone.

Lemma 2.17 [23] Let C ∈ (0, 1], let {αk} be a sequence in (0, +∞), let {βk} be a sequence

in (0, +∞), and let {εk} be a summable sequence in (0, +∞) such that

αk+1 ≤ Cαk− βk+ εk.

Then {αk} converges and

k|βk| < ∞.

Theorem 2.18 Suppose {xk} is a quasi-Fej´er monotone sequence relative to a nonempty set S in Rd. Let C{xk} denote the set of cluster points of {xk_{}. Then {x}k_{} converges to a}

point in S if and only if C{xk_{} ⊂ S.}

Proof. The result follows from Proposition 3.2 and Theorem 3.8 in [23].

If the sequence {wk_{} generated by Algorithm 1 is quasi-Fej´er monotone with respect to the}

can be achieved by verifying the sufficient and necessary condition mentioned in Theorem 2.18. That is to verify if every convergent subsequence of {wk_{} converges to a solution of}

problem (P1).

Second, let us introduce the demiclosed principle for nonexpansive operators, in order to study the convergence behavior of each convergent subsequence of {wk}.

Theorem 2.19 (Demiclosed Principle) [38] _{Let T : R}d _{→ R}d _{be a nonexpansive op-}

erator with respect to H ∈ Sd

+, let {xk} be a sequence in Rd, and let x be a point in Rd.

Suppose that {xk_{} converges to x and that {x}k_{− T x}k_{} converges to 0. Then x ∈ Fix T .}

Now, with the results above, we are ready to prove the convergence of the sequence {wk_}

generated by equation (2.29). The proof is presented in three steps. First, we show that {wk} is a quasi-Fej´er monotone sequence relative to the solution set, which is Fix (prox_F,R◦ E) = Fix (LM). Second, we prove that the assumptions of the demiclosed principle are satisfied,

and that therefore the limit of {wk_{} is exactly a fixed point of L}

M. Last, by applying both

Theorem 2.18 and Theorem 2.19, the convergence of {wk_{} is eventually achieved.}

Note that the proof of the following theorem is adapted from the proof in [23, 24] so that it is more applicable to implicit fixed-point proximity algorithms. And this proving strategy can be utilized for implicit fixed-point proximity algorithms with uncommon structures.

Theorem 2.20 Assume that Fix (prox_F,R◦ E) 6= ∅. Let LM be an M -operator associated

with prox_F,R◦ E and LM is nonexpansive with respect to H ∈ Sd+. Let {λk} be a sequence in

[0, 1] and {ek} be a sequence in Rd such that P_kλk(1 − λk) = +∞ and P_kλkkekkH < ∞.

Suppose the sequence {wk_{} is generated by equation (2.29) with an initial vector w}0 _{∈ R}d_.

Then the following statements hold.

(i) {wk} is a quasi-Fej´er monotone sequence relative to Fix (prox_F,R◦ E);

(ii) {LMwk− wk} converges to 0;

(iii) {wk_{} converges to a point in Fix (prox}

Proof. Let w∗ ∈ Fix (prox_Φ,R ◦ E), then by Proposition 2.8, w∗ _{= prox}

F,R(Ew

∗_{) and w}∗ ₌

LM(w∗).

(i) It follows from the nonexpansive property of LM that

wk+1− w∗ H = (1 − λk)wk+ λkLM(wk) + λkek− w∗ H ≤ λkkLM(wk) − w∗kH + λkkekkH + (1 − λk)kwk− w∗kH = λkkLM(wk) − LM(w∗)kH + λkkekkH + (1 − λk)kwk− w∗kH ≤ kwk− w∗kH + λkkekkH.

Hence, {wk_{} is a quasi-Fej´er monotone sequence relative to Fix (prox}

Φ,R◦ E).

(ii) Since (1 − λk)Id + λkLM is λk-average, then by Lemma 2.4

(1 − λk)wk+ λkLM(wk) − w∗ 2 H ≤ kw k_{− w}∗_k2 H− 1 − λk λk k(1 − λk)wk+ λkLM(wk) − wkk2H. Let C = sup_k{2 wk_{− w}∗ H + kekkH} < +∞. Then we have wk+1− w∗ 2 H = (1 − λk)wk+ λkLM(wk) + λkek− w∗ 2 H ≤ (1 − λk)wk+ λkLM(wk) − w∗ H + λkkekkH 2 ≤ (1 − λk)wk+ λkLM(wk) − w∗ 2 H + CkekkH ≤ kwk− w∗k2_H − 1 − λk λk k(1 − λk)wk+ λkLM(wk) − wkk2H + CkekkH = kwk− w∗k2_H − λk(1 − λk)kLM(wk) − wkk2H + CkekkH.

It follows from Lemma 2.17 that P

kλk(1 − λk)kLM(w

k_{) − w}k_k2

H < +∞. Since

λk) = +∞, then lim infk→∞kLM(wk) − wkk2H = 0. We have kLM(wk+1) − wk+1kH = kLM(wk+1) − (1 − λk)wk− λkLM(wk) − λkekkH ≤ kLM(wk+1) − LM(wk)kH + (1 − λk)kLM(wk) − wkkH + λkkekkH ≤ kwk+1_{− w}k_k H + (1 − λk)kLM(wk) − wkkH + λkkekkH = kλk(LM(wk) − wk) + λkekkH + (1 − λk)kLM(wk) − wkkH + λkkekkH ≤ kLM(wk) − wkkH + 2λkkekkH. Since P

k2λkkekkH < +∞, then, by Lemma 2.17, kLM(wk) − wkkH converges to 0. Hence,

{LM(wk) − wk} converges to 0.

(iii) Suppose {wki} is a convergent subsequence of {wk}, then {L

M(wki)−wki}iconverges

to 0. By the demiclosed principle in Theorem 2.19, {wki} converges to a fixed point in Fix (prox_F,R◦ E). So C{wk_{} ⊂ Fix (prox}

F,R◦ E). Hence, it follows from Theorem 2.18 that

{wk_{} converges to a fixed point in Fix (prox}

F,R ◦ E).

If LM is firmly nonexpansive with respect to H, which is a stronger condition than the

nonexpansive assumption mentioned in Theorem 2.20, then the parameter λk can be chosen

in [0, 2] instead of [0, 1].

Corollary 2.21 Assume that Fix (prox_F,R◦ E) 6= ∅. Let LM be an M -operator associated

with prox_F,R ◦ E and LM is firmly nonexpansive with respect to H ∈ Sd+. Let {λk} be

a sequence in [0, 2] and {ek} be a sequence in Rd such that P_kλk(2 − λk) = +∞ and

kλkkekkH < ∞. Then sequence {w

k_{} generated by equation (2.29) with an initial vector}

w0 _{∈ R}d _{converges to a point in Fix (prox}

F,R◦ E).

a nonexpansive operator R with respect to H such that LM = 1₂Id +1₂R. Thus,

wk+1 = wk+λk 2 R(w

k_{) + 2ε}

k− wk .

Then the result immediately follows from Theorem 2.20.

The convergence results mentioned above guarantee that the sequence generated by Al- gorithm 1 converges if LM satisfies three properties discussed in the previous subsections,

and the errors caused by inner iterations are summable.

Next, we study the theoretical convergence rate of Algorithm 1 in the sense of the partial primal-dual gap [82] without taking inner errors into consideration. The partial primal-dual gap of the average of the sequence {w1, w2, . . . , wK}, generated by Algorithm 1, vanishes with a rate of O(_K1), where K is the number of outer iterations.

Proposition 2.22 Let LM be an M -operator associated with proxF,R◦ E. Suppose R(E −

I) ∈ Rd×d _{is a skew-symmetric matrix and R(E − M ) ∈ S}d

+. Let wk ∈ Rd, k = 0, . . . , K − 1

be a sequence generated by wk+1 _{= L}

M(wk). Then, for any vector w ∈ Rd, the partial

primal-dual gap

G( ¯wK, w) ≤ 1

2Kkw − w

0_k2

R(E−M ),

where G(w1, w2) = F (w1) − F (w2) + hw1, R(E − I)w2i, and ¯wK = _K1

k=1w k_.

Proof. Since wk+1 = prox_F,R(M wk+1+ (E − M )wk), then by equation (2.4)

By the definition of subdifferential in equation (2.1),

F (w) ≥ F (wk+1) + hR(E − M )wk+ R(M − I)wk+1, w − wk+1i

= F (wk+1) + hR(E − M )wk− R(E − M )wk+1_{+ R(E − I)w}k+1_{, w − w}k+1_i

= F (wk+1) + hR(E − M )(wk− wk+1_{), w − w}k+1_{i + hR(E − I)w}k+1_{, w − w}k+1_i.

Since R(E − I) is a skew-symmetric matrix, then we have

hR(E − I)wk+1, w − wk+1i = hR(E − I)wk+1, wi = hwk+1, R(E − I)wi

and

F (w) = F (wk+1) + hwk+1, R(E − I)wi + hw − wk+1, R(E − M )(wk− wk+1)i.

Hence, the partial primal-dual gap

G(wk+1, w) =F (wk+1) − F (w) + hwk+1, R(E − I)wi ≤hw − wk+1_{, R(E − M )(w}k+1_{− w}k_)i =1 2kw − w k_k2 R(E−M )− 1 2kw − w k+1_k2 R(E−M )− 1 2kw k+1_{− w}k_k2 R(E−M ).

Summing both sides from k = 0, 1, . . . , K − 1, we have

K−1 X k=0 G(wk+1, w) ≤ 1 2kw − w 0_k2 R(E−M )− 1 2kw − w K_k2 R(E−M )− 1 2 K−1 X k=0 kwk+1_{− w}k_k2 R(E−M ) ≤ 1 2kw − w 0_k2 R(E−M ), since kw − wKk2 R(E−M ) and kw k+1_{− w}k_k2

By the convexity of G(·, w), we have

G( ¯wK, w) ≤ 1

2Kkw − w

0_k2

Examples of Implicit Fixed-point

Proximity Algorithms

In this chapter, we propose to develop several implicit fixed-point proximity algorithms for different optimization problems. Those algorithms are established under the implicit fixed-point proximity framework introduced in Chapter 2. A detailed convergence analysis will be conducted for each proposed implicit algorithm.

Before presenting these implicit fixed-point proximity algorithms, we introduce the gen- eral procedure for constructing implicit algorithms for composite optimization problems that can be identified as problem (P1).

First, we formulate the composite optimization problem as problem (P1) and then char- acterize its solutions as the solutions of a system of fixed point equations in the form of the unified fixed point equation (2.12).

Second, to form an implicit operator LM defined in Definition 2.7, we propose a block

structure for the matrix M and carefully construct the block matrix elements. In particular, the resulting LM can be computed iteratively by contractive mappings.

Third, we obtain an implicit fixed-point proximity algorithm, which generates a sequence

{wk_{} by the following equation}

wk+1 = wk+ λk LM(wk) − wk , (2.17)

where λk > 0 and LM(wk) is the unique solution of the following implicit equation

w = prox_F,R(M w + (E − M )wk). (2.18)

Last, we study the assumptions on algorithm parameters in order to guarantee the proposed LM satisfies Property 1, Property 2 and Property 3 under the implicit fixed-point

proximity framework. Then a convergence analysis of the proposed implicit fixed-point proximity algorithm is conducted with theoretical results.

Now, we utilize this procedure to develop several implicit fixed-point proximity algorithms. Each algorithm is designed for a different optimization problem with different num- bers of convex functions and linear operators.

3.1 Example 1

The first example is to minimize a separable sum of two convex functions, of which one convex function is composited with a linear operator.

min

x∈Rnf1(x) + f2(Bx), (P2)

where B is an m × n matrix, and f1 : Rn → (−∞, +∞], and f2 : Rm → (−∞, +∞] are

proper, lower semi-continuous, and convex.

Among the optimization problems presented in Chapter 1, the following problems have the form of problem (P2): ROF model, L1-TV denoising model, framelet-based denoising models, and `1-regularized linear least squares problem.

In document Implicit Fixed-point Proximity Framework for Optimization Problems and Its Applications (Page 49-59)