Preliminaries - Operator Splitting Methods for Convex and Nonconvex Optimization

In addition to the preliminaries introduced in Sec. 1.4, we also need the following in the chapter. Deﬁne the recession function of a PCC function f as

rec f (d) = lim

α→∞

f (x + αd)− f(x)

α (5.1)

for any x∈ dom f. Loosely speaking, the recession function characterizes the asymp- totic change of f as we go in direction d. In fact,

f (x + αd) = α rec f (d) + o(α)

as α → ∞ for any x ∈ dom f. The recession function rec f : Rn → R ∪ {∞} is a

positively homogeneous PCC function. If h(x) = g(−x), then rec(h∗)(d) = rec(g∗)(−d). When f and g are PCC, either f (x)+g(x) =∞ for all x ∈ Rnor rec(f +g) = rec f +rec g. If f is PCC, then σdom f∗ = rec f .

Deﬁne the proximal operator Proxf :Rn→ Rn as

Proxf(z) = arg min x∈Rn

f (x) + (1/2)kx − zk2o.

When f is PCC, the arg min uniquely exists, and therefore Proxf is well-deﬁned. When C is closed and convex, ProxδC = ΠC. When f is PCC, Proxf+ Proxf∗ = I, where

I :Rn_{→ R}n _{is the identity operator.}

A mapping T :Rn → Rnis nonexpansive ifkT (x)−T (y)k ≤ kx−yk for all x, y ∈ Rn. Nonexpansive mappings are, by deﬁnition, Lipschitz continuous with Lipschitz constant 1. T :Rn_{→ R}n _{is ﬁrmly-nonexpansive if}

kT (x) − T (y)k2 _{≤ hx − y, T (x) − T (y)i}

5.2.1 Duality and primal subvalue

We call the optimization problem

minimize

x∈Rn f (x) + g(x), (P)

the primal problem. We call the optimization problem maximize

ν∈Rn −f

∗_(ν)_{− g}∗₍_−ν), _(D)

the dual problem. Throughout this chapter, we assume f and g are PCC.

(P) is feasible if 0 ∈ dom f − dom g, strongly infeasible if 0 /∈ dom f − dom g, and weakly infeasible otherwise. (P) falls under exactly one of the three cases. (P) is infeasible if it is not feasible. (D) is feasible if 0 ∈ dom (f∗) + dom (g∗), strongly infeasible if 0 /∈ dom (f∗) + dom (g∗), and weakly infeasible otherwise.

We call p⋆ _{= inf}{f(x)+g(x) | x ∈ Rn} the primal optimal value and d⋆ _{= sup}{−f∗_(ν)− g∗(−ν) | ν ∈ Rn_{} the dual optimal value. We let p}⋆ ₌_{∞ if (}_P_{) is infeasible and d}⋆ ₌_−∞

if (D) is infeasible. Weak duality, which always holds, states d⋆ ≤ p⋆. We say strong duality holds between (P) and (D), if d⋆ _{= p}⋆ _{∈ [−∞, ∞]. We say total duality holds}

between (P) and (D), if (P) has a solution, (D) has a solution, and strong duality holds. Deﬁne the primal subvalue of (P) as

p− = lim

ε→0+_x,yinf∈Rn{f(x) + g(y) | kx − yk ≤ ε} .

The notion of primal subvalue is standard in conic programming [118, 234, 149, 147]. Here, we generalize it to general convex programs. The following theorem is well known [195], although we have not seen it stated exactly in this form.

Theorem 5.2.1. If f and g are PCC, then d⋆ _{= p}− _{≤ p}⋆_.

Proof. Write h(ν) = f∗(ν) + g∗(−ν), and deﬁne

p(δ) = min

x∈Rn{f(x + δ) + g(x)}.

Since f and g are proper, i.e., ﬁnite somewhere, p is proper. Since p is deﬁned by partial minimization of a convex function, it is convex.

Then p∗(ν) =− min δ∈Rn{p(δ) − ν T_δ_} =− min δ∈Rn min x∈Rn{f(x + δ) + g(x)} − ν T_δ =− min x∈Rn min δ∈Rn{f(x + δ) − ν T δ} + g(x) =− min x∈Rn min δ0∈Rn{f(δ 0₎_{− ν}T_δ0_{} + ν}T_{x + g(x)} = f∗(ν)− min x∈Rn n νTx + g(x)o = f∗(ν) + g∗(−ν) = h(ν).

We can rewrite the deﬁnition of the primal subvalue as

p−= lim

ε→0_kδk≤εinf p(δ) = lim infδ→0 p(δ),

where the second equality follows from the deﬁnition of lim inf. The lower semi- continuous hull of p is p∗∗ [195, Theorem 4 and 5], i.e.,

lim inf δ→0 p(δ) = p ∗∗_(0). So p−= p∗∗(0) = h∗(0) = sup ν∈Rn{f ∗_{(ν) + g}∗₍_{−ν)} = d}∗_.

With Theorem 5.2.1, we can interpret strong duality as well-posedness of (P). The primal subvalue p−is the optimal value of (P) achieved with infinitesimal infeasibilities. When the infinitesimal infeasibilities provide a non-infinitestimal improvement to the function value, we can consider (P) ill-posed.

5.2.2 Douglas–Rachford operator

Douglas–Rachford splitting (DRS) applied to (P) is

xk+1/2= Proxγf(zk)

xk+1 = Proxγg(2xk+1/2− zk) (5.2)

zk+1 = zk+ xk+1− xk+1/2

with a starting point z0 _{∈ R}n _{and a parameter γ > 0. We also express this iteration}

more concisely as zk+1 = Tγ(zk) where Tγ =

1 2I +

2(2 Proxγg−I)(2 Proxγf−I).

Tγ :Rn→ Rn is a ﬁrmly-nonexpansive operator, and we interpret DRS as a ﬁxed-point

iteration. Write T1 for Tγ with γ = 1.

The standard analysis of DRS assumes total duality, which, again, means (P) has a solution, (D) has a solution, and d⋆ _{= p}⋆_.

Theorem 5.2.2 (Theorem 7.1 and 8.1 of [13] and Proposition 4.8 of [80]). Total duality

holds between (P) and (D) if and only if Tγ has a ﬁxed point for some γ > 0. If total duality holds between (P) and (D), then DRS converges in that zk _{→ z}⋆_{, where}

x⋆ _{= Prox}

γf(z⋆) is a solution of (P). If total duality does not hold between (P) and

(D), then DRS diverges in that kzk_{k → ∞.}

Theorem 5.2.2 is well known, although the term “total duality” is not always used. More often, total duality is assumed by instead assuming a saddle point exists for an appropriate Lagrangian.

5.2.3 Fixed-point iterations without ﬁxed points

Theorem 5.2.2 states the DRS iteration has no fixed points under pathologies. Ana- lyzing fixed-point iterations without fixed points is the first part of our pathological analysis.

Let T :Rn → Rn _{be a ﬁrmly-nonexpansive operator. Write}

ran (I− T ) = {z − T (z) | z ∈ Rn}.

Note that T has a ﬁxed point if and only if 0∈ ran (I − T ). The closure of this set,

ran (I− T ), is closed and convex [172]. We call

v = Π_{ran (I}_{−T )}(0)

the infimal displacement vector of T . (The term was coined in [22].) If T has a fixed point, then v = 0, but v = 0 is possible even when T has no fixed point.

The following classical result by Pazy and Baillon et al. elegantly characterizes the asymptotic behavior of ﬁxed-point iterations with respect to T .

Theorem 5.2.3 (Theorem 2 of [172] and Corollary 2.3 of [11]). If T is ﬁrmly-nonexpansive

and v is its inﬁmal displacement vector, the iteration zk+1 _{= T (z}k_{) satisﬁes}

zk =−kv + o(k), zk+1− zk → −v.

Theorem 5.2.3 is especially powerful when we can concretely characterize v. Re- cently, Bauschke, Hare, and Moursi published the following elegant formula.

Theorem 5.2.4 ([23]). The inﬁmal displacement vector v of T1, the DRS operator,

satisﬁes

v = arg minnkzk | z ∈ dom f − dom g ∩ dom f∗+ dom g∗o.

The original result in [23] is more general as it applies to the DRS operator of monotone operators. In Section5.3.1, we use Theorem5.2.4and the notion of improving directions to provide a further concrete characterization of v.

In document Operator Splitting Methods for Convex and Nonconvex Optimization (Page 156-160)