• No results found

A symmetric version of the generalized alternating direction method of multipliers for two block separable convex programming

N/A
N/A
Protected

Academic year: 2020

Share "A symmetric version of the generalized alternating direction method of multipliers for two block separable convex programming"

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

R E S E A R C H

Open Access

A symmetric version of the generalized

alternating direction method of multipliers

for two-block separable convex programming

Jing Liu

1,2

, Yongrui Duan

1

and Min Sun

3,4*

*Correspondence:

[email protected] 3School of Mathematics and

Statistics, Zaozhuang University, Shandong, 277160, P.R. China 4School of Management, Qufu

Normal University, Shandong, 276826, P.R. China

Full list of author information is available at the end of the article

Abstract

This paper introduces a symmetric version of the generalized alternating direction method of multipliers for two-block separable convex programming with linear equality constraints, which inherits the superiorities of the classical alternating direction method of multipliers (ADMM), and which extends the feasible set of the relaxation factor

α

of the generalized ADMM to the infinite interval [1, +∞). Under the conditions that the objective function is convex and the solution set is nonempty, we establish the convergence results of the proposed method, including the global convergence, the worst-caseO(1/k) convergence rate in both the ergodic and the non-ergodic senses, wherekdenotes the iteration counter. Numerical experiments to decode a sparse signal arising in compressed sensing are included to illustrate the efficiency of the new method.

MSC: 90C25; 90C30

Keywords: alternating direction method of multipliers; convex programming; mixed variational inequalities; compressed sensing

1 Introduction

We consider the two-block separable convex programming with linear equality con-straints, where the objective function is the sum of two individual functions with decou-pled variables:

minθ(x) +θ(x)|Ax+Ax=b,x∈X,x∈X

, ()

whereθi:RniR(i= , ) are closed proper convex functions;AiRl×ni(i= , ) and

bRl, andXiRni (i= , ) are given nonempty closed convex sets. The linear

con-strained convex problem () is a unified framework of many problems arising in real world, including compressed sensing, image restoration, and statistical learning, and so forth (see, for example, [–]). An important special case of () is the following linear inverse problem:

min

xRnμx+ 

Axy

, ()

(2)

whereARm×nandyRm are given matrix and vector,μ>  is a regularization

pa-rameter andx is the-norm of a vectorxdefined asx=ni=|xi|. Then setting x:=Axy,x:=x, () can be converted into the following two-block separable convex programming:

min x

+μx 

s.t. –x+Ax=y,

x∈Rm,x∈Rn,

()

which is a special case of problem () with the following specifications:

θ(x) :=  x

, θ

(x) :=μx, A:= –Im, A:=A, b:=y.

1.1 Existing algorithms

In their seminal work, Glowinskiet al.[] and Gabayet al.[] independently developed the alternating direction method of multipliers (ADMM), which is an influential first-order method for solving problem (). ADMM can be regarded as an application of the Douglas-Rachford splitting method (DRSM) [] to the dual of (), or a special case of the proximal point algorithm (PPA) [, ] in the cyclic sense. We refer to [] for a more detailed rela-tionship. With any initial vectorsx

∈X,λ∈Rl, the iterative scheme of ADMM reads

⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩

xk+

 ∈argminx∈X{θ(x) –x

Aλk+

β

Ax+Axk–b}, xk+∈argminx∈X{θ(x) –x

Aλk+

βAx

k+

 +Ax–b},

λk+=λkβ(A

xk++Axk+–b),

()

where λRlis the Lagrangian multiplier andβ >  is a penalty parameter. The main

characteristics of ADMM are that it in full exploits the separable structure of problem (), and that it updates the variablesx,x,λin an alternating order by solving a series of low-dimensional sub-problems with only one unknown variable.

In the past few decades, ADMM has received a revived interest, and it has become a research focus in optimization community, especially in the (non)convex optimiza-tion. Many efficient ADMM-type methods have been developed, including the proxi-mal ADMM [, ], the generalized ADMM [], the symmetric ADMM [], the inertial ADMM [], and some proximal ADMM-type methods [–]. Specifically, the proximal ADMM attaches some proximal terms to the sub-problems of ADMM (). The general-ized ADMM updates the variablesxandλby including a relaxation factorα∈(, ), and

(3)

proper proximal matrix. In fact, most proximal ADMM-type methods need to estimate the matrix normAi Ai(i= , ), which demands lots of calculations, especially for large ni(i= , ). Quite recently, some customized Douglas-Rachford splitting algorithms [–

], and the proximal ADMM-type methods with indefinite proximal regularization are developed [, ], which dissolve the above problem to some extent. All the above men-tioned ADMM-type methods are generalizations of the classical ADMM, because they all reduce to the iterative scheme () by choosing some special parameters. For more new development of the ADMM-type methods, including the convergence rate, acceleration techniques, its generalization for solving multi-block separable convex programming and nonconvex, nonsmooth programming, we refer to [–].

1.2 Contributions and organization

We are going to further study the generalized ADMM. Note that the first sub-problem in the generalized ADMM is irrelevant to the relaxation factorα. That is, the updating formula forx does not incorporate the relaxation factorαexplicitly. Furthermore,α∈ (, ) is often advantageous for the generalized ADMM []. Therefore, in this paper, we are going to propose a new generalized ADMM, whose both sub-problems incorporate the relaxation factorαdirectly. The new method generalizes the method proposed in [] by relaxing the feasible set ofαfrom the interval [, ) to the infinite interval [, +∞), and can be viewed as a symmetric version of the generalized ADMM.

The rest of the paper is organized as follows. In Section , we summarize some necessary preliminaries and characterize problem () by a mixed variational inequality problem. In Section , we describe the new symmetric version of the generalized ADMM and establish its convergence results in detail. In Section , some compressed sensing experiments are given to illustrate the efficiency of the proposed method. Some conclusions are drawn in Section .

2 Preliminaries

In this section, some necessary preliminaries which are useful for further discussions are presented, and to make our analysis more succinct, some positive definite or positive semi-definite block matrices are defined and their properties are investigated.

For two real matricesARs×m,BRn×s, the Kronecker product ofAandBis defined asAB= (aijB). Let·p(p= , ) denote the standard definition ofp-norm; in particular, · = · . For any two vectorsx,yRn, x,yorxydenote their inner product, and for any symmetric matrixGRn×n, the symbolG (resp.,G) denotes thatGis

positive definite (resp., semi-definite). For anyxRnandG, theG-normxGof the

vectorxis defined as√xGx. The effective domain of a closed proper functionf :X→ (–∞, +∞] is defined asdom(f) :={xX|f(x) < +∞}, and the symbolri(C) denotes the set of all relative interior points of a given nonempty convex setC. Furthermore, we use the following notations:

x= (x,x), w= (x,λ).

Definition .([]) A functionf :RnRis convex if and only if

(4)

Then, for a convex functionf:RnR, we have the following basic inequality:

f(x)≥f(y) + ξ,xy, ∀x,yRn,ξ∂f(y), () where∂f(y) ={ξRn:fy)f(y) + ξ,y¯–y, for ally¯∈Rn}denotes the subdifferential of f(·) at the pointy.

Throughout the paper, we make the following standard assumptions for problem ().

Assumption . The functionsθi(·) (i= , ) are convex. Assumption . The matricesAi(i= , ) are full-column rank.

Assumption . The generalized Slater condition holds, i.e., there is a point (xˆ,xˆ)∈ ri(domθ×domθ)∩ {x= (x,x)∈X×X|Ax+Ax=b}.

2.1 The mixed variational inequality problem

Under Assumption ., it follows from Theorem . and Theorem . of [] thatx∗= (x∗,x)∈Rn+nis an optimal solution to problem () if and only if there exists a vector

λ∗∈Rlsuch that (x∗,x,λ∗) is a solution of the following KKT system:

⎧ ⎨ ⎩

∈∂θi(x∗i) –Ai λ∗+NXi(x ∗

i), i= , ,

Ax∗+Ax∗=b,

()

whereNXi(x∗i) is the normal cone of the convex setXiat the pointxi, which is defined

asNXi(x∗i) ={zRni| z,x

ixi ≤,∀xiXi}. Then, for the nonempty convex setXiand ∀xiXi, it follows from [] (Example .) thatNXi(xi) =∂δ(·|Xi)(xi), whereδ(·|Xi) is the indicator function of the setXi, and∂δ(·|Xi)(xi) is the subdifferential mappings of

δ(·|Xi) at the pointxiXi.

Lemma . For any vector xiRni, λ∗ ∈Rl, the relationship ∂θ

i(x∗i) – Ai λ∗ +

∂δ(·|Xi)(x∗i)is equivalent to xiXiand the inequality

θi(xi) –θi xi

+ xixi

–Aiλ∗≥, ∀xiXi.

Proof From ∈ ∂θi(x∗i) –Aiλ∗ +∂δ(·|Xi)(x∗i), we have xiXi and there exists ηi

∂δ(·|Xi)(x∗i) such that

Ai λ∗–ηi∂θi xi

.

From the subgradient inequality (), one has

θi(xi) –θi xi

xixi

Aiλ∗–ηi

, ∀xiRni.

Thus,

θi(xi) –θi xi

xixi

–Ai λ∗≥ xixi

(–ηi)≥, ∀xiXi,

(5)

Conversely, fromθi(xi) –θi(x∗i) + (xixi)(–Ai λ∗)≥,∀xiXi, we have

θi(xi) +xi –A

θi xi

+ xi –Ai λ∗, ∀xiXi,

which together withxiXiimplies that

xi =argmin

xiXi

θi(xi) +xi –A

.

From this and Theorem . of [], we have ∈∂θi(x∗i) –Aiλ∗+∂δ(·|Xi)(x∗i). This

com-pletes the proof.

Remark . Based on () and Lemma ., the vectorx∗= (x∗,x∗)∈Rn+nis an optimal

solution to problem () if and only if there exists a vectorλ∗∈Rlsuch that

⎪ ⎪ ⎨ ⎪ ⎪ ⎩

(x∗,x∗)∈X×X;

θi(xi) –θi(x∗i) + (xixi)(–A∗)≥, ∀xiXi,i= , ;

Ax∗+Ax∗=b.

()

Moreover, anyλ∗∈Rlsatisfying () is an optimal solution to the dual of problem ().

Ob-viously, () can be written as the following mixed variational inequality problem, denoted byVI(W,F,θ): Find a vectorw∗∈Wsuch that

θ(x) –θ x∗+ wwF w∗≥, ∀wW, ()

whereθ(x) =θ(x) +θ(x),W=X×X×Rl, and

F(w) :=

⎛ ⎜ ⎝

–Aλ

–Aλ

Ax+Ax–b

⎞ ⎟ ⎠=

⎛ ⎜ ⎝

  –A   –A AA 

⎞ ⎟ ⎠

⎛ ⎜ ⎝

xx

λ

⎞ ⎟ ⎠–

⎛ ⎜ ⎝

  b

⎞ ⎟

⎠. ()

The solution set ofVI(W,F,θ), denoted byW∗, is nonempty by Assumption . and Re-mark .. It is easy to verify that the linear functionF(·) is not only monotone but also satisfies the following desired property:

ww F wF(w)= , ∀w,wW.

2.2 Three matrices and their properties

To present our analysis in a compact way, now let us define some matrices. For anyRi

Rni×ni(i= , ), set

M=

⎛ ⎜ ⎝

In  

In 

 –βAIl ⎞ ⎟

(6)

and forα∈[, +∞), set

Q=

⎛ ⎜ ⎝

R  

R+ (α– )βAA –ααA

 –A αβIl

⎞ ⎟ ⎠,

H=

⎛ ⎜ ⎝

R  

R+α

–α+

α βAA –ααA  –ααAαβIl

⎞ ⎟ ⎠.

()

The above defined three matricesM,Q,Hsatisfy the following properties.

Lemma . IfαRand Ri (i= , ),then the matrix H defined in()is positive

semi-definite.

Proof Sett= α– α+ , which is positive for anyαR. By (), we have

H=

⎛ ⎜ ⎝

R    R 

  

⎞ ⎟ ⎠+

⎛ ⎜ ⎝

  

tαβAA –ααA  –ααAαβIl

⎞ ⎟ ⎠.

Obviously, the first part is positive semi-definite, and we only need to prove the second part is also positive semi-definite. In fact, it can written as

α

⎛ ⎜ ⎝

  

 √βA

  √

βIl

⎞ ⎟ ⎠

⎛ ⎜ ⎝

  

tIl ( –α)Il

 ( –α)Il Il ⎞ ⎟ ⎠

⎛ ⎜ ⎝

  

 √βA 

  √

βIl

⎞ ⎟ ⎠.

The middle matrix in the above expression can be further written as

⎛ ⎜ ⎝

  

t  –α

  –α

⎞ ⎟ ⎠⊗Il,

where ⊗denotes the matrix Kronecker product. The matrix Kronecker product has a nice property: for any two matricesXandY, the eigenvalue ofXY equals the product ofλ(X)λ(Y), whereλ(X) andλ(Y) are the eigenvalues ofXandY, respectively. Therefore, we only need to show the -by- matrix

t  –α

 –α

is positive semi-definite. In fact,

t– ( –α)=α≥.

(7)

Lemma . Ifα∈[, +∞)and Ri (i= , ),then the matrices M,Q,H defined,

respec-tively,in(), ()satisfy the following relationships:

HM=Q ()

and

Q+QMHMα–  α M

HM. ()

Proof From () and (), we have

HM=

⎛ ⎜ ⎝

R  

R+α

–α+

α βA

A –ααA

 –α

α A

αβIl

⎞ ⎟ ⎠ ⎛ ⎜ ⎝

In  

In 

 –βAIl ⎞ ⎟ ⎠ = ⎛ ⎜ ⎝

R  

R+ (α– )βAA –ααA

 –A αβIl

⎞ ⎟ ⎠=Q.

Then the first assertion is proved. For (), by some simple manipulations, we obtain

MHM=MQ

=

⎛ ⎜ ⎝

In  

In –βA   Il

⎞ ⎟ ⎠ ⎛ ⎜ ⎝

R  

R+ (α– )βAA –ααA

 –A αβIl

⎞ ⎟ ⎠ = ⎛ ⎜ ⎝

R  

R+ αβAA –A  –A αβIl ⎞ ⎟ ⎠.

We now break up the proof into two cases. First, ifα= , then

Q+QMHM=

⎛ ⎜ ⎝

R    R    βIl

⎞ ⎟ ⎠.

Therefore, () holds. Second, ifα∈(, +∞), then

Q+QMHM

=

⎛ ⎜ ⎝

R  

R+ (α– )βAA –ααA   –ααAαβIl

⎞ ⎟ ⎠ = ⎛ ⎜ ⎝

R    R 

  

⎞ ⎟

⎠+ (α– )

⎛ ⎜ ⎝

  

βAA –αA   –αAαβ(α–)Il

⎞ ⎟

(8)

Note that

α

βAA –αA –αAαβ(α–)Il

αβAA –A –A αβIl

=

αβAA –A –A αβα(+α–)Il

=

βA   √

βIl

αIl –Il

–Il α(αα+–)Il

βA   √

βIl

. ()

The middle matrix in the above expression can be further written as

α – – α+ α(α–)

Il.

Since

α – – αα(α+–)

, ∀α> ,

the right-hand side of () is also positive semi-definite. Thus, we have

βAA –αA –αAαβ(α–)Il

α

αβAA –A –A αβIl

. ()

Substituting () into () and by the expression ofMHM, we obtain (). The lemma is

proved.

3 Algorithm and convergence results

In this section, we first describe the symmetric version of the generalized alternating di-rection method of multipliers (SGADMM) forVI(W,F,θ) formally, and then we prove its global convergence in a contraction perspective and establish its worst-caseO(/k) con-vergence rate in both the ergodic and the non-ergodic senses step by step, wherekdenotes the iteration counter.

3.1 Algorithm

Algorithm .(SGADMM)

Step . Choose the parametersα∈[, +∞),β> ,RiRni×ni(i= , ), the tolerance

ε> and the initial iterate(x

,x,λ)∈X×X×Rl. Setk:= .

Step . Generate the new iteratewk+= (xk+

 ,xk+,λk+)by

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

xk+

 ∈argminx∈X{θ(x) –x

Aλk+

αβ

Ax+Ax

k

–b

+x–xkR},

xk+

 ∈argminx∈X{θ(x) –x

Aλk+( α–)β

Axk++Ax–b +x–xkR},

λk+=λkβ[αAxk+– ( –α)(Axk–b) +Axk+–b].

(9)

Step . If

maxRxk–Rxk+,Rxk–Rxk+,Axk–Axk+,λkλk+<ε, ()

then stop and return an approximate solution(xk+,xk+,λk+)ofVI(W,F,θ); else

setk:=k+ , and goto Step .

Remark . Obviously, the iterative scheme () reduces to the generalized ADMM when

α= , and further reduces to () whenRi=  (i= , ). That is to say, if the parameters

α=  andRi=  (i= , ), then the classical ADMM is recovered. Since the convergence

results of the (proximal) ADMM have been established in the literature [, , ], in the following, we only considerα∈(, +∞).

3.2 Global convergence

For further analysis, we need to define an auxiliary sequence{ ˆwk}as follows:

ˆ

wk=

⎛ ⎜ ⎝ ˆ

xk

ˆ

xk

ˆ

λk ⎞ ⎟ ⎠=

⎛ ⎜ ⎝

xk+ xk+

λkαβ(A

xk++Axk–b)

⎞ ⎟

⎠. ()

Lemma . Let{λk+}andλk}be the two sequences generated by SGADMM.Then

λk+=λˆkβ Axˆk–Axk

()

and

ˆ

λk

α –  λˆ

kλk=λk– (α– )β A

xˆk+Axk–b

. ()

Proof From the definition ofλk+, we get

λk+=λkβαAxˆk– ( –α) Axk–b

+Axˆk–b

=λkβα Axˆk+Axk–b

+ Axˆk–Axk

=λˆkβ Axˆk–Axk

.

Then () is proved. For (), we have

ˆ

λk

α –  λˆ

kλk

=λkαβ Axˆk+Axk–b

+

α– 

αβ Axˆk+Axk–b

=λk– (α– )β Axˆk+Axk–b

.

(10)

Thus, based on () and (), the two sequences{wk}and{ ˆwk}satisfies the following

relationship:

wk+=wkM wkwˆk, ()

whereMis defined in ().

The following lemma shows that the stopping criterion () of SGADMM is reasonable.

Lemma . If Rixki =Rixik+ (i= , ),Axk=Axk+andλk=λk+,then the iteratewˆk= (xˆk

,xˆk,λˆk)produced by SGADMM is a solution ofVI(W,F,θ).

Proof By invoking the optimality condition of the three sub-problems in (), we have the following mixed variational inequality problems: for anyw= (x,x,λ)∈W,

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

θ(x) –θ(xˆk) + (x–xˆk){–A[λkαβ(Axˆk+Axk–b)] +R(xˆk–xk)} ≥,

θ(x) –θ(xˆk) + (x–xˆk){–A[λk– (α– )β(Axˆk+Axˆk–b)]} +R(xˆk–xk)≥,

(λλˆk)[αAxˆk– ( –α)(Axkb) +Axˆk–b– (λkλk+)/β]≥. Then, adding the above three inequalities and by (), (), we get

θ(x) –θ xˆk+ wwˆk

⎧ ⎪ ⎨ ⎪ ⎩ ⎛ ⎜ ⎝

–Aλˆk

–Aλˆk

Axˆk+Axˆk–b

⎞ ⎟ ⎠

+

⎛ ⎜ ⎝

R(xˆk–xk)

(α– )βA(Axˆk–Axk) + ( –α)A(λˆkλk)/α+R(xˆk–xk) ( –α)(Ax˜k–Axk)/α+ (λk+–λk)/(αβ)

⎞ ⎟ ⎠ ⎫ ⎪ ⎬ ⎪ ⎭≥.

Then by (), we obtain

θ(x) –θ xˆk+ wwˆk

⎧ ⎪ ⎨ ⎪ ⎩F wˆ

k

+

⎛ ⎜ ⎝

R(xˆk–xk)

(α– )βA(Ax˜k–Axk) + ( –α)A(λˆkλk)/α+R(xˆk–xk) –(Axˆk–Axk) + (λˆkλk)/(αβ)

⎞ ⎟ ⎠ ⎫ ⎪ ⎬ ⎪ ⎭≥.

Then, by () (the definition ofQ), the above inequality can be rewritten as

θ(x) –θ xˆk+ wwˆkF wˆkwwˆkQ wkwˆk, () for anywW. Therefore, ifRixik=Rixki+(i= , ),Axk=Axk+ andλk=λk+, then by (), we haveλk+=λˆk. Thenλˆk=λk. Thus, we have

(11)

which together with () implies that

θ(x) –θ xˆk+ wwˆkF wˆk≥, ∀wW.

This indicates that the vectorwˆk is a solution ofVI(W,F,θ). This completes the proof.

Lemma . Let{wk}and{ ˆwk}be two sequences generated by SGADMM.Then,for any

wW,we have

wwˆkQ wkwˆk≥ 

ww

k+

Hww k

H

+α–  α w

kwk+

H. ()

Proof Applying the identity

(a–b)H(cd) =

ad

HacH

+

cb

HdbH

,

with

a=w, b=wˆk, c=wk, d=wk+,

we obtain

wwˆkH wkwk+= 

ww

k+

Hww k

H

+  w

kwˆk

Hw

k+wˆk

H

.

This together with () and () implies that

wwˆkQ wkwˆk=   ww

k+

Hww k

H

+  w

kwˆkHw

k+wˆkH

. ()

Now let us deal with the last term in (), which can be written as

wkwˆkHwk+–wˆkH

=wkwˆkHwkwˆkwkwk+H

=wkwˆkHwkwˆkM wkwˆkH (using ())

=  wkwˆkHM wkwˆkwkwˆkMHM wkwˆk

= wkwˆk Q+QMHM wkwˆk

α– 

α w

kwˆkMHM wkwˆk (using ())

=α–  α w

kwk+

H (using ()).

(12)

Theorem . Let{wk}and{ ˆwk}be two sequences generated by SGADMM.Then,for any

wW,we have

θ(x) –θ xˆk+ wwˆkF(w)

≥

ww

k+

Hww k

H

+α–  α w

kwk+

H. ()

Proof First, combining () and (), we get

θ(x) –θ xˆk+ wwˆkF wˆk

≥

ww

k+

Hww k

H

+α–  α w

kwk+

H.

From the monotonicity ofF(·), we have

wwˆk F(w) –F wˆk≥.

Adding the above two inequalities, we obtain the assertion (). The proof is completed.

With the above theorem in hand, we are ready to establish the global convergence of SGADMM for solvingVI(W,F,θ).

Theorem . Let{wk}be the sequence generated by SGADMM.Ifα> ,Ri+βAiAi

(i= , ),then the corresponding sequence{wk}converges to some w∞,which belongs toW∗.

Proof Settingw=w∗in (), we have

wkw∗Hα– 

α w

kwk+

H

≥θ xˆkθ x+ wˆkwF w+wk+w∗

H

wk+–w∗H,

where the second inequality follows fromw∗∈W∗. Thus, we have

wk+–w∗Hwkw∗Hα– 

α w

kwk+

H. ()

Summing overk= , , . . . ,∞, it yields

k=

wkwk+Hα

α– ww∗

H.

Byα>  and the positive semi-definite ofH, the above inequality implies that lim

k→∞

wkwk+H= .

Thus, by the definition ofH, we have

lim

k→∞x

k

–xk+ 

R=klim→∞v

kvk+

(13)

where

H=

R+α

–α+

α βAA –ααA –α

α A

αβIl

,

is positive definite byR+βAA. From () again, we have

wk+–w∗Hw–w∗H,

which indicates that the sequence{Hwk}is bounded. Thus,{Rxk}∞k=and{Hvk}∞k=are both bounded. Then{vk}

k=is bounded. IfR,{xk}∞k=is bounded; otherwise,AA , that is, A is full-column rank, which together with Ax = (λkλk+)/(αβ) + ( –

α)(Axk–b)/α– (Axk+–b)/α implies that {xk}∞k=is bounded. In conclusion,{wk}∞k= is bounded.

Then, from () andH, the sequence{vk}is convergent. Suppose it converges tov∞. Letw∞= (x∞ ,v∞) be a cluster point of{wk}and{wkj}be the corresponding subsequence. On the other hand, by () and (), we have

lim

k→∞Rx

k

–xˆk

= , lim

k→∞ x

k

–xˆk

= 

and

lim

k→∞λ

kλˆk= lim k→∞λ

kλk++β A

xˆk–Axk

= .

Thus,

lim

k→∞Q w

kwˆk= . ()

Then, taking the limit along the subsequence{wkj}in () and using (), for anywW, we obtain

θ(x) –θ x∞+ wwF w∞≥,

which indicates thatw∞ is a solution ofVI(W,F,θ). Then, sincew∗ in () is arbitrary, we can setw∗=w∞ and conclude that the whole generated sequence{wk}converges by

Ri+βAi Ai (i= , ). This completes the proof.

3.3 Convergence rate

Now, we are going to prove the worst-caseO(/t) convergence rate of SGADMM in both the ergodic and the non-ergodic senses.

Theorem . Let{wk}and{ ˆwk}be the sequences generated by SGADMM,and set

¯

wt=  t+ 

t

k=

ˆ

(14)

Then,for any integer t≥,we havew¯tW,and

θ(x¯t) –θ(x) + (w¯tw)F(w)

(t+ )ww 

H, ∀wW. ()

Proof From () and the convexity of the setW, we havew¯kW. From (), we have

θ(x) –θ xˆk+ wwˆkF(w) +  ww

k

H

 ww

k+

H, ∀wW.

Summing the above inequality overk= , , . . . ,t, we get

(t+ )θ(x) –

t

k=

θ xˆk+

(t+ )w–

t

k=

ˆ

wk

F(w) + ww



H≥, ∀wW.

By the definition ofw¯t and the convexity ofθ(·), the assertion () follows immediately

from the above inequality. This completes the proof.

The proof of the next two lemmas is referred to those of Lemmas . and . in []. For completeness, we give the detail proof.

Lemma . Let{wk}be the sequence generated by SGADMM.Then we have

wkwk+H wkwk+– wk+–wk+

≥α– 

α w

kwk+ wk+wk+

H. ()

Proof Settingw=wˆk+in (), we have

θ xˆk+–θ xˆk+ wˆk+–wˆkF wˆkwˆk+–wˆkQwkwˆk. Similarly settingw=wˆkin () fork:=k+ , we get

θ xˆkθ xˆk++ wˆkwˆk+F wˆk+≥ wˆkwˆk+Q wk+–wˆk+.

Then, adding the above two inequalities and using the monotonicity of the mappingF(·), we get

ˆ

wkwˆk+Q wkwˆkwk+–wˆk+≥. ()

By (), we have

wkwk+Q wkwˆkwk+–wˆk+

= wkwˆkwk+–wˆk++ wˆkwˆk+Q wkwˆkwk+–wˆk+

= wkwˆkwk+–wˆk+Q+ wˆkwˆk+Q wkwˆkwk+–wˆk+

(15)

Using (), () andQ=HMon both sides of the above inequality, we get

wkwk+H wkwk+– wk+–wk+

= wkwk+QM– wkwk+– wk+–wk+

= wkwk+Q wkwˆkwk+–wˆk+

wkwˆkwk+–wˆk+Q

= wkwˆkwk+–wˆk+Q wkwˆkwk+–wˆk+

= wkwk+– wk+–wk+M–QM– wkwk+– wk+–wk+

≥α– 

α w

kwk+ wk+wk+

×M–MHMM– wkwk+– wk+–wk+

=α–  α w

kwk+ wk+wk+

H.

Then we get the assertion (). The proof is completed.

Lemma . Let{wk}be the sequence generated by SGADMM.Then we have

wk+–wk+Hwkwk+Hα–  α w

kwk+ wk+wk+

H. ()

Proof Settinga:= (wkwk+) andb:= (wk+wk+) in the identity

aHbH= aH(ab) –abH,

we can derive

wkwk+Hwk+–wk+H

=  wkwk+H wkwk+– wk+–wk+– wkwk+– wk+–wk+H

≥α– 

α w

kwk+ wk+wk+

Hw

kwk+ wk+wk+

H

=α–  α w

kwk+ wk+wk+

H,

which completes the proof of the lemma.

Based on Lemma ., now we establish the worst-case O(/t) convergence rate of SGADMM in a non-ergodic sense.

Theorem . Let{wk}be the sequence generated by SGADMM.Then,for any wW

and integer t≥,we have

wtwt+Hα

(t+ )(α– )ww∗

(16)

Proof By (), we get

α– 

α

t

k=

wkwk+Hw–w∗H.

This and () imply that

(t+ )(α– )

α w

twt+

Hw

w∗

H.

Therefore, the assertion of this theorem comes from the above inequality immediately.

The proof is completed.

Remark . From (), we see that the largerαis, the smallerαα–, which controls the up-per bounds ofwtwt+

H. Therefore, it seems that larger values ofαare more beneficial

for speeding up the convergence of SGADMM.

4 Numerical experiments

In this section, we present some numerical experiments to verify the efficiency of SGADMM for solving compressed sensing. Those numerical experiments are performed in Matlab Ra on a ThinkPad computer equipped with Windows XP,  MHz and  GB of memory.

Compressed sensing (CS) is to recover a sparse signalx¯∈Rnfrom an undetermined

linear systemb=Ax, where¯ ARm×n(mn), can be depicted as problem ().

Obviously, Problem () is equivalent to the following two models:

(a) Model : Problem (). (b) Model :

minμx+ 

Ax–y

s.t.–x–x= ,

x∈Rn,x∈Rn.

()

4.1 The iterative schemes for (3) and (35)

Since () and () are both some concrete models of (), SGADMM are applicable to them. Below, we elaborate on how to derive the closed-form solutions for the sub-problems re-sulting by SGADMM.

For problem (), its first two sub-problems resulting by SGADMM are as follows.

•With the givenxk

andλk, thex-sub-problem in () is (hereR= )

xk+=argmin

x∈Rn

 x

+xλ+

αβ

x–Ax

k

+y

,

which has the following closed-form solution:

xk+= 

 +αβ αβ Ax

k

–y

(17)

•With the updatedxk+, thex-sub-problem in () is (hereR=τIn– (α– )βAA

withτ≥(α– )βAA)

xk+=argmin

x∈Rn

μx–xAλk+

(α– )β

 –x

k+

 +Ax–y+  x–x

k

 

R

,

and its closed-form solution is given by

xk+=shrinkμ

τ (α– )βA xk+

 +y

/τ+ τIn– (α– )βAA

xk/τ+Aλk/τ, where, for anyc> ,shrinkc(·) is defined as

shrinkc(g) :=g–min

c,|g| g

|g|, ∀gR

n,

and (g/|g|)ishould be taken  if|g|i= .

Similarly, for problem (), its first two sub-problems resulting by SGADMM are as follows.

•With the givenxk

andλk, thex-sub-problem in () is (hereR= )

xk+=argmin

x∈Rn

μx+

αβ

x–

xk+ 

αβλ

k

 ,

and its closed-form solution is given by

xk+=shrinkμ αβ

xk+ 

αβλ

k

.

•With the updatedxk+

 , thex-sub-problem in () is (hereR=τInAAwithτAA)

xk+=argmin

x∈Rn

Ax–y+x

λk+

(α– )β

x–x

k+ 

 +

x–x

k

 

R

,

and its closed-form solution is given by

xk+= 

τ+ (α– )β A

yλk+ (α– )βxk+

 +AAxk

.

Obviously, the above two iterative schemes both need to computeAAandAy, which is quite time consuming ifnis large. However, noting that these two terms are invariant during the iteration process, therefore we need only compute them once before all itera-tions.

(18)

β=mean(abs(y))/(α– ) in our algorithm. As for the parameterα, we have pointed out in Remark . that larger values ofα may be beneficial for our algorithm. Here, we use () to do a little experiment to test this. We choose different values ofαin the interval [, ]. Specifically, we chooseα∈ {., ., . . . , }. Other data about this experiment are as follows: the proximal parameterτ is set asτ = .(α– )βAA; the observed signal yis set asy=Ax+ .×randn(m, ) in Matlab; the sensing matrixAand the original signalxare generated by

¯

A=randn(m,n), [Q,R] =qrA¯, , A=Q,

and

x=zeros(n, ); p=randperm(n); x p( :k)=randn(k, ).

Then the observed signalyis further set as (R)–y. The initial points are set asx =Ay,

λ=Ax

. In addition, we set the regularization parameterμ= ., and the dimensions of the problem are set asn= ,,m= ,k= , wherekdenotes the number of the non-zeros in the original signalx. To evaluate the quality of the recovered signal, let us define the quantity ‘RelErr’ as follows:

RelErr=˜xx

x ,

wherex˜denotes the recovered signal. The stopping criterion is

fkfk–

fk–

< –,

wherefkdenotes the function value of () at the iteratexk.

4.2 Numerical results

The numerical results are graphicly shown in Figure . Clearly, the numerical results in Figure  indicate that Remark . is reasonable. Both CPU time and number of iterations are descent with respect toα. Then, in the following, we setα= ., which is a moderate choice forα.

Now, let us graphically show the recovered results of SGADMM for () and (). The proximal parameterτis set asτ= .(α– )βAAfor (), andτ= .AAfor (). The initial points are set asx=Ay,λ=Ax

for (), andx=Ay,λ=xfor (). Other parameters are set the same as above. Figure  reports the numerical results of SGADMM for () and ().

The bottom two subplots in Figure  indicate that our new method SGADMM can be used to solve () and ().

(19)
[image:19.595.116.480.80.572.2] [image:19.595.118.478.83.328.2]

Figure 1 Sensitivity test on the parameterα.

Figure 2 Numerical results of SGADMM for (3) and (35).The top: the original signal; the second: the noisy measurement; the bottom two: recovered signal.

The numerical results are listed in Table , where ‘Time’ denotes the CPU time (in sec-onds), and ‘Iter’ denotes the number of iterations required for the whole recovering pro-cess,m=floor(γn),k=floor(σm). The numerical results are the average of the nu-merical results of ten runs with different combinations ofγ andσ.

4.3 Discussion

(20)
[image:20.595.117.479.98.193.2]

Table 1 Comparison of SGADMM1, SGADMM2 and ADMM

n γ σ SGADMM1 SGADMM2 ADMM

Time Iter RelErr Time Iter RelErr Time Iter RelErr

1,000 0.3 0.2 0.6911 92.4 0.0387 0.9578 266.0 0.0394 0.9812 264.0 0.0393

0.2 0.2 0.6661 118.6 0.0825 1.3915 421.8 0.0915 1.3603 419.6 0.0915

0.2 0.1 0.5008 85.3 0.0609 0.4758 139.0 0.0579 0.5008 138.0 0.0539

2,000 0.3 0.2 2.1965 90.0 0.0437 3.6535 267.7 0.0447 3.5412 265.6 0.0447

0.2 0.2 2.2339 109.6 0.0785 5.2182 431.4 0.0874 5.1543 429.0 0.0874

0.2 0.1 1.5428 79.9 0.0534 1.7893 142.8 0.0467 1.7613 140.8 0.0513

better than the other two methods. Especially the number of iterations of SGADMM is about at most two-thirds of the other two methods. This experiment also indicate that the model () is also an effective model for compressed sensing, and sometimes it is more ef-ficient than the model (), though they are equivalent in theory. In conclusion, by choos-ing some relaxation factorα∈[, +∞), SGADMM may be more efficient than the classical ADMM.

5 Conclusions

In this paper, we have proposed a symmetric version of the generalized ADMM (SGADMM), which generalizes the feasible set of the relaxation factorα from (, ) to [, +∞). Under the same conditions, we have proved the convergence results of the new method. Some numerical results illustrate that it may perform better than the classical ADMM. In the future, we shall study SGADMM withα∈(, ) to perfect the theoretical system.

Acknowledgements

The authors gratefully acknowledge the valuable comments of the anonymous reviewers. This work is supported by the National Natural Science Foundation of China (Nos. 11601475, 71532015), the foundation of First Class Discipline of Zhejiang-A (Zhejiang University of Finance and Economics - Statistics).

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

The first author has proved the convergence results; the second author has accomplished the numerical experiment; and the third author has proposed the motivation of the manuscript. All authors read and approved the final manuscript.

Author details

1School of Economics and Management, Tongji University, Shanghai, 200092, P.R. China.2School of Data Sciences,

Zhejiang University of Finance and Economics, Zhejiang, 310018, P.R. China. 3School of Mathematics and Statistics, Zaozhuang University, Shandong, 277160, P.R. China.4School of Management, Qufu Normal University, Shandong, 276826, P.R. China.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Received: 22 January 2017 Accepted: 10 May 2017

References

1. Beck, A, Teboulle, M: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci.

2, 183-202 (2009)

2. Yang, JF, Zhang, Y, Yin, WT: A fast alternating direction method for TVL1-L2 signal reconstruction from partial Fourier data. IEEE J. Sel. Top. Signal Process.4, 288-297 (2010)

3. Boyd, S, Parikh, N, Chu, E, Peleato, B, Eckstein, J: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn.3, 1-122 (2011)

(21)

5. Gabay, D, Mercier, B: A dual algorithm for the solution of nonlinear variational problems via finite-element approximations. Comput. Math. Appl.2, 17-40 (1976)

6. Lions, PL, Mercier, B: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal.16, 964-979 (1979)

7. Gabay, D: Applications of the method of multipliers to variational inequalities. In: Fortin, M, Glowinski, R (eds.) Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, pp. 299-331. North-Holland, Amsterdam (1988)

8. Eckstein, J, Bertsekas, DP: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program.55, 293-318 (1992)

9. Esser, E: Applications of Lagrangian-based alternating direction methods and connections to split Bregman. TR. 09-31, CAM, UCLA (2009)

10. He, BS, Liao, LZ, Han, D, Yang, H: A new inexact alternating directions method for monotone variational inequalities. Math. Program.92, 103-118 (2002)

11. Golshtein, EG, Tretyakov, NV: Modified Lagrangian in convex programming and their generalizations. Math. Program. Stud.10, 86-97 (1979)

12. He, BS, Ma, F, Yuan, XM: Convergence study on the symmetric version of ADMM with larger step sizes. SIAM J. Imaging Sci.9(3), 1467-1501 (2016)

13. Chen, CH, Ma, SQ, Yuan, JF: A general inertial proximal point algorithm for mixed variational inequality problem. SIAM J. Optim.25(4), 2120-2142 (2015)

14. Fang, EX, He, BS, Liu, H, Yuan, XM: Generalized alternating direction method of multipliers: new theoretical insights and applications. Math. Program. Comput.7(2), 149-187 (2015)

15. Sun, M, Liu, J: A proximal Peaceman-Rachford splitting method for compressive sensing. J. Appl. Math. Comput.50, 349-363 (2016)

16. Chen, CH, Chan, RH, Ma, SQ, Yuan, JF: Inertial proximal ADMM for linearly constrained separable convex optimization. SIAM J. Imaging Sci.8(4), 2239-2267 (2015)

17. Sun, HC, Sun, M, Zhou, HC: A proximal splitting method for separable convex programming and its application to compressive sensing. J. Nonlinear Sci. Appl.9, 392-403 (2016)

18. Sun, M, Wang, YJ, Liu, J: Generalized Peaceman-Rachford splitting method for multiple-block separable convex programming with applications to robust PCA. Calcolo54(1), 77-94 (2017)

19. Han, DR, He, HJ, Yang, H, Yuan, XM: A customized Douglas-Rachford splitting algorithm for separable convex minimization with linear constraints. Numer. Math.127, 167-200 (2014)

20. He, HJ, Cai, XJ, Han, DR: A fast splitting method tailored for Dantzig selectors. Comput. Optim. Appl.62, 347-372 (2015)

21. He, HJ, Han, DR: A distributed Douglas-Rachford splitting method for multi-block convex minimization problems. Adv. Comput. Math.42, 27-53 (2016)

22. He, BS, Ma, F, Yuan, XM: Linearized alternating direction method of multipliers via positive-indefinite proximal regularization for convex programming. Manuscript (2016)

23. Sun, M, Liu, J: The convergence rate of the proximal alternating direction method of multipliers with indefinite proximal regularization. J. Inequal. Appl.2017, 19 (2017)

24. He, BS, Liu, H, Wang, ZR, Yuan, XM: A strictly contractive Peaceman-Rachford splitting method for convex programming. SIAM J. Optim.24(3), 1011-1040 (2014)

25. Hou, LS, He, HJ, Yang, JF: A partially parallel splitting method for multiple-block separable convex programming with applications to robust PCA. Comput. Optim. Appl.63(1), 273-303 (2016)

26. Wang, K, Desai, J, He, HJ: A proximal partially-parallel splitting method for separable convex programs. Optim. Methods Softw.32(1), 39-68 (2017)

27. Xu, YY: Accelerated first-order primal-dual proximal methods for linearly constrained composite convex programming. Manuscript (2016)

28. Guo, K, Han, DR, Wang, DZW, Wu, TT: Convergence of ADMM for multi-block nonconvex separable optimization models. Front. Math. China (2017). doi:10.1007/s11464-017-0631-6

29. Sun, HC, Sun, M, Zhou, HC: A new generalized alternating direction method of multiplier for separable convex programming and its applications to image deblurring with wavelets. ICIC Express Lett.10(2), 271-277 (2016) 30. Rockafellar, RT: Convex Analysis. Princeton University Press, Princeton (1970)

31. Fukushima, M: Fundamentals of Nonlinear Optimization. Asakura, Tokyo (2001)

32. Bonnans, JF, Shapiro, A: Perturbation Analysis of Optimization Problems. Springer, New York (2000)

33. He, BS, Yuan, XM: On theO(1/n) convergence rate of the Douglas-Rachford alternating direction method. SIAM J. Numer. Anal.50(2), 700-709 (2012)

34. He, BS, Yuan, XM: On non-ergodic convergence rate of Douglas-Rachford alternating direction method of multipliers. Numer. Math.130, 567-577 (2015)

Figure

Figure 1 Sensitivity test on the parameter α.
Table 1 Comparison of SGADMM1, SGADMM2 and ADMM

References

Related documents