A symmetric version of the generalized alternating direction method of multipliers for two block separable convex programming

(1)

R E S E A R C H

Open Access

A symmetric version of the generalized

alternating direction method of multipliers

for two-block separable convex programming

Jing Liu

1,2

_{, Yongrui Duan}

1

_{and Min Sun}

3,4*

*_{Correspondence:}

[email protected] 3_{School of Mathematics and}

Statistics, Zaozhuang University, Shandong, 277160, P.R. China 4_{School of Management, Qufu}

Normal University, Shandong, 276826, P.R. China

Full list of author information is available at the end of the article

Abstract

This paper introduces a symmetric version of the generalized alternating direction method of multipliers for two-block separable convex programming with linear equality constraints, which inherits the superiorities of the classical alternating direction method of multipliers (ADMM), and which extends the feasible set of the relaxation factor

α

of the generalized ADMM to the inﬁnite interval [1, +∞). Under the conditions that the objective function is convex and the solution set is nonempty, we establish the convergence results of the proposed method, including the global convergence, the worst-caseO(1/k) convergence rate in both the ergodic and the non-ergodic senses, wherekdenotes the iteration counter. Numerical experiments to decode a sparse signal arising in compressed sensing are included to illustrate the eﬃciency of the new method.

MSC: 90C25; 90C30

Keywords: alternating direction method of multipliers; convex programming; mixed variational inequalities; compressed sensing

1 Introduction

We consider the two-block separable convex programming with linear equality con-straints, where the objective function is the sum of two individual functions with decou-pled variables:

minθ(x) +θ(x)|Ax+Ax=b,x∈X,x∈X

, ()

whereθi:Rni→R(i= , ) are closed proper convex functions;Ai∈Rl×ni(i= , ) and

b∈Rl, andXi⊆Rni (i= , ) are given nonempty closed convex sets. The linear

con-strained convex problem () is a uniﬁed framework of many problems arising in real world, including compressed sensing, image restoration, and statistical learning, and so forth (see, for example, [–]). An important special case of () is the following linear inverse problem:

min

x∈Rnμx+ 

Ax–y

_, _()

(2)

whereA∈Rm×n_and_y_∈_Rm _{are given matrix and vector,}_μ_{>  is a regularization}

pa-rameter andx is the-norm of a vectorxdeﬁned asx=ni=|xi|. Then setting x:=Ax–y,x:=x, () can be converted into the following two-block separable convex programming:

min x

₊_μ_x 

s.t. –x+Ax=y,

x∈Rm,x∈Rn,

()

which is a special case of problem () with the following speciﬁcations:

θ(x) :=  x

_, _θ

(x) :=μx, A:= –Im, A:=A, b:=y.

1.1 Existing algorithms

In their seminal work, Glowinskiet al.[] and Gabayet al.[] independently developed the alternating direction method of multipliers (ADMM), which is an inﬂuential ﬁrst-order method for solving problem (). ADMM can be regarded as an application of the Douglas-Rachford splitting method (DRSM) [] to the dual of (), or a special case of the proximal point algorithm (PPA) [, ] in the cyclic sense. We refer to [] for a more detailed rela-tionship. With any initial vectorsx

∈X,λ∈Rl, the iterative scheme of ADMM reads

⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩

xk+

 ∈argminx∈X{θ(x) –x

Aλk+

β

Ax+Axk–b}, xk+∈argminx∈X{θ(x) –x

Aλk+

β Ax

k+

 +Ax–b},

λk+₌_λk_–_β_(A

xk++Axk+–b),

()

where λ∈Rl_{is the Lagrangian multiplier and}_β _{>  is a penalty parameter. The main}

characteristics of ADMM are that it in full exploits the separable structure of problem (), and that it updates the variablesx,x,λin an alternating order by solving a series of low-dimensional sub-problems with only one unknown variable.

In the past few decades, ADMM has received a revived interest, and it has become a research focus in optimization community, especially in the (non)convex optimiza-tion. Many eﬃcient ADMM-type methods have been developed, including the proxi-mal ADMM [, ], the generalized ADMM [], the symmetric ADMM [], the inertial ADMM [], and some proximal ADMM-type methods [–]. Speciﬁcally, the proximal ADMM attaches some proximal terms to the sub-problems of ADMM (). The general-ized ADMM updates the variablesxandλby including a relaxation factorα∈(, ), and

(3)

proper proximal matrix. In fact, most proximal ADMM-type methods need to estimate the matrix normA_i Ai(i= , ), which demands lots of calculations, especially for large ni(i= , ). Quite recently, some customized Douglas-Rachford splitting algorithms [–

], and the proximal ADMM-type methods with indeﬁnite proximal regularization are developed [, ], which dissolve the above problem to some extent. All the above men-tioned ADMM-type methods are generalizations of the classical ADMM, because they all reduce to the iterative scheme () by choosing some special parameters. For more new development of the ADMM-type methods, including the convergence rate, acceleration techniques, its generalization for solving multi-block separable convex programming and nonconvex, nonsmooth programming, we refer to [–].

1.2 Contributions and organization

We are going to further study the generalized ADMM. Note that the ﬁrst sub-problem in the generalized ADMM is irrelevant to the relaxation factorα. That is, the updating formula forx does not incorporate the relaxation factorαexplicitly. Furthermore,α∈ (, ) is often advantageous for the generalized ADMM []. Therefore, in this paper, we are going to propose a new generalized ADMM, whose both sub-problems incorporate the relaxation factorαdirectly. The new method generalizes the method proposed in [] by relaxing the feasible set ofαfrom the interval [, ) to the inﬁnite interval [, +∞), and can be viewed as a symmetric version of the generalized ADMM.

The rest of the paper is organized as follows. In Section , we summarize some necessary preliminaries and characterize problem () by a mixed variational inequality problem. In Section , we describe the new symmetric version of the generalized ADMM and establish its convergence results in detail. In Section , some compressed sensing experiments are given to illustrate the eﬃciency of the proposed method. Some conclusions are drawn in Section .

2 Preliminaries

In this section, some necessary preliminaries which are useful for further discussions are presented, and to make our analysis more succinct, some positive definite or positive semi-definite block matrices are defined and their properties are investigated.

For two real matricesA∈Rs×m,B∈Rn×s, the Kronecker product ofAandBis deﬁned asA⊗B= (aijB). Let·p(p= , ) denote the standard deﬁnition ofp-norm; in particular, · = · . For any two vectorsx,y∈Rn, x,yorxydenote their inner product, and for any symmetric matrixG∈Rn×n_{, the symbol}_G_{ (resp.,}_G_{) denotes that}_G_is

positive deﬁnite (resp., semi-deﬁnite). For anyx∈Rn_and_G_{, the}_G-norm_x_G_{of the}

vectorxis defined as√xGx. The effective domain of a closed proper functionf :X→ (–∞, +∞] is defined asdom(f) :={x∈X|f(x) < +∞}, and the symbolri(C) denotes the set of all relative interior points of a given nonempty convex setC. Furthermore, we use the following notations:

x= (x,x), w= (x,λ).

Deﬁnition .([]) A functionf :Rn_→_R_{is convex if and only if}

(4)

Then, for a convex functionf:Rn_→_R_{, we have the following basic inequality:}

f(x)≥f(y) + ξ,x–y, ∀x,y∈Rn,ξ∈∂f(y), () where∂f(y) ={ξ∈Rn:f(¯y)≥f(y) + ξ,y¯–y, for ally¯∈Rn}denotes the subdiﬀerential of f(·) at the pointy.

Throughout the paper, we make the following standard assumptions for problem ().

Assumption . The functionsθi(·) (i= , ) are convex. Assumption . The matricesAi(i= , ) are full-column rank.

Assumption . The generalized Slater condition holds, i.e., there is a point (xˆ,xˆ)∈ ri(domθ×domθ)∩ {x= (x,x)∈X×X|Ax+Ax=b}.

2.1 The mixed variational inequality problem

Under Assumption ., it follows from Theorem . and Theorem . of [] thatx∗= (x∗_,x∗_)∈Rn+n _{is an optimal solution to problem () if and only if there exists a vector}

λ∗∈Rlsuch that (x∗_,x∗_,λ∗) is a solution of the following KKT system:

⎧ ⎨ ⎩

∈∂θi(x∗i) –Ai λ∗+NXi(x ∗

i), i= , ,

Ax∗+Ax∗=b,

()

whereNXi(x∗i) is the normal cone of the convex setXiat the pointx∗i, which is deﬁned

asN_X_i(x∗_i) ={z∈Rni| _z,_x

i–x∗i ≤,∀xi∈Xi}. Then, for the nonempty convex setXiand ∀xi∈Xi, it follows from [] (Example .) thatNXi(xi) =∂δ(·|Xi)(xi), whereδ(·|Xi) is the indicator function of the setXi, and∂δ(·|Xi)(xi) is the subdiﬀerential mappings of

δ(·|Xi) at the pointxi∈Xi.

Lemma . For any vector x∗_i ∈ Rni_, _λ∗ ∈Rl_, _{the relationship} _∈_∂θ

i(x∗i) – Ai λ∗ +

∂δ(·|Xi)(x∗i)is equivalent to x∗i ∈Xiand the inequality

θi(xi) –θi x∗i

+ xi–x∗i

–A_iλ∗≥, ∀xi∈Xi.

Proof From ∈ ∂θi(x∗i) –Aiλ∗ +∂δ(·|Xi)(x∗i), we have x∗i ∈Xi and there exists ηi ∈

∂δ(·|Xi)(x∗i) such that

A_i λ∗–ηi∈∂θi x∗i

.

From the subgradient inequality (), one has

≥ xi–x∗i

A_iλ∗–ηi

, ∀xi∈Rni.

Thus,

– xi–x∗i

–A_i λ∗≥ xi–x∗i

(–ηi)≥, ∀xi∈Xi,

(5)

Conversely, fromθi(xi) –θi(x∗i) + (xi–x∗i)(–Ai λ∗)≥,∀xi∈Xi, we have

θi(xi) +xi –Aiλ∗

≥θi x∗i

+ x∗_i –A_i λ∗, ∀xi∈Xi,

which together withx∗_i ∈Xiimplies that

x∗_i =argmin

xi∈Xi

θi(xi) +xi –Aiλ∗

.

From this and Theorem . of [], we have ∈∂θi(x∗i) –Aiλ∗+∂δ(·|Xi)(x∗i). This

com-pletes the proof.

Remark . Based on () and Lemma ., the vectorx∗= (x∗_,x_∗)∈Rn+n _{is an optimal}

solution to problem () if and only if there exists a vectorλ∗∈Rl_{such that} ⎧

⎪ ⎪ ⎨ ⎪ ⎪ ⎩

(x∗,x∗)∈X×X;

θi(xi) –θi(x∗i) + (xi–x∗i)(–Aiλ∗)≥, ∀xi∈Xi,i= , ;

Ax∗+Ax∗=b.

()

Moreover, anyλ∗∈Rl_{satisfying () is an optimal solution to the dual of problem ().}

Ob-viously, () can be written as the following mixed variational inequality problem, denoted byVI(W,F,θ): Find a vectorw∗∈Wsuch that

θ(x) –θ x∗+ w–w∗F w∗≥, ∀w∈W, ()

whereθ(x) =θ(x) +θ(x),W=X×X×Rl, and

F(w) :=

⎛ ⎜ ⎝

–A_λ

–A_λ

Ax+Ax–b

⎞ ⎟ ⎠=

⎛ ⎜ ⎝

  –A_   –A_ A A 

⎞ ⎟ ⎠

⎛ ⎜ ⎝

x x

λ

⎞ ⎟ ⎠–

⎛ ⎜ ⎝

  b

⎞ ⎟

⎠. ()

The solution set ofVI(W,F,θ), denoted byW∗, is nonempty by Assumption . and Re-mark .. It is easy to verify that the linear functionF(·) is not only monotone but also satisﬁes the following desired property:

w–w F w–F(w)= , ∀w,w∈W.

2.2 Three matrices and their properties

To present our analysis in a compact way, now let us deﬁne some matrices. For anyRi∈

Rni×ni_(i_{= , )}_{, set}

M=

⎛ ⎜ ⎝

In  

 In 

 –βA Il ⎞ ⎟

(6)

and forα∈[, +∞), set

Q=

⎛ ⎜ ⎝

R  

 R+ (α– )βAA –ααA

 –A _αβ Il

⎞ ⎟ ⎠,

H=

⎛ ⎜ ⎝

R  

 R+α

_–_α_+

α βAA –ααA  –_ααA αβ Il

⎞ ⎟ ⎠.

()

The above deﬁned three matricesM,Q,Hsatisfy the following properties.

Lemma . Ifα∈Rand Ri (i= , ),then the matrix H deﬁned in()is positive

semi-deﬁnite.

Proof Sett= α– α+ , which is positive for anyα∈R. By (), we have

H=

⎛ ⎜ ⎝

R    R 

  

⎞ ⎟ ⎠+

⎛ ⎜ ⎝

  

 t_αβA_A –ααA  –_ααA αβ Il

⎞ ⎟ ⎠.

Obviously, the first part is positive semi-definite, and we only need to prove the second part is also positive semi-definite. In fact, it can written as



α

⎛ ⎜ ⎝

  

 √βA_ 

  √

βIl

⎞ ⎟ ⎠

⎛ ⎜ ⎝

  

 tIl ( –α)Il

 ( –α)Il Il ⎞ ⎟ ⎠

⎛ ⎜ ⎝

  

 √βA 

  √

βIl

⎞ ⎟ ⎠.

The middle matrix in the above expression can be further written as

⎛ ⎜ ⎝

  

 t  –α

  –α 

⎞ ⎟ ⎠⊗Il,

where ⊗denotes the matrix Kronecker product. The matrix Kronecker product has a nice property: for any two matricesXandY, the eigenvalue ofX⊗Y equals the product ofλ(X)λ(Y), whereλ(X) andλ(Y) are the eigenvalues ofXandY, respectively. Therefore, we only need to show the -by- matrix

t  –α

 –α 

is positive semi-deﬁnite. In fact,

t– ( –α)=α≥.

(7)

Lemma . Ifα∈[, +∞)and Ri (i= , ),then the matrices M,Q,H deﬁned,

respec-tively,in(), ()satisfy the following relationships:

HM=Q ()

and

Q+Q–MHMα–  α M

_HM. _()

Proof From () and (), we have

HM=

⎛ ⎜ ⎝

R  

 R+α

_–_α_+

α βA

A –_ααA

 –α

α A

 αβIl

⎞ ⎟ ⎠ ⎛ ⎜ ⎝

In  

 In 

 –βA Il ⎞ ⎟ ⎠ = ⎛ ⎜ ⎝

R  

 R+ (α– )βAA –_ααA

 –A αβ Il

⎞ ⎟ ⎠=Q.

Then the ﬁrst assertion is proved. For (), by some simple manipulations, we obtain

MHM=MQ

=

⎛ ⎜ ⎝

In  

 In –βA   Il

⎞ ⎟ ⎠ ⎛ ⎜ ⎝

R  

 R+ (α– )βAA –ααA

 –A _αβ Il

⎞ ⎟ ⎠ = ⎛ ⎜ ⎝

R  

 R+ αβAA –A  –A _αβ Il ⎞ ⎟ ⎠.

We now break up the proof into two cases. First, ifα= , then

Q+Q–MHM=

⎛ ⎜ ⎝

R    R    _βIl

⎞ ⎟ ⎠.

Therefore, () holds. Second, ifα∈(, +∞), then

Q+Q–MHM

=

⎛ ⎜ ⎝

R  

 R+ (α– )βAA –ααA   –_ααA αβ Il

⎞ ⎟ ⎠ = ⎛ ⎜ ⎝

R    R 

  

⎞ ⎟

⎠+ (α– )

⎛ ⎜ ⎝

  

 βA_A –_αA   –__αA αβ(α–)Il

⎞ ⎟

(8)

Note that

α

βA_A –_αA –__αA αβ(α–)Il

–

αβA_A –A –A αβ Il

=

αβA_A –A –A _αβα₍+_α_–)Il

=

_√

βA_   √

βIl

αIl –Il

–Il α(αα+–)Il

√

βA   √

βIl

. ()

The middle matrix in the above expression can be further written as

α – – α+ α(α–)

⊗Il.

Since

α – – _αα₍_α+_–)

, ∀α> ,

the right-hand side of () is also positive semi-deﬁnite. Thus, we have

βA_A –_αA –__αA αβ(α–)Il



α

αβA_A –A –A αβ Il

. ()

Substituting () into () and by the expression ofMHM, we obtain (). The lemma is

proved.

3 Algorithm and convergence results

In this section, we ﬁrst describe the symmetric version of the generalized alternating di-rection method of multipliers (SGADMM) forVI(W,F,θ) formally, and then we prove its global convergence in a contraction perspective and establish its worst-caseO(/k) con-vergence rate in both the ergodic and the non-ergodic senses step by step, wherekdenotes the iteration counter.

3.1 Algorithm

Algorithm .(SGADMM)

Step . Choose the parametersα∈[, +∞),β> ,Ri∈Rni×ni(i= , ), the tolerance

ε> and the initial iterate(x

,x,λ)∈X×X×Rl. Setk:= .

Step . Generate the new iteratewk+_{= (x}k+

 ,xk+,λk+)by

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

xk+

 ∈argminx∈X{θ(x) –x

Aλk+

αβ

Ax+Ax

k

–b

+_x–xkR},

xk+

 ∈argminx∈X{θ(x) –x

Aλk+( α–)β

 Axk++Ax–b +_x–xkR},

λk+=λk–β[αAxk+– ( –α)(Axk–b) +Axk+–b].

(9)

Step . If

maxRxk–Rxk+,Rxk–Rxk+,Axk–Axk+,λk–λk+<ε, ()

then stop and return an approximate solution(xk_+,xk_+,λk+₎_of_VI(_W_,_F,_θ₎_{; else}

setk:=k+ , and goto Step .

Remark . Obviously, the iterative scheme () reduces to the generalized ADMM when

α= , and further reduces to () whenRi=  (i= , ). That is to say, if the parameters

α=  andRi=  (i= , ), then the classical ADMM is recovered. Since the convergence

results of the (proximal) ADMM have been established in the literature [, , ], in the following, we only considerα∈(, +∞).

3.2 Global convergence

For further analysis, we need to deﬁne an auxiliary sequence{ ˆwk_}_{as follows:}

ˆ

wk=

⎛ ⎜ ⎝ ˆ

xk_

ˆ

xk_

ˆ

λk ⎞ ⎟ ⎠=

⎛ ⎜ ⎝

xk_+ xk_+

λk_–_αβ_(A

xk++Axk–b)

⎞ ⎟

⎠. ()

Lemma . Let{λk+}and{ˆλk}be the two sequences generated by SGADMM.Then

λk+=λˆk–β Axˆk–Axk

()

and

ˆ

λk–



α –  λˆ

k_–_λk₌_λk_{– (}_α_{– )}_β _A

xˆk+Axk–b

. ()

Proof From the deﬁnition ofλk+, we get

λk+=λk–βαAxˆk– ( –α) Axk–b

+Axˆk–b

=λk–βα Axˆk+Axk–b

+ Axˆk–Axk

=λˆk–β Axˆk–Axk

.

Then () is proved. For (), we have

ˆ

λk–



α –  λˆ

k_–_λk

=λk–αβ Axˆk+Axk–b

+



α– 

αβ Axˆk+Axk–b

=λk– (α– )β Axˆk+Axk–b

.

(10)

Thus, based on () and (), the two sequences{wk_}_and_{{ ˆ}_wk_}_{satisﬁes the following}

relationship:

wk+=wk–M wk–wˆk, ()

whereMis deﬁned in ().

The following lemma shows that the stopping criterion () of SGADMM is reasonable.

Lemma . If Rixki =Rixik+ (i= , ),Axk=Axk+andλk=λk+,then the iteratewˆk= (xˆk

,xˆk,λˆk)produced by SGADMM is a solution ofVI(W,F,θ).

Proof By invoking the optimality condition of the three sub-problems in (), we have the following mixed variational inequality problems: for anyw= (x,x,λ)∈W,

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

θ(x) –θ(xˆk) + (x–xˆk){–A[λk–αβ(Axˆk+Axk–b)] +R(xˆk–xk)} ≥,

θ(x) –θ(xˆk) + (x–xˆk){–A[λk– (α– )β(Axˆk+Axˆk–b)]} +R(xˆk–xk)≥,

(λ–λˆk)[αAxˆk– ( –α)(Axk–b) +Axˆk–b– (λk–λk+)/β]≥. Then, adding the above three inequalities and by (), (), we get

θ(x) –θ xˆk+ w–wˆk

⎧ ⎪ ⎨ ⎪ ⎩ ⎛ ⎜ ⎝

–A_λˆk

–A_λˆk

Axˆk+Axˆk–b

⎞ ⎟ ⎠

+

⎛ ⎜ ⎝

R(xˆk–xk)

(α– )βA_(Axˆk–Axk) + ( –α)A(λˆk–λk)/α+R(xˆk–xk) ( –α)(Ax˜k–Axk)/α+ (λk+–λk)/(αβ)

⎞ ⎟ ⎠ ⎫ ⎪ ⎬ ⎪ ⎭≥.

Then by (), we obtain

θ(x) –θ xˆk+ w–wˆk

⎧ ⎪ ⎨ ⎪ ⎩F wˆ

k

+

⎛ ⎜ ⎝

R(xˆk–xk)

(α– )βA_(Ax˜k–Axk) + ( –α)A(λˆk–λk)/α+R(xˆk–xk) –(Axˆk–Axk) + (λˆk–λk)/(αβ)

⎞ ⎟ ⎠ ⎫ ⎪ ⎬ ⎪ ⎭≥.

Then, by () (the deﬁnition ofQ), the above inequality can be rewritten as

θ(x) –θ xˆk+ w–wˆkF wˆk≥ w–wˆkQ wk–wˆk, () for anyw∈W. Therefore, ifRixik=Rixki+(i= , ),Axk=Axk+ andλk=λk+, then by (), we haveλk+=λˆk. Thenλˆk=λk. Thus, we have

(11)

which together with () implies that

θ(x) –θ xˆk+ w–wˆkF wˆk≥, ∀w∈W.

This indicates that the vectorwˆk _{is a solution of}_VI(_W_,_F,_θ_{). This completes the proof.}

Lemma . Let{wk_}_and_{{ ˆ}_wk_}_{be two sequences generated by SGADMM.}_Then,_{for any}

w∈W,we have

w–wˆkQ wk–wˆk≥ 

 w–w

k+

H–w–w k

H

+α–  α w

k_–_wk+

H. ()

Proof Applying the identity

(a–b)H(c–d) = 

 a–d 

H–a–cH

+

 c–b 

H–d–bH

,

with

a=w, b=wˆk, c=wk, d=wk+,

we obtain

w–wˆkH wk–wk+= 

 w–w

k+

H–w–w k

H

+  w

k_–_w_ˆk

H–w

k+_–_w_ˆk

H

.

This together with () and () implies that

w–wˆkQ wk–wˆk=   w–w

k+

H–w–w k

H

+  w

k_–_w_ˆk H–w

k+_–_w_ˆk H

. ()

Now let us deal with the last term in (), which can be written as

wk–wˆk_H–wk+–wˆk_H

=wk–wˆk_H– wk–wˆk– wk–wk+_H

=wk–wˆk_H– wk–wˆk–M wk–wˆk_H (using ())

=  wk–wˆkHM wk–wˆk– wk–wˆkMHM wk–wˆk

= wk–wˆk Q+Q–MHM wk–wˆk

≥α– 

α w

k_–_w_ˆk_M_HM _wk_–_w_ˆk _{(using ())}

=α–  α w

k_–_wk+

H (using ()).

(12)

Theorem . Let{wk_}_and_{{ ˆ}_wk_}_{be two sequences generated by SGADMM.}_Then,_{for any}

w∈W,we have

θ(x) –θ xˆk+ w–wˆkF(w)

≥

 w–w

k+

H–w–w k

H

+α–  α w

k_–_wk+

H. ()

Proof First, combining () and (), we get

θ(x) –θ xˆk+ w–wˆkF wˆk

≥

 w–w

k+

H–w–w k

H

+α–  α w

k_–_wk+

H.

From the monotonicity ofF(·), we have

w–wˆk F(w) –F wˆk≥.

Adding the above two inequalities, we obtain the assertion (). The proof is completed.

With the above theorem in hand, we are ready to establish the global convergence of SGADMM for solvingVI(W,F,θ).

Theorem . Let{wk}be the sequence generated by SGADMM.Ifα> ,Ri+βAiAi

(i= , ),then the corresponding sequence{wk}converges to some w∞,which belongs toW∗.

Proof Settingw=w∗in (), we have

wk–w∗_H–α– 

α w

k_–_wk+

H

≥θ xˆk_–_θ _x∗₊ _w_ˆk_–_w∗_F _w∗₊_wk+_–_w∗

H

≥wk+–w∗_H,

where the second inequality follows fromw∗∈W∗. Thus, we have

wk+–w∗_H≤wk–w∗_H–α– 

α w

k_–_wk+

H. ()

Summing overk= , , . . . ,∞, it yields

∞

k=

wk–wk+_H≤ α

α– w _–_w∗

H.

Byα>  and the positive semi-deﬁnite ofH, the above inequality implies that lim

k→∞

wk–wk+_H= .

Thus, by the deﬁnition ofH, we have

lim

k→∞x

k

–xk+ 

R=_klim_→∞v

k_–_vk+

(13)

where

H=

R+α

_–_α_+

α βAA –ααA –α

α A

 αβIl

,

is positive deﬁnite byR+βAA. From () again, we have

wk+–w∗_H≤w–w∗_H,

which indicates that the sequence{Hwk}is bounded. Thus,{Rxk}∞k=and{Hvk}∞k=are both bounded. Then{vk_}∞

k=is bounded. IfR,{xk}∞k=is bounded; otherwise,AA , that is, A is full-column rank, which together with Ax = (λk –λk+)/(αβ) + ( –

α)(Axk–b)/α– (Axk+–b)/α implies that {xk}∞k=is bounded. In conclusion,{wk}∞k= is bounded.

Then, from () andH, the sequence{vk}is convergent. Suppose it converges tov∞. Letw∞= (x∞_ ,v∞) be a cluster point of{wk_}_and_{_wkj}_{be the corresponding subsequence.} On the other hand, by () and (), we have

lim

k→∞R x

k

–xˆk

= , lim

k→∞ x

k

–xˆk

= 

and

lim

k→∞λ

k_–_λˆk₌ _lim k→∞λ

k_–_λk+₊_β _A

xˆk–Axk

= .

Thus,

lim

k→∞Q w

k_–_w_ˆk_{= .} _()

Then, taking the limit along the subsequence{wkj}_{in () and using (), for any}_w∈W_, we obtain

θ(x) –θ x∞+ w–w∞F w∞≥,

which indicates thatw∞ is a solution ofVI(W,F,θ). Then, sincew∗ in () is arbitrary, we can setw∗=w∞ and conclude that the whole generated sequence{wk_}_{converges by}

Ri+βAi Ai (i= , ). This completes the proof.

3.3 Convergence rate

Now, we are going to prove the worst-caseO(/t) convergence rate of SGADMM in both the ergodic and the non-ergodic senses.

Theorem . Let{wk_}_and_{{ ˆ}_wk_}_{be the sequences generated by SGADMM,}_{and set}

¯

wt=  t+ 

t

k=

ˆ

(14)

Then,for any integer t≥,we havew¯t_∈_W_,_and

θ(x¯t) –θ(x) + (w¯t–w)F(w)≤



(t+ )w–w 

H, ∀w∈W. ()

Proof From () and the convexity of the setW, we havew¯k_∈_W_{. From (), we have}

θ(x) –θ xˆk+ w–wˆkF(w) +  w–w

k

H≥

 w–w

k+

H, ∀w∈W.

Summing the above inequality overk= , , . . . ,t, we get

(t+ )θ(x) –

t

k=

θ xˆk+

(t+ )w–

t

k=

ˆ

wk

F(w) + w–w



H≥, ∀w∈W.

By the deﬁnition ofw¯t _{and the convexity of}_θ₍_·_{), the assertion () follows immediately}

from the above inequality. This completes the proof.

The proof of the next two lemmas is referred to those of Lemmas . and . in []. For completeness, we give the detail proof.

Lemma . Let{wk}be the sequence generated by SGADMM.Then we have

wk–wk+H wk–wk+– wk+–wk+

≥α– 

α w

k_–_wk+_– _wk+_–_wk+

H. ()

Proof Settingw=wˆk+in (), we have

θ xˆk+–θ xˆk+ wˆk+–wˆkF wˆk≥ wˆk+–wˆkQwk–wˆk. Similarly settingw=wˆk_{in () for}_k_:=_k_{+ , we get}

θ xˆk–θ xˆk++ wˆk–wˆk+F wˆk+≥ wˆk–wˆk+Q wk+–wˆk+.

Then, adding the above two inequalities and using the monotonicity of the mappingF(·), we get

ˆ

wk–wˆk+Q wk–wˆk– wk+–wˆk+≥. ()

By (), we have

wk–wk+Q wk–wˆk– wk+–wˆk+

= wk–wˆk– wk+–wˆk++ wˆk–wˆk+Q wk–wˆk– wk+–wˆk+

= wk–wˆk– wk+–wˆk+_Q+ wˆk–wˆk+Q wk–wˆk– wk+–wˆk+

(15)

Using (), () andQ=HMon both sides of the above inequality, we get

wk–wk+H wk–wk+– wk+–wk+

= wk–wk+QM– wk–wk+– wk+–wk+

= wk–wk+Q wk–wˆk– wk+–wˆk+

≥ wk–wˆk– wk+–wˆk+_Q

= wk–wˆk– wk+–wˆk+Q wk–wˆk– wk+–wˆk+

= wk–wk+– wk+–wk+M–QM– wk–wk+– wk+–wk+

≥α– 

α w

k_–_wk+_– _wk+_–_wk+

×M–MHMM– wk–wk+– wk+–wk+

=α–  α w

k_–_wk+_– _wk+_–_wk+

H.

Then we get the assertion (). The proof is completed.

Lemma . Let{wk_}_{be the sequence generated by SGADMM.}_{Then we have}

wk+–wk+_H≤wk–wk+_H–α–  α w

k_–_wk+_– _wk+_–_wk+

H. ()

Proof Settinga:= (wk_–_wk+_{) and}_b_{:= (w}k+_–_wk+_{) in the identity}

a_H–b_H= aH(a–b) –a–b_H,

we can derive

wk–wk+_H–wk+–wk+_H

=  wk–wk+H wk–wk+– wk+–wk+– wk–wk+– wk+–wk+_H

≥α– 

α w

k_–_wk+_– _wk+_–_wk+

H– w

k_–_wk+_– _wk+_–_wk+

H

=α–  α w

k_–_wk+_– _wk+_–_wk+

H,

which completes the proof of the lemma.

Based on Lemma ., now we establish the worst-case O(/t) convergence rate of SGADMM in a non-ergodic sense.

Theorem . Let{wk_}_{be the sequence generated by SGADMM.}_Then,_{for any w}∗_∈_W∗

and integer t≥,we have

wt–wt+_H≤ α

(t+ )(α– )w _–_w∗

(16)

Proof By (), we get

α– 

α

t

k=

wk–wk+_H≤w–w∗_H.

This and () imply that

(t+ )(α– )

α w

t_–_wt+

H≤w

_–_w∗

H.

Therefore, the assertion of this theorem comes from the above inequality immediately.

The proof is completed.

Remark . From (), we see that the largerαis, the smaller_αα_–, which controls the up-per bounds ofwt_–_wt+

H. Therefore, it seems that larger values ofαare more beneﬁcial

for speeding up the convergence of SGADMM.

4 Numerical experiments

In this section, we present some numerical experiments to verify the eﬃciency of SGADMM for solving compressed sensing. Those numerical experiments are performed in Matlab Ra on a ThinkPad computer equipped with Windows XP,  MHz and  GB of memory.

Compressed sensing (CS) is to recover a sparse signalx¯∈Rn_{from an undetermined}

linear systemb=Ax, where¯ A∈Rm×n_(m_{n), can be depicted as problem ().}

Obviously, Problem () is equivalent to the following two models:

(a) Model : Problem (). (b) Model :

minμx+ 

Ax–y 

s.t.–x–x= ,

x∈Rn,x∈Rn.

()

4.1 The iterative schemes for (3) and (35)

Since () and () are both some concrete models of (), SGADMM are applicable to them. Below, we elaborate on how to derive the closed-form solutions for the sub-problems re-sulting by SGADMM.

For problem (), its ﬁrst two sub-problems resulting by SGADMM are as follows.

•With the givenxk

andλk, thex-sub-problem in () is (hereR= )

xk_+=argmin

x∈Rn

 x



+xλ+

αβ

 x–Ax

k

+y 

,

which has the following closed-form solution:

xk_+= 

 +αβ αβ Ax

k

–y

(17)

•With the updatedxk_+, thex-sub-problem in () is (hereR=τIn– (α– )βAA

withτ≥(α– )βAA)

xk_+=argmin

x∈Rn

μx–xAλk+

(α– )β

 –x

k+

 +Ax–y+  x–x

k

 

R

,

and its closed-form solution is given by

xk_+=shrinkμ

τ (α– )βA _xk+

 +y

/τ+ τIn– (α– )βAA

xk_/τ+Aλk/τ, where, for anyc> ,shrink_c(·) is deﬁned as

shrinkc(g) :=g–min

c,|g| g

|g|, ∀g∈R

n_,

and (g/|g|)ishould be taken  if|g|i= .

Similarly, for problem (), its ﬁrst two sub-problems resulting by SGADMM are as follows.

•With the givenxk

andλk, thex-sub-problem in () is (hereR= )

xk_+=argmin

x∈Rn

μx+

αβ



x–

xk_+ 

αβλ

k

 ,

xk_+=shrinkμ αβ

xk_+ 

αβλ

k

.

•With the updatedxk+

 , thex-sub-problem in () is (hereR=τIn–AAwithτ ≥ AA)

xk_+=argmin

x∈Rn



Ax–y ₊_x

λk+

(α– )β

 x–x

k+ 

 +

x–x

k

 

R

,

xk_+= 

τ+ (α– )β A

_y_–_λk_{+ (}_α_{– )}_β_xk+

 +AAxk

.

Obviously, the above two iterative schemes both need to computeAAandAy, which is quite time consuming ifnis large. However, noting that these two terms are invariant during the iteration process, therefore we need only compute them once before all itera-tions.

(18)

β=mean(abs(y))/(α– ) in our algorithm. As for the parameterα, we have pointed out in Remark . that larger values ofα may be beneficial for our algorithm. Here, we use () to do a little experiment to test this. We choose different values ofαin the interval [, ]. Specifically, we chooseα∈ {., ., . . . , }. Other data about this experiment are as follows: the proximal parameterτ is set asτ = .(α– )βAA; the observed signal yis set asy=Ax+ .×randn(m, ) in Matlab; the sensing matrixAand the original signalxare generated by

¯

A=randn(m,n), [Q,R] =qrA¯, , A=Q,

and

x=zeros(n, ); p=randperm(n); x p( :k)=randn(k, ).

Then the observed signalyis further set as (R)–_{y. The initial points are set as}_x =Ay,

λ=Ax

. In addition, we set the regularization parameterμ= ., and the dimensions of the problem are set asn= ,,m= ,k= , wherekdenotes the number of the non-zeros in the original signalx. To evaluate the quality of the recovered signal, let us deﬁne the quantity ‘RelErr’ as follows:

RelErr=˜x–x

x ,

wherex˜denotes the recovered signal. The stopping criterion is

fk–fk–

fk–

< –,

wherefkdenotes the function value of () at the iteratexk.

4.2 Numerical results

The numerical results are graphicly shown in Figure . Clearly, the numerical results in Figure  indicate that Remark . is reasonable. Both CPU time and number of iterations are descent with respect toα. Then, in the following, we setα= ., which is a moderate choice forα.

Now, let us graphically show the recovered results of SGADMM for () and (). The proximal parameterτis set asτ= .(α– )βAAfor (), andτ= .AAfor (). The initial points are set asx_=Ay,λ₌_Ax

for (), andx=Ay,λ=xfor (). Other parameters are set the same as above. Figure  reports the numerical results of SGADMM for () and ().

The bottom two subplots in Figure  indicate that our new method SGADMM can be used to solve () and ().

(19)

[image:19.595.116.480.80.572.2] [image:19.595.118.478.83.328.2]

Figure 1 Sensitivity test on the parameterα.

Figure 2 Numerical results of SGADMM for (3) and (35).The top: the original signal; the second: the noisy measurement; the bottom two: recovered signal.

The numerical results are listed in Table , where ‘Time’ denotes the CPU time (in sec-onds), and ‘Iter’ denotes the number of iterations required for the whole recovering pro-cess,m=floor(γn),k=floor(σm). The numerical results are the average of the nu-merical results of ten runs with diﬀerent combinations ofγ andσ.

4.3 Discussion

(20)

[image:20.595.117.479.98.193.2]

Table 1 Comparison of SGADMM1, SGADMM2 and ADMM

n γ σ SGADMM1 SGADMM2 ADMM

Time Iter RelErr Time Iter RelErr Time Iter RelErr

1,000 0.3 0.2 0.6911 92.4 0.0387 0.9578 266.0 0.0394 0.9812 264.0 0.0393

0.2 0.2 0.6661 118.6 0.0825 1.3915 421.8 0.0915 1.3603 419.6 0.0915

0.2 0.1 0.5008 85.3 0.0609 0.4758 139.0 0.0579 0.5008 138.0 0.0539

2,000 0.3 0.2 2.1965 90.0 0.0437 3.6535 267.7 0.0447 3.5412 265.6 0.0447

0.2 0.2 2.2339 109.6 0.0785 5.2182 431.4 0.0874 5.1543 429.0 0.0874

0.2 0.1 1.5428 79.9 0.0534 1.7893 142.8 0.0467 1.7613 140.8 0.0513

better than the other two methods. Especially the number of iterations of SGADMM is about at most two-thirds of the other two methods. This experiment also indicate that the model () is also an effective model for compressed sensing, and sometimes it is more ef-ficient than the model (), though they are equivalent in theory. In conclusion, by choos-ing some relaxation factorα∈[, +∞), SGADMM may be more efficient than the classical ADMM.

5 Conclusions

In this paper, we have proposed a symmetric version of the generalized ADMM (SGADMM), which generalizes the feasible set of the relaxation factorα from (, ) to [, +∞). Under the same conditions, we have proved the convergence results of the new method. Some numerical results illustrate that it may perform better than the classical ADMM. In the future, we shall study SGADMM withα∈(, ) to perfect the theoretical system.

Acknowledgements

The authors gratefully acknowledge the valuable comments of the anonymous reviewers. This work is supported by the National Natural Science Foundation of China (Nos. 11601475, 71532015), the foundation of First Class Discipline of Zhejiang-A (Zhejiang University of Finance and Economics - Statistics).

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

The ﬁrst author has proved the convergence results; the second author has accomplished the numerical experiment; and the third author has proposed the motivation of the manuscript. All authors read and approved the ﬁnal manuscript.

Author details

1_{School of Economics and Management, Tongji University, Shanghai, 200092, P.R. China.}2_{School of Data Sciences,}

Zhejiang University of Finance and Economics, Zhejiang, 310018, P.R. China. 3_{School of Mathematics and Statistics,} Zaozhuang University, Shandong, 277160, P.R. China.4_{School of Management, Qufu Normal University, Shandong,} 276826, P.R. China.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional aﬃliations.

Received: 22 January 2017 Accepted: 10 May 2017

References

1. Beck, A, Teboulle, M: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci.

2, 183-202 (2009)

2. Yang, JF, Zhang, Y, Yin, WT: A fast alternating direction method for TVL1-L2 signal reconstruction from partial Fourier data. IEEE J. Sel. Top. Signal Process.4, 288-297 (2010)

3. Boyd, S, Parikh, N, Chu, E, Peleato, B, Eckstein, J: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn.3, 1-122 (2011)

(21)

5. Gabay, D, Mercier, B: A dual algorithm for the solution of nonlinear variational problems via ﬁnite-element approximations. Comput. Math. Appl.2, 17-40 (1976)

6. Lions, PL, Mercier, B: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal.16, 964-979 (1979)

7. Gabay, D: Applications of the method of multipliers to variational inequalities. In: Fortin, M, Glowinski, R (eds.) Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, pp. 299-331. North-Holland, Amsterdam (1988)

8. Eckstein, J, Bertsekas, DP: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program.55, 293-318 (1992)

9. Esser, E: Applications of Lagrangian-based alternating direction methods and connections to split Bregman. TR. 09-31, CAM, UCLA (2009)

10. He, BS, Liao, LZ, Han, D, Yang, H: A new inexact alternating directions method for monotone variational inequalities. Math. Program.92, 103-118 (2002)

11. Golshtein, EG, Tretyakov, NV: Modiﬁed Lagrangian in convex programming and their generalizations. Math. Program. Stud.10, 86-97 (1979)

12. He, BS, Ma, F, Yuan, XM: Convergence study on the symmetric version of ADMM with larger step sizes. SIAM J. Imaging Sci.9(3), 1467-1501 (2016)

13. Chen, CH, Ma, SQ, Yuan, JF: A general inertial proximal point algorithm for mixed variational inequality problem. SIAM J. Optim.25(4), 2120-2142 (2015)

14. Fang, EX, He, BS, Liu, H, Yuan, XM: Generalized alternating direction method of multipliers: new theoretical insights and applications. Math. Program. Comput.7(2), 149-187 (2015)

15. Sun, M, Liu, J: A proximal Peaceman-Rachford splitting method for compressive sensing. J. Appl. Math. Comput.50, 349-363 (2016)

16. Chen, CH, Chan, RH, Ma, SQ, Yuan, JF: Inertial proximal ADMM for linearly constrained separable convex optimization. SIAM J. Imaging Sci.8(4), 2239-2267 (2015)

17. Sun, HC, Sun, M, Zhou, HC: A proximal splitting method for separable convex programming and its application to compressive sensing. J. Nonlinear Sci. Appl.9, 392-403 (2016)

18. Sun, M, Wang, YJ, Liu, J: Generalized Peaceman-Rachford splitting method for multiple-block separable convex programming with applications to robust PCA. Calcolo54(1), 77-94 (2017)

19. Han, DR, He, HJ, Yang, H, Yuan, XM: A customized Douglas-Rachford splitting algorithm for separable convex minimization with linear constraints. Numer. Math.127, 167-200 (2014)

20. He, HJ, Cai, XJ, Han, DR: A fast splitting method tailored for Dantzig selectors. Comput. Optim. Appl.62, 347-372 (2015)

21. He, HJ, Han, DR: A distributed Douglas-Rachford splitting method for multi-block convex minimization problems. Adv. Comput. Math.42, 27-53 (2016)

22. He, BS, Ma, F, Yuan, XM: Linearized alternating direction method of multipliers via positive-indeﬁnite proximal regularization for convex programming. Manuscript (2016)

23. Sun, M, Liu, J: The convergence rate of the proximal alternating direction method of multipliers with indeﬁnite proximal regularization. J. Inequal. Appl.2017, 19 (2017)

24. He, BS, Liu, H, Wang, ZR, Yuan, XM: A strictly contractive Peaceman-Rachford splitting method for convex programming. SIAM J. Optim.24(3), 1011-1040 (2014)

25. Hou, LS, He, HJ, Yang, JF: A partially parallel splitting method for multiple-block separable convex programming with applications to robust PCA. Comput. Optim. Appl.63(1), 273-303 (2016)

26. Wang, K, Desai, J, He, HJ: A proximal partially-parallel splitting method for separable convex programs. Optim. Methods Softw.32(1), 39-68 (2017)

27. Xu, YY: Accelerated ﬁrst-order primal-dual proximal methods for linearly constrained composite convex programming. Manuscript (2016)

28. Guo, K, Han, DR, Wang, DZW, Wu, TT: Convergence of ADMM for multi-block nonconvex separable optimization models. Front. Math. China (2017). doi:10.1007/s11464-017-0631-6

29. Sun, HC, Sun, M, Zhou, HC: A new generalized alternating direction method of multiplier for separable convex programming and its applications to image deblurring with wavelets. ICIC Express Lett.10(2), 271-277 (2016) 30. Rockafellar, RT: Convex Analysis. Princeton University Press, Princeton (1970)

31. Fukushima, M: Fundamentals of Nonlinear Optimization. Asakura, Tokyo (2001)

32. Bonnans, JF, Shapiro, A: Perturbation Analysis of Optimization Problems. Springer, New York (2000)

33. He, BS, Yuan, XM: On theO(1/n) convergence rate of the Douglas-Rachford alternating direction method. SIAM J. Numer. Anal.50(2), 700-709 (2012)

34. He, BS, Yuan, XM: On non-ergodic convergence rate of Douglas-Rachford alternating direction method of multipliers. Numer. Math.130, 567-577 (2015)