Analysis of the equivalence relationship between \(l {0}\) minimization and \(l {p}\) minimization

(1)

R E S E A R C H

Open Access

Analysis of the equivalence relationship

between

l

₀

-minimization and

l

_p

-minimization

Changlong Wang and Jigen Peng

*

*_{Correspondence:}

jgpengxjtu@126.com School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, China

Abstract

In signal processing theory,l0-minimization is an important mathematical model. Unfortunately,l0-minimization is actually NP-hard. The most widely studied approach to this NP-hard problem is based on solvinglp-minimization (0 <p≤1). In this paper, we present an analytic expression ofp∗(A,b), which is formulated by the dimension of the matrixA∈Rm×n_{, the eigenvalue of the matrix}_AT_{A, and the vector}_b_∈_Rm_{, such} that everyk-sparse vectorx∈Rn_{can be exactly recovered via}_l

p-minimization whenever 0 <p<p∗(A,b), that is,lp-minimization is equivalent tol0-minimization whenever 0 <p<p∗(A,b). The superiority of our results is that the analytic expression and each its part can be easily calculated. Finally, we give two examples to conﬁrm the validity of our conclusions.

MSC: 60E05; 94A12; 94A20

Keywords: sparse recovery; null space constant; null space property;

l0-minimization;lp-minimization

1 Introduction

In sparse information theory, a central goal is to get the sparsest solutions of underdeter-mined linear systems including visual coding [1], matrix completion [2], source localiza-tion [3], and face recognilocaliza-tion [4]. All these problems are popularly modeled by the follow-ingl0-minimization:

min

x∈Rnx0 s.t. Ax=b, (1)

whereA∈Rm×n_{is an underdetermined matrix (}_i.e_. _m_<_n_{), and}_x

0 is the number of

nonzero elements ofx, which is commonly calledl0-norm although it is not a true vector

norm. Ifx∈Rn_{is a unique solution of}_l

0-minimization, we also say thatxcan be recovered

byl0-minimization; we adopt these two statements in this paper.

SinceAhas more columns than rows, the underdetermined linear systemAx=b ad-mits an inﬁnite number of solutions. To ﬁnd the sparsest one, much excellent theoretical work (see,e.g. [5, 6], and [7]) has been devoted to thel0-minimization. However,

Natara-jan [8] proved thatl0-minimization is NP-hard. Furthermore, it is combinationally and

computationally intractable to solvel0-minimization directly because of its discrete and

discontinuous nature. Therefore, a lot of work put forward some alternative strategies to

(2)

get the sparsest solution (see,e.g. [5, 9–14], and [15]). Among these methods, the most popular one islp-minimization with 0 <p≤1 introduced by Gribonval and Nielsen [16],

min x∈Rnx

p

p s.t. Ax=b, (2)

wherexpp= n

i=1|xi|p. In the literature,xpis still called thep-norm ofxthough it is

only a quasi-norm when 0 <p< 1 (because in this case it violates the triangle inequality). Due to the fact thatx0=limp→0xpp,l0-minimization andlp-minimization are

collec-tively calledlp-minimization with 0≤p≤1 in this paper.

However, to get the sparsest solution ofAx=b vialp-minimization, we need certain

conditions onAand/orb, for example, the novel restricted isometry property (RIP) ofA. A matrixAis said to have restricted isometry property of orderkwith restricted isometry constantδk∈(0, 1) ifδkis the smallest constant such that

(1 –δk)x22≤ Ax22≤(1 +δk)x22 (3)

for allk-sparse vectorsx, where a vectorxis said to bek-sparse ifx0≤k.

There exist a lot of suﬃcient conditions for the exact recovery byl1-minimization, such

asδ3k+ 3δ4k< 2 in [10],δ2k<

√

2 – 1 in [9], andδ2k< 2(3 –

√

2)/7 in [11]. Cai and Zhang [17] showed that for any givent≥4₃, the conditionδtk<

t–1

t guarantees recovery of every

k-sparse vector byl1-minimization. From the deﬁnition ofp-norm it seems to be more

natural to consider lp-minimization with 0 <p< 1 instead ofl0-minimization. Foucart

[11] showed that the conditionδ2k< 0.4531 can guarantee exactk-sparse recovery vialp

-minimization for any 0 <p< 1. Chartrand [18] proved that ifδ2k+1< 1, then we can recover

ak-sparse vector bylp-minimization for somep> 0 small enough. However, it should be

pointed out that the problem of calculatingδ2kfor a given matrixAis still NP-hard.

Recently, Peng, Yue, and Li [7] have proved that there exists a constantp(A,b) > 0 such that every solution oflp-minimization is also a solution ofl0-minimization whenever 0 <

p<p(A,b). This result builds a bridge betweenlp-minimization andl0-minimization, and

it is important that this conclusion is not limited by the structure of a matrixA. However, the paper [7] does not give an analytic expression ofp(A,b). The model of choice oflp

-minimization is still diﬃcult.

As already mentioned, it is NP-hard to calculateδ2kfor a given matrixA∈Rm×nand

also to calculate thesep. On the other hand, the possibility of recovery of everyk-sparse vector byl0-minimization is just a necessary condition for the existence of suchδ2k, and

therefore the results based onδ2klead to limitations of practical application.

We have to emphasize that althoughlp-minimization is also diﬃcult due to its

non-convexity and nonsmoothness, a lot of algorithms have been designed to solve lp

-minimization; seee.g. [11, 19], and [20]. Moreover, a reasonable range ofpin these algo-rithms is very important. In this paper, we devote ourselves to giving a complete answer to this problem.

Our paper is organized as follows. In Section 2, we present some preliminaries of the lp-null space property, which plays a core role in the proof of our main theorem. In

(3)

lp-minimization with 0 <p<p∗(A,b) as long asxcan be recovered vial0-minimization.

Finally, we summarize our ﬁndings in the last section.

For convenience, forx∈Rn, its support is deﬁned bysupport(x) ={i:xi= 0}, and the

cardinality of a setis denoted by||. LetKer(A) ={x∈Rn_:_Ax_{= 0}_}_{be the null space}

of a matrixA, and denote byλmin+(A) the minimal nonzero absolute-value eigenvalue of Aand byλmax(A) the maximal one. We denote byxthe vector that is equal toxon the

index setand zero elsewhere and byAthe submatrix the columns of which are the columns ofAthat are in the set index. Letc_{be the complement of}_.

2 Preliminaries

To investigate conditions under which bothl0-minimization andlp-minimization have

the same unique solution, it is convenient for us to work with a suﬃcient and necessary condition of the solutions ofl0-minimization andlp-minimization. Therefore, in this

pre-liminary section, we focus on introducing such an condition, namely the lp-null space

property.

Deﬁnition 1([16]) A matrix A∈Rm×n_with _m_≤_n _{is said to satisfy the}_l

p-null space

property of orderkif

xp<xc_p (4)

for everyx∈Ker(A)\ {0}and every set⊂ {1, 2, . . . ,n}with|| ≤k.

In the literature, the null space property usually means thel1-null space property. We

now indicate the relation between thelp-null space property and exact recovery vialp

-minimization with 0≤p≤1.

Theorem 1([16, 21]) Given a matrix A∈Rm×nwith m≤n,every k-sparse vector x∈Rn can be recovered via lp-minimization with0≤p≤1if and only if A satisﬁes the lp-null

space property of order k.

Theorem 1 provides a suﬃcient and necessary condition to judge whether a vector can be recovered bylp-minimization with 0≤p≤1, which is the most important advantage

of thelp-null space property. However, thelp-null space property is diﬃcult to be checked

for a given matrix. To reach our goal, we recall the concept of the null space constant (NSC), which is closely related to thelp-null space property and oﬀers tremendous help

in illustrating the performance ofl0-minimization andlp-minimization.

Deﬁnition 2([22]) For any 0≤p≤1 andk> 0, the null space constant (NSC)h(p,A,k) is the smallest number such that:

i∈

|xi|p≤h(p,A,k)

i∈/

|x|p (5)

for every index set⊂ {1, 2, . . . ,n}with|| ≤kand everyx∈Ker(A)\ {0}.

Similarly to thelp-null space property, NSC also can be used for characterizing the

per-formance oflp-minimization. Combining the deﬁnition of NSC and the results in [23] and

(4)

Corollary 1 For any p∈[0, 1],h(p,A,k) < 1is a suﬃcient and necessary condition for re-covery of all k-sparse vectors via lp-minimization with0≤p≤1.

Proof The proof is very easy, and we leave it to the readers.

Corollary 2 Given a matrix A∈Rm×n_,_{if h}_(0,_A_,_k_{) < 1,}_{we have}_:

(a) x0≥2k+ 1for everyx∈Ker(A)\ {0};

(b) k≤ n–2.5₂ + 1,wherearepresents the integer part ofa.

Proof (a) We assume that there exists a vectorx∈Ker(A)\ {0}withx0≤2k.

Let=support(x). Ifx0≤k, then we get thatx0≥ xc₀= 0.

If k<x0≤2k, we consider an arbitrary set⊂ with||=k. Then we get that

x0=k≥ xc₀.

According to the deﬁnition ofh(p,A,k), these two conclusions contradicth(0,A,k) < 1, and therefore we have thatx0≥2k+ 1 for anyx∈Ker(A)\ {0}.

(b) As has been proved in (a), we get that

2k+ 1≤ x0≤n. (6)

Due to the integer values ofx0andk, it is easy to get that

k≤

n–1

2 whennis odd,

n–2

2 whennis even.

In total, we havek≤ n–2.5₂ + 1.

Remark 1 In Corollary 2, we obtained a relation of inequality betweennandk under

the assumptionh(0,A,k) < 1. Furthermore, Foucart [23, p. 49, Chapter 2] showed another relation of inequality betweenmandk. If everyk-sparse vectorx∈Rn_{can be recovered}

vial0-minimization, then we get thatm≥2k; furthermore, it is easy to get thatk≤ m₂

due to the integer values ofk.

Remark 2 Chen and Gu [22] showed some important properties ofh(p,A,k). It is shown thath(p,A,k) is a continuous function inp∈[0, 1] whenk≤spark(A) – 1, wherespark(A) is the smallest number of columns from A that are linearly dependent. Therefore, if h(0,A,k) < 1 for some ﬁxedAandk, then there exists a constantp∗such thath(p,A,k) < 1 forp∈[0,p∗), that is, everyk-sparse vector can be recovered via bothl0-minimization and

lp-minimization forp∈(0,p∗), which is a corollary of the main theorem in [7].

Theorem 2([7]) There exists a constant p(A,b) > 0such that when0 <p<p(A,b),every solution to lp-minimization also solves l0-minimization.

Theorem 2 is the main theorem in [7]. Obviously, this theorem qualitatively proves the eﬀectiveness of solving the originall0-minimization problem vialp-minimization.

More-over, the theorem becomes more practical ifp(A,b) is computable. At the end of this sec-tion, we need to point out a necessary and suﬃcient condition based on thelp-null space

(5)

Lemma 1 Given an underdetermined matrix A∈Rm×n_{and an integer k}_,_{the inequality}

h(0,A,k) < 1holds if and only if there exist two constants0 <u≤w with

0 <λmin+ATA≤u2≤w2≤λ_maxATA (7)

such that

u2x2₂≤ Ax2₂≤w2x2₂ (8)

for every2k-sparse vector x∈Rn_.

Proof Necessity.The proof is divided into two steps. Step 1: Proof of the existence ofu.

To prove this result, we just need to prove that the set

V= u:Ax2/x2≥ufor any nonzeroxwithx0≤2k

has a nonzero inﬁmum.

If we assume that infV= 0, then, for anyn∈N+_{, there exists a vector 2}_k_{-sparse vector}

xnwithxn2= 1 such thatAxn2≤n–1.

Furthermore, it is easy to get a convergent subsequence{xni}of the bounded sequence

{xn}, that is,xni→x0, and it is obvious thatAx0= 0 because the functiony(x) =Axis

continuous.

LetJ(x0) ={i: (x0)i= 0}. There existsNisuch that (xnk)i= 0 whenk≥Nifor anyi∈J(x0).

LetN=max_i_∈_J₍_x₀₎Ni. For anyi∈J(x0), it is easy to get that (xnk)i= 0 whenk≥N. When

k≥N, we get thatxnk0≥ x00andx00≤2k.

However, according to Corollary 2, it is easy to get thatx0≥2k+ 1 for anyx∈Ker(A)\

{0}. We notice thatx0∈Ker(A), so the resultx00≤2kcontradicts Corollary 2.

Therefore, there exists a constantu> 0 such thatAx2≥ux2 for anyx∈Rnwith

x0≤2k.

Step 2: Proof ofu2_≥_λ

min+(ATA).

According to the proof above, there exists a vectorx∈Rn withx0≤2ksuch that

Ax2=ux2.

LetV=support(x). It is easy to get that

u2xTx≤xTAT_VAVx (9)

for allx∈R|V|_{. Therefore, the smallest eigenvalue of}_AT

VAVisu2sinceATVAV∈R|V|×|V|is

a symmetric matrix, and we can choose an eigenvectorz∈R|V|_{of eigenvalue}_u2_.

Ifu2_<_λ

min+(ATA), then consider the vectorx∈Rnwithxi=ziwheni∈V and zero

otherwise. Therefore, it is easy to get thatAT_Ax₌_u2_x_{, which contradicts the deﬁnition}

ofλmin+(ATA).

Finally, notice thatAT_A_{is a semipositive deﬁnite matrix such that}_Ax2

2=xTATAx≤

λmax(ATA)x22for allx∈Rn. So there exists a constantwsuch thatAx22≤w2x22for

(6)

Suﬃciency.Let ak-sparse vectorx∗be the unique solution ofl0-minimization. For any

k-sparse vectorx1, we have that

u2x∗–x1 2 2≤A

x∗–x1 2 2≤w

2_x∗_–_x 1

2

2. (10)

Therefore, we get thatx∗=x1as long asx1is a solution ofAx=b, that is, everyk-sparse

vector can be recovered byl0-minimization (1), and this is equivalent toh(0,A,k) < 1 by

Corollary 1.

3 Main contribution

In this section, we focus ourselves on the proposed problem. By introducing a new tech-nique and utilizing preparations provided in Section 2, we will present an analytic expres-sion ofp∗(A,b) such that everyk-sparse vectorxcan be recovered vialp-minimization

with 0 <p<p∗(A,b) as long as it can be recovered vial0-minimization. To this end, we

ﬁrst begin with two lemmas.

Lemma 2 For any x∈Rn_and_{0 <}_p_≤_1,_{we have that}

xp≤ x

1

p–12

0 x2.

Proof This result can be easily proved by Hölder’s inequality.

Lemma 3 Given a matrix A∈Rm×n_,_{if ux}

2≤ Ax2≤wx2for allx0≤2k,then

_Ax₁,Ax2≤

w2_–_u2

2 x12x22

for all x1and x2withxi0≤k(i= 1, 2),andsupport(x1)∩support(x2) =∅.

Proof By the assumption on the matrixA, it is easy to get that

|Ax1,Ax2|

x12x22

=

A

x1

x12

,A

x2

x22 =1 4 A x1

x12

+ x2

x22

2 2

–A

x1

x12

– x2 x22

2 2 ≤1 4 w2 x1

x12

+ x2 x22

2

–u2 x1 x12

– x2

x22 2

2

. (11)

Sincesupport(x1)∩support(x2) =∅, we have that

x1

x12

+ x2

x22 2

2

= x1 x12

– x2 x22

2

= 2, (12)

from which we get that

_Ax₁,Ax2≤

w2_–_u2

2 x12x22. (13)

(7)

With the above lemmas in hand, we now can prove our main theorems.

Theorem 3 Given a matrix A∈Rm×n_{with m}_≤_{n and}_{0 <}_p_≤_1,_{if h}_(0,_A_,_k_{) < 1,}_then

h(p,A,k) <h∗(p,A,k), (14)

where

h∗(p,A,k) =

√ 2 + 1

2

p

k k+ 1

(λ– 1)(n– 2 –k)

2k +

λ+

1 k+ 1

k–12

p

(15)

with

λ= λmax(A

T_A₎

λmin+(ATA).

Proof According to Theorem 1 and Corollary 2, it is easy to get thatx0≥2k+ 1 for

everyx∈Ker(A)\ {0}sinceh(0,A,k) < 1. Furthermore, according to Lemma 1, we can ﬁnd constantsλmin+(ATA)≤u2≤w2≤λ_max(ATA) such that

u˜x2≤ Ax˜2≤w˜x2 (16)

for anyx˜∈Rn_with_˜_x 0≤2k.

Now we consider a nonzero vector x∈Ker(A)\ {0}and an arbitrary index set0⊂

{1, 2, . . . ,n}with|0|=k. We partition the complement of0asc0= t

i=1i, where

1={indices of thek+ 1 largest absolute-value components ofx–x0},

2={indices of theklargest absolute-value components ofx–x0–x1},

3={indices of theklargest absolute-value components ofx–x0–x1–x2},

. . .

t={indices of the remaining components ofx}.

We know thatx0≥2k+ 1, so both1and0are not empty, and there are only two

cases:

(i) 0andi(i= 2, . . . ,t– 1) all havekelements except, possibly,t.

(ii) 0haskelements,1has less thank+ 1elements, andi(i= 2, . . . ,t– 1) are

empty.

Furthermore, in both cases, the set1can be divided in two parts:

(1)₁ ={indices of theklargest absolute-value components of1},

(2)₁ ={indices of the rest components of1}.

It is obvious that1=(1)1 ∪ (2)

1 and the set

(2)

(8)

Sinceu˜x2≤ A˜x2≤w˜x2for any˜x0≤2k, we have that

x₀2₂+x₁₂2=x₀2₂+x(1) 1

2

2+x(2)₁

2 2

=x0+x(1)₁

2

2+x(2)₁

2 2

≤ 1

u2A(x0+x(1)₁ )

2

2+x(2)₁

2

2. (17)

Sincex=x0+x(1) 1 +x

(2)

1 +x2+· · ·+xt∈

Ker(A)\ {0}, we have that

A(x0+x(1) 1 )

2 2=

A(–x0–x(1) 1 ,A(x

(2)

1 +x2+· · ·+xt)

=A(–x0–x(1)₁ ),Ax(2)₁

+ t i=2

A(–x0),Axi

+A(–x(1) 1 ),Axi

. (18)

According to Lemma 3, for anyi∈ {2, 3, . . . ,t}, we get that

A(–x0),Axi

≤w2–u2

2 x02xi2,

A(–x(1) 1 ),Axi

≤w2–u2

2 x(1)₁ 2xi2.

(19)

Substituting inequalities (19) into (18), we have

A(x₀+x(1) 1

)2₂≤A(–x₀–x(1) 1

)₂Ax(2) 1

+w

2_–_u2

2

_t

i=2

xi2

x02+x(1) 1 2

≤w2x₀2+x(1) 1 2

x(2) 1 2

+w

2_–_u2

2

_t

i=2

xi2

x02+x(1) 1 2

. (20)

By the deﬁnition ofx(1)

1 andx1 it is easy to get that

x(1)

1 2≤ x12 (21)

and

x(2) 1 2≤

1

k+ 1x12. (22)

Therefore, we get that

x(2) 1

2 2≤

1 k+ 1

x02+x12

x(2)

(9)

Substituting inequalities (20) and (23) into (17), we have that

x₀2₂+x₁2₂≤ 1

u2A(x0+x(1)₁ )

2

2+x(2)₁

2 2

≤x02+x12

w2_–_u2

2u2 t

i=2

xi2

+

w2 u2 +

1 k+ 1

x(2) 1 2

. (24)

For any 2≤i≤tand any elementaofxi, it is easy to get that|a|p≤

1 k+1x1

p p, so we

have the inequalities:

x_i2₂≤k(k+ 1)–2p_x12

p (25)

and

x(2) 1

2

2≤(k+ 1) –2_p

x₁2_p. (26)

Substituting inequalities (25) and (26) into (24), we derive that

x02

2+x122≤

x02+x12

w2–u2 2u2 k

1 2₍_k_{+ 1)}–

1

p₍_t_{– 1)}

+

w2

u2 +

1 k+ 1

(k+ 1)–1p

x1p. (27)

Letr=w_u22 and

B=

w2_–_u2

2u2 k

1 2₍_k_{+ 1)}–

1

p₍_t_{– 1) +}

w2

u2 +

1 k+ 1

(k+ 1)–1p

x1p.

Then we can rewrite inequality (27) as

x0

2 2+x1

2 2≤B

x02+x12

, (28)

so that

x02– B 2

2

+

x12– B 2

2

≤B2 2 .

Therefore, we get thatx02≤ √

2+1 2 B.

According to Lemma 2, we have that

x0p≤k 1

p–12x 02≤k

1

p–12 √

2 + 1

(10)

Substituting the expression ofBinto inequality (29), we obtain that

x0p≤k 1

p–12

√ 2 + 1

2

×

r– 1

2 k

1 2₍_k_{+ 1)}–

1

p₍_t_{– 1) +}

r+

1 k+ 1

(k+ 1)–1p

x1p

=

k k+ 1

1

p√_{2 + 1}

2

r– 1 2 (t– 1) +

r+

1 k+ 1

k–12

x1p. (30)

We notice that the sets0andi(i= 2, . . . ,t– 1) all havekelements and the set1has

k+ 1 elements such thattk+ 2≤n≤(t+ 1)k+ 1, so we get thatt≤n–2_k . According to Lemma 1, we have thatr=w_u22 ≤λ=

λmax(ATA)

λ_min+(AT_A₎. Substituting the inequalities

into (30), we obtain

x0p≤

√ 2 + 1

2

k k+ 1

1

p_(λ_{– 1)(}_n_{– 2 –}_k₎

2k +

λ+

1 k+ 1

k–12

x1p.

It is obvious thatx1p≤ xc

0p, and therefore we get that

x0p_p≤h∗(p,A,k)xc

0

p p,

whereh∗(p,A,k) is given in (15).

According to the deﬁnition ofh(p,A,k), we can get thath(p,A,k)≤h∗(p,A,k).

Theorem 3 presents a result that is very similar to the result in Theorem 1. However, it is worth pointing out that the constanth∗(p,A,k) plays a central role in Theorem 3. In fact, we can treat h∗(p,A,k) as an estimate of h(p,A,k), where the former is calcu-lable, and since the latter is NP-hard,h∗(p,A,k) may be considered as an improvement of h(p,A,k). According to Theorem 1, if we take k as thel0-norm of the unique

solu-tion ofl0-minimization, then we can get the main contribution as soon as the inequality

h∗(p,A,k)p1 _{< 1 is satisﬁed.}

Theorem 4 Let A∈Rm×nbe an underdetermined matrix of full rank,and denote∗=

|support(AT₍_AAT₎–1_b₎_|_._{If every k-sparse vector x can be recovered via l}

0-minimization,

then x also can be recovered via lp-minimization with p∈(0,p∗(A,b)),where

p∗(A,b) =max

h∗,h

n– 2.5 2 + 1 ,h m 2 (31) with

h(x) = ln(x+ 1) –lnx

ln[(

√ 2+1 2 )(

(λ–1)(n–3) 2 +λ+

1 2)]

(32)

and

λ= λmax(A

T_A₎

(11)

Proof Recalling (15), we can get the equivalence between l0-minimization and

lp-minimization as long ash∗(p,A,k)

1

p _{< 1. However,}_k_{cannot be calculated directly, and}

we need to estimatekand change the inequalityh∗(p,A,k)1p _{< 1 into a computable one}

through inequality technique.

Due to the integer values ofx0, we have that

λ+

1 k+ 1

k–12 ≤_λ₊

1 2.

Notice thatλ> 1, so we have

(λ– 1)(n– 2 –k)

2k ≤

(λ– 1)(n– 3)

2 .

Furthermore, according to Corollary 2, we get that 2k+ 1≤n, and it is easy to get that (λ– 1)(n– 2 –k)≥0, so that

h∗(p,A,k)p1 ≤

√ 2 + 1

2

k k+ 1

1

p_(λ_{– 1)(}_n_{– 3)}

2 +λ+

1 2

.

Furthermore, it is obvious that (λ–1)(₂n–3) +λ+

1

2 > 0. Therefore, forx∈(0, +∞) and

p∈(0, 1), it is easy to prove that the function

f(x,p) = √

2 + 1 2

x x+ 1

1

p_(λ_{– 1)(}_n_{– 3)}

2 +λ+

1 2

increases inxwhenpis ﬁxed and also increases inpwhenxis ﬁxed.

According to Corollary 2 and Remark 2, we have that k≤ n–2.5₂ + 1,k≤ m₂, and k≤ |∗|, where∗=|support(AT(AAT)–1b)|, because it is obvious thatx=AT(AAT)–1bis a solution of the underdetermined systemAx=b.

Therefore, we get that

f(k,p)≤min

f

n– 2.5 2

+ 1,p

,f∗,p,f

m 2 ,p . (33)

It is obvious that f(k,p) < 1 as long as one of three inequalities f(n–2.5₂ + 1,p) < 1, f(∗,p) < 1, andf(m₂,p) < 1 is satisﬁed.

Furthermore, the inequalityf(x,p) < 1 whenxﬁxed is very easily solved, and the range of suchpis

p<h(x) = ln(x+ 1) –lnx

ln[(

√ 2+1

2 )( (λ–1)(n–3)

2 +λ+

1 2)]

.

Hence, for any 0 <p<p∗=max{h(∗),h(n–2.5₂ + 1),h(m₂)}, we have thath∗(p,A,k)1p≤

f(k,p) < 1. Therefore, according to Theorem 1, everyk-sparse vectorx∈Rn_{can be}

(12)

Combining Theorems 3 and 4, we have reached the major goals of this paper. The most important result in these two theorems is the analytic expression ofp∗(A,b), with which the speciﬁc range ofpcan be easily calculated.

Next, we present two examples to demonstrate the validation of Theorem 4. We consider two matrixes of diﬀerent dimensions of their null spaces and get the unique solution to lp-minimization to verify whether it is the unique solution tol0-minimization.

Example 1 We consider an underdetermined systemAx=b, where

A=

⎛ ⎜ ⎝

1 –1.5 –0.7 0

0 –2 0.5 1

1 0.5 –1 1

⎞ ⎟ ⎠

and

b=

⎛ ⎜ ⎝

0.5 0 0.5

⎞ ⎟ ⎠.

It is obvious that the sparsest solution isx∗ = [0.5, 0, 0, 0]T andKer(A) is spanned by [–10, –2, –10, 1]T_{, so the solutions of the equation}_Ax₌_b_{can be expressed in the form}

x= [0.5, 0, 0, 0]T+t[–10, –2, –10, 1]T, wheret∈R.

Therefore, thep-norm ofxcan be expressed as

xp

p=|0.5 – 10t|p+|2t|p+|10t|p+|t|p.

Furthermore, it is easy to get thatλmax(ATA) = 7.2583,λmin(ATA) = 1.1926, andλ= λmax(ATA)

λmin(ATA) = 6.0856.

We can get that

ATAAT–1b= [0.2561, –0.0488, –0.2439, 0.0244]T,

and henceh(∗) = 0.0921 andh(n–2.5₂ + 1) =h(m₂) = 0.2862, sop∗(A,b) = 0.2862. As shown in the Figure 1, we can get the solution of lp-minimization in diﬀerent

cases where p = 0.2861, 0.2, 0.15, and 0.1. It is obvious that l0.2861-minimization, l0.2

-minimization,l0.15-minimization, andl0.1-minimization all reach their minimums att= 0,

which corresponds to the sparsest solutionx∗= [0.5, 0, 0, 0]T_.

Example 2 We consider a more complex situation withAx=b, where

A=

⎛ ⎜ ⎝

1 0 3.5 –3 –2.7

0 2 0 –1.5 4.5

2 2 –4 –0.5 1.5

(13)

[image:13.595.117.477.80.298.2]

Figure 1 Thep-norm of the solutions of the different cases in Example 1.

and

b=

⎛ ⎜ ⎝

0.5 0 1

⎞ ⎟ ⎠.

It is easy to get the sparsest solutionx∗= [0.5, 0, 0, 0, 0]T_{, and the solutions of the}

under-determined systemAx=bcan be expressed in the parameterized form

x= [0.5, 0, 0, 0, 0]T+s

213 110, –

9 4,

12 55, 0, 1

T

+t

17 22,

3 4,

7 11, 1, 0

T

, wheres,t∈R.

Therefore

xp p=

0.5 +213 110s+

17 22t

p+–9

4s+ 3 4t

p+12

55s+ 7 11t

p+|s|p+|t|p.

Furthermore, we can getλ=λmax(ATA)

λmin(ATA) = 4.1792. It is easy to get that

ATAAT–1b= [0.1903, 0.1083, –0.1188, –0.1527, –0.0990]T.

Henceh(∗) = 0.0801,h(n–2.5₂ + 1) = 0.1782, andh(m₂) = 0.3046, so we takep∗(A,b) = 0.3046.

From Figure 2 we can also ﬁnd the solutions in diﬀerent cases where p= 0.3045, 0.3, 0.2, and 0.1. It is obvious that the minimum is reached ats=t= 0, which corresponds to the sparsest solutionx∗= [0.5, 0, 0, 0, 0]T_{. The result can be seen more clearly in Figure 3.}

4 Conclusions

In this paper, we have studied the equivalence between l0-minimization and

lp-minimization. By using thelp-null space property and a suﬃcient and necessary

(14)

[image:14.595.117.478.81.334.2]

Figure 2 Thep-norm of the solutions of the different cases in Example 2.

Figure 3 Ons-tplane for thep-norm of the solutions to the different cases in Example 2.

ofp∗(A,b) such thatlp-minimization is equivalent tol0-minimization. Although it is

NP-hard to ﬁnd the global optimal solution oflp-minimization, a local minimizer can be done

[image:14.595.117.479.372.627.2]

(15)

can ﬁnd the sparse solution withlp-minimization with 0 <p<p∗(A,b) as long as we start

with a good initialization.

However, in this paper, we only consider the situation wherel0-minimization has an

unique solution. The uniqueness assumption is vital for us to prove the main results. However, from Lemma 1 we see that the uniqueness assumption is equivalent to a cer-tain double-inequality condition, which looks like RIP. The evident diﬀerence between them is in that the former possesses the homogeneity rather than the latter. This implies that, unlike RIP, the uniqueness assumption is not in essential conﬂict with equivalence of all linear systemsλAx=λx,λ∈R. Therefore, we think that the uniqueness assumption and, equivalently, the resulting double-inequality condition in Lemma 1 can replace the RIP in many cases.

Acknowledgements

The work was supported by the National Natural Science Foundations of China (11131006).

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

Both authors contributed equally to this work. Both authors read and approved the ﬁnal manuscript.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional aﬃliations.

Received: 7 August 2017 Accepted: 12 December 2017

References

1. Olshausen, B, Field, DJ: Emergence of simple-cell receptive ﬁeld properties by learning a sparse code for natural images. Nature381(6583), 607 (1996)

2. Candes, EJ, Recht, B: Exact matrix completion via convex optimization. Found. Comput. Math.9(6), Article ID 717 (2009)

3. Malioutov, D, Cetin, M, Willsky, AS: A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process.53(8), 3010-3022 (2005)

4. Wright, J, Ganesh, A, Zhou, Z: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell.

31(2), 210-227 (2009)

5. Candes, EJ, Tao, T: Decoding by linear programming. IEEE Trans. Inf. Theory51(12), 4203-4215 (2005)

6. Chen, L, Gu, Y: Local and global optimality oflpminimization for sparse recovery. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3596-3600 (2015)

7. Peng, J, Yue, S, Li, H:NP/CMP, equivalence: a phenomenon hidden among sparsity models,l0-minimization and lp-minimization for information processing. IEEE Trans. Inf. Theory61(7), 4028-4033 (2015)

8. Natarajan, BK: Sparse approximate solutions to linear systems. SIAM J. Comput.24(2), 227-234 (2006) 9. Candes, EJ: The restricted isometry property and its implications for compressed sensing. C. R. Math.346(9-10),

589-592 (2008)

10. Candes, EJ, Tao, T: Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theory52(12), 5406-5425 (2006)

11. Foucart, S, Lai, MJ: Sparsest solutions of underdetermined linear systems vialq-minimization for 0 <q≤1. Appl. Comput. Harmon. Anal.26(3), 395-407 (2009)

12. Donoho, DL, Tanner, J: Sparse nonnegative solution of underdetermined linear equations by linear programming. Proc. Natl. Acad. Sci. USA102(27), 9446 (2005)

13. Tropp, JA: Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory50(10), 2231-2242 (2004) 14. Petukhov, A: Fast implementation of orthogonal greedy algorithm for tight wavelet frames. Signal Process.86(3),

471-479 (2008)

15. Gao, Y, Ma, M: A new bound on the block restricted isometry constant in compressed sensing. J. Inequal. Appl.

2017(1), Article ID 174 (2017)

16. Gribonval, R, Nielsen, M: Sparse representations in unions of bases. IEEE Trans. Inf. Theory49(12), 3320-3325 (2004) 17. Cai, TT, Zhang, A: Sparse representation of a polytope and recovery of sparse signals and low-rank matrices. IEEE

Trans. Inf. Theory60(1), 122-132 (2013)

18. Chartrand, R: Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal Process. Lett.14(10), 707-710 (2007)

19. Sun, Q: Recovery of sparsest signals vialp-minimization. Appl. Comput. Harmon. Anal.32(3), 329-341 (2010) 20. Daubechies, I, Devore, R, Fornasier, M: Iteratively reweighted least squares minimization for sparse recovery.

Commun. Pure Appl. Math.63(1), 1-38 (2008)

(16)

22. Chen, L, Gu, Y: On the null space constant forlp-minimization. IEEE Signal Process. Lett.22(10), 1600-1603 (2015) 23. Foucart, S, Rauhut, H: A Mathematical Introduction to Compressive Sensing. Birkhäuser, Basel (2013)