The nonnegative rank - Applications of optimization to factorization ranks and quantum informat

In this section we adapt the techniques for the cp-rank from Section 5.4 to the asymmetric setting of the nonnegative rank. We now view a factorization A = (a^T_ib_j)i∈[m],j∈[n] by nonnegative vectors as a factorization by positive semidefinite diagonal matrices. That is, we write A_ij = Tr(X_iX_m+j), with X_i = Diag(a_i) and X_m+j= Diag(b_j). Note that we can view this as a “partial matrix” setting, where for the symmetric matrix (Tr(XiXk))_i,k∈[m+n] of size m + n, only the off-diagonal entries at the positions (i, m + j) for i ∈ [m], j ∈ [n] are specified.

This asymmetry requires rescaling the factors in order to get upper bounds on their maximal eigenvalues, which is needed to ensure the Archimedean property for the selected localizing polynomials. For this we use the well-known fact that for any A ∈ R^m×n+ there exists a factorization A = (Tr(XiXm+j)) by diagonal nonnegative matrices of size rank+(A), such that

λmax(Xi), λmax(Xm+j) ≤p

Amax for all i ∈ [m], j ∈ [n],

where A_max:= max_i,jA_ij. To see this, observe that for a rank-one matrix R = uv^T with 0 ≤ R ≤ A, one may assume 0 ≤ u_i, v_j≤√

A_maxfor all i, j. Hence, the set S_A⁺=pAmaxxi− x²_i : i ∈ [m + n] ∪ Aij− xixm+j: i ∈ [m], j ∈ [n]

is localizing for A; that is, there exists a minimal factorization X ∈ D(S_A⁺) of A.

Given A ∈ R^m×n≥0 , for each t ∈ N ∪ {∞} we consider the semidefinite program ξ_t⁺(A) = minL(1) : L ∈ R[x1, . . . , x_m+n]^∗_2t,

L(xixm+j) = Aij for i ∈ [m], j ∈ [n], L ≥ 0 on M2t(S⁺_A) .

Moreover, define ξ⁺_∗(A) by adding the constraint rank(M (L)) < ∞ to the program defining ξ⁺_∞(A). It it easy to check that ξ_t⁺(A) ≤ ξ⁺_∞(A) ≤ ξ_∗⁺(A) ≤ rank+(A) for t ∈ N.

Denote by ξ_t,†⁺(A) the strengthening of ξ_t⁺(A) where we add the positivity con-straints

L(gu) ≥ 0 for g ∈ {1} ∪ S_A⁺ and u ∈ [x]_2t−deg(g). (5.17)

5.5. The nonnegative rank 93 Note that these extra constraints can help for finite t, but that they are redundant for t ∈ {∞, ∗}.

5.5.1 Comparison to other bounds

As in the previous section, we compare our bounds to the bounds by Fawzi and Parrilo [FP16]. They introduce the following parameter τ₊(A) as an analogue of the bound τcp(A) for the nonnegative rank:

τ+(A) = minn

α : α ≥ 0, A ∈ α · convR ∈ R^m×n: 0 ≤ R ≤ A, rank(R) ≤ 1 o , and the analogue τ₊^sos(A) of the bound τ_cp^sos(A) for the nonnegative rank:

τ₊^sos(A) = infα : X ∈ R^mn×mn, α ∈ R,

α vec(A)^T

vec(A) X

X(i,j),(i,j)≤ A²_ij for 1 ≤ i ≤ m, 1 ≤ j ≤ n,

X(i,j),(k,`)= X(i,`),(k,j) for 1 ≤ i < k ≤ m, 1 ≤ j < ` ≤ n . First we give the analogue of Proposition 5.18, whose proof we omit since it is very similar.

Proposition 5.26. Let A ∈ R^m×n+ . For every t ∈ N ∪ {∞, ∗} the optimum in ξ⁺_t(A) is attained, and ξ⁺_t(A) → ξ⁺_∞(A) = ξ⁺_∗(A) as t → ∞. If ξ⁺_t(A) admits a flat optimal solution, then ξ⁺_t(A) = ξ⁺_∗(A). Moreover, ξ_∞⁺(A) = ξ_∗⁺(A) is the minimum value of L(1) taken over all linear functionals L that satisfy A = (L(x_ix_m+j)) and that are conic combinations L of trace evaluations at elements of D(S_A⁺).

Now we observe that the parameters ξ_∞⁺(A) and ξ_∗⁺(A) coincide with τ+(A), so that we have a sequence of semidefinite programs converging to τ+(A).

Proposition 5.27. For any A ∈ R^m×n≥0 , we have ξ⁺_∞(A) = ξ⁺_∗(A) = τ+(A).

Proof. The discussion at the beginning of Section 5.5 shows that for any rank-one matrix R satisfying 0 ≤ R ≤ A we may assume that R = uv^T with (u, v) ∈ R^m+×Rⁿ+

and ui, vj≤√

Amax for i ∈ [m], j ∈ [n]. Hence, τ+(A) can be written as

minn

α : α ≥ 0, A ∈ α · convuv^T: (u, v) ∈h 0,p

Amax

i^m+n

, uv^T ≤ A o

= minn

α : α ≥ 0, A ∈ α · convuv^T : (u, v) ∈ D(S_A⁺) o . The equality ξ_∞⁺(A) = ξ_∗⁺(A) = τ+(A) now follows from the reformulation of ξ⁺_∗(A) in Proposition 5.26 in terms of conic evaluations, after noting that for (u, v) in R^m× Rⁿ we have (u, v) ∈ D(S_A⁺) if and only if the matrix R = uv^T satisfies 0 ≤ R ≤ A.

94 Chapter 5. Matrix factorization ranks and polynomial optimization Analogously to the case of the completely positive rank we have the following proposition. The proof is similar to that of Proposition 4.2, considering now for M the principal submatrix of M2(L) indexed by the monomials 1 and xixm+j for i ∈ [m] and j ∈ [n].

Proposition 5.28. If A is a nonnegative matrix, then ξ⁺_2,†(A) ≥ τ₊^sos(A).

In the remainder of this section we recall how τ+(A) and τ₊^sos(A) compare to other bounds in the literature. These bounds can be divided into two categories:

combinatorial lower bounds and norm-based lower bounds. The following diagram from [FP16] summarizes how τ₊^sos(A) and τ₊(A) relate to the combinatorial lower bounds

τ₊^sos(A) ≤ τ₊(A) ≤ rank₊(A)

≤ ≤ ≤

ω(RG(A)) ≤ ϑ(RG(A)) ≤ χfrac(RG(A)) ≤ χ(RG(A)) = rankB(A).

Here RG(A) is the rectangular graph, with V = {(i, j) ∈ [m] × [n] : Aij > 0}

as vertex set and E = {((i, j), (k, `)) : Ai`Akj = 0} as edge set. The coloring number of RG(A) coincides with the well known rectangle covering number (also denoted rankB(A)), which was used, e.g., in [FMP⁺15] to show that the extension complexity of the correlation polytope is exponential. The clique number of RG(A) is also known as the fooling set number (see, e.g., [FKPT13]). Observe that the above combinatorial lower bounds only depend on the sparsity pattern of the matrix A, and that they are all equal to one for a strictly positive matrix.

Fawzi and Parrilo [FP16] have furthermore shown that the bound τ₊(A) is at least as good as norm-based lower bounds:

τ+(A) = sup

N monotone and positively homogeneous

N^∗(A) N (A).

Here, a function N : R^m×n+ → R+ is positively homogeneous if N (λA) = λN (A) for all λ ≥ 0 and monotone if N (A) ≤ N (B) for A ≤ B, and N^∗(A) is defined as

N^∗(A) = max{L(A) : L : R^m×n→ R linear and L(X) ≤ 1 for all X ∈ R^m×n+

with rank(X) ≤ 1 and N (X) ≤ 1}.

These bounds are called norm-based since norms often provide valid functions N . For example, when N is the `_∞-norm, Rothvoß [Rot17] used the corresponding lower bound N^∗(A)/N (A) to show that the matching polytope has exponential extension complexity.

When N is the Frobenius norm: N (A) = (P

i,jA²_ij)^1/2, the parameter N^∗(A) is known as the nonnegative nuclear norm. In [FP15] it is denoted by ν+(A) and it is shown to satisfy rank₊(A) ≥ (ν₊(A)/kAk_F)². Moreover, it is reformulated as

ν₊(A) = min X

λ_i: A =X

λ_iu_iv_i^T, (λ_i, u_i, v_i) ∈ R^1+m+n+ , ku_ik₂= kv_ik₂= 1 (5.18)

= maxhA, W i : W ∈ R^m×n, ^I ^−W

−W^T I is copositive , (5.19)

5.5. The nonnegative rank 95 where the cone of copositive matrices is the dual of the cone of completely posi-tive matrices. Fawzi and Parrilo [FP15] use the coposiposi-tive formulation (5.19) to provide bounds ν₊^[k](A) (k ≥ 0), based on inner approximations of the copositive cone from [Par00], which converge to ν+(A) from below. We now observe that by Theorem 4.12 the atomic formulation of ν+(A) from (5.18) can be seen as a moment optimization problem:

ν+(A) = min Z

V (S)

1 dµ(x) s.t. Aij= Z

V (S)

xixm+jdµ(x) for i ∈ [m], j ∈ [n].

Here, the optimization variable µ is required to be a Borel measure on the variety V (S), where

S = {Pm

i=1x²_i − 1, Pn

j=1x²_m+j− 1}.

(The same observation is made in [TS15] for the real nuclear norm of a symmetric 3-tensor and in [Nie17] for symmetric odd-dimensional tensors.) For t ∈ N ∪ {∞}, let µ_t(A) denote the parameter defined analogously to ξ⁺_t(A), where we replace the condition L ≥ 0 on M_2t(S_A⁺) by L ≥ 0 on M_2t({x₁, . . . , x_m+n}) and L = 0 on I_2t(S), and let µ_∗(A) be obtained by adding the constraint rank(M (L)) < ∞ to µ_∞(A). We have µt(A) → µ_∞(A) = µ_∗(A) = ν+(A) by Theorem 4.12 and (a non-normalized analogue of) Theorem 4.13. One can show that µ1(A) with the additional constraints L(u) ≥ 0 for all u ∈ [x]2, is at least as good as ν₊^[0](A). It is not clear how the hierarchies µt(A) and ν₊^[k](A) compare in general.

5.5.2 Computational examples

We illustrate the performance of our approach by comparing our lower bounds ξ_2,†⁺ and ξ_3,†⁺ to the lower bounds τ+and τ₊^soson the two examples considered in [FP16].

All nonnegative 2 × 2 matrices

For A(α) = ^{1 1}_{1 α}, Fawzi and Parrilo [FP16] show that

τ+(A(α)) = 2 − α and τ₊^sos(A(α)) = 2

1 + α for all 0 ≤ α ≤ 1.

Since the parameters τ+(A) and τ₊^sos(A) are invariant under scaling and permuting rows and columns of A, one can use the identity

1 1 1 α

=1 0 0 α

1 1

1 1/α

0 1 1 0

to see that their result describes the parameters for all nonnegative 2 × 2 matrices.

By using a semidefinite programming solver for α = k/100, k ∈ [100], we see that ξ⁺₂(A(α)) coincides with τ₊(A(α)).

96 Chapter 5. Matrix factorization ranks and polynomial optimization The nested rectangles problem

Here we consider the nested rectangles problem as described in [FP16, Section 2.7.2]

(see also [MSvS03]). This problem asks for which a, b ∈ [−1, 1] there exists a triangle T such that R(a, b) ⊆ T ⊆ P , where R(a, b) = [−a, a] × [−b, b] and P = [−1, 1]², see Figure 5.1 for an illustration.

−1 1

(−a, −b)

(a, b)

Figure 5.1: An example of the nested rectangles problem where a triangle exists.

In Chapter 2 we have seen how the nonnegative rank relates to the extension complexity of a polytope. In fact, it also relates to extended formulations of nested pairs of polytopes [BFPS15, GG12]. An extended formulation of a pair of polytopes P1 ⊆ P2 ⊆ R^d is a (possibly) higher-dimensional polytope K and a projection π such that π(K) is nested between P1 and P2. Let us suppose π(K) = {x ∈ R^d : y ∈ R^k+, (x, y) ∈ K} and K = {(x, y) : Ex + F y = g, y ∈ R^k+}, then k is the size of the extended formulation, and the smallest such k is called the extension complexity of the pair (P₁, P₂). It is known (cf. [BFPS15, Theorem 1]) that the extension complexity of the pair (P₁, P₂), where

P₁= conv({v₁, . . . , v_n}) and P₂= {x : a^T_i x ≤ b_i for i ∈ [m]},

is equal to the nonnegative rank of the generalized slack matrix SP₁,P₂ ∈ R^m×n, defined by

(S_P₁_,P₂)_ij= b_j− a^T_jv_i for i ∈ [m], j ∈ [n].

It is known that any nonnegative matrix is the slack matrix of some nested pair of polytopes [GPT13, Lemma 4.1] (see also [GG12]).

Applying this to the pair (R(a, b), P ), one immediately sees that there exists a polytope K with at most three facets whose projection T = π(K) ⊆ R² satisfies R(a, b) ⊆ T ⊆ P if and only if the pair (R(a, b), P ) admits an extended formulation of size 3. For a, b > 0, the polytope T has to be 2-dimensional, therefore K has to be at least 2-dimensional as well; it follows that K and T have to be triangles. Hence

5.6. The positive semidefinite rank 97

In document Applications of optimization to factorization ranks and quantum information theory (Page 100-105)