Commutative and tracial polynomial optimization

of Corollary 4.8.

Theorem 4.11. Let 1 ≤ δ ≤ t < ∞ and S, T ⊆ R[x]^2δ. If L ∈ R[x]^∗2t is δ-flat, nonnegative on M2t(S), and zero on I2t(T ), then L extends to a conic combination of evaluation maps at points in D(S) ∩ V (T ).

Proof. Here too we derive the result from its noncommutative analogue in Corol-lary 4.8. As in the above proof for the implication (1) =⇒ (2) in Theorem 4.10, define the sets S⁰, T⁰ ⊆ Rhxi and the linear form L⁰ ∈ Rhxi^∗2t by L⁰(p) = L(p^c) for p ∈ Rhxi^2t. Then L⁰ is symmetric, tracial, nonnegative on M2t(S⁰), zero on I2t(T⁰), and δ-flat. By Corollary 4.8, L⁰ is a conic combination of trace evalua-tion maps at elements of D(S⁰) ∩ V(T⁰). It suffices now to observe that such a trace evaluation LX is a conic combination of (scalar) evaluations at elements of D(S)∩V (T ). Indeed, as X ∈ V(T⁰), the matrices X₁, . . . , X_npairwise commute and thus can be assumed to be diagonal. Since X ∈ D(S⁰) ∩ V(T⁰), we have g(X) 0 for g⁰ ∈ S⁰ and h⁰(X) = 0 for h⁰ ∈ T⁰. This implies g((X₁)_jj, . . . , (X_n)_jj) ≥ 0 and h((X₁)_jj, . . . , (X_n)_jj) = 0 for all g ∈ S, h ∈ T , and j ∈ [d]. Thus L_X =P

jL_r_j, where r_j= ((X₁)_jj, . . . , (X_n)_jj) ∈ D(S) ∩ V (T ).

Unlike in the noncommutative setting, here we also have the following result, which permits to express any linear functional L nonnegative on an Archimedean quadratic module as a conic combination of evaluations at points, when restricting L to polynomials of bounded degree.

Theorem 4.12. Let S, T ⊆ R[x] such that M(S) + I(T ) is Archimedean. If L ∈ R[x]^∗ is nonnegative on M(S) and zero on I(T ), then for any integer k ∈ N the restriction of L to R[x]k extends to a conic combination of evaluations at points in D(S) ∩ V (T ).

Proof. By Theorem 4.9 there exists a probability measure µ on D(S) such that L(p) = L(1)

D(S)∩V (T )

p(x) dµ(x) for all p ∈ R[x].

A general version of Tchakaloff’s theorem, as explained in [BT06], shows that there exist r ∈ N, scalars λ¹, . . . , λr> 0 and points x1, . . . , xr∈ D(S) such that

D(S)∩V (T )

p(x) dµ(x) =

i=1

λ_ip(x_i) for all p ∈ R[x]k.

Hence the restriction of L to R[x]^k extends to a conic combination of evaluations at points in D(S).

4.2 Commutative and tracial polynomial optimization

We briefly discuss here the basic polynomial optimization problems in the com-mutative and tracial settings. We recall how to design hierarchies of semidefinite programming based bounds and we give their main convergence properties.

58 Chapter 4. Noncommutative polynomial optimization The classical commutative polynomial optimization problem asks to minimize a polynomial f ∈ R[x] over a feasible region of the form D(S) as defined in (4.16):

f_∗= inf_a∈D(S)f (a) = inff (a) : a ∈ Rⁿ, g(a) ≥ 0 for g ∈ S . (4.17) In tracial polynomial optimization, given f ∈ Sym Rhxi, this is modified to the problem of minimizing tr(f (X)) over a feasible region of the form D(S) ∩ V(T ) (as defined in (4.7) and (4.8)):

f_∗^tr= inftr(f (X)) : d ∈ N, X ∈ (H^d)ⁿ,

g(X) 0 for g ∈ S, h(X) = 0 for h ∈ T ,

where tr(·) is the normalized trace. Observe that the infimum does not change if we replace H^d by S^d in view of the embedding of H^d into S^2d that we have seen in Equation (2.1). Commutative polynomial optimization is recovered by restricting to 1 × 1 matrices.

For the commutative case, Lasserre [Las01] and Parrilo [Par00] have proposed hierarchies of semidefinite programming relaxations based on sums of squares of polynomials and the dual theory of moments. This approach has been extended to eigenvalue optimization in [PNA10, NPA12] and later to tracial optimization in [BCKP13, KP16]. The starting point in deriving these relaxations is to refor-mulate the above problems as minimizing L(f ) over all normalized trace evalua-tion maps L at points in D(S) or D(S) ∩ V(T ). We then express computaevalua-tionally tractable properties satisfied by such maps L and truncate to polynomials of finite degree t. Notice that in the commutative setting we do not need to work with the variety V (T ), since V (T ) = D(T ∪ −T ).

For a set S ⊆ R[x] and t ∈ N ∪ {∞}, recall the (truncated) quadratic module:

M2t(S) = conegp²: p ∈ R[x], g ∈ S ∪ {1}, deg(gp²) ≤ 2t . (4.18) For a polynomial f ∈ R[x] and t ≥ ddeg(f )/2e we can use the quadratic module to formulate the following semidefinite programming lower bound on f_∗:

ft= infL(f ) : L ∈ R[x]^∗2t, L(1) = 1, L ≥ 0 on M2t(S) . For t ∈ N we have ft≤ f_∞≤ f_∗.

In the same way, for sets S ∪ {f } ⊆ Sym Rhxi, and T ⊆ Rhxi, and t ∈ N ∪ {∞}

such that ddeg(f )/2e ≤ t ≤ ∞, we have the following semidefinite programming lower bound on f_∗^tr:

f_t^tr= infL(f ) : L ∈ Rhxi^∗2t tracial and symmetric, L(1) = 1, L ≥ 0 on M2t(S), L = 0 on I2t(T ) ,

where we now use definition (4.2) for M2t(S) and (4.3) for I2t(T ).

The next theorem from [Las01] gives fundamental convergence properties for the commutative case; see also, e.g., [Las09, Lau09] for a detailed exposition.

Theorem 4.13. Let 1 ≤ δ ≤ t < ∞ and S ∪ {f } ⊆ R[x]2δ with D(S) 6= ∅.

4.2. Commutative and tracial polynomial optimization 59 (i) If M(S) is Archimedean, then ft→ f_∞ as t → ∞, the optimal values in f_∞

and f_∗ are attained, and f_∞= f_∗.

(ii) If ftadmits an optimal solution L that is δ-flat, then L is a convex combination of evaluation maps at global minimizers of f in D(S), and ft= f_∞= f_∗. Proof. (i) By repeating the first part of the proof of Theorem 4.14 in the commu-tative setting we see that f_t → f_∞ and that the optimum is attained in f_∞. Let L be optimal for f_∞ and let k be greater than deg(f ) and deg(g) for g ∈ S. By Theorem 4.12, the restriction of L to R[x]^k extends to a conic combination of eval-uations at points in D(S). It follows that this extension if feasible for f_∗ with the same objective value, which shows f_∞= f_∗.

(ii) This follows in the same way as the proof of Theorem 4.14(ii) below, where, instead of using Corollary 4.8, we now use its commutative analogue, Theorem 4.11.

To discuss convergence for the tracial case we need one more optimization problem:⁵ f_II^tr

1 = infτ (f (X)) : A is a unital C^∗-algebra with tracial state τ,

X ∈ D_A(S) ∩ V_A(T ) . (4.19)

This problem can be seen as an infinite-dimensional analogue of f_∗^tr: if we re-strict to finite-dimensional C^∗-algebras in the definition of f_II^tr

1, then we recover the parameter f_∗^tr (use Theorem 4.6 to see this). Moreover, as we see in Theo-rem 4.14(ii) below, equality f_∗^tr = f_II^tr₁ holds if some flatness condition is satisfied.

Whether f_II^tr₁ = f_∗^tr is true in general is related to Connes’ embedding conjecture (see [KS08, KP16, BKP16]).

For all t ∈ N we have

f_t^tr≤ f_t+1^tr ≤ f_∞^tr ≤ f_II^tr₁≤ f_∗^tr,

where the last inequality follows by considering for A the full matrix algebra C^d×d. The next theorem from [KP16] summarizes convergence properties for these param-eters, its proof uses Lemma 4.15 below.

Theorem 4.14. Let 1 ≤ δ ≤ t < ∞, S ∪ {f } ⊆ Sym Rhxi2δ, and T ⊆ Rhxi2δ, with D(S) ∩ V(T ) 6= ∅.

(i) If M(S) + I(T ) is Archimedean, then f_t^tr → f_∞^tr as t → ∞, and the optimal values in f_∞^tr and f_II^tr

1 are attained and equal.

(ii) If f_t^tr has an optimal solution L that is δ-flat, then L is a convex combination of normalized trace evaluations at matrix tuples in D(S) ∩ V(T ), and f_t^tr = f_∞^tr = f_II^tr

1= f_∗^tr.

5The subscript II1 in f_II^tr

1 comes from the more usual definition of this parameter using von Neumann algebras of type II1, see [KP16]. See [GdLL19, App. A] for a short discussion on the equivalence of the two definitions.

60 Chapter 4. Noncommutative polynomial optimization Proof. We first show (i). As the Minkowski sum M(S)+I(T ) is Archimedean (4.4), R −Pn

i=1x²_i ∈ M2d(S)I2d(T ) for some R > 0 and d ∈ N. Since the bounds ft^tr

are monotone nondecreasing in t and upper bounded by f_∞^tr, the limit limt→∞f_t^tr exists and it is at most f_∞^tr.

Fix ε > 0. For t ∈ N let L^t be a feasible solution to the program defining f_t^tr with value Lt(f ) ≤ f_t^tr+ ε. As Lt(1) = 1 for all t we can apply Lemma 4.15 and conclude that the sequence (Lt)t has a convergent subsequence. Let L ∈ Rhxi^∗ be the pointwise limit. One can easily check that L is feasible for f_∞^tr. Hence we have f_∞^tr ≤ L(f ) ≤ limt→∞f_t^tr+ ε ≤ f_∞^tr + ε. Letting ε → 0 we obtain that f_∞^tr = limt→∞f_t^tr and L is optimal for f_∞^tr.

Next, since L is symmetric, tracial, nonnegative on M(S), and zero on I(T ), we can apply Theorem 4.5 to obtain a feasible solution (A, τ, X) to f_II^tr

1 satisfying (4.9) with objective value L(f ). This shows f_∞^tr = f_II^tr

1 and that the optima are attained in f_∞^tr and f_II^tr

Finally, part (ii) is derived as follows. If L is an optimal solution of f_t^tr that is δ-flat, then, by Corollary 4.8, it has an extension ˆL ∈ Rhxi^∗ that is a conic combination of trace evaluations at elements of D(S) ∩ V(T ). This shows that f_∗^tr≤ ˆL(f ) = L(f ), and thus the chain of equalities f_t^tr= f_∞^tr = f_∗^tr= f_Π^tr

1holds.

We conclude with the following technical lemma, based on the Banach-Alaoglu theorem. It is a well-known crucial tool for proving the asymptotic convergence result from Theorem 4.14(i) and it is used at other places in this thesis.

Lemma 4.15. Let S ⊆ Sym Rhxi, T ⊆ Rhxi, and assume R − (x²1+ · · · + x²_n) ∈ M2d(S) + I2d(T ) for some d ∈ N and R > 0. For t ∈ N assume L^t ∈ Rhxi^∗2t is tracial, nonnegative on M2t(S), and zero on I2t(T ). Then we have

|Lt(w)| ≤ R^|w|/2L_t(1) for all w ∈ hxi_2t−2d+2.

In addition, if sup_tLt(1) < ∞, then {Lt}t has a pointwise converging subsequence in Rhxi^∗.

Proof. We first use induction on |w| to show that L_t(w^∗w) ≤ R^|w|L_t(1) for all w ∈ hxi_t−d+1. For this, assume L_t(w^∗w) ≤ R^|w|L_t(1) and |w| ≤ t − d. Then we have

Lt((xiw)^∗xiw) = Lt(w^∗(x²_i − R)w) + R · Lt(w^∗w) ≤ R · R^|w|Lt(1) = R^|xⁱ^w|Lt(1).

For the inequality we use the fact that Lt(w^∗(x²_i− R)w) ≤ 0 since w^∗(R − x²_i)w can be written as the sum of a polynomial in M2t(S)+I2t(T ) and a sum of commutators of degree at most 2t, which follows using the following identity: w^∗qhw = ww^∗qh + [w^∗qh, w]. Next we write any w ∈ hxi_2(t−d+1)as w = w₁^∗w2 with w1, w2∈ hxit−d+1

and use the positive semidefiniteness of the principal submatrix of Mt(Lt) indexed by {w1, w2} to get

Lt(w)²= Lt(w^∗₁w2)²≤ Lt(w₁^∗w1)Lt(w^∗₂w2) ≤ R^|w¹^|+|w²^|Lt(1)²= R^|w|Lt(1)². This shows the first claim.

4.3. Advantages and disadvantages of noncommutativity 61

In document Applications of optimization to factorization ranks and quantum information theory (Page 65-69)