Unconstraint global polynomial optimization via Gradient Ideal

(1)

Unconstraint global polynomial optimization via

Gradient Ideal

Marta Abril Bucero, Bernard Mourrain, Philippe Tr´

ebuchet

To cite this version:

Marta Abril Bucero, Bernard Mourrain, Philippe Tr´ebuchet. Unconstraint global polynomial optimization via Gradient Ideal. 2013. <hal-00779666v3>

HAL Id: hal-00779666

https://hal.inria.fr/hal-00779666v3

Submitted on 21 Mar 2013

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche fran¸cais ou étrangers, des laboratoires publics ou privés.

(2)

GRADIENT IDEAL

MARTA ABRIL BUCERO, BERNARD MOURRAIN, PHILIPPE TREBUCHET

Abstract. In this paper, we describe a new method to compute the minimum of a real polynomial function and the ideal defining the points which minimize this polynomial func-tion, assuming that the minimizer ideal is zero-dimensional. Our method is a generalization of Lasserre relaxation method and stops in a finite number of steps. The proposed algo-rithm combines Border Basis, Moment Matrices and Semidefinite Programming. In the case where the minimum is reached at a finite number of points, it provides a border basis of the minimizer ideal.

1. Introduction

Optimization appears in many areas of Scientific Computing, since the solution of a problem can often be described as the minimum of an optimization problem. Local methods such as gradient descent are often employed to handle global minimization problems. They can be very efficient to compute a local minimum, but the output depends on the initial guess and they give no guarantee of a global solution.

In the case where the function f to minimize is a polynomial, it is possible to develop methods which ensure the computation of a global solution. Reformulating the problem as the computation of a (minimal) critical value of the polynomialf, different polynomial system solvers can be used to tackle it (see e.g. [27], [8]). But in this case, the complex solutions of the underlying algebraic system come into play and additional computation efforts should be spent to remove these extraneous solutions. Semi-algebraic techniques such as Cylindrical Algebraic Decomposition or extensions [29] may also be considered here but suffer from similar issues. Though the global minimization problem is known to be NP-hard (see e.g. [23]), a practical challenge is to device methods which take into account “only” the real solutions of the problem or which can approximate them efficiently.

Previous works. About a decade ago, a relaxation approach has been proposed in [14] (see also [26], [31]) to solve this difficult problem. Instead of searching points where the polynomial f reaches its minimumf∗_{, a probability measure which minimizes the function}_f _{is searched.}

This problem is relaxed into a hierarchy of finite dimensional convex minimization problems, that can be solved by Semi-Definite Programming (SDP) techniques, and which converges to the minimum f∗ _{[14]. This hierarchy of SDP problems can be formulated in terms of}

linear matrix inequalities on moment matrices associated to the set of monomials of degree

≤ t _∈ N _{for increasing values of} _{t. The dual hierarchy can be described as a sequence of} maximization problems over the cone of polynomials which are Sums of Squares (SoS). A feasibility condition is needed to prove that this dual hierarchy of maximization problems also converges to the minimumf∗_{, ie. that there is no duality gap.}

From a computational point of view, this approach suffers from two drawbacks:

(1) the hierarchy of optimization problems may not be exact, ie. it may not always convergence in a finite number of steps;

(2) the size of the SDP problems to be solvedgrows exponentially witht.

(3)

To address the first issue, the following strategy has been considered: add polynomial in-equalities or in-equalities satisfied by the points where the function f is minimum. Inequality constraints can for instance be added to restrict the optimization problem to a compact subset of Rn _{and to make the hierarchies exact [14], [19]. Natural constraints which do not require} apriori bounds on the solutions are for instance the vanishing of the partial derivatives of f. A result in [17] shows that if the gradient ideal (generated by the first differentials of f) is zero-dimensional, then the hierarchy extended with constraints from the gradient ideal is exact. It is also proved that there is no duality gap if “good” generators of the gradient ideal are used. In [25], it is proved that the extended hierarchy is exact when the gradient ideal is radical and in [24] the extended hierarchy is proved to be exact for any polynomial f when the minimum is reached inRn_{(see also [9]). In [30], the relaxation techniques are analyzed for}

functions for which the minimum is not reached and which satisfies some special properties “at infinity”.

From an algorithmic point of view, this is not ending the investigations since a criteria is needed to determine at which step the minimum is reached. It is also important to known if all the minimizers can be recovered at that step.

Methods based on moment matrices have been proposed to compute generators of the ideal characterizing the real solutions of a polynomial system. In [16], the computation of generators of the (real) radical of an ideal is based on moment matrices which involve all monomials of a given degree and a stopping criterion related to Curto-Fialkow flat extension condition [5] is used. This method is improved in [13]. Combining the border basis algorithm of [21] and a weaker flat extension condition [18], a new algorithm which involves SDP problems of significantly smaller size is proposed to compute the (real) radical of an ideal, when this (real) radical ideal is zero-dimensional. In [25], an algorithm is also proposed to compute the global minimum of the polynomial functionf based on techniques from [17]. It terminates when the gradient ideal is radical zero-dimensional.

An interesting feature of these hierarchies of SDP problems is that, at any step they provide a lower bound of f∗ _{and the SoS hierarchy gives certificates for these lower bounds (see e.g.}

[12] and reference therein). In [10, 11] it is also shown how to obtain “good” upper bounds by perturbation techniques, which can directly be generalized to the approach we propose in this paper.

Contributions. We show that if the minimumf∗_{is reached in}_Rn_{, a generalized hierarchy of}

relaxation problems which involve the gradient idealIgrad(f)is exact and yields the generators

of the minimizer ideal Imin(f) from a sufficiently high degree.

In the case where the minimizer ideal is zero dimensional, we give a criterion for deciding when the minimum is reached, based on the flat extension condition in [18].

This criterion is used in a new algorithm which computes a border basis of the minimizer ideal of a polynomial function, when this minimizer ideal is zero-dimensional.

The algorithm is an extension of the real radical algorithm described in [13]. The rows and columns of the matrices involved in the semi-definite programming problem are associated with the family of monomials candidates for being a basis of the quotient space, i.e, a subset of monomials of size much smaller than the number of monomials of the same degree. Thus, the size of the SDP problems involved in this computation is significantly smaller than the one in [14], [17] or [25].

We show that by solving this sequence of SDP problems, we obtain in a finite number of steps the minimum f∗ _of _f_{, with no duality gap. When this minimum is reached, the}

kernel of the Hankel matrix associated to the solution of the SDP problem yields generators of the minimizer ideal which are not in the gradient ideal. Assuming that the minimizer ideal Imin(f)is zero dimensional, computing the border basis of this kernel yields a representation

(4)

of the quotient algebra byImin(f)and thus a way to compute effectively the minimizer points,

using eigenvector solvers.

An implementation of this method has been developed, which integrates a border basis implementation and a numerical SDP solver. It is demonstrated on typical examples.

Content. The paper is organized as follows. Section 2 recalls the concepts of algebraic tools as ideals, varieties, dual space, quotient algebra, the definitions and theorems about border basis, and the Hankel Operators involved in the computation of (real) radical ideals and in the computation of our minimizer ideal. In Section 3, we describe the main results on the exactness of the hierarchy of SDP problems on truncated Hankel operators and show that the minimizer ideal can be computed from the kernel of these Hankel. In Section 4, we analyze more precisely the case where the minimizer ideal is zero-dimensional. In section 5, we describe our algorithm and we prove its correctness. Finally in Section 6, we illustrate the algorithm on some classical examples.

2. Ideals, dual space, quotient algebra and border basis

In this section, we set our notation and recall the eigenvalue techniques for solving polyno-mial equations and the border basis method.

2.1. Ideals and varieties. Let K_[_x_] _{be the set of the polynomials in the variables} _x ₌

(x1, . . ., xn), with coefficients in the field K. Hereafter, we will choose1 K = R or C. Let

K _{denotes the algebraic closure of} K_{. For} _α _∈ Nn_, _xα ₌ _xα1

1 · · ·xαnn is the monomial with

exponent α and degree _|α_|=P

iαi. The set of all monomials in x is denoted M =M(x).

We say that xα ≤ xβ if xα divides xβ, i.e., if α _≤ β coordinate-wise. For a polynomial

f =P

αfαxα, its support is supp(f) := {xα |fα 6= 0}, the set of monomials occurring with

a nonzero coefficient inf.

Fort_∈N_and _S_⊆K_[_x_]_{, we introduce the following sets:}

• St is the set of elements ofS of degree≤t, • hS_i= P

f∈Sλff |f ∈S, λf ∈K is the linear span ofS, • (S) = P

f∈Spff |pf ∈K[x], f ∈S is the ideal in K[x]generated by S, • hS_|t_i = P

f∈Stpff | pf ∈K[x]t−deg(f) is the vector space spanned by {x

α_f |f _∈ St,|α| ≤t−deg(f)},

• S+_:=_S_∪_x

1S∪. . .∪xnS is the prolongation of S by one degree, • ∂S:=S+_\_S _{is the border of}_S,

• S[t] _:= _St+times···+ _{is the result of applying} _t _{times the prolongation operator ‘}+_{’ on} _S,

withS[1] =S+ and, by convention, S[0]=S.

Therefore,St=S∩K[x]t, S[t]={xαf |f ∈S,|α| ≤t},hS|ti ⊆(S)∩K[x]t= (S)t, but the

inclusion may be strict.

If_{B ⊆ M} contains 1 then, for any monomial m_{∈ M}, there exists an integer k for which m_{∈ B}[k]_{. The}_B_-index_of _{m, denoted by}_δ

B(m), is defined as the smallest integer kfor which

m_{∈ B}[k]_.

A set of monomials_B is said to be connected to 1 if 1_{∈ B} and for every monomial m₆= 1

in_B,m=xi0m

′ _{for some}_i

0 ∈[1, n]andm′ ∈ B.

Given a vector spaceE_⊆K_{, its prolongation}_E+ _:=_E₊_x₁_E₊_{. . .}₊_x_n_E _{is again a vector} space.

The vector spaceE is said to beconnected to 1 if 1_∈E and any non-constant polynomial p_∈E can be written as p =p0+Pni=1xipi for some polynomials p0, pi ∈E withdeg(p0)≤ deg(p),deg(pi)≤deg(p)−1for i∈[1, n].

1_{For notational simplicity, we will consider only these two fields in this paper, but}_R_and_C_{can be replaced}

(5)

Obviously, E is connected to 1 when E = _hCi for some monomial set _{C ⊆ M} which is connected to 1. Moreover, E+=_hC+_i if E=_hCi.

Given an ideal I _⊆K_[_x_]_{and a field} L_⊇K_{, we denote by} VL(I) :=_{x_∈Ln_|_f₍_x_{) = 0} _∀_f _∈_I_}

its associated variety in Ln_{. By convention} _V₍_I_{) =}_V_K₍_I₎_{. For a set} _V _⊆Kn_{, we define its} vanishing ideal I(V) :=_{f _∈K_[_x_]_|_f₍_v_{) = 0}_∀_v_∈_V_}_. Furthermore, we denote by √ I :=_{f _∈K_[_x_] _| _fm_∈_I _{for some} _m_∈N_{\ {}₀_}} the radical of I.

For K₌ R_{, we have} _V₍_I_{) =} _V_C(_I₎_{, but one may also be interested in the subset of real} solutions, namely the real variety VR(I) = V(I)_∩Rn_. _{The corresponding vanishing ideal is} I(VR(I))and the real radical ideal is

R √ I :=_{p_∈R_[_x_]_|_p2m₊X j q_j2_∈I for some qj ∈R[x], m∈N\ {0}}. Obviously, I _⊆√I _⊆I(VC(I)), I _⊆ √RI _⊆I(VR(I)).

An ideal I is said to be radical (resp., real radical) if I = √I (resp. I = √R

I). Obviously, I _⊆ I(V(I)) _⊆ I(VR(I)). Hence, if I _⊆ R _{is real radical, then} _I _{is radical and moreover,} V(I) =VR(I)_⊆Rn_if _|_V_R(_I₎_|_<_∞_.

The following two famous theorems relate vanishing and radical ideals: Theorem 2.1.

(i) Hilbert’s Nullstellensatz(see, e.g., [4, §4.1])√I =I(VC(I))for an idealI _⊆C_[_x_]_. (ii) Real Nullstellensatz (see, e.g.,[1, §4.1]) √R_I ₌_I₍_V_R(_I₎₎ _{for an ideal}_I

⊆R_[_x_]_. 2.2. The roots from the quotient algebra structure. Given an ideal I _⊆ K_[_x_]_{, the} quotient set K_[_x_]_/I _{consists of all cosets} _[_f_{] :=}_f ₊_I ₌_{_f ₊_q _| _q _∈ _I_} _for _f _∈K_[_x_]_{, i.e.,} all equivalent classes of polynomials of K_[_x_] _{modulo the ideal} _I_{. The quotient set} K_[_x_]_/I is an algebra with addition [f] + [g] := [f +g], scalar multiplication λ[f] := [λf] and with multiplication [f][g] := [f g], for λ_∈R_,_{f, g}_∈K_[_x_]_.

We will say that an idealI is zero-dimensional if 0<_|VK(I)|<∞. ThenK[x]/I is a

finite-dimensional vector space and its dimension is the number of roots counted with multiciplicity (see e.g. [4], [6]). Thus _|VK(I)| ≤dimKK[x]/I, with equality if and only if I is radical.

Assume that 0 < _|VK(I)| < ∞ and set N := dimKK[x]/I so that |VK(I)| ≤ N < ∞.

Consider a set _B := _{b1, . . . , bN} ⊆ K[x] for which {[b1], . . . ,[bN]} is a basis of K[x]/I; by

abuse of language we also say that _B itself is a basis of K_[_x_]_/I_{. Then every} _f _∈ K_[_x_] _can be written in a unique way as f = PN

i=1cibi +p, where ci ∈ K, p ∈ I; the polynomial

πI,B(f) := PN_i₌₁cibi is called the remainder of f modulo I, or itsnormal form, with respect

to the basis _B. In other words,_hBi andK_[_x_]_/I _{are isomorphic vector spaces.} Given a polynomial h_∈K_[_x_]_{, we can define the} _{multiplication (by} _h_{) operator} _as

(1) Mh : K[x]/I −→ K[x]/I

[f] _{7−→ M}h([f]) := [hf],

Assume that N := dimKK_[_x_]_{/I <} _∞_{. Then the multiplication operator} _M_h _{can be} rep-resented by its matrix, again denoted _Mh for simplicity, with respect to a given basis B=_{b1, . . . , bN} of K[x]/I.

(6)

Namely, settingπI,B(hbj) :=PN_i₌₁aijbi for some scalarsaij ∈K, the jth column ofMh is

the vector (aij)Ni=1.

Define the vector ζB,v := (bj(v))Nj=1 ∈ K

N

, whose coordinates are the evaluations of the polynomials bj ∈ B at the point v ∈ K

n

. The following famous result (see e.g., [3, Chapter 2§4], [6]) relates the eigenvalues of the multiplication operators in K_[_x_]_/I _{to the algebraic} variety V(I). This result underlies the so-called eigenvalue method for solving polynomial equations [7] and plays a central role in many algorithms, also in the present paper.

Theorem 2.2. Let I be a zero-dimensional ideal inK_[_x_]_,_B _{a basis of} K_[_x_]_/I_{, and} _h_∈K_[_x_]_. The eigenvalues of the multiplication operator_Mh are the evaluationsh(v)of the polynomialh

at the pointsv _∈V(I). Moreover, (_Mh)TζB,v=h(v)ζB,v and the set of common eigenvectors

of (_Mh)h∈K_[_x_] are up to a non-zero scalar multiple the vectors ζB,v for v∈V(I).

Throughout the paper we also denote by _Mi := Mxi the matrix of the multiplication

operator by the variable xi. By the above theorem, the eigenvalues of the matrices Mi are

theith coordinates of the points v_∈V(I). Thus the task of solving a system of polynomial equations is reduced to a task of numerical linear algebra once a basis ofK_[_x_]_/I _{and a normal} form algorithm are available, permitting the construction of the multiplication matrices_Mi.

2.3. Border bases. The eigenvalue method for solving polynomial equations from the above section requires the knowledge of a basis of K_[_x_]_/I _{and an algorithm to compute the normal} form of a polynomial with respect to this basis. In this section we will recall a general method for computing such a basis and a method to reduce polynomials to their normal form.

Throughout_{B ⊆ M}is a finite set of monomials.

Definition 2.3. A rewriting family F for a (monomial) set _B is a set of polynomials F =

{fi}i∈I such that • supp(fi)⊆ B+,

• fi has exactly one monomial in∂B, denoted as γ(fi)and called the leading monomial

of fi. (The polynomial fi is normalized so that the coefficient of γ(fi) is 1.) • if γ(fi) =γ(fj) then i=j.

Definition 2.4. We say that the rewriting family F is graded if deg(γ(f)) = deg(f) for all f _∈F.

Definition 2.5. A rewriting familyF for _B is said to be complete in degree t if it is graded and satisfies(∂_B)t⊆γ(F); that is, each monomialm∈∂B of degree at most t is the leading

monomial of some (necessarily unique) f _∈F.

Definition 2.6. Let F be a rewriting family for _B, complete in degree t. Let πF,B be the

projection on_hBialongF defined recursively on the monomialsm_{∈ M}t in the following way: • if m_{∈ B}t, then πF,B(m) =m,

• if m _∈ (∂_B)t (= (B[1] \ B[0])t), then πF,B(m) = m −f, where f is the (unique)

polynomial in F for which γ(f) =m,

• if m _∈ (_B[k]_{\ B}[k−1]₎

t for some integer k ≥ 2, write m = xi0m

′_{, where} _m′

∈ B[k−1]

and i0 ∈[1, n] is the smallest possible variable index for which such a decomposition

exists, then πF,B(m) =πF,B(xi0πF,B(m

′₎₎_.

If F is a graded rewriting family, one can easily verify that deg(πF,B(m)) ≤ deg(m) for

m _{∈ M}t. The map πF,B extends by linearity to a linear map from K[x]t onto hBit. By

construction,f =γ(f)₋πF,B(γ(f))andπF,B(f) = 0for allf ∈Ft. The next theorems show

that, under some natural commutativity condition, the map πF,B coincides with the linear

projection fromK_[_x_]_t_onto_hBi_t_{along the vector space} _h_F_|_t_i_{. It leads to the notion of border} basis.

(7)

Definition 2.7. Let _{B ⊂ M} be connected to 1. A familyF _⊂K_[_x_] _{is a border basis for} _B _if it is a rewriting family for _B, complete in all degrees, and such that K_[_x_{] =}_{hBi ⊕}₍_F₎_.

An algorithmic way to check that we have a border basis is based on the following result, that we recall from [21]:

Theorem 2.8. Assume that_Bis connected to1and letF be a rewriting family for_B, complete in degree t_∈N_{. Suppose that, for all} _m_{∈ M}_t₋₂_,

(2) πF,B(xiπF,B(xjm)) =πF,B(xjπF,B(xim)) for all i, j ∈[1, n].

Then πF,B coincides with the linear projection of K[x]t on hBit along the vector space hF|ti;

that is, K_[_x_]_t₌_hBi_t_{⊕ h}_F_|_t_i_.

In order to have a simple test and effective way to test the commutation relations (2), we introduce now the commutation polynomials.

Definition 2.9. Let F be a rewriting family andf, f′

∈F. Let m, m′ _{be the smallest degree}

monomials for which m γ(f) = m′γ(f′). Then the polynomial C(f, f′) := mf ₋m′f′ =

m′_π

F,B(f′)−mπF,B(f) is called the commutation polynomial of f, f′.

Definition 2.10. For a rewriting family F with respecet to _B, we denote by C+₍_F₎ _{the set}

of polynomials of the form m f ₋m′f′, wheref, f′_∈F and m, m′

∈ {0,1, x1, . . . , xn} satisfy • either m γ(f) =m′_γ₍_f′₎_, • or m γ(f)_{∈ B} andm′_{= 0}_.

Therefore, C+₍_F₎ _{⊂ hB}+_i _and _C+₍_F₎ _{contains all commutation polynomials} _C₍_{f, f}′₎ _for

f, f′

∈Fwhose monomial multipliersm, m′_{are of degree}

≤1. The next result can be deduced using Theorem 2.8.

Theorem 2.11. Let_{B ⊂ M}be connected to1 and letF be a rewriting family for_B, complete in degree t. If for all c_∈C+(F) of degree _≤t, πF,B(c) = 0, then πF,B is the projection of Kt

on _hBit along hF|ti, ie. Kt=hBit⊕ hF|ti.

If such a property is satisfied we say that F is a border basis for B in degree _≤t.

Theorem 2.12 (border basis, [21]). Let _{B ⊂ M} be connected to 1 and let F be a rewriting family for _B, complete in any degree. Assume that πF,B(c) = 0 for all c∈C+(F).Then B is

a basis of K_[_x_]_/₍_F₎_, K ₌_{hBi ⊕}₍_F₎_{, and} ₍_F₎_t ₌_h_F_|_t_i _{for all} _t_∈ N_{; the set} _F _{is a} _border basis of the ideal I = (F) with respect to_B.

This implies the following characterization of border bases using the commutation property. Corollary 2.13 (border basis, [20]). Let _{B ⊂ M} be connected to 1 and let F be a rewriting family for _B, complete in any degree. If for all m_{∈ B} and all indices i, j_∈[1, n], we have:

πF,B(xiπF,B(xjm)) =πF,B(xjπF,B(xim)),

then _B is a basis ofK_/₍_F₎_, K₌_{hBi ⊕}₍_F₎_{, and} ₍_F₎_t₌_h_F_|_t_i _{for all} _t_∈N_.

2.4. Hankel operators and positive linear forms. This section is based on [13].

Definition 2.14. For Λ_∈R_[_x_]∗_{, the Hankel operator} _H_Λ _{is the operator from}R_[_x_]_to R_[_x_]∗ defined by

(3) HΛ:p∈R[x]7→p·Λ∈R[x]∗

Definition 2.15. We define the kernel of the Hankel operator:

(8)

To analyse the properties of Λ, we study the quotient algebra K_[_x_]_/_ker_H_Λ ₌_A_Λ_{. A first} result is the following (see e.g. [13]):

Lemma 2.16. The rank of the operatorHΛis finite if and only ifkerHΛis a zero-dimensional

ideal, in which casedimK_[_x_]_/_ker_H_Λ₌_rankH_Λ_.

For a zero-dimensional idealI _⊂K_[_x_]_{with simple zeros}_V₍_I_{) =}_{_ζ₁_{, . . . , ζ}_r_{} ⊂}Kn_{, we have} I⊥ ₌

h1ζ1, . . . ,1ζri and the ideal I is radical as a consequence of Hilbert’s Nullstellensatz.

WhenI = kerHΛ, this yields the following property.

Proposition 2.17. Let K₌C _{and assume that} _rankH_Λ₌_{r <} _∞_{. Then, the ideal} _ker_H_Λ is radical if and only if

(5) Λ = r X i=1 λi1ζi withλi ∈K− {0} and ζi ∈K n _{pairwise distinct}_,

in which case kerHΛ=I(ζ1, . . . , ζr) is the vanishing ideal of the points ζ1, . . . , ζr.

As a corollary, we deduce that whenK₌R_and _rankH_Λ₌_{r <}_∞_{, the ideal}_ker_H_Λ _{is real} radical if and only if the pointsζi are inRn.

Definition 2.18. We say that Λ _∈ R_[_x_]∗ _{is positive, which we denote} _Λ _< ₀_{, if} _Λ(_p2₎ _≥ ₀

for all p_∈ R_[_x_]_{. We have that} _Λ <0 iff HΛ <0. If moreover Λ(1) = 1, we say that Λ is a

probability measure.

This term is justified by a theorem of Riesz-Haviland [28], which states that a Lebesgue measure onRn_{is uniquely determined by its value on the polynomials}_∈R_[_x_]_{. In particular, if}

Λ<0and Λ(1) = 1, there exists a unique probability measureµsuch that _∀p_∈R_[_x_]_,_Λ(_p_{) =}

R

p dµ.

An important property of positive forms is the following:

Proposition 2.19. AssumerankHΛ=r <∞. ThenΛ<0 if and only if Λ has a

decompo-sition (5) with λi >0 and distinct ζi∈Rn, in which case V(kerHΛ) ={ζ1, . . . , ζr} ⊂Rn.

In particular, it shows that ifΛ<0, then kerHΛ is a real radical ideal.

3. Main Results

In this section, we give the main results which shows how the minimum off and the ideal defining the points where this minimum is reached, can be computed.

Hereafter, we will assume that the minimum f∗ _of_f _{is reached at a point}

x∗ ∈Rn_. Definition 3.1. We define the gradient ideal off(x):

(6) Igrad(f) = (∇f(x)) = ( ∂f ∂x1 , .., ∂f ∂xn ). Definition 3.2. We define the minimizer ideal of f(x):

(7) Imin(f) =I(x∗ ∈Rn s.t. f(x∗) is minimum).

By construction, Igrad(f) ⊂ Imin(f) and Imin(f) 6= (1) if the minimum f∗ is reached in

Rn_{. The objective of this section is to describe a method to compute generators of} _I_min₍_f₎ from generators of Igrad(f). For that purpose, first of all we need to restrict our analysis to

matrices of finite size. For this reason, we consider here truncated Hankel operators, which will play a central role in the construction of the minimizer ideal off.

Definition 3.3. For a vector spaceE _⊂R_[_x_]_{, let}_E_·_E _:=_{_{p q} _|_{p, q}_∈_E_}_{. For a linear form}

Λ _{∈ h}E_·E_i∗_{, we define the map} _HE

Λ :E → E∗ by HΛE(p) = p·Λ for p ∈ E. Thus HΛE is

(9)

Definition 3.4. We define the kernel of the truncated Hankel operator:

(8) kerH_ΛE =_{p_∈E_|p_·Λ = 0, i.e, Λ(pq) = 0 _∀q_∈E_}.

When E=R_[_x_]_t _for _t_∈N_,_HE

Λ is also denotedHΛt.

Given a subspace E0 ⊂ E, Λ induces a linear map on hE0·E0i and we can consider the

induced truncated Hankel operator HE0

Λ :E0 →(E0)

∗

as a restriction of H_ΛE.

Below we will deal with the following sets and minimums, in order to define our primal-dual problems.

Definition 3.5. Given a vector space E _⊂R_[_x_] _and_G_{⊂ h}_E_·_E_i_{, we define} (9) _LG,E,< :={Λ∈ hE·Ei∗|Λ⊥G, Λ(1) = 1, Λ(p2)≥0 ∀p∈E}.

If E=R_[_x_]_t _and _G′ ₌_h_G_|₂_t_i_{, we also denote} _L_G′_,E,< by LG,t,<.

Notice that if G _⊂ G′ _and _E

⊂ E′ _then

LG′_,E′_,< ⊂ LG,E,<. When E and G are vector

spaces of finite dimension, _LG,E,< is the intersection of the closed convex cone of semi-definite

positive quadratic forms onE_×E with a linear space, thus it is a convex closed semi-algebraic set. More details on its description will be given in Section 4. We will need the following result:

Lemma 3.6. For any vector space E andG=_{0_} and Λ,Λ′

∈ LG,E,<, we have:

• ∀p_∈E, Λ(p2) = 0 implies p_∈kerHE

Λ.

• kerHE

Λ+Λ′ = kerH_ΛE∩kerH_ΛE′.

Proof. The first point is a consequence of the positivity ofHE

Λ.

For the second point, _∀p, q _∈E, _∀t_∈ R_,_Λ((_p₊_tq₎2_{) =} _t2_Λ(_q2_{) + 2}_t_Λ(_{p q}₎ _≥₀_{. Dividing} by t and letting t go to zero yields Λ(p q) = 0, thus showing p _∈ kerHE

Λ. The inclusion kerH_ΛE_∩kerH_ΛE′ ⊂kerH_Λ+ΛE ′ is obvious.

Conversely, let p _∈ kerHΛ+Λ′. In particular, (Λ + Λ′)(p2) = 0, which implies Λ(p2) =

Λ′₍_p2_{) = 0}_(since _Λ(_p2₎_,_Λ′₍_p2₎_≥₀_{) and thus}_p_∈_ker_H

Λ∩kerHΛ′.

Definition 3.7. Given a vector space E _⊂R_[_x_] _and_G_{⊂ h}_E_·_E_i_{, we define} (10) _SG,E :={ p∈R[x]|p= s X i=1 h2_i +h, hi ∈E, h∈G}. If E=R_[_x_]_t _and _G′ ₌ hG_|2t_i, we also denote _SG′_,E by S_G,t.

Notice that if G_⊂G′ _and _E

⊂E′ _then

SG,E ⊂ SG′_,E′. When E and Gare vector spaces

of finite dimension, _SG,E is the projection of the sum of a linear space and the convex cone

of positive quadratic forms on E∗ ×E∗_.

Definition 3.8. Let E be a subspace of R_[_x_] _{such that} ₁ _∈ _E _and _f _{∈ h}_E _·_E_i _{and let} G_{⊂ h}E_·E_i. We assume that f attains its minimum at some pointsx∗ ∈Rn_{. We define the} following extrema:

• f∗_{= min}

x∈Rnf(x),

• f_G,Eµ = inf _{Λ(f)2 s.t. Λ_{∈ L}G,E,<},

• fsos

G,E = sup{λ∈R s.t. f −λ∈ SG,E}.

If E =R_[_x_]_t _(resp. _E ₌R_[_x_]_{) and} _G′ ₌_h_G_|₂_t_i _{we also denote}_fµ

G′_,E by f µ G,t (resp. f µ G) and fsos

G′_,E by f_G,tsos (resp. f_Gsos).

Remark 3.9. By convention if the sets are empty, fsos

G,E =−∞ and f µ

G,E= +∞. 2_{the notation}µ_{stands for measure}

(10)

We now analyse the relations between these different extrema. Remark 3.10. Let E _⊂ E′ _{be two subspaces of} _R_[

x] and G _{⊂ h}E_·E_i, G′ ⊂ hE′

·E′ i with G_⊂G′ _{then we directly deduce that}

• f_G,Eµ _≤f_Gµ′_,E′, • fsos

G,E ≤fGsos′,E′,

from the fact that_LG′_,E′_,<⊂ LG,E,< and SG,E ⊂ SG′_,E′.

If we takeG_⊂Imin(f), we have the following relations between the extrema.

Proposition 3.11. Let G_⊂Imin(f). Then fG,Esos ≤f µ G,E ≤f

∗_.

Proof. We havef_G,Esos _≤f_G,Eµ because if there existsλ_∈R_{such that}_f₋_λ_{∈ S}_G,E_{, i.e.} _f₋_λ₌

P

ih2i +g withhi ∈E and g ∈Gthen ∀Λ∈ LG,E,<,Λ(f −λ) = Λ(f)−λ=P_iΛ(h2_i)≥0.

We deduce thatΛ(f)_≥λand we concludefsos G,E ≤f

µ

G,E. For the second inequality, let x ∗ _be

a point of Rn _{such that} _f₍_x∗₎ _{is the mininum of} _f _{and let} ₁_x∗ ∈R[x]∗ :p 7−→ p(x∗) be the

evaluation atx∗. Then we have H₁

x∗ <0,1x∗(1) = 1and1x∗(G) = 0since G⊂Imin(f), so

that1x∗ ∈ L_G,E,_<.We deduce the inequalityf_G,Eµ ≤f∗.

Following the relaxation approach proposed in [14], we are now going to consider a hierarchy of convex optimization problems and show that for such hierarchy, the minimum f∗ _{is always}

reached in a finite number of steps. Let us consider the sequence of spaces

· · · ⊂ LG,t+1,< ⊂ LG,t,<⊂ · · · and · · · ⊂ SG,t ⊂ SG,t+1⊂ · · ·

fort_∈N_,_G_⊂_I_min₍_f₎_{. Using Remark 3.10 and the fact that}₍_G_{) =}_∪_t_∈_N_h_G_|_t_i_{, we check that}

• the increasing sequence_{· · ·}f_G,tµ _≤f_G,tµ ₊₁_{≤ · · · ≤}f∗_{, f or t}

∈N_{converges to}_fµ

(G)≤f

∗_, • the increasing sequence_{· · ·}fsos

G,t ≤fG,tsos+1≤ · · · ≤f∗, f or t∈Nconverges tof(sosG)≤f

∗_.

We are going to show that these limits are attained for some t_∈N_.

The next result which is a slight variation of a result in [15] (and also used in [13]) shows that for a high enough degree, the kernel of some truncated Hankel operators allows us to compute generators of the real radical of an ideal.

Proposition 3.12. For G_⊂R_[_x_]_with _I_grad₍_f₎_⊂₍_G₎_{, there exists} _t₀ _∈N _{such that} _∀_t_≥_t₀_,

∀Λ_{∈ L}G,t,<, R

p

Igrad(f)⊂(kerH_Λt∗).

Proof. Let g1, . . . , gn be generators of I := Igrad(f), ds := deg(gs), d := maxs=1,...,nds and

h1, . . . , hk be generators of the ideal J := R √

I. By the Real Nullstellensatz, for l_∈ 1, . . . , k, there existml∈N, ml ≥1and polynomialsu(

l)

j and sums of squaresσlsuch thath2lml+σl= Pn

j=1u (l)

j gj. AsI ⊂(G), there exist t′0 such that u (l)

j gj ∈ hG|t′0i. Set t0 := maxl≤k,j≤n(t′0,

d, deg(h2ml

l ), deg(σl)) and let t ≥ t0. As u

(l)

j gj ∈ hG|ti, uj(l)gj ∈ kerH_Λt for all Λ ∈ LG,t,<.

Henceh2ml

l +σl∈kerHΛt∗, which implies that hl∈kerH_Λt since H_Λt∗ <0.

To compute generators ofImin(f), we use the decomposition ofVR(Igrad(f))in components

where f has a constant value.

Lemma 3.13. LetVR(Igrad(f)) =W0∪W1∪. . .∪Ws be the decomposition of the variety in

disjoint real subvarieties, such that f(Wj) =fj ∈R withfi < fj ∀0≤i < j ≤s. Then there

exist polynomials p0, . . . , pr ∈ R[x] such that pi(Wj) = δij, where δij is the Kronecker delta

function.

Proof. As in [25], we decomposeVC(Igrad(f)) as an union of complex varieties

VC(Igrad(f)) = WC,0∪WC,1 ∪. . .∪WC,s ∪WC,s+1 such that WC,i, i = 0, . . . , s have real

(11)

these varieties so that f0 < f1 < · · · < fs and f0 = f∗. By construction, the varieties

Wi := WC,i∩Rn are disjoint, f is constant on Wi and VR(Igrad(f)) = W0∪W1∪. . .∪Ws.

Let us take pi=Li(f(x))whereL0, . . . , Lsare the Lagrange interpolation polynomials at the

value f0, . . . , fn∈R. They satisfy pi(Wj) =δij.

Remark 3.14. By definition the polynomials pi have the following properties: • p0+. . .+ps≡1 modulo R p Igrad(f). • (pi)2≡pi modulo R p Igrad(f) ∀i= 0, . . . , s.

The next results shows that in the sequence of optimization problems that we consider, the minimum of f is reached from a certain degree.

Theorem 3.15. For G_⊂R_[_x_] _with _I_grad₍_f₎_⊂₍_G₎ _⊂_I_min₍_f₎_{, there exists} _t₁ _≥₀ _{such that}

∀t _≥t1, fG,tsos = f µ G,t = f

∗ _and ∀Λ∗

∈ LG,t,< with Λ∗(f) = f_G,tµ , we have pi ∈ kerHΛt∗ ∀i =

1, . . . , s.

Proof. By [24][Theorem 2.3], there existst′₁ _∈ N _{such that} _∀_t _≥_t′

1, fD,tsos =f µ D,t = f ∗ where D = _{_∂x∂f 1, .., ∂f

∂xn}. As (D) = Igrad(f) ⊂ (G) = (g1, . . . , gs), there exists qi,j ∈ R[x] such

that _∂x∂f

i =

Ps

j=1qi,jgj. Let d = max{deg(qi,j), i = 1, . . . , n, j = 1, . . . , s}. Then hD|ti ⊂ hG_|t+d_i. By Remark 3.10 and Proposition 3.11, for t_≥t′

1, f∗=fD,tsos ≤fG,tsos+d≤f µ G,t+d≤ f∗_{. Thus for} _t ≥t′ 1+d, we have (11) fG,tsos=f µ G,t=f ∗ . Let J = pR _I

grad(f). By Lemma 3.13 and Remark 3.14, we can write

(12) f _≡

s X

i=0

fip2i modulo J,

where fi=f(Wi)∈R andf0 =f(W0) =f∗ . Then

(13) f _≡f∗p2₀+ s X i=1 fip2i ≡f ∗ (1₋ s X r=1 p2_r) + s X i=1 fip2i ≡f ∗ + s X i=1 (fi−f∗)p2i modulo J. Hence (14) f₋f∗ _≡ s X i=1 (fi−f∗)p2i +h.

withh_∈J. By Theorem 3.12, there existst′′

1 ≥t0 such thatΛ(h) = 0for allΛ∈kerLG,t′′

1,<.

Let us take t1 = max{t′1 +d, t′′1,deg(p1), . . . ,deg(ps)}, t ≥ t1 and Λ∗ ∈ LG,t,< such that Λ∗₍_f_{) =}_fµ

G,t. From Equation (11), we deduce that

(15) Λ∗(f ₋f∗) = 0 =

s X

i=1

(fi−f∗)Λ∗(p2i)

This implies that Λ∗₍_p2

i) = 0and pi ∈kerH_Λt∗ for i= 1, . . . , s, since fi−f∗>0 andΛ∗ <0

on R_[_x_]_t_.

The example 3.4 in [25] shows that it may not always be possible to writef₋f∗ _{as a sum of}

squares modulo Igrad(f)but, from the previous result we see that f−f∗ is the limit of sums

of squares modulo Igrad(f)∩R[x]tfor a fixed t≥t1. Under the assumption that there exists

x∗ ∈Rn _{such that} _f₍_x∗_{) =} _f∗_{, we can construct} _Λ∗ ₌ ₁_x∗ ∈ L_G,t,< such that Λ∗(f) = f∗.

(12)

A direct corollary of the previous theorem is that for anyG_⊂R_[_x_]_with_I_grad₍_f₎_⊂₍_G₎_⊂ Imin(f), we have f_Isos grad(f)=f sos (G)=f µ Igrad(f) =f µ (G)=f ∗ . As for the construction of generators of the real radical pR _I

grad(f) (Proposition 3.12), we

can construct generators ofImin(f)from the kernel of a truncated Hankel operator associated

to any linear form which minimizes f:

Theorem 3.16. For G_⊂R_[_x_] _with_I_grad₍_f₎ _⊂₍_G₎ _⊂_I_min₍_f₎_{, there exists} _t₂ _∈N _{such that}

∀t_≥t2, forΛ∗∈ LG,t,< with Λ∗(f) =f_G,tµ , we have Imin(f)⊂(kerH_Λt∗).

Proof. Let I =Igrad(f) and J = R √

I. We consider again the above decomposition VR(I) =

W0 ∪W1∪. . .∪Ws with f0 = f(W0) < f1 = f(W1) < · · · < fs = f(Ws). We denote by

h1, . . . , hka family of generators of the idealImin(f) =I(W0)andd= max{deg(h1), . . . ,deg(hk)}.

Let us fix 0 _≤ j _≤ k and show that hj ∈ kerH_Λt∗ for t sufficiently large. We know that

p0h2j |W0= 0and that fori= 1, . . . , s,p0h

2

j |Wi= 0which implies thatp0h

2

j ∈J := R p

Igrad(f).

By Theorem 3.12, there existst′

2,j such that Λ(p0h2j) = 0 for allΛ∈ LG,t,< witht≥t′₂_,j.

By Theorem 3.15, pi ∈ kerH_Λt∗ for t ≥t1. By Remark 3.14, p0+. . .+ps ≡1 modulo J.

By Theorem 3.12, there exists t′′

2 such that Λ(p0+. . .+ps−1) = 0 for all Λ ∈ LG,t,< with

t_≥t′′

2.

Let us taket2 := max{t1+ 2d, t′′2+ 2d, t′2,1, . . . , t′2,k}andt≥t2. Then forΛ∗ ∈ LG,t,< with Λ∗₍_f_{) =}_fµ

G,t, we have

Λ∗(h2_j) = Λ∗(h2_jp0) + Λ∗(h2jp1) +. . .+ Λ∗(h2jps) = 0.

Hence, hj ∈kerH_Λt∗ for j= 1, . . . , k andImin(f)⊂(kerH_Λt∗).

We introduce now the notion ofgeneric linear form for f _∈R_[_x_]_{. Such a linear form will} allow us to compute Imin(f) as we will see.

Proposition 3.17. For Λ∗

∈ LG,E,< andp∈R[x], the following assertions are equivalent:

(i) rankHE

Λ∗ =max_Λ∈L_G,E,<,Λ(p)=pµ_G,ErankH E

Λ.

(ii) _∀Λ_{∈ L}G,E,< such that Λ(p) =p_G,Eµ , kerH_ΛE∗ ⊂kerH_ΛE.

(iii) rankHE0

Λ∗ =maxΛ∈LG,E,<,Λ(p)=pµ_G,ErankH E0

Λ for any subspace E0 ⊂E.

We say thatΛ∗

∈ LG,E,< is generic forpif it satisfies one of the equivalent conditions (i)-(iii).

Proof. (i)=_⇒(ii): Note that 1₂(Λ+Λ∗₎

∈ LG,E,<and 1₂(Λ+Λ∗)(p) =pµ_G,EandkerHE1 2(Λ+Λ∗)

= kerHE

Λ ∩kerHΛE∗ (using Lemma 3.6). Hence, rankHE1

2(Λ+Λ∗) ≥

rankHE

Λ∗ and thus equality

holds. This implies thatkerHE

1 2(Λ+Λ∗)

= kerHE

Λ∗ is thus contained inkerH_ΛE.

(ii) =_⇒ (iii): Given E0 ⊂ E, we show that kerHΛE∗0 ⊂ kerH E0

Λ . By Lemma 3.6, we have kerHE0

Λ∗ ⊂kerH_ΛE∗ and, by the above, we have kerH_ΛE∗ ⊂kerH_ΛE.

(iii)=_⇒(i) This implication is obvious.

The next result, which refines Theorem 3.16, shows only elements in Imin(f) are involved

in the kernel of a truncated Hankel operator associated to a generic linear form for f. Theorem 3.18. LetE, Gbe as in Definition 3.8 withG_⊂Imin(f). IfΛ∗ ∈ LG,E,<is generic

for f and such that Λ∗₍_f_{) =}_f∗_{, then}_ker_HE

Λ∗⊂Imin(f).

Proof. Let x∗ ∈ Rn _{such that} _f₍_x∗_{) =} _f∗ _{is minimum. Let} ₁

x∗ denotes the evaluation

at x∗ restricted to _hE_·E_i and h _∈ kerHE

Λ∗. Our objective is to show that h(x

∗_{) = 0}_.

(13)

and 1_x∗(f) = f(x∗) = f∗. We define Λ =˜ 1₂(Λ∗ +1_x∗). By definition Λ˜ ∈ LG,E,< and ˜ Λ(f) = 1₂(Λ∗₍_f_{) +} 1_x∗(f)) = 1₂(Λ∗(f) +f(x∗)) =f∗. As h∈kerH_ΛE∗, ˜ Λ(h2) = 1 2(Λ ∗ (h2) +1_x∗(h2)) = 1 2h 2₍ x∗)₆= 0 thus h _∈ kerHE

Λ∗ \kerH_ΛE_˜ and by the maximality of the rank of H_ΛE∗, kerH_ΛE_˜ 6⊂ kerH_ΛE∗.

Hence there exits ˜h _∈ kerHE

˜ Λ \ kerH E Λ∗. Then 0 = H_ΛE_˜(˜h) = ₂1(H_ΛE∗(˜h) +H₁E x∗(˜h)) = 1 2(HΛE∗(˜h) + ˜h(x ∗₎ ·1_x∗). AsH_ΛE∗(˜h)6= 0implies ˜h(x ∗₎ 6

= 0. On the other hand,

0 =H_ΛE_˜(˜h)(h) = ˜Λ(hh˜) = 1 2(Λ ∗ (hh˜) +h(x∗)˜h(x∗)) = 1 2(H E Λ∗(h)(˜h) +h(x∗)˜h(x∗)). As h_∈HE Λ∗, then 0 =H_ΛE_˜(˜h)(h) = 1 2(h(x ∗ )˜h(x∗)).

As ˜h(x∗)₆= 0, since we have supposed that h(x∗)₆= 0 it yields a contradiction. The last result of this section shows that a generic linear form for f yields the generators of the minimizer ideal Imin(f) in high enough degree.

Theorem 3.19. For G_⊂R_[_x_]_with_I_grad₍_f₎_⊂₍_G₎_⊂_I_min₍_f₎_{, there exists}_t₂ _∈N_{(defined in} Theorem 3.16) such that such that _∀t_≥t2, for Λ∗ ∈ LG,t,< generic for f, we haveΛ∗(f) =f∗

and (kerHt

Λ∗) =Imin(f).

Proof. We obtain the result as consequence of Theorem 3.15, Theorem 3.16 and Theorem

3.18.

As a consequence, in a finite number of steps, the sequence of optimization problems that we consider gives the minimum of f and the generators of Imin(f).

4. Zero-dimensional Case

In this section, we describe a criterion to detect when the kernel of a truncated Hankel operator associated to a generic linear form for f yields the generators of the minimizer ideal. It is based on a flat extension property [18] and applies to global polynomial optimization problems where the minimizer ideal Imin(f) is zero-dimensional.

Definition 4.1. Given vector subspaces E0 ⊂E ⊂K[x] and Λ∈ hE·Ei∗, HΛE is said to be

a flat extension of its restriction HE0

Λ if rankHΛE =rankH

E0

Λ .

We recall here a result from [18], which gives a rank condition for the existence of a flat extension of a truncated Hankel operator.

Theorem 4.2. Consider a monomial set B _{⊂ M} connected to 1 and a linear function Λ

defined on _hB+_·_B+_i_{. Let} _E ₌ _h_B_i _and _E+ ₌_h_B+_i_{. Assume that} _rank_HE+

Λ = rankHΛE =

|B_|. Then there exists a (unique) linear formΛ˜ _∈K_[_x_]∗ _{which extends}_Λ_{, i.e.,}_Λ(˜ _p_{) = Λ(}_p₎_for

all p_{∈ h}B+_·B+_i, satisfyingrankH_Λ˜ = rankHE

+

Λ . Moreover, we have kerH_Λ˜ = (kerHE

+

Λ ).

In other words, the condition rankHE+

Λ = rankHΛE = |B| implies that the truncated

Hankel operator HB+

Λ has a (unique) flat extension to a (full) Hankel operator HΛ˜.

Proposition 4.3. Let E, G be as in Definition 3.8 with G _⊂ Imin(f). If Λ∗ ∈ LG,E,<

coincides with a probability measure µ on _hE_·E_i and satisfies Λ∗₍_f_{) =}_fµ

G,E, then Λ ∗₍_f_{) =}

f_G,Eµ =f∗_.

Proof. By Proposition 3.11, f_G,Eµ _≤ f∗_{. Conversely as} _f₍_x₎

≥ f∗ _{for all} _x

∈ Rn_{, we have}

Λ∗(f) =R

f dµ_≥R

(14)

Proposition 4.4. If there exists Λ_{∈ L}G,E,< withkerH_ΛE ={0}, then f_G,Esos =f_G,Eµ .

Proof. IfkerHE

Λ ={0} implies thatΛ≻0, i.e,HΛE >0. Hence by Slater’s Theorem of [2] we

have the strong duality thenfsos G,E =f

µ

G,E.

Theorem 4.5. Let B is a monomial set connected to 1, E = _hB+_i _and _G _{⊂ h}_B+_·_B+_{i ∩}

Imin(f). Let Λ∗ ∈ LG,E,< such that Λ∗ is generic for f and satisfies the flat extension

property: rankHB+

Λ∗ = rankH_ΛB∗ =|B|. Then there is no duality gap, f∗ =f µ

G,E =fG,Esos and

(kerH_ΛB∗+) =Imin(f).

Proof. As rankHB+

Λ∗ = rankH_ΛB∗ = |B|, Theorem 4.2 implies that there exists a (unique)

linear functionΛ˜∗

∈K_[_x_]∗ _{which extends} _Λ∗_{. As}_rank_H_˜

Λ∗ = rankH_ΛB∗ =|B|and kerH_Λ˜∗ =

(kerH_ΛB+), any polynomialp_∈R_[_x_]_{can be reduced modulo} _ker_H_˜

Λ∗ to a polynomial b∈ hBi

so thatp₋b_∈kerH_Λ˜∗. ThenΛ˜

∗₍_p2_{) = ˜}_Λ∗₍_b2_{) = Λ}∗₍_b2₎_≥₀ _since _Λ∗

∈ LG,E,<. This implies

thatΛ˜∗ <0. By Proposition 2.19,Λ˜∗ has a decomposition Λ˜∗ =Pr

i=1λi1ζi withλi >0 and

ζi ∈Rn.

AsΛ˜∗_{(1) = Λ}∗_{(1) = 1}_,_Λ_˜∗ _{is a probability measure. By Proposition 4.3,} _Λ_˜∗₍_f_{) = Λ}∗₍_f_{) =}

f_G,Eµ =f∗_. As Λ˜∗ ₌ Pr i=1λi1ζi with λi > 0 and ζi ∈ R n _{and as} _Λ_˜∗_{(1) =} Pr i=1λi = 1 and ˜ Λ∗₍_f_{) =}Pr

i=1λif(ζi) =f∗, we deduce that f(ζi) =f∗ for i= 1, . . . , r so that{ζ1, . . . , ζr} ⊂

V(Imin(f)).

By Proposition 2.17, Theorem 4.2 and Theorem 3.18, we have

kerH_Λ˜∗ =I(ζ1, . . . , ζr) = (kerH B+

Λ∗ )⊂Imin(f).

We deduce that_{ζ1, . . . , ζr}=V(Imin(f))so that(kerHB

+

Λ∗ ) =Imin(f).

As we have the flat extension condition,kerH_ΛB∗={0}and by Proposition 4.4 there is not

duality gap: f∗₌_fµ

G,E=fG,Esos.

Notice that if the hypotheses of this theorem are satisfied, then necessarily Imin(f) is

zero-dimensional.

5. Minimizer border basis algorithm

In this section we describe the algorithm to compute the global minimum of a polynomial, assuming f∗ _{is reached in} _Rn _and _I

min(f) is zero dimensional. It can be seen as a type of

border basis algorithm, for we insert an additional step in the main loop. It is closely connected to the real radical border basis algorithm presented in [13] but instead of “minimizing zero” to generate new elements in the real radical, we minimize f to compute generators of the minimizer ideal Imin(f).

5.1. Description. The convex optimization problems that we consider are the following: Algorithm 5.1: Optimal Linear Form

Input: f _∈R_[_x_]_,_M _{= (}_xα₎_α_∈_A_{a monomial set containing} ₁_with

f =P

α∈A+Afαxα ∈ hM·Mi,G⊂R[x].

Output: the minimum f∗ G,M of

P

α∈A+Aλαfα subject to:

– H_ΛM∗ = (hα,β)α,β∈A<0,

– HM

Λ∗ satisfies the Hankel constraints h0,0 = 1, and hα,β =hα′_,β′ =λ_α₊_β

if α+β =α′₊_β′_, – Λ∗₍_g_{) =}P α∈A+Agαλα= 0 for all g= P α∈A+A gαxα ∈G∩ hM·Mi. andΛ∗

∈ hM _·M_i∗ _{represented by the vector} _[_λ

(15)

This optimization problem is a Semi-Definite Programming problem, corresponding to the optimization of a linear functional on a linear subspace of the cone of Positive Semi-Definite matrices. It is a convex optimization problem, which can be solved efficiently by SDP solvers. If an Interior Point Methods is used, the solution Λ∗ _{is in the interior of a face on which the}

minimum Λ∗₍_f₎_{is reached so that}_Λ∗ _{is generic for}_f_{. This is the case for tools such as}

csdp or sdpa, that we will use in the experimentations.

In the case, where M = B is a monomial set connected to 1, F is a complete rewriting family for B in degree _≤2t, and G=_{b₋πF,B(b);b∈B·B}, we will also denote Optimal Linear Form (f, Bt, πF,B):=Optimal Linear Form (f, Bt, G).

Algorithm 5.2: Minimizer Ideal of f

Input: A real polynomial function f withImin(f)6= (1) and zero-dimensional.

F :=_{_∂x∂f

1, ..,

∂f

∂xn};t:=⌈deg(f)/2⌉; B := set of monomials of degree ≤t;

˜

f :=_−∞; stop:=false; While not stop

(1) Compute the commuting relations forF with respect to B in degree _≤2t. (2) Reduce them by the existing relationsF.

(3) Add the non-zero reduced relations to F and updateB. (4) Let[f∗

F,Bt,Λ

∗_{] :=}_{Optimal Linear Form}₍_{f, B}

t, πF,Bt)

(a) If there is a duality gap then go to 1 witht:=t+ 1. (b) If f˜₆=f∗

F,Bt thenf˜:=f

∗

F,Bt; go to 1 with t:=t+ 1.

(c) Compute the border basis F′ _of _F_{+ ker}_HBt−1

Λ∗ and the basis B

′ _{in degree} ≤2 (t₋1). (d) If there exists b′ ∈B′ _with_deg(_b′₎ ≥t₋1then go to (1) with t:=t+ 1. (e) Let [f∗ F′_,B′,Λ

′_{] :=}_{Optimal Linear Form}₍_{f, B}′_{, π} F′_,B′). (f) If f∗ F′_,B′ = ˜f and kerH B′ t−1 Λ′ ={0} ⇒stop=true, F =F ′_,_B ₌_B′_,_f_˜₌_f∗ F,Bt−1. else go to 1 witht:=t+ 1.

Output: A basis B of _A=R_[_x_]_/I_min₍_f₎_{, a border basis} _F _of_I_min₍_f₎ _for _B _{and the} minimum f˜.

5.2. Correctness of the algorithm. In this subsection, we analyse the correctness of the algorithm.

Lemma 5.1. Let t _∈ N_, _B _⊂ R_[_x_]2_t _{be a monomial set connected to} ₁_, _F _⊂ R_[_x_] _{be a} border basis for B in degree _≤ 2t and G = _hBt·Bti ∩ hF|2ti. Let E ⊂ R[x]t be a vector

space containing B+_∩R_[_x_]_t _and _G′ ₌ _h_E_·_E_{i ∩ h}_F_|₂_t_i_{. For all} _Λ _{∈ L}_G,B

t,<, there exists

a unique Λ˜ _{∈ L}G′_,E,< which extends Λ. Moreover, Λ˜ satisfies rankH_ΛE_˜ = rankH_ΛBt and kerHE_˜

Λ = kerH

Bt

Λ +hF|ti ∩E.

Proof. Suppose that F _⊂R_[_x_] _{be a border basis for} _B _{in degree} _≤₂_{t, that is, all boundary} polynomials of C+(F2t) reduces to 0 by F2t. Then by Theorem 2.11, we have R[x]2t = hB_i2t⊕ hF|2ti and hE·Ei = hBi2t ⊕G′, hBt·Bti = hB2t∩Bt·Bti ⊕G. Thus for all

Λ _{∈ L}G,Bt,<, there exists a unique Λ˜ ∈ hE·Ei

∗ _{such that} _Λ(_˜ _G′_{) = 0} _and _Λ(_˜ _b_{) = Λ(}_b₎ _{for all}

b_∈Bt·Bt.

AsE_·(E_{∩ h}F_|t_i)_{⊂ h}E_·E_{i ∩ h}F_|2t_i=G′_{, we have}_Λ(_˜ _E

·(_hF_|t_{i ∩}E)) = 0so that

(16) _hF_|t_{i ∩}E _⊂kerH_ΛE_˜.

For any elementb_∈kerHBt

Λ we have∀b′ ∈Bt,Λ(b b′) = ˜Λ(b b′) = 0. AsΛ(˜ Bt·(hF|ti∩E)) =

(16)

(17) kerHBt

Λ ⊂kerH_ΛE˜.

Conversely asE=_hBti ⊕(hF|ti ∩E), any element ofE can be reduced modulohF|ti ∩E

to an element of_hBti, which shows that

(18) kerH_ΛE_˜ _⊂kerHBt

Λ +hF|ti ∩E.

From the inclusions (16), (17) and (18), we deduce that kerH_ΛE_˜ = kerHBt

Λ + (hF|ti ∩E)

and that rankHE

˜

Λ = rankH

Bt

Λ .

AsΛ <0, by projection along (_hF_|t_{i ∩}E) _⊂kerHE

˜

Λ,∀p ∈E there exists b ∈ hBti such

thatΛ(˜ p2) = Λ(b2)_≥0. Thus Λ˜ <0which ends the proof of this lemma.

Lemma 5.2. If the algorithm terminates, thenF′ _{is a border basis of} _I

min(f) for B′.

Proof. If the algorithme stops, this can happen only in step (4) in some degreet+ 1such that

• [f∗

F,Bt+1,Λ

∗_{] :=}_{Optimal Linear Form}₍_{f, B}

t+1, F),

• F′ _{is a border basis of} _F_{+ ker}_HBt

Λ∗ for B

′ _{in degree} ≤2t,

• the monomials inB′ _{are of degree} _{< t}_{so that} _B′ t=B′, • [f∗

F′,B′,Λ

′_{] :=}_{Optimal Linear Form}₍_{f, B}′_{, F}′₎ _with_f˜_{= Λ}′₍_f_{) =}_f∗

F′,B′ = Λ ∗₍_f_{) =} f∗ F,Bt andkerH B′ Λ′ ={0}. Then _hB′+_i ₌ _h_B′ i ⊕ hF′

i. By Lemma 5.1 with E = _hB′+_i_{, the linear form} _Λ′

∈ LF′_,B′_,<

can be extended to a linear formΛ′′

∈ LF′′,B′+_,_< withF′′ =hF′|2ti ∩ hB′+·B′+i such that

kerHB′+ Λ′′ = hF ′ |t_{i ∩ h}B+_i ₌ _h_F′ i. As kerHB′ Λ′ = {0} and hB ′+_i ₌ _h_B′ i ⊕ hF′ i, we have rankHB′+ Λ′′ = rankHB ′ Λ′ = |B ′

| and the flat extension theorem (Theorem 4.2) applies: Λ′′ _is

the restriction to _hB′ ·B′

iof a positive linear form

˜ Λ′ = r X i=1 λi1ζi withr =_|B′

|,ζi ∈ Rn distinct, λi > 0 and Pri=1λi = 1. Then Λ′(f) =Pri=1λif(ζi) ≥f∗.

By hypothesis, Λ′₍_f_{) =}_f∗

F2t,Bt ≤f

∗ _since _F

⊂Igrad(f) ⊂Imin(f). We deduce that Λ′(f) =

Λ∗₍_f_{) =}_f∗_{, that} ₍_F

2t+ kerH_ΛB∗t) = (F ′₎

⊂Imin(f) by Theorem 3.18 and that(F′) =Imin(f)

by Theorem 4.5. ThereforeF′ _{is a border basis of}_I

min(f) for B′.

Proposition 5.3. Assume thatImin(f) is zero-dimensional. Then the algorithm terminates.

It outputs a border basisF forBconnected to 1, such thatR_[_x_{] =}_h_B_i⊕₍_F₎_and₍_F_{) =}_I_min₍_f₎_. Proof. First, we are going to prove by contradiction that when Imin(f) is zero-dimensional,

the algorithm terminates. Suppose that the loop goes for ever. Notice that at each step either F is extended by adding new linearly independent polynomials or it moves to t+ 1. Since the number of linearly independent polynomials added toF in degree _≤2tis finite, there is a step in the loop from which F is not modified any more in degree _≤ 2 t. In this case, all boundary C-polynomials of elements ofF of degree_≤2tare reduced to0byF2t andF2tis a

border basis forB2t in degree≤2t. By Lemma 5.1 with E=Rt,Λ∗ extends to a linear form

˜ Λ∗ ∈R_[_x_]∗ 2tsuch that Kt:= kerHt_˜ Λ∗ = kerH Bt Λ∗ +hF|ti.

By Theorem 3.19, for t _≥ t2 we have (Kt) = Imin(f) and Λ∗(f) = f∗. As Imin(f) is

zero-dimensional, for t high enough, a border basis F′ _of _Kt _{in degree} _≤ ₂_t _{is a border basis of}

(17)

of minimizers (ie. the points in V(Imin(f))). By Lemma 5.1 with E =hB′+i and Theorem

4.5, any linear form Λ′

∈ LF′_,B,_< generic forf is the restriction of a positive linear form

˜ Λ′= r X i=1 λi1ζi

with _{ζ1, . . . , ζr} = V(Imin(f)) and λi > 0 and Pr_i₌₁λi = 1. In this case, Λ′(f) = Pr

i=1λif(ζi) = f∗ and kerHB ′

Λ′ = {0}. We arrive at a contradiction, which shows that

the algorithm should stops in step (4), for some degree t. By Lemma 5.2, F′ _{is a border basis of} _I

min(f) with basisB′ andΛ′(f) =f∗.

6. Examples

This section contains examples which illustrate the behavior of the algorithm. In the first example of Motzkin polynomial, the gradient ideal is not zero-dimensional whereas that in the second example of Robinson polynomial, the gradient ideal is zero-dimensional. In all the examples the minimizer ideal is zero-dimensional hence when we apply our algorithm we obtain the good result in a finite number of steps.

The implementation of the previous algorithm has been performed using theborderbasix3

package of theMathemagix4software. borderbasixis aC++implementation of the border basis algorithm of [32].

Semidefinite positive Hankel operators are computed using the semi definite programming routine of sdpa5 software. For the link with sdpa we use a file interface since sdpa is not distributed as a library.

For the computation of border basis, we use as a choice function that is tolerant to nu-merical instability i.e. a choice function that chooses as leading monomial a monomial whose coefficient is maximal among the choosable monomials. This, according to [22], makes the border basis computation stable with respect to numerical perturbations. This property is fundamental as we use results from SDP solvers to get new equations. And as these equations are computed numerically, the computation is subject to numerical errors.

Once the border basis of the minimizer is computed, the roots are obtained using the numerical routines described in [7].

Experiments are made on an Intel Corei7 2.30GHz with 8Gb of RAM. In the following examples, we use the natation Ht_Λ=HBt

Λ

Example 6.1. We consider the Motzkin polynomial

f(x, y) = 1 +x4y2+x2y4₋3x2y2

which is non negative on R2 _{but not a sum of squares in} R_[_{x, y}_]_{. We compute its gradient} ideal, Igrad(f) = (−6xy2+2xy4+4x3y2,−6yx2+2yx4+4y3x2)which is not zero-dimensional. • In the first iteration the degree is 3, the size of the Hankel matrixH3_Λis 10,min Λ(f) =

−216, there is a duality gap hence we try with degree 4.

• In the second iteration the degree is 4, the size of the Hankel matrix H4_Λ is 15, min

Λ(f) = 0, there is no duality gap. The minimum differs from the previous minimum, so a new iteration is needed.

• In the third iteration the degree is 5, the size of the Hankel matrix H5_Λ is 19 and

min Λ(f) = 0. The minimums are equal hence we compute the kernel of H4_Λ, which

3_{www-sop.inria.fr/galaad/mmx/borderbasix} 4_{www.mathemagix.org}