Semidefinite Programs and Duality Theory - Semidefinite Programming Theory

3.1 Semidefinite Programming Theory

3.1.2 Semidefinite Programs and Duality Theory

Semidefinite programming consists in optimising a linear matrix function over the cone of positive semidefinite matrices subject to linear matrix constraints. A standard primal semidefinite programme can be written as

(PSDP)

min _{hC, Xi} s.t. _{AX = b}

X 0,

(3.3)

where the objective coefficient matrix C is usually required to be symmetric and the constraints hAi, Xi = bi, Ai∈ Sn, bi∈ R for all i = 1, . . . , m are subsumed with the help of a vector b ∈ Rm

and a linear operator_{A : S}n→ Rmdefined as

AX =     hA1, Xi .. . hAm, Xi     . (3.4)

Note that semidefinite programming contains linear programming as a special case. This can be seen as follows: Semidefinite programs can contain several matrix variables Xi, i = 1, . . . , k, which

could be subsumed in a big block diagonal matrix to arrive at a standard semidefinite programme like (PSDP). In an extreme case, these matrix variables Xi ∈ Sn+ can be just 1× 1 matrices,

i.e. ordinary scalar variables xi ∈ Rn+. Then the semidefinite programme reduces to a linear

programme. It is also possible to transform convex quadratically constrained convex quadratic programming problems into semidefinite programs. For details, we refer, e.g., to [65].

To express a dual semidefinite programme to (PSDP), we need the so-called adjoint operator AT _{: R}m_{→ S}

n toA. It is defined so that

hAX, yi =X, AT_y

(3.5) for all X _{∈ S}n and y∈ Rm, i.e.,

AT_{y =} m

i=1

yiAi . (3.6)

Now, consider the following relations. inf

X0{hC, Xi : AX = b} = Xinf0ysup∈RmhC, Xi + hy, b − AXi

≥ sup y∈RmXinf0C − A T_{y, X}_{+ hy, bi} = sup y∈Rmhy, bi : C − A T_y 0 . (3.7)

The left-hand side is (PSDP) restated. For the first equality, we used a Lagrange multiplier y_{∈ R}m

to lift the equality constraint of (PSDP) into the objective function. The equality is always true, since_{hC, Xi + hy, b − AXi = hC, Xi for all y ∈ R}m_iff

AX = b, and for AX 6= b the inner maximi- sation over y_{∈ R}m_{is unbounded. For the inequality, we exchanged inf and sup, which is justified}

by Lemma 36.1 of [114], and used (3.5) to regroup the inner product terms. For the last equality, observe that the inner minimisation will be finite (>_{−∞) for a given y ∈ R}m_iff_{C − A}T_{y, X}_{≥ 0}

for all X 0. But then, Fejer’s trace theorem, as stated in Proposition 115, demands C −AT_y_0.

We now introduce a slack variable Z = C_{− A}T_{y to rewrite sup}

y∈Rmhy, bi : C − ATy 0 as (DSDP) max hb, yi s.t. _AT_{y + Z = C} y_{∈ R}m_{, Z}_{0 .} (3.8)

For given primal and dual feasible solutions X and (y, Z), the difference between the primal and the dual objective value, which is called the duality gap, can be calculated and bounded below by zero as follows:

hC, Xi − hb, yi =AT_{y + Z, X}_{− hAX, yi = hZ, Xi ≥ 0 .} _(3.9)

The inequality is again justified by Fejer’s trace theorem, since Z _{0 and X 0. The fact} that any primal feasible solution yields a larger objective value than any dual feasible solution is called weak duality. We know that in linear programming the objective values of primal and

dual optimal solutions are always equal if they exist, a property known as strong duality and used as necessary and sufficient optimality condition in linear programming. For semidefinite programs, strong duality is not guaranteed to hold, as a well-known example, originated by Boyd and Vandenberghe [28], and extended by Helmberg [65], shows.

Example 119. Let the following pair of primal and dual semidefinite programs be given (for their derivation we refer, e.g., to [65]).

min x12 s.t.    0 x12 0 x12 x22 0 0 0 1 + x12    0 . max y1 s.t.   

−y2 1+y₂1 −y3 1+y1

2 0 −y4

−y3 −y4 −y1



  0

(3.10)

By the positive semidefiniteness conditions, x12 = 0 and y1 = −1, i.e., the duality gap of the

optimal solutions is 1. Helmberg attributes the positive duality gap in this example to the fact that the dualisation approach followed in (3.7) does not take the geometry of the primal feasible set into account. With x12= 0, it is easy to see that any primal feasible matrix has a zero eigenvalue with

eigenvector (1, 0, 0)T_{. Thus, by Proposition 117 the primal feasible set is contained in a face of S}+ n

that is not dimensional,

F =      P W PT : P =    0 0 1 0 0 1   W ∈ S + 2      .

If we wrote X _{∈ F instead of X 0 in (3.7), the necessary condition for finiteness of the inner} minimisation in line two of (3.7) could be reduced from _{C − A}T_{y, X}

≥ 0 for all X 0 to C − AT_{y, X}_{≥ 0 for all X ∈ F . Then Z = C − A}T_{y is less restricted, and the dual programme}

is indeed able to produce an optimal solution with the same objective value as the optimal primal solution.

The pitfall in the example above is that all primal feasible matrices are in fact only positive semidefinite, but not positive definite. Once we realise this, we could restrict ourselves to the minimal face of the primal cone that contains the feasible set completely, and strong duality would hold again. This problem does not occur if primal feasible solutions exist that are positive definite. Such solutions are called strictly feasible. Let us give a formal definition and proposition.

Definition 120. A matrix X is strictly feasible for (PSDP) if it is feasible for (PSDP) and X_{≻ 0.} In this case, (PSDP) is also called strictly feasible. A pair (y, Z) is strictly feasible for (DSDP) if it is feasible for (DSDP) and Z_{≻ 0. In this case, (DSDP) is also called strictly feasible. One also} says that (PSDP) and (DSDP) satisfy a Slater condition.

Proposition 121 (Strong duality, v. Corollary 2.2.6 in [65]). Define p∗_{= inf}

X0{hC, Xi : AX = b}

and d∗_{= sup}

y∈Rmhy, bi : C − ATy 0 .

1. If (PSDP) is strictly feasible with a finite p∗_{, then the value d}∗_{= p}∗ _{is attained for (DSDP).}

3. If (PSDP) and (DSDP) are strictly feasible, a finite value p∗ _{= d}∗ _{is attained for both}

problems.

Helmberg gives a folklore example to show that strict feasibility of (DSDP) is indeed necessary for attainment of the primal optimal solution.

Example 122 ([65]). For the following pair of primal and dual semidefinite programs, min x11 s.t. " x11 1 1 x22 # 0 max 2y1 s.t. " 1 −y1 −y1 0 # 0, (3.11)

the solution x11 = x22 = 2 is strictly feasible for the primal problem, and the only dual feasible

solution y1= 0 attains the optimal dual objective value 0, but is not strictly feasible. Weak duality

implies that the infimum of the primal feasible objective values must be greater or equal to the optimal dual objective value 0. We know from Proposition 113 that X 0 requires x11 ≥ 0,

x22≥ 0 and x11x22− 1 ≥ 0, i.e., x11≥ _x1₂₂. Therefore, the primal optimal objective value 0 is not

attained for any X 0 with finite entries.

Under the assumption of strict feasibility of (PSDP) and (DSDP), we can now summarise necessary and sufficient optimality conditions:

Proposition 123. Let (PSDP) and (DSDP) be strictly feasible. Then X and (y, Z) are optimal if and only if

AX = b, X 0

Z = C− AT_{y, y}_{∈ R}m_{, Z}₀

hZ, Xi = XZ = 0 .

(3.12)

Note that the last equality is usually referred to as complementary slackness and uses Obser- vation 114 and strong duality.

Having thought about the faces of the positive semidefinite cone in Section 3.1.1, we may as well consider the facial structure of the feasible set of semidefinite programs, which is an intersection of the positive semidefinite cone and an affine subspace. The faces of this intersection arise from all possible intersections of faces of the cone and the subspace. Therefore, the facial structure of the cone should in general influence the facial structure of the intersection. It should, on the one hand, inherit the cone’s property to be nonpolyhedral. On the other hand, optimal solutions, which most likely occur in faces of small dimension, should have small rank. This intuition is influenced by Proposition 117 and was more accurately captured by Pataki as follows:

Proposition 124 (Pataki [109]). Let F be a face of dimension k of the feasible set of (PSDP). For X ∈ F , the rank r = rank(X) is bounded by

r + 1 2

≤ m + k . (3.13)

Let F be a face of dimension k of the set _{Z 0 : ∃ y ∈ R}m_{: Z +}_AT_{y = C}

of (DSDP). For Z ∈ F , the rank r = rank(Z) is bounded by

r + 1 2 ≤n + 1₂ − m + k . (3.14)

3.2 An SDP Relaxation for (MB) and an Equivalent Eigen-

In document Branch-and-Cut for a Semidefinite Relaxation of Large-scale Minimum Bisection Problems (Page 95-99)