• No results found

Improvement to Linear Running Time per Node

direct computation simply asks for solving a system of linear equations which takes cubic running time in the reduced dimension n−`. We remark that the box- constraints are taken into account implicitly by the branch-and-bound scheme. We prune all nodes with invalid variable fixings outside of the box, instead of computing the box-constrained continuous minimizer, since the latter problem is much harder to solve in practice. Only at depth n1, after all integer variables

have been fixed, we need to compute a feasibles point within the box. This can be done by any common method for box-constrained QP, for example active set or interior point methods.

Primal Bounds Since the problem is only box-constrained, any mixed-integer point x ∈ Zn1 × Rn−n1 ∩ [l, u]n is feasible and therefore at level n

1 of the enu-

meration tree, meaning all integer variables have been fixed, we get a primal bound. Alternatively, one can try to improve the primal bound during the enu- meration by using primal heuristics. The authors use a genetic algorithm for 0-1 programming to improve primal bounds [146].

4.3

Improvement to Linear Running Time per

Node

Preprocessing Since the dual bounds obtained by the continuous relaxation may be weak, the enumeration tree might become very large. Therefore the strat- egy of the branch-and-bound scheme is to enumerate the nodes quickly. Before starting the enumeration it takes advantage of a preprocessing phase that per- forms the most time-consuming computations in every node in advance. This is possible due to the fact that at every level ` of the branching tree, certain data that is needed to compute the continuous minimizer does not depend on the specific fixings but only on the current depth. To explain this, we first de- scribe a useful effect of their way to branch by fixing. After each branching a new subproblem of the same form in a reduced dimension is built by updating the problem data. In this way, no constraint needs to be added for the new fixing. In particular having fixed ` variables x1, . . . x` to integer values r1, . . . , r`, we get

the reduced objective function f : Rn−`→ R of the form

¯

f (x) = x>Q`x + ¯c>x + ¯d,

where the reduced constant term ¯d =P`

i=1ciri+

P` i=1

P`

j=1qijrirj contains the

fixings from the quadratic and linear part and the components of the reduced linear part ¯cj−d = cj+ 2P`i=1qijri, j = ` + 1, . . . , n, contain the fixings from the

66 CHAPTER 4. CONVEX QUADRATIC INTEGER PROGRAMMING

Algorithm 12: Basic Branch-and-Bound Scheme for CBMIQP

input : a strictly convex function f : Rn → R, f(x) = x>Qx + c>x + d

output: a vector x? ∈ Zn1 × Rn−n1 minimizing f

determine a variable order x1, . . . , xn1, set ` := 0, ub := ∞

let Q` be the submatrix of Q for rows and columns ` + 1, . . . , n

while 0 ≤ ` ≤ n1 do

define ¯f : Rn−`→ R by ¯f (x) := f (r1, . . . , r`, x1, . . . , xn−`)

compute ¯c and ¯d such that ¯f (x) = x>Q`x + ¯c>x + ¯d

// compute lower bound if ` = n1 then

compute ¯x := argmin{f (x) | a ≤ x ≤ b} else

compute ¯x = −12Q−1` ¯c ∈ Rn−` and set lb := ¯f (¯x)

// compute upper bound

set rj := ¯xj−` for j = ` + 1, . . . , n1 to form r ∈ Zn1× Rn−n1

// update solution

if ¯f (r`+1, . . . , rn) < ub then

set r∗ = r

set ub = ¯f (r`+1, . . . , rn)

// prepare next node if lb < ub then // branch on variable x`+1 set ` := ` + 1; if l1 ≤ b¯x1e ≤ b1 then set r` := b¯x1e; else

// prune current node set ` := ` − 1;

if ` > 0 then

// go to next node

increment r` in increasing distance to the value in the

4.3. IMPROVEMENT TO LINEAR RUNNING TIME PER NODE 67 ` rows and columns, and are therefore again positive definite. To compute the dual bounds according to (4.1) we need the inverse matrices Q−1` . An important observation is that Q` and therefore also Q−1` does not depend on the values ri,

i = 1, . . . , `, to which the first ` variables are fixed. Exploiting this useful fact, we do not change the order of fixing the variables in the branch-and-bound tree at each level `, i.e., we always fix the variables according to an order that is determined a priori, before starting the enumeration. On the one hand, we do not have the freedom to use sophisticated branching rules. But on the other hand, this implies that, in total, we only have to consider the n different matrices Q`

during the whole enumeration tree, which we know in advance, since the fixing order is predetermined. Thus, we avoid to compute an exponential number of reduced matrices that would be needed if the order of variables to be fixed were allowed to be chosen freely. The basic branch-and-bound scheme is outlined in Algorithm 12. Using the preprocessing phase, every node in the enumeration tree can be processed in quadratic running time in the reduced dimension n − `. Here, the most time-consuming operation is still the computation of the continuous minimum and its corresponding objective function value, given the precomputed inverse matrix Q−1` . In a first step, using the preprocessing leads to a reduction of the running time per node from O(n − `)3 to O(n − `)2.

Lemma 4.2 ([43]). After a polynomial-time preprocessing, the running time per node of Algorithm 12 can be reduced to O(n − `)2.

Proof. Since the inverse matrices Q−1` ∈ R(n−`)×(n−`) are computed in polynomial

time in the preprocessing, the running time is dominated by the computation of ¯

c>Q−1` ¯c which is quadratic in the reduced dimension n − `.

Incremental Computations Apart from precomputing the reduced matrices Q`, the preprocessing phase can be used to further decrease the running time of

every node at level ` to O(n−`). Again, this is achieved by improving the running time for the computation of the continuous minima of ¯f . To do this, the authors propose an incremental technique. It determines the new continuous minimum from the old one in linear time whenever a new variable is fixed. For this, the authors make use of the surprising fact that in a given node, the continuous minima corresponding to all possible fixings of the next variable lie on a line and the direction of this line only depends on which variables have been fixed so far, but not on the values to which they were fixed. Again, this allows to shift some expensive computations into the preprocessing since the direction of this line is fully determined by the level of the current node. This observation is illustrated in Figure 4.1. Recall that ¯f (x) = x>Q`x + ¯c>x + ¯d denotes the function obtained

from f by fixing variable xi to ri for i = 1, . . . , `, and ¯x ∈ Rn−` is its continuous

68 CHAPTER 4. CONVEX QUADRATIC INTEGER PROGRAMMING

x y

z`

Figure 4.1: Incremental computation of the minimizer y (red dots), depending on the fixings of x by moving along the direction z` (blue arrow).

If we define the vectors

z` := (1, (q`+1,`+2, q`+1,`+3, . . . , q`+1,n)Q−1`+1)>∈ Rn−`,

for every ` = 1, . . . , n1, we finally get the following result.

Proposition 4.3 ([43]). If we fix variable x`+1 to r`+1 and reoptimize ¯f , the

resulting continuous minimum is given by ¯x + (r`+1− ¯x1)z`.

Proof. Let ¯c(u) ∈ Rn−`−1 denote the linear term of ¯f : Rn−`−1 → R after fixing

x`+1 to u and removing the associated component. We have

¯ c(u)j−`−1 = cj + 2 ` X i=1 qi,jri+ 2q`+1,ju, j = ` + 2, . . . , n.

Moreover, we denote by ¯x ∈ R¯ n−` the continuous minimizer of ¯f after fixing

variable x`+1 to the value r`+1and keeping the associated component. This means

for the first component we have ¯x¯1 = r`+1, yielding

¯ ¯ x = (r`+1, (− 1 2 ¯ Q−1`+1c(r¯ `+1))>)>.

It is clear that in case r`+1 = ¯x1, we have ¯x = ¯¯ x, since in that case x`+1 is fixed to

its optimal continuous value and the continuous minimum is unchanged. Thus, we can write ¯x as ¯ x = (¯x1, (− 1 2Q¯ −1 `+1c(¯¯x1))>)>. If follows that ¯ ¯ x = ¯x + (r`+1 − ¯x1, (− 1 2 ¯ Q−1`+1(¯c(r`+1) − ¯c(¯x1)))>)>.

4.3. IMPROVEMENT TO LINEAR RUNNING TIME PER NODE 69 It remains to show that

(r`+1− ¯x1)z` = (r`+1− ¯x1, (− 1 2 ¯ Q−1`+1(¯c(r`+1) − ¯c(¯x1)))>)>, or equivalently (r`+1− ¯x1)(1, −(q`+1,`+2, q`+1,`+3, . . . , q`+1,n)Q−1`+1) > =(r`+1− ¯x1, (− 1 2 ¯ Q−1`+1(¯c(r`+1) − ¯c(¯x1)))>)>.

The equality for the first component is clear. For the other components, we ob- serve that

¯

c(r`+1) − ¯c(¯x1) = (2q`+1,`+2(r`+1− ¯x1), . . . , 2q`+1,n(r`+1− ¯x1))> ∈ Rn−`−1

and hence we get

(−1 2Q¯ −1 `+1(¯c(r`+1) − ¯c(¯x1))> =(− ¯Q−1`+1(q`+1,`+2, . . . , q`+1,n)>(r`+1− ¯x1))> =(r`+1− ¯x1)(− ¯Q`+1−>(q`+1,`+2, . . . , q`+1,n)>)> =(r`+1− ¯x1)(−(q`+1,`+2, . . . , q`+1,n) ¯Q−1`+1).

To guarantee sublinear running time of O(n − `) per node, we also have to reduce the running time for the evaluation of the objective function values of the contin- uous minima, since a simple evaluation would take quadratic time and therefore dominate the rest of the computations. As to this, if we define

v` := 2Q`z` ∈ Rn−`, s` := (z`)>Q`z` ∈ R,

we can state the following analogous result for the incremental computation of the associated optimal objective function value.

Proposition 4.4 ([43]). If we fix variable x`+1 to r`+1 and reoptimize ¯f , the

resulting optimal objective function value is given by ¯ f (¯x + αz`) = ¯f (¯x) + α(¯x>v`+ ¯c>z`) + α2s`, where α = r`+1− ¯x1. Proof. We have f (¯x + αz`) = (¯x + αz`)>Q¯`(¯x + αz`) + ¯c>(¯x + αz`) + ¯d = ¯x>Q¯`x + α¯ 2z`>Q¯`z`+ 2α¯x>Q¯`z`+ ¯c>x + α¯¯ c>x + α¯¯ c>z`+ ¯d = ¯f (¯x) + α(¯x>2 ¯Q`z`+ ¯c>z`) + α2z`>Q¯`z` = ¯f (¯x) + α(¯x>v`+ ¯c>z`) + α2s`

70 CHAPTER 4. CONVEX QUADRATIC INTEGER PROGRAMMING Since ¯c can be computed incrementally in O(n − `) time, the last two propositions can be summarized in the following main result.

Theorem 4.5 ([43]). If, in the preprocessing phase of Algorithm 12, we compute z`, v` and s

` as defined above for ` = 0, . . . , n − 1, then the computation of

the continuous minimizer and the associated lower bound can be carried out in O(n − `) time per node.

Proof. The result follows directly from Theorems 4.3 and 4.4. We close this chapter with an example.

Example 4.1. Consider the following unconstrained integer quadratic program min{x>Qx + c>x | x ∈ Z3}, where Q =   2 −1 −3 −1 2 4 −3 4 9  < 0 and c =   1 3 2  .

The continuous minimizer is ¯x = −12Q−1c = (1.5, −7, 3.5)>. Suppose we fix x1 to

r1 = 2. For the first iteration we compute in a preprocessing phase the vectors

z0 = (1, −(q1,2, q1,3)Q−11 ) > = (1, 1.5, 1)>, where ¯Q1 = 2 4 4 9  ,

v0 = 2Qz0 = (1, 0, 0)> and the scalar s

0 = z0>Qz0 = 12.

Hence the new minimizer is ¯ ¯

x = ¯x + (r1− ¯x1)z0 = (2, −7.75, 4)>

and its objective function value is ¯

f = ¯f (¯x) + (r1− ¯x1)(¯x>v0+ ¯c>z0) + (r1− ¯x1)2s0 = −6.125.

Buchheim et al. [43] showed by extensive numerical experiments that for the un- constrained case (x ∈ Zn) and the ternary case (x ∈ {−1, 0, 1}n) of CBMIQP the

branch-and-bound algorithm outperforms all competitive software on a test set of randomly generated instances as well as real-world instances from an applica- tion in electronics. They conclude that CPLEX as the most competitive solver was outperformed by even several orders of magnitude.

Chapter 5

Active Set Based

Branch-and-Bound

In this chapter we aim at extending the branch-and-bound Algorithm 12 of the previous Chapter 4 for mixed-integer optimization problems with strictly convex quadratic objective functions to the presence of additional linear inequalities:

min f (x) = x>Qx + c>x + d s.t. Ax ≤ b

xi ∈ Z, i = 1, . . . , n1

xi ∈ R, i = n1+ 1, . . . , n

(CMIQP)

where Q ∈ Rn×n is a positive definite matrix, c ∈ Rn, d ∈ R, A ∈ Rm×n, b ∈ Rm,

and n1 ∈ {0, . . . , n}. Therefore, we will present a generalization of the approach

for CBMIQP. Up to now, all solution methods for convex mixed-integer quadratic programming are based on the methods presented in Section 3.2. Agrawal [6] presented a cutting plane algorithm that starts from the solution of the QP- relaxation. If the solution has a non-integral component that is required to be in- teger it adds a Chv´atal-Gomory cut to the existing constraints and applies the Pa- rameter t-Algorithm by Beale [18] to find its optimal integral value. The procedure is repeated until integrality for all integer variables is obtained. Bienstock [29] proposed a branch-and-cut algorithm using disjunctive cuts for solving a family of mixed-integer quadratic programming problems arising in a portfolio optimiza- tion application. By reformulating the original problem, Lazimy [133, 134] applied the Generalized Benders Decomposition method yielding Benders cuts that are linear instead of quadratic in the integer variables. The original mixed-integer quadratic problem is decomposed into a finite sequence of integer linear master programs, and quadratic programming subproblems. Al-Khayyal and Larsen [8] developed a branch-and-bound algorithm that uses different types of piecewise affine convex functions to underestimate the objective function. The relaxed lin- ear problems are used to compute lower bounds. Instead of using linearizations

72 CHAPTER 5. ACTIVE SET BASED BRANCH-AND-BOUND Leyffer [137] showed how improved lower bounds from the QP-relaxations can be used within a branch-and-bound algorithm. In particular Fletcher and Leyf- fer [81] compute lower bounds by parametrically taking one step of the dual active set method in every node of the search tree. Their numerical experiments also show that branch-and-bound is the most effective approach out of all the com- mon methods to solve MIQP problems, since the convex QP-relaxations are easy to solve, e.g., by methods outlined in Section 2.4. Standard commercial solvers that can handle mixed-integer quadratically constrained programs (MIQCP), and therefore in particular also CMIQP, include CPLEX [57], Xpress [78], Gurobi [106] and MOSEK [156], while Bonmin [34] and SCIP [2] are well-known non-commercial software packages being capable of solving CMIQP, compare Section 3.3. Recent benchmarks show that Gurobi and CPLEX perform best on MIQP problems, both implementing state-of-the-art branch-and-bound algorithms.

We propose an extended version of the branch-and-bound algorithm 12 of Chap- ter 4 to solve CMIQP. Since dualizing the continuous relaxation yields another convex quadratic programming problem with only non-negativity constraints, as we will see in Section 5.4, a new feasible active set method for quadratic pro- grams of that form is specifically designed to compute lower bounds. The main feature of the branch-and-bound algorithm 12 for solving CBMIQP, i.e., the so- phisticated preprocessing phase, leading to a fast enumeration of the branch-and- bound nodes, can be easily adapted by small modifications to handle CMIQP. Moreover, our feasible active set method takes advantage of this preprocessing phase and is also well suited for reoptimization. Following the ideas of the branch- and-bound scheme sketched in the last chapter, we keep the fixed branching order that is determined in advance. Recall that on the one hand, we lose the flexibility of choosing sophisticated branching strategies usually used in recent state-of-the art software packages, as shortly reviewed in Section 3.1, but on the other hand, we gain the possibility of shifting expensive computations into a preprocessing phase. As one of the main ideas that differs from standard approaches, our algo- rithm solves the dual problem of the continuous relaxation in each node in order to determine a local lower bound. Using the concept of duality theory in nonlinear programming, as seen in Section 2.3, strong duality holds if the primal problem is feasible, since all constraints of the continuous relaxation of CMIQP are affine. Thus, computing a lower bound by the dual is useful since on the one hand it can be solved effectively by exploiting its structure and on the other hand we get a lower bound that is as strong as by solving the primal. If additionally the primal problem has a small number of constraints, the dual problem has a small dimension. Extending the sophisticated incremental computations of Chapter 4, the overall running time per node in our approach is still linear in the dimen- sion n if the number of constraints is fixed and if we assume that the solution of the dual problem, which is of fixed dimension in this case, can be computed in constant time. In this sense our approach can be seen as an extension of the

73 Algorithm 12 without increasing the order of running time per node.

For the solution of the dual problem we investigate the performance of two active set methods. The first one, discussed in Section 5.2, is an infeasible active set method by Kunisch and Rendl [129], denoted by KR. It computes a KKT-point by relaxing primal and dual feasibility while requiring stationarity and comple- mentary slackness at every iteration. This relaxation leads to a simplification of the KKT-system to a system of linear equations in the dimension of the set of inactive variables. A theoretical limitation of this algorithm is that it requires a strictly convex objective function. However, this problem can be tackled by adding a small multiple of the identity matrix to the diagonal of Q. To over- come this disadvantage, we propose as a second method, a new feasible active set method FAST-QPA, discussed in detail in Section 5.3. A key feature of that active set method is that each of its iterates is feasible. Combining this with properties of duality theory, it suffices to find an approximate solution, as each dual feasible solution yields a valid lower bound. We can thus prune the branch-and-bound node as soon as the current upper bound is exceeded by the value of any feasible iterate produced in a solution algorithm for the dual problem.

Another feature that is used in both active set algorithms within the branch-and- bound is the use of warmstarts: after each branching, corresponding to the fixing of a variable, we pass the optimal active set from the parent node as starting point to the child nodes. This leads to a significant reduction in the average number of iterations needed to solve the dual problem to optimality.

This chapter is organized as follows. In Section 5.1 we give a short overview of existing methods for non-negativity constrained convex quadratic programming problems. In Section 5.2 we discuss the infeasible active set method by Kunisch and Rendl for solving this kind of problem. In Section 5.3 we present our own