Linear Algebra
E XAMPLE 3.23: D IAGONALIZATION
1. Fixed parameters, system aspects that cannot be changed and that there- there-fore, from the perspective of the model, are constants
4.3 Optimizing Linear Systems
We often wish to maximize1 an objective function O that is a linear combination of the control parameters x1, x2,... xn and can therefore be expressed as
(EQ 4.1)
where the s are real scalars. Moreover, all the xi are typically constrained by lin-ear constraints in the following standard form:
(EQ 4.2)
(EQ 4.3)
or, using matrix notation:
(EQ 4.4)
1. In this chapter, we always seek to maximize the objective function. Identical techniques can be used for minimization.
x1 = c 0d dc 1 x1
O = c1x1+c2x2+}+cnxn ci
a11x1+a12x2+}+a1nxn = b1 a21x1+a22x2+}+a2nxn = b2
} }
am1x1+am2x2+}+amnxn = bm
Ax = b
ptg7913109
4.3 Optimizing Linear Systems 153
(EQ 4.5)
where A is an matrix, and x and b are column vectors with n and m ele-ments, respectively, with . Let A’s rank be r. To allow optimization, the sys-tem must be underconstrained with , so that some of the xis can be written as a linear combination of the others, which form a basis for A.2
Generalizing from Example 4.3, each equality corresponds to a hyperplane, which is a plane in more than two dimensions. (This is not as intuitive as it sounds;
for instance, a three-plane is a solid that fills the entire Euclidean space.) The con-straints ensure that valid xi lie at the intersection of these hyperplanes.
Note that we can always transform an inequality of the form
to an equality by introducing a new variable si, called the surplus variable, such that
(EQ 4.6)
By treating the si as virtual control parameters, we can convert a constraint that has a greater-than inequality into the standard form. (We ignore the value assigned to a surplus variable.) Similarly, introducing a slack variable converts lesser-than inequalities to equalities. Therefore, any linear system of equal and unequal straints can be transformed into the standard form that has only equality con-straints. Once this is done, we can use linear programming (discussed below) to find the value of x that maximizes the objective function.
EXAMPLE 4.4: REPRESENTING A LINEAR PROGRAM IN STANDARD FORM
Consider a company that has two network connections to the Internet through two providers (also called multihoming). Suppose that the providers charge per byte and provide different delays. For example, the lower-priced provider may guarantee that transit delays are under 50 ms, and the higher-priced pro-vider may guarantee a bound of 20 ms. Suppose that the company has two commonly used applications, A and B, that have different sensitivities to delay. Application A is more tolerant of delay than application B is. Moreover, the applications, on average, generate a certain amount of traffic every day, which has to be carried by one of the two links. The company wants to allocate all the traffic from the two applications to one of the two links, maximizing
2. To understand this section more fully, the reader may wish to review Section 3.4.
xt0 mun
ntm
rn
ai1x1+ai2x2+}+ainxntbi
ai1x1+ai2x2+}+ainxn–si = bi
ptg7913109 their benefit while minimizing its payments to the link providers. Represent
the problem in standard form.
Solution:
The first step is to decide how to model the problem. We must have variables that reflect the traffic sent by each application on each link. Call the lower-priced provider l and the higher lower-priced provider h. Then, we denote the traffic sent by A on l as xAland the traffic sent by A on h as xAh. Define xBland xBh similarly. The traffic sent is non-negative, so we have
; ; ; ;
If the traffic sent each day by application A is denoted TA, and the traffic sent by B is denoted TB, we have
;
Suppose that the providers charge cl and chmonetary units per byte. Then, the cost to the company is
What is the benefit to the company? Suppose that application A gains a bene-fit of bAlper byte from sending traffic on link l and bAhon link h. Using similar notation for the benefits to application B, the overall benefit (i.e., benefit – cost) that the company should maximize, which is its objective function, is
O =
Thus, in standard form, the linear program is the preceding objective function, and the constraints on the variables expressed as
;
Note that, in this system, n = 4 and m = 2. To allow optimization, the rank of the matrix A must be smaller than n = 4. In this case, the rank of A is 2, so
ptg7913109
4.3 Optimizing Linear Systems 155
How can we find values of the xijsuch that O is maximized? Trying every possi-ble value of x is an exponentially difficult task, so we have to be cleverer than that.
What we need is an algorithm that systematically chooses the xi that maximize or minimize O.
To solve a linear system in standard form, we draw on the intuition developed in Examples 4.2 and 4.3. Recall that in Example 4.3, the optimal value of O was reached at one of the vertices of the constraint plane because any other point has a neighbor that lies on a better isoquant. It is only at a vertex that we “run out” of better neighbors.3 Of course, in some cases, the isoquant can be parallel to one of the hyperedges of a constraint hyperplane. In this case, the O attains a minimum or maximum along an entire edge.
In a general system, the constraint plane corresponds to a mathematical object called a polytope, defined as a convex hyperspace bounded by a set of hyperplanes.
In such a system, it can be shown that the extremal value of the objective function is attained at one of the vertices of the constraint polytope. It is worth noting that a polytope in more than three dimensions is rather difficult to imagine: For instance, the intersection of two four-dimensional hyperplanes is a three-dimensional solid.
The principal fact needed about a polytope when carrying out an optimization is that each of its vertices is defined by n coordinates, which are the values assumed by the xi at that vertex. The optimal value of O is achieved for the values of the xi corresponding to the optimal vertex.
The overall approach to finding the optimal vertex is, first, to locate any one ver-tex of the polytope; second, to move from this verver-tex to the neighboring verver-tex where the value of the objective function is the greatest; and finally, to repeat this process until it reaches a vertex such that the value of the objective function at this vertex is greater than the objective function’s value at all of its neighbors. This must be the optimal vertex. This algorithm, developed by G. Dantzig, is the famous simplex algorithm.
The simplex algorithm builds on two underlying procedures: finding any one ver-tex of the polytope and generating all the neighbors of a verver-tex. The first procedure is carried out by setting n – r of the xi to 0, so that the resulting system has rank n, and solving the resultant linear system using, for example, Gaussian elimination.
The second procedure is carried out using the observation that because A’s rank is , it is always possible to compute a new basis for A that differs from the cur-rent basis in only one column. It can be shown that this basis defines a neighboring vertex of the polytope.
To carry out simplex in practice, we have to identify whether the program has incompatible constraints. This is easy because, if this is the case, the Gaussian
3. For nonlinear objective functions, we could run out of better points even within the constraint plane, so the optimal point may not lie at a vertex.
rn
ptg7913109 elimination in the first procedure fails. A more subtle problem is that it is possible
for a set of vertices to have the same exact value of O, which can lead to infinite loops. We can eliminate this problem by slightly jittering the value of O at these vertices or using other similar antilooping algorithms.
From the perspective of a practitioner, all that needs to be done to use linear pro-gramming is to specify the objective function and the constraints to a program called a Linear Program Solver, or LP Solver. CPLEX and CS2 are two examples of well-known LP Solvers. A solver returns either the optimal value of the objective function and the vertex at which it is achieved or declares the system to be unsolv-able due to incompatible constraints. Today’s LP Solvers can routinely solve sys-tems with more than 100,000 variables and tens of thousands of constraints.
The simplex algorithm has been found to work surprisingly well in dealing with most real-life problems. However, in the worst case, it can take time exponential in the size of the input (i.e, the number of variables) to find an optimal solution.
Another LP solution algorithm, called the ellipsoidal method, is guaranteed to terminate in O(n3) time, where n is the size of the input, although its performance for realistic problems is not much faster than simplex. Yet another competitor to the simplex algorithm is the interior point method, which finds the optimal ver-tex not by moving from verver-tex to verver-tex but by using points interior to the polytope.
Linear programming is a powerful tool. With an appropriate choice of variables, it can be used to solve problems that, at first glance, may not appear to be linear programs. As an example, we now consider how to set up the network-flow problem as a linear program.
4.3.1 Network Flow
The network-flow problem models the flow of goods in a transportation network.
Goods may be temporarily stored in warehouses. We represent the transportation network by a graph. Each graph node corresponds to a warehouse, and each directed edge, associated with a capacity, corresponds to a transportation link. The source node has no edges entering it, and the sink node has no edges leaving it. The problem is to determine the maximum possible throughput between the source and the sink.
We can solve this problem by using LP, as the next example demonstrates.
EXAMPLE 4.5: NETWORK FLOW
Consider the network flow graph in Figure 4.4. Here, the node s represents the source and has a total capacity of 11.6 leaving it. The sink, denoted t, has a capacity of 25.4 entering it. The maximum capacity from s to t can be no larger than 11.6 but may be smaller, depending on the intermediate paths.
ptg7913109