3.4 Stochastic Optimization
3.4.4 Monte Carlo Methods
For most problems it is not practical to solve the stochastic optimization problem against the complete set of possible realizations of the random vector and the problem is typically solved for
EEV RP WS EV EVPI VSS
Relative Solutions for a Minimization Problem Under Uncertainty
79 some sample of possible outcomes. Since we are solving against a sample of possible outcomes, the calculated optimal solution is a statistical estimate of the true solution, subject to statistical sampling error. A growing body of literature addresses the statistical properties of sampled stochastic programs under the general category of Monte Carlo Methods. An overview of Monte Carlo methods is provided in (Birge and Louveaux 1997) Chapter 10. Monte Carlo methods for stochastic programming borrow many concepts from the simulation literature and the interface between these two methodologies is discussed in (Pflug 1996).
An important consideration in Monte Carlo methods is the distinction between the true problem and the sampled problem. The true problem can be written as32:
* min ( , ) X z E f
ξ
∈ = x x (3.65) *arg min
( , )
XE f
ξ
∈
∈
xx
x
(3.66)We seek to find the decision vector
x
*, and the corresponding objective valuez
* that minimizethe expected value of the objective function
f( , )x
ξ
, where the expectation is taken over the support of the random vectorξ
. The actual problem we solve is the sample path approximation problem, an optimization over a finite set of samplesξ
i. The sample path approximation problem can then be written as:* 1 1 min ( , ) n i n X i z f n
ξ
∈ = =∑
x x (3.67) * 1 1 arg min ( , ) n i n X i f nξ
∈ = ∈∑
x x x (3.68)An important issue in stochastic programming is the statistical convergence of the sample path solution solution
z
*n to the true solutionz
*. We are interested in how close our approximate solution is to the true solution. Conversely, we may be interested in determining how many samples (scenarios) are required in order to achieve a desired level of confidence. An important32 The exact notation varies from paper to paper. Here I adopt a notation used in Mak, Morton, and Wood (1999).
80 paper on this topic is (Dupacova and Wets 1988). This paper establishes the basic convergence properties of the sampled stochastic program proving that the sample path solution converges to the true solution as the sample size goes to infinity. Other papers on this topic include (Shapiro 1991; King and Rockafeller 1993).
While the sampled problem converges to the true solution, it generates a biased estimated of the true solution for finite samples. (Mak, Morton et al. 1999) show that the expected outcome of the sampled problem is optimistically biased and that the bias is decreasing in the number of samples. The convergence properties of the solution vector
x
are discussed in the most detail in (Shapiro and Homem-de-Mello 2000). The paper shows that in the case of a two stage linear stochastic program with a discrete distribution, the optimal solution to the approximating problem will be exactly equal to the solution of the true problem for a large enough N. They also show that the rate of convergence is exponential in the number of samples. An empirical assessment of sampling bias is provided in (Freimer, Thomas et al. 2006; Linderoth, Shapiro et al. 2006).A method for developing a confidence interval on the sampled problem is developed in (Mak, Morton et al. 1999). The solution to any sampled problem provides a point estimate on the lower bound of a stochastic minimization problem. Assume we choose to solve the approximation several times to improve our estimate. Let
ξ
i1,....,ξ
in,i=1,...,n be a set of n different batches of scenarios, each of which hasn
observations.Define * 1
1
min
n( ,
)
i ij n X iz
f
n
ξ
∈ =
=
∑
xx
(3.69)to be the objective value found by solving the sample path approximation problem against the ith
batch of sample scenarios.
We can then define our estimate lower bound as
* 1
1
( )
n i n iL n
z
n
==
∑
(3.70)81 The lower bound is set as the average over the n different batches of scenarios solved.
To calculate an upper bound, assume we have a candidate solution
ˆx
. This solution might be the result of solving the mean value problem, or it could be the lower bound found by solving the sample approximation problem. We can therefore calculate an upper bound by finding the expected cost of implementing the solutionˆx
.1 1 ˆ ( ) n ( , )i i U n f n =
ξ
=∑
x (3.71)The solution to the reference problem provides an unbiased estimate of the expected cost of implementing the stage one decision
ˆx
.We can then define an approximate (1 2 )−
α
confidence interval on the optimality gap as0, U n( )u L n( ) +
ε
uε
− + +
(3.72)
where
( , )ε ε
u are standard errors estimated for the upper and lower bounds.These results suggest a general procedure for estimating a set of bounds on the optimal solution to a stochastic optimization problem.
1) Determine the number of batches ( )n to be solved and the number of scenarios ( )n to be used in each batch.
2) Solve each of the ( )n problems to optimality using any algorithm
3) Calculate a point estimate for the lower bound using (3.70) (average objective value)
4) Calculate the sample variance of lower bound. Use this statistic to estimate the standard error on the lower bound
ε
.5) Calculate a candidate solution
ˆ
n
1 ni 1 n*i− =
∑
x =
x
where x*niare the solutions to the individual batch problems.6) Generate a set of
n
u scenarios to be used for the upper bound estimate. Note that it often the case that the number of scenario used for the upper bound is much larger82 then the number of scenarios used in the lower bound
calculation.
7) Calculate an estimate on the upper bound by solving the reference problem with the stage 1 decision fixed at
ˆx
. Note that this solution finds the optimal recourse for each scenario givenˆx
and takes the average. Since the master problem is not optimized we need only solve the subprogramu
n
times which can be done relatively fast.8) The optimal solution to the reference problem is the expected cost of implementing
ˆx
and is our point estimate for the upper bound.9) Calculate the sample variance of the upper bound and use that to calculate the standard error.
10) Calculate a point estimate of the optimality gap
( )
u( )
U n
L n
+
−
.11) Calculate a confidence interval on the optimality gap using (3.72).
Figure 3-4 Stochastic Bounding Procedure
This bounding technique forms the basis for a solution approach known as Sample Average Approximation which is developed in (Kleywegt, Shapiro et al. 2001). The basic idea of SAA, is that instead of solving the problem with a large set of samples we solve the problem multiple times with smaller samples and examine the statistical properties of the resulting solutions. This procedure is analogous to the multiple runs concept in discrete simulation.
The approach is a loose framework that does not specify specific solution algorithms, but rather a general iterative approach. I present a slightly simplified version of the algorithm presented in Kleywegt’s section 3.5
1) Choose an initial sample size
N
andN'
, a tolerance levelε
, and a number of batches M2) For each batch m=1,...M perform the following
a) Generate a sample of size
N
and solve the SAA problem with objective value vˆNMandε
optimal solution ˆM N
x
b) Estimate the optimality gap and the variance of the gap estimator
c) If the optimality gap and the variance of the gap estimator are sufficiently small go to step 4.
3) If the optimality gap or the variance of the estimate is too high increase the sample size and go to step 1
83 4) Choose the best solution
ˆx
among all candidate solutionsusing a screening and selection process. Stop
Figure 3-5 General Sample Average Approximation Procedure