doi:10.4236/ajcm.2011.14036 Published Online December 2011 (http://www.SciRP.org/journal/ajcm)
First Order Convergence Analysis for Sparse Grid Method
in Stochastic Two-Stage Linear Optimization Problem
Shengyuan Chen
Department of Mathematics and Statistics, York University, Toronto, Canada E-mail: [email protected]
Received August 27, 2011; revised September 20,2011; accepted September 30, 2011
Abstract
Stochastic two-stage linear optimization is an important and widely used optimization model. Efficiency of numerical integration of the second stage value function is critical. However, the second stage value function is piecewise linear convex, which imposes challenges for applying the modern efficient spare grid method. In this paper, we prove the first order convergence rate of the sparse grid method for this important stochastic optimization model, utilizing convexity analysis and measure theory. The result is two-folded: it establishes a theoretical foundation for applying the sparse grid method in stochastic programming, and extends the convergence theory of sparse grid integration method to piecewise linear and convex functions.
Keywords:Convergence Analysis, Stochastic Optimization, Scenario Generation, Convex Analysis, Measure Theory
1. Introduction
Stochastic two-stage linear optimization, also called sto-chastic two-stage linear programming, models a sequen-tial decision structure, where the first stage decisions are made now before the random variable manifests itself; and the second stage decisions are made adaptively to the realized random variable and the first stage decisions. The adaptive decision model has been applied many im-portant application areas. For example, in the introduc-tory farmer’s problem [1], a farmer needs to divide the land for different vegetables in spring. The farmer’s ob-jective is to maximize profit in the harvest season. The profit is related to the market price at that time and the weather dependent yield. Neither the price nor the weather is known at the present time, hence the farmer’s decision in spring has to take into account multiple sce-narios. It is not a simple forecasting problem though, since the farmer’s second stage decision in fall, which adapts to different scenarios, also jointly determines the profit. The second stage decision problem is also called “recourse” problem. [2] collects more recent applications in engineering, manufacture, finance, transportation, tele- communication et al.
A stochastic two-stage linear problem with recourse has the following general representation:
min , d , and
, = min
. . = , 0,
T x X
T
c x x P
x s y
s t Qy Tx y
(1)
where is a random vector properly defined on
,,P
, X is a polytope feasible region for the first stage, Qm n , s and are a vector and a matrix of proper sizes,T
: X, ,
is a real valued function.
The high dimensional integration in (1) is difficult and is usually approximated by using a set of scenarios and weights
k, k
, 1, ,w k K
K as:
1min , ,and
, min
. . , 0.
T k k
x X k
k T
k
c x w x
x d y
s t Qy Tx y
(2)
Under this scenario approximation, the optimal objec-tive value *
K
z of (2) provides an approximation of the optimal objective value z* of (1). An optimal solution
* K
x of (2) provides an approximation of an optimal solu-tion x* of (1).
sampling points and wk = 1K. The convergence theory of Monte Carlo method has been extensively studied [3-6]. The core result is the epi-convergent theorem: un-der mild assumptions, zK* converges to w.p.1 as
; and any clustering point of the =1, which is the sequence of optimal solutions to (2), is in the op-timal solution set of the original problem. Quasi Monte- Carlo (QMC) method has also been recently studied [7], and similar convergence result has been achieved.
*
z
*
{xK}K
K
The sparse grid (SP) method is an established high dimensional quadrature rule, which was originally pro-posed by Smolyak [8], and has been studied by many authors in the context of numerical integration [9] (and references therein). Its application in the stochastic two- stage linear optimization is only shown in a recent nu- merical study in [10]. Though [10] shows the superior numerical performance of sparse grid method, compared with both MC and QMC, the convergence analysis is based on an assumption that the recourse function is in a Sobolev space, which only holds for a very narrow sub- set of the two-stage linear problems, i.e., separable prob- lems. The contribution of this paper are 1) establishing the epi-convergence of the sparse grid method for this important decision model; 2) prove the first order con- vergence rate of the method.
We first introduce the spare grid approximation error for integrand functions in Sobolev spaces.
Let Dj denote the partial derivative operator
j
=
jf
D f x x
x
Let =
1, ,d
,j
be a multi-index, and
define
=1
=1
= =
d j
j d ,
j j
j j
D D
x
where 1 d
. The Sobolev space with smooth- ness parameter r1 is defined as
: 2 0,1 for all , dr
d f D f r
where r means component-wisely j r, 1, ,
j
d . Sobolev spaces could also be defined using
p
norms, see Evans [11]. The derivatives in the defini-tion of Sobolev space are weak derivatives. Formally,
D f is the -derivative of f if for all 0
0,1 ,d
vC i.e., infinitely differentiable function on
0,1d, D fsatisfies the following equation:
0,1
[0,1]
d
1 d
d
d
D f x v x x
.
f x D v x x
For example, f x
= x defined in
0,1 has the first order weak derivative function Df sign
x ; but the function is nondifferentiable at 0 in the usual strong sense. It has been shown that weak derivative is essen-tially unique, and coincides with the classical strong rivative when it exists. Various properties of strong de-rivatives carry over to weak derivative as well, for ex-ample, D f D
D f
D
D f
for all multi- index , , r. For more calculus rules regard-ing weak derivative, includregard-ing the extended Leibniz theorem, see Evans ([11], Section 5.2.3).The norm of the defined Sobolev space is
2
2,
r
d r
f D f
where 2
is the L2 -norm of a function, r component-wisely, 2 is the Euclidian 2-norm of a finite vector:
2
1/ 2 [0,1]2 d d
f
f x x1/ 2 2 2
1
.
k
j j
h h
For , the sparse grid method achieves the fol-lowing convergence rate [12]:
r d
f
0,1
=1 1 1 ,
d
log
,
K
k k d
k d r
r
r d r d
f w f
K
f K
(3)
where K is the number of function evaluations, r d, is a constant independent of f , increasing with both d and r, see Brass [13]. Note that fdr implies
,
i d
f ir. Since the norm r d
f and d r, are non- decreasing in r, and the term r
ln
d1 r1K K is non-increasing in for large , it is none trivial to tell which space will yield the tightest bound. The prob-lem is called fat
r K
F problem in Wonzniakowski [14]. In this paper, as we shall see, only is relevant for our discussion.
= 1
r
The convergence result in (3) only holds for the two- stage stochastic linear programming (1) in the trivial case, i.e., when the integrand function is separable. For example,
x, :
1 1
1 2 1 2 1 1 2 2 ,
, , , = min
y y
x x y y y
1 1 1 = 1
yy x
2 2 = 2 2 yy x
1, 1, 2, 2 0 y y y y is equivalent to
x x1, 2, ,1 2
1 x1 2 2 , x
where 1 x1 11,
1
2 x2 1
and the conver-gence result in (3) can be applied directly.
However, in general,
x, is non-separable piece-wise function, see Birge and Louveaux [1], and does not belong to dr for any . For example, r
1 2 1 2
,
, , , = min
y y
x x y
y
2 1 2 1
=
yy x x
, 0
y y
is equivalent to
x x1, 2, ,1 2
= 1 2 x1 2 , x
and does not have the (weak) derivative D 1,1
x x1, 2, ,
discon and non-differentiable even in the we D 1,1 f =D 1,0
D 0,1 f
if nd in (3) can not be applied to two-stage linear problem directly. The major contribution of this paper is to prove the convergence of (2) to (1) in the rate specified in (3) with r= 1, i.e., the first order conver e rate, even though
, 1d
x
. On the other hand, extends the convergence theory of sparse grid method to convex multivariate piecewise linear functions since for such a function
: : i
since is tinuous
ak sense, while Hence the erro
genc this analys (0,1)
D f
exists.
1,1
D f r bou
is
f d x for xBi polyh
and
1,,Bm
partitions presented a
= mf x
; each B
, could be
i is edron
re s1
B
in
, = 1, , .
y
i
y yd x i m
The paper is organized as the followings. In Section 2, we introduce a logarithmic mollifier function and prove its various properties. The mollifier function is quite fa-miliar to the optimization community as it is the barrier function used in the Interior Point Method for linear pro-gramming. In Section 3, we use the limiting properties of the mollifier function to prove the uniform convergence and the first order convergence rate for the objective function. We also show the converging behaviour of the optimal solutions x*K in a subsequence. Finally, Section 4 presents our conc ions.
In the coming sections, we a lus
ssume . For a
m
= 0,1d ith a inve ore general continuous distribution w rse
cu-mulative function F1: 0,1
d , one can apply trans- formation
d = 0,1df F f
F1
d.
The transforma re complexity in the
an
2. Mollifier Function
We make the followi tions of the problem
:
tion brings in mo
ng mild assump
alysis without changing our conclusion, hence we as-sume
0,1d in the following sections and extend the an ugh inverse transformation and trunca-tion in the Appendix.alysis thro
(1):
A1 X is compact with nonempty relative interior;
, , <
x X, x
, or relative;
A2: completeness, and
x, has nonempty rela- tive interior;A3: rank
Q =m;A4: is a continuous random vector with an
in-ve cu
ysis using the In
rtible mulative distribution function. Assumption A1 is necessary for our anal
terior Point Method theory. Assumption A2 is for convenience since otherwise we need to discuss the case
x, = , which will drag our analysis to a different ption A3 is implicitly assumed in many analysis of linear programming, since the rows of Q could be preprocessed such that the reduced Q has fu row rank. Assumption A4 facilitates the conversion from a unit cube
focus. Assum
ll
0,1d to through the inverse c.d.f. trans- formation.We define a mollifier function , : :
min T
,
. . =
x
, s y
B y
Tx
s t Qy
=1
=
ni logyi. In the(4)
where B y
following, we call
x, the recourse function, and ,x
the molli-fier function. We let
1
1 1
, ,
T
n
g y B y
y y
2
2 2
1
1 1
diag , , .
n
H y B y
y y
(5)
is in fact a barrier function widely used in the
( )
B y
Interior Point Method for linear programming, and its properties are well studied. As 0, the convergence of ,x
and
* *
,x, ,x
y u i in the following
theo Theore
s stated rem.
m 2.1. For any xX,
0,1, let
y*,x,u*,x
e mollifier be the optimal primal and dual solutions of th
function, then
, 0
= ,
lim x x
* *
* * , ,0
, ,
lim yx ux y ux x
,
where
y u*x, *x
unctio
is an optimal primal and dual pair of the recourse f n
x, .Proof. Due to the barrier function , the objective fu
( )
B
nction of ,x
is strictly conv Together with the relative c eness assumption, it is clear that
,x
ex. omplet
has an unique optimal solution. Since the pro- non-empty relative interior by assumption, the central path
y* u*
blem has
,x, ,x
exists for each x, and con-
verges to the analytical center of the optimal set, see
property of
Roos et al. [15] Theorem I.30 and its Definition I.20 for analytic center.
The converging
* *
,x, ,x
y u
tant topic in
(central path) as
ap-pe
where . Clearly
0
has been an impor Interior Point Me d has been extensive studied by many authors. For interested readers, in addition to the reference given in the proof, we refer the extensive research in Megiddo [16], an early work Fiacco [17], and a survey of degen-eracy and IPM by Güler et al. [18]. For readers interested in the interior point method in general, we refer Nesterov and Nemirovskii [19], Renegar [20] and Wright [21].
The KKT condition of the optimization problem thod, an
ared in (4) is
, , , 0,
T x
s g Q u F y u
Qy Tx
( 1) , :
n m m
x
F F,x
, ,is infinitely differentiable. Furthermore,
* T
, ,
,
0
x x
y u
H y Q
F
Q
, 0
,
x
F
I
(6)
where I is an identity matrix, y u, F,x is invertible
since H( ) is positive definite. the implicit functio eorem, *, , *,
Hence by
n th yx ux are infinitely differentiable
functions of . So ,
= *,
*,T
x s yxB yx is
infinitely differ tiable. Proposition 2.1. For a
en
ny xX,
0,1 , ,x( ) : C .
In the following, we directly derive t e (strong) partial de
h rivatives of D ,x
for all 1. Note that there are 2d 1 number of s satisfying 1. We also prove these partial derivatives are finite for all
0,1 ,xX . Finally, we show that their limits are finite when
0
<
. Hereinafter, a vector v< or a matrix
M
Prop
means the inequality holds c nent-wisely. osition 2.2. For all xX,
0,1 ,ompo
*
,x u ,x
= < .
= <
x ux
Furthe lim0 ,
*rmore,
grangian functi o
.
ization
Proof. The La on of the ptim
problem in (4) is
, ,
L,x y u s yT B y
u QyT uTu Tx.Since
T
* *
,x ,x, ,x, = ,x
L y u
,
* *
, ,
, , , ,
* *
* ,
* * *
, , ,
* , *
, *
,
= ,
=
=
= ,
x x
x x x
, ,
* *
,
, ,
,
x x
x x
x x
x T
u
L u
Q u
x x x
x x
x
L y
L L y
y u
y
u s g y
u
Qy Tx
u
e last equality follows the KKT condition.
where th
Clearly u*,x
<. As 0
, u*,xu*x by Theo-
rem 2.1. We let
*
*,x diag y ,x , x diag y .
x
at the
Recall th *
x
y
1,
refers to the limiting point defined in the Theorem 2. not an arbitrary optimal solution of
x, .
Lemma 2.1. For any xX,
0,
1 ,
*
2
1, = , ,
T
x x
u Q Q
* 2 2
, = , , .
T T
x x x
y Q Q Q
1
Proof. F
,y,x,u,x
=* *
0 defines implicit functions
* *
,x, ,x
,
y u . With y u, F,x, F x given in (6), y u, F,1x
can be explicitly computed as
1 1
1 1 1 1 1
1
1 1 1
,
T T T T
T T
I Q QH Q QH H Q QH Q
QH Q QH QH Q
1
H
where is a shorthand notation of
* ,xH y
H . Hence
by the i licit function theorem
* mp
, 1
( , ) , , *
,
= ,
x
y u x x x
y
F F
u
tion 2.3. For all 1 . Furthermo essentia
a 2.1,
(7)
Hence and (5).
Proposi xX,
0, ,
12 2 T
re,
,x Q ,xQ
2
0 , 0
lim x
Proof. By Propositio
lly. n 2.2 and Lemm
,x Q ,xQ
2 2 T 1
2
,x < , ,x X.
By Theorem
2.
If the optimal set of 1,
12 2
, 0
= 0 .
lim x Q xQT
x,
ce the a
is non-degenerate, 2 T
xQ
is non-singular, hen bove limit is zero. If al set is degenerate, then the limit 0
Q
the optim is not
defined. However, in the following, we sh at the degenerated case has zero Lebesgue measure, hence the limit is zero with probability one. Let’s first consider a special case of degeneration. Let the first
m
columns of Q be an optimal basis B and1 , 2, , m 0, m1 n
y y y y y be e the set
* 1
= | = , = 0, , , > 0
E y B Tx y y y
ow th
degene
0
optimal b
asic feasible solu
rated
. Since
set 0 has zero
a tion. Defin
1 2 0
B m
m
1 2
= | = 0, , , m>
Y y y y y
easure in m
, and B is both
for an arbitrarily d
n.
Lebes-gue m injective and
sur-jective, hence m E
m B Y
0. Clearly the sameargument holds egenerated optimal
basis B with an arbitrarily chosen degenerated basic colum Furthermore, there are only finitely many ways to choose basis and degenerated columns. Hence the total measure of the set which leads to a degenerated
,x is zero. Hence we conclude that the limit is ntially.
We continue t ca zero esse
o lculate higher order partial deriva-tives. Note that 2,x
is a m m matrix, and
3
, 2
,
= .
x
x
i j k k ij
In general, for any kd and any set of i1,,ik,
2 , 3 x i ik s a m
i m matrix, and
, 2
,
1 2 3 1 2,
= .
k x
x
i i ik i ik
i i
Proposition 2.4. For any xX,
0,1 , and any set of indices i1,,ik,
, 1 < , k
, 0 1 = 0 lim k x i ik essentially.Proof. We first prove the conclusion f r , and extend by induction.
o k= 2
12 2 , ( ) = , T x x k k Q Q
1 12 2 2
, , , * 1, , * 2, , 1 2 , * , , 1 2 , = 0 0 0 0 = 2 0 0 0 ,
T T T
x x x
k x k x T x k n x k T T x
Q Q Q Q Q Q
y
y
Q Q Q
y
Q Q Q
(8) where
* 1, , * 2 2
, , ,
= =
j x T T
x jk x x
jk k
y
y Q Q Q
is shown in lemma 2.1. Clearly (8) is finite for > 0. Now taking limit 0, by Theorem 2.1 and
tion 2.3:
Proposi-
2 , 0= 0 essentially.
lim x k
Furthermore, we prove by induction that
2 , x i ik 3 i ik x is in the form 2 S Q
,,x,,x
,ere S
is an algebraic expression invowh lving
multi-plicati ummation
claim
on and s
is true for
of
2
1, , ,
T
x Q xQ
, Q. The
2
,x k
it is
in (8). Suppose
true for 2
, 3 1 x i ik
, then by the chain rule,
the term
2 ,T x
Q Q 1 panded into multiplica-
tion of th ,xQ
, Q, hence thewill be ex
e same terms
form
1 2 , , T x Q
, ,
2 S Q,x,x is established. It is clear that
0 2 0 essentially
lim S Q,,x,,x by the Theo-
rem 2.1 and Proposition 2.3. We also der e hiv igher order partial derivatives explicitly in the Appendix A.
Theorem 2.2. For any 1 ,
, 0,
1 ,
2
1 2 2
( ) < ;
x n
* *
1
, 1, ,
2 2
0
( ) , < .
lim x x m x
n
u u
Furthermore,
12 2 2
* *
1, ,
2 2
, .
sup x m x
x X
u u
Proof. follows the definition of the norm 1 n e
finite-, Propositions 2.2, 2.3, 2.4, and Theorem 2.1. Th
ness of u*j x, , j= 1,, mption of the
. Furthermo
m, follows the relative com-pleteness two-stage linear problem and
duality th re,
assu
eory X is comp t, hence
o (1 gence e.
ac
<
.
3. First Order Convergence Rate
We first discuss the convergence of the objective func-tion specified in the approximati n model (2) to the ob-jective function of the true model ), and the
rat
Theorem 3.1. For all xX ,
2 11, 0,1
=1
, k , k .
d d
k
x d w x
K
Hence,
K
log d
K K
.
Proof. For notation convenience, define operators and
,
Kk k
w x
0,1
=1
,
d k
x d uniformly
E
K
A as
0,1d
d ,E f
f
=
K
k k K
A f w f
=1 k
.Then for any xX,
0,1 ,K,
,
, ,
,
, ,
,
,
K
x
x K x
K x K
E x A x
E x E
E A
A A x
Taking limiting on both sides, then the first and third term on t nd side go to zero by Theo-rem 2.1, and the sec is bounded by the classical convergence rate grid method, see (3), Proposi-tion 2.3 and Theorem
Let the objective function of the true problem (1) and the approximated problem (2) be
0
he right ha ond term of sparse
2.2.
0,1
=1
= , d ,
= , ,
d K
T k k
K
k
z x c x P
z x c x w x
Tx
an
d let the optimal objective value and optimal solution set of the true model and approximated model be
* * * *
, , K, K
3.3 state the resu
z X z X respectively. Theorem 3.2 and Theorem lts for the optimal objective value and the optimal sets separately.
Theorem 3.2. The optimal objective value converges, i.e., limKz*K z*, and the rate of convergence is
2 1* *
1, log
.
d
K d
K z z
K
Proof. For the minimization problem we note that
*
z x z x for any * * ,
x X xX, and
K xK zK x for any *
, K K
z x X xX . Let
21, log =
d
K d
K 1
K
, then
* *
=
K K K K K K
K
z
K K K
z x z x z x z x z x z x
x z x
* * *
* ,
*
* =
K K K K K
K
z x z x z x z x x
z x x
K
where the inequalities follow Theorem 3.1. Theorem 3.3. For any
K
z
z x z
* * *
,
K K
x X x X ,
K
x
1) is feasible;
2)
2 1 *
1,
log 2
d
K d
K z x z x
K
;
*
of a subsequence 3) For any clustering point x
*
, ,
Kt Kt Kt
x t x X , * *
x X ; furthermore, *
* = if X x is a singleton, then x*=x*.Proof. xK satisfies the first stage constraints and by
the relative completeness assumption, xK is feasible.
To show that xK is also very close to a mal
solu-tio m
3.1.
ny opti n, we apply the similar technique used in Theore
*
*K K K K K
z x z x z x z x
2K, 2K
,
since
=
K
z x z x
K z xK zK xK K
by Theorem 3.1, and
*K z xK z x K
by the steps in the proof of
Theo x*
is a clustering point, rem 3.3. Since
* *
* 0limt Kt
z x z x z x z x by the ine-
quality above. As a special case, if X =
x* , then* *
=
x x . t
based the uniform conve
bsequ al sets are
eto , and on xpect
op-timality of clustering points. The result h [23,24
onverg c nvergenc
K
he sparse grid method for the stoch -mming not only converges but
order rate. Our constructive proof
Mathematics, Philadelphia, 2005.
R. J.-B Wets, “Epi-Convergency of Con- Programs,” Stochastic and Stochastic Re-ports, Vol. 34, 1991, pp. 83-92.
chastic Pro- rgence, see Römisch [22]. The result is stated in su ence since the optim
not necessarily singl ns e can only e
can also be proved by epi-convergence, see Attouc ]. Epi- c en e is implied by uniform co e, see
all [25].
4. Conclusions
The modern sparse grid method is very efficient in nu-merical integration for integrant functions the Sobolev space dr. However, the integrand function in two- stage linear programming does not belong to r. We prove that
d
astic two t
stage linear progra converges in the first
also
uses a logarithmic mollifier function from interior point method.
5. References
[1] J. R. Birge and F. Louveaux, “Introduction to Stochastic Programming,” Springer, New York, 1997.
[2] S. W. Wallace and W. T. Ziemba, Eds., “Applications of Stochastic Programming,” Society for Industrial and Ap-plied
[3] A. J. King and vex Stochastic
[4] A. J. King and R. T. Rockafellar, “Asymptotic Theory for Solutions in Statistical Estimation and Sto
gramming,” Mathematics for Operations Research, Vol. 18, No. 1, 1993, pp. 148-162. doi:10.1287/moor.18.1.148 [5] A. Shapiro, “Asymptotic Analysis of Stochastic Programs,”
Annals of Operations Resesrch, Vol. 30, No. 1, 1991, pp. 169-186. doi:10.1007/BF02204815
[6] J. Dupacova and R. Wets, “Asymptotic Behavior of Sta- tistical Estimators and of Optimal Solutions of Stochastic Optimization Problems,” Annals of Statistics, Vol. 16, No. 4, 1988, pp. 1517-1549. doi:10.1214/aos/1176351052 [7] T. Pennanen and M. Koivu, “Epi-Convergent Discretiza-
tion of Stochastic Programs via Integration Quadratures,”
Numerische Mathematik, Vol. 100, No. 1, 2005, pp. 141- 163. doi:10.1007/s00211-004-0571-4
[8] S. A. Smolyak, “Interpolation and Quadrature Formula for the Class a
s
W and a s
E ,” Doklady Akademii Nauk SSSR, Vol. 131, 1960, pp. 1028-1031. (in Russian, Eng- lish Translation: Soviet Mathematica Doklady, Vol. 4, 1963,
9129717644 pp. 240-243).
[9] T. Gerstner and M. Griebel, “Numerical Integration Us- ing Sparse Grid,” Numerical Algorithms, Vol. 18, No. 3-4, 1998, pp. 209-232. doi:10.1023/A:101
Cost
[10] M. Chen and S. Mehrotra, “Epiconvergent Scenario Gen-
eration Method for Stochastic Problems via Sparse Grid,”
Stochastic Programming E-Print Series, Vol. 2008, No. 7, 2008.
[11] L. C. Evans, “Partial Differential Equations,” American Mathematical Society, Vol. 37, No. 3, 1998, pp. 363-367. [12] G. W. Wasilkowsi and H. Wozniakowski, “Explicit
Bounds of Algorithms for Multivariate Tensor Product Problems,” Journal of Complexity, Vol. 11, No. 1, 1995, pp. 1-56. doi:10.1006/jcom.1995.1001
[13] H. Brass and G. H¨ammerlin, Eds., “Bounds for Peano kernels,” Vol. 112, Birkhäuser, Basel, 1993, pp. 39-55. [14] H. Wozniakowski, “Information-Based Complexity,” An-
nual Review of Computer Science, Vol. 1, No. 1, 1986, pp. 319-380. doi:10.1146/annurev.cs.01.060186.001535 [15] C. Roos, T. Terlaky and J.-P. Vial, “Interior Point Methods
for Linear Optimization,” Springer, New York, 1997.
ew De- [16] N. Megiddo, “Progress in Mathematical Programming, Chapter Pathways to the Optimal Set in Linear Program- ming,” Springer-Verlag, New York, 1989, p. 132. [17] A. V. Fiacco, “Introduction to Sensitivity and Stability Ana-
lysis in Nonlinear Programming,” Academic Press, N York, 1983.
[18] O. Güler, D. den Hertog, C. Roos and T. Terlaky, “ generacy in Interior Point Methods for Linear Program- ming: A Survey,” Annals of Operations Research, Vol. 46-47, No. 1, 1993, pp. 107-138.
doi:10.1007/BF02096259
[19] Y. Nesterov and A. Nemirovskii, “Interior Point Polyno- mial Algorithms in Convex Programming,” Society for
2001.
Interior-Point Methods,”
Soci-800
Industrial and Applied Mathematics, Philadelphia, 1994. [20] J. Renegar, “A Mathematical View of Interior-Point Me-
thods in Convex Optimization,” Society for Industrial and Applied Mathematics, Philadelphia,
[21] S. J. Wright, “Primal-Dual
ety for Industrial and Applied Mathematics, Philadelphia, 1997.
[22] W. Römisch, “An Approximation Method in Stochastic Optimal Control,” In: Optimization Techniques, Part 1,
Lecture Notes in Control and Information Sciences, Sprin- ger-Verlag, New York, 1980, pp. 169-178.
[23] H. Attouch, “Variational Convergence for Functions and Operators,” Pitman (Advanced Publishing Programs), 1984.
[24] H. Attouch and R. J.-B. Wets, “Quantitative Stability of Variational Systems: I. The Epigraphical Distance,” Tran- sactions of the American Mathematical Society, Vol. 328, No. 2, 1991, pp. 695-729. doi:10.2307/2001
Vol. 11, No. 1, 1998, pp. 9-18.
[25] P. Kall, “Approximation to Optimization Problems: An Elementary Review,” Mathematics of Operations Re- search,
doi:10.1287/moor.11.1.9
tion and
or a random vector on with invertible cumulative ion om
Appendix A. Inverse Transforma
Truncation
F
distribution function F1: 0,1
d , the integrat domain of a mollifier function can be transformed fr to
0,1d:
1
,x F d = ,x F d ,
[0,1]d
then we apply the sparse grid method to generate scenar- s and weights
io for on the cube . We need to
check the properties of the integrand . Its differentiability only depends on
0,1dfunction
1 ,x F
1
( )
F
since ,x C ( )
. Most commonly used invertible cu-mulative distribution functions, for e ple, inverse of normal distribution function 1
, is also in Cxam
. The finiteness of the partial derivatives of ,x F 1
only d
also epends on F1
since partial derivatives of
,x
are finite for any multi-index 1 compo-nent-wisely.
The higher order partial derivative of a posite function can be calculated explicitly. For
: f g
h x X y Y z , we can ap-ply the Faà di Bruno’s formula:
com
n
d
= = ,
d
n
n P B
n
P B P
f g x f g x f g x g x
x
where n is the set of all partit
ions of the set Jn of
integers 1,,n. A partition of Jn
s of
is a family o ir-wise di nempty subset
f pa
sjoint no Jn whose union is
n
J . A means the cardinality of the set A. For a vec-p
rm tor com site function:
: f g ,
h y Y z
We apply Tsoy-Wo Ma’s higher chain fo ula [26]: o
x X
, , =1
1 1 = !
! !
mk p
m s k
m pk
s p m k k k
z
z y
m p
x y x
where is the set of all decompositions of multi-index with multiplicities Calculation involving a multi-
x
m.
inde
1, ,
follows rules:
=1 =1
= , ! = !,
= .
j j
j j
j
z
=1 =1
= jj,
j j j
x x z
x x
A multi-index
decomposes into s parts 1, , s
p p in with multiplicities m1,,ms in
respectively if the decomposition equation
1 1 2 2
= m p m p m ps s
holds and all parts are different. The total multiplicity is defined as
1 2
= s.
m m m m
The list
s p m, ,
is called a -decomposition of . nsure all parts are different we may impose1 2
0
To e
s
p p p
, where means 1=j,,j1=j1 , but j<j , for a
j.
For the problem under discussion, let 1 compo-nent-wisely, i.e., r= 1, note that are 2d 1 number of such (s). Fol g the higher c formula, we get
lowin hain rule
1 ,
1 1
= .
x
k
F
,
!
m p
m s k
x
m pk
ce
, , =1 ! !
s p m k mk pk
Furthermore, sin m02,x
by Proposi-tion 2.3, the computation of
= 0 li
1 F
0 ,
lim x
can be simplified significantly. In this case, only the de-compositions
1, , ei
,-zeros in the
i
e is the th unit basis of cor-
respond to non ula. Otherwise, for
i
above form
d ,
, m=m1 ms, and lim 0 ,
= 0m x m
.
2
s
Hence
,0 0 =1
im
d
x i
i i w
1
, =
lim x F l i
* , =1
= .
d
i i i x i
u w
Hence,
1 *
, 1 ,
0
= <
d
x i
=1
2 1 2
,
lim i x i
d i
F u
if and only if
< i
almost surely 1, i= 1, ,d
. For some distributions, the condi-tion might not hold. For example, the inverse of a cumu-lative function of the normal distribution does not have this property nearby 0 or 1. To remove the singularities, truncation of the cube [0, 1] could be applied:
where
0,1dh x x( )d ,1dh x
d ,x
0 << 1 han we need t
is a small positive number. To
com-d sicom-de rd sparse grid
method, o change the variable to , where pute the right using the standa
= 1 2 : 0
x y ,1
d
,1
d1
. Hence
,1
d = 1 2
0,1
1 2
d . ddh x x dh y y
Hence for a two-stage linear problem with an invert-ible but unbounded cumulative distribution function
F , we shall first generate the standa weights
,
1
= 1 2 , = 1 2 d ,
k F k wk wk
finally use the
and
k, k
w
in the approximation model (2)
The er proximation model is exactly the .
ror of this ap
sum of truncation error et, and sparse grid
approxima-tion error es. Erro
rd grid points and
=1
K k k
w
using the sp
k arse grid method,
then scale and transform them to the original random variable by
r et goes down with and error
s
e goes down with increasing K at der rate sparse g ethod.
the first or