Buered Probability of Exceedance: Mathematical Properties
and Optimization Algorithms
Alexander Mafusalov, Stan Uryasev RESEARCH REPORT 2014-1 Risk Management and Financial Engineering Lab Department of Industrial and Systems Engineering 303 Weil Hall, University of Florida, Gainesville, FL 32611.
E-mails: [email protected], [email protected]. First draft: October 2014, This draft: October 2014 Correspondence should be addressed to: Stan Uryasev
Abstract
This paper introduces a new probabilistic characteristic called buered probability of ex-ceedance (bPOE). This characteristic is an extension of so-called buered probability of failure and it is equal to one minus superdistribution function. Paper provides ecient calculation formulas for bPOE. bPOE is proved to be a quasi-convex function of random variable w.r.t. the regular addition operation and a concave function w.r.t. the mixture operation; it is a monotonic function of random variable. bPOE is proved to be a strictly decreasing function of the parameter on the interval between the mathematical expec-tation and the essential supremum. Multiplicative inverse of the bPOE is proved to be a convex function of parameter, and a piecewise-linear function in the case of discretely distributed random variable. Minimization of the bPOE can be reduced to a convex pro-gram for a convex feasible region and to LP for a polyhedral feasible region. A family of bPOE minimization problems and family of the corresponding CVaR minimization problems share the same frontier of optimal solutions and optimal values.
Keywords: probability of failure, probability of exceedance, buered probability of failure, superdistribution, superquantile, Conditional Value-at-Risk, CVaR, parametric simplex method
1. Introduction
This paper uses the notation CVaRα(X) for conditional-value-at-risk (CVaR) for a
random variable X and a condence level α ∈ [0,1], explored in [5]. To have a more consice notation, an alternative name superquantile q¯α(X) is used, similar to a regular
quantile qα(X). That is, qα¯ (X) = CVaRα(X). Notation q¯(α;X) is used to present
superquantile q¯α(X) = ¯q(α;X) as a function of parameter α. For example, q¯−1(x;X)
should be interpreted as an inverse function of superquantile as a function ofα.
Probability of exceedance is dened aspx(X) =P(X > x) = 1−FX(x), whereFX(x) is a distribution function. In engineering applications it is usual to see an optimization problem with probability of exceedance in constraints or as an objective. Paper [4] sug-gests as an alternative to the probability of failure, which is p(X) = P(X > 0), the
buered probability of failure, which is a value p¯(X) such that q¯p¯(X)(X) = 0. This
pa-per denes buered probability of exceedance p¯x(X) in a way that p¯0(X) = ¯p(X) and
¯
px(X) = ¯p0(X−x).
To dene buered probability of exceedandce, we introduce the following mathe-matical notions from paper [3]. For any random variable X with distribution function FX(x) there is an auxilary random variable X¯ = ¯q(FX(X);X) with distribution function FX¯(x) = ¯FX(x), called superdistribution function, and
¯ FX(x) = 1, for x≥supX; ¯ q−1(x;X), for EX < x <supX; 0, otherwise,
where q¯−1(x;X)is an inverse of the function q¯(α;X) as a function of α.
Denition 1. For a random variable X and x ∈R, buered probability of exceedance is
dened as follows ¯ px(X) = 1−F¯X(x) = 0, for x≥supX; 1−q¯−1(x;X), for EX < x < supX; 1, otherwise.
Book [9] considers Chebyshev-type family of inequalities with CVaR deviation and shows that the tightest inequality in the family is obtained forα= ¯px(X), and the tightest
inequality itself reduces to
px(X)≤p¯x(X). (1)
Inequality (1) is similar toqα(X)≤q¯α(X). Inequality (1) is one of the motivations for
in-troducing buered probability of exceedance instead of regular probability of exceedance. Paper [4] uses inequality (1) to argue that the buered probability of failure is a conser-vative estimate of the probability of failure. Similarly, buered probability of exceedance is a conservative estimate of the probability of exceedance.
Section 2 proves several formulas for ecient calculation of p¯x(X). Section 3.1
in-vestigates mathematical properties of p¯x(X) w.r.t. parameter x. Section 3.2 establishes
mathematical properties of p¯x(X) w.r.t. random variableX. Section 4 studies
minimiza-tion of p¯x(X)over a feasible region X ∈ X.
2. Calculation Formulas for BPOE
Note that, sinceq¯α(X)−x= ¯qα(X−x)for any constantx, see e.g. [6], then X−x=
¯
X−x and FX¯ (x) = ¯FX−x(0). Therefore, px¯ (X) = ¯p0(X−x).
The following proposition is a slightly modied proposition from paper [1], studying applications of buered probability of exceedance in classication.
Proposition 1. For a random variable X and x ∈ R, buered probability of exceedance
equals ¯ px(X) = ( 0, if x= supX; mina≥0E[a(X−x) + 1]+, otherwise. (2)
Proof. In the denition of buered probability of exceedance we have three cases: 1. p¯x(X) = 1−q¯−1(x;X)when EX < x <supX,
2. px¯ (X) = 1 when x < EX orx=EX <supX,
3. p¯x(X) = 0 when x≥supX.
Let us prove the proposition case by case.
1. Let EX < x < supX, and take x= 0. Sinceq¯α(X)is a strictly increasing function
of α on α∈ [0,1−P(X = supX)], then equation q¯p(X) = 0has a unique solution p∗ for EX < x < supX. Then, p¯0(X) = p such that mincc+ 1pE[X −c]+ = 0.
Since q¯α(X) is an increasing function of parameter α, then we can reformulate
¯
p0(X) = minpp such that mincc+1pE[X−c]+≤0. Therefore,
¯ p0(X) = min p,c p s.t. c+ 1 pE[X−c] + ≤0. Optimal c∗ < 0, since c∗ ≤ c∗+ p1∗E[X −c
∗]+ ≤ 0, and c∗ = 0 implies supX ≤ 0, which is not the case we consider. Therefore,
¯ p0(X) = min p,c p s.t. p c |c| +E 1 |c|X− c |c| + ≤0.
Since c∗ <0, then |cc∗∗| =−1. Further, denoting a=
1 |c|, we have ¯ p0(X) = min p,a>0 p s.t. E[aX+ 1]+≤p. and, therefore, ¯ p0(X) = min a≥0 E[aX+ 1] +.
Note that change a >0to a≥0includes value 1 to the feasible region, which does not aect the case considered. Finally, since px¯ (X) = ¯p0(X−x), then
¯
px(X) = min
a≥0 E[a(X−x) + 1] +.
2. When EX ≥ x, we have E[a(X −x) + 1]+ ≥ aE(X−x) + 1 ≥ 1. Note also that
E[a(X−x) + 1]+= 1 for a= 0. Therefore, min
a≥0E[a(X−x) + 1]+ = 1.
3. For x= supX, by the formula,p¯x(X) = 0. Considerx >supX, i.e., X−x≤ −ε <
0. Taking a= 1
ε makes a(X−x)≤ −1, therefore, mina≥0E[a(X−x) + 1]
+= 0.
Corollary 1. For EX < x <supX,
¯
px(X) = 1−q¯−1(x;X) = min c<x
E[X−c]+
Furthermore, for x= ¯qα(X), where α∈(0,1), it is valid that qα(X)∈arg min c<x E[X−c]+ x−c , and, consequently, ¯ px(X) = E[X−qα(X)]+ ¯ qα(X)−qα(X) .
Proof. Since EX < x < supX, then q¯−1(x;X) ∈ (0,1), therefore, a = 0 is not optimal formina≥0E[a(X−x) + 1]+. Therefore, change of variablea → x−1c leads to an equivalent
program: min a≥0 E[a(X−x) + 1] += min c<x E 1 x−c(X−x) + 1 + = min c<x E[X−c]+ x−c .
Note that if x = ¯qα(X), then px¯ (X) = 1−q¯−1(x;X) = 1−α. Since qα¯ (X) = qα(X) +
1 1−αE[X−qα] +, then ¯ px(X) = 1−α = E[X−qα(X)]+ ¯ qα(X)−qα(X) ,
that is, qα(X)∈arg minc<xE[X
−c]+
x−c .
Let X be a discretely distributed random variable with atoms {xi}N
i=1, and
proba-bilities {pi}N
i=1, where xi ≤ xi+1, i = 1, . . . , N −1, and N is either nite or N = ∞.
For condence levels αj = Pj
i=1p
i, where j = 0, . . . , N, let us denote corresponding
su-perquantiles x¯j = PN
i=j+1xipi/(1−αj), with x¯N =xN for nite N and x¯N = limi→∞xi for N = ∞. Then, px¯ (X) = 1 for x ≤ x¯0 = EX, px¯ (X) = 0 for x ≥ x¯N = supX, and
¯ px¯j(X) = 1−αj for j = 0, . . . , N −1. Corollary 2. ¯ px(X) = E[X−xj+1]+ x−xj+1 = PN i=j+1p i[xi−xj+1]+ x−xj+1 , (4) for x¯j < x <x¯j+1, where j = 0, . . . , N −1. Proof. Note that for x¯j <q¯
α(X)<x¯j+1 we haveαj < α < αj+1, therefore, qα(X) = xj+1.
Therefore, formula (4) is implied by Corollary 1.
Buered probability is calculated with a simple formula for a set of specic values ¯
xj, j = 0, . . . , N. The following proposition presents a formula for calculation bPOE
in intermediate values, i.e., for x such that x¯j < x < x¯j+1. Such value x can be also
represented as a weighted combination of valuesx¯j and x¯j+1: x=µx¯j + (1−µ)¯xj+1, for
some µ∈(0,1).
Corollary 3. For µ∈(0,1) and j = 0, . . . , N −1 ¯ p(µx¯j + (1−µ)¯xj+1;X) = µ ¯ p(¯xj;X) + 1−µ ¯ p(¯xj+1;X) −1 ,
Proof. Corollary 2 implies 1 ¯ px(X) = x−x j+1 PN i=j+1pi[xi−xj+1]+ ,
for x¯j < x < x¯j+1, where j = 0, . . . , N −1. Therefore, since p¯
x(X) is continuous for x∈[EX,supX), see Proposition 2, then 1/p¯x(X) is a piecewise-linear function ofx.
3. Mathematical Properties of bPOE 3.1. Properties of bPOE w.r.t. Parameter x
Proposition 2. Distribution FX¯(x) = ¯FX(x) has no more than one atom at sup ¯X =
supX with probability P( ¯X = supX) =P(X = supX).
Proof. Note that if for α1 < α2 we have q¯α1(X) = ¯qα2(X), then, by denition of CVaR, min c c+ 1 1−α1 E[X−c]+ = min c c+ 1 1−α2 E[X−c]+ .
For each value of c, if c < supX, then E[X −c]+ > 0 and c+ 1
1−α1E[X −c]
+ < c+ 1
1−α2E[X−c]
+. Therefore, arg min
c
n
c+1−1α
1E[X−c]
+o= supX. It proves that q¯
α(X)
as a function of α can have only one interval of constancy, which is for α ∈[1−P(X = supX),1]. For the interval α ∈ [0,1 −P(X = supX)] the function qα¯ (X) is strictly increasing in α. This implies that if superdistribution has an atom, then there are two
possible locations. The rst case, x =EX, but q¯0(X) = EX, therefore, F¯X(EX−0) =
¯
FX(EX + 0) = 0, and EX is a continuity point of superdistribution. The second case, x = supX, then limx→supX−0F¯X(x) = 1−P(X = supX). Since F¯X(supX) = 1, and
¯
X ≥X, see [3], then sup ¯X = supX and P( ¯X = supX) = P(X = supX).
Corollary 4. For any random variable X, buered probability of exceedance p¯x(X) is a
continuous strictly decreasing function of x at the interval x∈[EX,supX).
Proof. bPOE equals 1−q¯−1(x;X) for x ∈[EX,supX). The function q¯(α;X) is strictly
increasing continuous for α ∈ [0,1−P(X = supX)] (see, e.g., proof of Proposition 2). Therefore, for x ∈(EX,supX) the function q¯−1(x;X) is a strictly increasing continuous function of x. The point x = EX can be added to the interval of continuity, since we
have proved that it is a continuity point of F¯
X(x) = ¯q−1(x;X).
Corollary 5. Buered probability of exceedancep¯x(X)is a non-increasing right-continuous
function of x with no more than one point of discontinuity.
Proof. Immediately follows from denitionp¯x(X) = 1−F¯X(x) and Proposition 2.
Proposition 3. Function 1 1−F¯X(x) = 1 ¯ px(X)
Proof. Consider interval EX < x <supX, where formula (3) is valid. Then, 1 ¯ px(X) = 1/min c<x E[X−c]+ x−c = maxc<x x−c E[X−c]+.
Note that sincemaxc<x(x−c)/E[X−c]+>0, then
max c<x x−c E[X−c]+ = maxc<x [x−c]+ E[X−c]+ = maxc [x−c]+ E[X−c]+.
The last expression maxc{[x−c]+/E[X−c]+} is convex over x as a maximum over
the family of convex functions of x. p¯x(X) is a continuous non-increasing function
on x ∈ (−∞,supX), therefore, 1/p¯x(X) is a continuous non-decreasing function on x ∈ (−∞,supX). Then, extending the interval from (EX,supX) to (−∞,supX) does not violate convexity of 1/p¯x(X), since 1/p¯x(X) = 1, i.e., constant, for x ∈ (−∞, EX].
Further extending of the interval from (−∞,supX) to (−∞,+∞), i.e., R, will not vio-late convexity either, since 1/p¯x(X) = +∞ for x≥ supX. That is, 1/p¯x(X) is a convex
function of x.
Suppose that X is discretely distributed. Again, 1/p¯x(X) = 1 for x ∈ (−∞, EX],
and that is the rst interval of linearity. Consider probability atom with value x∗ which random variable X takes with probability p∗. Denote α1 = F−
X(x∗) = P(X < x∗),
α2 =F
X(x) =P(X ≤ x∗) =α1 +p∗ and x¯i = ¯qαi(X) for i= 1,2. Then for x¯1 < x <x¯2 we have x = ¯qα(X) with α ∈ (α1, α2), therefore, qα(X) = x∗. Applying Corollary 1 we
nd that1/p¯x(X) = (x−x∗)/E[X−x∗]+ forx¯1 < x <x¯2. Therefore,1/p¯x(X)is linear on
¯
x1 < x <x¯2. This way, all the atom probability intervals of type(F−
X(x∗), FX(x∗))⊆[0,1] will project into the intervals of type (¯x1; ¯x2) ⊆ (EX,supX) between corresponding superquantiles, covering all the interval (EX,supX). Therefore, 1/p¯x(X) is a
piecewise-linear function on x∈(−∞,supX), and1/p¯x(X) = +∞on x∈[supX,+∞).
3.2. Properties of bPOE w.r.t. Random Variable
Proposition 4. Buered probability is a closed quasi-convex function of random variable (w.r.t. addition operation), i.e., the set {X|p¯x(X)≤p} is a closed convex set of random
variables for any p∈R. Furthermore, for p∈[0,1), ¯
px(X)≤p⇔q¯1−p(X)≤x.
Proof. If p ≥ 1, then the inequality p¯x(X) ≤ p holds for any x and X. Therefore,
the level-set {X|px¯ (X) ≤ p} is a closed convex set. For p < 0, {X|px¯ (X) ≤ p} = ∅.
Consider p ∈ [0,1). Suppose p¯x(X) ≤ p, then p¯x(X) = p−ε for some ε ≥ 0. Then,
either q¯1−p¯x(X)(X) = ¯q1−p+ε(X) = x, therefore, q¯1−p(X) ≤ x, or supX ≤ x, therefore, ¯
q1−p(X) ≤ q¯1(X) ≤ x. Conversely, if q¯1−p(X) ≤ x, then either q¯1−p+ε(X) = x for some ε≥0, or supX≤x. In the rst case, p¯x(X) =p−ε≤p, and
¯
px(X)≤p⇔q¯1−p(X)≤x.
If supX ≤ x, then p¯x(X) = 0 ≤ p. Function q¯1−p(X) is a closed convex function of X,
therefore, the set {X|q¯1−p(X) ≤ x} is closed convex. Then, the set {X|p¯x(X) ≤ p} is
Example 1. Buered probability of exceedance is not a convex function of random variable (w.r.t. addition operation), i.e., in general, p¯x(λX+ (1−λ)Y)6≤λp¯x(X) + (1−λ)¯px(Y).
Counterexample is as follows. Take x= 0 and
X =
(
1, with probability 1/2, −1, with probability 1/2.
Take Y ≡ 0, λ= 1/2. Note that p¯0(X) = 1, since q¯0(X) = 0, p¯0(Y) = 0. Note also that
λX+ (1−λ)Y =X/2, therefore, ¯
p0(λX + (1−λ)Y) = 16≤1/2 =λp¯0(X) + (1−λ)¯p0(Y).
Denote byBλ the Bernoulli random variable with probabilityλ being equal to 1, i.e., Bλ =
(
1, with probability λ,
0, with probability 1−λ.
Denote the mixture of random variables with coecient λ as λX⊕(1−λ)Y =XBλ+Y(1−Bλ),
where Bλ is independent of X and Y.
In words, a mixture of random veriables with coecientλis a random variable which
takes a value of the rst random variable with probability λ, and a value of the second
random variable with probability (1−λ).
Mixture operation results from the addition operation over measures. Supposeµand ν are measures. Then, scaled measure λµ is a measure satisfying (λµ)(A) = λµ(A) for any measurable setA,λ∈R. Sum of measuresµ+ν is a measure satisfying(µ+ν)(A) =
µ(A) + ν(A) for any measurable set A. Random variable X denes measure µX on
(R,B(R))such that µX(A) =P(X ∈A), for any A∈ B(R). Conversely, any nonnegative
measure µX on (R,B(R)) such that µX(R) = 1 denes a random variable. Suppose that
random variables X and Y correspond to measures µX and µY. Measure µZ = λµX +
(1−λ)µY for λ ∈ [0,1] is a nonnegative measure on (R,B(R)) and µZ(R) = λµX(R) +
(1−λ)µY(R) = 1. Therefore, µZ denes the random variable Z. We call Z a mixture
of random variables X and Y with coecient λ and denote Z = λX⊕(1−λ)Y. In
particular,FλX⊕(1−λ)Y(z) = λFX(z) + (1−λ)FY(z), whereFZ is a cumulative distribution
function of the random variable Z.
Proposition 5. (1−α)¯qα(X) is a concave function of (X, α) w.r.t. mixture operation
and addition operation correspondingly, i.e.,
(1−(λα1+ (1−λ)α2))¯q(λα1+(1−λ)α2)(λX⊕(1−λ)Y)≥
Proof. DenoteαM =λα1+(1−λ)α2. Then, with denitions of CVaR andλX⊕(1−λ)Y, we have (1−αM)¯qαM(λX ⊕(1−λ)Y) = min c (1−αM)c+E[BλX+ (1−Bλ)Y −c]+ = = min c (1−αM)c+E[X−c]+I(Bλ = 1) +E[Y −c]+I(Bλ = 0) .
Since Bλ is independent of X and Y, then
E[X−c]+I(Bλ = 1) =E[X−c]+EI(Bλ = 1) =λE[X−c]+. Then, (1−αM)¯qαM(λX⊕(1−λ)Y) = min c (1−αM)c+λE[X−c]++ (1−λ)E[Y −c]+ ≥ ≥min c1,c2 λ(1−α1)c1+λE[X−c1]++ (1−λ)(1−α2)c2+ (1−λ)E[Y −c2]+ = =λ(1−α1)¯qα1(X) + (1−λ)(1−α2)¯qα2(Y).
The following statement is similar to a proposition in [2], which has motivated sition 5 in the rst place. Here we show how this proposition can be proved from Propo-sition 5 as a corollary.
Corollary 6. Let X(x,p) be a discretely distributed random variable, taking values x= (x1, . . . , xm) with probabilities p = (p1, . . . , pm), pi ≥ 0, Pmi=1pi = 1. Then function
¯
qα(X(x,p)) is a concave function of p.
Proof. Note that if pM = λp1 + (1 − λ)p2, then FX(x,pM)(x) = λFX(x,p1)(x) + (1−
λ)FX(x,p2)(x). Therefore, X(x,pM) =λX(x,p1)⊕(1−λ)X(x,p2). Then Proposition 5
implies the concavity of q¯α(X(x,p))w.r.t. vector p.
The following proposition is similar to the one in [10]. Here we show how this proposition can be proved from Proposition 5 as a corollary.
Corollary 7. Let random variable Xp have a distribution F(x;p) =
Pm
i=1piFi(x), where Fi(x) for i= 1, . . . , m are the distribution functions, and p= (p1, . . . , pm)∈Rm, pi ≥0,
Pm
i=1pi = 1. Then function q¯α(Xp) is a concave function of p.
Proof. Note that if pM =λp1+ (1−λ)p2, thenF(x;pM) =λF(x;p1) + (1−λ)F(x;p2).
Therefore,XpM =λXp1⊕(1−λ)Xp2. Then Proposition 5 implies the concavity of qα¯ (Xp) w.r.t. vectorp.
Proposition 6. Buered probability of exceedance is a concave function of random vari-able w.r.t. mixture operation, i.e.,p¯x(λX⊕(1−λ)Y)≥λp¯x(X)+(1−λ)¯px(Y), λ∈(0,1).
Proof. Suppose px¯ (X) = α1 and px¯ (Y) = α2, then there are three possible cases. First,
α1 = α2 = 1, then EX ≥ x and EY ≥ x, therefore, EλX ⊕(1−λ)Y = λEX + (1−
λ)EY ≥x, and p¯x(λX⊕(1−λ)Y) =λp¯x(X) + (1−λ)¯px(Y) = 1. Second, α1 =α2 = 0,
Therefore, p¯x(λX⊕(1−λ)Y) = λp¯x(X) + (1−λ)¯px(Y) = 0. Third, in all other cases λα1 + (1−λ)α2 ∈ (0,1). By Proposition 4, p¯x(X) ≤ p ⇔ q¯1−p(X) ≤ x for p ∈ (0,1),
therefore, p¯x(X) > p ⇔ q¯1−p(X) > x and, furthermore, p¯x(X) > p ⇔ q¯1−p(X) > x,
supX > x. Note also that p¯x(X) =p∈(0,1)⇔q¯1−p(X) =x, supX > x. Then,
¯ px(λX⊕(1−λ)Y)≥λp¯x(X) + (1−λ)¯px(Y) = λα1+ (1−λ)α2 ∈(0,1), is equivalent to ¯ q1−(λα1+(1−λ)α2)(λX⊕(1−λ)Y)≥x, supλX ⊕(1−λ)Y > x. Note that α1q¯1−α1(X)≥α1x. (5)
If α1 = 0, then 0 ≥ 0. If α1 = 1, then q¯1−1(X) = EX ≥ x. If α1 ∈ (0,1), then
¯
q1−α1(X) =x. Similarly,
α2q¯1−α2(Y)≥α2x. (6)
Implying Proposition 5 and inequalities (5), (6), we get ¯ qλ(1−α1)+(1−λ)(1−α2)(λX ⊕(1−λ)Y)≥ λα1q¯1−α1(X) + (1−λ)α2q¯1−α2(Y) λα1+ (1−λ)α2 ≥ ≥ λα1x+ (1−λ)α2x λα1+ (1−λ)α2 =x.
Since α1 > 0 or α2 > 0, then supX > x or supY > x, therefore supλX ⊕(1−λ)Y =
max{supX,supY}> x, which nishes the proof.
It was mentioned that distribution functions are linear w.r.t. mixture operation:
FλX⊕(1−λ)Y(x) = λFX(x) + (1−λ)FY(x). Note that Proposition 6 proves that
superdis-tribution functions are convex w.r.t. mixture operation: F¯λX⊕(1−λ)Y(x)≤λF¯X(x) + (1−
λ) ¯FY(x).
Proposition 7. Buered probability of exceedance px¯ (X) is a monotonic function of ran-dom variable, i.e., p¯x(Y)≤p¯x(Z) for Y ≤Z almost surely.
Proof. Supposex6= supY andx6= supZ. Then mina≥0E[aY + 1]+ ≤mina≥0E[aZ+ 1]+,
therefore, px¯ (Y) ≤ px¯ (Z). Suppose x = supY, then px¯ (Y) = 0 ≤ px¯ (Z). Suppose
x= supZ, then x≥supY, therefore, p¯x(Y) = 0 = ¯px(Z).
4. Optimization Problems with bPOE 4.1. Two Families of Optimization Problems
Denote programP(x), P(x) : min p¯x(X) s.t. X ∈ X. Denote programQ(α), Q(α) : min q¯α(X) s.t. X ∈ X.
For a set of random variablesX, dene eX = inf
X∈XEX, sX = infX∈XsupX.
Proposition 8. LetX0 ∈ X be an optimal solution to P(x0), whereeX < x0 ≤sX. Then,
X0 is an optimal solution to Q(1−p¯x0(X0)).
Proof. Denote p∗ = ¯px0(X0). Since x0 > eX, then p¯x0(X0) < 1. If p¯x0(X0) = 0, then supX0 ≤ x0, but x0 ≤ sX, and supX0 ≥ sX by denition, therefore, supX0 =x0 = sX. If 0<p¯x0(X0)<1, then, by Denition 1 of bPOE, q¯1−p∗(X0) =x0.
Suppose thatX0 is not an optimal solution toQ(1−p¯x0(X0)), then there existsX ∗ ∈
X such that q¯1−p∗(X∗) < x0. Since x0 ≤ sX, then p∗ > 0 and supX∗ ≥ x0. Therefore,
there exists p < p∗ such that q¯1−p(X∗) = x0, sinceq¯1−p(X)is a continuous non-increasing
function of p. There are two possible cases. First, if supX∗ =x0, thenp¯x0(X
∗) = 0< p∗,
X0 is not an optimal solution to P(x0), contradiction. Second, if supX∗ > x0, then
¯
px0(X
∗) =p < p∗,X
0 is not an optimal solution toP(x0), contradiction.
Two intervals for x0 are not covered in Proposition 8. Note that for x0 ≤ eX the optimal value forP(x0)is 1, therefore, any feasible solution is an optimal solution. As for
the intervalx > sX, optimal value for P(x0)is 0. If sX <supX0 ≤x0, thenp¯x0(X0) = 0, and it is optimal for P(x0), but q¯1(X0)> sX, and it is not optimal forQ(1).
Proposition 9. Let X0 ∈ X be an optimal solution to Q(α0). Then X0 is an optimal
solution to P(¯qα0(X0)), unless supX0 >q¯α0(X0) and there exists X
∗ ∈ X such that 1. supX∗ = ¯qα0(X0),
2. P(X∗ = supX∗)≥1−α0.
Proof. Denote x0 = ¯qα0(X0). First, suppose supX0 =x0. Then px¯0(X0) = 0, and X0 is an optimal solution to P(x0).
Second, suppose that supX0 > x0 and that exists X∗ ∈ X such that p¯x0(X ∗) < ¯
px0(X0). Since x0 = ¯qα0(X0) and supX0 > x0, thenpx¯0(X0) = 1−α0. SupposesupX∗ > x0, thenq¯α(X∗)is strictly increasing on[0,1−p¯x0(X
∗)]. Therefore, ¯ qα0(X ∗)<q¯ 1−p¯x0(X∗)(X ∗) = x
0, which implies thatX0is not an optimal solution toQ(α0),
contradiction. Consequently, supX∗ =x0.
SupposeP(X∗ =x0)<1−α0. Then, q¯α(X∗)is strictly increasing on[0,1−P(X∗ = x0)], andq¯α0(X
∗)< x
0. Therefore,X0 is not an optimal solution to Q(α0), contradiction.
Therefore,P(X∗ =x0)≥1−α0.
Intuiton behind Proposition 9 is as follows. Note that X∗ is also an optimal solution
toQ(α0). Therefore, we have two optimal solutions to right tail expectation minimization
problem. The dierence between optimal solutions X∗ and X0 is that X∗ is constant in
its right 1−α0 tail, andX0 is not, sinceq¯α0(X0)<supX0. Proposition 9 implies thatX ∗ is an optimal solution to P(¯qα0(X0)), while X0 is not. Which is a very natural risk-averse decision. This implies that, for certain problems, formulations of typeP(x)provide more reasonable solutions than formulations of type Q(α).
Corollary 8. Let X be a set of random variables, such that supX =∞ for all X ∈ X.
Then, program families P(x), for x > eX, and Q(α), for 0 < α < 1, have the same set of optimal solutions. That is, if X0 is optimal for P(x0), then X0 is optimal for
Proof. Proposition 8 implies that if eX < x0 ≤ sX = ∞, then if X0 is optimal for
P(x0), then X0 is optimal for Q(1−p¯x0(X0)). Note that since eX < x0 < sX, then ¯
px0(X0)∈(0,1).
Proposition 9 implies that ifX0is optimal forQ(α0), thenX0is optimal forP(¯qα0(X0)), unless exists X∗ ∈ X such that supX∗ = ¯qα0(X0). Which is impossible since supX
∗ =
∞>q¯α0(X0). Note that since α0 ∈(0,1), then eX <q¯α0(X0)<∞.
Assumption supX = +∞ for allX ∈ X in Corollary 8 might be too strong for some
practical problems, where it is a common practice for all random variables to be dened on a nite probability space generated by system observations.
Let us describe sets of optimal points (x, α) for problem families P(x) and Q(α). Dene
fP(x) = min
X∈Xp¯x(X), fQ(α) = minX∈Xq¯α(X). Then, sets of all optimal points of P(x) and Q(α) families are
SP ={(x, α)|fP(x) = 1−α}, SQ ={(x, α)|fQ(α) = x}. Finally, reduced sets of optimal points are
SP− ={(x, α)∈SP|eX ≤x≤sX}, SQ− ={(x, α)∈SQ|x < sX} ∪ {(sX,1)}. For any random variable X ∈ X there is a set SX = {(x, α)|qα¯ (X) = x}. Let us dene
union of such sets for X∈ X as SX =
[
X∈X
SX ={(x, α)| existsX : ¯qα(X) =x}.
Naturally, we prefer random variables with superquantile as small as possible for a xed condence level and with condence level as big as possible for a xed superquantile value. Therefore, for the setSX we dene a Pareto front, which is often called an ecient frontier in nance, as follows: SX−={(x, α)∈SX|x < x0 orα > α0 for all (x0, α0)∈SX,(x0, α0)6= (x, α)}. Proposition 10. SP∩SQ =SP−=S − Q =S − X. Proof. Let us start with SP∩SQ =SX−. Notice that
(x, α)∈SP ⇔(x, α0)∈SX →α0 ≤α, (7) (x, α)∈SQ ⇔(x0, α)∈SX →x0 ≥x. (8) Clearly, right sides of (7) and (8) hold for SX−, which implies SX− ⊆ SP ∩SQ. Suppose
SX−⊂SP∩SQ, i.e., for some (x, α)∈SP∩SQ there exists (x0, α0)∈SX such thatx0 ≤x,
α0 ≥α and (x0, α0)6= (x, α). Notice that if (x, α)∈SP∩SQ, then (x, α0)∈SX →α0 ≤α and (x0, α) ∈ SX → x0 ≥ x. Then, x0 < x and α0 > α. Consider random variable X∗ which has generated point (x0, α0). Since qX¯ ∗(α0)< x, then qX¯ ∗(α)< x. Therefore, there
exists(¯qX∗(α), α)∈SX with q¯X∗(α)< x, while (x, α)∈SQ and (8) holds. Contradiction.
Let us prove SP− = SQ. Suppose− (x, α) ∈ SP− and x 6= eX, then we can use Proposition 8 to conclude that (x, α) ∈ SQ−. If x = eX, then px¯ (X) = 0, therefore,
(x, α) = (eX,0) ∈ SQ. Let− (x, α) ∈ SQ−. If x < sX, then there is no X∗ such that supX∗ =x, therefore, we can use Proposition 9 and conclude that (x, α)∈SP−.
Finally, let us prove SP ∩SQ = SP−. Since S −
P ⊆ SP, SQ− ⊆ SQ and SP− = S − Q, then
SP− ⊆ SP ∩SQ. Suppose (x, α) ∈ SP ∩SQ. If x < sX, then (x, α) ∈ SP−. If x = sX, then α = 1, because there is only one α for any x in SP. Then (x, α) = (sX,1) ∈ SP−. Point withx > sX can not be inSQ since fQ(α) = minX∈Xq¯α(X)≤minX∈XsupX =sX. Therefore,SP ∩SQ ⊆SP−, which nalizes the proof.
4.2. Parametric Simplex Algorithm for CVaR and bPOE Minimization
Suppose that we are interested in solution toP1(x)forx≤sX. Suppose also we have an algorithm for solving P2(α), i.e. we can calculate function f2(α) for any α ∈ [0,1].
Then we can nd an approximation to f1(x) by calculating f2(α) several times. First,
calculate f2(1) = xX. If x > f2(1), then problem P1(x) is inecient. If x = f2(1), then
f1(x) = 0. If x < f2(1), continue. Calculate f2(0). If x < f2(0), then P1(x) is infeasible.
If x = f2(0), then f1(x) = 1. If x > f2(0), continue. Set a = 0, b = 1. Inequality
1−b < f1(x) < 1−a holds. We will calculate f2((a+b)/2) at each step of the binary
search procedure to make dierenceb−a as small as we need.
Suppose that problem P2(α) can be expressed as a linear program. Let X1, . . . , Xn
be a set of random variables discretely distributed on the common set ofmscenarios, with
scenario probabilitiesp1, . . . ,pm. Let random variableXi take value xji under scenarioj.
Letλ = (λ1, . . . , λn)be a set of decision variables such that X ∈ X ⇔X =
n
X
i=1
λiXi, for some λ∈Λ,
where Λ⊂Rn is a polyhedral set. Then P
2(α) is equivalent to min λ qα¯ n X i=1 λiXi ! s.t. λ∈Λ.
With minimization form of CVaR, which is, qα¯ (x) = minc
c+ 1−1αE[X−c]+ , we refor-mulate P2(α)as min c,λ c+ 1 1−α m X j=1 pj " n X i=1 λixji −c #+ s.t. λ∈Λ.
We will slightly adapt the parametric simplex method, see e.g. [8], [7], to solve this problem for all values α ∈ [0,1]. To start, we need to obtain a basic feasible solution for one of extreme values, say α = 0.
To get a solution for α = 0 we need to nd a random variable with minimal expec-tation. For example, if λ≥0 and Pn
i=1λi = 1, then we ndEXi =
P
jpjxji for all i and
then take i∗ = arg miniEXi. Then the optimal solution is λ∗0 such that λi∗ = 1, λj = 0
for j 6=i∗.
After obtaining the rst solution, denote µ = 1−1α and express reduced costs for nonbasic decision variables as linear functions of µ. Since the solution is optimal for
µ0 = 1, all reduced costs are nonnegative at µ = 1. We do not have dependence on µ
in constraints, that is why if µ is changing, solution remains primal feasible, but may
become dual infeasible. Let us nd the biggest parameter valueµ1 at which reduced costs
are still nonnegative. For µ0 ≤ µ≤ µ1 solution λ∗0 remains optimal. When µ > µ1 some
reduced costs became negative, that is why we make primal pivots until we nd new optimal solution λ∗1. Then we express reduced costs as linear functions of µ, nd the next
critical value µ2 at which some costs reduce to 0, and so forth, see [8] for detailes. At
some point all reduced costs will become nonnegative, no matter how big µ is, it means
that the current solution is optimal forµ up to +∞, or up to α = 1.
As a result of the algorithm we have: a sequence of parameters1 = µ0, . . . ,µM which
corresponds to sequence α0 = 0, . . . , αM = 1−1/µM, αM+1 = 1; a sequence of optimal
solutions λ∗0, . . . ,λ∗M; a sequence of optimal objective valuesf2(αi)(f2(αM+1) = f2(αM)).
To calculate f1(x), nd the interval [αj, αj+1] such that f2(αj) ≤ x ≤ f2(αj+1). Then
optimal solution isλ∗j = (λj1, . . . , λnj)and f1(x) = ¯px(
Pn
i=1λ
j iXi).
4.3. Finite Probability Space Applications
Proposition 11. Let X be a convex set of random variables. Then for the problem
infX∈Xp¯x(X), there are two cases:
1. If infX∈XsupX is attained for some X∗ ∈ X, and x = minX∈XsupX = supX∗, then minX∈Xp¯x(X) = 0 with optimal solution X∗.
2. Problem infX∈Xp¯x(X) can be reformulated as a following problem:
inf
Y∈YE[Y + 1]
+,
where Y =cl cone(X −x) is a closed convex cone.
Proof. If Case 1 is not valid, then with Proposition 1 we conclude that inf
X∈Xp¯x(X) = infX∈Xmina≥0 E[a(X−x) + 1]
+ = inf
X∈X,a≥0E[a(X−x) + 1] +.
Denote Y =a(X−x). Since X is convex, then constraints X ∈ X, a≥ 0are equivalent toY ∈cone(X −x). Suppose that sequence {Yi}∞
i=1 ⊂cone(X −x)converges weakly to
Y ∈ Y =cl cone(X −x). Since cone(X −x)is a convex set, and for convex sets weak and
L1 convergences are equivalent, then sequence {Yi}∞i=1 is L1-converging to Y. Therefore,
lim i→∞E[Yi+ 1] + =E[Y + 1]+. Then, nally, inf X∈Xp¯x(X) = infY∈YE[Y + 1] +. Denote Πm = {q = (q 1, . . . , qm)T ∈ Rm|qi ≥ 0, i = 1, . . . , m; Pm i=1qi = 1}. Denote Πm+ ={q= (q1, . . . , qm)T ∈Rm|qi >0, i= 1, . . . , m;Pmi=1qi = 1}.
For the following proposition we supposeX to be a set of random variables dened on
the common nite probability space with the vector of elementary events' probabiliesp = (p1, . . . , pm)T ∈Πm+. Denote by the setS ⊆Rm the set of vectors such that any random
variable X ∈ X takes values x1, . . . , xm with probabilitiesp1, . . . , pm correspondingly, for
some x= (x1, . . . , xm)T ∈S. Then X being closed convex set is equivalent to S being a
closed convex set.
Let us say that random variable X takes values from x = (x1, . . . , xm) ∈ Rm with
probabilitiesp = (p1, . . . , pm)∈Πm ifXis disctretely distributed overm atoms and takes
valuexi with probability pi for i= 1, . . . , m.
Corollary 9. Let X be a set of random variables such that
X ∈ X ⇔X takes values from x∈S with probabilities p,
where S ⊆ Rm is a convex set, p ∈ Πm
+. Then for the problem infX∈Xp¯x(X), there are
two cases:
1. If infx∈Smaxixi is attained for some x∗ ∈ S, and x = minx∈Smaxixi = maxix∗i,
then minX∈Xpx¯ (X) = 0 with optimal solution X∗ taking values from x∗. 2. Problem infX∈Xp¯x(X) can be reformulated as a following problem:
inf y∈Cp
T[y+e]+,
where C =cl cone(S−xe) is a closed convex cone, e= (1, . . . ,1)T ∈
Rm.
Proof. Let us apply Proposition 11 to the specic case of nite probability space. If we consider the Case 2, then probleminfX∈Xp¯x(X)can be reformulated
inf y∈Cp
T
[y+e]+,
where C =cl cone(S−xe)is a closed convex cone, since S is convex.
Corollary 10. Let X be a set of random variables such that
X ∈ X ⇔X takes values from x∈S with probabilities p,
where S = {x|Ax ≤ b} ⊆ Rm, p ∈ Πm
+. Then for the problem infX∈Xp¯x(X), there are
two cases:
1. If x = minx∈Smaxixi, then minX∈Xp¯x(X) = 0 with optimal solution X∗ taking
values from x∗ such that maxix∗i =x.
2. Problem infX∈Xp¯x(X) can be reformulated as an LP:
inf pTz (9)
s.t. z≥y+e, (10)
Ay−a(b−xAe)≤0, (11)
z≥0, a≥0. (12)
Proof. Corollary 9 implies that for the Case 2 probleminfX∈Xp¯x(X)can be reformulated
as infy∈CpT[y+e]+, with C =cl cone(S−xe). Note that
Note also that
cone(S−xe) ={ax|Ax≤b−xAe, a >0} ∪ {0}={y|Ay≤a(b−xAe), a >0} ∪ {0}.
Therefore,
cl cone(S−xe) = {y|Ay≤a(b−xAe), a≥0} ∪ {0}={y|Ay≤a(b−xAe), a≥0}.
Finally, introducing z= [y+e]+, we obtain reformulation (9)(12).
Consider random real-valued functionf(w;X), wherew∈RkandXT = (X
1, . . . , Xn)
is a random vector of dimensionn. It is assumed here that variablesX1, . . . , Xncan be
ob-served, but can not be controlled. It is also assumed that valuef(w;X)can be controlled by the vector w∈W ⊆Rk.
Proposition 12. Let f(w;X) be a convex function of w. Then p¯x(f(w;X)) is a
quasi-convex function of w.
Proof. Convexity of f implies
f(wM;X)≤λf(w1;X) + (1−λ)f(w2;X),
for wM =λw1+ (1−λ)w2. Then, using monotonicity of p¯
x(X), see Proposition 7,
¯
px(f(wM;X))≤p¯x(λf(w1;X) + (1−λ)f(w2;X).
¯
px(X) is a quasi-convex function of X, see Proposition 4. Quasi-convexity of a function
¯
px(X)is equivalent to p¯x(λX1 + (1−λ)X2)≤max{p¯x(X1),p¯x(X2)}. Then,
¯
px(λf(w1;X) + (1−λ)f(w2;X))≤max{p¯x(f(w1;X)),p¯x(f(w2;X))}.
Therefore,
¯
px(f(wM;X))≤max{p¯x(f(w1;X)),p¯x(f(w2;X))},
i.e., p¯x(f(w;X)) is a quasi-convex function of w.
Proposition 13. LetXbe a random vector and letf(w;X)be a convex positive-homogeneous function of w. Assume that convergence of {wi} implies L1-convergence of {f(wi;X)}.
Let W be a convex set. Then for the problem infw∈W p¯x(f(w;X)) there are two possible
cases:
1. Ifinfw∈Wsupf(w;X)is attained for somew∗ ∈W, andx= minw∈Wsupf(w;X) =
supf(w∗;X), then minw∈Wpx¯ (f(w;X)) = 0 with optimal solution w∗.
2. Problem infw∈W p¯x(f(w;X))can be reformulated as a convex programming problem:
inf
v∈V E[ ¯f(v;X) + 1]
+,
where vT = (v
1, . . . , vk+1) ∈ Rk+1, f¯(v;X) = f((v1, . . . , vk)T;X)−vk+1, and V =
Proof. In the Case 2 we can reformulate problem infw∈W p¯x(f(w;X))as follows:
inf
w∈Wp¯x(f(w;X)) = a≥0inf,w∈WE[a(f(w;X)−x) + 1]
+.
Denote vT = (v1, . . . , vk+1)∈Rk+1 and f¯(v;X) =f((v1, . . . , vk)T;X)−vk+1. Then,
inf
w∈Wp¯x(f(w;X)) =a≥0,v∈infW×{x}E[a ¯
f(v;X) + 1]+.
Note thatf¯(v;X) is also a convex positive-homogeneous function of v. Then, inf
w∈Wp¯x(f(w;X)) =a≥0,v∈infW×{x}E[ ¯f(av;X) + 1]
+.
Note that sinceW is a convex set, then W × {x} is also a convex set. Therefore, a ≥0,v∈W × {x} ⇔av∈cone(W × {x}).
Note that feasible region can be extended to V = cl cone(W × {x}): since convergence of {wi} implies L1-convergence of f(w;X), then convergence of vi → v implies L1
-convergence of f¯(vi;X) → L1 f¯(v;X), therefore, E[ ¯f(v i;X) + 1]+ → E[ ¯f(v;X) + 1]+. Finally, inf w∈Wp¯x(f(w;X)) = infv∈V E[ ¯f(v;X) + 1] + .
Corollary 11. Let X = (X1, . . . , Xn,1)T be a random vector, with the last component
being a constant 1, andE|Xi|<∞fori= 1, . . . , n. LetW ⊆
Rn+1 be a convex set. Then
for the problem infw∈W p¯x(wTX) there are two possible cases as follows:
1. If infw∈W supwTX is attained for some w∗ ∈ W, and x = minw∈WsupwTX =
sup(w∗)TX, then min
w∈W p¯x(wTX) = 0 with optimal solution w∗.
2. Problem infw∈Wpx¯ (wTX)can be reformulated as the following convex programming
problem:
inf v∈V E[v
TX+ 1]+,
whereV =cl cone(W−xen+1)is a closed convex cone, whereen+1 = (0, . . . ,0,1)T ∈
Rn+1.
Proof. Let us show that this corollary follows from Proposition 13. First, f(w;X) =
Pn
i=1wiXi+wn+1 is convex and positive-homogeneous w.r.t. w. Second, suppose that
wj →w. Then E|(wj)TX−wTX| ≤ |wnj+1−wn+1|+ n X i=1 |wij−wi|E|Xi| →0,
since E|Xi| <∞. Therefore, convergence of w implies L
1-convergence of f(w;X). Note
that in this particular case of function f there is no need to introduce a new parameter.
It is sucient to shift feasible region forwn+1 by−x: W¯ =W−xen+1. Further change of
variables v =aw, and setting V = cl cone( ¯W), as it is done in Proposition 13, nalizes the proof.
Corollary 12. Let X = (X1, . . . , Xn,1)T be a random vector, with the last component
being a constant 1, and E|Xi| < ∞ for i = 1, . . . , n. Let W = {w|Aw ≤ b} ⊆
Rn+1.
Then for the problem infw∈W p¯x(wTX) there are two possible cases:
1. If infw∈W supwTX is attained for some w∗ ∈ W, and x = minw∈WsupwTX =
sup(w∗)TX, then min
w∈W p¯x(wTX) = 0 with optimal solution w∗.
2. Problem infw∈Wpx¯ (wTX) can be reformulated as the linear programming problem:
inf E[vTX+ 1]+, (13) s.t. AvT −a(b−xAen+1)≤0, (14)
a≥0. (15)
Proof. Let us prove that this corollary follows from Corollary 11. Note that
W −xen+1 ={w|Aw+xAen+1 ≤b}.
Note further that
cone(W −xen+1) = {v|Av+axAen+1 ≤ab, a >0} ∪ {0}.
Finally,
V =cl cone(W −xen+1) = {v|Av−a(b−xAen+1)≤0, a≥0}.
Corollary 13. Let X be a random vector taking valuesx1, . . . ,xm ∈Rn with probabilities
p = (p1, . . . , pm) ∈ Πm+. Let f(w;X) be a convex positive-homogeneous function of w ∈
Rk. Let W ⊆Rk be a convex set. Then for the probleminfw∈Wp¯x(f(w;X))there are two
possible cases:
1. Ifinfw∈Wmaxjf(w;xj)is attained for somew∗ ∈W, andx= minw∈Wmaxjf(w;xj) =
maxjf(w∗;xj), then minw∈Wp¯x(f(w;X)) = 0 with optimal solution w∗.
2. Problem infw∈W p¯x(f(w;X)) can be reformulated as the convex programming
prob-lem: inf v∈V m X j=1 pj[ ¯f(v;xj) + 1]+, where vT = (v 1, . . . , vk+1) ∈ Rk+1, f¯(v;X) = f((v1, . . . , vk)T;X)−vk+1, and V =
cl cone(W × {x}) is a closed convex cone.
Proof. Note that since there are nitely many scenarios for random vector X, then for wi →w, due to continuity of functionf w.r.t. w, we havemax
j|f(wi;xj)−f(w;xj)| →0.
That is, convergence on w implies L1-convergence of f(w;X). Therefore, this corollary
follows directly from Proposition 13. 5. References
[1] Norton, M., and Uryasev, S. AUC and Buered AUC Maximization. University of Florida, Research Report, in preparation, 2014.
[2] Pavlikov, K., and Uryasev, S. CVaR Distance between Distributions and Ap-plications. University of Florida, Research Report, in preparation, 2014.
[3] Rockafellar, R. T., and Royset, J. O. Random variables, monotone relations and convex analysis. Mathematical Programming B, accepted.
[4] Rockafellar, R. T., and Royset, J. O. On buered failure probability in design and optimization of structures. Reliability Engineering and System Safety 95, 5 (2010), 499 510.
[5] Rockafellar, R. T., and Uryasev, S. Conditional value-at-risk for general loss distributions. Journal of Banking and Finance (2002), 14431471.
[6] Rockafellar, R. T., and Uryasev, S. The Fundamental Risk Quadrangle in Risk Management, Optimization and Statistical Estimation. Surveys in Operations Research and Management Science 18 (2013).
[7] Ruszczynski, A., and Vanderbei, R. J. Frontiers of Stochastically Nondomi-nated Portfolios. Econometrica 71, 4 (2003), 12871297.
[8] Vanderbei, R. J. Linear Programming: Foundations and Extensions. International series in operations research & management science. Kluwer Academic, 2001.
[9] Zabarankin, M., and Uryasev, S. Statistical decision problems. Selected concepts and portfolio safeguard case studies. Springer Optimization and Its Applications 85. New York, NY: Springer.
[10] Zdanovskaya, V., Pavlikov, K., and Uryasev, S. Estimation of Mixtures of Continuous Distributions: Mixtures of Normal Distributions and Applications. University of Florida, Research Report, in preparation, 2013.