8.5 Evaluating strategies under uncertainty
8.5.1 Quality measures for optimization under uncertainty
There are many different ways of measuring the quality of a strategy under uncer- tainty. Consider an optimization problem
min
A∈Sz(A, x),
where the objective function z depends on a strategy A and on an uncertain parameter
160 The Traveller’s Route Choice Problem
max{depS(1), sucS(x − y)} + MS
sucS(x − y) + MS depS(2) + MS
1st stage dec. 2nd stage dec.
∆D+ MD
0 ∆S ∆S+ TS ∆D ∆S+ 2TS time
t= x
Figure 8.3: The second stage path corresponding to the strategy “take the first D-train”
(later called A0,D) for a realization of x and the case depS(2) + MS <
∆D+ MD.
• If sucS(x) + MS < sucD(x) + MD, then he takes the next available S-train at
time sucS(x)and he arrives at time sucS(x) + MS.
• If sucS(x) + MS >sucD(x) + MD, then he takes the next available D-train at
time sucD(x), and he arrives at time sucD(x) + MD.
Thus in both cases the traveler arrives at time: min{sucS(x) + MS, sucD(x) + MD}
(8.3)
Without loss of generality, we assume that the traveler takes the S-train if sucS(x) +
MS=sucD(x) + MD. Note that, if the traveler is still waiting at the departure station
at a point in time t with sucD(t) + MD < sucS(t) + MS, then he should take the
D-train anyway, whether the disruption is over or not. Combining the first and second stage
As was described before, if the disruption ends at time x while the traveler is still waiting at the departure station, he takes the uniquely defined second-stage strategy
A∗(x). Hence, the practical strategies for the TRCP can be described as follows:
• The strategies Ar,S, for r = 0, . . . , ∞ connected to the r-th departure of an
S-train:
8.5 Evaluating strategies under uncertainty 161
– Wait at the departure station not longer than until min{x, depS(r)}, i.e.,
until the r-th S-train departs or until the disruption is over, whatever happens first.
– If x depS(r)use the second-stage strategy A∗(x).
– If depS(r) < x, take the r-th S-train.
• The strategies Ar,D, for r = 0, . . . , ∞ connected to the r-th departure of a D-train:
– Wait at the departure station not longer than until min{x, depD(r)}, i.e.,
until the r-th D-train departs or until the disruption is over, whatever happens first.
– If x depD(r)use the second-stage strategy A∗(x).
– If depD(r) < x, take the r-th D-train.
Note that the above strategies are the strategies that are not sorted out as being impractical at first sight. In Section 8.7 on dominance among strategies for the TRCP we show that some of the above strategies are clearly superior to others.
If A is one of the above defined strategies for solving instances of the TRCP, then we will use the notation z(A, x) for the arrival time that is realized by applying strategy A, if the end time of the disruption turns out to be x. Based on these descriptions, the
arrival times realized by the strategies Ar,Sand Ar,Dcan be computed as follows:
z(Ar,S, x) =
min{sucS(x) + MS, sucD(x) + MD} if x depS(r),
depS(r) + MS if depS(r) < x and x depS(r) + y, sucS(x − y) + MS otherwise, see (8.3) see (8.1) see (8.1) z(Ar,D, x) =
min{sucS(x) + MS, sucD(x) + MD} if x depD(r),
depD(r) + MD otherwise,
see (8.3) see (8.2)
8.5 Evaluating strategies under uncertainty
8.5.1 Quality measures for optimization under uncertainty
There are many different ways of measuring the quality of a strategy under uncer- tainty. Consider an optimization problem
min
A∈Sz(A, x),
where the objective function z depends on a strategy A and on an uncertain parameter
162 The Traveller’s Route Choice Problem
measures used for evaluation under uncertainty. Note that most literature on robust and stochastic optimization deals with solutions, however, we transfer these concepts to strategies in the following description.
Robust optimization aims at the minimization of some objective function in the
worst case. In strict robustness, the worst-case absolute objective value
gwc-abs(A) :=sup
x∈X
z(A, x)
is used as an objective function, see e.g. Ben-Tal et al. (2009). Other robustness ap- proaches do not evaluate the absolute objective value over all scenarios x, but compare the realized objective value of a strategy A to the objective value of the best possible strategy that could have been obtained under prior knowledge of scenario x, denoted
as z∗(x). We call the difference
zrg(A, x) := z(A, x) − z∗(x)
the absolute regret of the strategy A in scenario x. Similarly, the ratio
zcomp(A, x) := z(A, x)
z∗(x)
is called the competitive ratio (or relative regret) of the strategy A in scenario x. Mini- mizing the worst-case regret
gwc-rg(A) :=sup
x∈X
z(A, x) − z∗(x)
and the worst-case competitive ratio
gwc-comp(A) :=sup
x∈X
z(A, x) z∗(x)
are common objective functions in robust optimization, see e.g. Kouvelis and Yu (1997). In online optimization, the competitive ratio is the most common quality measure, see e.g. Borodin and El-Yaniv (1998).
If a probability distribution with probability density function p on the uncertainty set X is known, the conservative approach of robust optimization can be weakened by excluding unlikely scenarios with high objective values. For example, for any γ∈ [0, 1] and f ∈ {z, zrg, zcomp}we define by
greach-γ(A) :=min{α :
I(A,α)p(x)dx γ},
8.5 Evaluating strategies under uncertainty 163
with
I(A, α) := {x : f(A, x) α}, (8.4)
the minimal value of f which can be guaranteed with probability γ. If we set γ := 1, we obtain the above-described problems of minimizing the worst-case absolute value, regret or competitive ratio from robust and online optimization. An example for this objective function where γ < 1 can be found, e.g. in Daskin et al. (1997) and Gao (2011).
A different objective which is often used in stochastic optimization is the expected
value
gexp(A) :=
X
f(A, x)p(x)dx
(where p : X → [0, 1] is the probability distribution on the uncertainty set X), which,
as before, can be evaluated for f ∈ {z, zrg, zcomp}, see e.g. Birge and Louveaux (1997).
If the traveler in the TRCP has an important appointment at his destination at a certain time, then he would choose a strategy with highest probability to arrive on time. Likewise, the probability to achieve a certain regret value or competitive ratio can be maximized. Let I(A, α) be defined as in (8.4). We define
gprob-α(A) :=
I(A,α)p(x)dx
as the probability that f(A, x) stays below the value α for f ∈ {z, zrg, zcomp}.
This function gprob-α(A)is sometimes called the reliability of A (Nie and Wu, 2009;
Pan et al., 2013). Often, a lower bound on the reliability, a so-called chance constraint gprob-α(A) γ is imposed, see e.g. Valdebenito and Schuëller (2010) and Birge and Louveaux (1997). Approaches that find the most reliable strategy are less common, see e.g. Nie and Wu (2009); Gao (2011); Pan et al. (2013).
As a last example for a quality measure, some papers, see e.g. Sigal et al. (1980), also investigate how to find a strategy which has the highest probability of being optimal, i.e., they take
gopt(A) :=
I(A)optp(x)dx
with I(A)opt:= {x : f(A, x) f(A, x) for all A}, as an evaluation criterion. Note that
for this measure it is irrelevant whether we choose f as the absolute value, the regret or the competitive ratio.
Summarizing, which strategy is considered as optimal depends on:
1. The way of measuring the objective value for a given A and a given scenario x:
162 The Traveller’s Route Choice Problem
measures used for evaluation under uncertainty. Note that most literature on robust and stochastic optimization deals with solutions, however, we transfer these concepts to strategies in the following description.
Robust optimization aims at the minimization of some objective function in the
worst case. In strict robustness, the worst-case absolute objective value
gwc-abs(A) :=sup
x∈X
z(A, x)
is used as an objective function, see e.g. Ben-Tal et al. (2009). Other robustness ap- proaches do not evaluate the absolute objective value over all scenarios x, but compare the realized objective value of a strategy A to the objective value of the best possible strategy that could have been obtained under prior knowledge of scenario x, denoted
as z∗(x). We call the difference
zrg(A, x) := z(A, x) − z∗(x)
the absolute regret of the strategy A in scenario x. Similarly, the ratio
zcomp(A, x) :=z(A, x)
z∗(x)
is called the competitive ratio (or relative regret) of the strategy A in scenario x. Mini- mizing the worst-case regret
gwc-rg(A) :=sup
x∈X
z(A, x) − z∗(x)
and the worst-case competitive ratio
gwc-comp(A) :=sup
x∈X
z(A, x) z∗(x)
are common objective functions in robust optimization, see e.g. Kouvelis and Yu (1997). In online optimization, the competitive ratio is the most common quality measure, see e.g. Borodin and El-Yaniv (1998).
If a probability distribution with probability density function p on the uncertainty set X is known, the conservative approach of robust optimization can be weakened by excluding unlikely scenarios with high objective values. For example, for any γ∈ [0, 1] and f ∈ {z, zrg, zcomp}we define by
greach-γ(A) :=min{α :
I(A,α)p(x)dx γ},
8.5 Evaluating strategies under uncertainty 163
with
I(A, α) := {x : f(A, x) α}, (8.4)
the minimal value of f which can be guaranteed with probability γ. If we set γ := 1, we obtain the above-described problems of minimizing the worst-case absolute value, regret or competitive ratio from robust and online optimization. An example for this objective function where γ < 1 can be found, e.g. in Daskin et al. (1997) and Gao (2011).
A different objective which is often used in stochastic optimization is the expected
value
gexp(A) :=
X
f(A, x)p(x)dx
(where p : X → [0, 1] is the probability distribution on the uncertainty set X), which,
as before, can be evaluated for f ∈ {z, zrg, zcomp}, see e.g. Birge and Louveaux (1997).
If the traveler in the TRCP has an important appointment at his destination at a certain time, then he would choose a strategy with highest probability to arrive on time. Likewise, the probability to achieve a certain regret value or competitive ratio can be maximized. Let I(A, α) be defined as in (8.4). We define
gprob-α(A) :=
I(A,α)p(x)dx
as the probability that f(A, x) stays below the value α for f ∈ {z, zrg, zcomp}.
This function gprob-α(A)is sometimes called the reliability of A (Nie and Wu, 2009;
Pan et al., 2013). Often, a lower bound on the reliability, a so-called chance constraint gprob-α(A) γ is imposed, see e.g. Valdebenito and Schuëller (2010) and Birge and Louveaux (1997). Approaches that find the most reliable strategy are less common, see e.g. Nie and Wu (2009); Gao (2011); Pan et al. (2013).
As a last example for a quality measure, some papers, see e.g. Sigal et al. (1980), also investigate how to find a strategy which has the highest probability of being optimal, i.e., they take
gopt(A) :=
I(A)optp(x)dx
with I(A)opt := {x : f(A, x) f(A, x) for all A}, as an evaluation criterion. Note that
for this measure it is irrelevant whether we choose f as the absolute value, the regret or the competitive ratio.
Summarizing, which strategy is considered as optimal depends on:
1. The way of measuring the objective value for a given A and a given scenario x:
164 The Traveller’s Route Choice Problem • Absolute regret
• Competitive ratio (or relative regret)
2. The utility function, i.e., the way of aggregating over the uncertainty set:
• Minimize: Objective value that can be guaranteed with probability at least
γ(with robust and online optimization (γ = 1) as special cases)
• Minimize: Average value
• Maximize: Probability of reaching a predefined value α • Maximize: Probability of having found an optimal solution
3. The choice of the probability distribution.
Hence, while in the deterministic case the definition of an optimal solution is straight- forward, this is not true in the uncertain case. In this chapter we compare different strategies A. The uncertain parameter in this case is the time x at which the disruption vanishes. In the next section we indeed see that different quality measures lead to different “optimal” solutions.