CHAPTER 4 : Data-Driven Robust Resource Allocation
4.6 Data-Driven Evaluations
4.6.5 Compare robust solutions with non-robust solutions
For testing the quality of the uncertainty sets applied in the robust dispatch problems, we use the idea of cross-validation from machine learning. The dataset is separated as a training set for building the uncertain demand model, and a testing set for comparing the results of the dispatch solutions. The customer demand models applied in the robust and non-robust optimization problems are dif- ferent. For the non-robust dispatch problem, the demand predictionrkis a deterministic value. For
instance, in this work we use the average or mean of the bootstrapped value of the training dataset. In the experiments, the idle geographical distance of one taxi between a drop-off event of one passenger and the following pick-up event is approximately as the one norm distance between the 2D geographical coordinates (provided as longitude and latitude values of GPS data in the trip dataset) of the two points. Then the corresponding idle miles on ground is converted from the geographical distance according to the geographical coordinates of New York City.
0 60 120 180 240 0
20% 40% 60%
Demand supply ratio error range
Percentage
Demand supply ratio error comparison Robust solutions Non−robust solutions
Figure 22: Demand-supply ratio error distribution of the robust optimization solutions with the SOC type of uncertain demand set (= 0.25, or probabilistic guarantee level75%) and non-robust optimization solutions. The demand-supply ratio error of robust solutions is smaller than that of the non-robust solutions, that the average demand-supply ratio error is reduced by31.7%.
In the robust dispatch problem, the part that directly includes the uncertain demandrkis the penalty
function for violating a balanced demand-supply ratio requirement. For each testing data rk, we
denote the demand-supply ratio mismatch error of a dispatch solution as (5.18). We then compare the value of (5.18) of robust dispatch solutions with the SOC type of uncertainty set constructed in this work with the value of (5.18) of non-robust solutions of testing samples. The distribution of values are shown in Figure 22. The average demand-supply ratio error is reduced by31.7%with robust solutions.
We compare the cost distribution of total idle distance in Figure 23. It shows the average total idle distance is reduced by10.13%. For all testing, the robust dispatch solutions result in no idle distance greater than0.8×105, and non-robust solutions has48%of samples with idle distance greater than 0.8×105. The cost of robust dispatch (4.11) is a weighted sum of both the demand-supply ratio error and estimated total idle driving distance, and the average cost is reduced by11.8%with robust solutions. It means that the performance of the system is improved when the true demand deviates from the average historical value considering model uncertainty information in the robust dispatch process. It is worth noting that the number of total idle distance shown in this figure is the direct calculation result of the robust dispatch problem. When we convert the number to an estimated value of corresponding miles in one year, the result is a total reduction of20million miles in NYC.
0.7 0.8 0.9 1 1.1 1.2 1.3 x 105 0 25% 50% 75%
Total idle distance range
Percentage
Total idle distance comparison Robust solutions
Non−robust solutions
Figure 23: Total idle distance comparison of robust optimization solutions with the SOC type of uncertain demand set (= 0.25, or probabilistic guarantee level75%) and non-robust optimization solutions. The average total idle distance is reduced by10.13%. For all samples used in testing, the robust dispatch solutions result in no idle distance greater than0.8×105, and non-robust solutions has48%of samples with idle distance greater than0.8×105. The number of total idle distance shown in this figure is the direct calculation result of the robust dispatch problem, and we convert the number to an estimated value of corresponding miles in one year, the result is a total reduction of20million miles in NYC.
robust dispatch problems with the uncertainty set should guarantee that with at least the probability (1−), when the system applies the robust dispatch solutions, the actual dispatch cost under a true demand is smaller than the optimal cost of the robust dispatch problem. Figures 4.24(a) and 4.24(b) show the cross-validation testing result that the probabilistic guarantee level is reached for both
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
20%
40%
60%
80%
100%
ϵ
Percentage valuePercentage of experiments with satisfied constraints
Experiments
1−ε
(a) Comparison result with the box type of uncertainty set.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 11
0
20%
40%
60%
80%
100%
ϵ
Percentage valuePercentage of satisfied constraints for SOC model
Experiments
1−ε
(b) Comparison result with the SOC type of uncertainty set. The true percentage value is closer to the value of1−
compared with the solution given a box type uncertainty set.
Figure 24: The percentage of tests that have a smaller true dispatch cost than the optimal cost of the robust dispatch problem with the box and SOC types of uncertainty sets constructed from data. When1−decreases, the percentage value also decreases, but always greater than1−.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.751
x 105
cost
Optimal and average cost of robust dispatch
Box uncertainty set Robust solutions
Optimal cost of (11) Non-robust solutions
1.25
(a) Comparison result with box type of uncertainty set. When= 0.3the average cost is the smallest.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 11
0.75
1
x 105
1.25 SOC uncertainty set
cos
t
Robust solutions Non-robust solutions
(b) Comparison result with SOC type of uncertainty set. When= 0.25the average cost is the smallest.
Figure 25: Comparisons of the optimal cost of the robust dispatch problem with box and SOC types of uncertainty sets and the average cost when applying the robust solutions for the test subset of sampledrc.
box type and SOC type of uncertainty sets via solving (4.25) and (4.26), respectively. Comparing these two figures, one key insight is that the robust dispatch solution with an SOC type uncertainty set provides a tighter bound on the probabilistic guarantee level that can be reached under the true random demand compared with solutions of the box type uncertainty set. It shows the advantage of considering second order moment information of the random vector, though the computational cost is higher to solve problem (4.26) than to solve problem (4.25).
How probabilistic guarantee level affects the average cost: There exists a trade-off between the probabilistic guarantee level and the average cost with respect to a random vector rc. Selecting
a value for is case by case, depending on whether a performance guarantee for the worst case scenario is more important or whether the average cost performance is more important. For a high probabilistic guarantee level or a large1−value, the average cost may not be good enough since we minimize a worst case that rarely happens in the real world. When the1−value is relatively small, the average cost can also be large since many possible values of the random vector are not considered.
We compare the optimal cost of robust solutions and average cost of empirical tests for two types of uncertainty sets via solving (4.25) and (4.26) in Figure 4.25(a) and 4.25(b), respectively. The optimal cost of the robust dispatch framework shows that the result of minimized worst case scenario
for all possiblercincluded in the uncertainty set, and the average cost of empirical tests show the
real world scenario when we applying the optimal solution to dispatch taxis under random demand
rc. The horizontal line shows the average cost of non-robust solutions since this cost is not related
to. Thevalues that provide the best average costs are not exactly the same for different types of uncertainty sets according to the experiments. For the box type of uncertainty set shown in Figure 4.25(a), = 0.3 provides the smallest average experimental cost, and for SOC type of uncertainty set shown in Figure 4.25(b),= 0.25provides the smallest average experimental cost.
The minimum average cost of an SOC robust dispatch solution is smaller than that of a box type. It indicates that the second order moment information of the random variable should be included for modeling the uncertainty set and calculating robust dispatch solutions for the dataset we use in this section, though its computational cost is higher.