Sequential decision-making under uncertainty

TIMER / IMAGE

5 U NCERTAINTY ANALYSIS

5.2 Methods to address uncertainty in energy modelling

5.2.4 Sequential decision-making under uncertainty

Sequential decision making under uncertainty differs from the previously discussed methods in the sense that optimal policies are determined at more than one point in time taking into account learning. Manne et al. (1991, p. 545) have described uncertainty propagation in optimisation models as “learn now then act” and sequential decision making in contrast as “act now then learn”. While all the input variables are known in advance for uncertainty propagation, not all information is available from the beginning of the model period during sequential decision-making so that the model has to “act” and later adapt to new information when uncertainty is resolved. It is assumed that there are one or more points in time in which policy makers make decisions to react to outcomes and that their knowledge increases with time.

Sequential decision-making under uncertainty is implemented in energy models via stochastic optimisation. Two methods can be distinguished to convert problems into solvable stochastic optimisation problems: decision tree and mean-variance modelling. The latter one is based on Markowitz’ mean-variance method (Markowitz 1952), where parameters are substituted with a distribution function weighted by a mean and a variance in a linear optimisation approach (for an example see e.g. Yu 2003).

The most common way of applying two-stage stochastic programming is via decision trees. The analyst has to define the uncertain variable(s) and define how many alternatives, i.e. branches, should be considered for the variable(s). Those alternatives are either states of the world or a new distribution with a different mean and/or with a reduced variance. In the next step, probabilities for each branch, and a period when uncertainty is resolved, have to be defined. In the case of multiple-stage stochastic programming, where uncertainty is not completely resolved at one point in time,

multiple uncertainty resolution times are determined. Finally, the model is solved to obtain results on optimal decision making under uncertainty.

In this context, one can differentiate between two different sets of decisions. On the one hand, a number of decisions are taken before the resolution of uncertainty, where the period is called the first stage or hedging period. On the other hand, a number of decisions are taken after the resolution of uncertainty; the period associated with those decisions is called second stage or recourse period (Birge and Louveaux 1997, p. 52). The set of second stage decisions can be different depending on the outcome, while the set of first stage decisions cannot. During the first stage a strategy composed of contingent actions is followed that takes into account all probable outcomes and their probabilities.

The main goal of stochastic programming is to identify hedging strategies, which balance the risks of waiting with premature action (Rotmans and van Asselt 2001, p. 118). Hedging can be regarded as a strategy that builds a contingency plan and responds to opportunities and dangers as they are resolved (Kann and Weyant 2000, p. 38). This is in contrast to a strategy that only takes the average of different policies, which are optimal for different states of the world. Thus stochastic modelling can give insights additional to the comparison of several runs with a deterministic model. An illustrative example is the stochastic definition of a CO2 reduction target, where the model chooses an emission path in the first stage from where it is always possible to meet all specified final targets. Deliberations include the trade-off between waiting to learn more versus higher damage or waiting to learn more versus beneficial effects from induced technological learning. In addition, stochastic programming can yield interesting results on robust technologies, i.e. those that are chosen during the first stage of the optimisation problem. Furthermore, after the resolution of uncertainty the recourse strategy can reveal interesting insights on the flexibility of the energy system if an unlikely event occurs. It could be interesting, e.g., to see what are the consequences if an investment opportunity into a low-carbon technology opens up after uncertainty is resolved.

Peterson (2006, p. 11) summarises several models, which have applied sequential decision-making under uncertainties, with Peck et al. (1993) and Manne et al. (1991) being one of the first to apply stochastic programming to an energy model.

The expected value of perfect information (EVPI) is a mathematical value and is often used in the context of stochastic programming to determine the value of having the information about the uncertain variables available from the start. More precisely the EVPI is the difference between the expected value obtained if the state of the world is known before a policy must be adopted and the expected value obtained if a single policy must be adopted and then applied across all possible states of the world. The EVPI measures the maximum amount a decision maker would be ready to pay in return for complete information about the development of the concerned uncertain variable(s) (Birge and Louveaux 1997, p. 137). Peck et al. (1993, p. 94) noted in this context that the value of information for two or more variables if treated together can be bigger than the sum of all variables at once.

Drawbacks of this concept are that the value of information depends largely on the dispersion of the distribution that is assigned to a variable, which is a subjective estimation. Usher (2011) found the EVPI to be at a maximum when uncertainty is maximised in the way that all possible outcomes have the same probability. Although the EVPI gives a precise number for the availability of information, this is based on subjective assumptions. A value of information for an individual input variable is most likely not the information decision makers are looking for. They are more interested in joint values, which are difficult to obtain due to complex calculations and correlations among variables.

The concept of sequential decision-making under uncertainty comes with several shortcomings. In general, energy modelling comprises a very large number of uncertainties, which cannot all be taken into account due to incomplete knowledge and computational limitations. As the number of branches increases exponentially with the number of uncertain inputs, the analyst needs to limit the number of uncertain variables that are considered. This makes an exhaustive representation of uncertainty impossible. In addition, stochastic modelling only assumes a few variables to be uncertain, whereas others are assumed to be fully known. Thus, results reflect the certainty associated with deterministic variables that can lead to a preference for a known technology independent of the uncertainty characterisation of stochastic variables.

Concerning the assessment of variable uncertainties, stochastic programming suffers from the same problem as uncertainty propagation. While in some cases it is possible to

generally not available. This makes the analyst’s choice of probability determination arbitrary. Further, the extent of possible correlations among uncertain variables is not known. A survey among experts, an expert elicitation, is a possibility to obtain a meaningful uncertainty profile.

Stochastic programming using the decision tree formulation with states of the world suggests that there is one point in time in the future, where perfect information becomes available all at once. Yet, this is not the case, instead there is a continuing process of updating best estimates over time as information is developed (Peck and Teisberg 1993, p. 86). Reducing decision-making to every 20 years or so is an oversimplification of the process, since adjustments to policies are made continuously as information is updated. In addition, released information is obstructed by noise and imperfect understanding of social and technological dynamics, so that it cannot be considered to be perfect.

In document Decomposing long-run carbon abatement cost curves - robustness and uncertainty (Page 173-176)