• No results found

3.6

Online Stochastic Optimisation

Online stochastic optimisation algorithms make decisions for one step at a time using stochastic information for any unknowns. After each time step more parameter values become known and the result of actions is revealed. Decisions for the next period are then computed and the process is repeated. Online stochastic optimisation has been used successfully on a wide variety of problems (e.g., see Powell et al. [2012], Van Hentenryck and Bent [2006]). Our algorithms make use of a receding horizon, as illustrated in figure 3.2. Optimisation is performed for each horizon using stochastic information for any unrevealed parameters and then the actions for the first time step are executed in the real world. In the formulations that follow we assume that the horizon is T time steps long and that the first step in the horizon cor- responds to t = 1. The online algorithm performs its optimisation for the horizon when the current time is equal to the time at the start of the horizon, in this case τ =τ0. 1 2 1 2 1 2 T T T

Figure 3.2: Receding horizon for 3 consecutive iterations.

The next section discusses the executives which are used to actually implement the scheduled decisions, and then in the following sections two approaches to solving the stochastic optimisation part of the problem for each horizon: an expectation and a 2-stage algorithm.

3.6.1 Executives

Some parameters might still be uncertain in the first time step, which means that it might not be possible to execute the decisions made by the optim- isation directly as given. For example, if the hot water heater is scheduled to run at full power, but the demand for hot water turns out to be less than expected, then the tank might be heated beyond its limits.

In order to manage this problem, we develop simple executives which take in the scheduled actions and modifies them based on what actually occurs in real-time as the uncertainty is revealed. Our EMS therefore is comprised of two parts: an online scheduling algorithm, and an executive controller that implements the schedule decisions for the next time step using very simple policies that can run in real-time. When shorter time steps are used in the online scheduling, the intervention of an executive will become less necessary.

The core action variables for the devices are: the power into the device for the battery, EV and hot water tank; the heating and cooling power of the underfloor heat pump; and the start time of the shiftable loads. The executive modifies these actions based on what happens during the first time step in real-time, and calculates the resulting values of all other state or auxiliary variables at the end of the time step. Using the hot water heating example from above: if the tank reaches its upper limit during the first time step, then the executive cuts back the scheduled power so that the tank is not overheated.

3.6.2 Expectation Formulation

The expectation online algorithm takes the conditional expected value of any unknown parameters in the optimisation horizon, and solves the de- terministic version of the problem given in equations (3.6–3.9). We use the term expected value loosely because, in truth, the expected value is used only where it makes sense, which is typically for continuous inputs. For the rest of the inputs, the most likely value is calculated instead. For example, the expected value is used for outdoor temperatures and the most likely value is used for the washing machine requests.

Both of these calculations are performed using the joint distribution for any unknown parameters in the horizon, conditioned on any known paramet- ers in the horizon and prior to it. For example, assuming that the random parameters have a dependence that stretches at mostQtime steps into the past, the probabilities of the unknown random parameters in the horizon are given by:

P Wt,k =wt,k ∀k6∈Kτ0,t ∀t∈ {1, . . . , T}

|Wt,k =w∗t,k ∀k∈Kτ0,t ∀t∈ {−Q, . . . , T}

(3.25)

3.6.3 2-Stage Formulation

In this algorithm, 2-stage stochastic programming is used within each hori- zon. This provides an approximation to a full multi-stage stochastic program which are, in general, known to be extremely computationally challenging [Shapiro, 2006].

The 2-stage algorithm uses more information from the random para- meter probability distributions by working directly with samples instead of the expectation. Figure 3.3 provides a selection of samples and the ex- pectated value for outdoor temperature over 12 hours. This highlights how a collection of samples can capture the variance of the random parameter.

Traditionally, in 2-stage stochastic programming there is no uncertainty in the first stage [Shapiro et al., 2009]. The most natural choice of a first stage is the first time step but, as we have discussed, the first time step may have uncertainty in it. There are two ways forward.

3.6. ONLINE STOCHASTIC OPTIMISATION 37

0

2

4

6

8

10

12

Time (hrs)

16

18

20

22

24

26

28

30

32

34

Temperature (degC)

Figure 3.3: Five conditional samples and expected value (dashed line) for outdoor temperatures over 12 hours.

The first approach is to treat the first time step as fully deterministic, by taking the expected values of uncertain parameters in the first time step (as done in the previous section). The first stage includes time step 1, and the second stage time steps 2, . . . , T. The scenarios in the second stage are then sampled from the joint distribution of random parameters in the second stage, conditioned on any known parameters in and prior to the second stage, and the expected values that were calculated for the unknown first stage parameters. Using w1,k† to represent the expected values in the first stage, the joint distribution to sample from is:

P Wt,k =wt,k ∀k6∈Kτ0,t ∀t∈ {2, . . . , T}

|Wt,k =w∗t,k ∀k∈Kτ0,t ∀t∈ {−Q, . . . , T},

W1,k =w†1,k ∀k6∈Kτ0,1

(3.26)

The sample average approximation is used to limit the problem to m

scenarios, which are given by the setS :={1, . . . , m}. We subscript variables and parameters by an s ∈S to indicate which scenario it belongs to. The

2-stage optimisation problem is: min xd,t,s,Pt,s 1 m X s∈S " T X t=1 ∆τtλt,sPt,s+ X d∈D T X t=1 fd(xd,t,s, rd,t,s) # (3.27) s.t. xd,1,s1 =xd,1,s2 ∀d∈ D, s1, s2 ∈S (3.28) P1,s1 =P1,s2 ∀d∈ D, s1, s2 ∈S (3.29) Pt,s∈[ ¯ P,P¯] ∀t∈ {1, . . . , T}, s∈S (3.30) Pt,s=Pt,sb + X d∈D hd(xd,t,s, rd,t,s) ∀t∈ {1, . . . , T}, s∈S (3.31) gd(xd,t,s, rd,t,s, xd,t−1,s, rd,t−1,s)≤0 ∀d∈ D, t∈ {1, . . . , T}, s∈S (3.32) Equations (3.28) and (3.29) tie together the variables in the first time step (first stage) so that they share a common value. As discussed above, expec- ted values were used for all uncertain parameters in the first time step, which means that for a devicedand any two scenarioss1 ands2: rd,1,s1 =rd,1,s2. The second 2-stage approach is to represent the behaviour of the ex- ecutives for the first time step within the scheduler. The first stage then only represents the first time step device action variables. The second stage covers the whole horizon, where the scenarios are sampled from the joint distribution given by (3.25). The first stage variables are linked with the first time step in the second stage through constraints that implement the device executives.

For example, for the hot water heating (with reference to its model in section 3.4), we link the first stage action variable P0 to the second stage variableP1,s in the first time step for scenario swith the relation:

P1,s = min(max(P0,

¯

Ps),P¯s)) (3.33)

The parameters ¯

Ps and ¯Ps are calculated for the particular scenario of in-

terest, depending on the values ofP1,sd and T1,so , and the amount of energy in the tank at the beginning of the horizon E0. They represent the min-

imum and maximum amount of power that can go toward heating the tank in order to satisfy the minimum tank energy heating trigger, and so that the tank does not exceed its maximum energy. The relation above can be implemented as a piecewise-linear constraint between the two variablesP1,s

andP0.

We do not formalise the second approach here or provide its results in detail because, as we will discuss in section 3.8.2, it does not produce results that are much different from the first approach.