2.3 An Approximate Dynamic Programming Formulation
2.3.2 Value function approximation
In this subsection, we will discuss how to update the value function gradient ap- proximation ¯Vh+,n−1. Letv+h,ndenote a sample estimate of marginal value of increasing post-decision PHEV backlog at time h, yh+,n, by one unit. The proposed scheme to obtainv+h,n involves approximating and updating wholesale electricity prices. We use
¯
Pn
h to denote the approximation of the wholesale electricity price at hourh, computed in iteration n. The initial wholesale electricity price approximation associated with
any hour is assumed to be 0; that is ¯P1
h = 0, 1 ≤ h ≤ H. Let p n
h denote a new estimate of the wholesale electricity price at timeh, obtained at iteration n. Starting from iteration n = 2, at each hour h, after a charging decision zh+,n is determined from (2.18) – (2.24), and a specific realization of exogenous information at time h,
ωn
h, is known to the system, a real-time economic dispatch problem is solved to obtain
pn
h. The real-time economic dispatch is performed by a system operator to determine the after-the-fact wholesale electricity price at time h. The objective of the real-time economic dispatch is to minimize the costs of satisfying the actual electricity demand, written as
min ghj, wh, qh
Chdisp(Shn, xh), (2.26)
subject to the following constraints: J X j=1 ghj +wh+qh =Dh+D 0 h+ L X l=1 CP ×z{+h,n−l+1}>0; (2.27) 0≤ghj ≤Gj, 1≤j ≤J; (2.28) 0≤wh ≤βhn×W; (2.29) qh ≥0. (2.30)
A particular realization of wind availability factorβn
h in (2.29) is sampled for iteration
nusing Monte Carlo simulation based on a time-series model. Details on the modeling of wind power production will be presented in Section 2.4.3. The dual of the power balance constraint represented by (2.27) is the ex post wholesale electricity price associated with this particular sample path, which can be used as a new estimate of wholesale electricity price.
We now use the new estimate pn
h to update the wholesale electricity price approx- imation according to the following equation
¯ Phn= (1−α P n−1)×P¯ n−1 h +α P n−1×p n h, 2≤n ≤N, 1≤h≤H; (2.31) where αP
n−1 ∈ (0,1) is a step-size; and, the common practice is to use a constant
Using ¯Pn
h, 1 ≤ h ≤ H, a new sample estimate of marginal value of increasing post-decision PHEV backlogy+h,n (denoted as v
+,n
h ) can be obtained, as illustrated in Figure 2.3. We could increase the number of empty batteries at time h, yh+,n, by one unit, by charging one less unit of batteries at time h. By doing this, two things will happen in the future hours till the end of a day. First, in the very next L−1 hours,
h+ 1≤τ ≤h+L−1,CP [kW] of electricity generation at a marginal cost equal to ¯
Pn
τ will be saved. CP represents the charging power rate, and L denotes the number of hours needed to fully charge a battery. The reduction on electricity generation costs in the future hours would be given by
h+L−1
X
τ=h+1
CP ×P¯τn, (2.32)
which can be rewritten as (by letting τ =h+l−1) L
X
l=2
CP ×P¯hn+l−1. (2.33)
The second thing that will occur is that we need to fully charge the one unit of batteries by the end of the day because of the charging due time constraint. The lowest cost to charge the additional unit can be estimated by solving a trivial optimization problem of finding an optimal start time of charging to minimize the associated electricity generation costs incurred during a charging cycle that lasts for L hours. The optimization problem can be written as follows
min h+1≤τ≤H−L+1 L X l=1 CP ×P¯τn+l−1. (2.34)
To summarize, the marginal value of increasing PHEV backlog by one unit can be estimated by the net reduction on electricity generation costs, written as
vh+,n = L X l=2 CP ×P¯hn+l−1− min h+1≤τ≤H−L+1 L X l=1 CP ×P¯τn+l−1. (2.35)
From (2.35) we can see that when future electricity prices are low, gains from increasing PHEV backlog will be relatively large, meaning that more vehicles’ charg- ing will be delayed to take advantage of low electricity prices in future hours. This
Fig. 2.3. Illustrating how to obtain a new sample estimate of the value function gradient approximation given the wholesale electricity price approximations
shows that using the designed value function approximation, combined with the iter- ative updating operation, a closed feedback loop is created to make better and better decisions.
We now use the new estimate vh+,n to update the value function gradient approx- imation according to the following equation
¯ Vh+,n= (1−α+ n−1)×V¯ +,n−1 h +α + n−1×v +,n h , 2≤n≤N, 1≤h≤H; (2.36) where α+
n−1 is a step-size between 0 and 1; and, the common practice is to use a