Model predictive control - Optimization for online energy management

4.4 Optimization for online energy management

4.4.3 Model predictive control

System dynamics: the system to be controlled is described by means of a discrete-time model:

Bpt ` 1q “ Bptq ` T ptq ` W ptq, (4.14)

where t is the current time slot. The M ˆ ns matrix Bptq with elements

rBptqsk,n “ Bnpkq denotes the system state, representing for each BS n P S

the energy buffer level for time slot k, with k“ t, t ` 1, . . . , t ` M ´ 1, were M is the optimization horizon. Note that the system state in the first time slot tis known, whereas those in the following M´1 time slots have to be estimated. Referring to Section 4.4.2, we thus have M “ N_˚` 1. The M ˆ ns matrix

Tptq with elements rT ptqsk,n “ Tnpkq denotes the control matrix, representing

the amount of energy that each BS n shall either transfer (if Tnpkq ă 0) or

receive (Tnpkq ą 0) in time slot k “ t, . . . , t ` M ´ 1. The M ˆ nsmatrix Wptq

with elementsrW ptqsk,n “ Hnpkq ´ Onpkq models the effective energy income,

i.e., the stochastic behavior of the forecast profiles (harvested and consumed energy), with:

where ĎWptq and ΣW_ptq contain the mean and variance of the forecast estimates,

respectively. Note that processes Hnpkq and Onpkq are statistically character-

ized through the prediction framework of Section 4.4.2, and their difference is still a Gaussian r.v. (in fact, Onpkq is derived from Lnpkq through a linear

model, and as such is still Gaussian distributed). Following [138], due to the stochastic nature of Eq. (4.15), the system state Bptq can also be written in a probabilistic way:

Bptq „ N p sBptq, ΣB_ptqq, (4.16)

where sBptq and ΣB_ptq are the mean and the variance of Bptq, respectively.

Objective functions: the goal of the MPC controller is to determine the amount Tnpkq that each BS n should either transfer or receive in time slots

k “ t, . . . , t ` M ´ 1, so that all the energy buffers remain as close as possible to the reference value Bref. A first quadratic cost function tracks the total

amount of energy that is to be exchanged among BSs in the optimization horizon k“ t, . . . , t ` M ´ 1: f₁MPCpT ptqq “ t`M´1_ÿ k_“t ns ÿ n“1 Tnpkq2. (4.17) fMPC

1 p¨q is used to minimize the total amount of energy exchanged, so as to keep

the energy losses low during the subsequent energy transfer phase. Through a second objective function, the MPC controller seeks to equalize the BS energy buffer levels as close as possible to the reference threshold Bref (defined in

Section 4.3.4). To achieve this, a second cost function is defined as follows:

f2MPCpBptq, Brefq “ t`M´1_ÿ k_“t ns ÿ n“1 pBnpkq ´ Brefq2. (4.18)

Control problem: the following finite-horizon multi-objective optimization problem is formulated:

min T_ptq E “ αf₁MPCpT ptqq ` p1 ´ αqf2MPCpBptq, Brefq ‰ (4.19a) subject to: Bptq „ N p sBptq, ΣB_ptqq, (4.19b) Wptq „ N p ĎWptq, ΣW_ptqq, (4.19c) Blowď Bnpkq ď Bmax, (4.19d)

Tnpkqmin ď Tnpkq ď Tnpkqmax, (4.19e)

with: k “ t, t ` 1, . . . , t ` M ´ 1

where α P r0, 1s is a weight to balance the relative importance of the two cost functions. Blow and Bmax are the energy buffer limitations defined in

Section 4.3.4. Finally, constraint Eq. (4.19e) defines the amount of energy that each BS n P S can exchange in slot k and depends on the system state, i.e., the energy buffer level Bnpkq, the expected harvested energy and expected

traffic load: the system state defines the limits of the control action for each k.

For any fixed value of α, and since the optimization problem must be solved at runtime, it is strongly preferable to choose a convex optimization formulation such as Eq. (4.19), which can be solved through standard tech- niques. Here, we have used the CVX tool [139] to obtain the optimal solution Tptq˚ _{“ rT}

npkq˚s, which represents the amount of energy that BS i P S shall

either offer or demand in time slot k“ t, . . . , t ` M ´ 1.

Optimization algorithm: the MPC controller performs as follows [140]: 1. Step 1: at the beginning of time slot k, the system state is obtained,

that is energy buffer levels for all BSs, the harvested energy and traffic load forecasts for the next M time slots (the optimization horizon). 2. Step 2: the control problem in Eq. (4.19) is solved yielding a sequence

of control actions over the horizon M .

3. Step 3: only the first control action is performed and the system state is updated upon implementing the required energy transfers.

from Step 1.

In document Bayesian Learning Strategies in Wireless Networks (Page 116-119)