resource allocation problems
Zawar Qureshi Sebastian East Mark Cannon
University of Oxford
July 10, 2019
Optimize power delivered by i.c. engine and electric motor while meeting driver demand
driver demand supervisory controller
actuator
setpoints vehicle
power
SoC, torque, speed
Minimize fuel consumption over a future horizon given
? limits on energy capacity (battery SoC, fuel), power flows, torques
Piecewise quadratic maps fitted to fuel map &
electrical loss map battery /
fuel Pbat Pfuel motor-generator i.c. engine Pmot Peng Σ Pout storage conversion 264 264 267 267 267 270 270 270 270 270 273 273 273 273 273 277 277 277 277 277 280 280 280 280 280 284 284 284 284 284 289 289 289 289 289 293 293 293 293 293 298 298 298 298 298 302 302 302 302 302 308 308 308 308 313 313 313 313 318 318 318 324 324 330 330 337 337 343 343 350 350 100 150 200 250 300 350 400 450 500 550 eng (rad/s) 50 100 150 200 250 300 350 Teng (Nm) 0 50 100 150 200 250 300 350 Pf (kW) BSFC (g/kWh)
Engine fuel map
1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 10 10 10 10 11 11 11 11 12 12 12 12 0 50 100 150 200 250 300 350 400 450 500 550 mot (rad/s) -250 -200 -150 -100 -50 0 50 100 150 200 Tmot (Nm) -150 -100 -50 0 50 100 Pbat (kW) Ploss (kW) Electrical losses 3/ 19
This paper:
B Stochastic demand & robust optimization
B Multiple resources
B Quadratic losses & quadratic costs
B Parallel ADMM implementation
/ source 1 loss 1
source 2 loss 2 Σ output
This paper:
B Stochastic demand & robust optimization
B Multiple resources
B Quadratic losses & quadratic costs
B Parallel ADMM implementation / / source 1 loss 1 loss 2 source 2 lossn sourcen Σ Σ output storage conversion 4/ 19
xk gk(·) fk(·) 1 z−1 1 z−1 X k gk(xk) X k fk(xk)
x(1)k x(2)k g(1)k (·) fk(1)(·) g(2)k (·) fk(2)(·) 1 z−1 1 z−1 1 z−1 1 z−1 X k gk(1)(x(1)k ) X k fk(1)(x(1)k ) X k gk(2)(x(2)k ) X k fk(2)(x(2)k ) 5/ 19
/ / x(1)k x(2)k g(1)k (·) fk(1)(·) g(2)k (·) fk(2)(·) 1 z−1 1 z−1 1 z−1 1 z−1 Σ Σ X k gk(1)(x(1)k ) X k gk(2)(x(2)k ) X i,k fk(i)(x(ki)) X i x(ki)
/ / / / x(1)k x(2)k x(kn) g(1)k (·) fk(1)(·) g(2)k (·) fk(2)(·) g(kn)(·) fk(n)(·) 1 z−1 1 z−1 1 z−1 1 z−1 1 z−1 1 z−1 Σ Σ Σ Σ X k gk(1)(x(1)k ) X k gk(2)(x(2)k ) X k gk(n)(x(kn)) X i,k fk(i)(x(ki)) X i x(ki) 5/ 19
If demand sequence: {y1, . . . , yn}is known
Optimal allocation for no uncertainty
minimize
xk(i)∈[xk(i),x¯(ki)]
X
i,k
fk(i)(x(i)k ) ←total cost
subject to X
i
x(i)k ≥yk ∀k ←demand
X
k
gk(i)(x(i)k )≤c(i) ∀i ←resource capacity
Assumption
fk(i)(·),gk(i)(·), are convex and quadratic:
fk(i)(x) =α(i)k,2x2+αk,1(i)x+α(i)k,0
If demand sequence: {y1, . . . , yn}is known
Optimal allocation for no uncertainty
minimize
xk(i)∈[xk(i),x¯(ki)]
X
i,k
fk(i)(x(i)k ) ←total cost
subject to X
i
x(i)k ≥yk ∀k ←demand
X
k
gk(i)(x(i)k )≤c(i) ∀i ←resource capacity
Assumption
fk(i)(·),gk(i)(·), are convex and quadratic:
fk(i)(x) =α(i)k,2x2+αk,1(i)x+α(i)k,0
gk(i)(x) =βk,2(i)x2+βk,1(i)x+βk,0(i)
Unknown demand sequence with samplesy(j)={y(j)
1 , . . . , y
(j)
n }, j= 1, . . . , q
Robust optimal allocation
minimize xk(i,j)∈[xk(i),¯x(ki)] x(1i) 1 q X i,j,k fk(i,j)(x(i,j)k ) subject to X i x(i,j)k ≥yk(j) ∀k, j X k
gk(i,j)(x(i,j)k )≤c(i) ∀i, j
x(i,j)1 =x(i)1 ∀i, j ←common 1st decision
Assumption
Unknown demand sequence with samplesy(j)={y(j)
1 , . . . , y
(j)
n }, j= 1, . . . , q
Robust optimal allocation
minimize xk(i,j)∈[xk(i),¯x(ki)] x(1i) 1 q X i,j,k fk(i,j)(x(i,j)k ) subject to X i x(i,j)k ≥yk(j) ∀k, j X k
gk(i,j)(x(i,j)k )≤c(i) ∀i, j
x(i,j)1 =x(i)1 ∀i, j ←common 1st decision
Assumption
Samplesy(1), . . . , y(q) are i.i.d.
Equivalent problem: minimize x(ki,j), x(1i) zk(i,j), s(kj), t(i,j) X i,j,k h 1 qf (i,j) k (x (i,j) k ) + I[x(ki),x¯(ki)](x (i,j) k ) i +X j,k I≥0(s (j) k ) + X i,j I≤c(i)(t(i,j))
subject to gk(i,j)(x(i,j)k ) =z(i,j)k ∀i, j, k
X i x(i,j)k −yk(j)=s(j)k ∀j, k X k z(i,j)k =t(i,j) ∀i, j x(i,j)1 =x(i)1 ∀i, j
Equivalent problem: minimize x(ki,j), x(1i) zk(i,j), s(kj), t(i,j) X i,j,k h 1 qf (i,j) k (x (i,j) k ) + I[x(ki),x¯(ki)](x (i,j) k ) i +X j,k I≥0(s (j) k ) + X i,j I≤c(i)(t(i,j))
subject to gk(i,j)(x(i,j)k ) =z(i,j)k ∀i, j, k
X i x(i,j)k −yk(j)=s(j)k ∀j, k X k z(i,j)k =t(i,j) ∀i, j x(i,j)1 =x(i)1 ∀i, j IS(x) = ( 0 x∈ S +∞ x /∈ S 8/ 19
Equivalent problem: minimize x(ki,j), x(1i) zk(i,j), s(kj), t(i,j) X i,j,k h 1 qf (i,j) k (x (i,j) k ) + I[x(ki),x¯(ki)](x (i,j) k ) i +X j,k I≥0(s (j) k ) + X i,j I≤c(i)(t(i,j))
subject to gk(i,j)(x(i,j)k ) =z(i,j)k ∀i, j, k
X i x(i,j)k −yk(j)=s(j)k ∀j, k X k z(i,j)k =t(i,j) ∀i, j x(i,j)1 =x(i)1 ∀i, j
split capacity constraints into: separable nonlinearities & linear constraints
Augmented Lagrangian: L=X i,j,k 1 qf (i,j) k (x (i,j) k ) +X i,j,k h I[x(i) k ,¯x (i) k ] (x(i,j)k ) +ρ1 2 z (i,j) k −g (i,j) k (x (i,j) k ) +λ (i,j) k 2i +X i,j h I≤c(i)(t(i,j)) +ρ22 t(i,j)− X k zk(i,j)+p(i,j)2i +X j,k h I≥0(s (j) k ) + ρ3 2 s (j) k − X i x(i,j)k +yk(j)+µ(j)k 2i +X i,j ρ4 2 x (i) 1 −x (i,j) 1 +ν (i,j)2
? λ(i,j)k ,µ(j)k ,ν(i,j),p(i,j): Lagrange multipliers
? ρ1,ρ2,ρ3: multiplier update step size parameters
ADMM iteration: primal update x(i,j)k ←argmin x(ki,j) L= Π[x(i) k ,x¯ (i) k ]
{minimizer of quartic equation inx(i,j)k }
zk(i,j) ←argmin
z(ki,j)
L=
gk(i,j)(x(i,j)k )−λ(i,j)k + ρ2
ρ1+nρ2 h t(i,j)+p(i,j)−P k g (i,j) k (x (i,j) k )−λ (i,j) k i x(i)1 ←argmin x(1i) L= P j(x (i,j) 1 −ν(i,j)) t(i,j) ←argmin t(i,j) L= Π≤c(i) n P kz (i,j) k −p (i,j)o s(j)k ←argmin s(kj) L= Π≥0 n P ix (i,j) k −y (j) k −µ (j) k o
ADMM iteration: primal update x(i,j)k ←argmin x(ki,j) L= Π[x(i) k ,x¯ (i) k ]
{minimizer of quartic equation inx(i,j)k }
zk(i,j) ←argmin
z(ki,j)
L=
gk(i,j)(x(i,j)k )−λ(i,j)k + ρ2
ρ1+nρ2 h t(i,j)+p(i,j)−P k g (i,j) k (x (i,j) k )−λ (i,j) k i x(i)1 ←argmin x(1i) L= P j(x (i,j) 1 −ν(i,j)) t(i,j) ←argmin t(i,j) L= Π≤c(i) n P kz (i,j) k −p (i,j)o s(j)k ←argmin s(kj) L= Π≥0 n P ix (i,j) k −y (j) k −µ (j) k o
can be implemented in parallel partially parallelizable
ADMM iteration: dual update µ(j)k ← µ(j)k +s(j)k −P ix (i,j) k +y (j) k
λ(i,j)k ← λ(i,j)k +zk(i,j)−g(i,j)k (x(i,j)k )
ν(i,j)← ν(i,j)+x(i)1 −x(i,j)1
p(i,j) ← p(i,j)+t(i,j)−P
kz
(i,j) k
ADMM iteration: dual update µ(j)k ← µ(j)k +s(j)k −P ix (i,j) k +y (j) k
λ(i,j)k ← λ(i,j)k +zk(i,j)−g(i,j)k (x(i,j)k )
ν(i,j)← ν(i,j)+x(i)1 −x(i,j)1
p(i,j) ← p(i,j)+t(i,j)−P
kz
(i,j) k
can be implemented in parallel partially parallelizable
CUDA heterogeneous programming model
CUDA kernels run in parallel on the GPU
Threads execute same instructions simultaneously using different data
Execution CPU GPU CPU GPU CPU serial code parallel kernel 0 serial code parallel kernel 1 serial code
CPU main memory device m emo ry L1 cache control data GPU | {z } GPU cores
e.g. Nvidia GTX 1060 3GB GPU has 1152 cores and up to 18432 threads
? x∗= argminxAx4+Bx3+Cx2+Dx
? Fast algebraic solution based on Vieta’s and Cardano’s methods:
Input :coefficientsA,B,C,D b←3B/4A,c←C/2A,d←D/4A; Q←c/3−b2/9,R←bc/6−b3/27−d/2,∆←Q3+R2; if∆>0then x∗←(R+√∆)1/3+ (R−√∆)1/3−b/3; else ifQ=R= 0then x∗← −b/3; else θ←cos−1(R/|Q|3/2); xa←2|Q|1/2cos (θ/3)−b/3; xb←2|Q|1/2cos (θ/3 + 2π/3)−b/3; xc←2|Q|1/2cos (θ/3 + 4π/3)−b/3; (x1, x2, x3)←sort(xa, xb, xc); δf←1 4(x 4 1−x43) + b 3(x 3 1−x33) + c 2(x 2 1−x23) +d(x1−x3); ifδf >0then x∗←x 3; else x∗←x ;
Computation time for minimization ofN quartics with random coefficients (CPU speed optimizations via compiler flags /Ox and /Od)
102 104 106 Number of equations (N) 10-5 10-4 10-3 10-2 10-1 100 101 102 Time (s)
CPU single-core optimisations off CPU multi-core optimisations on GPU
? yk(j): samples of stochastic driver power demand
? x(1,j)k : drive power from i.c. engine
? x(2,j)k : drive power from electric motor
? Objective: minimize fuel consumption (fk(2,j)= 0 ∀j, k) constraints: battery capacity & power flows (g(1,j)k = 0 ∀j, k)
Robust optimization problem
minimize xk(i,j)∈[xk(i),x¯(ki)] x(1i) 1 q q X j=1 n X k=1 fk(1,j)(x(1,j)k ) subject to x(1,j)k +x(2,j)k ≥yk(j), j∈ {1, . . . , q}, k∈ {1, . . . n} P kg (2,j) k (x (2,j) k )≤∆E, j∈ {1, . . . , q} x(i,j)1 =x(i)1 , i= 1,2, j∈ {1, . . . , q}
Optimal predicted battery state of charge profiles with 100 power demand scenarios generated from random perturbations of FTP-75 drive cycle
Average computation times
100 101 102
Number of demand scenarios, q
10-3 10-2 10-1 100 101 102 Time (s) CPU runtime GPU runtime
Average computation times
100 101 102
Number of demand scenarios, q
10-3 10-2 10-1 100 101 102 Time (s) 0 50 100 150 200 250 300 Number of iterations CPU runtime GPU runtime Number of iterations
? Parallel implementation is between 10×and20×faster than serial
? 0.25 s for 100 scenarios (0.6 s for 200 scenarios) is acceptable withTsamp= 1s
Contributions
? ADMM algorithm for robust quadratic resource allocation problems
? Parallel implementation on GPU coded in CUDA
Observations
? Choice of operator splitting is important for parallel implementation
? Robust optimal energy management problem is solvable in real time
using cheap, non-specialized hardware
? State of the art low-cost parallel processing hardware is evolving fast
Code: https://github.com/qureshizawar/CUDA-quartic-solver
Contributions
? ADMM algorithm for robust quadratic resource allocation problems
? Parallel implementation on GPU coded in CUDA
Observations
? Choice of operator splitting is important for parallel implementation
? Robust optimal energy management problem is solvable in real time using cheap, non-specialized hardware
? State of the art low-cost parallel processing hardware is evolving fast
Code: https://github.com/qureshizawar/CUDA-quartic-solver
Questions?
Contributions
? ADMM algorithm for robust quadratic resource allocation problems
? Parallel implementation on GPU coded in CUDA
Observations
? Choice of operator splitting is important for parallel implementation
? Robust optimal energy management problem is solvable in real time using cheap, non-specialized hardware
? State of the art low-cost parallel processing hardware is evolving fast
Code: https://github.com/qureshizawar/CUDA-quartic-solver