• No results found

A Splitting Method for Embedded Optimal Control

N/A
N/A
Protected

Academic year: 2021

Share "A Splitting Method for Embedded Optimal Control"

Copied!
47
0
0

Loading.... (view fulltext now)

Full text

(1)

A Splitting Method for Embedded Optimal Control

B. O’Donoghue G. Stathopoulos S. Boyd

Stanford University

EMBOPT, 8/9/14, IMT Lucca

1

(2)

Outline

Convex optimal control problem

Operator splitting method

Examples

Conclusion

(3)

Convex optimal control problem

I

we consider discrete-time, deterministic, finite-horizon control

I

linear-convex optimal control problem:

minimize P

T

t=0

`

t

(x

t

, u

t

)

subject to x

t+1

= A

t

x

t

+ B

t

u

t

+ c

t

, t = 0, . . . , T − 1 x

0

= x

init

I

variables: states x

t

∈ R

n

and actions u

t

∈ R

m

, t = 0, . . . , T

I

stage cost `

t

: R

n

× R

m

→ R ∪ {∞} convex

I

infinite values of `

t

encode state/action constraints

Convex optimal control problem 3

(4)

Solution methods

I

many methods to solve convex optimal control problem

I interior-point methods

I accelerated (primal or dual) proximal gradient

I explicit MPC

I active set

I

each has advantages, disadvantages, limitations

(5)

This talk

yet another method for convex control problem, that

I

is fast and reliable

I

is implementable in light, library free code

I

can take advantage of parallelism

I

scales to large problems

I

can be implemented in fixed point arithmetic (in many cases)

Convex optimal control problem 5

(6)

Stage cost decomposition

I

stage cost decomposed as

`

t

= φ

t

+ ψ

t

I

φ

t

convex quadratic

I

ψ

t

non-quadratic, possibly infinite (but convex)

I

(decomposition not unique)

(7)

Quadratic stage cost

convex quadratic terms φ

t

: R

n

× R

m

→ R have the form

φ

t

( x, u) = (1/2)

 x u 1

T

Q

t

S

t

q

t

S

Tt

R

t

r

t

q

Tt

r

Tt

0

 x u 1

 where

 Q

t

S

t

S

Tt

R

t



 0 (i.e., symmetric positive semidefinite)

Convex optimal control problem 7

(8)

Decomposed problem

minimize P

T

t=0

t

(x

t

, u

t

) + ψ

t

(x

t

, u

t

))

subject to x

t+1

= A

t

x

t

+ B

t

u

t

+ c

t

, t = 0, . . . , T − 1

x

0

= x

init

(9)

Variable-term graph structure

circles: objective function terms; rectangles: variables

φ

0

(x

0

, u

0

)

ψ

0

φ

t

(x

t

, u

t

)

ψ

t

φ

t+1

ψ

t+1

φ

T

(x

T

, u

T

)

ψ

T

(x

t+1

, u

t+1

)

Convex optimal control problem 9

(10)

Notation

I

x = (x

0

, . . . , x

T

), u = (u

0

, . . . , u

T

), (x, u) denote whole trajectories

I

define trajectory costs

φ(x, u) = X

T

t=0

φ

t

(x

t

, u

t

), ψ(x, u) = X

T

t=0

ψ

t

(x

t

, u

t

)

I

D is set of trajectories that satisfy dynamics

D = {(x, u) | x

0

= x

init

, x

t+1

= A

t

x

t

+ Bu

t

+ c

t

, t = 0, . . . , T − 1}

I

I

D

is indicator function of D I

D

( x, u) =

0 (x, u) ∈ D

∞ otherwise

(11)

Optimal control problem

minimize I

D

( x, u) + φ(x, u) + ψ(x, u)

I

I

D

(x, u) encodes linear equality (dynamics) constraints

I

φ( x, u) is separable convex quadratic

I

ψ( x, u) is separable non-quadratic convex

Convex optimal control problem 11

(12)

Outline

Convex optimal control problem

Operator splitting method

Examples

Conclusion

(13)

Consensus form

I

replicate x and u, and add consensus constraints:

minimize (I

D

( x, u) + φ(x, u)) + ψ(˜x, ˜u) subject to (x, u) = (˜x, ˜u)

over (x, u) ∈ R

(n+m)(T+1)

and (˜x, ˜u) ∈ R

(n+m)(T+1)

Operator splitting method 13

(14)

Graph structure (original problem)

φ

0

(x

0

, u

0

)

ψ

0

φ

t

(x

t

, u

t

)

ψ

t

φ

t+1

ψ

t+1

φ

T

(x

T

, u

T

)

ψ

T

(x

t+1

, u

t+1

)

(15)

Graph structure (consensus form)

φ

0

(x

0

, u

0

)

φ

t

(x

t

, u

t

)

φ

t+1

φ

T

(x

T

, u

T

) (x

t+1

, u

t+1

)

( ˜x

0

, ˜u

0

)

ψ

0

(˜x

t

, ˜u

t

)

ψ

t

ψ

t+1

(˜x

T

, ˜u

T

)

ψ

T

( ˜x

t+1

, ˜u

t+1

)

Operator splitting method 15

(16)

Proximal operator

I

define prox operator

prox

f

(v) = argmin

x



f (x) + (ρ/2)kx − vk

22



with parameter ρ > 0

I

generalizes notion of projection

I

prox operators of many functions have simple forms

(17)

Douglas-Rachford splitting for consensus convex optimization

I

consensus convex optimization problem minimize f (x) + g(z) subject to x = z

I

DR splitting algorithm: starting from any z

0

, λ

0

, for k = 0, 1, . . . , x

k+1

:= prox

f

(z

k

+ λ

k

)

z

k+1

:= prox

g

(x

k+1

− λ

k

) λ

k+1

:= λ

k

+ x

k+1

− z

k+1

I

λ is (scaled) dual variable associated with consensus constraint

I

λ

k

is running summing of errors x

k

− z

k

(integral control)

I

converges to solution, if one exists

Operator splitting method 17

(18)

Operator splitting for control (OSC)

I

consensus form optimal control problem:

minimize (I

D

( x, u) + φ(x, u)) + ψ(˜x, ˜u) subject to (x, u) = (˜x, ˜u)

I

OSC: starting from any (˜x

0

, ˜u

0

), (z

0

, y

0

), for k = 0, 1, . . . , (x

k+1

, u

k+1

) := prox

I

D

(˜x

k

+ z

k

, ˜u

k

+ y

k

) ( ˜x

k+1

, ˜u

k+1

) := prox

ψ

(x

k+1

− z

k

, u

k+1

− y

k

)

(z

k+1

, y

k+1

) := (z

k

, y

k

) + (˜x

k+1

− x

k+1

, ˜u

k+1

− u

k+1

)

(19)

Stopping criterion

I

primal residual r

k

= (x

k

, u

k

) − ( ˜x

k

, ˜u

k

)

I

dual residual s

k

= ρ((˜x

k

, ˜u

k

) − (˜x

k−1

, ˜u

k−1

))

I

both converge to zero

I

stopping criterion:

kr

k

k

2

6 

pri

, ks

k

k

2

6 

dual

with tolerances 

pri

> 0 and 

dual

> 0

Operator splitting method 19

(20)

Consensus form graph

φ

0

(x

0

, u

0

)

φ

t

(x

t

, u

t

)

φ

t+1

φ

T

(x

T

, u

T

) (x

t+1

, u

t+1

)

( ˜x

0

, ˜u

0

)

ψ

(˜x

t

, ˜u

t

)

ψ ψ

(˜x

T

, ˜u

T

)

ψ

( ˜x

t+1

, ˜u

t+1

)

(21)

With dual variables

φ

0

(x

0

, u

0

)

φ

t

(x

t

, u

t

)

φ

t+1

φ

T

(x

T

, u

T

) (x

t+1

, u

t+1

)

( ˜x

0

, ˜u

0

)

ψ

0

(˜x

t

, ˜u

t

)

ψ

t

ψ

t+1

(˜x

T

, ˜u

T

)

ψ

T

( ˜x

t+1

, ˜u

t+1

)

(z

0

, y

0

) (z

t

, y

t

) (z

t+1

, y

t+1

) (z

T

, y

T

)

Operator splitting method 21

(22)

Douglas-Rachford Splitting

φ

0

(x

0

, u

0

)

φ

t

(x

t

, u

t

)

φ

t+1

φ

T

(x

T

, u

T

) (x

t+1

, u

t+1

)

( ˜x

0

, ˜u

0

) (˜x

t

, ˜u

t

) ( ˜x

t+1

, ˜u

t+1

) (˜x

T

, ˜u

T

)

(z

0

, y

0

) (z

t

, y

t

) (z

t+1

, y

t+1

) (z

T

, y

T

)

(23)

Sub-problems

φ

0

(x

0

, u

0

)

φ

t

(x

t

, u

t

)

φ

t+1

φ

T

(x

T

, u

T

) (x

t+1

, u

t+1

)

( ˜x

0

, ˜u

0

)

ψ

0

(˜x

t

, ˜u

t

)

ψ

t

ψ

t+1

( ˜x

T

, ˜u

T

)

ψ

T

( ˜x

t+1

, ˜u

t+1

)

(z

0

, y

0

) (z

t

, y

t

) (z

t+1

, y

t+1

) (z

T

, y

T

)

Quadratic

Affine equality constraints

Operator splitting method 23

(24)

Sub-problems

φ

0

(x

0

, u

0

)

φ

t

(x

t

, u

t

)

φ

t+1

φ

T

(x

T

, u

T

) (x

t+1

, u

t+1

)

( ˜x

0

, ˜u

0

) (˜x

t

, ˜u

t

) (˜x

t+1

, ˜u

t+1

) ( ˜x

T

, ˜u

T

) (z

0

, y

0

) (z

t

, y

t

) (z

t+1

, y

t+1

) (z

T

, y

T

)

Quadratic

Affine equality constraints

(25)

Linear quadratic step

I

OSC first step is solving a linearly constrained quadratic problem minimize (1/2)w

T

Ew + f

T

w

subject to Gw = h over variable w ∈ R

(T+1)(n+m)

I

E has block structure

I

optimality conditions: KKT system

 E G

T

G 0

  w λ



=

 −f h



λ ∈ R

(T+1)n

dual variable associated with Gw = h

I

in each iteration of OSC we solve KKT system with same KKT matrix

Operator splitting method 25

(26)

Sparse LDL

T

decomposition

I

factor KKT matrix as

 E G

T

G 0



= PLDL

T

P

T

I P is a permutation matrix

I L is unit lower triangular

I D is diagonal

I

P chosen to yield a factor L with few nonzeros

I

can choose P such that L is block banded

I

factorize, then cache P, L, D

−1

(27)

Solve step

I

solve KKT system using

 w λ



= P

 L

−T

 D

−1

 L

−1

 P

T

 −f h



I multiplication by L−1is forward substitution

I multiplication by L−Tis backward substitution

I

these operations do not require division

I

factor cost: O(T(m + n)

3

), solve cost: O(T(m + n)

2

)

I

same as Riccati recursion, but (much) more general

I

(we also use regularization and iterative refinement)

Operator splitting method 27

(28)

Non-quadratic prox step

I

OSC second step separable across time

I

solve for each t:

minimize ψ

t

( ˜x

t

, ˜u

t

) + (ρ/2)k(˜x

t

, ˜u

t

) − (v

t

, w

t

)k

22

over ˜x

t

∈ R

n

and ˜u

t

∈ R

m

I

in many cases we have analytic or semi-analytic solutions

I

can be solved in parallel

(29)

OSC summary

in each step:

1.

solve linear-quadratic regulator problem

2.

T + 1 parallel prox steps

3.

dual update

Operator splitting method 29

(30)

Usage scenarios

I

cold start

I solve optimal control problem once

I

warm start

I solve many times with similar data

I initialize algorithm using previous solution

I

constant quadratic

I solve many times, where Qt, Rt, St, At, Btdo not change

I perform LDLTfactorization once, offline

I can yield division free algorithm

I

warm start constant quadratic

(31)

Outline

Convex optimal control problem

Operator splitting method

Examples

Conclusion

Examples 31

(32)

Examples

I

three examples, three instances of each

I

timing results for

I cold start: initialize variables to zero

I warm start: perturb xinit

I

at termination no instance was more than 1% suboptimal

I

implemented in C

I

Tim Davis’ AMD and LDL packages for factorization and solve steps

I

run on 4 core Intel Xeon processor (3.4GHz, 16Gb of RAM)

I

and, for fun, Rasberry Pi

(33)

Box-constrained quadratic optimal control

I

box-constrained problem:

minimize (1/2) P

T

t=0

(x

Tt

Qx

t

+ u

Tt

Ru

t

) subject to x

t+1

= Ax

t

+ Bu

t

, t = 0, . . . , T − 1

x

0

= x

init

ku

t

k

6 1 Q  0 and R  0

I

data randomly generated; A scaled so that ρ(A) = 1

I

x

init

scaled so inputs saturated for at least 2/3 of horizon

I

ψ

t

(x

t

, u

t

) = I

kutk61

, so

prox

ψt

(v, w) = (v, sat

[−1,1]

(w))

Examples 33

(34)

Results

(all times in milliseconds)

small medium large

state dimension n 5 20 50

input dimension m 2 5 20

horizon length T 10 20 30

total variables 77 525 2170

CVX solve time 400 500 3400

fast MPC solve time 1.5 14.2 2710

factorization time 0.1 1.3 16.8

KKT solve time 0.0 0.1 0.9

OSC iterations 92 46 68

OSC solve time 0.4 4.4 60.5

warm start OSC iterations 72.6 35.1 39.5

warm start OSC solve time 0.3 3.4 35.2

(35)

Results

(all times in milliseconds)

small medium large

state dimension n 5 20 50

input dimension m 2 5 20

horizon length T 10 20 30

total variables 77 525 2170

CVX solve time 400 500 3400

fast MPC solve time 1.5 14.2 2710

factorization time 0.1 1.3 16.8

KKT solve time 0.0 0.1 0.9

OSC iterations 92 46 68

OSC solve time 0.4 4.4 60.5

warm start OSC iterations 72.6 35.1 39.5 warm start OSC solve time 0.3 3.4 35.2

Examples 34

(36)

Multi-period portfolio optimization

I

manage a portfolio of n assets over t = 0, . . . , T

I

x

t

∈ R

n

vector of portfolio positions at time t (in dollars)

I (xt)i<0: short position in asset i in period t

I

u

t

∈ R

n

vector of trades at time t (in dollars)

I (ut)i<0: asset i is sold in period t

I

dynamics

x

t+1

= diag(r

t

)(x

t

+ u

t

), t = 0, . . . , T − 1

I

r

t

> 0 (estimated) returns in period t

(37)

Stage cost

gross cash in

z}|{

1

T

u

t

+

price-impact

z }| { u

Tt

diag(s)u

t

+

quadratic risk

z }| { (x

t

+ u

t

)

T

Σ(x

t

+ u

t

) +

bid-ask spread

z }| { κ

T

|u

t

| +

trading constraint

z }| { I

Ct

(x

t

, u

t

)

I

κ > 0, s > 0, and Σ  0 are data

I

negative stage cost means (risk-adjusted) revenue extracted

I

trading constraints

I long-only:Ct={(xt, ut)| xt+ut> 0}, t , T

I liquidate position:CT={(xT, uT)| xT+uT=0}

Examples 36

(38)

Splitting

gross cash in

z}|{

1

T

u

t

+

price-impact

z }| { u

Tt

diag(s

t

)u

t

quadratic risk

z }| { (x

t

+ u

t

)

T

Σ

t

(x

t

+ u

t

)

| {z } φ

t

+

bid-ask spread

z }| { κ

T

|u

t

| +

trading constraint

z }| { I

Ct

(x

t

, u

t

)

| {z } ψ

t

note: ψ

t

separable across assets

(39)

Proximal operator for ψ

t

I

for t < T prox step given by solution to

minimize κ

i

|u|

i

+ (ρ/2) (x

i

− v

i

)

2

+ (u

i

− w

i

)

2

 subject to x

i

+ u

i

> 0

with scalar variables x

i

and u

i

I

solution easily expressed using soft-thresholding operator S

γ

(z) S

γ

(z) = argmin

y

γ|y| + (1/2)(y − z)

2

 = z(1 − γ/|z|)

+

Examples 38

(40)

Results

(all times in milliseconds)

small medium large

number of assets n 10 30 50

horizon length T 30 60 100

total variables 620 3660 10100

CVX solve time 800 2100 10750

factorization time 0.7 13.3 73.6

KKT solve time 0.1 0.7 3.2

OSC iterations 27 41 53

OSC solve time 1.5 30.8 177.7

warm start OSC iterations 5.1 5.9 4.8

warm start OSC solve time 0.3 4.4 16.1

(41)

Results

(all times in milliseconds)

small medium large

number of assets n 10 30 50

horizon length T 30 60 100

total variables 620 3660 10100

CVX solve time 800 2100 10750

factorization time 0.7 13.3 73.6

KKT solve time 0.1 0.7 3.2

OSC iterations 27 41 53

OSC solve time 1.5 30.8 177.7

warm start OSC iterations 5.1 5.9 4.8 warm start OSC solve time 0.3 4.4 16.1

Examples 39

(42)

Supply chain management

I

single commodity supply chain on a directed graph

I n nodes: warehouses or storage locations

I m edges: shipment links between warehouses, sources, and sinks

I

x

t

∈ R

n+

amount of the commodity stored in warehouses

I

u

t

∈ R

m+

amount shipped across links

I

dynamics

x

t+1

= x

t

+ (B

+

− B

)u

t

I B+ij =1 if edge j enters node i

(43)

Stage cost

storage cost

z }| { q

Tt

x + ˜q

Tt

x

2

+

transportation cost

z}|{

r

Tt

u

t

| {z } φ

t

+

warehouse capacities

z }| { I

06xt6C

+

link capacities

z }| { I

06ut6U

+

can’t ship more than on hand

z }| { I

But6xt

| {z } ψ

t

transportation costs include

I

cost of acquisition

I

revenue from sales

prox step of ψ

t

solved via saturation and bisection

Examples 41

(44)

Results

(all times in milliseconds)

small medium large

warehouses n 10 20 40

edges m 25 118 380

horizon length T 20 20 20

total variables 735 2898 8820

CVX solve time 500 1200 3300

factorization time 0.3 1.3 4.7

KKT solve time 0.0 0.1 0.3

single-thread prox step time 0.1 0.4 1.3 multi-thread prox step time 0.0 0.1 0.4

OSC iterations 82 77 116

OSC solve time 4.6 19.1 88.1

warm start OSC iterations 21.9 31.0 24.2

warm start OSC solve time 1.2 7.5 18.5

(45)

Results

(all times in milliseconds)

small medium large

warehouses n 10 20 40

edges m 25 118 380

horizon length T 20 20 20

total variables 735 2898 8820

CVX solve time 500 1200 3300

factorization time 0.3 1.3 4.7

KKT solve time 0.0 0.1 0.3

single-thread prox step time 0.1 0.4 1.3 multi-thread prox step time 0.0 0.1 0.4

OSC iterations 82 77 116

OSC solve time 4.6 19.1 88.1

warm start OSC iterations 21.9 31.0 24.2 warm start OSC solve time 1.2 7.5 18.5

Examples 42

(46)

Outline

Convex optimal control problem

Operator splitting method

Examples

Conclusion

(47)

Summary

I

decompose convex optimal control problem into

I convex linear quadratic control problem

I time-separable nonquadratic problems

I

yields fast, reliable algorithm

I small problems solved in microseconds

I large problems solved in milliseconds

I

if dynamics matrices don’t change, yields division-free method

I

can be improved by diagonal scaling, computed on-line or off-line

Conclusion 44

References

Related documents

I Simplex and interior point methods produce optimal primal and dual solutions.. I Interior point methods typically maintain feasible primal and dual solutions in

Optimization, numerical methods, linear programming, optimal con- trol, robust control, convex programming, interior-point methods, FIR lter design, conjugate

The development of this cross-disciplinary problem-solving ability has increasingly become part of clinical legal education.' The pediatric medical-legal partnership, a

As one of the largest convention centres in the province and the Okanagan Valley’s only full-service facility, we’ve got you covered with 60,000 sq ft of versatile, flexible space

The data envelopment analysis (DEA) conducted on a data set of 78 state universities and colleges (SUCs) provides empirical evidence on the inefficiency of the majority of the SUCs

ay be counting on this and therefore h t market at this point. Banks May Value Alternative Data including Mobile T Banks still use a more judgmental approach to evalu type of

Now, if we consider this problem as an online problem with an adversary specifying the routing requests, a memoryless algorithm will imply that the adversary is also memoryless;

Mathematicians, teachers and instructional designers regard symbols as carriers of meaning (Stacey, Chick &amp; Kendal, 2006). Learners, however, lack the necessary