An indirect approach using gradient information

switching boundary controls

6.1 An indirect approach using gradient information

The proposed indirect method is based on the key idea already dis- cussed in Section 3.1. We use the a-priori bound K given by (3.1.4) in Theorem 3.1.1 for the number of essential switches in the optimal q∗(·), fix q(0) by q0 ∈ Q = {0, 1} and thus all essential subsequent modes and

then obtain ˜q∗(·) by considering the switching times τk, k = 1, . . . , K as

continuous optimization variables subject to the inequality constraints 0 6 τ16τ26· · · 6 τK6T. (6.1.1)

In order to compute a point (τ1, . . . , τK) satisfying the optimality con-

dition of Theorem 3.2.2, we adapt a gradient projection method with an Armijo stepsize suggested in [17] for switching ordinary differential equations. Therefore, let Υ denote the feasible set, i. e.,

6.1 An indirect approach using gradient information

Algorithm 6.1.1 (Gradient-projection method [17]):

Require: 0 < α < 0.5, 0 < β < 1, ¯z > 0, ε > 0. Initialize: Choose τ0= (τ0₁, . . . , τ0K). Set i := 0.

while k ¯h(τi)k > ε do if τi+ ¯z ¯h(τi_{) < Υ then} zmax:= max {z > 0 : τi+ z ¯h(τi)∈ Υ}. else set zmax:= ¯z. end if

Step a: Compute a stepsize ζi_by

ζi= max {z = zmaxβk : k = 0, 1, 2, . . .}

subject to J(τi+ z ¯h(τi)) − J(τi) 6 αz∇J(τi)⊤¯h(τi) . (6.1.5)

Step b: Set τi+1:= τi_{+ ζ}i_¯h(τi_{), i := i + 1.}

end while

and let Ψ(τ) denote the set of feasible directions from a point τ ∈ Υ, i. e.,

Ψ(τ) = {h∈ RK: for some ¯ζ > 0 and all ζ ∈ [0, ¯ζ), τ + ζh ∈ Υ}. (6.1.3) Let ¯h(τ) denote the projection of the vector −∇J(τ) onto Ψ(τ) with

∇J(τ) = ∂J(τ) ∂τ₁ , . . . , ∂J(τ) ∂τK (6.1.4) and ∂J(τ)_∂τ_k given by Theorem 3.2.1, k = 1, . . . , K. The gradient descent method in Algorithm 6.1.1 uses the Armijo stepsize rule in this direc- tion.

It is well-known that any limit point of the sequence of iterates {τi}i∈N produced by Algorithm 6.1.1 satisfies the Karush-Kuhn-Tucker optimality conditions and, hence, the optimality condition of Theorem 3.2.2. Moreover, the Algorithm 6.1.1 is stable in the sense that it converges for every starting point τ0∈ Υ, and it has a linear asymptotic convergence rate. For details, see [17] and the references therein or the textbook [32].

Algorithm 6.1.2 (Computing ¯hi(τ), i = k, . . . , n [17]):

Require: τ = (τ1, . . . , τK), T , ∂J(τ)_∂τ

1 , . . . ,

∂J(τ) ∂τK .

Initialize: Set l := k and m = 0.

while m < n do

Compute rmaxdefined by

rmax:= max    1 (i − k + 1) i X j=k ∂J(τ) ∂τj : i = l, . . . , n   . (6.1.6) Set m := max {i = k, . . . , n : _(i−k+1)1 Pij=k

∂J(τ)

∂τj = rmax}.

for all i ∈ {l, . . . , m} do

if(rmax >0 and τm= 0) or (rmax<0 and τm= T ) then

¯hi(τ) = 0 else ¯hi(τ) = −rmax. end if end for Set l := m + 1. end while

For a given τ ∈ Υ, ¯h(τ) used in Algorithm 6.1.1 can be computed as follows. Define a block to be a contiguous set of integers {k, . . . , n} ⊂ {1, . . . , K} such that τk = τn (and hence τi = τn for all i = k, . . . , n).

Clearly, every subset of a block is also a block and thus, a block can be considered as maximal, if no superset thereof is a block. Obviously, the set {1, . . . , K} is partitioned into disjoint maximal blocks. The computa- tion of ¯h(τ) is then done one-block-at-a-time for every maximal block {k, . . . , n} using Algorithm 6.1.2.

Observe that if the optimality condition as of Theorem 3.2.2 holds, then obviously ¯h(τ) given by Algorithm 6.1.2 satisfies ¯h(τ) = 0. If it is not satisfied, then the Algorithm 6.1.2 indicates which variables τi,

i∈ {k, . . . , n}, should be increased and which ones should be decreased. Thus, ¯h(τ) indeed is the projection of −∇J(τ) onto Ψ(τ).

6.1 An indirect approach using gradient information

is the fact that, in order to evaluate the cost function J(τ) and the par- tial derivatives ∂J(τ)_τ

k , k = 1, . . . , K, one needs to discretize and solve

the PDE constraint in time and space in every step of Algorithm 6.1.1. In particular, it is necessary to ensure that the discretized solution ˜x(·, ·) depends continuously on the optimization variables τ1, . . . , τK. Careless

re-meshing in every step of the optimizer may easily destroy this prop- erty. For ODEs, this continuous dependence of the map (τ1, . . . , τK)7→ ˜x

can be achieved by using adaptive time steps ∆t which may become very small. But in the PDE case under consideration here, for time steps ∆t much smaller than the discretization step size h of any fixed Eulerian grid in space, the numerical dissipation, e. g.,

2(λ(t, s)∆t − h)λ(t, s) ∂2

∂s2x(t, s) (6.1.7)

for upwind finite differencing discretization schemes, becomes large and causes inaccurate solution approximations.

We overcome this difficulty by using a meshfree numerical solver for the transport problem. Points representing the solution are moved ac- cording to their characteristic velocity. These meshfree schemes are ca- pable of propagating discontinuities in the solution with correct speed and they are free of numerical dissipation. In case of a semilinear equation, the method is easy to implement but rarely used. The main ideas are the following.

1. The initial solution is the approximation of the initial data ¯x by a finite number of points s1<· · · < sm ∈ (0, 1) with function values

x₁, . . . , xm for some m ∈ N.

2. The solution over time is found by

a) Moving each point si with speed λ(t, s) as suggested by the

characteristic equation (2.0.24).

b) Updating the function values xi by solving an integral for-

mulation of ˙x = f(t, s, u), compare (2.0.27).

c) Inserting points where the distance between two points or their distance to s = 0 becomes unsatisfyingly large. When points are inserted at s = 0, their function value is taken from an approximation of the boundary data uq(t)_(t).

d) Dropping all points that are no longer needed, i. e., those with si>1.

Algorithm 6.1.3 (Meshfree solver):

Require: λ, f, ¯x, u, τ1, . . . , τK.

Initialize: τ0:= 0, τK+1:= 1, ∆h := m1

[s] := [s₁, . . . , sm] with si= i ∆h

[u] := [u₁, . . . , um] with ui= ¯u(si)

for k =0, . . . , K + 1 do ∆t := (τ_k+1− τ_k)/N for j =1, . . . , N do t := τk+ j ∆t Memorize: ˜x(t, [s]) := [x] Move: [s] := [s] + ∆t λ(t, [s]) Update: [x] := [x] + ∆t f(t, [s], [x])

for all i such that si+1− si> ∆hdo

Insert: [s]:=[[s]6i,si+s₂i+1, [s]>i+1], [x]:=[[x]6i,xi+x₂i+1, [x]>i+1]

end for

if s1 > ∆hthen

Insert: [s] := [0, [s]], [x] := [uq(t)_{(t), [x]]}

end if

Drop: [s] := [s]I\Jwith I = {1, . . . , length([s])}, J = {i ∈ I : si>1}

end for end for

Many efficient adaptive sampling strategies for the initial and boundary data can be used because there is no requirement on the point dis- tribution si. In particular, one may approximate the boundary data

uq(·)(·) at the switching times τk and at a fixed number of equidis-

tantly places time instances during interswitching intervals [τk, τk+1].

This strategy ensures that the discretized solution depends continuously on the switching times as desired. The method is as accurate as the movement of siand the updates of xiare realized. In particular, us-

ing explicit Euler methods makes the BV solution approximation ˜x(·, ·) first order accurate everywhere. A detailed description of the meshfree solver that we used is given in Algorithm 6.1.3.

6.1 An indirect approach using gradient information

ization of any other further processing of the solution approximation requires non-standard methods on unregular grids. By similar reasons, a generalization to the usage in (fully coupled) systems of equations is not straightforward. Nevertheless, an extension to scalar non-linear conservation laws seems possible, noting that an appropriate particle management for that case is proposed in [18].

For a solution approximation ˜x(t, [s]), t ∈ [t] on the free grid [t], [s] obtained by Algorithm 6.1.3, we approximate the integral part of the cost function J by a rectangle rule, i. e.,

Zt 0 Z1 0 |x(t, s) − xd(t, s)|2ds dt ≈ length([t])−1 X i=1 length([s])−1 X j=1 ˜x(t_i+ 1 2|ti+1− ti|, sj+ 1 2|sj+1− sj|) − x_d(t_i+ 1 2|ti+1− ti|, sj+ 1 2|sj+1− sj|) 2 |s_j+1− s_j||t_i+1− t_i| (6.1.8) and take cWT

0 q(t) δtgiven by c times the number of essential switches

in q(·). We denote the sum of this values by ˜J(˜x) as an approximation of J(x).

Similarly, for an approximation ˜d(τk) of ∂(J(τ))_∂τ_k given by (3.2.7) in

Theorem 3.2.1 for k = 1, . . . , K, we use an explicit Euler scheme on the time grid [t] to obtain an approximation of s∗(·; τk, 0) as a solu-

tion of the ODE (2.0.24) through the point (τk, 0). An approximation of ∂

∂τks

∗_{(·; τ}

k, 0) for the evaluation of (3.2.7) is obtained by solving (3.2.11)

in Proposition 3.2.1 approximatively by means of a central finite differ- ence

∂

∂sλ(θ, s) ≈ (λ(θ, si+1) − λ(θ, si−1)) /∆h (6.1.9) and using explicit Euler backward in time on the same time grid [t]. Finally, the integration in (3.2.7) in order to obtain ˜dk is also realized by

the rectangle rule on [t].

The gradient projection method in Algorithm 6.1.1 and 6.1.2 with x, J(τ) and ∂J(τ)_∂τ_k replaced by the respective approximations ˜x, ˜J(τ) and ˜dk obtained by Algorithm 6.1.3 and the formulas (6.1.8) and (6.1.9),

viability. The results for two numerical examples are summarized in Section 6.3.

6.2 Direct Approach: A mixed integer formulation

In document Der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades Dr. rer. nat. (Page 108-114)