switching boundary controls
6.1 An indirect approach using gradient information
The proposed indirect method is based on the key idea already dis- cussed in Section 3.1. We use the a-priori bound K given by (3.1.4) in Theorem 3.1.1 for the number of essential switches in the optimal q∗(·), fix q(0) by q0 ∈ Q = {0, 1} and thus all essential subsequent modes and
then obtain ˜q∗(·) by considering the switching times τk, k = 1, . . . , K as
continuous optimization variables subject to the inequality constraints 0 6 τ16τ26· · · 6 τK6T. (6.1.1)
In order to compute a point (τ1, . . . , τK) satisfying the optimality con-
dition of Theorem 3.2.2, we adapt a gradient projection method with an Armijo stepsize suggested in [17] for switching ordinary differential equations. Therefore, let Υ denote the feasible set, i. e.,
6.1 An indirect approach using gradient information
Algorithm 6.1.1 (Gradient-projection method [17]):
Require: 0 < α < 0.5, 0 < β < 1, ¯z > 0, ε > 0. Initialize: Choose τ0= (τ01, . . . , τ0K). Set i := 0.
while k ¯h(τi)k > ε do if τi+ ¯z ¯h(τi) < Υ then zmax:= max {z > 0 : τi+ z ¯h(τi)∈ Υ}. else set zmax:= ¯z. end if
Step a: Compute a stepsize ζiby
ζi= max {z = zmaxβk : k = 0, 1, 2, . . .}
subject to J(τi+ z ¯h(τi)) − J(τi) 6 αz∇J(τi)⊤¯h(τi) . (6.1.5)
Step b: Set τi+1:= τi+ ζi¯h(τi), i := i + 1.
end while
and let Ψ(τ) denote the set of feasible directions from a point τ ∈ Υ, i. e.,
Ψ(τ) = {h∈ RK: for some ¯ζ > 0 and all ζ ∈ [0, ¯ζ), τ + ζh ∈ Υ}. (6.1.3) Let ¯h(τ) denote the projection of the vector −∇J(τ) onto Ψ(τ) with
∇J(τ) = ∂J(τ) ∂τ1 , . . . , ∂J(τ) ∂τK (6.1.4) and ∂J(τ)∂τk given by Theorem 3.2.1, k = 1, . . . , K. The gradient descent method in Algorithm 6.1.1 uses the Armijo stepsize rule in this direc- tion.
It is well-known that any limit point of the sequence of iterates {τi}i∈N produced by Algorithm 6.1.1 satisfies the Karush-Kuhn-Tucker optimal- ity conditions and, hence, the optimality condition of Theorem 3.2.2. Moreover, the Algorithm 6.1.1 is stable in the sense that it converges for every starting point τ0∈ Υ, and it has a linear asymptotic convergence rate. For details, see [17] and the references therein or the textbook [32].
Algorithm 6.1.2 (Computing ¯hi(τ), i = k, . . . , n [17]):
Require: τ = (τ1, . . . , τK), T , ∂J(τ)∂τ
1 , . . . ,
∂J(τ) ∂τK .
Initialize: Set l := k and m = 0.
while m < n do
Compute rmaxdefined by
rmax:= max 1 (i − k + 1) i X j=k ∂J(τ) ∂τj : i = l, . . . , n . (6.1.6) Set m := max {i = k, . . . , n : (i−k+1)1 Pij=k
∂J(τ)
∂τj = rmax}.
for all i ∈ {l, . . . , m} do
if(rmax >0 and τm= 0) or (rmax<0 and τm= T ) then
¯hi(τ) = 0 else ¯hi(τ) = −rmax. end if end for Set l := m + 1. end while
For a given τ ∈ Υ, ¯h(τ) used in Algorithm 6.1.1 can be computed as follows. Define a block to be a contiguous set of integers {k, . . . , n} ⊂ {1, . . . , K} such that τk = τn (and hence τi = τn for all i = k, . . . , n).
Clearly, every subset of a block is also a block and thus, a block can be considered as maximal, if no superset thereof is a block. Obviously, the set {1, . . . , K} is partitioned into disjoint maximal blocks. The computa- tion of ¯h(τ) is then done one-block-at-a-time for every maximal block {k, . . . , n} using Algorithm 6.1.2.
Observe that if the optimality condition as of Theorem 3.2.2 holds, then obviously ¯h(τ) given by Algorithm 6.1.2 satisfies ¯h(τ) = 0. If it is not satisfied, then the Algorithm 6.1.2 indicates which variables τi,
i∈ {k, . . . , n}, should be increased and which ones should be decreased. Thus, ¯h(τ) indeed is the projection of −∇J(τ) onto Ψ(τ).
6.1 An indirect approach using gradient information
is the fact that, in order to evaluate the cost function J(τ) and the par- tial derivatives ∂J(τ)τ
k , k = 1, . . . , K, one needs to discretize and solve
the PDE constraint in time and space in every step of Algorithm 6.1.1. In particular, it is necessary to ensure that the discretized solution ˜x(·, ·) depends continuously on the optimization variables τ1, . . . , τK. Careless
re-meshing in every step of the optimizer may easily destroy this prop- erty. For ODEs, this continuous dependence of the map (τ1, . . . , τK)7→ ˜x
can be achieved by using adaptive time steps ∆t which may become very small. But in the PDE case under consideration here, for time steps ∆t much smaller than the discretization step size h of any fixed Eulerian grid in space, the numerical dissipation, e. g.,
1
2(λ(t, s)∆t − h)λ(t, s) ∂2
∂s2x(t, s) (6.1.7)
for upwind finite differencing discretization schemes, becomes large and causes inaccurate solution approximations.
We overcome this difficulty by using a meshfree numerical solver for the transport problem. Points representing the solution are moved ac- cording to their characteristic velocity. These meshfree schemes are ca- pable of propagating discontinuities in the solution with correct speed and they are free of numerical dissipation. In case of a semilinear equa- tion, the method is easy to implement but rarely used. The main ideas are the following.
1. The initial solution is the approximation of the initial data ¯x by a finite number of points s1<· · · < sm ∈ (0, 1) with function values
x1, . . . , xm for some m ∈ N.
2. The solution over time is found by
a) Moving each point si with speed λ(t, s) as suggested by the
characteristic equation (2.0.24).
b) Updating the function values xi by solving an integral for-
mulation of ˙x = f(t, s, u), compare (2.0.27).
c) Inserting points where the distance between two points or their distance to s = 0 becomes unsatisfyingly large. When points are inserted at s = 0, their function value is taken from an approximation of the boundary data uq(t)(t).
d) Dropping all points that are no longer needed, i. e., those with si>1.
Algorithm 6.1.3 (Meshfree solver):
Require: λ, f, ¯x, u, τ1, . . . , τK.
Initialize: τ0:= 0, τK+1:= 1, ∆h := m1
[s] := [s1, . . . , sm] with si= i ∆h
[u] := [u1, . . . , um] with ui= ¯u(si)
for k =0, . . . , K + 1 do ∆t := (τk+1− τk)/N for j =1, . . . , N do t := τk+ j ∆t Memorize: ˜x(t, [s]) := [x] Move: [s] := [s] + ∆t λ(t, [s]) Update: [x] := [x] + ∆t f(t, [s], [x])
for all i such that si+1− si> ∆hdo
Insert: [s]:=[[s]6i,si+s2i+1, [s]>i+1], [x]:=[[x]6i,xi+x2i+1, [x]>i+1]
end for
if s1 > ∆hthen
Insert: [s] := [0, [s]], [x] := [uq(t)(t), [x]]
end if
Drop: [s] := [s]I\Jwith I = {1, . . . , length([s])}, J = {i ∈ I : si>1}
end for end for
Many efficient adaptive sampling strategies for the initial and bound- ary data can be used because there is no requirement on the point dis- tribution si. In particular, one may approximate the boundary data
uq(·)(·) at the switching times τk and at a fixed number of equidis-
tantly places time instances during interswitching intervals [τk, τk+1].
This strategy ensures that the discretized solution depends continu- ously on the switching times as desired. The method is as accurate as the movement of siand the updates of xiare realized. In particular, us-
ing explicit Euler methods makes the BV solution approximation ˜x(·, ·) first order accurate everywhere. A detailed description of the meshfree solver that we used is given in Algorithm 6.1.3.
6.1 An indirect approach using gradient information
ization of any other further processing of the solution approximation requires non-standard methods on unregular grids. By similar reasons, a generalization to the usage in (fully coupled) systems of equations is not straightforward. Nevertheless, an extension to scalar non-linear conservation laws seems possible, noting that an appropriate particle management for that case is proposed in [18].
For a solution approximation ˜x(t, [s]), t ∈ [t] on the free grid [t], [s] obtained by Algorithm 6.1.3, we approximate the integral part of the cost function J by a rectangle rule, i. e.,
Zt 0 Z1 0 |x(t, s) − xd(t, s)|2ds dt ≈ length([t])−1 X i=1 length([s])−1 X j=1 ˜x(ti+ 1 2|ti+1− ti|, sj+ 1 2|sj+1− sj|) − xd(ti+ 1 2|ti+1− ti|, sj+ 1 2|sj+1− sj|) 2 |sj+1− sj||ti+1− ti| (6.1.8) and take cWT
0 q(t) δtgiven by c times the number of essential switches
in q(·). We denote the sum of this values by ˜J(˜x) as an approximation of J(x).
Similarly, for an approximation ˜d(τk) of ∂(J(τ))∂τk given by (3.2.7) in
Theorem 3.2.1 for k = 1, . . . , K, we use an explicit Euler scheme on the time grid [t] to obtain an approximation of s∗(·; τk, 0) as a solu-
tion of the ODE (2.0.24) through the point (τk, 0). An approximation of ∂
∂τks
∗(·; τ
k, 0) for the evaluation of (3.2.7) is obtained by solving (3.2.11)
in Proposition 3.2.1 approximatively by means of a central finite differ- ence
∂
∂sλ(θ, s) ≈ (λ(θ, si+1) − λ(θ, si−1)) /∆h (6.1.9) and using explicit Euler backward in time on the same time grid [t]. Finally, the integration in (3.2.7) in order to obtain ˜dk is also realized by
the rectangle rule on [t].
The gradient projection method in Algorithm 6.1.1 and 6.1.2 with x, J(τ) and ∂J(τ)∂τk replaced by the respective approximations ˜x, ˜J(τ) and ˜dk obtained by Algorithm 6.1.3 and the formulas (6.1.8) and (6.1.9),
viability. The results for two numerical examples are summarized in Section 6.3.