Dimension Reduction for Systems with Slow Relaxation
SIAM DS17 May 24, 2015
Raman Venkataramani and Juan Restrepo Shankar Venkataramani
‘Oil’ consists of I distinct species with concentrations ci(t), i = 1, 2, . . . , I each decaying at a constant rate ↵i:
@tci(t) = ↵ici(t), ↵i > 0, 1 i I.
Single observable: M(t) is a weighted average of the concentrations ci
M (t) = X
i
ici(t) =
X
i
ici(0)e ↵it.
Impractical/impossible to separately measure the concentrations/amounts ci of all the individual species.
Question: Can we use the measured quantity M(t) to extract the various decay rates ↵i using nonlinear fitting?
No!
Continuum limit
• One cannot hope to extract the decay rates ↵i, i = 1, 2, . . . , I from the measured function M(t)
• We therefore consider the complementary limit, where the number of dis-tinct species I 1.
The model: Linear evaporation process
@t⇢(w, t) = w⇢(w, t), M (t) =
Z 1
0
⇢(w, t) dw. Nondimensional evaporation rate: 0 w 1. Continuum limit: ci ! ⇢(w).
⇢(w, 0) is “random” and E[⇢(w, 0)] = 1.
Schr¨odinger picture of the evolution of the system:
⇢(w, t) = ⇢(w, 0)e wt.
“Dual” Heisenberg picture:
G(t) =
Z
g(w)⇢(w, t)dw. Observable :
G(t) = Z
g(w)⇢(w, t)dw = Z
g(w)e wt⇢(w, 0)dw = Z
Evolution of the total mass
Discrete time setting
Discrete time = Takens delay-coordinate embedding
⇢n+1(w) = ⇤T ⇢n(w)
g(n+1)⌧ (w) = ⇤gn⌧ (w)
⇤ : C([0, 1]) ! C([0, 1])
⇤g(w) = e w⌧ g(w)
⇤[1] 6= 1, so ⇤ is not the Koopman operator for a dynamical system!
Nonetheless, we can “formally” apply the Mori-Zwanzig projection operator technique.
E[Mn] =
Z 1
0 E
[⇢n(w)]dw =
Z
e nw⌧ dw = 1 e
n⌧
Mori-Zwanzig projection
Mn =
n
X
k=1
hkMn k + n,
hk = Memory kernel, n = Orthogonal dynamics (“noise”)
This equation is exact. Intuition: It is good place to start approximating.
Can solve for memory kernel anaytically.
hk ⇠ 1
k log2(k) as k ! 1,
Although P hk converges to ˆH(1) = 1, the partial sums go to 1 extremely slowly, 1 PNk=1 h(k) ⇠ log(N) 1.
Filtering, estimation and prediction
Given a sequence of noisy measurements ˜Mk =
Z
⇢kdw + k where k are
uncorrelated normal variates.
Question: What is the “best” prediction for Mn in terms of the
measure-ments ˜Mk for k < n?
Abstractly, optimal estimate = conditional expectation
¯
Mn = E[Mn | M˜n 1, M˜n 2, . . . , M˜1, M˜0].
Goal: Concrete representation for optimal estimator = explicit functions Fn
such that
E[Mn | M˜n 1, M˜n 2, . . . , M˜1, M˜0] ⇡ Fn( ˜Mn 1, M˜n 2, . . . , M˜j, . . .).
Classification of filters
• Autonomous = shift-invariant = Fn ⌘ F independent of n.
• Fn only depends on ˜Mn 1, M˜n 2, . . . , M˜n L = Finite impulse response
with L taps.
• Fn is genie-aided if it has access to future information. Like a Maxwell demon, this fictional construct is useful because it allows us to bound the best-case behavior of constructible filters.
• Filter is empirical or data-driven = coefficients obtained through regression on one or many realizations of the underlying random process ˜Mk.
Reduced model: If Fn is a (close to) optimal filter, then
c
Mn = Fn(Mcn 1, Mcn 2, . . . , Mcj, . . .) + ✓n,
✓n stochastic with appropriate statistics = good surrogate for the process Mn.
Empirical filters
Assume no measurement error. State-space model is:
Mn =
n X
k=1
hkMn k + n
n is a non-stationary random process
Find the weights h0k by minimizing the sum of the normalized squared
resid-uals J X j=1 N X
n=L+1
"
Mn(j) PLk=1 h0kMn k(j)
PL
k=1 M (j) n k
#2
, where the outer sum is over di↵erent
realizations, and the inner sum is over all subsequences of L consecutive values of Mk(j).
Distribution of initial conditions
E[⇢0(w)] = 1
E[⇢0(w)⇢0(w0)] = 1 + ¯2 (w w0)
We can construct a sequence of point mass (i.e. discretized) initial conditions whose weak limits satisfy these conditions
E[Mn] = 1 e
n⌧
n⌧
E[MnMj] = E[Mn]E[Mj] + ¯2 1 e
(n+j)⌧
(n + j)⌧
Regression: optimal AR(L) filter of the form
Mn = qnM0 + h(1n)Mn 1 + h2(n)Mn 2 + · · · + hL(n)Mn L + ✓n,
Nonautonomous optimal filters Yule-Walker equations
1 e(2n k)⌧
(2n k)⌧ =
L
X
j=1
h(jn) 1 e
(2n k j)⌧
(2n k j)⌧ , k = 1, 2, . . . , L.
Hilbert matrix! 0 B B B @ 1 2n 1 1 2n 2 .. . 1 2n L 1 C C C A = 0 B B B @ 1 2n 2 1
2n 3 · · ·
1
2n L 1 1
2n 3
1
2n 4 · · ·
1
2n L 2
..
. ... . .. ...
1
2n L 1
1
2n L 2 · · ·
1 2n 2L 1 C C C A 0 B B B B @
h(n)1 h(n)1
.. . h(n)L
1 C C C C A .
Asymptotic filter
h(n)j =
L Y
i6=j
i i j
L Y
i=1
2n i j
2n i
= ( 1)j 1
✓ L
j ◆
+ ( 1)j L
2 2n ✓ L 1 j 1 ◆
+ O(n 2).
h(1n) = 6 36 2n 1,
h(2n) = 15 + 630 2n 1
225
n 1,
h(3n) = 20 3360
2n 1 +
2100
n 1
1200 2n 3,
h(4n) = 15 + 7560 2n 1
6300
n 1 +
6300 2n 3
450
n 2,
h(5n) = 6 7560
2n 1 +
7560
n 1
10080 2n 3 +
1260
n 2
180 2n 5,
h(6n) = 1 + 2772 2n 1
3150
n 1 +
5040 2n 3
840
n 2 +
210 2n 5
3
Universal filter
Asymptotic filter coefficients converge as n ! 1
lim
n!1 h
(n)
j = ( 1)j 1
✓
L j
◆
Post facto justification for averaging over n,
L X i=0 ✓ L i ◆
( 1)i
n i =
L!
n(n 1)(n 2) · · · (n L) ⇠
L!
nL+1 ,
Mn ⇡ LMn 1 L(L 1)
2 Mn 2 + · · · ( 1)
LM
Universal filter and slow decay of correlations
f(x) is algebraically decaying. Among all sets of coefficients ↵0, ↵1, ↵2, . . . , ↵L, normalized by ↵0 = 1, the linear combination
L X i=0 ✓ L i ◆
( 1)if(n i) ⇠ d L
dxL f
✓
n L
2
◆
,
is asymptotically “the smallest” possible.
Not true for exponentially decaying functions!
Slowly decaying correlations implies stochastically parameterization:
[(1 R)Lf]n =
L X i=0 ✓ L i ◆
( 1)ifn i = n✓n,
The time and temperature dependence of the evaporation curves are best fit by one of the following two equations:
%E = (0.165(%D) + 0.045(T 15)) log(t) and
Nondimensional evaporation curves
Time scale is set by most volatile species: wmax = 1.
M (t) = 1 a log(1 + t/t0) and
M (t) = 1 a(p1 + t/t0 1)
t, t0 (small scale cuto↵) and a ⌧ 1 are all dimensionless
˙
M(t) . ↵maxM(t) = M (t) so that a/t0 . 1.
Ranges of validity: Tmax ⇠ t0e1/a for the logarithmic equation and Tmax ⇠
Nondimensional evaporation curves and filtering
Empirical
Log-concavity
d2
dt2 log(M (t)) =
R
w2⇢(w, t)dw R ⇢(w, t)dw R w⇢(w, t)dw 2
R
⇢(w, t)dw 2 0
This relation has to hold for every realization
Log equation:
Tcrit ⇡ 1
e Tmax, M (Tcrit) ⇡ a ⌧ 1.
Sqrt equation:
Tcrit ⇡ 1
4 Tmax, M (Tcrit) ⇡
1
Universal vs. asymptotic filter
The ability of a filter to track/predict these functions accurately is not nec-essarily a positive feature.
Conclusions
• Mori-Zwanzig does poorly on systems with slow decay of correlations.
• The universal filter has very small error as n ! 1, but is not very
dis-criminating.
• The empirical linear filter is very discriminating/nearly optimal among all
linear filters with fixed coefficients and L (a given number of) taps. Floor
for its error – Sloppy model.
• The extended asymptotic filter is (essentially) time varying so it has