Application to molecular dynamics
Vom Fachbereich Mathematik und Informatik der Freien Universit¨at Berlin
zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften genehmigte Dissertation
vorgelegt von
Diplom-Mathematiker Philipp Metzner
Freie Universit¨at Berlin
Fachbereich Mathematik und Informatik Arnimallee 6
14195 Berlin
Gutachter: Prof. Dr. Christof Sch¨utte
1. Introduction 1
2. Theory: Time-continuous Markov Processes 9
2.1. Markov Diffusion Processes . . . 9
2.1.1. Markov Processes . . . 9
2.1.2. The Infinitesimal Operator . . . 10
2.1.3. Diffusion Processes . . . 11
2.1.4. Reversed-time Diffusion Process . . . 12
2.1.5. Backward and Forward Equations . . . 13
2.1.6. Partial Differential Operators . . . 14
2.1.7. Relation between Lbw and Lf w . . . 15
2.1.8. Stochastic Representation of Solutions of Boundary Value Prob-lems . . . 16
2.1.9. Adjoint Boundary Condition . . . 17
2.1.10. Langevin and Smoluchowski Dynamics . . . 18
2.2. Markov Jump Processes . . . 20
3. Transition Path Theory for Diffusion Processes 25 3.1. Theory: Transition Path Theory . . . 25
3.1.1. Ensemble of Reactive Trajectories . . . 25
3.1.2. Committor Function . . . 26
3.1.3. Probability Density Function of Reactive Trajectories . . . . 28
3.1.4. Probability Current and Transition Rate . . . 28
3.1.5. Transition Tubes . . . 30
3.2. TPT in the Smoluchowski Case . . . 30
3.3. TPT in the Langevin Case . . . 31
3.4. Numerical Aspects . . . 33
3.5. Diffusion in the Double-Well Potential . . . 34
3.5.1. Committor Function . . . 35
3.5.2. Probability Density Function of Reactive Trajectories . . . . 35
3.5.3. Probability Current of Reactive Trajectories and its Streamlines 37 3.5.4. Reaction Rate . . . 37
3.6. Entropic Barriers: Pure Diffusion . . . 38
3.7. Entropic Switching . . . 40
3.7.1. Diffusion in a Three-Hole Potential . . . 40
3.7.2. Diffusion in a Rough Three-Hole Potential . . . 44
3.8. Different Time-Scales: Fast-Slow Diffusion in a Double-Well Potential 47 3.9. Langevin Dynamics . . . 50
3.9.2. Medium Friction Case, γ = 1 . . . . 52
3.9.3. Low Friction Case, γ = 0.001 . . . . 55
3.9.4. Rough Potential Landscape . . . 56
4. Transition Path Theory for Markov Jump Processes 59 4.1. Theoretical Aspects . . . 60
4.1.1. Preliminaries: Notations and Assumptions . . . 60
4.1.2. Reactive Trajectories . . . 60
4.1.3. Probability Distribution of Reactive Trajectories . . . 62
4.1.4. Discrete Committor Equations . . . 63
4.1.5. Probability Current of Reactive Trajectories . . . 65
4.1.6. Transition Rate and Effective Current . . . 67
4.1.7. Relations with Electrical Resistor Networks . . . 68
4.1.8. Dynamical Bottlenecks and Reaction Pathways . . . 69
4.1.9. Relation with Laplacian Eigenmaps and Diffusion Maps . . . 72
4.2. Algorithmic Aspects . . . 73
4.2.1. Computation of Dynamical Bottlenecks and Representative Dominant Reaction Pathways . . . 73
4.3. Illustrative Examples . . . 75
4.3.1. Discrete Analog of a Diffusion in a Potential Landscape . . . 75
4.3.2. Molecular Dynamics : Glycine . . . 79
4.3.3. Chemical Kinetics . . . 87
5. Generator Estimation of Markov Jump Processes 91 5.1. The Embedding Problem . . . 91
5.2. The Maximum Likelihood Method . . . 92
5.2.1. Continuous and Discrete Likelihood Functions . . . 92
5.2.2. Likelihood Approach Revisited . . . 94
5.2.3. Enhanced Computation of the Maximum Likelihood Estimator 96 5.2.4. Reversible Case . . . 98
5.2.5. Scaling . . . 100
5.2.6. Enhanced MLE-Method vs. MLE-Method . . . 100
5.3. An Alternative Approach: The Quadratic Optimization Method . . . 101
5.4. Numerical Examples for Equidistant Observation Times . . . 102
5.4.1. Preparatory Considerations . . . 102
5.4.2. Transition Matrix with Underlying Generator . . . 104
5.4.3. Transition Matrix without Underlying Generator . . . 106
5.4.4. Transition Matrix with Exact Generator under Perturbation 107 5.4.5. Application to a Time Series from Molecular Dynamics . . . 109
5.5. Numerical Examples for Non-Equidistant Observation Times . . . . 110
5.5.1. Test Example . . . 112
5.5.2. Application to a Genetic Toggle Switch Model . . . 112
6. Detecting Reaction Pathways via Shortest Paths in Graphs 117 6.1. Shortest Path in Graphs . . . 117
6.1.1. Dijkstra Algorithm . . . 117
6.1.2. Bidirectional Dijkstra Algorithm . . . 118
6.2.1. Likelihood Approach . . . 120
6.2.2. Free Energy Approach . . . 121
6.3. Numerical Experiments . . . 123
7. Variance of the Committor Function 127 7.1. The Discrete Committor Function . . . 127
7.2. Metropolis Markov Chain Monte Carlo . . . 128
7.3. Ensemble of Transition Matrices via MCMC . . . 130
7.3.1. Dynamics on the Transition Matrix Space . . . 130
7.3.2. MCMC on the Frequency Matrix Space . . . 130
7.3.3. Proof of Correctness . . . 132
7.4. Numerical Experiments . . . 134
7.4.1. Dirichlet Distribution . . . 134
7.4.2. Small Example . . . 135
7.4.3. Glycine . . . 139
8. Summary and Conclusion 143 A. Appendix 145 A.1. Discretization of the Committor Equation . . . 145
A.1.1. Discretization via Finite Differences . . . 146
A.1.2. Finite Difference Discretization of the Smoluchowski Commit-tor Equation . . . 148
A.1.3. Finite Difference Discretization of the Langevin Committor Equation . . . 153
A.2. Weak Formulation for the Elliptic Mixed-Boundary Value Problem . 159 A.2.1. Existence of a Weak Solution . . . 162
A.2.2. Classical Solution vs. Weak Solution . . . 163
A.3. Approximation of Diffusion Processes via Markov Jump Processes . 164 A.4. Proofs . . . 165
A.4.1. Proof for the Representation of the Probability Current of Reactive Trajectories . . . 165
A.4.2. Proof for the Representation of the Transition Rate via a Vol-ume Integral . . . 167
A.5. Short Account to Free Energy . . . 168
Transition events in complex systems between long lived states are a key feature of many systems arising in physics, chemistry, biology, etc. It was early recognized that transition processes are characterized by rare but important events, i.e., transition processes are phenomena that take place on a long time scale compared to the time scale characterizing the states of local stability, also called metastable states. For example, the timescale for folding of a small protein, i.e. the transition from an unfolded in a folded state is in the range of microseconds to milliseconds, whereas that for small-amplitude motions of amino acid side chains and water solvent is 1 femtosecond.
The first step towards an understanding of rare events was to realize that escape from a metastable state can only happen via noise-assisted hopping events where the amplitude of the noise reflects the finite temperature at which the process takes place. In other words, the dynamics of the process is subject to random perturba-tions. If we relate the fluctuation induced by the noise to an appropriate energy scale Enoise, escape from a metastable state will be rare whenever the condition
Ebarrier/Enoise À 1 holds, where Ebarrier denotes the energy barrier height which separates the metastable state.
Under physical assumptions on the governing dynamics of the process, the time scale of escape from a metastable state depends exponentially on the ratio
Ebarrier/Enoise. This means that one has to wait exponentially long to observe a single transition. On the other hand, the impact of the motion on the fastest time scale on the global behavior of the process is not negligible. Consequently, any di-rect numerical simulation of the dynamics in order to get a sufficient statistics on transition events would fail. Hence, alternative and effective strategies are required and had been developed such as Transition State Theory, Transition Path Sampling, and more recently Transition Path Theory.
In the present work we give a unified presentation of Transition Path Theory (TPT) for time-continuous Markov processes and we elucidate its range of applica-bility on the example of conformational dynamics of bio-molecules.
We consider the most interesting results to include the following:
• Illustration of TPT on several low dimensional examples for Smoluchowski and
Langevin dynamics arising from the stochastic modeling of molecular dynam-ics.
• Derivation of a stable finite discretization scheme of the committor function
equation associated with the hypoelliptic Langevin dynamics.
• Adaptation of TPT to the class of time-continuous Markov processes with
discrete state space (Markov jump processes).
• Development of efficient graph algorithms for identifying transition pathways
• Presentation, improvement and comparison of methods to estimate an
in-finitesimal generator of a Markov jump process if only an incomplete observa-tion of the process is available.
• Derivation of an Metropolis Monte Carlo Markov chain method to
investi-gate the error propagation in the discrete committor function computation for Markov chains.
Rare Events in Molecular Dynamics In the classical description of molecular pro-cesses the dynamics of the molecule’s microscopic configurations (position and mo-menta) are mathematically modeled in terms of ordinary differential equation, result-ing from formulations of Lagrange and Hamilton. Within these models, the physical interactions of atoms are encoded in the interaction potential which is composed of sums of contributions of different physical origin as the bond structure of the molecule and electrostatic interactions. But most biomolecular processes can only be understood within a thermodynamical context; instead of a single molecular sys-tem as a solution of the classical equations, one is interested in statistical ensembles, since only such ensembles can be object of experimental investigation. Throughout this thesis we will focus on that ensemble view.
Functions of bio-molecules depend on their dynamical properties, and especially on their ability to undergo transitions between long-living states, called conformations. A conformation of a molecule is understood as a mean geometric structure of the molecule which is conserved on a large time scale compared to the fastest molecular motions where the system may well rotate, oscillate or fluctuate. From the dynamical point of view, a conformation typically persists for a long time (again compared to the fastest molecular motions) such that the associated subset of microscopic configurations is almost invariant or metastable [82] with respect to the dynamics. Hence transitions between different conformations of a molecule are rare events compared to the fluctuations within each conformation.
A very popular model to describe molecular systems including thermal noise is the stochastic Langevin dynamics or Smoluchowski dynamics. A Langevin system can be regarded as a mechanical system with additional noise and friction where the noise can be thought of modeling the influence of a heat bath surrounding the molecule and the friction is chosen such as to counterbalance the energy fluctuations due to the noise [45]. The Smoluchowski dynamics [87] is a Brownian motion which results from the Langevin dynamics in the high friction limit and acts only on the position space.
Mathematically, the Langevin and Smoluchowski dynamics are time-continuous Markov diffusion processes on a continuous state space. Under weak conditions both admit a unique stationary (equilibrium) distribution in configuration space which corresponds to the stationary (canonical) ensemble in experiments under constant volume and temperature, respectively.
As mentioned above, the problem of identifying conformations amounts to the identification of metastable sets in configuration space. The characterization of metastability within the canonical ensemble hence requires the mathematical de-scription of the propagation of sub-ensembles. This is accomplished by the transfer
operator approach [80]; if we define a transition probability from a sub-ensemble C
stays in C after time τ is almost one, i.e. p(τ, C, C) ≈ 1 [51]. Finally, the algorithmic strategy to decompose the state space into metastable states is based on spectral properties of the transfer operator [24].
Transition State Theory Since the 1930s transition state theory (TST) and evolu-tions thereof based on the reactive flux formalism have provided the main theoretical framework for the description of rare events [37, 95, 97, 7, 15]. Originally, TST was derived in the context of analyzing the rate of chemical reactions R → P , where R denotes the reactant and P the product. The idea behind TST is to approximate the reaction rate k by the mean crossing frequency kT ST of transitions from R to P through a transition state, the dynamical bottleneck for the reaction. Generally, the transition state can be any dividing surface separating the reactant state R from the product state P . Then the TST rate, kT ST, is proportional to the total flux of
reactive trajectories, i.e., trajectories from the reactant to the product side of the
dividing surface, and can be expressed in terms of thermodynamical quantities. The TST rate is always an upper bound of the true reaction rate because reac-tive trajectories can recross the transition state many times during one reaction. Therefore, the true rate is given by
k = κkT ST,
where κ, the transition coefficient, is a correcting factor accounting for these re-crossings. Due to this overestimation, several strategies have been proposed to im-prove the TST rate. For example, the earliest one is called variational TST [50] and amounts to choose the dividing surface which minimizes the TST rate constant (see also [91, 94]).
Performing the computation in practice, however, may prove very challenging, and this difficulty is related to a deficiency of the theory. TST is based on partitioning the system into two, leaving the reactant state on one side of a dividing surface and the product state on the other, and the theory only tells how this surface is crossed during the reaction. As a result, TST provides very little information about the mechanism of the transition, which has bad consequences e.g. if this mechanism is totally unknown a priori. In this case, it is difficult to choose a suitable dividing surface and a bad choice will lead to a very poor estimate of the rate by TST (too many spurious crossings of the surface that do not correspond to actual reactive events). The TST estimate is then extremely difficult to correct. The situation is even worse when the reaction is of diffusive type, since in this case all surfaces are crossed many times during a single reactive event and there is simply no good TST dividing surface that exists.
Transition Path Sampling How to go beyond TST and describe rare events whose mechanism is unknown a priori is an active area of research and several new tech-niques have been developed to tackle these situations. Most notable among these techniques are the transition path sampling (TPS) technique of Bolhuis, Chandler, Dellago, and Geissler [72, 21] and the action method of Elber [35, 36] which allow to sample directly the ensemble of reactive trajectories, i.e. the trajectories by which the reaction occurs.
The basic idea behind TPS is a generalization of standard Monte Carlo Markov Chain (MCMC) [39, 56] procedures on the trajectory space of the considered dy-namics. Generally, an MCMC procedure performs a biased random walk on the configuration space such that the number of visits of a configuration x is propor-tional to its probability p(x). In TPS a configuration X(T ) = (x0, x∆t. . . , xT) is a sequence of states representing a time-discretization of a true dynamical trajectory of fixed length T rather than individual states of the dynamics itself. The statistical weight p(X(T )) depends on the initial conditions and on the underlying dynamics. Since one is only interested in reactive trajectories connecting A and B, TPS finally performs a random walk on the transition path ensemble with respect to the reactive
path probability
pAB(X(T )) = ZAB−1(T )1A(x0)p(X(T ))1B(xT),
where ZAB normalizes the distribution of the transition path ensemble and the char-acteristic1A(x) is equal one if x ∈ A and 0 otherwise (1B(x) is defined analogously).
Following [72]:
Metaphorically, TPS is akin to ”throwing ropes over rough mountains passes, in the dark” where ”throwing ropes” stands for shooting trajecto-ries, attempting to reach one metastable state from another and ”in the dark” because high-dimensional systems are so complex that it is gener-ally impossible to make any prediction on the relevant energy surfaces.
We want to emphasize that reactive trajectories in the transition path ensemble are true dynamical trajectories, free of any bias by non-physical forces, constraints or assumptions on the reaction mechanism. The mechanism of the reaction and possibly its rate can then be obtained a posteriori by analyzing the ensemble of reactive trajectories. However, these operations are far from trivial. TPS or the action method
per se do not tell how this analysis must be done and simple inspection of the reactive
trajectories may not be sufficient to understand the mechanism of the reaction. This may sound paradoxical at first, but the problem is that the reactive trajectories may be very complicated objects from which it is difficult to extract the quantities of real interest such as the probability density that a reactive trajectory be at a given location in state-space, the probability current of these reactive trajectories, or their rate of appearance. In a way, this difficulty is the same that one would encounter having generated a long trajectory from the law of classical mechanics but ignoring all about statistical mechanics: how to interpret this trajectory would then be unclear. Similarly, the statistical framework to interpret the reactive trajectories is not given by the trajectories themselves, and further analysis beyond TPS or the action method is necessary (for an attempt in this direction, see [52]).
Transition Path Theory Recently, a theoretical framework to describe the statisti-cal properties of the reactive trajectories in the context of Markov diffusion processes has been introduced [34, 92]. This framework, termed transition path theory (TPT), goes beyond standard equilibrium statistical mechanics and accounts for the non-trivial bias that the very definition of the reactive trajectories imply – they must be involved in a reaction.
trajectories (not only reactive trajectories with respect to a fixed length as in TPS) by giving precise answers to the following questions:
• What is the probability to encounter a reactive trajectory in a given state, i.e.
what is the probability density function of reactive trajectories?
• What is the net amount of reactive trajectories going through a given state,
i.e. what is the probability current of reactive trajectories?
• What is the mean frequency of transitions between two sets, say A and B, i.e.
what is the rate of reaction?
• What are the mechanisms of transitions, i.e. what are the transition tubes or transition pathways?
The key ingredient in the main objects provided by TPT is the committor function
qAB(x) ≡ q(x) which is the probability to go rather to the set B than to the set A conditional on the process has started in the state x. The committor function q(x) can be seen as an abstract reaction coordinate, because under appropriate conditions on the dynamics the levels sets of the committor function foliate the state space in sets of equal probability to rather end up in B than A, i.e. it describes the progress of reaction from A to B in terms of probabilities.
For Markov diffusion processes, the committor function satisfies a boundary value problem where the involved partial differential operator is the generator of the dif-fusion process under consideration. Solving the committor equation numerically in high dimensions is infeasible and, hence, TPT is impractical for the analysis of high dimensional complex processes.
As a remedy to avoid the ”curse of dimension” we will follow a two-step procedure. Instead of considering the system in all its degrees of freedom, we will choose appro-priate low-dimensional observables which allow to describe the effective dynamics of the system. In the second step the dynamics in these observables is considered on a coarse grained level, e.g. on a discretization of the image space of the observables, and modeled as a Markov jump process. As a result the essential dynamics of the complex system is captured in a discrete transition network (see Figure 1).
For discrete representatives of the sets A and B, discrete TPT [66] allows to analyze the statistical properties of the associated reactive trajectories, i.e. these trajectories by which the walkers transit on the discrete state space from A to B driven by the underlying Markov jump process. Discrete TPT provides discrete analogs of the probability density, the transition rate and the probability current of reaction trajectories. Again, these objects depend on a discrete committor function which satisfies a linear system of equations involving the infinitesimal generator of the considered jump process. Within this discrete setting, then it is easy to compute transition rates and, moreover, to identify transition pathways by utilizing Graph algorithms.
Finally, it is worth to point out that TPT is the theoretical background beyond the string method [30, 31, 32, 33, 75, 60], which is a numerical technique to compute the statistical properties of the reactive trajectories directly (that is, without having to identify these trajectories themselves beforehand as in TPS or the action method) in complicated systems with many degrees of freedom.
100000 300000 500000 −180 0 180 Φ 100000 300000 500000 −180 0 180 Ψ 0 90 180 270 360 −180 −90 0 90 180 Φ Ψ 0 90 180 270 360 −180 −90 0 90 180 Φ Ψ
Figure 1.1.: In this figure we exemplify our strategy to capture the essential dynam-ics of a bio-molecule in a coarse grained model. The top left panel shows the ball-and-stick representation of the trialanine dipeptide analog. Top right: Projection of the time series (all atomic positions) onto the torsion angle space spanned by Φ and Ψ, which reveals the metastable behavior. Bottom left: The Ramachandran plot of the torsion angle time series. At first glance, trialanine attains three different conformations, indicated by the three clusters. Bottom right: The discrete free energy, − log π, associated with the stationary distribution π of a Markov jump process which models the effective dynamics of a system in terms of the torsion angles Φ and Ψ. The jump process was estimated from the underlying time series with respect to a 20 × 20 box discretization of the torsion an-gle space. The lighter the color of a box the more probable to encounter the process in that box.
the possibility to complete this thesis. In particular, I would like to thank my visors Prof. Dr. Christof Sch¨utte and Prof. Dr. Eric Vanden-Eijnden for their continuous support and patience during my studies.
Special thanks to Jessica Walter for convincing me to do my Ph.D. in the Bio Computing group, Alexander Fischer and Illia Horenko for taking me by the hands on my first steps into the field of molecular dynamics and Eike Meerbach for providing me data from molecular dynamics simulations (not to mention his tolerance for thousand of hours of stimulating dark wave music). I’m indebted to thank Heidi for not just simply being there during one year. Finally, without the support of my parents, my sister and my friends this all would not have been possible: thank you. This work was funded by the DFG Research Center Matheon ”Mathematics for Key Technologies” (FZT86) in Berlin.
Processes
The purpose of this chapter is to give an introduction to the theoretical framework of time-continuous Markov processes on a continuous and a discrete state space.
2.1. Markov Diffusion Processes
2.1.1. Markov Processes
In this section we give a brief mathematical description of Markov processes. For a detailed introduction see, e.g., [3],[86].
To begin at the beginning, a d-dimensional stochastic process {Xt, t ≥ 0} is a collection of random variable assuming its values in Rd (for d ≥ 1) and the index t is referred to as the time. Formally, {Xt, t ≥ 0} is defined on the probability space
(Ω, F, P) with Ω = {f : [0, ∞) → Rd} is the set of Rd-valued functions defined on the interval [0, ∞), F is the sigma-algebra generated by the sets {f ∈ Ω : f (s) ∈ B}, 0 ≤
s < ∞, B ∈ Bd where Bd denotes the sigma algebra of Borel sets in Rd, P is the probability measure defined by the finite-dimensional distributions of the process
{Xt, t ≥ 0} on the space (Ω, F) and Xt(ω) = ω(t) for all ω ∈ Ω. A sample path (realization, trajectory) Xt(ω) of the stochastic process is therefore an Rd-valued function defined on the time interval [0, ∞). In the following, we shall denote briefly the process by Xt.
Let FT for T ≥ 0 denote the sigma-algebra which is generated by the sets {f ∈ Ω : f (s) ∈ B}, 0 ≤ s < T, B ∈ Bd. A stochastic process X
t is called Markov process if the so-called Markov property is satisfied:
P(Xt∈ B|Fs) = P(Xt∈ B|Xs), ∀ 0 ≤ s < t, ∀ B ∈ Bd. (2.1) A verbal formulation of the Markov property (2.1) is as follows [3]:
If the state of the process at a particular time s (the presents) is known, additional information regarding the behavior of the process at r < s (the past) has no effect on our knowledge of the probable development of the process at t > s (in the future).
A Markov process is called a homogeneous Markov process if the right hand side in (2.1) does only depend on the time difference (t − s), i.e.
P(Xt+h∈ B|Xt) = P(Xh ∈ B|X0), ∀ 0 ≤ t, h, ∀ B ∈ Bd.
We write X0 ∼ v0 if the Markov process Xtis initially distributed according to the probability density v0, i.e. if P(X0 ∈ B) =
R
Let Xtbe a homogeneous Markov process with initial distribution v0. The
proba-bility P(Xt∈ B) to observe Xtat the time T in the subset B ⊂ Bdof the state space is given by
P(Xt∈ B) =
Z
Rd
p(t, x, B)v0(x)dx,
where the function p : [0, ∞)×Rd×Bd→ [0, 1] is called stochastic transition function and is defined according to
p(s, x, B)def= P(Xs∈ B|X0 = x), s ∈ [0, ∞), x ∈ Rd, B ∈ Bd. (2.2)
The function p : [0, ∞) × Rd× Bd→ [0, 1] has the following properties 1. x 7→ p(s, x, B) is measurable for fixed s ∈ [0, ∞) and fixed B ∈ Bd.
2. B 7→ p(s, x, B) is a probability measure for fixed s ∈ [0, ∞) and fixed x ∈ Rd. 3. p(0, x, Rd\ {x}) = 0 for all x ∈ Rd.
4. the Chapman-Kolmogorov equation
p(t + s, x, B) =
Z
Rd
p(t, x, dz)p(s, z, B) (2.3) holds for all t, s ∈ [0, ∞), x ∈ Rdand B ∈ Bd.
We say that the Markov process Xt admits an invariant probability measure µ, if Z
Rd
p(t, x, B)µ(dx) = µ(B) ∀ t ∈ [0, ∞), ∀ B ∈ Bd. (2.4) In many applications, it is important to guarantee that the Markov property (2.1) even holds if the fixed time s is replaced by a stopping time. A random variable
ν : Ω → R+∪ {0} is said to be a stopping time with respect to the Markov process Xtif
{ν ≤ t} = {ω ∈ Ω : ν(w) ≤ t} ∈ Ft, ∀t ≥ 0.
In words, it should be possible to decide whether or not ν ≤ t has occurred on the basis of the knowledge of the process up to the time t. A time-homogeneous Markov process Xthas the strong Markov property with respect to a stopping time ν if,
P(Xν+h ∈ B|Xν) = P(Xh∈ B|X0), ∀t, h ≤ 0, ∀ B ∈ Bd. (2.5)
2.1.2. The Infinitesimal Operator
To every homogeneous Markov process Xt one can assign a semigroup of Markov
operators {Tt, t ≥ 0}, defined for any suitable function u : Rd→ R by
Ttu(x)def= Ex[u(Xt)] = Z
Rd
u(y)p(t, x, dy), (2.6) where Ex[u(Xt)] denotes the expectation of the observable u at time t conditional on X0 = x. Moreover, the operator T0 is the identity operator and the semigroup property, that is,
follows from the Chapman-Kolmogorov equation (2.3). The generator Lbw of a ho-mogeneous Markov process Xtis defined by an operator representing the derivative of the family {Tt, t ≥ 0} at the point t = 0,
Lbwu(x)def= lim t↓0
Ttu(x) − u(x)
t . (2.7)
The domain DLbw of definition of the operator Lbwis a subset of the space of bounded
measurable scalar functions defined on Rdand consists of all functions for which the limit in (2.7) exists. The quantity Lbwu(x) is interpreted as the mean infinitesimal rate of change of u(X0) in case X0 = x.
2.1.3. Diffusion Processes
Diffusion processes are special cases of Markov processes with continuous sample functions. There are basically two different approaches to the class of diffusion pro-cesses. On the one hand, one can define them in terms of the conditions on the stochastic transition function introduced above. On the other hand, one can study the state Xt itself and its variation with respect to time. This leads to a stochastic
differential equation. That is what we shall do in the present section. A detailed
introduction to stochastic differential equation can be found in, e.g., [70, 40]. In what follows, we restrict ourselves to time-homogeneous Markov diffusion
pro-cesses Xt which are solutions or (or which are generated by) the stochastic differ-ential equation (SDE) of the form
dXt= b(Xt)dt + σ(Xt)dWt, (2.8) where Xt∈ Rdand Wt= (Wt1, . . . , Wtd) is a d-dimensional standard Wiener process (see definition A.6.1 in the Appendix). The real vector field b : Rd → Rd is called the drift field or mean velocity field of the diffusion. The real symmetric matrix
a(x) = (aij(x)) ∈ Rd×d, defined for all x ∈ Rd via the real matrix σ(x) ∈ Rd×d according to
a(x)def= 1
2σ(x)σ(x)
T (2.9)
is called the diffusion matrix. Here σT(x) denotes the transposed matrix of the real matrix σ(x).
Assumption 2.1.1. Henceforth, we make the following additional assumptions on
the coefficients of the SDE (2.8):
• The diffusion matrix a(x) is for all x ∈ Rd non-negative definite, i.e., d
X i,j=1
aij(x)ξiξj ≥ 0, ∀ξ ∈ Rd. (2.10)
• The drift field b(x) and the diffusion matrix a(x) are such that there exists an unique solution of (2.8). (See Theorem (A.6.1) in Appendix).
• The drift field b(x) and the diffusion matrix a(x) are such that the diffusion process Xt is ergodic with respect to a unique invariant probability measure
dµ(x) = ρ(x)dx, i.e., lim T →∞ 1 T Z T 0 f (Xs)ds = Z Rd f (y)ρ(y)dy (2.11) for all f ∈ L1(Rd).
2.1.4. Reversed-time Diffusion Process
Let {Xt, 0 ≤ t ≤ T }, T > 0 be a Markov diffusion process, satisfying the SDE dXt= b(Xt)dt + σ(x)dWt, 0 ≤ t ≤ T
and denote by v(t, x) the probability density of the law of Xt at time t, i.e., P[Xt∈ C] =
Z C
v(t, y)dy, ∀C ∈ Bd.
A Markov process remains a Markov process under time reversal, i.e., the reversed-time process {XR
t , 0 ≤ t ≤ T } according to
XtRdef= XT −t
is again a Markov process, but in general the diffusion property is not preserved. Under mild conditions on the drift field b(x), the matrix σ(x) and the probability density v0(x) of the law of X0, it is proven in [47] that the reversed-time process
XR
t is again a Markov diffusion process. In particular, it is shown that XtR satisfies a SDE
dXtR= bR(t, XtR)dt + σ(XtR)dWt (2.12) where the time-dependent reversed drift field bR(t, x) : Rd+1→ Rd is given by
bR(t, x) = −b(x) + 2
v(T − t, x)div
¡
a(x)v(T − t, x)¢. (2.13) If the diffusion process {Xt, 0 ≤ t ≤ ∞} admits an invariant probability measure µ, induced by the probability density ρ(x), then (2.13) reduces to
bR(x) = −b(x) + 2
ρ(x)div
¡
a(x)ρ(x)¢ (2.14)
and dµ(x) = ρ(x)dx is the invariant probability measure of the reversed process too. If the diffusion process Xt is such that
b ≡ bR
then the original process Xt and the reversed process XtR are statistically indistin-guishable and the process Xtis called reversible.
2.1.5. Backward and Forward Equations
For a Markov diffusion process Xtof the form (2.8), the infinitesimal operator Lbwis a linear second order partial differential operator whose coefficients are determined by the drift field b(x) and the diffusion matrix a(x),
Lbwu = d X i,j=1 aij ∂ 2u ∂xi∂xj + d X i=1 bi∂x∂u i (2.15) acting formally on the space of twice partially differentiable functions u : Rd → R. The first double sum in (2.15) is called the principle part of the differential operator. Next, we establish the relation between the semigroup {Tt, 0 ≤ t < ∞} and the partial differential operator Lbw.
Theorem 2.1.1. ([3], page 42-43) Let g : Rd → R denote a continuous bounded scalar function such that the function u : [0, ∞) × Rd→ R according to
u(t, x)def= Ex[g(Xt)]
is continuous and bounded, as are its derivatives ∂u/∂xi and ∂2u/∂xi∂xj. Then
u(t, x) satisfies the Kolmogorov’s backward equation
∂u ∂t = Lbwu in (0, ∞) × R d u(0, ·) = g on Rd. (2.16) Loosely spoken, the backward equation describes the evolution of conditional ex-pectations of observables with respect to Xt. The evolution of the probability density of the law of a diffusion process Xtis governed by the Kolmogorov’s forward equation, also known as Fokker-Planck equation.
Theorem 2.1.2. ([57], page 360) If the functions σij, ∂σij/∂xk, ∂2σij/∂xk∂xl, bi,
∂bi/∂xj, ∂v/∂t, ∂v/∂xi, and ∂2v/∂xi∂xj are continuous for t > 0 and x ∈ Rd, and
if bi, σij and their first derivatives are bounded, then v(t, x) satisfies the equation ∂v ∂t = Lf wv in (0, ∞) × R d v(0, ·) = v0 on Rd, (2.17)
where X0 ∼ v0 and the operator Lf w is a linear second order partial differential
operator, defined according to Lf wvdef= d X i,j=1 ∂2(a ijv) ∂xi∂xj − d X i=1 ∂(biv) ∂xi = d X i=1 ∂ ∂xi Xd j=1 ∂(aijv) ∂xj − biv . (2.18)
Notice, that the probability density function ρ of the invariant measure µ is the steady state solution of the Fokker-Plank equation (2.17), i.e.,
Remark 2.1.2. The generator of a Markov diffusion process plays a key role in
Transition Path Theory. For the sake of a compact presentation, we introduce a compact notation for differential operations on functions. Let u : Rd → R then the
Nabla-operator ∇ is defined as
∇u = (∂u ∂x1, . . . ,
∂u ∂xd) and the Laplace-operator ∆ is given by
∆u = d X i=1 ∂2u ∂2xi.
Moreover, we abbreviate the divergence of a vector field b : Rd7→ Rd by
∇ · bdef= div(b) = d X i=1 ∂bi ∂xi.
The divergence ∇ · a of a matrix a(x) = (a(x)ij) ∈ Rd×d is a vector field whose ith
component is defined by (∇ · a)idef= d X j=1 ∂aij ∂xj , i = 1, . . . , d.
Henceforth, we will write the generator (2.15) of a diffusion process as
Lbwu = a : ∇∇u + b · ∇u, (2.19)
where we additionally abbreviate the principle part of Lbw by a : ∇∇udef= d X i,j=1 aij ∂ 2u ∂xi∂xj
and b · ∇u denotes the scalar product between the vector field b(x) and the gradient ∇u(x). In the introduced notation, the operator Lf w, defined in (2.18), takes the
form
Lf wv = ∇ · [∇ · (av) − bv] ,
where the vector field
J(x)def= −£∇ ·¡a(x)v(x)¢− b(x)v(x)¤ (2.20)
is referred to as (probability) current. 2.1.6. Partial Differential Operators
In this work we are mainly concerned with two types of linear second order partial differential operators (PDEs): the elliptic and the degenerate elliptic type.
Consider the general linear second order partial differential operator
Gu = a : ∇∇u + b · ∇u + cu (2.21) with real coefficients aij(x), bi(x), c(x) defined on a domain (open and connected) Ω ⊂ Rd. Because the Hesse matrix of a function u ∈ C2(Rd) is symmetric, we may assume without loss of generality that the matrix a(x) = (aij(x)) is symmetric. Second-order PDEs are classified according the behavior of a quadratic form which is associated with their principle parts.
Definition 2.1.3. The operator G is said to be of elliptic type (or elliptic) at a
point x0 ∈ Ω if the matrix a(x0) is positive definite, i.e., d
X i,j=1
aij(x0)ξiξj > 0, ∀ ξ ∈ Rd: ξ 6= 0. (2.22)
The operator G is called elliptic in Ω if the matrix a(x) is positive definite for all x ∈ Ω. If there exists a positive constant θ > 0 such that
d X i,j=1
aij(x)ξiξj ≥ θ kξk2
for all x ∈ Ω, ξ ∈ Rd, then we say that G is uniformly elliptic in Ω. If the matrix a(x) is nonnegative definite, i.e.,
d X i,j=1
aij(x)ξiξj ≥ 0 (2.23)
for all x ∈ Ω, ξ ∈ Rd then G is called degenerate elliptic [90].
Remark 2.1.4. Notice that besides the elliptic operators, the class of degenerate
elliptic operators includes operators of parabolic types, first order equations, ultra-parabolic equations, and others. In the literature, a degenerate elliptic operator is also called semi-elliptic [70] or of nonnegative characteristic form [71].
2.1.7. Relation between Lbw and Lf w
In the language of the theory of partial differential equations, the operator Lf w(2.18) is the formal L2-adjoint of the operator L
bw (2.15), i.e., Z Rd vLbwu dx = Z Rd uLf wv dx, ∀u, v ∈ L2(Rd), (2.24) where L2(Rd) = {v : Rd → R : R
Rd|v(x)|2dx < ∞}. The operator Lbw is called
self-adjoint if Lbw ≡ Lf w. If the domain of integration in (2.24) is restricted to a bounded domain Ω ⊂ Rd with a sufficiently smooth boundary ∂Ω then by virtue of Green’s theorem the identity (2.24) takes the form
Z Ω vLbwu dx = Z Ω uLf wv dx + Z ∂Ω R · ˆn dσ∂Ω(x), (2.25)
where ˆn is the unit normal to the boundary ∂Ω pointing outward Ω, dσ∂Ω is the surface element on ∂Ω and the real vector field R : Rd → Rd (the concomitant of
Lbw [79]) is given by
R = va∇u − ua∇v + uv[b − ∇ · a]. (2.26) The identity (2.25) will be useful to define adjoint boundary conditions in Sec-tion 2.1.9.
2.1.8. Stochastic Representation of Solutions of Boundary Value Problems
Theorem 2.1.1 states that for any suitable function g the function
u(t, x) = Ex[g(Xt)] satisfies the initial value problem
∂u ∂t − Lbwu = 0 in (0, ∞) × Ω u(0, ·) = g on Ω (2.27) where Lbw is the generator of the considered diffusion process Xt. In other words, the solution of (2.27) can be expressed in terms of the Markov diffusion process
Xt associated with the generator Lbw. Therefore, it is natural to ask the following question: Given a degenerate elliptic differential operator acting on C2(Rd) of the form
Gu = a : ∇∇u + b · ∇u,
and let Ω ⊂ Rd be a domain (open and connected). Under what conditions on the coefficients a(x), b(x) there exists a Markov diffusion process Xt such that the solution u ∈ C2(Ω) ∩ C(Ω) of the Dirichlet-Poisson problem for given functions
f ∈ C(Ω) and g ∈ C(∂Ω),
(
Gu = f in Ω
u = g on ∂Ω (2.28)
can be expressed in terms of the Markov diffusion process Xt?
The idea of solution is to find a diffusion process Xt such that its generator Lbw coincides with G on C2(Rd). This is formally achieved by setting
dXt= b(Xt)dt + σ(Xt)dWt, (2.29) where σ(x) ∈ Rd×d is chosen such that
1
2σ(x)σ(x)T = a(x).
In order to guarantee that (2.29) admits a unique solution, we assume that the conditions on b(x) and a(x) in Theorem A.6.1 are satisfied. In particular, conditions which guarantee the Lipschitz-continuity of the square root of a(x) are given in [40], Theorem 1.2, page 129.
Theorem 2.1.3. Suppose the function g ∈ C(∂Ω) is bounded and the function f ∈ C(Ω) satisfies Ex ·Z τΩ 0 |f (Xs)|ds ¸ < ∞, ∀ x ∈ Ω,
where τΩ= inf{t : Xt∈ ∂Ω} is the first exit time from Ω. Suppose further that
τΩ< ∞, a.s. ∀x ∈ Ω.
Then if u ∈ C2(Ω) ∩ C(Ω) is a solution of the Dirichlet-Poisson problem (2.28) we
have u(x) = Ex[g(XτΩ)] − Ex ·Z τΩ 0 f (Xs)ds ¸ . (2.30)
Next we address the question of existence of a solution of the Dirichlet-Poisson problem in (2.28). Under the assumption that the operator G is uniformly elliptic in Ω, the following Theorem holds:
Theorem 2.1.4 ([40], page 144). Let the conditions
• (aij), bi is uniformly Lipschitz-continuous in Ω
• f is uniformly H¨older continuous in Ω • g is continuous on ∂Ω
• ∂Ω ∈ C2
Then (2.30) is the unique classical solution of the Dirichlet-Poisson problem in (2.28).
Unfortunately, it turned out that the existence problem for the case where G is degenerate elliptic, but not elliptic is a difficult question. Up to our knowledge there is no result which provides conditions under which a classical solution of (2.28) exists. For results on the existence of weak solutions of (2.28) we refer the interested reader to [71, 88].
2.1.9. Adjoint Boundary Condition
To motivate the concept of adjoint boundary condition, suppose we are interested in the invariant probability distribution of a Markov diffusion process restricted on a domain Ω ⊂ Rd. We mean by ”restricted” that we require that the process must not escape the domain. As pointed out in Section 2.1.5, the probability density function ρ(x) of the invariant probability distribution is the steady state solution of the Kolmogorov forward equation (2.17), hence we are interested in the solution of the equation
Lf wv = 0 in Ω.
In order to reflect that the process must not escape the domain Ω, we have to impose additional conditions on the probability density function v(x) on the boundary ∂Ω. The natural choice is to require that the probability current (2.20) is tangential to the boundary which leads to the boundary conditions
where ˆn is the unit normal to ∂Ω pointing outward Ω. The adjoint boundary
con-ditions BC∗(u) = 0 are chosen such that both operator L
f w and Lbw are adjoint in the domain Ω, i.e., Z
Ω
vLbwudx =
Z
Ω
uLf wvdx.
Recalling the integral identity (2.25), the adjoint boundary conditions BC∗(u) = 0 are formally defined [28] as a minimal set of homogeneous conditions on u such that
BC(v) = BC∗(u) = 0 on ∂Ω =⇒ R · ˆn = 0 on ∂Ω.
A short calculation shows that the adjoint boundary conditions of the boundary conditions (2.31) take the form
BC∗(u) = a∇u · ˆn = 0 on ∂Ω. (2.32)
Notice that in the case a = I = diag(1, . . . , 1) ∈ Rd×d the conditions (2.32) reduce to the Neumann-conditions.
2.1.10. Langevin and Smoluchowski Dynamics
In this work, we are mainly concerned with two classes of time-homogeneous Markov diffusion processes which arise from the stochastic modeling of the dynamics of particles in a potential landscape. Both dynamics incorporate a physical temperature and friction.
Langevin Dynamics
The first class of time-homogeneous diffusion process, we are interested in, is gen-erated by the famous Langevin equation which is componentwise given in its tradi-tional form by [76] ˙xi(t) = m−1i pi(t), ˙pi(t) = −∂V (x(t)) ∂xi − γim −1 i pi(t) + p 2γiβ−1ζ i(t) (2.33)
where x = (x1, . . . , xd) is the position of the particles, p = (p1, . . . , pd) is the mo-mentum of the particles, mi> 0 is the mass of xi, the function V (x) is the potential,
γi> 0 is the friction coefficient on xi and ζi(t) is a white noise (see Definition A.6.1 in Appendix). The inverse temperature β > 0 is related to the physical temper-ature T by β = 1/kBT where kB is the Boltzmann-constant. A system governed by the Langevin dynamics can be regarded as a mechanical system with additional noise and dissipation (friction). The noise can be thought of modeling the influence of a heat bath surrounding the molecule and the dissipation is chosen such as to counterbalance the energy fluctuations due to the noise.
The Langevin dynamics (2.33) is ergodic with respect to the equilibrium measure (invariant probability measure)
where the Hamiltonian H(x, p) is defined as
H(x, p) = V (x) +1
2p
TM−1p, M−1= diag(m−1
1 , . . . , m−1d )
and Z =RRd×Rde−βH(x,p)dxdp is the normalization constant. Notice that (2.33) can
be put in the form of (2.8) by setting
b(x, p) = (M−1p, −∇V (x) − ΓM−1p)T ∈ R2d, σ =p2β−1 Ã 0 0 0 Γ12 ! ∈ R2d×2d, where Γ12 = diag(√γ1, . . . ,√γd).
According to (2.19), the generator of the Langevin dynamics (2.33) takes the form Lbwu =β−1Γ : ∇p∇pu + M−1p · ∇xu
− ∇xV · ∇pu − ΓM−1p · ∇pu,
(2.35) where ∇x and ∇p act only on the positions and momenta, respectively.
Remark 2.1.5. Notice that the diffusion matrix of the Langevin dynamics
a = β−1 µ 0 0 0 Γ ¶ ∈ R2d×2d
is not positive definite but nonnegative definite. Hence the operator Lbw is not
elliptic but degenerate elliptic. In the literature, e.g. in [74], the Langevin process is also called a hypoelliptic diffusion process (see definition A.6.2 in Appendix)
Next, we turn our attention to the reversed time Langevin dynamics. Recalling the relation (2.14) between the drift fields of a diffusion process and its reversed time process, the reversed drift field of the reversed time Langevin dynamics is given by
bR((x, p)) = (−M−1p, ∇V (x) − ΓM−1p)T
and the generator of the reverse-time Langevin dynamics takes the form LRbwu =β−1Γ : ∇p∇pu − M−1p · ∇xu
+ ∇xV · ∇pu − ΓM−1p · ∇pu. (2.36)
Since b(x, p) 6= bR(x, p), the Langevin dynamics is a non-reversible diffusion process on the phase space (x, p).
Smoluchowski Dynamics
A second important class of time-homogeneous diffusion processes is generated by the overdamped Langevin or Smoluchowski dynamics which arises in the high friction limit of the Langevin equation (2.33),
˙xi(t) = −γi−1∂V (x)∂x i
+ q
where x = (x1, . . . , xd) denotes the position of the particles and the other quantities are as in (2.33). For a sketch of the derivation of the Smoluchowski dynamics see [51]. The Smoluchowski dynamics (2.37) is ergodic with respect to the invariant measure dµ(x) = ρ(x)dx, induced by the equilibrium probability density function
ρ(x) = Z−1e−βV (x), (2.38) where Z =RRde−βV (x)dx is the normalization constant. In contrast to the Langevin
dynamics, (2.37) defines a reversible diffusion process on the position space and the generator is given by the elliptic operator
Lbwu = β−1Γ−1 : ∇∇u − Γ−1∇V · ∇u, (2.39) where Γ−1= diag(γ−1
1 , . . . , γd−1).
2.2. Markov Jump Processes
In this section we will introduce time-continuous Markov processes on a discrete state space and will provide the basic facts about this class of processes which will be relevant for the derivation of discrete transition path theory. For further readings, see e.g. [86, 13, 69].
Let {X(t), t ≥ 0} be an S-valued stochastic process on a probability space (Ω, F, P), with a discrete (countable) state space S and a continuous (time) parameter 0 ≤
t < ∞. We will denote by {X(t)}t∈R an equilibrium sample path (or trajectory) of the Markov process, i.e. any path obtained from {X(t)}t∈[T,∞) by pushing back the initial condition, X(T ) = x, at T = −∞.
A continuous-time stochastic process {X(t), t ≥ 0} with discrete state space S is called a Markov process if for any tk+1 > tk> · · · > t0 ≥ 0 and any j, i1, · · · , ik∈ S
P(X(tk+1) = j|X(tk) = ik, · · · , X(t1) = i1) = P(X(tk+1) = j|X(tk) = ik) (2.40) holds. A continuous-time Markov process is called homogeneous if the right hand side of (2.40) only depends on the time increment τk = tk+1− tk. The probability distribution µ0 satisfying
µ0(i) = P(X(0) = i), ∀i ∈ S
is called the initial distribution. In the following we will focus on homogeneous continuous-time Markov processes on a finite state space S ∼= {1, . . . , d} and we will denote that class of processes by Markov jump processes.
For a fixed time t, the transition probabilities
pij(t) = P(X(t) = j|X(0) = i)
define a transition matrix P (t) = (pij(t))i,j∈S where pij(0) = δij and δij = 1, if i = j and zero otherwise. By definition, P (t) is a stochastic matrix, i.e,
pij(t) ≥ 0 and X k∈S
Throughout this thesis, we assume that the transition probabilities are continuous at t = 0, i.e.
lim
t↓0 p(t, i, j) = δij, ∀i, j ∈ S. (2.42) which guarantees, that a trajectory of {X(t), t ≥ 0} is a right continuous function with left limits (c`adl`ag).
The family of transition matrices {P (t), t ≥ 0} is called the transition semigroup of the Markov jump process which is justified by the fact that {P (t), t ≥ 0} obeys the Chapman-Kolmogorov equation
P (t + s) = P (t)P (s), s, t ≥ 0.
with P (0) = I where I = diag(1, . . . , 1) ∈ Rd×d is the identity matrix.
Furthermore, a local characterization of the transition semigroup of a Markov jump process can be obtained by considering the infinitesimal changes of the tran-sition probabilities. Under the assumption made in (2.42), one can show that the right-sided limit [13]
L = lim
t→0+
P (t) − I t
exists (entrywise). The matrix L = (lij)i,j∈S is referred to as the infinitesimal
gen-erator of the transition semigroup {P (t), t ≥ 0} because L ’generates’ the transition
semigroup via the relation
P (t) = exp(tL) = ∞ X n=0 tn n!L n.
Due to the finite state space S, the matrix L has a special structure, namely, 0 ≤ lij < ∞ and X
k∈S
lik = 0 ∀ i, j ∈ S, i 6= j. (2.43) where an entry lij, i 6= j is interpreted as a transition rate: the average number of transitions from state i to state j per time unit. The diagonal entries of L, given by
lii= − X k6=i
lik, ∀ i ∈ S, are called the escape rates of the states.
The Markov property (2.40) of a Markov jump process even holds for a certain class of random times, the so-called stopping times. A real, non-negative random variable ν is called a stopping time with respect to the process {X(t), t ≥ 0} if for all t ≥ 0, the event {ν ≤ t} is expressible in terms of (X(s), s ∈ [0, t]), i.e. it should be possible to decide whether or not ν ≤ t has occurred on the basis of the knowledge of the process up to the time t.
Now let {X(t), t ≥ 0} be a Markov jump process with generator L, ν a stopping time with respect to {X(t), t ≥ 0} and i ∈ S an arbitrary state. Then, given that
X(ν) = i,
the process after ν and the process before ν are independent, and
The property (2.44) is called the strong Markov property.
Analogously to the case of a continuous state space, the evolution of conditional expectations of observables is governed by the infinitesimal generator. To be more precise, let f : S 7→ R be an observable. Then the time derivative of the conditional expectations u(i, t) = E[f (X(t))|X(0) = i], i ∈ S satisfies the backward Kolmogorov equations
d
dtu(i, t) =
X j∈S
liju(j, t), u(i, 0) = f (i) ∀i ∈ S, t ≥ 0 (2.45) or, in matrix-vector notation
du
dt = Lu, u(0) = f , t ≥ 0.
Similarly, let µ(t) = (µi(t))Ti∈S= (P(X(t) = i))Ti∈S be the probability distribution of the Markov jump process at time t. Then the distribution µ(t) evolves in time according to the forward Kolmogorov equation
dµ dt = µ
TL, t ≥ 0, (2.46)
also known as Master equation. A probability distribution π = (πi)i∈S is called a
stationary distribution if it satisfies
0 = πTL.
In other words, π is a left eigenvector associated with the zero eigenvalue of L. To further illuminate the characteristics of Markov Jump processes, denote by
t0 = 0 < t1 < t2 < . . . the random jump times, at which the Markov process
changes its state. For notational convenience, we denote the left-sided limit of the process at time t by
X∗(t)def= lim
s→t−X(s). (2.47)
Then the sequence of jump times {tn, n ∈ N ∪ {0}}, formally given by
t0 = 0, ∀n ∈ N : tn= inf{s : s > tn−1, X(s) 6= X∗(s)}. defines according to
Xndef= X(tn)
the embedded process {Xn, n ∈ N0} associated with the Markov jump process. It
can be shown that {Xn, n ∈ N0} is a discrete-time Markov chain and its transition
matrix P = (pij)i,j∈S is related to the infinitesimal generator L by
pij = (
−lij
lii ∀i 6= j
0, otherwise. (2.48)
A Markov jump process is called irreducible if the embedded process is irreducible, i.e., if for any pair (i, j), i 6= j of states there exists an m ∈ N such that (Pm)i,j > 0 (cf. Sect. A.6).
Next, we turn our attention to the reversed time process {XR(t), t ∈ R} defined by
XR(t)def= X∗(−t),
where X∗(−t) denotes the left-sided limit of the process at time −t. If we assume that
{X(t), t ∈ R} is irreducible and that it admits a unique stationary distribution π =
(πi)i∈S, then the process {XR(t), t ∈ R} is again a c`adl`ag Markov jump process with the same stationary distribution as {X(t)}t∈R, π, and the infinitesimal generator
LR= (lR
ij)i,j∈S given by
lijR= πj
πilji. (2.49)
If in particular the infinitesimal generator L satisfies the detailed balance equations
πilij = πjlji, ∀i, j ∈ S (2.50) then LR ≡ L and hence, the direct and the reversed time process are statistically
indistinguishable. Such a process is called reversible.
We end this section by stating a strong law of large numbers for Markov jump processes, which says that the time average of an observable f : S 7→ R with respect to the process equals the expectation of f with respect to the stationary distribution. Formally, we have lim t→∞ 1 t Z t 0 f (X(s))ds =X i∈S f (i)πi (2.51)
almost sure for all initial distributions µ0 where π is the stationary distribution
of the Markov jump process. In particular, the Markov jump process is said to be
Processes
As explained in the introduction of this thesis, Transition Path Theory (TPT) pro-vides a powerful framework to describe the statistical properties of the ensemble of reactive trajectories. In this chapter, we will recall the theoretical aspects of TPT in the context of Markov diffusion processes (Sect. 3.1) and, in particular, we will derive the main objects of TPT for the case of the Smoluchowski dynamics (Sect. 3.2) and for the Langevin dynamics (Sect. 3.3), respectively. The remainder of this chapter is devoted to illustrate TPT via several low dimensional examples where we will also explain briefly how the various quantities of TPT were computed on the simple examples. For the details of the numerical considerations, especially how we numer-ically solved the committor equation see Section A.1 in the Appendix. For more details, we refer the reader to the original references [34, 92, 65].
3.1. Theory: Transition Path Theory
Consider a system whose dynamics is governed by the following stochastic differential equation
dXt= b(Xt)dt + σdWt, (3.1)
where Xt ∈ Rd, b(x) = (b1(x), . . . , bd(x))T ∈ Rd is the drift vector, σ ∈ Rd×d is a real matrix and Wt is a d-dimensional, standard Wiener process. The generator associated with the dynamics (3.1) is given by
Lbwu(x) = d X i,j=1 aij∂2u(x) ∂xi∂xj + d X i=1 bi(x)∂u(x) ∂xi = a : ∇∇u(x) + b(x) · ∇u(x), (3.2)
where a = 12σσT is the diffusion matrix.
3.1.1. Ensemble of Reactive Trajectories
Let X(t),−∞ < t < ∞ be an infinity long trajectory solution of (3.1) which is ergodic with respect to the equilibrium probability density function ρ(x), i.e. given any suitable observable φ(x), we have
lim T →∞ 1 2T Z T −T φ(X(t))dt = Z−1 Z Rd φ(x)ρ(x)dx, (3.3) where Z =RRdρ(x)dx. (3.3) is a property of any generic trajectory in the system
B
A
Figure 3.1.: Schematic representation of the reactant state A, the product state B and a piece of an equilibrium trajectory (shown in thin black). The sub-pieces connecting ∂A to ∂B (shown in thick black) are each a reactive trajectory, and the collection of all of them is the ensemble of reactive trajectories.
times when T is large (and infinitely often as T → ∞). Suppose however that one is not interested in the statistical properties of such a generic trajectory, but rather in the statistical properties that this trajectory displays while involved in a reaction. This question can be made precise as follows. Suppose that A ⊂ Rd and B ⊂ Rd are two regions in configuration space that characterize the system while it is in the reactant and the product states, respectively, of a given reaction. Then, given any generic trajectory, x(t), −∞ < t < ∞, we can prune this trajectory as illustrated in Figure 3.1 to consider only the pieces of this trajectory that connect ∂A (the boundary of A) to ∂B (the boundary of B). Each such piece is a reactive trajectory and the collection of all of them is the ensemble of reactive trajectories. By ergodicity, the statistical properties of this ensemble are independent of the particular trajectory used to generate the ensemble, and these properties are the object of TPT.
Formally, the ensemble of reactive trajectories is defined in Definition 3.1.1 (ensemble of reactive trajectories).
ensemble of reactive trajectories
= {X(t) : t ∈ R} where t ∈ R if and only if
X(t) 6∈ A ∪ B, X(t+AB(t)) ∈ B and X(t−AB(t)) ∈ A
(3.4)
where
t+AB(t) = smallest t0 ≥ t such that X(t0) ∈ A ∪ B,
t−AB(t) = largest t0 ≤ t such that X(t0) ∈ A ∪ B. (3.5) Each continuous piece of the trajectory going from A to B in the ensemble (3.1.1) is a specific reactive trajectory. The main objects of TPT are then defined in terms of the reactive trajectories and expressed in terms of ρ(x) and the committor functions
q(x) and qb(x) which will be defined in the next section.
3.1.2. Committor Function
We will see in the next sections that the forward committor function q(x), defined as the probability that the trajectory starting from x 6∈ A ∪ B reaches first B rather than A and the backward committor function qb(x), defined as the probability that the trajectory arriving at x 6∈ A ∪ B came rather from A than from B are the crucial objects to express, e.g., the probability density function of reactive trajectories.
Formally, the forward committor function q(x) satisfies the backward Kolmogorov equation associated with (3.1):
Lbwq = 0 in Rd\ (A ∪ B), q = 0 on ∂A, q = 1 on ∂B, (3.6)
where Lbw is the operator in (3.2). To see (3.6) notice that the committor function
q(x) can be expressed in terms of a conditional expectation, i.e, q(x) = Ex[1B(X(τA∪B))] ,
where τA∪B is the first hitting time of the process Xtwith respect to the set A ∪ B. If we define the auxiliary function g : ∂A ∪ ∂B → R by
g(x) =
(
0, if x ∈ ∂A 1, if x ∈ ∂B
and set f ≡ 0 then by virtue of Theorem 2.1.3 follows that if (3.6) possesses a (classical) solution, say u(x), then we have q ≡ u, and therefore, q(x) satisfies (3.6). For conditions on the differential operator Lbw and the boundary of the set A ∪ B which ensure the existence of a classical solution, see Theorem 2.1.4.
A similar reasoning as above shows that the backward committor function qb(x) satisfies the backward Kolmogorov equation associated with the reversed-time pro-cess (cf. Sect. 2.1.4): LRbwqb = 0 in Rd\ (A ∪ B), qb = 1 on ∂A, qb = 0 on ∂B, (3.7) where LRbwqb = a : ∇∇qb(x) + bR(x) · ∇qb(x) (3.8) with the drift field (cf. Theorem A.6.2)
bR(x) = −b(x) + 2
ρ(x)div
¡
a(x)ρ(x)¢.
Notice that if the process Xt is reversible than in particular we have Lbw≡ LRbwand it follows that the backward committor function qb(x) can be expressed in terms of the forward committor function:
qb(x) = 1 − q(x). (3.9)
In large dimensional systems, the main question of interest then becomes how to solve (3.6), which is a highly nontrivial problem since (3.6) involves a partial differential equation for a function of many variables. The string method is a way to deal with this issue. In the context of the two-dimensional examples considered in this chapter, however, standard numerical techniques based on discretizing (3.6) by finite differences can be applied, as briefly explained in detail in the Appendix, Section A.1.
Remark 3.1.2. Let r(x) denote the mean first passage time (mean first hitting
time) of the process Xt with respect to the set S ⊂ Rd, conditional on X(0) = x.
Formally, r(x) is given by
r(x) = Ex[τS],
where τS is the hitting time of the process Xt with respect to the set S. If we set
g ≡ 0 and f ≡ −1 then a similar reasoning as for the committor function shows that r(x) satisfies
(
Lbwr = −1 in Rd\ S,
r = 0 on ∂S, (3.10)
where Lbw is the operator in (3.2).
3.1.3. Probability Density Function of Reactive Trajectories
Let A ⊂ Rdand B ⊂ Rddenote the reactant and product states, respectively. What is the probability density to observe a reactive trajectory at position x 6∈ A ∪ B at time t, conditional on it being reactive at time t?
Intuitively, it should be clear that the probability density to observe any reactive trajectory is given by the probability density to observe any trajectory (reactive or not) at point x, which is ρ(x), times the probability qb(x) that the trajectory came rather from A than from B and times the probability q(x) that the trajectory reaches first B rather than A.
Formally, the probability density function of reactive trajectories ρAB(x) is defined such that, giving any observable φ(x), we have
lim T →∞ R R∩[−T,T ]R φ(X(t))dt R∩[−T,T ]dt = Z ΩAB φ(x)ρAB(x)dx, (3.11) where ΩAB = Rd \ (A ∪ B). Indeed, it is proven in [34] that by exploiting both ergodicity and the strong Markov property of the dynamics the intuitive picture is right, namely that ρAB(x) can be expressed in terms of ρ(x), q(x) and qb(x) as
ρAB(x) = ZAB−1q(x)qb(x)ρ(x), (3.12) where the normalization constant ZAB,
ZAB = Z
ΩAB
q(x)qb(x)ρ(x)dx, (3.13) is the total probability to encounter a reactive trajectory.
3.1.4. Probability Current and Transition Rate
The probability density ρAB(x) is not the only quantity of interest as it may not be sufficient to characterize the reaction pathway. To get a better understanding of this pathway, we may also ask about the probability current of reactive trajectories. Roughly, this current is such that, integrated over any surface in ΩAB, it gives the probability flux of reactive trajectories across this surface, that is, the net balance