III – 2 Spatio-Temporal Mixed Effects Models

data, called generic spatio-temporal model. This model relies on the notion of parallel variations of a curve in a Riemannian manifold. We thus start by recalling briefly this notion. Indeed, this model no longer assumes that individuals follow a geodesic evolu- tionary trajectory but rather a trajectory parallel to the representative curve, which is still supposed to be geodesic.

We want to generalize to the Riemannian manifold case the notion of parallel lines. Let M a geodesically complete Riemannian manifold. Let γ : I ⊂ R → M a differentiable curve on M, t0 ∈ I and w0∈ Tγ(t0)M a tangent vector. Then, we call parallel variation

of γ along the direction w the curve ηw_{(γ; ·): I → M defined by}

ηw(γ; ·): t 7→ Expγ(t) Pγ,t0,t(w)

where Pγ,t0,t(w) is the parallel transport of the vector w, along the curve γ between

the points γ(t0) and γ(t). If in the euclidean case these curves are identical, in the Riemannian case this may not be the case: in other words, due to the curvature of the manifold, the parallel variation of a geodesic a priori does not have any reason to remain a geodesic. One only needs to look at the Earth, its equator and its different parallels including the two tropics to be convinced.

In particular, in order to compute the parallel variation of a given curve, we first have to transport the direction vector w along the curve γ, for all time t. Numerically, it may be very expensive; suitable numerical schemes were recently designed with the purpose of reducing the computational cost (Louis et al.,2017,2018).

The model proposed bySchiratti et al.(2015,2017) is to be compared to the model developed inDurrleman et al.(2009). Let the longitudinal dataset (yi, ti), where for each

subject i, we observe yi = (yi,j)j∈∈Rki at times ti = (ti,j)j∈J1,kiK. The idea of Schiratti et al.(2015,2017) is to see these observations as noisy samples along individual-specific trajectories, themselves arising from a representative path of the global evolution of the population through spatio-temporal deformations. Moreover, as explained previously and for numerical reasons, the authors want to build a parametric model.

To do so, they require the representative trajectory to be geodesic. This trajectory is then naturally parameterized with the triplet (t0, p0, v0), where t0 ∈R is a reference time, p0∈ M the value of the curve at that time and v0 ∈ Tp0M the value of its velocity

vector (Gallot et al.,2004). Finally, the representative curve γ0: R → M writes γ0: t 7→ Expp0,t0(v0)(t) ,

where Expp0,t0(v0) is the exponential map passing through p0, at velocity v0, at time

t0. At the individual scale, these trajectories are seen as parallel variations, with time warps, of this curve γ0; that is to say, for each individual i, and given a vector wi∈ Tp0M

and where the random effects (αi, τi) are to be interpreted as acceleration factors and

individual time warps. In other words,Schiratti et al.(2015,2017) allow each subject to follow its own spatial trajectory of evolution at its own pace αi while being potentially

ahead in time (or on the contrary late in time, according to its sign) of a value τi with

respect to the mean trajectory. Figure2.5 illustrates this construction. The statistical model is then given by

yi,j = ηwi

Exp_p₀_,t₀(v₀) ; α_i(t_i,j− t₀− τ_i) + t₀ + ε_i,j

with εi,j a Gaussian white noise independent and identically distributed. In order to

have an identifiable model, the vectors wi have to be chosen orthogonal to the trajectory

γ0. Moreover, in order to reduce the dimension of the estimated parameters space (Giraud, 2014), they are required to arise from a independent components statistical model (Hyvärinen et al., 2004). In other words, Schiratti et al. (2015, 2017) assume that the wi are all linear combinations of independent sources and, instead of directly

estimating wi, they propose to estimate a design matrix A and some sources si such that

wi = Asi. Estimation of the parameters is performed through a well-defined a posteriori

maximum, whose existence is proved by Schiratti(2016) in his PhD work, with the use of the MCMC-SAEM algorithm we introduced in SectionI.3.

This model was notably applied to early detection of Alzheimer’s disease and proved thereat to be efficient. The study of Bilgel et al.(2016) on β-amyloïde plaques, one of the hallmarks of Alzheimer’s diseases, is another illustration of the applicability of this model.

M

γ0 γi + + _{+ + +} ++ + + + + + + + + ηwi_{, ψ}_i wi

Figure 2.5 – Generic spatio-temporel model ofSchiratti et al.(2015,2017).

We observe noisy samples along individual-specific trajectories, constructed as spatio-temporal deformations of a geodesic representative trajectory. In particu- lar, each trajectory γi evolves with its own pace, given by the time warp ψi and its

own geometry, given by the translation vector wi.

In Section I.1, we debated the need to be able to estimate the time t0. Let us check that in the unidirectional case, i.e. with M = R, the model of Schiratti et al. (2015,

2017) actually differs from the random slope and intercept model. Assume that R is endowed with its usual metric. Then, the generic model writes, for each subject i and each observation j,

yi,j = p0+ v0αi(ti,j− t0− τi) + εi,j = v0αi(ti,j− t0) + p0− v0αiτi+ εi,j.

This model is nonlinear for there is a product of the random effects αi and τi. On the

other side, the random slope and intercept model, with notations consistent with the ones ofSchiratti et al. (2015,2017), can be written in the form

yi,j = (v0+ αi)(ti,j− t0) + (p0+ βi) + εi,j.

More precisely, we can interpret these models as follows: the random slope and intercept model compare the observation distribution with a reference time t0, while the model of Schiratti et al. (2015, 2017) compare the observations with respect to a reference mea- sure value p0 and thus allows for the estimation of time t0.

In the same vein, Kim et al. (2017) also proposed a mixed effects model for the study of longitudinal data: the Riemannian nonlinear mixed effects model. It appears as a kind of generalization of the geodesic hierarchical model introduced byMuralidharan and Fletcher(2012) and described page43. More precisely, given a dataset (yi, ti) arising

from the observation of yi= (yi,j)j∈J1,kiK at times ti = (ti,j)j∈J1,kiK, the model writes    yi,j = Exp Exp bi; Γb,bi(v)αi(ti,j− τi− t0) ; εi,j bi = Exp (b ; ui) ,

where Γb,bi(v) ∈ TbiM is the parallel transport of vector v ∈ TbM along a trajectory connecting b and bi. In other words, if we denote γi such a geodesic, Γb,bi = Pγi,0,1with the previous notations. The model of Kim et al. (2017) shares an affine time warp with Schiratti et al. (2015, 2017). However and unlike the model of Schiratti et al. (2015, 2017), it is impossible to estimate exactly the parameters, because of the high complex- ity of the model. The authors thus introduce a technique of approximate estimation to cope with this issue. Lastly, this model was also applied to cortical atrophy.

The model ofSchiratti et al.(2015,2017) was recently improved in order to increase its applicability. Indeed, in the form we described the model may use high dimensional parameters, making their estimation complex. Koval et al.(2017) propose a first upgrade of the model for the study of networks, i.e. of measures varying over time on a fixed graph, and apply it to cortical atrophy (Koval et al.,2018).

Bône et al.(2018) propose an instance of the model ofSchiratti et al.(2015,2017) in the case of shapes in the LDDMM framework. As explained page 41, this model relies on a sparse representation of the group of deformations via control points (Durrleman et al.,2011,2013). Let c ∈Rn d _{a set of n} _{control points, a shape y} _{∈ M ⊂}_Rd _and

a momentum m0 ∈Rncpd. The representative path then writes γ0: t 7→ Expc0,t0,t(m0) ◦

y0 and the individual trajectories are obtained through parameterized parallels of this representative curve; in other words:

γi: t 7→ ηwi(ψi(t)) ◦ y0, where ηwi: t 7→ Expc(t),0,1(Pt(wi)) ,

and where, for all momentum vector w, Pt(w) is the parallel transport of vector w, along

the curve γ0 between times t0 and t. We denote c(t) ∈ Rncpd the control points at time t, namely c(t) = Expc0,t0,t(m0) ◦ c0. Like in the generic case, vectors wi are assumed

to be linear combinations of independent sources and the parameters of the model are estimated via a MCMC-SAEM algorithm.

Finally, a last upgrade of the generic model was proposed byBône et al.(2019) with neural networks, whose use allows to handle very high dimensional data. The generic model is still the subject of intensive research; for example, the work of Debavelaere et al.(2019) seeks to include the generic model in a mixture model in order to classify individuals from a same cohort in different sub-populations. It permits, for example, to distinguish the evolution of healthy patients from the evolution of ill patients, etc.

IV. Thesis Outline

The model we propose in this manuscript builds on the generic model and seeks to generalize it to situations where the evolution dynamic is not univariate. Indeed, it is the standard situation in most applications. This is particularly the case in chemotherapy monitoring; we devote a full chapter to this situation: Chapter5. When giving a patient a new treatment against cancer, he typically undergoes three phases: a decrease of tumors sizes while as the patient responds to therapy, then a stable phase and, most of the time, an escape from the treatments through an new increase in tumors size. Therefore, it is not realistic nor reasonable to model the representative trajectory of the population with a geodesic curve. We want to overcome this constraint with the use of a piecewise-geodesic representative path. Then, progression of the disease can have distincts evolution phases: one for each geodesic chunk in the representative curve.

Furthermore, despite being an innovative approach which grounded a new research fields, the generic model developed by Schiratti et al. (2015, 2017) suffers from a lack of theoretical guarantees, making it impossible to conclude about the reliability of their results otherwise than through graphical validation. Hence, we focused on proving consistency of our model, getting as a byproduct a consistency result for the model of Schiratti et al.(2015,2017) and its extensions we described in the previous section.

Following standard methods for statistical inference in nonlinear mixed effects models, the parameters of our model are estimated through a MCMC-SAEM algorithm. However, our numerical experiments suffered from technical limitations such as a high sensibility of the SAEM algorithm regarding its initial conditions. We then worked

on a potential upgrade of this algorithm building on simulated annealing or tempering techniques.

Finally, the work presented in this document is at a crossroad in what we believe is a basis of mathematical modeling in medicine: a combination of clinical applications, with for instance an ongoing collaboration with oncologists and radiologists at the Georges Pompidou European hospital (HEGP for hôpital européen Gorges Pompidou in French), mathematical theory as a guarantor of proposed models reliability and the development of efficient numerical tools allowing for the analysis of complex heterogeneous datasets which become increasingly massive as imaging techniques improve.

In document Modèles statistiques et algorithmes stochastiques pour l’analyse de données longitudinales à dynamiques multiples et à valeurs sur des variétés riemaniennes (Page 79-84)