Towards a spatio-temporal framework
4.1 The spatio-temporal modelling framework: DSTM
The dynamic spatio-temporal model (DSTM) framework is investigated initially. The DSTM is a family of widely used models for spatio-temporal data analysis. Cressie & Wikle(2011) describes the essence of this type of models as the ‘hierarchical state space framework ’.
It is assumed that the true process of interest cannot be observed perfectly, so the first level employs ‘a mapping that relates a set of observations to the true process of interest ’.
The second level would then specify ‘a model for this true (hidden/latent/state) process’, which typically involves some forms of Markovian-dependency. There is usually a third level providing assumptions on the model parameters.
Following Cressie & Wikle (2011), the DSTM can be written schematically in terms of a data model, a process model and a parameter model. At the top level is the data model,
78
which associates the observed data Z(x; r) in spatial domain D (x ∈ D) and time domain T (r ∈ T ) to a latent/hidden process Y (s; t),
[{Z(x; r) : x ∈ D , r ∈ T } | {Y (s; t) : s ∈ Bs, t ∈ Bt} , ΨD] .
Here Bs, Bt represents the neighbourhoods of s and t respectively; ΨD is the collection of parameters of this mapping. In the middle level is the process model,
h Y (s; t)
n
Y (w; t − τ1) : w ∈ B(1)s o
, · · · ,n
Y (w; t − τq) : w ∈ Bs(q)o , ΨPi
,
where τ1, · · · , τq are the time lags, B(1)s , · · · , Bs(q) represent the neighbourhoods of s at dif-ferent time lags and ΨP is the collection of process model parameters. The process model describes the spatio-temporal dynamic of the hidden process. Finally, the parameter model at the bottom level is,
[ΨD, ΨP|ΨH] ,
with ΨH representing the collection of ‘hyperparameters’. Various types of models can be built based on this framework, through specifications of the data, process and parameter models and the associated hierarchy (Cressie & Wikle,2011,Wikle & Hooten,2010).
For the data model written specifically as
Z(; t) = AtY (; t) + (, t) ,
both linear and non-linear mapping between the observations Z(; t) and the latent process Y (; t) can be considered through the design of At. It also provide the possibility of dimension reduction (Wikle & Cressie,1999) through the basis representation of Y (; t) as
Y (; t) = Φ()βt+ ω(; t) .
Typical choices of the basis functions Φ() are Fourier, empirical orthogonal functions (EOF), wavelet, splines, bi-square, etc. A related approach using the idea of low rank representation can be found in Mardia et al.(1998), for the Kriged Kalman filter.
For the process model, a Markov-type dynamic is often used to describe the evolution of the latent process,
Y (; t) = M Y (; t − 1) + u(; t) ,
where M is the propagator matrix. This has the advantage of avoiding the specification of a joint spatial-temporal covariance structure, which is usually impractical in real life (Cressie
& Wikle,2011). There are various designs of matrix M , e.g. spatio-temporal random walk,
‘lagged nearest-neighbour’ models, vector auto-regressive (VAR) models, PDE/IDE based models and non-linear specifications (Wikle & Hooten,2010). It is worthwhile pointing out that the estimation of M can be difficult for a high dimensional process, especially if the number of time points T is small. So parameterization is often considered to reduce the estimation complexity.
The specification of the parameter model is usually associated with the hierarchical designs of the DSTM. An example is parameterizing the error covariance matrix in the data or process model level (Xu & Wikle, 2007). Priors may be assigned to the parameters in a Bayesian setting. It is also possible to incorporate random parameters, since the deterministic model might not be able to describe a complex process. However, one needs to be aware of the interpretation and identifiability issues of such settings (Cressie & Wikle,2011). In general, a sensible design of the parameter model can simplify the evaluation of the model distribution and its computation at the same time.
The estimation of the DSTM model falls into two general categories. Cressie & Wikle(2011) summarised them as empirical hierarchical modelling (EHM) and Bayesian hierarchical mod-elling (BHM). Both approaches estimate the models using sequential implementation in an iterative manner. In terms of the inference of the model parameters, EHM often adopts an EM-type algorithm; whereas BHM applies Gibbs samplers, MCMC or other sampling techniques to assist the inference. For the update of system states, EHM often uses Kalman filter/smoother in linear Gaussian models. BHM implementation usually involves sampling from the filtering and prediction distributions. A Kalman filter step can be added to the sampling procedure to update the system states and speed up convergence, provided the dependencies between the current states and previous states are relatively strong.
The DSTM has wide application in remote-sensing data, ranging from research on ocean water temperature byBerliner et al.(2000),Stroud et al.(2001), tropical ocean surface wind byWikle et al.(2001),Wikle & Berliner.(2005), to global CO2 byKatzfuss & Cressie(2011, 2012), Nguyen et al. (2014) and many others. The method shows distinctive advantages in terms of these applications, e.g. its power in dimensional reduction, flexibility in describing the system dynamics and ability to accommodate different spatial resolutions.
This thesis considers one particular type of DSTM, consisting of three levels.
(a) A data model exploits a ‘dimension reduction’ through basis representation, similar to the one proposed inWikle & Cressie(1999),
Z(s; t) = Y (s; t) + (s; t) , (4.1)
with the latent process Y (s; t) specified using basis function representation
Y (s; t) = Φβ(s)βt+ ζ(s; t) . (4.2)
(b) A process model describing the dynamics through lagged temporal dependence, such as in a vector auto-regressive model
βt=X
q
Mqβt−τq + ut. (4.3)
(c) A parameter model putting constraints/priors on (s; t), ζ(s; t), Mq and ut, which completes the hierarchical design and makes the model identifiable.
Note that the component ζ(s; t) in equation (4.2) is introduced to the model to account for the remaining spatial or spatio-temporal variations which cannot be accommodated by the system dynamic component Φβ(s)βt. It is sometimes assumed that ζ(s; t) is a random component and only depends on the data at time t. This type of model is often referred to as a spatio-temporal random effect (STRE) model. The STRE model has received great interest in recent years, research on this model can be found in Cressie et al. (2010),Kang
& Cressie (2010), Katzfuss & Cressie (2011), to name just a few.
It is possible to further decompose ζ(s; t) as
ζ(s; t) = Φη(s)ηt+ ω(s; t) ,
where the basis representation Φη(s)ηtis used to transform a high-dimensional process into a low-dimensional one. The choice of basis Φη can be different from Φβ to reflect different spa-tial contents, such as the macro and micro spaspa-tial scales inWikle et al.(2001). Alternatively, using the same basis yields the following,
Z(s; t) = Y (s; t) + (s; t) (4.4)
= Φ(s)βt+ Φ(s)ηt+ ∗(s; t) ,
where ∗(s; t) = ω(s; t) + (s; t).
There are two reasons why this approach is of interest. First of all, it is flexible in design.
The model above allows dimension reduction through basis/spectral representation, as de-scribed in Wikle & Cressie (1999). Specifically, the state transition equation with respect to the high-dimensional vector, Yt = (Y (s1; t), · · · , Y (sn; t))>, can be transformed into a low-dimensional transition equation of βtwithout loss of information. This means, the func-tional representation used in Chapter 3can be carried into this new setting. Meanwhile, the system dynamic Φβ(s)βt can be efficiently estimated using the classical Kalman filter and smoother. All the parameters associated with the system dynamic can be estimated using an EM-type algorithm (Katzfuss & Cressie,2011).
The DSTM described here is in its most general form. Various models can be built based on this framework through the specification of model components, which provides the possibility of describing many different spatio-temporal contents. Associated with these models are a variety of estimation methods. In the next two sections, several aspects of the DSTM frame-work are investigated, including its connection with the state space model, its estimation using the Kalman filter/smoother within the EM algorithm and the frequently used model specifications. These are crucial to the development of the spatio-temporal model for the remote-sensing image time series.