Chapter 4 Signal exchange of the oceanic common bottlenose dolphin ( Tursiops
4.2.2 Mechanical (tactile) and photic data processing, definition and
4.2.3.2 Multivariate mixed hidden Markov model of subsurface
4.2.3.2.2 Baseline model
To fit a hidden Markov model in this study, a strong simplifying assumption was applied and the data from all recordings were combined and a single model fitted. The observed data sequences (same stochastic process) were assumed to be stable and identical (same stochastic process). This was deemed appropriate in this case due to its application in other similar datasets and the identical recording protocols across the dataset (DeRuiter et al., 2016; Zucchini et al., 2016; Popov et al., 2017).
For the baseline model, homogenous (time constant) state transition probabilities we assumed,
!"# = %&(() = *|(),- = .) (summarised in equation 3 in Table 4.2). In this homogeneous state process, based on call data, steady state (in equilibrium) is assumed at the beginning of the observation period. This is because the actual call behaviour will have been active already for a period of time, not the result of the recording starting (as per other acoustic studies, i.e., Popov et al., 2017). Equilibrium in the initial distribution is then assumed (stationary distribution). Here, d was calculated via 0(12− Γ + U) = 1′, where IN is an N x N identity matrix and U is
an N x N matrix of ones (as per Popov et al., 2017; Zucchini et al., 2016). The homogeneity assumption was, however, relaxed, requiring 0 to be estimated.
It can be problematic to directly interpret Markov chain parameters, thus both the stationary distributions and the transition probabilities matrix (t.p.m) were examined for fixed ratio values (as suggested in Patterson et al., 2009). Consequently, the behaviour of the stationary distribution and t.p.m. for all model levels were examined (Popov et al., 2017).
Chapter 4 – Signal exchange of oceanic common bottlenose dolphin (Tursiops truncatus) during intra- and inter-species associations in Far North waters, New Zealand
Table 4.2: Components of hidden Markov models used in this study (modified from Popov et al., 2017;
Jurafsky & Martin 2018):
# Equation Description
1 St, t = 1,..., n,
State process for a single dataset. Here a series from 1 to N is considered and the N distribution that Xt
(see equation 2) is derived from.
2 Xt, t = 1,..., n Observable time series for a single dataset
3 Γ = 9
!-- … !-2
⋮ ⋱ ⋮
!2- … !22
= ∙ A N x N transition probability matrix (t.p.m.). would indicate the transition probability matrix at Γ t
time t. 4 %"(?) = Γ(? + @" ) Γ(@")Γ(x + 1)B @" @"+ C"D EF B@C" "+ C"D G ∙
A negative binomial distribution probability mass function with q = 2 parameters. Note, ni = state-
specific size, C" = mean and Γ(∙) = the gamma function.
5 Γ = B1 − !!H--H 1 − !!-HH-D ∙
A two-state hidden Markov model with t.p.m. to illustrate how the likelihood function was constructed
6 X1,..., Xn
Likelihood of an observed time series. Each expresses the probability of an observation Xt being
generated from a state. Under the assumption of independence of the recording time-series, the joint log-likelihood was the sum of the log-likelihoods of individual recording. This joint log-likelihood was maximised numerically, using Newton-Raphson- type optimisation in free statistical software R, using the routine NLM() (R Core Development Team, 2014, RStudio for Mac version 1.0.136)
7 0 = 1/(!-H+ !H-)(!H-, !-H) The stationary distribution, when the Markov chain
was in equilibrium at the start of the time series
8 ℓ(L|19,0, … ,0) = 1 !-H+ !H-(!H-, !-H) B %-(19) 0 0 %H(19)D × B1 − !-H !-H !H- 1 − !H-D B %-(0) 0 0 %H(0)D … B 1 − !-H !-H !H- 1 − !H-D × B%-(0)0 %0 H(0)D P11Q,
The corresponding likelihood of the initial datasets, where L = (!-H, !H-, S-, @-, SH, @H)′ ∙. In the first dataset, the call rates or mean call frequencies were given by (1.3, 0,..., 0 and the ellipsis do not indicate zeros, but general observations.
4.2.3.2.3 ‘Hidden’ state covariates in hidden Markov model
4.2.3.2.3.1 Development of covariates
Subsurface and surface influences on call behaviour were considered by including these variables as ‘hidden’ states (covariates) on the probability of transitions between states of call behaviour. The assumption was made that the covariates provided explanatory information about oceanic bottlenose dolphins’ likelihood of transitioning between call states. All the covariates considered in this analysis originated from instantaneous focal follow data, both surface and subsurface. Independent covariates were unlikely, so an all-encompassing analysis was not a good compromise between the number parameters, improved model fit, and
Chapter 4 – Signal exchange of oceanic common bottlenose dolphin (Tursiops truncatus) during intra- and inter-species associations in Far North waters, New Zealand
122 biological inference (Popov et al., 2017).
Each model was considered with one covariate (listed in Table 4.3). The only disparity between models of mean frequency and call rate was group size, which was also added as a covariate for mean frequency. Since call rate was calculated based on the number of individuals, group size was not included in the base comparison. Model formulation included only a single covariate at any time to avoid numerical instability for the hidden Markov model analysis performed. Additional covariates were considered separately in subsequent analyses, if required, to maintain numerical stability.
Table 4.3: Potential covariates for hidden Markov models from the original dataset of oceanic common
bottlenose dolphins (T. truncatus) and pilot whales (Globicephala sp.). * only included in mean frequency models, ** variable combined with only top models.
Covariate Variable definition
Year 1st September 2013 – 31st August 2014 and 1st September 2014 – 31st
August 2015
Month Defined as lunar month
Water depth Water depth (m) at time of recording
Substrate Predominant substrate type – rocky, sandy, vegetation BSS Beaufort sea state; 1, 2, 3 or 4
Wind speed 0 – 5, 6 – 10, 11 – 15, 16 – 20 (knts)
Surface behaviour Travelling, milling, resting, foraging, socialising, and diving Calf presence (y/n) Refer to Appendix 2.2, for calf definition
Species ratio Group size of each species present and whole group were logged according to the same categories as group size.
Mixed species (y/n) Yes/no, indicates if group is oceanic bottlenose dolphin only or pilot whale only (pilot whale only) if no, and mixed oceanic bottlenose dolphin with pilot whale (oceanic bottlenose dolphin mixed) if yes Tactile type Predominant tactile per minute (Section 4.2.2.)
Posture type Predominant posture per minute (Section 4.2.2.) Tactile/posture rate Mean number of tactile/posture per minute Detection range Secchi disk measurements of visibility (m)
Surface cohesion The elliptical spread area was included ([group size/spread area]*100) in the estimation of group density (number of dolphins per 100 m2)
Synchrony The number of animals surfacing in sequential 3-second intervals for a 30-second period
Group size* Collective group size was logged according to three categories; minimum, maximum and the best estimate (Dwyer et al., 2016; Peters & Stockin, 2016; section 4.2.1.1.1.). Best estimate was modelled. time to change
(TTCh)**
Additionally, time to change (TTCh) was included in analysis for the top covariates indicated (as per Popov et al., 2017). The time (minutes) to the nearest (before or after the current time bin) change in group call behaviour is measured. This was included as oceanic bottlenose dolphin coordinate their behaviour (with calls) before, during, and after a change in context.
Chapter 4 – Signal exchange of oceanic common bottlenose dolphin (Tursiops truncatus) during intra- and inter-species associations in Far North waters, New Zealand
4.2.3.2.3.2 Covariates in the t.p.m.
Covariates in the hidden Markov model were assumed to influenced the transition between states, not the state-dependent distributions (fixed for a given state, Zucchini et al., 2016). The t.p.m. row constraints, i.e., !"# ∈ [0,1], ∑2 !"# = 1
",- , for . = 1, … , X, were handled with a
multinomial logistic link function (Popov et al., 2017):
!"# =∑cYZ[\]^YZ[F_`ab def (]^Fd`a) ,
Where the vector g′) = \1, g-,), … , gh,)b included k covariates at time t. Additionally, bij is a k + 1 column vector with estimated coefficients (Popov et al., 2017). This analysis set bii = 0 for
i = 1,..., N; this is standard practice in multinomial logit modelling (McFadden, 1984; as used in Popov et al., 2017). With the addition of covariates in the t.p.m., the number of parameters increased from the baseline model, represented by N(N - 1) to N(N - 1)(k + 1) (Popov et al., 2017).
In the baseline model all covariate coefficients were treated as zero. The addition of covariates influenced the transition probabilities and the Markov chain was no longer homogeneous. The state process could no longer be described as stationary, thus the initial state distribution was estimated alongside the other model parameters.
4.2.3.2.3.3 Model selection and checking - AIC protocol
In order to choose between different candidate models, the model selection criteria of Akaike’s Information Criterion with correction for small samples (AICc) was applied (Akaike, 1973). AICc scores are a version of AIC created to deal with small sample sizes, which is advised to be used as a default (Symonds & Moussalli, 2011). Each base and covariate model were assessed with AICc scores which include the data fit and complexity of the model. This
approach results in in the simpler model being favoured if two models had similar fit. The calculation of AICc scores was as follows (Akaike, 1973):
i1j = −2In(n) + 2o +2o(o + 1) @ − o − 1 i1jp= i1j +
2oH+ 2o @ − o − 1
Where l is the maximum likelihood estimate and k is the number of parameters (including the intercept). Joint log-likelihood for all encounter datasets was used, with p the number of (10)
Chapter 4 – Signal exchange of oceanic common bottlenose dolphin (Tursiops truncatus) during intra- and inter-species associations in Far North waters, New Zealand
124 parameters estimated (i.e., the length of the vector h) and ntot the total number of observations.
Finally, evidence ratios (ER; calculated as the ratio of two model likelihoods) quantified the relative empirical support for any two models in the set. The process of model selection provided evidence as to which factors were better predictors of the response variable (i.e., which factors had stronger effects). An additional model selection criterion was applied. Ordinary pseudo-residual plots were produced to quantify the goodness-of-fit of the hidden Markov model (following Popov et al., 2017 and Zucchini et al., 2016). The resulting plots allowed any outliers or inadequacy of fit to be identified (following Popov et al., 2017).
4.2.3.2.3.4 Interpretation of the t.p.m parameters.
For t.p.m. covariate models, their effect on the t.p.m was described using the stationary (or equilibrium) distribution. This assumed a fixed level covariate (Patterson et al., 2009; Popov et al., 2017). The covariate value and the estimated coefficients provided information on the model’s marginal behaviour. The influence of the covariate can be assessed by comparing fixed level results. Confidence intervals for the stationary distributions were calculated using the delta method (Oehlert, 1992). Set covariate values were also used to calculate the probability of transition for those set values. This allowed interpretation of how likely the oceanic bottlenose dolphins were to switch states under the conditions of those variables.
4.2.3.2.3.5 Likelihood estimation
Models were fitted via a numerical maximum likelihood estimation approach, utilising the nlm optimiser in R, primarily due to the associated low computational cost (see Altman, 2007 for implementation details). For an observed time series (Table 4.1, equation 6), the likelihood was calculated, by considering all possible hidden state sequences that may have influenced call observations. The models were run 100 times to achieve maximisation and check numerical stability (Quick et al., 2017). The two state-dependent distributions are negative binomials with pi(x)given in Table 4.1, equation 8, with the parameters (l1, n1) and (l2, n2), respectively
(corresponding likelihood is detailed in Table 4.1, equation 8; Popov et al., 2017).
4.2.3.2.3.6 Viterbi algorithm - sequence of hidden states
For a model, “the most likely sequence of hidden states, given the likelihood of observations under the state-dependent distributions and the transition probabilities between states” (Quick
Chapter 4 – Signal exchange of oceanic common bottlenose dolphin (Tursiops truncatus) during intra- and inter-species associations in Far North waters, New Zealand
et al., 2017, p. 10), was estimated using the Viterbi algorithm (Forney, 1973). The Viterbi hidden Markov model package in R was utilised (R Core Development Team, 2014). This approach produced the most likely sequence of the ‘hidden’ states based on the data available (Quick et al., 2017). Specifically, the aim was to produce the sequence s1, s2,..., sn that
maximises the conditional probability (Popov et al., 2017):
Pr((- = s-, … , (E = sE, t- = ?-, … , tE = ?E)
The state sequences for each encounter were decoded separately, due to the independence assumption (introduced in Table 4.1, equation 6).
4.3 Results