3. Data and Methodology
3.6 Method Procedure
3.6.1 Parametric Bayesian Model
The parametric Bayesian model follows the method procedure of the nonparametric Bayesian model with one exception - the model becomes a finite model and is further simplified by the author for computational ease. To recap, according to Karabatsos (2016), a nonparametric Bayesian model is often referred to being an infinite-mixture model. Essentially, a nonparametric Bayesian model assumes an infinite number of clusters, whereas a parametric Bayesian model assumes a finite number of clusters (Jensen and Maheu, 2018; Karabatsos, 2016). The parametric Bayesian model is a
91 finite model because by definition a parametric model has a set number of parameters with respect to its sample size (Jin, 2017).
Therefore, although the parametric Bayesian model takes into account asymmetric properties, it does so to a limited extent because the model is essentially parametric (Jin, 2017). Hence, the parametric model does not have the capacity to take into account an infinite number of possibilities and account for every possible risk-return relationship that could exist (Demirer et al., 2019). Consequently, there is no need to apply a prior to reduce an infinite model to a finite model since it is already finite (Karabatsos, 2016). However, in order to derive the posterior parameter estimates, the posterior methods, the slice sampler by Kalli et al., (2011) and Gibbs sampler are still applied.
3.6.1.1 Joint
Following Jensen and Maheu (2018), in this studies procedure in uncovering the risk- return relationship, the first step is to model a joint distribution of excess returns and realised variance. This results in a number of bivariate density functions which are the consequent possible densities of the joint model of the two variables shown in Equation 46:
p(rt, log(RVt) |It−1, Ω, Θ) = ∑∞j=1wj∗ f (rt, log(RVt)|θj, It−1) (46) where the probability of excess returns and log realised variance is conditional on the following: The information set, the mixture weights Ω = 𝑤𝑗 where ∑∞𝑗=1𝑤𝑗 = 1 and mixture parameters Θ = 𝜃𝑗 where 𝑗 = 1, … , ∞ which refers to the number of clusters of mixture parameters. This is equivalent to the sum of all the weights and functions of excess returns and log realised variance, given the parameters and information set. The next step involves deriving a parametric version of the risk-return relationship by reducing the notation of Equation 46 to only the necessary components and rewriting Equation 46 as Equation 47:
f (rt, log(RVt)|θj, It−1) ≡ f (rt| log(RVt), θj, It−1) ∗ f (log(RVt)|θj, It−1) (47) where the latter is simply the product of the conditional distribution (term 1) and marginal distribution (term 2) by the law of total probability (Chan, Guo, Lee and Li, 2018; Jensen and Maheu, 2018). Since excess returns and log realised variance
92 theoretically tends to a normal distribution, Equation 47 allows for the representation of Equation 48 and 49, respectively (Jensen and Maheu, 2018).
Equations 48 and 49 follow a functional form of normality like conventional parametric methods such as the GARCH approach (Madaleno and Vieira, 2018). However, the mixing of the finite number of cluster parameters allows for a wider array of joint densities, including asymmetric densities, in a finite sample space (Karabatsos, 2016). Therefore, suggesting that the parametric Bayesian model is more robust than other conventional parametric models such as the GARCH approach (Jensen and Maheu, 2018).
f (rt| log(RVt) , θj, It−1) = fN(rt|α0+ α1RVt, η12RVt) (48) From Equation 48 by Jensen and Maheu (2018), the conditional mean function of excess returns given log realised variance, the parameter and information set, is equivalent to the normal conditional mean function of excess returns. The latter is conditional on the following: The coefficient 𝛼1 on 𝑅𝑉𝑡 represents the persistence of risk on realised variance and this term represents volatility feedback. The 𝜂12 on 𝑅𝑉
𝑡 indicates the systematic error on realised variance. This term refers to the error surrounding the stochastic measure of realised variance which is unavoidable regardless of the number of times the model is run (Beyhaghi, Alimo and Bewley, 2018; Jensen and Maheu, 2018).
f (log(RVt) |θj, It−1) (49)
= fN((log(RVt) | γ0+ γ1log(RVt−1) + 𝛾2log (RVt−i) + γ3 rt−1 √RVt−1
+ γ4| rt−1 √RVt−1
| , η22)
From Equation 49 by Jensen and Maheu (2018), the conditional mean function of log realised variance, given the parameters and information set, is equivalent to the normal conditional mean function of log realised variance. The latter is conditional on the following: The coefficients 𝛾1, 𝛾2, 𝛾3 and 𝛾4 which refer to the persistence of the variables. The first two terms cater for volatility feedback but the last two terms cater for the leverage effect. Although the latter two variables are taken into account, it is not within the scope of this study; thus, it is ignored (Jensen and Maheu, 2018).
93 Equations 48 and 49 are the main equations of interest for the parametric Bayesian model. Conditioning has been dropped for convenience following Jensen and Maheu (2018). The addition of the innovation terms 𝜀𝑡 and 𝜐t shown in both equations is to aid understanding. Further simplifications made by the author for computational ease are discussed. rt = α0+ α1RVt + 𝜀𝑡 (50) log(RVt) (51) = γ0+ γ1 log(RVt−1) + 𝛾2 1 2488∑ log(RVt+1−i) 2488 𝑖=1 + + γ3 rt−1 √RVt−1+ γ4| rt−1 √RVt−1| + 𝜐t
In Equation 50, the realised variance measure is not introduced into the innovation term. This is because the error variance in the model accounts for all unexplained variance that arises from sources such as uncertainty and measurement errors (Chakraborty and Lozano, 2019). This includes the systematic error on the realised variance measure which refers to the error surrounding the stochastic nature of the risk measure (Beyhaghi et al., 2018). Hence, both the innovation terms 𝜀t and 𝜐t of Equation 50 and 51 capture the possible deviations between the observed and expected values (Jegadeesh et al., 2019).
It is accounted for and reflected through the 𝜎12 and 𝜎22 of both models (Karabatsos, 2016). Further, according to Jensen and Maheu (2018), in Equation 49, the coefficient 𝛾2 is supposed to cater for volatility feedback over a six-month period. However, it is not shown because in this study, the entire sample period is taken into account as shown by Equation 51. This is in order to essentially determine the presence and persistence of volatility over time in the South African market to provide an overall state and condition of the market (Jensen and Maheu, 2018).
3.6.1.2 Posterior
The posterior procedure consists of a number of steps. Initially, random samples are drawn from a joint distribution by means of a slice sampling technique by Kalli et al., (2011). The slice sampler is applied to Equation 46, except an additional random variable represented by 𝑢𝑡 is introduced, as shown in Equation 52 by Jensen and Maheu (2018):
94 p(rt, log(RVt) , ut|Ω, Θ, It−1) = ∑∞j=1𝟏 (ut< wj) ∗ f (rt, log(RVt)|θj, It−1) (52) where the aim of adding this variable 𝑢𝑡 is to ensure that only positive weights are retained and all weights of zero are “sliced away” (Karabatsos, 2016; Liu and Luger, 2018; Jensen and Maheu, 2018). Thereafter, the following iteration method by Jensen and Maheu (2018), is applied which refers to the repetitive resampling process of a collection of steps.
Firstly, a Gibbs sampling technique is applied which is often used when the joint distribution is unknown and it is simpler to draw samples from the known conditional distribution (Merel, Shababo, Naka, Adesnik and Paninski, 2016). In this case, the conditional distribution contains the cluster mixture parameters and weights (Jensen and Maheu, 2018). Secondly, since the priors are strong, this allows for the formation of a conjugate conditional posterior, meaning, a conditional posterior that shares similar model properties to the prior (Gu et al., 2019). Thirdly, consequently, each of the random variables tends to form a homogenous distribution provided the given weights and parametric space (Jensen and Maheu, 2018). Finally, if the cluster count is amended, there may be further prior draws (Merel et al., 2016).
This procedure will continue; however, the Gibbs sampling process is subject to a burn-in period in which samples in the earlier stages that are no longer accurately representative of the required distribution are discarded (Merel et al., 2016). The original base distribution is then updated to the posterior distribution (Cai, Mitzemacher and Adams, 2018). Hence, so are the coefficients and parameter estimates of Equation 50 and 51, from which conclusive results can be drawn with respect to the risk-return relationship and volatility feedback (Jensen and Maheu, 2018).