Adjusting the acceptance rate of the Metropolis step with the variance C

5.9 Applications

5.9.3 Adjusting the acceptance rate of the Metropolis step with the variance C

In this section, we highlight the fact that we should be careful in parameterizing the variance C of the proposal distribution in the Metropolis-Hastings algorithm (see Algorithm 7). We can compute the acceptance rate, denoted as ar, i.e. the proportion of times in the MCMC procedure

for which the proposal value is accepted. If we have a multidimensional random walk Metropolis algorithm, which is the case when simulating U, then Roberts et al. (1997) show that the optimal acceptance rate is 0.234. However, in the literature, it is generally accepted to have an acceptance rate that lies between 0.20 and 0.40. We will try in this section to obtain the optimal acceptance rate for our simulated data sets by scaling the value of C when the variance B of the prior of U

is equal to the identity: B = Ip (see Section 5.9.2).

Small values of C will reduce the variance around the previous value of U in the MCMC algorithm, and make each simulated U(i) very close to the mean of the proposal distribution, i.e.

the previous value U(i−1)_{. Therefore the acceptance ratio α (see Algorithm 7) is more likely to}

be equal to 1, and we will have an acceptance rate that is more likely to be close to 1. Therefore we would need to increase the size of C, so that the acceptance rate is closer to the optimal value 0.234, see Roberts et al. (1997). If the norm of C tends to be large, then the simulated values for U(i) _{by the proposal distribution will be more likely far from the previous value U}(i−1)_,

making the acceptance ratio more likely to be dierent from 1. We will therefore bring down the acceptance rate to smaller values.

A good acceptance rate leads to more volatility for the values of U as shown at the top of Figure 5.5. However, if the acceptance rate is too low, the simulated U(i)s are rejected at

almost every step i of the MCMC algorithm. This example is a little extreme but if we look at the bottom of Figure 5.5, we can see that the trace plots of the coecients U21 and U32

stay the same for almost all the iterations of the MCMC algorithm. As a consequence, the covariance matrix S = (UU0₎+ _{simulated at each iteration becomes almost xed during the}

MCMC algorithm because the previous value is chosen. Therefore we come back to the problem where S is xed (see Section 5.3.2). The resulting independent cointegrating relations presented in Table 5.1 below and obtained from Algorithm 10 are not correctly found.

Figure 5.5: Trace plots of the coecients U21, U32for the rst simulated data set P1. Trace plots

on the top with C = 0.3 × Ip and ar = 0.227, and trace plots on the bottom with C = Ip and

ar = 0.021.

Table 5.1: Cointegrating relations for the rst simulated data set P1 with B = Ip and C = Ip

and ar = 0.021. x7t x5t x4t x3t x6t x2t x1t 1 0 0 0 -0.662 -0.239 -0.831 0 1 0 0 0.168 1.147 0.121 0 0 1 0 0.921 -0.601 1.032 0 0 0 1 2.709 0.481 2.281

the MCMC procedure dened by Algorithms 8 and 9 in Section 5.8.2. We use the same other hyperparameters as dened in Algorithm 8 but we change the value of C only. At the end of each MCMC procedure, we collect the corresponding acceptance rate ar. From these tables we

can see that as we reduce the amplitude of C, the acceptance rate increases towards the optimal acceptance rate desired.

Table 5.2: Acceptance ratio for the two simulated data sets P1 and P2

Simulated data set P1 Simulated data set P2

C ar C ar

1 × Ip 0.021 1 × Ip 0.135

0.5 × Ip 0.112 0.8 × Ip 0.157

0.3 × Ip 0.227 0.5 × Ip 0.254

Table 5.3 shows the independent cointegrating relations for the rst simulated data set with the optimal acceptance rate found from Table 5.2, that is the closest ar to the value 0.234, see

Roberts et al. (1997). The value of ar = 0.227 is taken and then we choose C = 0.3 × Ip. After

that, we run the algorithm seen in Section 5.8 with m = 30,000 iterations and a burn-in period of 20,000 iterations. We use the xed hyperparameters dened in Algorithm 9.

Table 5.3: Cointegrating relations for the rst simulated data set P1: C = 0.3 × Ip, ar = 0.227.

x7t x5t x4t x3t x6t x2t x1t

1 0 0 0 -0.952 0.911 0.070 0 1 0 0 0.009 0.076 -0.946 0 0 1 0 -0.043 -0.950 -0.069 0 0 0 1 -0.024 -0.946 -0.993

Table 5.4 shows the cointegrating relations obtained from the second simulated data set P2

Here we can see that when C is well chosen we obtain accurate estimates of the cointegrating coecients.

Table 5.4: Cointegrating relations for the second simulated data set P2: C = 0.5×Ip, ar = 0.254.

x5t x4t x3t x2t x1t

1 0 0 0.001 -0.991 0 1 0 -0.997 -0.002 0 0 1 -0.989 -0.995

Comparison with the static model of Chapter 3 for the simulated data sets

In this section, we recall the cointegrating relations found by applying the methods seen in Chapter 3 with Algorithms 1, 2 and 3. The cointegrating relations found with the methods described in this chapter are not very dierent if we compare the rst simulated data set P1 (see

Table 5.3 and Table 3.3) or the second simulated data set P2 (see Table 5.4 and Table 3.4).

Table 5.5: Cointegrating relations for the rst simulated data set P1 with the static model of

Chapter 3. x7t x5t x4t x3t x6t x2t x1t 1 0 0 0 -0.941 1.009 0.043 0 1 0 0 -0.058 0.043 -0.971 0 0 1 0 -0.014 -0.965 0.047 0 0 0 1 -0.056 -0.961 -0.961

Table 5.6: Cointegrating relations for the second simulated data set P2 with the static model of Chapter 3. x5t x4t x3t x2t x1t 1 0 0 -0.012 -1.009 0 1 0 -0.983 0.014 0 0 1 -1.019 -1.009

The advantage of the method seen in Chapter 3 is that the cointegration rank is evaluated during the MCMC procedure. However in this chapter, the novelty relies on the fact that we used singular distributions to infer the cointegrating matrix. This chapter would constitute the rst step in a new approach for inferring the cointegrating matrix in the VECM.

5.9.4 Posterior summaries

In this section, we highlight posterior summaries of some parameters of the VECM for the rst simulated data set P1. The trace plots shown in Figure 5.6 are taken after running the

MCMC procedure presented by Algorithms 8 and 9 and by using the parameters C = 0.3 × Ip

and B = Ip that we thought were appropriate in Sections 5.9.3 and 5.9.2, respectively. We also

display trace plots of some coecients of the hyperparameter U, showing convergence as well (see Figure 5.7).

Figure 5.6: Trace plots of the coecients Π63, Π71, Ψ52 and Σ42 for the rst simulated data set

Figure 5.7: Trace plots of the coecients U64, U42, U31 and U24 for the rst simulated data set

P1.

We can see convergence of those parameters in these trace plots (see Figure 5.6). Their respective densities are represented in Figure 5.8. We decide to show two coecients of the singular long-run impact matrix (Π63 and Π71). The property of singularity for the distribution

of Π makes the shape of the distribution non-Gaussian. This comes from the fact that a singular distribution is not dened under the Lebesgue measure and therefore it is not entirely correct to expect these coecients to have a Gaussian shaped distribution. The posterior distribution of Ψis however non-singular and Gaussian, and we should expect a symmetric distribution for its coecients (see Ψ52 in Figure 5.8).

Figure 5.8: Posterior densities of the coecients Π63, Π71, Ψ52 and Σ42 for the rst simulated

data set P1.

5.9.5 Comparison with the static model of Chapter 3 for the European

In document Novel Bayesian methods on multivariate cointegrated time series. (Page 180-188)