Comparison with the classical estimation of the pricing functional

5.3 Bayesian Econometric Analysis

5.4.3 Comparison with the classical estimation of the pricing functional

We develop in this paragraph a comparison between the bayesian method we have proposed in this paper for recovering the asset pricing functional and the classical solution to the integral equation (5.7) computed in Carrasco et al. (2007) [10]. The classical solution does not require the use of any regularization scheme since the operator (I − K) is continuously invertible. Since K is unknown it is substituted by ˆK as defined in subsection 5.3.1, the estimated pricing functional ˆp is

p = (I − ˆK)−1d,ˆ

with ˆd defined in subsection 5.3.1. By applying Theorem 7.2 in Carrasco et al. [10], the squared norm of the asymptotic bias is of order

||ˆp − p∗||2 ∼ Op³ 1_{T h}_n + h2ρ .

The optimal speed of convergence is obtained when 1

T hn = h2ρ, that is when h = c1(_T1)

2ρ+n.

With this optimal choice of bandwidth the classical estimator ˆp converges at the rate of (_T1)2ρ+n2ρ : ||ˆp − p_∗||2 ∼ O_p((1

2ρ

2ρ+n).

We compare this rate of convergence with the rate of the estimated regularized pos- terior mean obtained when a classical Tikhonov scheme and the optimal α are used: ||ˆEα(p| ˆR) − p∗||2 ∼ Op((_T1)

β+1). The comparison will be possible only in the subset

Φ_β ∈ X of the pricing functionals p such that Ω−12

0 (p − p0) ∈ R(Ω 1 2 0H∗HΩ 1 2 0) β 2, since we

are able to compute the Bayesian speed of convergence for true value p∗ belonging to this

set. In this subspace, our solution converges faster if β > 2ρ_n. This condition is more likely to be satisfied when the parameter ρ (that is a measure of regularity of the transition density function) is small or equivalently, for a given value of ρ, when the dimension of Yt,

i.e. the number of conditioning variables in the transition probability, increases.

Anyway, with Tikhonov regularization the qualification matters, so that we can only ex- ploit a regularity β of the function p that is less or equal than 2. Therefore, in order condition β > 2ρ_n is satisfied, it must be 2ρ_n ≤ 2, that holds when ρ ≤ n.

Let us consider the regularized posterior mean obtained through a Tikhonov scheme in Hilbert scale. In this case the comparison will be possible only on the subspace Xβ+1. With

the optimal regularization parameter α_∗ the rate of convergence is ||E_s(p| ˆR) − p_∗||2 _∼

Op((_T1)

β+1

a+β) and it is faster than the rate of convergence with classical solution if β >

2ρ(a−1)

n − 1. When a > 2 and ρ < 2(a−2)n , this condition is less stringent than condition

β > 2ρ_n, demanded for Tikhonov regularized posterior mean converging faster than the classical estimator ˆp. When the degree of ill-posedness a is less than 2, then the condition β > 2ρ(a−1)_n − 1 is less stringent than condition β > 2ρ_n if ρ > n

2(a−2).

Summarizing, under some condition on the regularity of the function p∗, in particular if

the price function is highly smooth, or if n is high or ρ is small, our Bayesian estimator converges faster than the classical one. The price to pay for having this fastest speed of convergence is to impose a regularity assumption on the price functional that we do not impose with the classical resolution method.

5.5 A g-prior with Regularizing Power

We have shown in preceding sections that, in general, the prior distribution does not regularize and we need to artificially introduce a regularization scheme in order to obtain consistency of the posterior distribution.

Nevertheless, there exists a particular specification of the prior distribution that has a regularizing power in the sense that the prior-to-posterior transformation has the same effect as the application of a regularization scheme so that the recovered posterior mean is consistent. This type of prior distribution is suggested by the Zellner’ (1986) g-prior but it extends the latter because it is linked to a slightly modified sampling mechanism. More precisely, it is linked to the sampling mechanism of the non-projected model ˆd = (I − ˆK)p + error. This extended g-prior was introduced in Chapter 3 where its regularizing power was shown.

Let suppose that the prior measure specified in 5.3.2 is replaced by the extended g-prior with a covariance operator related to operator K in the sampling mechanism:

p ∼ GP ³ p0,σ 2 g (K ∗_K)s´_, _{for some s > 0} _(5.20)

with g = g(T ) a function of the sample size T such that g → ∞ with T . We use the notation Ω0 = (K∗K)s. Let α = _T1g be the parameter playing the role of regularization

parameter. For that, it must go to zero with T and it must be such that α2_{T → ∞. These}

conditions imply that g must go to infinity faster than √T and slower than T .

Equation (5.14) implies an operator A = (K∗_K)s_Hˆ∗_(α(K∗_{K) + ˆ}_H(K∗_K)s_Hˆ∗₎−1 _{that, as}

T → ∞, is well-defined if it is applied to ( ˆR − ˆHp₀). The fact that (K∗_{K) multiplying α}

can be factorized out allows to directly obtain a regularization of the inverse of the limit of (K∗_K)−1

2H(Kˆ ∗K)sHˆ∗(K∗K)

2. Using equation (5.15) for defining A we have

A = σ2 g (K ∗_K)s_H_ˆ∗_{( ˆ}_Σ T + σ2 g H(Kˆ ∗_K)s_H_ˆ∗₎−1 = (( ˆK∗K)ˆ −12H(Kˆ ∗K)s)∗(αI + ( ˆK∗K)ˆ −12H(Kˆ ∗K)sHˆ∗( ˆK∗K)ˆ −12)−1( ˆK∗K)ˆ −12

that is a continuous operator. This is due to the fact that R(K∗_{K) ⊂ R(K) = D(K}−1_{) ⊂}

D((K∗_K)−1

2), so that (K∗K)− 1

2H is well defined. The posterior mean and variance are

Eg_{(p| ˆ}_{R) = A( ˆ}_{R − ˆ}_Hp

0) + p0 and V arg(p| ˆR) = (K∗K)s− A ˆH(K∗K)s. Because operators

K and K∗ _{are unknown, it follows that they must be substituted by their consistent esti-}

mators in the prior covariance. We denote with ˆEg_{(p| ˆ}_{R) and d}_{V ar}g_{(p| ˆ}_{R) the corresponding}

estimated mean and variance.

Study of asymptotic behavior of the posterior distribution is based on the decompositions: ˆ

V arg(p| ˆR) = [ dV arg(p| ˆR) − gV arg(p| ˆR)] + [ gV arg(p| ˆR) − V arg(p| ˆR)] + V arg(p| ˆR). The only difference between ˆEg_{(p| ˆ}_{R) and ˜}_Eg_{(p| ˆ}_{R) is that in the first one the prior covari-}

ance operator is estimated while in the latter it is known. The same difference characterizes d

V arg(p| ˆR) and gV arg(p| ˆR). Hence, the first square brackets term of both the two decom- positions above is due to estimation of Ω0, the second error is due to estimation of all

the other operators and the last one is the bias and the variance, respectively, for known operators.

We show in the following theorems that the posterior distribution corresponding to the g- prior is consistent. This is guaranteed by convergence to zero of the bias and the posterior variance.

Theorem 23 Let (5.20) be the prior distribution for the functional p in the sampling equation (5.12). If, for some γ > 0, (K∗_K)sγ _{is trace class and if (p}

∗− p0) ∈ R(Ω

β 2s

0 ) then

||Eg_{(p| ˆ}_{R) − p}

∗||2 converges to zero with respect to the sampling probability at the speed

||ˆEg(p| ˆR) − p∗||2 ∼ Op ³ αβs + 1 Tα −γ ₊ 1 α2 ³ 1 T hn + h2ρ ´ (α3s−ββ+s + 1 Tα −γ₎ + 1 α2 ³ 1 T + h 2ρ´ 1 Tα 1−γ´_. Furthermore, if α = c1(_T1) s (β+γs)_{, h = c} 2(_T1) 1

2ρ for some constants c₁ and c₂,

Tβ+γsβ _{||E(p| ˆ}_{R) − p}

∗||2 ∼ Op(1)

Assumption 24 if β ≥ 1.

The fastest speed of convergence of the posterior mean is of order T−β+γsβ _{. It is faster than}

the rate in the classical resolution method (illustrated in subsection 5.4.3) if β > 2ρ_nγs. Theorem 24 Let (5.20) be the prior distribution for the functional p in the sampling equation (5.12). If s ≥ 2 then || dV arg(p| ˆR)||2 converges to zero with respect to the sampling probability. Moreover, ∀φ ∈ X such that Ω12

0φ ∈ R(Ω

β−s 2s

0 ), the posterior variance converges

at the speed || dV arg(p| ˆR)||2 ∼ O_p ³ αβs + 1 α2 ³ 1 T hn + h 2ρ´_αβ_s´_.

When α is set equal to the optimal one, i.e. α = c1(_T1)

β+γs, the posterior variance

converges to zero if _2ρn ≤ β+γs−2s_β+γs .

The value of g corresponding to the optimal α is: g = (1 T)

−β+γs−s_β+γs _{. It converges at infinite}

faster than √T and slower than T if β > (2 − γ)s. In particular, convergence at a slower rate than T is always guaranteed.

In document Bayesian Analysis of Linear Inverse Problems with Applications in Economics and Finance (Page 122-125)