Nonparametric Estimation of Scalar Di¤usion
Models of Interest Rates Using Asymmetric Kernels
Nikolay Gospodinovy Masayuki Hirukawaz Concordia University and CIREQ Setsunan University
This Version: February 2012
Abstract
This paper proposes an asymmetric kernel-based method for nonparametric estimation of scalar di¤usion models of spot interest rates. We derive the asymptotic theory for the asymmetric kernel estimators of the drift and di¤usion functions for general and positive recurrent processes and illustrate the advantages of the Gamma kernel for bias correction and e¢ ciency gains. The …nite-sample properties and the practical relevance of the pro-posed nonparametric estimators for bond and option pricing are evaluated using actual and simulated data for U.S. interest rates.
Keywords: Nonparametric regression; Gamma kernel; di¤usion estimation; spot interest rate; derivative pricing.
JEL classi…cation: C13; C14; C22; E43; G13.
We would like to thank the Co-Editor (Franz Palm), three anonymous referees, Evan Anderson, Zongwu Cai, Bruce Hansen, Arthur Lewbel, Peter Phillips, Jeroen Rombouts, Ximing Wu, and participants at the 2007 MEG, 2008 CESG and 2008 SEA meetings and seminars at DePaul, Loyola, Purdue, and Queen’s University for helpful comments. The …rst author gratefully acknowledges …nancial support from FQRSC, IFM2 and SSHRC. The second author gratefully acknowledges …nancial support from Japan Society of the Promotion of Science (grant number 23530259).
yDepartment of Economics, Concordia University, 1455 de Maisonneuve Blvd. West, Montreal,
Quebec H3G 1M8, Canada; phone: (+1)514-848-2424 (ext. 3935); fax: (+1)514-848-4536; e-mail: [email protected]; web: http://alcor.concordia.ca/~gospodin/.
zFaculty of Economics, Setsunan University, 17-8 Ikeda Nakamachi, Neyagawa, Osaka, 572-8508,
Ja-pan; phone: (+81)72-839-8095; fax: (+81)72-839-8138; e-mail: [email protected]; web: http://www.setsunan.ac.jp/~hirukawa/.
1
Introduction
Nonparametric methods have attracted growing attention in econometrics due to their
‡ex-ibility for handling possible nonlinearities in conditional moment function estimation. This
paper focuses on nonparametric estimation of time-homogeneous drift and di¤usion functions
in continuous-time models that are used to describe the underlying dynamics of spot interest
rates. More speci…cally, we improve the nonparametric estimators of drift and di¤usion
func-tions of Stanton (1997) by means of a nonstandard smoothing technique. The main source
for this improvement comes from asymmetric kernel functions which are employed in place of
standard symmetric kernels.
Kernel smoothing has been widely applied for estimating continuous-time di¤usion
pro-cesses; examples of recent contributions include Florens-Zmirou (1993), Aït-Sahalia (1996a,b),
Jiang and Knight (1997), Stanton (1997), Chapman and Pearson (2000), Bandi (2002), Bandi
and Phillips (2003), Fan and Zhang (2003), Nicolau (2003), and Arapis and Gao (2006). An
interesting situation that often arises in economics and …nance is when the conditioning
vari-ables are nonnegative (i.e., they have a natural boundary at the origin) such as nominal interest
rates, volatility, etc. In this case, the Nadaraya-Watson (“NW”) regression estimator based on
a standard symmetric kernel is not appropriate without a boundary correction for the region
near the origin. This has generated a large literature on correction methods for boundary
e¤ects in nonparametric regression estimation which includes Gasser and Müller (1979), Rice
(1984), Müller (1991), Fan and Gijbels (1992), Fan (1993), among others.
As a viable alternative to these boundary correction methods, Chen (2000) and Scaillet
(2004) advocate the use of asymmetric kernel functions1 that have the same support as the
1
Strictly speaking, asymmetric kernel functions should be referred to as kernel-type weighting functions. In a slightly di¤erent context, Gouriéroux and Monfort (2006) and Jones and Henderson (2007) argue that unlike the case of symmetric kernels, the roles of the data pointX and the design pointxin asymmetric kernels are not exchangeable, which leads to a lack of normalization in density estimation using these kernels. Nevertheless, we follow the adopted convention in the literature for these kernel-type functions. Also, note that the lack of normalization is not an issue in regression estimation.
marginal density of the conditioning variables. In addition to eliminating boundary e¤ects,
asymmetric kernels have a number of important advantages in the analysis of economic and
…nancial data.
First, the shape of the asymmetric kernels varies according to the position at which
smoothing is made; in other words, the amount of smoothing changes in an adaptive
man-ner. Figure 1 plots the shapes of a Gamma kernel function for …ve di¤erent design points
(x= 0:00;0:02;0:04;0:08 and 0:16) at which the smoothing is performed.2 It is worth noting that for all plotted functions, the smoothing parameter value of the Gamma kernel is …xed. In
contrast, the amount of smoothing by symmetric kernels with a single bandwidth parameter
is …xed everywhere. Since the distribution of interest rates is empirically characterized by a
mode near the boundary and the sparse tail region, a single bandwidth does not su¢ ce in this
situation; while it is appropriate to employ a short bandwidth in the region near the boundary,
a longer bandwidth is required to capture the shape of the tail. As a result, when the data
in-dicate that the distribution contains both dense and sparse regions, it may prove useful to turn
to variable bandwidth methods (e.g. Abramson, 1982; Fan and Gijbels, 1992). The adaptive
smoothing property of asymmetric kernels bears some similarities to the variable bandwidth
methods, but unlike these methods, the adaptive smoothing for asymmetric kernels is achieved
by a single smoothing parameter which makes them much more appealing in empirical work.
Second, asymmetric kernels achieve the optimal rate of convergence (in mean integrated
squared error sense) within the class of nonnegative kernel estimators. Third, unlike the case
with symmetric kernels, the variances of asymmetric kernel estimators tend to decrease as the
design point moves away from the boundary. This property is particularly advantageous for
the analysis of interest rate data since the support of the density has a sparse region for high
interest rate values. In sum, we can view asymmetric kernels as a combination of a boundary
2
correction device and a “variable bandwidth” method.
Although asymmetric kernels are relatively new in the literature, several papers report
favorable evidence from applying them to empirical models in economics and …nance. A
non-exhaustive list includes: (i) estimation of recovery rate distributions on defaulted bonds
(Renault and Scaillet, 2004), (ii) income distribution estimation (Bouezmarni and Scaillet,
2005; Hagmann and Scaillet, 2007); (iii) actuarial loss distribution estimation (Hagmann and
Scaillet, 2007; Gustafsson et al., 2009); (iv) hazard estimation (Bouezmarni and Rombouts,
2008); (v) regression discontinuity design (Fé, 2010); and (vi) realized integrated volatility
estimation (Kristensen, 2010).
Incorporating asymmetric kernel smoothing into the nonparametric drift and di¤usion
es-timators of Stanton (1997) is expected to shed additional light on the nonparametric estimation
of spot rate di¤usion models. The Monte Carlo simulations in Chapman and Pearson (2000)
indicate that when the true drift is linear in the level of the spot interest rate, there are two
biases in Stanton’s (1997) symmetric kernel estimate of the drift function; namely, a bias near
the origin and a pronounced downward bias in the region of high interest rates where the
data are sparse. While the existing boundary correction methods may provide a remedy for
improving the performance of the drift estimate near the boundary, they are of little help for
removing the bias that occurs for high values of the spot rate. In contrast, given the appealing
properties of asymmetric kernels listed above, asymmetric kernel smoothing is expected to
reduce substantially both of these biases in the estimate of the drift function.
The remainder of the paper is organized as follows. Section 2 develops the asymmetric
ker-nel estimators of the drift and di¤usion functions and establishes their asymptotic properties.
It also discusses the smoothing parameter selection and the implementation of a bootstrap
procedure for inference. In Section 3, the proposed estimators are implemented to study the
simula-tion experiment that examines the …nite-sample performance of the symmetric and asymmetric
kernel estimators of di¤usion models and their practical relevance for bond and option pricing.
Section 4 summarizes the main results of the paper. Assumptions and proofs are provided in
the Appendix.
2
Estimation of Scalar Di¤usion Models via Asymmetric
Ker-nel Smoothing
2.1 NW Estimators Using the Gamma Kernel
The estimation of scalar di¤usion models plays an important role in the analysis of term
struc-ture of interest rates and derivative pricing. The general form of the underlying continuous-time
process for the spot rate Xt is represented by the stochastic di¤erential equation (“SDE”)
dXt= (Xt)dt+ (Xt)dWt; (1)
whereWtis a standard Brownian motion, (Xt)is the drift function and (Xt)is the di¤usion
(or instantaneous volatility) function. We assume that the initial condition X0 is …xed, i.e.
X0 =x0 2(0;1).3
Stanton (1997) develops a nonparametric approach to estimating the dynamics of spot
interest rate driven by the scalar di¤usion process (1). More speci…cally, Stanton (1997) uses
the in…nitesimal generator and a Taylor series expansion to give the …rst-order approximations
to (X)and 2(X)as
E(Xt+ XtjXt) = (Xt) +o( ); (2)
E (Xt+ Xt)2 jXt = 2(Xt) +o( ); (3)
where is a discrete, arbitrarily small, time step and o( ) denotes a remainder term which goes to 0 as ! 0.4 Each of these conditional expectations at the design point x can
3
This particular initialization is used in Florens-Zmirou (1993), among others. Alternative initializations are also possible.
4
be estimated by running a nonparametric regression. The approach by Stanton (1997) is
convenient because it enables us to estimate ( )and 2( ) separately, whereas the approaches by Aït-Sahalia (1996a) and Jiang and Knight (1997) require their sequential estimation; see
Arapis and Gao (2006) for a comparison of these three methods.
Since the conditioning variableXtis nonnegative, it is not appropriate to employ symmetric
kernels for estimating (X)and 2(X)without a boundary correction. The key building block
for nonparametric drift and di¤usion estimation that we propose in this paper is the NW kernel
regression estimator based on the Gamma kernel function (Chen, 2000)5
KG(x=b+1;b)(u) = u
x=bexp ( u=b)
bx=b+1 (x=b+ 1)1fu 0g; (4)
where ( ) = R01y 1exp ( y)dy; > 0 is the Gamma function and b is the smoothing parameter. Since the density of a Gamma distribution has support[0;1), the Gamma kernel function does not generate a boundary bias.
Suppose that we observe a discrete samplefXi gni=1 atnequally spaced time points from
the short-rate di¤usion process fXt: 0 t Tg satisfying the SDE (1). Here is the step
size between observations and T = n is the time span of the sample. Given a design point
x >0, we de…ne the Gamma-NW estimator of the drift (X) and di¤usion function 2(X) as
bb(x) = 1 Pni=11 X(i+1) Xi KG(x=b+1;b)(Xi ) Pn 1 i=1 KG(x=b+1;b)(Xi ) (5) and b2b(x) = 1 Pni=11 X(i+1) Xi 2KG(x=b+1;b)(Xi ) Pn 1 i=1 KG(x=b+1;b)(Xi ) : (6)
more accurate. However, Fan and Zhang (2003) show that higher-order approximations tend to reduce biases at the expense of in‡ating variances nearly exponentially, and thus we focus only on …rst-order approximations.
5Two other asymmetric kernels, namely, the Inverse Gaussian and Reciprocal Inverse Gaussian kernels
(Scail-let, 2004), can be also applied in this context, but we focus exclusively on the Gamma kernel due to its its better …nite-sample performance (see Gospodinov and Hirukawa, 2007).
2.2 Asymptotic Properties of Gamma-NW Estimators
In this section, we present some large sample results for the Gamma-NW estimator in the
di¤usion model (1). Following Bandi and Phillips (2003; abbreviated as “BP” hereafter), we
explore the in-…ll and long span asymptotics such that n ! 1, T ! 1 (long span), and
= T =n ! 0 (in-…ll). It is worth mentioning that the asymptotic results in BP are based on the assumption of symmetric, nonnegative kernel functions; see Assumption 2 in BP. None
of the asymmetric kernels proposed by Chen (2000) and Scaillet (2004) can be expressed in
the form of K(X x;b) for a data point X, design point x, and smoothing parameter b. Therefore, establishing the limiting theory for the Gamma estimator does not directly follow
from BP and is new to the literature.
The …rst results are based on the assumption that the short-rate process Xt is recurrent.
LetS(x) denote the natural scale function of Xt which is de…ned, for some generic constant
c2(0;1), as S(x) = Z x c exp Z y c 2 (u) 2(u) du dy; (7)
ands(x)be the speed function of Xt given by
s(x) = 2 2
(x)S0(x): (8)
Because the range ofXtis(0;1),Xtbecomes recurrent if and only iflimx!0S(x) = 1and
limx!1S(x) =1 (Assumption 1(iii) in Appendix A). The full set of regularity conditions is
provided in Appendix A.
In the subsequent analysis, the concept of local time plays a key role. The chronological
local time of the di¤usion process (1) is de…ned as
LX(T; x) = 1 2(x)lim!0 1Z T 0 1[x;x+ )(Xs) 2(Xs)ds (9)
for every x > 0, where 1A( ) denotes the indicator function on the set A. LX(T; x) is a
whenXtis strictly stationary,LX(T; x)=T a:s:
! f(x), wheref(x)is the time-invariant marginal density ofXt; see Bosq (1998, Theorem 6.3), for instance. In contrast, for a general recurrent
di¤usion, LX(T; x) diverges to in…nity at a rate no faster thanT. Using the Gamma kernel,
LX(T; x)for a …xed T can be (strongly) consistently estimated by
b LX(T; x; b) = n X i=1 KG(x=b+1;b)(Xi ): (10)
To describe the di¤erent asymptotic properties of the estimator depending on the position
of the design point x, we denote by “interior x” and “boundary x” a design point x that
satis…es x=b ! 1 and x=b ! for some > 0 as n; T ! 1, respectively. Also, let “)” denote weak convergence to a random sequence. Theorem 1 states the asymptotic properties
of the Gamma-NW estimator for general recurrent di¤usions.
Theorem 1 (general recurrent case).
(i) (Drift estimator) Suppose that Assumptions 1 and 3 in Appendix A hold. In addition, if b5=2LX(T; x) =Oa:s:(1), then q b1=2Lb X(T; x; b) bb(x) (x) BR(x)b )N 0; 2(x) 2p x1=2 (11)
for interior x, and q
bLbX(T; x; b)fbb(x) (x)g )N 0;
(2 + 1) 2(x)
22 +1 2( + 1) (12)
for boundary x, where BR(x) = 0(x)f1 +xs0(x)=s(x)g+ (x=2) 00(x).
(ii) (Di¤usion estimator) Suppose that Assumptions 1 and 3 in Appendix A hold. In addition, if b5=2LX(T; x)= =Oa:s:(1), then
s b1=2Lb X(T; x; b) b2b(x) 2(x) BR2(x)b )N 0; 4(x) p x1=2 (13)
for interior x, and s bLbX(T; x; b) b2b(x) 2(x) )N 0; (2 + 1) 4(x) 22 2( + 1) (14)
for boundary x, where BR2(x) = 2(x)
0
f1 +xs0(x)=s(x)g+ (x=2) 2(x) 00.
Proof. See Appendix B.
Theorem 1 establishes the limiting behavior of the drift and di¤usion function estimators
when a single smoothing parameter b is used for both interior and boundary regions. For
each of the results, the rate on b balances orders of magnitude in squared bias and variance
for interior x. This rate undersmooths the curve for boundary x, and as a consequence, the
O(b) leading bias term becomes asymptotically negligible. The theorem also suggests that the di¤usion estimator has a faster convergence rate than the drift estimator, regardless of
the position of the design point x. This result is consistent with Remark 8 in BP. However,
the chronological local time is random, and thus convergence rates of the two estimators are
path-dependent (BP, Remark 4).
The existing literature often assumes stationarity or ergodicity of the processXt. To derive
the distributional theory of the Gamma-NW estimator in this case, we impose the additional
condition thatR01s(x)dx <1(Assumption 2 in Appendix A). Using this assumption, it can be demonstrated that LbX(T; x; b)=T
a:s:
! f(x) and that s0(x)=s(x) = f0(x)=f(x).
Accord-ingly, we have the following corollary.
Corollary 1 (positive recurrent case).
(i) (Drift estimator) Suppose that Assumptions 1, 2 and 3 in Appendix A hold. In addition, if b=O T 2=5 , then p T b1=2 b b(x) (x) BS(x)b )N 0; 2(x) 2p x1=2f(x) (15)
for interior x, and
p
T bfbb(x) (x)g )N 0; (2 + 1)
2(x)
22 +1 2( + 1)f(x) (16)
for boundary x, where BS(x) = 0(x)f1 +xf0(x)=f(x)g+ (x=2) 00(x).
(ii) (Di¤usion estimator) Suppose that Assumptions 1, 2 and 3 in Appendix A hold. In addition, if b=O n 2=5 , then p nb1=2 b2 b(x) 2(x) BS2(x)b )N 0; 4(x) p x1=2f(x) (17)
for interior x, and
p
nb b2b(x) 2(x) )N 0; (2 + 1)
4(x)
22 2( + 1)f(x) (18)
for boundary x, where BS2(x) = 2(x) 0f1 +xf0(x)=f(x)g+ (x=2) 2(x) 00.
In Corollary 1, the leading bias terms inbb(x)andb2b(x)for interiorx, namely,BS(x)band
BS
2(x)b, are from nonparametric estimation of the conditional expectation using the Gamma
kernel; on the other hand, the bias terms for boundaryxbecome asymptotically negligible due
to undersmoothing as in Theorem 1. While the bias terms are of order O(b), it follows from
=T =n !0 thatbb(x) has a slower convergence rate of orderO T 1=2b 1=4 for interiorx
andO T 1=2b 1=2 for boundary x. Therefore, if the sample size is not su¢ ciently large, it is
much harder to estimate the drift accurately, especially for design points in the region of high
values where the data are sparse. However, the property of asymmetric kernel smoothing that
the variance of the estimator decreases withx, is expected to provide an e¤ective remedy.
Furthermore, Corollary 1 and the trimming argument in Chen (2000) yield the mean
integ-rated squared error (“MISE”) of each estimator given byM ISEfbb(x)g=O b2+T 1b 1=2
and M ISE b2b(x) = O b2+n 1b 1=2 . Then, the MISE-optimal smoothing parameters are b = O T 2=5 and b 2 = O n 2=5 for bb(x) and b2b(x), respectively. This result
in Chapman and Pearson (2000, p.367). As a consequence, the optimal MISEs are of order
M ISE fbb(x)g=O T 4=5 and M ISE b2b(x) =O n 4=5 .
2.3 Selection of Smoothing Parameter
The practical implementation of the proposed nonparametric estimator requires a choice of a
smoothing parameter.6 Indeed, as argued in Chapman and Pearson (2000), the performance
of the NW estimator depends crucially on the choice of a bandwidth parameter, as well as
the persistence of the data. In this respect, asymmetric kernels are not an exception, even if
they possess the property of adaptive smoothing with a single smoothing parameter. While
the result in the previous section provides some guidance in this direction, the expression for
the optimal smoothing parameter in mean squared error (“MSE”) or MISE sense involves
unknown quantities and a “plug-in rule”is di¢ cult to obtain. Note also that the MSE-optimal
smoothing parameter depends explicitly on the design point and, in principle, they should
take di¤erent values at each x. Hagmann and Scaillet (2007), however, argue for a single,
global smoothing parameter since the dependence on the design point x may deteriorate the
adaptability of asymmetric kernels.
In this paper, we adopt a cross-validation (“CV”) approach to choosing a global smoothing
parameter for nonparametric curve estimation based on asymmetric kernels. Since the data
are dependent, the leave-one-out CV is not appropriate. Instead, we work with theh-block CV
version of Györ…et al. (1989) and Burmanet al. (1994), whereh data points on both sides of
thes-th observation are removed from T observations fXtgTt=1 and the unknown function of
interestm(Xs)is estimated from the remainingT (2h+ 1)observations. The idea behind this
method is that for ergodic processes, the blocks of length h are asymptotically independent
although the block size may need to shrink (at certain rate) relative to the total sample size in
6Bandwidth selection for kernel estimators of continuous-time di¤usion models is an on-going research
pro-gram. Some recent contributions tailored to continous-time Markov processes include Bandiet al. (2010) and Kanaya and Kristensen (2011).
order to ensure the consistency of the procedure. Hallet al. (1995) establish the asymptotic
properties of theh-block CV method.
In our context with equally spaced observationsfXi gni=1, letm^ (i h) :(i+h) (Xi )denote
the estimate of the drift (Xi ) or the di¤usion 2(Xi ) from n (2h + 1) observations
X ; X2 ; :::; X(i h 1) ; X(i+h+1) ; :::; Xn (=XT) . Then, the smoothing parameter can
be selected by minimizing the least squares cross-validation function
CV (b) = arg min b2B n hX i=h+1 Yi m^ (i h) :(i+h) (Xi ) 2 (Xi ); (19) where Yi = X(i+1) Xi = and X(i+1) Xi 2
= when estimating the drift and
di¤usion functions, respectively, B is a predetermined range of b, and ( ) is a weighting function that has compact support and is bounded by 1. In our numerical work, we set
( ) 1. This procedure could be further extended to the modi…ed (asymptotically optimal)
h-block CV method proposed by Racine (2000).
Since formal results on the optimal selection of the block size h are not available in the
literature, we propose an automatic, data-dependent choice of h that takes into account the
persistence of the data. While Burmanet al. (1994) and Racine (2000) adopt an ad hoc …xed
fraction rule (h =n1=4), this choice ofh does not account for di¤erent degrees of persistence in the underlying process. Given some similarities between the choice of h and the selection
of a bandwidth parameter in the heteroskedasticity and autocorrelation consistent (“HAC”)
variance estimation, we follow Andrews (1991) and modify the plug-in-rule forhtoh= ( n)1=4, where = (1 )42(1+ )2 2 and <1is the AR coe¢ cient from an AR(1) regression forfXi g
n i=1.7
When = 0 (or equivalently = 0), this procedure naturally reduces to the leave-one-out CV for serially uncorrelated data.
2.4 Bootstrap Procedure
Inference on the conditional expectations (Xt)and 2(Xt)can be conducted using the
asymp-totic distributions derived in Theorem 1 or Corollary 1 above. This involves explicit estimation
of unknown functions of the data that enter the asymptotic distributions. A more practical
approach, that typically provides better approximations to the …nite-sample distributions of
the statistics of interest, employs simulation-based or bootstrap methods.8 In particular, we
propose a parametric bootstrap procedure based on the discretized version of model (1)
X(i+1) Xi = (Xi ) + (Xi )
p
"i ; (20)
where"i iid
N(0;1). Our bootstrap method also deals explicitly with the discretization bias induced by the discrete-time version of the underlying model. This is achieved by directly
sampling from the continuous-time model with a step size set to4=M for someM 1(in our empirical application,M = 100) while keeping the design points the same across the bootstrap samples.9 More speci…cally, using the estimated drift and di¤usion functionsbb(x)andbb(x),10
we setx0=Xi and compute recursively
xk=xk 1+bb(xk 1)
M +bb(xk 1)
r
M"k; (21)
fork= 1; :::M, where "kiidN(0;1). We then set X(i+1) =xM and repeat this procedure for each design point to construct the bootstrap sample nX(i+1) on
i=1: Finally, each bootstrap
seriesX(i+1) Xi is used to re-estimate (X) and (X) at the same design points.
Let b(x)and b(x)denote the corresponding bootstrap estimates. Repeating this proced-ure a large number of times provides an approximation to the distributions of b(x) bb(x)
8Bootstrap methods for inference in nonparametric regression models have been developed by Härdle and
Marron (1991) and Frankeet al. (2002), among others.
9We would like to thank an anonymous referee for suggesting this to us. 1 0
The di¤usion estimates are obtained by imposing the restriction (0) = 0 as in Stanton (1997) which substantially reduces the bias documented in Fan and Zhang (2003) and ensures the nonnegativity of the simulated interest rates as4 !0.
and b(x) bb(x) at each design point. Then, pointwise con…dence bands can be
construc-ted as bb(x) q 1 2 ; bb(x) q 2 and bb(x) q 1 2 ; bb(x) q 2 , where
q ( ) and q ( ) denote the -quantiles of the bootstrap distributions of b(x) bb(x) and
b(x) bb(x). This method can also be used for bias correction of the original estimates
bb(x) and bb(x) which could further improve the accuracy of the approximation in a double
bootstrap procedure.
3
Empirical and Simulation Analysis
3.1 Dynamics of U.S. Risk-Free RateOne outstanding question in the analysis of the term structure of U.S. interest rates is
con-cerned with the presence of possible nonlinearities in the underlying dynamics of the spot rate.
Despite the large literature on this issue (Aït-Sahalia, 1996a,b; Stanton, 1997; Chapman and
Pearson, 2000; Bandi, 2002; Durham, 2003; Fan and Zhang, 2003; Jones, 2003; Arapis and
Gao, 2006; among others), there is still no consensus on the presence of statistically signi…cant
nonlinearities in short-term interest rates and their economic importance for pricing bonds
and interest rate derivative products. For instance, Chapman and Pearson (2000) argue that
the nonlinearity in the drift of the spot rate at high values of interest rates documented by
Stanton (1997) could be spurious due to the poor …nite-sample properties of Stanton’s (1997)
estimator. Fan and Zhang (2003) conclude that there is little evidence against linearity in the
short rate drift function. Similarly, Bandi (2002), Durham (2003) and Jones (2003) …nd no
empirical support for nonlinear mean reversion in short-term rates. In contrast, Arapis and
Gao (2006) report that their speci…cation testing strongly rejects the linearity of the short rate
drift at both daily and monthly frequency.
Here, we revisit this issue by estimating the drift function using our proposed Gamma kernel
estimator. The data used for estimation are 564 monthly observations for the annualized
January 1952 – December 1998. As it is typically the case for interest rate data, the U.S.
risk-free rate exhibits high persistence and strong conditional heteroskedasticity. Given the
sensitivity of the nonparametric estimates of the drift function to the bandwidth (Chapman
and Pearson, 2000), this high persistence should be properly incorporated into the selection
procedure for a smoothing parameter.
Since our interest lies in identifying possible nonlinearities in the drift function of the spot
rate, it proves instructive to estimate the CIR (Coxet al., 1985) model
dXt= ( Xt)dt+ Xt1=2dWt; (22)
which speci…es the drift as a linear function in X of the form (Xt) = ( Xt), where
denotes the speed of mean reversion and is the long-run mean of X. Given the nature
of the observed data for the spot rate, the unknown parameters ( ; ; ) are estimated from the discretized version of the model with = 1=12 which induces a discretization bias. To correct for the discretization bias, we follow Brozeet al. (1995) and estimate the parameters
of the CIR model by indirect inference using the same auxiliary model and discretization step
as suggested by Broze et al. (1995). The resulting estimates for the U.S. risk-free rate are
b= 0:2804,b= 0:0541and b= 0:0876.
Figure 2 presents the nonparametric estimates of the drift function for the risk-free rate
from the Gamma NW and Gaussian NW regressions along with the estimated parametric drift
function from the CIR model. The smoothing parameter for the Gamma kernel estimator is
chosen byh-block CV described in Section 2.3 while the smoothing parameter for the Gaussian
kernel is set equal to four times the bandwidth foriiddata as in Stanton (1997). The di¤erences
in the nonparametric estimates reveal that part of the accelerated mean reversion at higher
interest rates, reported in the previous literature, may be due to larger biases of the symmetric
kernel-based estimators although the Gamma kernel estimator also suggests that the mean
signi…cance of these …ndings, we also report 95% con…dence bands for the Gamma kernel
estimates obtained by the bootstrap procedure outlined in Section 3.4 with 999 bootstrap
replications. Despite the mild nonlinearity of the nonparametric estimates, the parametric
estimate of the drift function from the CIR model falls within the 95% con…dence bands (with
the exception of very low interest rates) and it appears that the null hypothesis of a linear drift
cannot be rejected for this particular sample. Also, the nonparametric estimate based on the
Gaussian kernel lies inside the 95% Gamma kernel-based con…dence bands except for very large
interest rates where the bias due to sparseness of the data becomes more pronounced. The
Gamma kernel estimate of the di¤usion function and the associated 95% bootstrap con…dence
bands are plotted in Figure 3. To gain further insights into the …nite-sample properties of the
nonparametric estimators and the economic importance of these seemingly small statistical
di¤erences for bond and option pricing, we conduct a simulation experiment whose design
closely mimics our empirical application.
3.2 Monte Carlo Experiment
The data for the simulation experiment is generated from the CIR model (22) with true
parameter values that are set equal to the estimates from U.S. data in the previous section,
( ; ; ) = (0:2804;0:0541;0:0876);and a time step between two consecutive observation equal to = 1=12 corresponding to monthly data. The CIR model is convenient because the transition and marginal densities are known and the bond and call option prices are available in
closed form (Coxet al., 1985). 5;000sample paths for the spot interest rate of600observations (n= 600; T = 50) are constructed recursively by drawing random numbers from the transition non-central chi-square density and using the values for , ; and (see Chapman and
Pearson, 2000, for more details).
The expressions for the price of a coupon discount bond and a call option on a
Jiang (1998) and Phillips and Yu (2005) and compute the prices of a three-year zero-coupon
discount bond and a one-year European call option on a three-year discount bond with a face
value of $100 and an exercise price of $87 with an initial interest rate of 7% by simulating spot
rate data from the estimated di¤usion process. The simulated bond and derivative prices are
then compared to the analytical prices based on the true values of the parameters.
More speci…cally, the price of a zero-coupon bond with face value P0 and maturity( t)
is computed as Pt =P0Et exp Z t ~ Xudu ; (23)
where the risk-neutral process X~t evolves as dX~t = b( ~Xt)dt+b( ~Xt)dWt, and b( ) and b( )
denote the nonparametric estimates of the drift and di¤usion functions, respectively.11 The
expectation is evaluated by Monte Carlo simulation using a discretized version of the dynamics
of the spot rate.
The price of a call option with maturity (m t) on a zero-coupon bond with maturity
( t);face value P0 and exercise priceK is computed as
Ctm = Et exp Z m t ~ Xudu max (Pm K;0) = Et exp Z m t ~ Xudu max P0Em exp Z m ~ Xudv K;0 ; (24)
where m < and sample paths for X~t are simulated from the nonparametrically estimated
discretized model of spot rate.
We consider the NW estimators based on Gaussian and Gamma kernels with smoothing
parameters that are selected by h-block CV. The average Monte Carlo estimates and 95%
Monte Carlo con…dence bands for the drift function are plotted in Figures 4 and 5. While
the Gamma NW kernel estimator is practically unbiased, the Gaussian NW kernel exhibits
a downward bias for interest rates higher than 9-10%. Furthermore, the asymmetric kernel
1 1
For simplicity, the market price of risk is assumed to be equal to zero since its computation requires another interest rate process of di¤erent maturity.
estimator exhibits smaller variability both near the boundary and in the tail region where the
data are sparse.12
The economic signi…cance of the improved estimation of di¤usion models of spot rate is
evaluated by comparing bond and option pricing errors based on symmetric and asymmetric
kernel estimators13 for the CIR model. The results are presented in Table 1. The bond and
derivative prices based on the asymmetric kernel estimator are less biased than its Gaussian
counterpart which could translate in potentially substantial …nancial bene…ts and investment
pro…ts for large portfolios. Another important feature of the results is that the Gamma-based
bond and option prices enjoy much smaller variability and tighter con…dence intervals than the
symmetric kernel-based option prices. Finally, while the con…dence intervals for the Gamma
kernel-based call prices appear symmetric around the median estimate, the con…dence intervals
for the Gaussian kernel-based call prices tend to be highly asymmetric (with long right tail)
which illustrates the economic impact of the boundary problems of symmetric kernels.
4
Conclusion
This paper proposes an asymmetric kernel estimator for estimating the underlying scalar
di¤u-sion model of spot interest rate. The asymptotic properties of the Gamma kernel estimator of
the drift and di¤usion functions are established for both interior and boundary design points.
We show that the Gamma kernel estimator possesses some appealing properties such as lack
of boundary bias and adaptability in the amount of smoothing. The paper adopts a block
cross-validation method for dependent data in choosing the smoothing parameter. The
…nite-sample performance of the asymmetric kernel estimator and its practical relevance for bond
1 2
The estimates of the di¤usion function ( ) are not reported to conserve space. Overall, the di¤usion function estimates based on the Gamma and Gaussian kernels are similar due to the zero restriction of the di¤usion function at the origin (see footnote 10). When this restriction is removed, the Gamma kernel estimator tends to perform better as a result of its intrinsic boundary correction property.
1 3
Kristensen (2008) demonstrates that when the processXtis stationary, derivative prices obtained by plug-in
nonparametric estimates of the drift and di¤usion functions have the same rate of convergence as parametric estimators.
and option pricing are evaluated using simulated data of spot interest rate. The empirical
ana-lysis of the U.S. risk-free rate reveals the existence of some mild (and statistically insigni…cant)
nonlinearity in the drift function although the estimated increased mean reversion at higher
interest rate levels is substantially smaller than originally reported by Stanton (1997).
A
Appendix: Assumptions
This appendix provides a set of assumptions for Theorem 1 and Corollary 1 that are similar
to those in Bandi (2002) and BP. Our …rst assumption is basically the same as Assumption
1 in BP, which ensures that the SDE (1) has a unique strong solution Xt and that Xt is
recurrent. Assumption 2, together with Assumption 1, implies that the processXt is positive
recurrent (or ergodic) and ensures the existence of a time-invariant distributionP0with density
f(x) =s(x)=R01s(x)dx. Moreover, ifX0 has distributionP0,Xtbecomes strictly stationary.
Assumption 3 controls the rates of convergence or divergence of the sequences used in the
asymptotic results for general and positive recurrent cases.
Assumption 1.
(i) ( ) and ( ) are time-homogeneous,B-measurable functions on (0;1), where B is the -…eld generated by Borel sets on (0;1). Both functions are at least twice continuously di¤erentiable. Hence, they satisfy local Lipschitz and growth conditions. Thus, for every
compact subset J of the range (0;1), there exist constants C1J and C2J such that, for all
x; y2J,j (x) (y)j+j (x) (y)j CJ
1 jx yjand j (x)j+j (x)j C2Jf1 +jxjg.
(ii) 2( )>0 on(0;1).
(iii) The natural scale functionS(x) satis…eslimx!0S(x) = 1and limx!1S(x) =1.
Assumption 3. Asn; T ! 1, (=T =n)!0,b(=bn;T)!0such that
LX(T; x)
b
s
log 1 =oa:s:(1): (A1)
B
Appendix: Preliminary Lemma and Proof of Theorem 1
The proof of Theorem 1 closely follows the proof of Theorem 3 in BP by adapting it to the
case of the Gamma kernel. We provide only the proof for the drift estimator bb(x), because the one for the di¤usion estimatorb2b(x)follows similar arguments. To establish the results in Theorem 1, we need the following lemma.
Lemma 1. Under Assumptions 1 and 3 in Appendix A,
n 1 X i=1 KG(x=b+1;b)(Xi ) (Xi ) = Z T 0 KG(x=b+1;b)(Xs) (Xs)ds+oa:s:(1): (B1)
Proof of Lemma 1. Consider that
n 1 X i=1 KG(x=b+1;b)(Xi ) (Xi ) Z T 0 KG(x=b+1;b)(Xs) (Xs)ds n 1 X i=0 Z (i+1) i KG(x=b+1;b)(Xi ) KG(x=b+1;b)(Xs) (Xi )ds + n 1 X i=0 Z (i+1) i KG(x=b+1;b)(Xs)f (Xs) (Xi )gds + KG(x=b+1;b)(X0) (X0) A1(x) +A2(x) +Oa:s:( ); (B2)
where the bound of the third term is obtained fromX0 =x0, Assumption 1 and boundedness
of the kernel. Observe that
A1(x) n 1 X i=0 Z (i+1) i KG0 (x=b+1;b) X~is jXs Xi j j (Xi )jds (B3)
forX~is on the line segment connectingXs and Xi . Since
kn;T max i n i ssup(i+1) jXs Xi j=Oa:s: s log 1 ! (B4)
(see p.267 in BP),X~is=Xs+oa:s:(1)uniformly over i n, and thus A1 kn;T n 1 X i=0 Z (i+1) i KG0 (x=b+1;b)(Xs+oa:s:(1)) j (Xs+oa:s:(1))jds = kn;T Z T 0 KG0(x=b+1;b)(Xs+oa:s:(1)) j (Xs+oa:s:(1))jds = kn;T b Z 1 0 b KG0(x=b+1;b)(u+oa:s:(1)) j (u)jLX(T; u)du: (B5) Note that KG0(x=b+1;b)(u) = x b ux=b 1exp ( u=b) bx=b+1 (x=b+ 1) 1 b ux=bexp ( u=b) bx=b+1 (x=b+ 1): (B6)
Then, by the properties of the Gamma function,
Z 1 0 b KG0(x=b+1;b)(u) du = x Z 1 0 ux=b 1exp ( u=b) bx=b+1 (x=b+ 1)du+ Z 1 0 ux=bexp ( u=b) bx=b+1 (x=b+ 1)du = x b x=b (x=b) bx=b+1 (x=b+ 1) + 1 = 1 + 1 = 2: (B7)
Combining this result with the continuity of ( ) and LX(T; ) and Assumption 3, we can
conclude that A1 is bounded by Oa:s: LX(bT;x)
q
log 1 = oa:s:(1). Similarly, A2(x) is
bounded byOa:s: LX(bT;x)
q
log 1 =oa:s:(1).
Proof of Theorem 1. Observe that
bb(x) (x) = Pn 1 i=1 KG(x=b+1;b)(Xi )f (Xi ) (x)g Pn 1 i=1 KG(x=b+1;b)(Xi ) + Pn 1 i=1 KG(x=b+1;b)(Xi ) nX (i+1) Xi (Xi ) o Pn 1 i=1 KG(x=b+1;b)(Xi ) B (x) +V (x): (B8)
Using Lemma 1 above and Lemma 6 in BP, we have
B (x) = RT 0 KG(x=b+1;b)(Xs)f (Xs) (x)gds+oa:s:(1) RT 0 KG(x=b+1;b)(Xs)ds+oa:s:(1) = R1 0 KG(x=b+1;b)(u)f (u) (x)gs(u)ds+oa:s:(1) R1 0 KG(x=b+1;b)(u)s(u)ds+oa:s:(1) + (s:o:); (B9)
where(s:o:)denotes the term that is of smaller order than the …rst term. Ignoring smaller order terms, taking a second-order Taylor expansion for (u)aroundu=x, and usingE( x x) =b
and E( x x)2 =b(x+ 2b) for x Gamma (x=b+ 1; b), we obtain the following
approxim-ation of the …rst term:
R1 0 KG(x=b+1;b)(u)f (u) (x)gs(u)ds R1 0 KG(x=b+1;b)(u)s(u)ds = 0(x) 1 +xs 0(x) s(x) + x 2 00(x) b+o(b): (B10)
On the other hand, V (x) can be rewritten as
V (x) = Pn 1 i=1 KG(x=b+1;b)(Xi ) 1 R(i+1) i f (Xs) (Xi )gds Pn 1 i=1 KG(x=b+1;b)(Xi ) + Pn 1 i=1 KG(x=b+1;b)(Xi ) 1 R(i+1) i (Xs)dW s Pn 1 i=1 KG(x=b+1;b)(Xi ) V1(x) +V2(x); (B11)
whereV1(x) is bounded by Oa:s:
q
log 1 =oa:s:(1) following the same argument as in
the proof of Lemma 1 above. The leading variance term of the numerator of V2(x) can be
determined by n 1 X i=0 KG2(x=b+1;b)(Xi ) 1 Z (i+1) i 2(X s)ds+oa:s:(1) = Z T 0
KG2(x=b+1;b)(Xs+oa:s:(1)) 2(Xs) +oa:s:(1) ds+oa:s:(1): (B12)
Ignoring smaller order terms, and de…ning Ab(x) = b 1 (2x=b+ 1)= 22x=b+1 2(x=b+ 1) ,
we have Z T 0 KG2(x=b+1;b)(Xs) 2(Xs)ds = Ab(x) Z T 0 KG(2x=b+1;b=2)(Xs) 2(Xs)ds = Ab(x) Z T 0 KG(2x=b+1;b=2)(u) 2(u)LX(T; u)du; (B13)
where (see Chen, 2000)
Ab(x) = ( b 1=2 2p x1=2 +o b 1=2 for interior x b 1 (2 +1) 22 +1 2( +1) +o b 1 for boundaryx: (B14)
Hence, for interiorx, b1=2 Z T 0 KG2(x=b+1;b)(Xs) 2(Xs)ds a:s: ! 2(x) 2p x1=2LX(T; x): (B15)
It follows from the arguments in the proof of Theorem 3 in BP that
b1=4V2(x))M N 0; 2(x) 2p x1=2L X(T; x) : (B16) Therefore, q b1=2L X(T; x)V2(x))N 0; 2(x) 2p x1=2 : (B17)
The result for interior x can be established by combining (B8), (B10) and (B17) under the
assumption that b5=2LX(T; x) = Oa:s:(1) and then replacing LX(T; x) by LbX(T; x; b). The
References
[1] Abramson, I. S. (1982): “On Bandwidth Variation in Kernel Estimates - A Square Root Law,”Annals of Statistics, 10, 1217 - 1223.
[2] Aït-Sahalia, Y. (1996a): “Nonparametric Pricing of Interest Rate Derivative Securities,”
Econometrica, 64, 527 - 560.
[3] Aït-Sahalia, Y. (1996b): “Testing Continuous Time Models of the Spot Interest Rate,”
Review of Financial Studies, 9, 385 - 426.
[4] Andrews, D. W. K. (1991): “Heteroskedasticity and Autocorrelation Consistent Covari-ance Matrix Estimation,”Econometrica, 59, 817 - 858.
[5] Arapis, M., and J. Gao (2006): “Empirical Comparisons in Short-Term Interest Rate Models Using Nonparametric Methods,”Journal of Financial Econometrics, 4, 310 - 345.
[6] Bandi, F. M. (2002): “Short-Term Interest Rate Dynamics: A Spatial Approach,”Journal of Financial Economics, 65, 73 - 110.
[7] Bandi, F. M., V. Corradi, and G. Moloche (2010): “Bandwidth Selection for Continuous-Time Markov Processes,” Unpublished manuscript.
[8] Bandi, F. M., and P. C. B. Phillips (2003): “Fully Nonparametric Estimation of Scalar Dif-fusion Models,”Econometrica, 71, 241 - 283.
[9] Bosq, D. (1998): Nonparametric Statistics for Stochastic Processes: Estimation and Pre-diction, 2nd edition. New York: Springer-Verlag.
[10] Bouezmarni, T., and J. V. K. Rombouts (2008): “Density and Hazard Rate Estimation for Censored and -Mixing Data Using Gamma Kernels,”Journal of Nonparametric Statist-ics, 20, 627 - 643.
[11] Bouezmarni, T., and O. Scaillet (2005): “Consistency of Asymmetric Kernel Density Es-timators and Smoothed Histograms with Application to Income Data,”Econometric The-ory, 21, 390 - 412.
[12] Broze, L., O. Scaillet, and J.-M. Zakoïan (1995): “Testing for Continuous-Time Models of the Short-Term Interest Rate,”Journal of Empirical Finance, 2, 199 - 223.
[13] Burman, P., E. Chow, and D. Nolan (1994): “A Cross-Validatory Method for Dependent Data,”Biometrika, 81, 351 - 358.
[14] Chapman, D. A., and N. D. Pearson (2000): “Is the Short Rate Drift Actually Nonlinear?,”
Journal of Finance, 55, 355 - 388.
[15] Chen, S. X. (2000): “Probability Density Function Estimation Using Gamma Kernels,”
Annals of the Institute of Statistical Mathematics, 52, 471 - 480.
[16] Cox, J. C., J. E. Ingersoll, Jr., and S. A. Ross (1985): “A Theory of the Term Structure of Interest Rates,”Econometrica, 53, 385 - 408.
[17] Durham, G. B. (2003): “Likelihood-Based Speci…cation Analysis of Continuous-Time Models of the Short-Term Interest Rate,”Journal of Financial Economics, 70, 463 - 487.
[18] Fan, J. (1993): “Local Linear Regression Smoothers and Their Minimax E¢ ciencies,”
Annals of Statistics, 21, 196 - 216.
[19] Fan, J., and I. Gijbels (1992): “Variable Bandwidth and Local Linear Regression Smooth-ers,”Annals of Statistics, 20, 2008 - 2036.
[20] Fan, J., and C. Zhang (2003): “A Reexamination of Di¤usion Estimators with Applications to Financial Model Validation,”Journal of the American Statistical Association, 98, 118 -134.
[21] Fé, D. (2010): “An Application of Local Linear Regression with Asymmetric Kernels to Regression Discontinuity Designs,” Economics Discussion Paper Series EDP-1016, Uni-versity of Manchester.
[22] Florens-Zmirou, D. (1993): “On Estimating the Di¤usion Coe¢ cient from Discrete Ob-servations,”Journal of Applied Probability, 30, 790 - 804.
[23] Franke, J., J.-P. Kreiss, and E. Mammen (2002): “Bootstrap of Kernel Smoothing in Nonlinear Time Series,”Bernoulli, 8, 1 - 37.
[24] Gasser, T., and H.-G. Müller (1979): “Kernel Estimation of Regression Functions,” in T. Gasser and M. Rosenblatt (eds.), Smoothing Techniques for Curve Estimation: Pro-ceedings of a Workshop Held in Heidelberg, April 2 - 4, 1979. Berlin: Springer-Verlag, 23 - 68.
[25] Gospodinov, N., and M. Hirukawa (2007): “Time Series Nonparametric Regression Using Asymmetric Kernels,” CIRJE Discussion Paper No. CIRJE-F-573, CIRJE, Faculty of Economics, University of Tokyo.
[26] Gouriéroux, C., and A. Monfort (2006): “(Non)Consistency of the Beta Kernel Estimator for Recovery Rate Distribution,” CREST Discussion Paper n 2006-31.
[27] Gustafsson, J., M. Hagmann, J. P. Nielsen, and O. Scaillet (2009): “Local Transformation Kernel Density Estimation of Loss Distributions,”Journal of Business & Economic Stat-istics, 27, 161 - 175.
[28] Györ…, L., W. Härdle, P. Sarda, and P. Vieu (1989): Nonparametric Curve Estimation from Time Series, Lecture Notes in Statistics, Springer-Verlag, Berlin.
[29] Hagmann, M., and O. Scaillet (2007): “Local Multiplicative Bias Correction for Asymmet-ric Kernel Density Estimators,”Journal of Econometrics, 141, 213 - 249.
[30] Hall, P., S. N. Lahiri, and J. Polzehl (1995): “On Bandwidth Choice in Nonparametric Regression with Both Short- and Long-Range Dependent Errors, ”Annals of Statistics, 23, 1921 - 1936.
[31] Härdle, W., and J. S. Marron (1991): “Bootstrap Simultaneous Error Bars for Nonpara-metric Regression,”Annals of Statistics, 19, 778 - 796.
[32] Jiang, G. J. (1998): “Nonparametric Modeling of U.S. Interest Rate Term Structure Dynamics and Implications on the Prices of Derivative Securities,”Journal of Financial and Quantitative Analysis, 33, 465 - 497.
[33] Jiang, G. J., and J. L. Knight (1997): “A Nonparametric Approach to the Estimation of Di¤usion Processes with an Application to a Short-Term Interest Rate Model,” Econo-metric Theory, 13, 615 - 645.
[34] Jones, C. S. (2003): “Nonlinear Mean Reversion in the Short-Term Interest Rate,”Review of Financial Studies,16, 793 - 843.
[35] Jones, M. C., and D. A. Henderson (2007): “Kernel-Type Density Estimation on the Unit Interval,”Biometrika, 24, 977 - 984.
[36] Kanaya, S., and D. Kristensen (2011): “Optimal Sampling and Bandwidth Selection for Nonparametric Kernel Estimators of Di¤usion Processes,” Unpublished manuscript.
[37] Kristensen, D. (2008): “Estimation of Partial Di¤erential Equations with Applications in Finance,”Journal of Econometrics, 144, 392 - 408.
[38] Kristensen, D. (2010): “Nonparametric Filtering of the Realized Spot Volatility: A Kernel-Based Approach,”Econometric Theory, 26, 60 - 93.
[39] Müller, H.-G. (1991): “Smooth Optimum Kernel Estimators near Endpoints,”Biometrika, 78, 521 - 530.
[40] Nicolau, J. (2003): “Bias Reduction in Nonparametric Di¤usion Coe¢ cient Estimation,”
Econometric Theory, 19, 754 - 777.
[41] Phillips, P. C. B., and J. Yu (2005): “Jackkni…ng Bond Option Prices,”Review of Financial Studies, 18, 707 - 742.
[42] Racine, J. S. (2000): “A Consistent Cross-Validatory Method for Dependent Data: hv -Block Cross-Validation,”Journal of Econometrics, 99, 39 - 61.
[43] Renault, O., and O. Scaillet (2004): “On the Way to Recovery: A Nonparametric Bias Free Estimation of Recovery Rate Densities,”Journal of Banking & Finance, 28, 2915 - 2931.
[44] Rice, J. (1984): “Boundary Modi…cation for Kernel Regression,”Communications in Stat-istics: Theory and Methods, 13, 893 - 900.
[45] Scaillet, O. (2004): “Density Estimation Using Inverse and Reciprocal Inverse Gaussian Kernels,”Journal of Nonparametric Statistics, 16, 217 - 226.
[46] Stanton, R. (1997): “A Nonparametric Model of Term Structure Dynamics and the Market Price of Interest Rate Risk,”Journal of Finance, 52, 1973 - 2002.
Table 1. Monte Carlo statistics of bond and option prices based on nonparametric (Gaussian NW and Gamma NW) estimators in the CIR model with = 0:2804; = 0:0541 and = 0:0876.
bond price call option price true price 82.425 1.889 Gaussian NW estimator median estimate 82.359 1.656 standard deviation 1.322 0.515 95% con…dence interval [80.420, 85.573] [1.014, 3.026] Gamma NW estimator median estimate 82.447 1.704 standard deviation 1.115 0.347 95% con…dence interval [80.665, 85.058] [1.133, 2.463]
Notes: The statistics in the table are computed from 5,000 samples generated from the CIR model with4= 1=12 andn= 600. The prices of a three-year zero-coupon discount bond and a one-year European call option on a three-year bond with face value of $100, strike price of $87 and initial interest rate of 7% are computed by Monte Carlo simulation.
Figure 1. Shapes of the Gamma kernel function with a …xed smoothing parameter (0.02) at various design points.
Figure 2. Estimates of the drift function ( )of the risk-free rate with the NW estimator based on the Gamma kernel (along with 95% bootstrap con…dence bands), NW estimator based on the Gaussian kernel and indirect inference estimator of the CIR model. The smoothing parameter for the Gamma kernel estimator is chosen byh-block cross-validation withh= ( n)1=4, where is de…ned in Section 2.3. The smoothing parameter for the Gaussian kernel estimator is set equal to four times the bandwidth parameter foriiddata (Stanton, 1997).
Figure 3. Estimate of the di¤usion function ( )of the risk-free rate with the NW estimator based on the Gamma kernel (along with 95% bootstrap con…dence bands). The smoothing parameter for the Gamma kernel estimator is chosen by h-block cross-validation with h = ( n)1=4, where is de…ned in Section 2.3.
Figure 4. Average Monte Carlo drift (Gamma and Gaussian NW) estimates from CIR model with ( ; ; ) = (0:2804; 0:0541; 0:0876). The smoothing parameters are selected by h-block cross validation withh= ( n)1=4, where is de…ned in Section 2.3.
Figure 5. 95% Monte Carlo con…dence intervals of the drift (Gamma and Gaussian NW) estimates from CIR model with( ; ; ) = (0:2804;0:0541;0:0876)and smoothing parameters selected byh-block cross validation withh= ( n)1=4, where is de…ned in Section 2.3.