Estimation with Unknown f - Three Essays in Micro-Econometrics

f_v is usually unknown. In this section, we discuss the case when f_v is estimated nonpara-metrically.

We consider here the modi…ed estimator (1.2.8) with estimated bfv;

bn=

fb_v(v_i) = 1

n 1

Xn j=1

hK v_j v_i

h :

K ( ) is standard kernel function de…ned in Assumption 37.

The estimation of fvintroduces some new problems: the estimation of fv is in expanding sets [ ₀; _n]; the estimator now needs a linear representation. As shown in Wooldridge (2007), Hirano, Imbens, and Ridder (2003), Magnac and Maurin (2007), and many others, the estimator with estimated fv can have smaller variance than the one using the true fv: This is also the case here, however, the rate remains the same. For the convenience of inference, we prove the consistency of the bootstrap when we are in the nice world. Note that the convergence rate in this model is slower than root-n.

1.3.1 The Consistency of bf_v(v)

To have a point-wise consistent estimate of bfv(v) ; we need the number of observations around the point v to tend to in…nity. We know that f_v( _n) inf

v2[ 0; _n]f_v(v) for n large enough: So if bf_v( _n) is consistent for f_v( _n) ; the point-wise consistency of bf_v(v) on the whole interval [ ₀; _n] should hold.

The standard nonparametric analysis (e.g., Li and Racine 2007) gives that

Eh fbv(v)i

= fv(v) + ^q

q!f_v^(q)(v) h^q; (1.3.2)

varh fb_v(v)i

= f_v(v)

nh ; (1.3.3)

where _q R

v^qK (v) dv; R

K (v)²dv; and q is the order of Kernel function K. From

equation (1.3.2) and (1.3.3),

fb_v(v)

fv(v) = 1 + ^q q!

f_v^(q)(v) h^q fv(v) + O

nhfv(v) : (1.3.4)

To control the variance term, we need the number of observations used to estimate f_v( _n) ; nhf ( _n) to tend to in…nity. The bias term could be controlled by using a high order kernel function with a bandwidth h n ^c; for some c > 0.

The optimal convergence rate condition (1.2.15) and the tail condition (1.2.14) imply that nfv( _n) ! 1: For the consistency of bfv(v) on [ ₀; _n], we need a little bit stronger condition than that:

n^{1 c}^hfv( _n) ! 1; (1.3.5)

for some 0 < c_h < 1: The optimal convergence rate condition remains the same:

1 n

Z _n

E Y ²D v

fv(v) dv = cov (Y ; U )²: (1.3.6)

However, condition (1.3.5) and (1.3.6) place a more restrictive condition on f_v(v):

Z _n

1 fv(v)dv

!1 c_h

f_v( _n) ! 1; (1.3.7)

for some 0 < c_h < 1. This is the new and stronger tail condition needed to be in the nice world with the estimated instead of true density. Condition (1.3.7) rules out fv(v) exp ( v^c) for c < 1 in example 2: This is because the tail of that f_v(v) is too thin to ensure the consistency of bf_v on the entire expanding sets, if we choose h = n ^cfor some c > 0:

Assumption 11 (Restriction on fv) For _n chosen from condition (1.3.6), ^f^v_f^(v+h)

v(v) =

1 + o (1) ; for v 2 [ 0; _n] ; where h is the bandwidth used in the kernel function, h n ^c; for some c > 0:

Assumption 11 is for the consistency of bfv(v) ; intuitively, it says that the density of those observations used in estimation should be close to the density we estimate. It is not hard to verify that those f_v in Lemma 1.2.10 satisfy Assumption 11, so it is reasonable to impose this assumption.

Lemma 1.3.1 For _n chosen from condition (1.3.6), under Assumption 11, if h n ^c^h, for some 0 < ch < 1; using Kernel de…ned in Assumption 37 with q > ^{1 c}_c ^h

sup

v2[ 0; _n]

fb_v(v) f_v(v) = O ln n nh

1 2

! :

Note that condition (1.3.6) can possibly give _nas fast as n¹²; if the tail of fv(v) is thick enough: Hansen (2008) also obtains the uniform convergence rate of sup

v2[ 0; _n]

fb_v(v) f_v(v)

on expanding set. However, this does not cover our result here, because our _n may go to in…nity faster.

1.3.2 The First-Order Asymptotics We consider the …rst-order asymptotics of ¹_nPn

1 b_ni. To simplify notation, let m_ni

DiTniYi

n E(Uⁿ); then ni mni

fv(vi); bni mni

fbv(vi): Observe that

b_ni = mni

fb_v(v_i) = mni

f_v(v_i) +

mni fv(vi) fbv(vi) f_v²(v_i) +

mni fv(vi) fbv(vi) ²

f_v²(v_i) bf_v(v_i) ; (1.3.8)

where the …rst two terms on the right hand side are the in‡uence term and could be analyzed

using standard U-statistics, and the last term is the residual term, which is asymptotically negligible.

With the uniform convergence of bf_v(v) over the expanding sets, the following theorem gives the linear representation form, by applying the standard U-statistics (see Powell et al. 1989) technique on the in‡uence term and showing the residual term is asymptotic negligible.

Theorem 1.3.2 Suppose f_v(v) satis…es condition (1.3.7). Let Assumption 3, 35 v 11, 37 hold. For _n chosen from condition (1.3.6), we set h = n ^c^h; 0 < c_h c_h; and q > ^{1 c}_c ^h

where the in‡uence term is asymptotic normal and achieves the fastest rate of convergence, and

q 2 n

nBn 1:

By Theorem 1.3.2 and for the same reason as in Corollary 1.2.12, we have the following Corollary.

Corollary 1.3.3 Suppose all Assumptions in Theorem 1.3.2 hold, then

bn E (Y ) Bn= 1 where the in‡uence term is asymptotic normal and achieves the fastest rate of convergence, and

q ₂

nBn 1:

Proof.2It is not hard to see that the Lindeberg condition (1.2.10) also works for ¹_nPn 1 b_ni:

The rest of the proof is done by Theorem 1.3.2 and the delta method.

The variance here is smaller than that in Corollary 1.2.12 with same degree of trimming, con…rming previous results. However, the convergence rate remains the same:

1.3.3 Bootstrapping the Estimator

Suppose we have data fzigⁿi=1 and a statistic % formed from fzigⁿi=1. The bootstrap ran-domly generates a series fzigⁿi=1 many times according to the empirical distribution of original series fzigⁿi=1; and then gets a new statistic % based on fzigⁿ_i=1. % is used to ap-proximate the distribution of %. The consistency of bootstrap has been discussed intensively in the literature. For an comprehensive review, see Horowitz (2001) and references therein.

Estimator (1.3.1) with a nonparametric estimated component is essentially a U-statistic.

After some transformation, equation (1.3.8) becomes

1 n

i=1b_ni = 1 n

Xn i=1

2m_ni fv(vi)

1 n (n 1)

Xn i=1

j=1;j6=iQn(zi; zj) + 1

n Xn

i=1

mni fv(vi) fbv(vi) ²

f_v²(v_i) bf_v(v_i) ; (1.3.11)

where Q_n(z_i; z_j) ¹₂ _f^m2ⁿⁱ

v(vi)+_f^m2^nj v(vj)

hK ^v^j_h^vⁱ , Z denotes all the variables involved.

The bootstrap for U-statistics is …rst discussed by Bickel and Freedman (1981), which gives conditions for the bootstrap to work. One condition is that second moment of Q_n(z_i; z_j) is uniformly bounded which is not the case here. Chen, Linton, and Keilegom (2003) show bootstrap works for semiparametric estimates when the criterion function is not smooth but their results are in the regular case (root-n). So we need to show that the bootstrap works for estimator (1.3.1).

For notation we let variables with superscript be the newly generated variables from the empirical probability density function of fzⁱgⁿi=1 with mass _n¹ on each zi; i.e., fzigⁿi=1

and b_ni are the newly generated variables for fzigⁿi=1 and b_ni respectively.

The theorem below says that the bootstrap technique indeed works for our estimator here, when we are in the nice world. The proof is tedious, but the idea of the proof is simple: we follow the standard proof of the consistency of the bootstrap for U-statistics while showing residual terms asymptotically negligible as in Section 1.3.2.

Theorem 1.3.4 Under the same conditions in Theorem 1.3.2, and the bootstrap series fzigⁿi=1 are distributed as the empirical probability density function of fzigⁿi=1 with mass _n¹ on each zi; then

s n

[ _ni E ( nijvi)]²o 1 n

i=1 b_ni 1 n

i=1b_ni ! N (0; 1) :^d

In document Three Essays in Micro-Econometrics (Page 31-37)