Online Appendix B: - Finite-Sample Corrected Inference for Two-Step GMM in Time Series

“Finite-sample Corrected Inference for Two-Step GMM in Time Series”

Jungbin Hwang University of Connecticut Department of Economics

Gonzalo Valdés Universidad de Tarapacá

Departamento de Ingeniería Industrial y de Sistemas B.1 Finite-sample correction formula for non-linear moments

In the non-linear moment case, recall that the Taylor expansion of the FOC in (3),p

T (^₂ ₀);

is expanded as:

pT (^₂ ₀) = h

A( ₀; S_T(^₁))i 1

G_T( ₀)⁰S_T¹(^₁)p

T f_T( ₀) + O_p 1

pT ; (B.1) where the matrix A( ₀; S_T(^₁)) of second-order derivatives is given as

A( ₀; S_T(^₁)) = G_T( ₀)⁰S_T¹(^₁)G_T( ₀) + H_T ( ₀)⁰(I_d S_T¹(^₁)f_T( ₀)) and

H_T ( ₀) = 2 66 4

@GT( )

@ 1 = 0

:::

@GT( )

@ d = 0

3 77 5 :

Then we can formulate an estimator for the asymptotic variance of ^₂ as d

var(^2) = 1 T

A(^2; ST(^1))i 1

G⁰_T(^2)S_T¹(^1)GT(^2) h

A(^2; ST(^1))i ₁₀ :

Note that the form ofvar(^d 2) in (4) is di¤erent from the standard asymptotic variance estimate (G⁰_T(^₂)S_T¹(^₁)G_T(^₂)) ¹: Keeping this term, however, could potentially improve the …nite-sample performance of var(^d 2); because the second term in the expression for A(^2; ST(^1)), H_T(^₂)⁰(I_d S_T¹(^₁)f_T(^₂)); which is of stochastic order O_p(T ¹⁼²) = o_p(1); is non-zero in a

…nite sample in over-identi…ed GMM, (e.g., Windmeijer (2005)).

Similarly to the expansion in (8) and taking a further expansion of (B.1), we have that pT (^₂ ₀) = (A( ₀; S_T( ₀)) ¹G_T( ₀)⁰S_T¹( ₀)p

T f_T( ₀) + D( ₀; S_T( ₀))p

T (^₁ ₀) + O_p 1

pT ; (B.2)

where

D( ₀; S_T( ₀)) = @ (A( 0; S_T( )) ¹G_T( 0)⁰S_T¹( )f_T( 0)

@ ⁰

= 0

is a d d matrix. The j-th column of D( 0; S_T( 0)) is expressed as D( 0; ST( 0))[:; j] = (A( 0; ST( 0)) ¹ @A( 0; ST( )))

@ j = 0

(A( 0; ST( 0)) ¹ (B.3) GT( 0)⁰S_T¹( 0)fT( 0) + (A( 0; ST( )) ¹GT( 0)⁰S_T¹( 0) (B.4)

@S_T( )

@ j =0

S_T¹( ₀)f_T( ₀): (B.5)

From the FOC, the …rst term in (B.3) is always estimated as zero, so the feasible estimator for D( ₀; S_T( ₀))[:; j] is

D(^₂; S_T(^₁))[:; j] = (A(^₂; S_T(^₁)) ¹G_T(^₂)⁰S_T¹(^₁)

@ST( )

@ _j _=^

S_T¹(^1)f_T(^2);

where the formula for @ST( )=@ jj =^1 is provided in (10). The …nite-sample corrected formula is then given by

varc(^2) =var(^d 2) + D(^2; ST(^1))var(^d 2)

+var(^d ₂)D(^₂; S_T(^₁))⁰+ D(^₂; S_T(^₁))dvar(^₁)D(^₂; S_T(^₁))⁰; where

var(^₁) = 1

T G_T(^₂)⁰W_T¹G_T(^₂) ¹ G_T(^₂)⁰W_T¹S_T(^₁)W_T¹G_T(^₂) G_T(^₂)⁰W_T¹G_T(^₂) ¹; d

var(^₂) = 1 T

A(^₂; S_T(^₁))i 1

G⁰_T(^₂)S_T¹(^₁)G_T(^₂) h

A(^₂; S_T(^₁))i ₁₀ :

Unlike the linear moment condition illustrated in Section 2, we note that there still exists a remainder term of order O_p(T ¹⁼²) in (B.2) which arises from the Jacobian function G_T( ). The remainder term is of the same order as our correction term, D( ₀; S_T( ₀)): This implies that the e¤ect of …nite-sample improvement after taking account of D(^2; ST(^1)) in our corrected formula can depend on its magnitude relative to that of the remainder term in (B.2). The same points are mentioned in Windmeijer (2005) and Hwang et al. (2020).

B.2 Iterated GMM

Let ^⁰_iter be the two-step estimator ^2: For j 1; the j-th iterated GMM estimator ^^j_IE is de…ned as the solution to the following minimization problem:

^^j

iter= arg min

M ; ST ^^{j 1}

iter ; (B.6)

where M ( ; S_T(^^{j 1}_iter)) = f_T( )⁰S_T¹(^_iter^{j 1})f_T( ): The asymptotic distribution ofp

T (^¹_iter ₀) can be represented as follows:

pT (^_iter¹ ₀) = (G⁰_TS_T¹( ₀)G_T) ¹G⁰_TS_T¹( ₀)p

T f_T( ₀) (B.7) + D ₀_;S_T₍ ₀₎p

T (^₂ ₀) + o_p 1

pT : (B.8)

Substituting the expansion in (8) into (B.8), we can represent the …rst iteration estimator as pT (^_iter¹ 0) = (Id+ D( 0; ST( 0))) (G⁰_TS_T¹( 0)GT) ¹G⁰_TS_T¹( 0)p

T fT( 0) + [D( ₀; S_T( ₀))]²p

T (^₁ ₀) + o_p 1 pT : The leading term in p

T (^_iter¹ ₀) consists of an asymptotic normal distribution, part of which is scaled by I_d+ D( ₀; S_T( ₀)). Also, the e¤ect of the one-step estimator p

T (^₁ ₀) decays through the iteration procedure when we keep repeating this substitution until the j-th iteration:

pT (^^j_iter 0) =

Id+ Xj i=1

[D( 0; ST( 0))]ⁱ

(G⁰_TS_T¹( 0)GT) ¹G⁰_TS_T¹( 0)p

T fT( 0)

+ [D( 0; ST( 0))]^j+1p

T (^1 0) + op

p1 T :

When the number of iterations j goes to in…nity, ^_iter^j is expected to converge to a random variable ^_iter¹ .¹ Then the impact of p

T (^₁ ₀) on p

T (^_iter^j ₀) through [D( ₀; S_T( ₀))]^j+1 =

Op(T ^(j+1)=2) can be perfectly removed, and we have that

pT (^_iter¹ 0) = (I_d D( 0; S_T( 0))) ¹(G⁰_TS_T¹( 0)G_T) ¹G⁰_TS_T¹( 0)p

T f_T( 0) + op

p1 T ; assuming that Id D( 0; ST( 0)); which is Id+op(1); is invertible. The corrected variance estimate for ^¹_iter is constructed as follows :

d varc ^1

iter = I_d D ^¹_iter; ST(^¹_iter) ¹var ^d ¹_iter I_d D ^¹_iter; ST(^¹_iter) ⁰

; where dvar(^¹_iter) = T ¹(G⁰_TS_T¹(^¹_iter)GT) ¹ is the standard sandwich variance formula. The corrected formula for ^¹_iter extends that of Windmeijer (2000), which is formulated in an i.i.d.

setting. The corrected Wald statistic is F_c(^_iter¹ ) = 1

p R^¹_iter r ⁰ Rvard_c(^¹_iter)R⁰ ¹ R^¹_iter r :

Similarly, one can construct the corrected t statistic when p = 1: The asymptotic distribution of F_c(^¹_iter) can be characterized as

F_c(^¹_iter) = 1 p

R (I_d D( ₀; S_T( ₀))) ¹(G⁰_TS_T¹( ₀)G_T) ¹G⁰_TS_T¹( ₀)p

T f_T( ₀)i₀ 0

R I_d D ^_iter¹ ; S_T(^_iter¹ ) ¹ G⁰_TS_T¹(^_iter¹ )G_T ¹ I_d D ^¹_iter; S_T(^¹_iter) ⁰

R⁰

1 CA

R (I_d D( 0; ST( 0))) ¹(G⁰_TS_T¹( 0)GT) ¹G⁰_TS_T¹( 0)p

T fT( 0)i

+ op(1):

1Hansen and Lee (2020) provide some regular conditions that guarantee that the loop of the iteration sequence,

^^j

it e r for j = 1; 2; : : : ; is a contraction mapping, which implies that the iteration estimator ^it e r¹ is the …xed point.

Under the …xed-smoothing asymptotics, we have that S_T( 0) ^a ST: The asymptotically equivalent distribution of F_c(^_iter¹ ) is then given by

F^iter =1 P

hR(G~ ⁰_TST¹GT) ¹G⁰_TST¹ Z i₀h

R G~ ⁰_TST¹GT 1R~⁰

i 1h

R(G~ ⁰_TST¹GT) ¹G⁰_TST¹ Z i

; where ~R = R(I_d D( 0; S_T( 0))) ¹is a p d matrix. Considering ~R = R+op(1) and Theorem 1 in Sun(2014), we obtain F^iter= F+o^p(1). Thus instead of approximating Fc(^¹_iter) by a conventional

d=p distribution, the standard t and F distributions shown in Theorem 3 and (18), together with the corrected variance estimatevard_c(^¹_iter); the J -statistic modi…cation, and the …nite-sample adjustments in subsection 3.2, can be used to obtain asymptotic critical values for t^adjc (^_iter¹ ) and F_c^adj(^¹_iter), respectively.

B.3 Finite-sample corrected formula for linear-IV model

Let X = (x₁; :::; x_T)⁰ 2 R^T ^d, Z = (z₁; :::; z_T)⁰ 2 R^T ^m, and y = (y₁; :::; y_T)⁰: We choose the initial weight matrix W_T as Z⁰Z=T: This makes the initial one-step estimator ^₁ equivalent to the two-stage least-square estimator (2SLS), which is formulated as

^₁ = X⁰ZW_T¹Z⁰X ¹ X⁰ZW_T ¹Z⁰y ; and the corresponding asymptotic variance estimator is

var(^₁) = T X⁰ZW_T¹Z⁰X ¹ X⁰ZW_T¹S_T ^₁ W_T¹Z⁰X X⁰ZW_T¹Z⁰X ¹; where

S_T ^₁ = 1 T

XT t=1

XT s=1

Q_h(t T; s

T) 0

@zt y;t(^₁) 1

T XT j=1

z_{j y;j}(^₁) 1 A 0

@zt y;t(^₁) 1

T XT j=1

z_{j y;j}(^₁)

1 A

and _y;t(^₁) = y_t x⁰_t^₁: The e¢ cient two-step GMM estimator is

^₂= X⁰ZS_T¹(^1)Z⁰X ¹X⁰ZS_T¹(^1)Z⁰y;

and the corresponding uncorrected sandwich variance estimator and the corrected variance esti-mator are

var(^₂) = T X⁰ZS_T¹(^₁)Z⁰X ¹ and

var_c(^₂) =dvar(^₂) + D(^₂; S_T(^₁))var(^d ₂) +var(^d ₂)D(^₂; S_T(^₁))⁰ + D(^₂; S_T(^₁))var(^d ₁)D(^₂; S_T(^₁))⁰;

respectively, where the j-th column of D(^2; S_T(^1)) is given by

The iterated GMM estimator, ^¹_iter; which repeats the loop of the iteration sequence in (B.6), is given by

iter= X⁰ZS_T¹(^¹_iter)Z⁰X ¹X⁰ZS_T¹(^¹_iter)Z⁰y;

and the corresponding uncorrected sandwich variance estimator and the corrected variance esti-mator are

B.4 Technical lemmas and proof of Theorem 4

Lemma B.1 Under Assumption 1, together with h ! 1 and T ! 1 such that h=T !0, the

Proof of Lemma B.1. We start by proving the results for the case when Q_h(r; s) = k((r s)=b);

2; and this enables us to apply the dominated convergence theorem and obtain Z 1

Similarly, for part (b), we have

sup

because the …rst term on the right-hand side of the …rst equality (B.9) is O(B_T=T ) = o(1): Also, Assumption 1 implies that sup₀ ₁ ₁jk (( 1 2)=b)j ! 0 for almost all 2; and this enables us to apply the dominated convergence theorem to get the second term on the right-hand side of the …rst equality in (B.9) to be o(1) when b = B_T=T ! 0:

For part (c),

where the last equality follows from the proof of part (b):

Next, we consider the case of the OS-LRV with Q_h(r; s) = K ¹PK

j=1 j(r) j(s) and K ! 1 such that K=T ! 0: Then the result of part (a) follows from

sinceR1

0 j( ) d = 0 by Assumption 1. Part (b) follows in a similar manner using Assumption 1, since

Lemma B.2 Let us de…ne

j( ) = 1

Proof of Lemma B.2. For each j 2 f1; : : : ; dg; we have that term A (above) and obtain

where the second equality follows from summation by parts. For the last term on the right-hand side of the second equality in (B.13),

e_T(T )H_T;j( _T;j)p

T (^ ₀) = O_p 1 T by Assumptions 1 and 4. For any m-dimensional vector a,

E by Assumption 5: Together with Markov’s inequality, this leads us to get

where the last equality follows fromp

T (^ ) = O_p(1); sup_{1 t T} _j;t = o_p(1); and sup_{1 t T}e_T(t) = O(1=T ); which leads us to D = O_p(1=T ) : Incorporating this, together with (B.11) and (B.12), into (B.10), we obtain

j(^) = _j(^) + Op

1 T

= _j(^) + op(1) ; (B.14)

which is the desired result. of the convergence on the right-hand sides of (B.15) and (B.16) becomes o(1) instead of O(1=T ):

Using (B.15), one can show that the last term in (B.10) is op(1) when (h; T ) ! 1 such that h=T ! 0: Also, a careful inspection of the proof when h is …xed as T ! 1 indicates (B.16) leads to the conclusion that

A = Op(1) and D = op(1)

hold when h ! 1 such that h=T ! 0: Similarly, we can prove that B = o^p(1) and C = Op(1) also hold when h ! 1 such that h=T ! 0: Incorporating these results into (B.10), we can conclude that _j(^) = _j(^) + o_p(1) also holds when (h; T ) ! 1 such that h=T ! 0:

Proof of Lemma B.3. We use the formula for summation by parts:

for any conformable vectors at and bt: Consider 1 We …rst apply the formula in (B.17) to the expression inside the parentheses by setting at = Q_h(t=T; =T ); bt= At; and Ct=Pt

for each 2 f1; : : : ; T g: Incorporating this into (B.18); we have We repeatedly apply the formula in (B.17) to the terms on the right-hand side. For the …rst term, setting a = Q_h 1;_T ST(A); b = B⁰; and C =P Incorporating these results into (B.19), we obtain

Lemma B.4 Let Assumptions 1–3 hold with ^_CU. Then the CU-FOC function Q _;S_T_{( )} satis…es

Proof of Lemma B.4. We start by showing that @Q( ; S_T( ))=@ ⁰ coincides with the de…nition of ~A1( ; ST( )): We have that

The right-hand side of the last equality can be re-written as 0

and using the linearity of f (vt; ) in ; we obtain

where, because of the linearity of the moment process, ⁽²⁾_ji does not depend on . We have that

For Aj( ), it is straightforward to obtain A_j( ) = @

@ j

G~⁰_T( ) S_T¹( ) ~G_T( ) G~⁰_T( )S_T¹( )@S_T( )

@ j

S_T¹( ) ~G_T( ) + [ ~G_T( )]⁰S_T¹( ) @

@ _jG~_T( )

= G~^(a)_j;2( ) + ~G^(b)_j;2( ) + ~G^(c)_j;2( ) ⁰S_T¹( ) ~GT( ) G~⁰_T( )S_T¹( ) _j( ) + _j( )⁰ S_T¹( ) ~G_T( ) + [ ~G_T( )]⁰S_T¹( ) G~^(a)_j;2( ) + ~G^(b)_j;2( ) + ~G^(c)_j;2( ) : For B_j( ); its i-th row is expressed as

@ j

f_T( )⁰S_T¹( )@ ~G_T( )

@ i

= @

@ _jf_T( )⁰ S_T¹( )@ ~G_T( )

@ _i f_T( )⁰S_T¹( )@S_T( )

@ _j S_T¹( )@ ~G_T( )

@ _i + f_T( )⁰S_T¹( )@²G~_T( )

@ _j@ _i

= G⁰_jTS_T¹( ) G~^(a)_i;2( ) + ~G^(b)_i;2( ) + ~G^(c)_i;2( )

f_T( )⁰S_T¹( ) _j( ) + _j( )⁰ S_T¹( ) G~^(a)_i;2( ) + ~G^(b)_i;2( ) + ~G^(c)_i;2( ) + f_T( )⁰S_T¹( ) @ ~G^(a)_i;2( )

@ _j +@ ~G^(b)_i;2( )

@ _j +@ ~G^(c)_i;2( )

@ _j

! :

Lastly, the i-th row of Cj( ) is expressed as

@ _j f_T( )⁰S_T¹( ) _i( )S_T¹( )h

G~_T( )i

= G⁰_jTS_T¹( ) i( )S_T¹( )h

G~T( )i fT( )⁰S_T¹( )@ST( )

@ _j S_T¹( ) i( )S_T¹( )h

G~T( )i + fT( )⁰S_T¹( ) ⁽²⁾_ij S_T¹( )

hG~T( ) i

fT( )⁰S_T¹( ) i( )S_T¹( )@S_T( )

@ j

S_T¹( )

hG~T( ) i

+ fT( )⁰S_T¹( ) i( )S_T¹( ) @ ~G^(a)_i;2( )

@ j

+@ ~G^(b)_i;2( )

@ j

+@ ~G^(c)_i;2( )

@ j

! :

From the proof of Lemma 1, it is straightforward to show that

Proof of Theorem 4. From the second-order Taylor expansion of the FOC, we have that 0 = Q(^_CU; S_T(^_CU)) = ~G_T(^_CU)h

where each component of _T is located between the corresponding components 0 and ^CU: By Assumption (6), we obtain

where the …rst term on the right-hand side is Op(T ¹⁼²): Lemma B.4 provides that

@²Q( ; S_T( ))

Therefore, we have that

Combining these results into the second-order Taylor expansion of the FOC, we obtain 0 = Q ^_CU; S_T(^_CU) = ~G_T(^_CU)h

which leads to the desired result, pT ^_CU ₀ = h

[1] Hansen, B. E., & Lee, S. (2019). Inference for iterated GMM under misspeci…cation.

Working paper.

[2] Hwang, J., Kang, B., & Lee, S. (2020). A doubly corrected robust variance estimator for linear GMM. Working paper, Department of Economics, University of Connecticut.

[3] Sun, Y. (2014). Fixed-smoothing asymptotics in a two-step generalized method of moments framework. Econometrica, 82(6), 2327–2370.

[4] Windmeijer, F. (2005). A …nite sample correction for the variance of linear e¢ cient two-step GMM estimators. Journal of Econometrics, 126(1), 25–51.

In document Finite-Sample Corrected Inference for Two-Step GMM in Time Series (Page 31-48)