“Finite-sample Corrected Inference for Two-Step GMM in Time Series”
Jungbin Hwang University of Connecticut Department of Economics
Gonzalo Valdés Universidad de Tarapacá
Departamento de Ingeniería Industrial y de Sistemas B.1 Finite-sample correction formula for non-linear moments
In the non-linear moment case, recall that the Taylor expansion of the FOC in (3),p
T (^2 0);
is expanded as:
pT (^2 0) = h
A( 0; ST(^1))i 1
GT( 0)0ST1(^1)p
T fT( 0) + Op 1
pT ; (B.1) where the matrix A( 0; ST(^1)) of second-order derivatives is given as
A( 0; ST(^1)) = GT( 0)0ST1(^1)GT( 0) + HT ( 0)0(Id ST1(^1)fT( 0)) and
HT ( 0) = 2 66 4
@GT( )
@ 1 = 0
:::
@GT( )
@ d = 0
3 77 5 :
Then we can formulate an estimator for the asymptotic variance of ^2 as d
var(^2) = 1 T
h
A(^2; ST(^1))i 1
G0T(^2)ST1(^1)GT(^2) h
A(^2; ST(^1))i 10 :
Note that the form ofvar(^d 2) in (4) is di¤erent from the standard asymptotic variance estimate (G0T(^2)ST1(^1)GT(^2)) 1: Keeping this term, however, could potentially improve the …nite-sample performance of var(^d 2); because the second term in the expression for A(^2; ST(^1)), HT(^2)0(Id ST1(^1)fT(^2)); which is of stochastic order Op(T 1=2) = op(1); is non-zero in a
…nite sample in over-identi…ed GMM, (e.g., Windmeijer (2005)).
Similarly to the expansion in (8) and taking a further expansion of (B.1), we have that pT (^2 0) = (A( 0; ST( 0)) 1GT( 0)0ST1( 0)p
T fT( 0) + D( 0; ST( 0))p
T (^1 0) + Op 1
pT ; (B.2)
where
D( 0; ST( 0)) = @ (A( 0; ST( )) 1GT( 0)0ST1( )fT( 0)
@ 0
= 0
is a d d matrix. The j-th column of D( 0; ST( 0)) is expressed as D( 0; ST( 0))[:; j] = (A( 0; ST( 0)) 1 @A( 0; ST( )))
@ j = 0
(A( 0; ST( 0)) 1 (B.3) GT( 0)0ST1( 0)fT( 0) + (A( 0; ST( )) 1GT( 0)0ST1( 0) (B.4)
@ST( )
@ j =0
ST1( 0)fT( 0): (B.5)
From the FOC, the …rst term in (B.3) is always estimated as zero, so the feasible estimator for D( 0; ST( 0))[:; j] is
D(^2; ST(^1))[:; j] = (A(^2; ST(^1)) 1GT(^2)0ST1(^1)
@ST( )
@ j =^
1
ST1(^1)fT(^2);
where the formula for @ST( )=@ jj =^1 is provided in (10). The …nite-sample corrected formula is then given by
d
varc(^2) =var(^d 2) + D(^2; ST(^1))var(^d 2)
+var(^d 2)D(^2; ST(^1))0+ D(^2; ST(^1))dvar(^1)D(^2; ST(^1))0; where
d
var(^1) = 1
T GT(^2)0WT1GT(^2) 1 GT(^2)0WT1ST(^1)WT1GT(^2) GT(^2)0WT1GT(^2) 1; d
var(^2) = 1 T
h
A(^2; ST(^1))i 1
G0T(^2)ST1(^1)GT(^2) h
A(^2; ST(^1))i 10 :
Unlike the linear moment condition illustrated in Section 2, we note that there still exists a remainder term of order Op(T 1=2) in (B.2) which arises from the Jacobian function GT( ). The remainder term is of the same order as our correction term, D( 0; ST( 0)): This implies that the e¤ect of …nite-sample improvement after taking account of D(^2; ST(^1)) in our corrected formula can depend on its magnitude relative to that of the remainder term in (B.2). The same points are mentioned in Windmeijer (2005) and Hwang et al. (2020).
B.2 Iterated GMM
Let ^0iter be the two-step estimator ^2: For j 1; the j-th iterated GMM estimator ^jIE is de…ned as the solution to the following minimization problem:
^j
iter= arg min
2
M ; ST ^j 1
iter ; (B.6)
where M ( ; ST(^j 1iter)) = fT( )0ST1(^iterj 1)fT( ): The asymptotic distribution ofp
T (^1iter 0) can be represented as follows:
pT (^iter1 0) = (G0TST1( 0)GT) 1G0TST1( 0)p
T fT( 0) (B.7) + D 0;ST( 0)p
T (^2 0) + op 1
pT : (B.8)
Substituting the expansion in (8) into (B.8), we can represent the …rst iteration estimator as pT (^iter1 0) = (Id+ D( 0; ST( 0))) (G0TST1( 0)GT) 1G0TST1( 0)p
T fT( 0) + [D( 0; ST( 0))]2p
T (^1 0) + op 1 pT : The leading term in p
T (^iter1 0) consists of an asymptotic normal distribution, part of which is scaled by Id+ D( 0; ST( 0)). Also, the e¤ect of the one-step estimator p
T (^1 0) decays through the iteration procedure when we keep repeating this substitution until the j-th iteration:
pT (^jiter 0) =
"
Id+ Xj i=1
[D( 0; ST( 0))]i
#
(G0TST1( 0)GT) 1G0TST1( 0)p
T fT( 0)
+ [D( 0; ST( 0))]j+1p
T (^1 0) + op
p1 T :
When the number of iterations j goes to in…nity, ^iterj is expected to converge to a random variable ^iter1 .1 Then the impact of p
T (^1 0) on p
T (^iterj 0) through [D( 0; ST( 0))]j+1 =
Op(T (j+1)=2) can be perfectly removed, and we have that
pT (^iter1 0) = (Id D( 0; ST( 0))) 1(G0TST1( 0)GT) 1G0TST1( 0)p
T fT( 0) + op
p1 T ; assuming that Id D( 0; ST( 0)); which is Id+op(1); is invertible. The corrected variance estimate for ^1iter is constructed as follows :
d varc ^1
iter = Id D ^1iter; ST(^1iter) 1var ^d 1iter Id D ^1iter; ST(^1iter) 0
1
; where dvar(^1iter) = T 1(G0TST1(^1iter)GT) 1 is the standard sandwich variance formula. The corrected formula for ^1iter extends that of Windmeijer (2000), which is formulated in an i.i.d.
setting. The corrected Wald statistic is Fc(^iter1 ) = 1
p R^1iter r 0 Rvardc(^1iter)R0 1 R^1iter r :
Similarly, one can construct the corrected t statistic when p = 1: The asymptotic distribution of Fc(^1iter) can be characterized as
Fc(^1iter) = 1 p
h
R (Id D( 0; ST( 0))) 1(G0TST1( 0)GT) 1G0TST1( 0)p
T fT( 0)i0 0
B@
R Id D ^iter1 ; ST(^iter1 ) 1 G0TST1(^iter1 )GT 1 Id D ^1iter; ST(^1iter) 0
1
R0
1 CA
1
h
R (Id D( 0; ST( 0))) 1(G0TST1( 0)GT) 1G0TST1( 0)p
T fT( 0)i
+ op(1):
1Hansen and Lee (2020) provide some regular conditions that guarantee that the loop of the iteration sequence,
^j
it e r for j = 1; 2; : : : ; is a contraction mapping, which implies that the iteration estimator ^it e r1 is the …xed point.
Under the …xed-smoothing asymptotics, we have that ST( 0) a ST: The asymptotically equivalent distribution of Fc(^iter1 ) is then given by
Fiter =1 P
hR(G~ 0TST1GT) 1G0TST1 Z i0h
R G~ 0TST1GT 1R~0
i 1h
R(G~ 0TST1GT) 1G0TST1 Z i
; where ~R = R(Id D( 0; ST( 0))) 1is a p d matrix. Considering ~R = R+op(1) and Theorem 1 in Sun(2014), we obtain Fiter= F+op(1). Thus instead of approximating Fc(^1iter) by a conventional
2
d=p distribution, the standard t and F distributions shown in Theorem 3 and (18), together with the corrected variance estimatevardc(^1iter); the J -statistic modi…cation, and the …nite-sample adjustments in subsection 3.2, can be used to obtain asymptotic critical values for tadjc (^iter1 ) and Fcadj(^1iter), respectively.
B.3 Finite-sample corrected formula for linear-IV model
Let X = (x1; :::; xT)0 2 RT d, Z = (z1; :::; zT)0 2 RT m, and y = (y1; :::; yT)0: We choose the initial weight matrix WT as Z0Z=T: This makes the initial one-step estimator ^1 equivalent to the two-stage least-square estimator (2SLS), which is formulated as
^1 = X0ZWT1Z0X 1 X0ZWT 1Z0y ; and the corresponding asymptotic variance estimator is
d
var(^1) = T X0ZWT1Z0X 1 X0ZWT1ST ^1 WT1Z0X X0ZWT1Z0X 1; where
ST ^1 = 1 T
XT t=1
XT s=1
Qh(t T; s
T) 0
@zt y;t(^1) 1
T XT j=1
zj y;j(^1) 1 A 0
@zt y;t(^1) 1
T XT j=1
zj y;j(^1)
1 A
0
:
and y;t(^1) = yt x0t^1: The e¢ cient two-step GMM estimator is
^2= X0ZST1(^1)Z0X 1X0ZST1(^1)Z0y;
and the corresponding uncorrected sandwich variance estimator and the corrected variance esti-mator are
d
var(^2) = T X0ZST1(^1)Z0X 1 and
d
varc(^2) =dvar(^2) + D(^2; ST(^1))var(^d 2) +var(^d 2)D(^2; ST(^1))0 + D(^2; ST(^1))var(^d 1)D(^2; ST(^1))0;
respectively, where the j-th column of D(^2; ST(^1)) is given by
The iterated GMM estimator, ^1iter; which repeats the loop of the iteration sequence in (B.6), is given by
^1
iter= X0ZST1(^1iter)Z0X 1X0ZST1(^1iter)Z0y;
and the corresponding uncorrected sandwich variance estimator and the corrected variance esti-mator are
B.4 Technical lemmas and proof of Theorem 4
Lemma B.1 Under Assumption 1, together with h ! 1 and T ! 1 such that h=T !0, the
Proof of Lemma B.1. We start by proving the results for the case when Qh(r; s) = k((r s)=b);
2; and this enables us to apply the dominated convergence theorem and obtain Z 1
Similarly, for part (b), we have
sup
because the …rst term on the right-hand side of the …rst equality (B.9) is O(BT=T ) = o(1): Also, Assumption 1 implies that sup0 1 1jk (( 1 2)=b)j ! 0 for almost all 2; and this enables us to apply the dominated convergence theorem to get the second term on the right-hand side of the …rst equality in (B.9) to be o(1) when b = BT=T ! 0:
For part (c),
where the last equality follows from the proof of part (b):
Next, we consider the case of the OS-LRV with Qh(r; s) = K 1PK
j=1 j(r) j(s) and K ! 1 such that K=T ! 0: Then the result of part (a) follows from
1
sinceR1
0 j( ) d = 0 by Assumption 1. Part (b) follows in a similar manner using Assumption 1, since
Lemma B.2 Let us de…ne
j( ) = 1
Proof of Lemma B.2. For each j 2 f1; : : : ; dg; we have that term A (above) and obtain
p1
where the second equality follows from summation by parts. For the last term on the right-hand side of the second equality in (B.13),
eT(T )HT;j( T;j)p
T (^ 0) = Op 1 T by Assumptions 1 and 4. For any m-dimensional vector a,
E by Assumption 5: Together with Markov’s inequality, this leads us to get
p1
where the last equality follows fromp
T (^ ) = Op(1); sup1 t T j;t = op(1); and sup1 t TeT(t) = O(1=T ); which leads us to D = Op(1=T ) : Incorporating this, together with (B.11) and (B.12), into (B.10), we obtain
j(^) = j(^) + Op
1 T
= j(^) + op(1) ; (B.14)
which is the desired result. of the convergence on the right-hand sides of (B.15) and (B.16) becomes o(1) instead of O(1=T ):
Using (B.15), one can show that the last term in (B.10) is op(1) when (h; T ) ! 1 such that h=T ! 0: Also, a careful inspection of the proof when h is …xed as T ! 1 indicates (B.16) leads to the conclusion that
A = Op(1) and D = op(1)
hold when h ! 1 such that h=T ! 0: Similarly, we can prove that B = op(1) and C = Op(1) also hold when h ! 1 such that h=T ! 0: Incorporating these results into (B.10), we can conclude that j(^) = j(^) + op(1) also holds when (h; T ) ! 1 such that h=T ! 0:
Proof of Lemma B.3. We use the formula for summation by parts:
1
for any conformable vectors at and bt: Consider 1 We …rst apply the formula in (B.17) to the expression inside the parentheses by setting at = Qh(t=T; =T ); bt= At; and Ct=Pt
for each 2 f1; : : : ; T g: Incorporating this into (B.18); we have We repeatedly apply the formula in (B.17) to the terms on the right-hand side. For the …rst term, setting a = Qh 1;T ST(A); b = B0; and C =P Incorporating these results into (B.19), we obtain
1
Lemma B.4 Let Assumptions 1–3 hold with ^CU. Then the CU-FOC function Q ;ST( ) satis…es
Proof of Lemma B.4. We start by showing that @Q( ; ST( ))=@ 0 coincides with the de…nition of ~A1( ; ST( )): We have that
The right-hand side of the last equality can be re-written as 0
and using the linearity of f (vt; ) in ; we obtain
where, because of the linearity of the moment process, (2)ji does not depend on . We have that
@
For Aj( ), it is straightforward to obtain Aj( ) = @
@ j
G~0T( ) ST1( ) ~GT( ) G~0T( )ST1( )@ST( )
@ j
ST1( ) ~GT( ) + [ ~GT( )]0ST1( ) @
@ jG~T( )
= G~(a)j;2( ) + ~G(b)j;2( ) + ~G(c)j;2( ) 0ST1( ) ~GT( ) G~0T( )ST1( ) j( ) + j( )0 ST1( ) ~GT( ) + [ ~GT( )]0ST1( ) G~(a)j;2( ) + ~G(b)j;2( ) + ~G(c)j;2( ) : For Bj( ); its i-th row is expressed as
@
@ j
fT( )0ST1( )@ ~GT( )
@ i
!
= @
@ jfT( )0 ST1( )@ ~GT( )
@ i fT( )0ST1( )@ST( )
@ j ST1( )@ ~GT( )
@ i + fT( )0ST1( )@2G~T( )
@ j@ i
= G0jTST1( ) G~(a)i;2( ) + ~G(b)i;2( ) + ~G(c)i;2( )
fT( )0ST1( ) j( ) + j( )0 ST1( ) G~(a)i;2( ) + ~G(b)i;2( ) + ~G(c)i;2( ) + fT( )0ST1( ) @ ~G(a)i;2( )
@ j +@ ~G(b)i;2( )
@ j +@ ~G(c)i;2( )
@ j
! :
Lastly, the i-th row of Cj( ) is expressed as
@
@ j fT( )0ST1( ) i( )ST1( )h
G~T( )i
= G0jTST1( ) i( )ST1( )h
G~T( )i fT( )0ST1( )@ST( )
@ j ST1( ) i( )ST1( )h
G~T( )i + fT( )0ST1( ) (2)ij ST1( )
hG~T( ) i
fT( )0ST1( ) i( )ST1( )@ST( )
@ j
ST1( )
hG~T( ) i
+ fT( )0ST1( ) i( )ST1( ) @ ~G(a)i;2( )
@ j
+@ ~G(b)i;2( )
@ j
+@ ~G(c)i;2( )
@ j
! :
From the proof of Lemma 1, it is straightforward to show that
Proof of Theorem 4. From the second-order Taylor expansion of the FOC, we have that 0 = Q(^CU; ST(^CU)) = ~GT(^CU)h
where each component of T is located between the corresponding components 0 and ^CU: By Assumption (6), we obtain
where the …rst term on the right-hand side is Op(T 1=2): Lemma B.4 provides that
@2Q( ; ST( ))
Therefore, we have that
Combining these results into the second-order Taylor expansion of the FOC, we obtain 0 = Q ^CU; ST(^CU) = ~GT(^CU)h
which leads to the desired result, pT ^CU 0 = h
[1] Hansen, B. E., & Lee, S. (2019). Inference for iterated GMM under misspeci…cation.
Working paper.
[2] Hwang, J., Kang, B., & Lee, S. (2020). A doubly corrected robust variance estimator for linear GMM. Working paper, Department of Economics, University of Connecticut.
[3] Sun, Y. (2014). Fixed-smoothing asymptotics in a two-step generalized method of moments framework. Econometrica, 82(6), 2327–2370.
[4] Windmeijer, F. (2005). A …nite sample correction for the variance of linear e¢ cient two-step GMM estimators. Journal of Econometrics, 126(1), 25–51.