A Appendix section
A.5 Consistency of Sieve Semiparametric MLE
Lemma 3 (B.1) Let bn= bn; bfn;bn be such that bQn(bn) sup 2AnQbn( ) Op( n) with n = op(1) : (ii)bn is well de…ned and measurable.
B.1.4 (i) Let c (m (n)) = sup 2An Qbn( ) Q ( ) = op(1);
(ii) Uniformly over " > 0
max fc (m (n)) ; n; jQ ( n 0) Q ( 0)jg = o (1)
Then d (bn; 0) = op(1).
Note that since there is no penalty term, Qn(:) = Q (:) = Q (:) in the original Lemma B1.
Let us now check the conditions of Lemma B:1 above.
Condition B:1:1(i) is satis…ed by assumptions C1 and C2. In order for the criterion function Q ( 0) = Et;xlog p (tjx; 0) < 1 and in anticipation of the information inequality, I show that
Et;xp (tjx; 0) < 1
where p (tjx; 0) > 0. The joint probability distribution of T and X is denoted as P (t; x), while the marginal densities of XjT and of T are denoted as t(x) and (t), respectively. Then
Et;xp (tjx; 0) =
Condition B:1:1(ii) is implied by assumptions C1 and C2. Since 0 is identi…ed and Et;xp (tjx; 0) < 1, by the information inequality, Q ( ) Q ( 0) < 0 for 2 An with 6= 0.
De…ne:
(m (n) ; ") sup
2An:jj 0jj1 "
Q ( ) Q ( 0)
Since An is compact, there exists a n2 An with jj n 0jj1 " > 0 such that
n = arg max
2An:jj n 0jj1 "Q ( ) Then, for some constant C > 0,
(m (n) ; ") = Q ( n) Q ( 0)
= Q ( n) Q ( n 0) + Q ( n 0) Q ( 0) C jj n n 0jj21+ o (1)
Suppose Q ( n) Q ( 0) ! 0, then jj n n 0jj21! 0. However, since
jj n 0jj21 jj n n 0jj21+ jj n 0 0jj21
then jj n 0jj21! 0, which is a contradiction to jj n 0jj21 " > 0. Therefore
lim inf
n!1 (m (n) ; ") > 0
Condition B:1:2 is implied by the way the parameter and the sieve spaces are de…ned in (20a) (20b) and (21a) (21b).
Condition B:1:3 is implied by assumptions C1, C2, and C3. In order to check B.1.3 I apply Remark B.1(1)(a) in Chen and Pouzo (2012). First note that by construction, An is a compact subset of A for each n under the norm de…ned in (22). Before showing the continuity of the criterion function in the consistency norm, let = ( ; k) and de…ne the following terms:
(u; x) = (x) f (u) ( ; t; k) =
Z t 0
(x; u) 1( (t u) ; k) du
1( (t u) ; k) = f (u) (1 (t u) ( ; t; k)) 1( (t u) ; k) + (x) f2(u) (t u) 11( (t u) ; k)
2( (t u) ; k) = (x) (1 (t u) ( ; t; )) 1( (t u) ; k) + 2(x) f (u) (t u) 11( (t u) ; k)
3( (t u) ; k) = (x) f (u) 12( (t u) ; k) ( ; t; k) 2( (t u) ; k)
By assumption C2 and by letting Mf suptf (t) and M supx (x), we have that
sup 1( (t u) ; k) MfM1
sup 2( (t u) ; k) M M1 sup 3( (t u) ; k) M MfM12
and by assumption C1(i), we have that for all and x and almost all t :
( ; t; k) 6= 0
Condition B:1:4(i) is implied by assumptions C1 through C3. To show the uniform convergence of the criterion function over the sieve space, we have to show that:
sup
2Am(n)
Qbn( ) Q ( ) = op(1)
which holds if the class of functions indexing the criterion is Glivenko-Cantelli. That is, we need to show that the class of functions (43) is Glivenko-Cantelli
L = fl (tjx; ) = log p (tjx; ) : 2 Ang (43)
By Theorem 2.4.1 of van der Vaart and Wellner (1986) if the bracketing number N[]("; L; L1) is …nite for all " > 0, then L is Glivenko-Cantelli. I proceed now to calculate the bracketing number of the class L.
De…ne
Combining (44) and (45) obtains
By using a result of Shen and Wong (1994) (page 597) and by using a bracketing entropy preservation result of Kosorok (2008) (2008, Lemma 9:25) I show below that
lUij(t; ; k) lijL(t; ; k)
1 C"
First, notice that kUi kiL "=2 holds as k is a …nite dimensional parameter and the covering number of is of order O 1" . Then I show thatRt
0 U
j L
j du "=2 holds. According to a result on page 597 of Shen and Wong (1994), the bracketing entropy of n is bounded by
log N[] 2M" ; n; jj:jj1 C0mnlog 2M"
where the envelope of the class of functions indexing n is M and where I used that if F is a class of functions with envelope equal to 1, then M F, where M is a constant, has N[]("M; F; jj:jj) = N M"; F; jj:jj . Also, Fnint; the space of functions indexed byRt
0fn(u) du is a …nite dimensional linear space with envelope Rt
0fn(u) du Mf. Applying the same result in Shen and Wong (1994), we have that the bracketing entropy of Fnint is bounded by
log N[] 2M"
f; Fnint; jj:jj1 C00m (n) log 2Mf
"
By bracketing entropy preservation results,6 since both Mn(x) and M1
f
Rt
0fn(u) du are uniformly bounded by 1, letting K = max C0; C00 and de…ning the class of functions indexed by (x)Rt
0fn(u) du as , we
6Let F and G be classes of measurable functions. Then for any probability measure P and any 1 r 1, provided f 2 F : jfj Land g 2 G : jgj K
N[]("; F G; Lr(P )) N[] "
2L; F; Lr(P ) N[] "
2K; G; Lr(P ) (46)
have that the class is bounded by
log N[]("; ; jj:jj1) Kmnlog 4M Mf
"
which means there exists a set of functions n L
jfjL; UjfjU
o(4M Mf=")Kmn
j=1 such that the following two expres-sions hold for some j = 1; :::; 4M M" f Kmn
L
jfjL UjfjU
L
jfjL UjfjU
1 "=2 Then, the class of functions L is bounded by
log N[]("; L; jj:jj1) log N[]("; ; jj:jj1) + log N[]("; ; jj:jjE)
= Kmnlog 4M Mf
" + log 2
"
so that the class L is Glivenko-Cantelli. Moreover, L is Donsker. Then we can …nd cQb(mn) explicitly by calculating the integral below
Z 1 0
s
Kmnlog 4M Mf
" + log 2
" d"
which obtains a result of order O p
1 + mn . Therefore:
cQb(mn) = 1 + mn
n
1=2
(ii) The second part of condition B.1.4 states that
cQb(mn) = o (1) (47a)
jQ ( n 0) Q ( 0)j = o (1) (47b)
n = o (1) (47c)
(47a) holds since d is …xed and by construction mn ! 1 at a rate slower than n: (47b) is satis…ed by the continuity of Q ( ) and by n 0 ! 0 from Condition B.1.2(ii). (47c) holds with n small enough by the uniform convergence of the criterion function.
References
Aalen, O., O. Borgan, andH. Gjessing (2008): Survival and Event History Analysis: A Process Point of View. Springer.
Aalen, O., andN. Hjort (2002): “Frailty Models that Yield Proportional Hazards,” Statistics and Prob-ability Letters, 58, 335–342.
Abbring, J. (2012): “Mixed Hitting-Time Models,” Econometrica, 80, 783–819.
Ariga, K., Y. Ohkusa, andG. Brunello (1999): “Fast Track: Is It in the Genes? The Promotion Policy of a Large Japanese Firm,” Journal of Economic Behavior and Organization, 38(4), 385–402.
Attfield, M.,andT. Hodous (1992): “Pulmonary Function of US Coal Miners Related to Dust Exposure Estimates,” American Review of Respiratory Diseases, 145(3), 605 –609.
Berg, C., Y. Chen, andM. Ismail (2002): “Small Eigenvalues of Large Henkel Matrices: The Indetermi-nate Case,” Mathematica Scandinavica, 91(1), 67–81.
Bertoin, J. (1996): Lévy Processes. Cambridge University Press.
BLS (2008): Musculoskeletal Disorders and Days Away from Work in 2007 Washington D.C. Bureau of Labor Statistics.
Bound, J., T. Stinebrickner, andT. Waidmann (2010): “Health, Economic Resources, and the Work Decision of Older Men,” Journal of Econometrics, 156(1), 106–129.
Chen, X.,andD. Pouzo (2012): “Estimation of Nonparametric Conditional Moment Models With Possibly Nonsmooth Generalized Residuals,” Econometrica, 20(1), 277–321.
Christensen, B., andM. Kallestrup-Lamb (2012): “The Impact of Health Changes on Labor Supply:
Evidence from Merged Data on Individual Objective Medical Diagnosis Codes and Early Retirement Behavior,” Health Economics, 21, 56 –100.
Dabrowska, D. (2006): “Estimation in a Class of Semiparametric Transformation Models,”Lecture Notes-Monograph Series, 49, 131–169.
Dykstra, R., and P. Laud (1981): “A Bayesian Nonparametric Approach to Reliability,” Annals of Statistics, 9(2), 356–367.
Elbers, C., and G. Ridder (1982): “True and Spurious Duration Dependence: The Identi…ability of the Proportional Hazard Model,” Review of Economic Studies, 49(3), 403–409.
Gerbrich, S., T. Church, P. McGovern, H. Hansen, N. Nachreiner, M. Geisser, A. Ryan, S. Mongin, and G. Watt (2004): “An Epidemiological Study of the Magnitude and Consequences of Work Related Violence: The Minnesota Nurses Study,” Occupational and Environmental Medicine, 61(6), 495 –503.
Gibbons, R.,andM. Waldman (2004): “Task-Speci…c Human Capital,”The American Economic Review, 94(2), 203–207.
Gjessing, H., O. Aalen, and N. Hjort (2003): “Frailty Models Based on Lévy Processes,” Advances in Applied Probability, 35, 532–550.
Gnedin, A., and J. Pitman (2008): “Moments of Convex Distribution Functions and Completely Alter-nating Sequences,”in Probability and Statistics: Essays in Honor of David A. Freedman, ed. by D. Nolan, andT. Speed, pp. 30–41. Beachwood, Ohio, USA: Institute of Mathematical Statistics.
Heckman, J.,andB. Singer (1984): “Econometric Duration Analysis,”Journal of Econometrics, 24(1-2), 63–132.
Heckman, J., and C. Taber (1994): “Econometric Mixture Models and More General Models for Unob-servables in Duration Analysis,” Statistical Methods in Medical Research, 3(3), 279–302.
Honoré, B. (1991): “Identi…cation Results for Duration Models with Multiple Spells or Time-Varying Covariates,” working paper, Northwestern University.
Kambourov, G., and I. Manovskii (2009): “Occupational Speci…city of Human Capital,” International Economic Review, 50(1), 63–115.
Kebir, Y. (1991): “On Hazard Rate Processes,” Naval Research Logistics, 6(38), 865–876.
Kosorok, M. (2008): Introduction to Empirical Processes and Semiparametric Inference. Springer: New York.
McConnochie, K. (1990): “Environmental Lung Disorders: Mineral Pneumoconiosis,”in Radiologic Diag-nosis of Chest Disease, ed. by M. Sperber, M., pp. 386–401. Springer US.
Pope, C., R. Burnett, M. Thun, E. Calle, D. Krewski, K. Ito, and G. Thurston (2002): “Lung Cancer, Cardiopulmonary Mortality, and Long-Term Exposure to Fine Particulate Air Pollution,”Journal of the Americal Medical Association, 287, 1132–1141.
Shen, X., andW. Wong (1994): “Convergence Rate of Sieve Estimates,” The Annals of Statistics, 22(2), 580–615.
Singpurwalla, N. (1995): “Survival in Dynamic Environments,” Statistical Science, 10(1), 86–103.
(2006): “The Hazard Potential: Introduction and Overview,” Journal of the American Statistical Association, 101(476), 1705–1717.
Stock, J. (1988): “Estimating Continuous-Time Processes Subject to Time Deformation: An Application to Postwar U.S. GNP,” Journal of the American Statistical Association, 83(401), 77–85.
USEPA (2003): Fourth External Review Draft of Air Quality Criteria for Particulate Matter US Environ-mental Protection Agency, National Center for EnvironEnviron-mental Assessment, O¢ ce of Research and Devel-opment, Research Triangle Park NC.
Van den Berg, G. (2001): “Duration Models: Speci…cation, Identi…cation and Multiple Durations,” in Handbook of Econometrics, ed. by J. Heckman, and E. Leamer, vol. 5 of Handbook of Econometrics, chap. 55, pp. 3381–3460. Elsevier.
van der Vaart, A.,andJ. Wellner (1986): Weak Convergence and Empirical Processes with Application to Statistics. New York: Springer-Verlag.