Estimators of the process - Efficient nonparametric inference for discretely observed compound

Most of the discussions so far have concerned estimating the parameters separately. To make inference on the whole underlying compound Poisson process, one typically shows joint convergence of the estimators of all the parameters. In the case when dF has an absolutely continuous component and with the exception of the work of Coca [2015], this has not been addressed in the literature discussed above. There are two different reasons: on the one hand, Buchmann and Grübel [2003] and van Es et al. [2007] assumed γ= 0 and that the intensity is known; and, on the other, Duval [2013a] and Comte et al. [2014] made estimation in the high frequency regime, so as ∆ → 0 the setting tends to the continuous observation regime of Section 1.3 and estimation of the parameters is asymptotically independent. Notice that, due to the asymptotic properties of our estimator forγ included in the previous section, making inference on the process reduces to making joint inference onF and λ. This goes in line with the remark in Trabs [2015a] where it was pointed out that the addition of a drift to the model does not affect the information lower bounds. The exact asymptotic dependence of the estimators of F and λ, and of the estimators for the mass of the atoms, was given by Coca [2015] and is part of the main result of Chapter 3. Therefore, joint estimation is possible using these results and in Section 4.4 we show their joint behaviour in practice. We remark that, in the case when only a discrete component is present, joint estimation in the discrete case was addressed by Buchmann and Grübel [2003] and Buchmann and Grübel [2004], and is asymptotically equivalent to that following from our results.

parameter. This is precisely the role of the L´evy distributionN introduced in Section 2.2, which carries all the information of the process. There, we constructed an estimator of it, namelyNnb . In Coca [2015] and in Chapter 3, we show that, under the same assumptions

considered in Section 2.2 and asn→ ∞, the estimator is well-defined in sets of probability approaching 1 and √ n b Nn−N →D_BN in`∞(R), (2.5.1)

whereBN is a centred Gaussian process with covariance structure

ΣBN x,y := 1 ∆2 Z R f_x(N)∗ F−1 ϕ−1(−·)(z) f_y(N)∗ F−1 ϕ−1(−·)(z) dG(z), x, y∈R,

withfx(N):=1(−∞,x]1R\{0},x∈R. As we see in the next section, the covariance structure of this process coincides with the information lower bound developed by Trabs [2015a], so

Nn is asymptotically efficient and optimal estimation of the process can also be achieved this way. Furthermore, and as pointed out in Coca [2015], it turns out to give important insights that allow us to justify why our estimators are efficient and others are not.

Moreover, note the similarity between ΣBF

x,y and the covariance ΣG

x,y from Nickl and Reiß [2012] given in Section 2.2.2. Indeed, NG _{is the generalised version of} _N _{and next} we see that, under the assumptions in Nickl and Reiß [2012], the two estimators are, up to exponentially negligible terms, the same. The strategy is to convertNnb (x) intoNb_nG(x),

and, for simplicity, we assume γ = 0 andx≤ −ζ <0. Then, we can introduce any of the estimators of the intensity, say ˆλn, in the Fourier inverse in (2.2.14) by simply adding and subtracting a (uniformly inx) negligible term, and

b Nn(x)≈ 1 ∆ Z x −∞ F−1hLog(FdGn) + ˆλnFKhn i (y)dy = 1 ∆ Z x −∞ 1 iyF −1h_iy_Log(_F_dGn_{) + ˆ}_λn_F_Kh n i (y)dy = 1 ∆ Z x −∞ 1 iyF −1 Log(FdGn) + ˆλn 0 FKhn+ Log(FdGn)+ ˆλn (FKhn) 0 (y)dy,

where the equalities hold in sets of probability approaching 1 as n→ ∞. The first summand in the Fourier inverse gives rise to exactly Nb_nG(x) because of the finite moment

assumption, so we need to show that the second is uniformly negligible. In the next chapter, and in Nickl and Reiß [2012], we assume the tails ofKdecay faster than quadratically. Therefore, (FKhn)

0 ₌_i_F_[_·_K

hn] =ihnF[(·K)hn] and this is supported in [−h

−1

n , h−n1], so (FKhn)

0 _∈_Lr₍

R) for all r ≥1. In this compact interval,

Log(FdGn) + ˆλn

is bounded above and below in sets of probability approaching 1 and its product with (FKhn)

0 _has

L2(R) norm of order h1/2n if·K ∈L2(R), which we will assume. Furthermore, the supre-

mum inx≤ −ζ <0 of theL2(R) norm of 1/y1(−∞,x](y) is bounded above and therefore the second summand in the last display is of orderhnuniformly in x, which we will take

to decay to zero exponentially fast. We note that, if Nb_nG(x), x ≤ −ζ <0, is rescaled by

λn then, by the same arguments, it is equal to Fb_n(x) up to uniformly and exponentially

negligible additive terms.

Another interesting question is whether, as ∆→0 and under the assumptions of Nickl et al. [2016], the appropriate modification ofNb_nis, up to uniformly negligible terms, equal

to the estimatorsNe_n and Nb_n. This is indeed the case, and for the latter can be argued as

in the previous paragraph assumingK decays fast enough. For the former, we argue as follows: the appropriate modification ofNb_n is

1 ∆

Z x

−∞

ρ(y)y2F−1[Log(1 +FdGn−1)FKhn] (y)dy.

Arguing as at the end of Section 2.4.2, this can be shown to be, up to uniformly in x negligible terms, equal to

1 ∆ Z x −∞ ρ(y)y2F−1_[(_F_dG n−1)FKhn] (y)dy= 1 ∆ Z x −∞ ρ(y)y2dGn∗Khn(y)dy − 1 ∆ Z x −∞ ρ(y)y2Khn(y)dy.

As expected, the first summand is the kernel-regularised version ofNe_n, and the second is

uniformly negligible becauseρ(y)≤C(1∧y−2) for someC >0 by assumption onρ. Hence, we also conclude that the existing estimators developed from the spectral approach are equal up to uniformly negligible terms. We remark that, in the compound Poisson case, the L´evy measure has no singularity at the origin and the same calculation shows that, as ∆→0, our estimator Nnb is asymptotically equivalent to the cumulative function of the

linear L´evy density estimator 1 ∆n n X k=1 Khn(· −Zk)1{0}c(Zk)

(note the similarity of this estimator with the intensity and drift estimators in (2.4.6) and (2.4.11)). This estimator and its wavelet versions resemble those of Comte et al. [2014] and Duval [2013a] with I = 1 (cf. (2.1.14) and the discussions after it), with the distinction that the use the non-zero observations conditioning on the total number of them while the last display takes all and automatically discards the null ones. We note in passing that in this form they have not been studied in the literature and preliminary calculations indicate they enjoy the same features (minimax optimality, concentration of measure, etc.) as the respective estimators in the standard i.i.d. case. Furthermore, if we rescale our estimatorNnb by ˆλn, the resulting estimator is asymptotically equivalent to the empirical

In document Efficient nonparametric inference for discretely observed compound Poisson processes (Page 64-67)