Performance of the approximate MLE for Gaussian-type DPPs . 94

5.2 Likelihood of DPPs

5.3.1 Performance of the approximate MLE for Gaussian-type DPPs . 94

K^ρ,α(x, y) = ρ exp −ky − xk² α²

(5.3.1) where πρα² 6 1. Here, the parameter ρ controls the number of points of the point process while α controls its repulsiveness. The bound on πρα² is a consequence of the eigenvalues of K being in [0, 1], it can be interpreted as a trade-off between the repulsiveness of the DPP and how dense it is. It is easy to see that the exponential decay of K₀^ρ,α makes this family satisfy Condition (H) when πρα² is bounded by a constant strictly lower than 1. Moreover, since this family satisfy (5.2.10) then, as explained in Section 5.2.5, we will not jointly estimate (ρ, α) directly by MLE but we estimate ρ by ˆρ = N (X ∩ Wn)/µ(W_n) and α by the argument of the maximum of α 7→^el_n(X| ˆρ, α).

Figure 10: Examples of realization of a Gaussian-type DPP on [0, 1]² with parameters ρ^∗ = 100 and α^∗ ∈ {0.01, 0.03, 0.05}

We consider realizations of a DPP with Gaussian kernel where ρ^∗ = 100, α^∗ ∈ {0.01, 0.03, 0.05} and the observation window is either [0, 1]², [0, 2]² or [0, 3]². When ρ^∗ = 100, α takes values in ]0, (10√

π)⁻¹ ≈ 0.056[. Therefore, α^∗ = 0.01 corresponds to a weakly repulsive Gaussian DPP, close to a Poisson point process, while α^∗ = 0.03 corresponds to a mildly repulsive DPP and α^∗ = 0.05 corresponds to a strongly repulsive DPP. Examples of such realizations are shown in Figure 10. We estimate α^∗ by the approximate MLE defined in (5.2.5) and compare it to its edge-corrected version defined in (5.2.9) as well as minimum contrast estimators (MCE) based on the pair correlation function (pcf) or Ripley’s K function (see [16]), both being common

second-order moment estimators used in spatial statistics. All simulations of the DPP have been done in R [87] using the spatstat [7] package and both moment estimators were computed by the function dppm of the same package.

Boxplots of the difference between the estimators ˆα and the true value α^∗ for 500 simulations in the nine different cases are displayed in Figure 11 and corresponding mean square errors are displayed in Table 5.

Figure 11: Boxplots of ˆα − α^∗ generated from 500 simulations of Gaussian-type DPPs (5.3.1) on windows W = [0, 1]², [0, 2]² or [0, 3]² with true parameters ρ^∗ = 100 and α^∗ = 0.01, 0.03 or 0.05. The first two estimators are the approximate MLE from el_n^T(X| ˆρ, α) and ^el_n(X| ˆρ, α), while the last two are MCE based on the pair correlation function and Ripley’s K function.

Note that, when α^∗ = 0.01 and α^∗ = 0.03, inference based on the approximate likelihood eln(X| ˆρ, α) outperforms moment based inference for windows bigger than [0, 2]². These results are expected from maximum likelihood based inference and shows that hundreds of points are enough for^eln(X| ˆρ, α) to be a good enough approximation of the true likelihood when the underlying DPP is not too repulsive. The issue lies within the case α^∗ = 0.05 where the estimation is heavily biased due to strong edge effects making^eln(X| ˆρ, α) not a good approximation of the true likelihood for low values of n. As can be seen in Figure 12,eln(X| ˆρ, α) will usually be decreasing with respect to

α Window MLE based on^el ^T MLE based on^el MCE (pcf) MCE (K)

0.01 [0, 1]² 0.83 1.25 0.86 1.81

[0, 2]² 0.21 0.24 0.31 0.74

[0, 3]² 0.090 0.095 0.17 0.48

0.03 [0, 1]² 0.81 1.75 0.77 1.17

[0, 2]² 0.18 0.23 0.27 0.46

[0, 3]² 0.079 0.10 0.17 0.23

0.05 [0, 1]² 0.41 0.54 0.74 0.51

[0, 2]² 0.088 0.28 0.23 0.21

[0, 3]² 0.051 0.20 0.19 0.12

Table 5: Estimated mean square errors (x10⁴) of ˆα for Gaussian-type DPPs on different windows and with different values of α, each computed from 500 simulations. The first two columns corresponds to estimators based on the approximate MLEs (5.2.5) and (5.2.9). The last two columns corresponds to MCE estimators based on the pcf and Ripley’s K function.

α when the underlying DPP is very repulsive and therefore it often estimates α by the highest value the parameter can take, which is 1/√

π ˆρ, explaining its behaviour.

Figure 12: Comparison between −el_n(X|100, α) (solid lines) and −el_n^T(X|100, α) (dashed lines) with respect to α where X has been simulated from a DPP on [0, 1]² with a Gaussian-type kernel (5.3.1) with true parameters ρ^∗ = 100 and, from left to right, α^∗ = 0.01, 0.03 or 0.05.

On the other hand, the correction introduced in (5.2.9) gives more accurate values of the likelihood for high values of α as can be seen in Figure 12. This explains why this estimator outperforms the others in nearly every cases and especially the most repulsive ones. The main issue of this correction is that it is limited to rectangular windows but these results suggest that, for a window with a different shape, using a similar idea of replacing the euclidean distance in the expression of l_n(X|θ) with a distance that brings points on the edge closer to each other should give similar results.

Finally, the main drawback with the MLE is its heavy computation time required due to the need to optimize a function defined as the log-determinant of an n×n matrix, where n is the number of points observed. For comparison, each MCE took less than 1 second on a regular computer to compute one estimator in each case considered in Figure 11 but each computation of the approximate MLE took about a second when W = [0, 1]², about 20 seconds when W = [0, 2]² and about 100 seconds when W = [0, 3]².

5.4 Proofs of Section 5.2

In this section, we give the proofs of the propositions stated in Section 5.2. We will use several times these two expressions of the expectation of a functional of X:

Eθ^∗[f (X ∩ W )] = det(Id − K_W^θ )^X

k>0

1 k!

W^k

f (x) det(L^θ_W^∗[x])dµ(x) (5.4.1)

Eθ^∗





x1,··· ,xk∈X∩W

f (x₁, · · · , x_k)



= Z

W^k

f (x) det(K^θ^∗[x])dµ(x) (5.4.2) for all W compact, f : ^S_n>0Wⁿ → R symmetrical and integrable and where the symbol 6= in the second expression means that the sum is over p-tuples of distinct points of X ∩ W . Here, (5.4.1) is the consequence of Theorem 5.2.1 while (5.4.2) is the definition of the k-th order intensity function of a determinantal point process. In the proofs of Propositions 5.2.3 and 5.2.4, as explained in Section 5.2, we work with DPPs on ((εZ)^d, µ), where µ is the counting measure on εZ^d and with kernels K_ε^θ(x, y) :=

ε^dK^θ(x, y). In this case, the associated integral operator K_ε is simply the infinite matrix K_ε[(εZ)^d] and L_ε can be simply defined as the infinite matrix K_ε[(εZ)^d](Id − Kε[(εZ)^d])⁻¹ when kK_εk < 1. Finally, by Lemma 5.A.5, ε is chosen small enough such that the DPPs are well-defined and satisfy the following assumptions

Condition H_ε: {K_ε^θ, θ ∈ Θ} is a compact family of stationary kernels on ((εZ)^d, µ), where µ is the counting measure on εZ^d, satisfying

inf

θ∈Θλ_min(K_ε^θ) > 0 and sup

θ∈Θ

λ_max(K^θ_ε) < 1.

Moreover, there exists constants C, C⁰, α, α⁰> 0 such that for all x ∈ (εZ)^d and θ ∈ Θ

|K_0,ε^θ (x)|6 C

1 + kxk^α+d and |L^θ_0,ε(x)|6 C⁰

1 + kxk^α⁰^+d. (5.4.3) In the proofs of Propositions 5.2.3 and 5.2.4, we mostly avoid using ε in indices as to avoid a clutter of notations. And finally, we will consider the following constants:

λ^θ_m := λ_min(K^θ_ε), λ^θ_M := (1 − λ_max(K^θ_ε))⁻¹ and A_Θ:= sup

max(| log(λ^θ_m)|, | log(λ^θ_M)|).

In particular, A_Θ< +∞ under condition Hε.

In document Statistiques asymptotiques des processus ponctuels déterminantaux stationnaires et non stationnaires (Page 97-101)