• No results found

EXPLORATORY DATA ANALYSIS

6.4 Quadrat counting

If it is suspected that the intensity may be inhomogeneous, it can be estimated nonparametrically by techniques such as quadrat counting and kernel estimation.

6.4.1 Quadrat counts

A simple way to check for inhomogeneity is to check whether regions of equal area contain roughly equal numbers of points (as they must do if the point process is homogeneous).

In quadrat counting, the observation window W is divided into subregions B1, . . . , Bm called quadrats.2 For simplicity, suppose the regions have equal area. We count the numbers of points falling in each quadrat, nj= n(x ∩ Bj) for j = 1, . . . , m. Since these counts are unbiased estimators of the corresponding expected values E[n(X∩Bj)], they should be equal ‘on average’ if the intensity is homogeneous. In particular, any apparent spatial trend in the counts njsuggests that the intensity is inhomogeneous.

Quadrat counting is performed in spatstat by the function quadratcount.

> swp <- rescale(swedishpines)

> Q3 <- quadratcount(swp, nx=3, ny=3)

> Q3

The value returned by quadratcount is an object belonging to the special class "quadratcount".

We used the print method for this class to obtain the text output above, and the plot method to get the display in the left panel of Figure 6.6. Note that the plot and print methods display the counts in the same spatial arrangement.

8 6 7

Figure 6.6. Quadrat counting for Swedish Pines data. Left: quadrat counts. Right: intensity estimates (points per square metre).

The arguments nx,ny to quadratcount specify that the quadrats should be an nx by ny grid

2A ‘quadrat’ (German for ‘square’) originally meant a rectangular wooden frame used for random sampling in the field.

It now refers to a spatial sampling region of any shape. A quadrat is very different from a ‘quadrant’, which is one-quarter of the infinite Euclidean plane.

rectangle. Other options are discussed below.

In choosing the size of quadrats, there is a tradeoff between bias and variability: choosing larger quadrats reduces the relative error (standard error divided by mean) of the counts njbut also obliterates the spatial variation in intensity within each quadrat.

If the quadrat counts are divided by the areas of the corresponding quadrats, we obtain the average intensity in each quadrat, which is a simple estimate of the intensity function. The method intensity.quadratcount calculates these intensity estimates from a "quadratcount" object.

> intensity(Q3) x

y [0,3.2] (3.2,6.4] (6.4,9.6]

(6.67,10] 0.7500 0.5625 0.6562 (3.33,6.67] 0.7500 1.0312 0.8437 [0,3.33] 0.4687 0.5625 1.0312

Use intensity( , image=TRUE) to obtain a pixel image that is an estimate of the intensity func-tion. Setting L3 <- intensity(Q3, image=TRUE) and calling plot(L3) gives the display in the right panel of Figure 6.6.

Quadrat counts, in quadrats of equal size and shape, can also be used to calculate a standard error for the overall estimate of intensity. If we are not willing to assume a Poisson process, but willing to assume that the counts in different quadrats are approximately independent variables with the same (unknown) distribution, we can apply the usual estimator of standard error for a mean:

> l3 <- as.numeric(intensity(Q3))

> sem <- sqrt(var(l3)/(length(l3)-1))

> sem [1] 0.07118

For comparison, the estimated standard error assuming CSR was sdX = 0.08777.

In general, the quadrats could have unequal sizes and shapes. The alternative argument tess to quadratcount allows the quadrats to be any tessellation of the window. For example, hexagonal tiles can be created using hextess:

> H <- hextess(swp, 1)

> hQ <- quadratcount(swp, tess=H)

The counts are plotted in the left panel of Figure 6.7. Since the quadrat areas are not all equal, the counts of points in these quadrats should not be compared directly. Under the assumption of homogeneity, equation (6.1), the expected count in each quadrat is proportional to the area of the quadrat. The average intensity (6.2) in each quadrat is an unbiased estimator of the homogeneous intensityλ. For exploratory purposes, we can plot the average intensity in each quadrat. The right panel of Figure 6.7 shows a plot of intensity(hQ, image=TRUE).

Note that quadratcount is not designed to handle very large numbers of quadrats. To count the number of points falling in each pixel of a fine grid of pixels, use pixellate.

6.4.2 Quadrat counting test of homogeneity

Historically the swedishpines dataset has been analysed under the assumption of homogeneous intensity [643, 575]. However, the quadrat counts in Figures 6.6 and 6.7 suggest the intensity may be slightly elevated along a diagonal swath from top left to bottom right of the plot.

One way to assess the evidence for inhomogeneity is to conduct a formal test of statistical

1 0 0

Figure 6.7. Quadrat counting with hexagonal quadrats. Left: quadrat counts superimposed on point pattern.Right: intensity estimates (points per square metre). Swedish Pines data.

significance. The principles of hypothesis tests are explained in Chapter 10: here we just run through the procedure for the test.

The null hypothesis is that the intensity is homogeneous, and the alternative hypothesis is that the intensity is inhomogeneous in some unspecified fashion. For practical purposes we will assume, provisionally, that the point process is Poisson. Then the null hypothesis is CSR and the alternative is an inhomogeneous Poisson process.

As before we divide the window W into quadrats B1, . . . , Bm and count the numbers of points n1, . . . , nmof points in each quadrat. If the null hypothesis is true, the njare realisations of indepen-dent Poisson random variables with expected valuesµj=λajwhereλis the unknown intensity and aj is the area of Bj. If the quadrats all have equal area aj= a, then the counts njare independent Poisson random variables with equal meanλa.

Theχ2(chi-squared) test could be used in two different ways here: to test goodness-of-fit to the Poisson distribution assuming homogeneity [113], or to test homogeneity assuming independence [296]. Our focus is on homogeneity, so we shall do the latter, applying the ‘χ2test of uniformity’.

Given the total number of points n = ∑jnj, and the total window area a = ∑jaj, the estimated

If the quadrats all have equal area, then the njare independent with equal expected value under the null hypothesis. The test statistic reduces to

X2=

j

(nj− n/m)2

n/m . (6.6)

Under the null hypothesis, the distribution of the test statistic is approximately aχ2distribution with m − 1 degrees of freedom. The approximation is traditionally deemed to be acceptable when the expected counts ejare greater than 5 for all quadrats.

The quadrat counting test is performed in spatstat by quadrat.test.

> tS <- quadrat.test(swp, 3,3)

> tS

Pearson X2 statistic data: swp

X2 = 4.7, df = 8, p-value = 0.4 alternative hypothesis: two.sided Quadrats: 3 by 3 grid of tiles

The value returned by quadrat.test is an object of class "htest" (the standard R class for hy-pothesis tests). Printing the object (as shown above) gives comprehensible output about the outcome of the test. Inspecting the p-value, we see that the test does not reject the null hypothesis of CSR for the Swedish Pines data. The p-value can also be extracted by

> tS$p.value [1] 0.4169

The return value of quadrat.test also belongs to the special class "quadrat.test". Plotting the object will display the quadrats, annotated by their observed and expected counts and the Pearson residuals. Figure 6.8 shows the result of plot(swp); plot(tS, add=TRUE). In each quadrat the observed counts njare displayed at top left; expected counts ejat top right; and Pearson residuals rj= (nj− ej)/√ejat bottom.

8 6 7

8 11 9

5 6 11

7.9 7.9 7.9

7.9 7.9 7.9

7.9 7.9 7.9

0.04 −0.67 −0.32

0.04 1.1 0.4

−1 −0.67 1.1

Figure 6.8.Quadrat counting test of CSR for Swedish Pines data.

Other arguments to quadrat.test make it possible to conduct a one-sided test, and to compute the p-value using Monte Carlo simulation instead of theχ2approximation.

> quadrat.test(swp, 5, alternative="regular", method="MonteCarlo") Conditional Monte Carlo test of CSR using quadrat counts

Pearson X2 statistic data: swp

X2 = 18, p-value = 0.2

alternative hypothesis: regular Quadrats: 5 by 5 grid of tiles

The function quadrat.test is generic, with methods for "ppp" objects (which we have used above), but also for "splitppp" and "quadratcount" objects. It is possible to perform aχ2test using previously computed counts, for example the counts Q3 above:

> quadrat.test(Q3)

Chi-squared test of CSR using quadrat counts Pearson X2 statistic

data:

X2 = 4.7, df = 8, p-value = 0.4 alternative hypothesis: two.sided Quadrats: 3 by 3 grid of tiles

The results of several quadrat tests can also be pooled. For example, suppose an ecologist has recorded the spatial pattern of trees in three separate plots in the same forest. The data from each plot have been subjected to a quadrat counting test as described above. Then an overall test of uniform intensity is performed by applying pool.quadrattest to the three test results:

test1 <- quadrat.test(X1, 3) test2 <- quadrat.test(X2, 3) test3 <- quadrat.test(X3, 5) pool(test1, test2, test3)

The quadrat test of homogeneity can be generalised to a test of any model for the intensity: see Section 10.4. More powerful tests become available if it is possible to specify a covariate upon which the intensity might depend (instead of being homogeneous). Specifying such a covariate makes the alternative hypothesis more precise, allowing the analyst to choose a more powerful test. Covariate-dependent quadrat tests are discussed in Sections 6.7.1 and 10.4. Other covariate-dependent tests are introduced in Sections 6.7 and 10.5.

Theχ2test is simple to apply, but it is not necessarily the best test to apply to the quadrat counts.

Other options are the likelihood ratio test and the Cressie-Read [566] divergence family, which can be selected in quadrat.test using the argument CR. The values CR=1, CR=0, and CR=-1/2 correspond to theχ2test, likelihood ratio test, and Freeman-Tukey test, respectively.

6.4.3 Critique

Since this technique is often used in the applied literature, a few comments are appropriate.

The main critique of the quadrat test described above is the lack of information. This is a goodness-of-fit test in which the alternative hypothesis H1is simply the negation of H0, that is, the alternative is that ‘the process is not a homogeneous Poisson process’. A point process may fail to be a homogeneous Poisson process either because it fails to have homogeneous intensity, or because it violates the property of independence between points. There are too many types of departure from H0.

The usual justification for the classicalχ2goodness-of-fit test is to assume that the counts are independent, and derive a test of the null hypothesis that all counts have the same expected value.

Invoking it here is slightly naive, since the independence of counts is also open to question.

Indeed we can also turn things around and view theχ2test as a test of the independence property of the Poisson process, assuming that the intensity is homogeneous. The Pearson χ2test statistic (6.6) coincides, up to a constant factor, with the sample variance-to-mean ratio of the counts nj, which is often interpreted as a measure of over-/underdispersion of the counts nj assuming they have constant mean.

The power of the quadrat test depends on the size of quadrats, and is optimal when the quadrats are neither very large nor very small. The power also depends on the alternative hypothesis, in particular on the ‘spatial scale’ of any departures from the assumptions of constant intensity and independence of points. The choice of quadrat size is also an implicit choice of spatial scale, because

where it is recommended to compute the variance-to-mean ratio orχ2statistic for different sizes of quadrats, and to plot the statistic against quadrat size [296, 462]. We return to the topic of spatial scale in Chapter 7.

6.5 Smoothing estimation of intensity function