Point Process Methods
5.2 Basic definitions ∗
Here we define some mathematical notation and give a few basic definitions. For full details see [198, 199, 484].
5.2.1 Basic notation
For most of this book we are working in the two-dimensional plane of euclidean geometry, denoted by R2. A point location in the plane is denoted by a lower case letter like u. Any location u can be specified by its Cartesian coordinates u = (u1, u2); we shall usually not need to mention the coordinates explicitly. The euclidean distance between two points u = (u1, u2) and v = (v1, v2) is
ku − vk =
(u1− v1)2+ (u2− v2)21/2
.
A ‘region’ is a subset of the plane, denoted by a capital letter like A. The rectangle A = [a,b] ×[c,d]
is the set of all points u = (u1, u2) with a ≤ u1≤ b and c ≤ u2≤ d. For example [0,1]×[0,1] = [0,1]2 is the unit square with bottom-left corner at the origin. The disc with centre point u and radius r > 0 is
b(u,r) = {v : ku − vk ≤ r},
the set of points lying at most r units away from u. The circle with centre u and radius r is the boundary of the disc,
∂b(u,r) = {v : ku − vk = r}, the set of points lying exactly r units away from u.
A point pattern is denoted by a bold lower case letter like x. It is a set x= {x1, x2, . . . , xn}
of points xi in two-dimensional space R2. The number n = n(x) of points in the pattern is not fixed in advance, and may be any finite nonnegative number including zero. In practice, the data points are obviously recorded in some order x1, . . . , xn; but this ordering is artificial, and we treat the pattern x as an unordered set of points. Duplicated points are allowed, that is, it is possible that xi= xj for two different indices i and j. However, many methods in this book require that there should be no duplicated points.
If x is a point pattern and B is a region, we write x ∩ B for the subset of x consisting of points that fall in B. The number of points of x falling in B is n(x ∩ B). See Figure 5.2.
Figure 5.2.A point pattern (dots) and a test set B (shaded).
5.2.2 Point processes
A point process is a random mechanism whose outcomes are point patterns. Point processes are denoted by capital letters X,Y, and so on.
The rigorous mathematical definition of a point process is quite technical: see [197, 198, 199]
for the general theory, and [484] for the theory that is particularly relevant to spatial statistics.
We can avoid many of these technicalities because, for statistical applications, we can usually assume that the number of points in the process is finite. A finite point process is a random mecha-nism for which (1) every possible outcome is a point pattern with a finite number of points; and (2) for any1region B, the number n(X ∩ B) of points falling in B is a well-defined random variable.
These conditions are enough to support a full statistical theory for analysing spatial point pat-terns. They guarantee that all the statistics we might wish to calculate for a point process will be well-defined random variables.
It is sometimes necessary to consider a random pattern of infinitely many points scattered over the infinite two-dimensional plane. For this we need a slight extension of the previous definitions.
A locally finite point pattern is a set x = {x1, x2, . . .} of points in two-dimensional space R2which has only a finite number of points in any bounded region B, that is, n(x ∩ B) is finite. The total number of points in x may be infinite. A locally finite point process X in two-dimensional space
1the region B should be bounded, and topologically closed.
point pattern, and (2) for any bounded test region B, the number n(X ∩ B) of points falling in B is a well-defined random variable.
5.2.3 Uniformly random points
A very simple point process is one that consists of a single random point. Suppose we need to pick one point at random: since a spatial location u is determined by its Cartesian coordinates (u1, u2), we only need to assign random values U1,U2 to the coordinates. This gives us a random point U = (U1,U2).
Figure 5.3.Uniform random point.Left: a pair of random coordinates which are jointly uniformly distributed.Right: probability of falling in a test region B (shaded) is proportional to its area.
To know the statistical properties of this random point, it is enough to know the joint probability distribution of the random coordinates U1and U2. This could be given by a joint probability density
f (u1, u2). Then the probability that the random point U = (U1,U2) falls in a test region B is P{U ∈ B} =
Z
Bf (u1, u2) du1du2. (5.1) An important question in spatial statistics is whether points are ‘uniformly spread’ over the survey region. Say that a random point U is uniformly distributed in a spatial region W , if its Cartesian coordinates (U1,U2) have a joint probability density which is constant inside W and zero outside W. Since a probability density must integrate to 1, the constant value can be determined: it must be 1/|W|, and the density is
f (u1, u2) =
1/|W| if (u1, u2) falls in W
0 if not
where |W| is the area of W. This is only meaningful if |W| is non-zero and finite.
For example if W is a rectangle, say W = [0,w] × [0,h], then (u1, u2) falls in W if and only if 0 ≤ u1≤ w and 0 ≤ u2≤ h. The joint probability density factorises into f (u1, u2) = f1(u1) f2(u2) where
f1(u1) =
1/w if 0 ≤ u1≤ w 0 if not
is the probability density of a random variable that is uniformly distributed on [0,w], and similarly f2 is the density of a random variable uniformly distributed on [0,h]. That is, to generate a uniformly distributed random point in a rectangle, we simply have to choose random coordinates U1 and U2which are independent and uniformly distributed. A computer random number generator can provide such numbers. See Figure 5.3.
An important property of uniformly distributed random points is that, if B is a test region con-tained in W , the probability that U falls in B is
P{U ∈ B} = Z
Bf (u1, u2) du1du2= 1
|W | Z
B1 du1du2= |B|
|W |, (5.2)
the fraction of area occupied by B within W . This probability depends only on the area of the test set B, and not on its location. Indeed this is what would be expected for a random point that has no preference for particular locations.
5.2.4 Binomial point process
The next simplest example of a point process is one in which the number of points n is fixed, and only the locations of the points are random. This point process is a random set X containing exactly nrandom points X1, . . . , Xn.
To make the points ‘uniformly spread’ over a region W , let us assume that X1, . . . , Xnare in-dependentrandom locations, and that each Xiis uniformly distributed over the region W . This is enough information to generate simulated realisations of the process: see Figure 5.4.
If B is a test region, the number n(X ∩ B) of random points falling in B is the number of indices isuch that the random point Xifalls in B. It is clear that this is a well-defined random variable, as required, so that X satisfies the requirements of a finite point process.
Figure 5.4.Ten different realisations of the binomial point process with n = 8 points in a square.
To determine the probability distribution of n(X ∩ B), where B is a subset of W, notice that n(X ∩ B) is the number of successes in n independent trials. That is, if we treat each random point Xias a ‘success’ when it falls inside B and a ‘failure’ otherwise, these trials are independent, and each trial has success probability p = |B|/|W|. Consequently n(X ∩ B) has a binomial distribution
P{n(X ∩ B) = k} =
n k
pk(1 − p)n−k
for k = 0,1,...,n. For this reason the model is known as the binomial point process.
In spatstat the function runifpoint generates a random realisation of the binomial point process. (By convention, random number generators in R have names beginning with r.) The argument n gives the number of points, and win is the window in which to generate the points.
Figure 5.4 was generated by runifpoint(8, square(1), nsim=10).