• No results found

Estimated intensity

4.2 Manipulating point patterns and windows

We now show how to manipulate point pattern data, for example, how to extract or change the spatial coordinates or the marks, extract a subset of the point pattern, split a pattern into several subsets, or combine several point patterns to make a single pattern.

4.2.1 Basic operations on a point pattern

A planar point pattern is represented in spatstat by an object of the class "ppp". This object consists effectively of the coordinates of the points, the window or study region in which the points were observed, and optionally the mark values associated with each of the points.

Extracting data from a point pattern

Although it is possible to directly access the internal structure of a point pattern object, this is not advisable. Modifying the internal components is dangerous, because the data can become internally inconsistent. The internal format of objects in a package can change from one version of the package to another.

It is much safer and neater to use facilities defined in the package to extract and modify or ma-nipulate the data in a point pattern object. Table 4.3 lists the most important functions for extracting data from a point pattern. These are all generic functions with methods for class "ppp". For help on extracting the coordinates of a point pattern, see the help for coords.ppp.

The Frame of a spatial dataset, extracted by Frame(X), is a rectangle which always contains the entire spatial extent of the dataset. The Frame is part of the data structure, and is used in many calculations which require a rectangle containing X, for example when allocating space for plotting the data, or when converting the data to a pixel image.

The Frame always contains the observation window Window(X) and is often the smallest possi-ble rectangle containing the window, although it can be larger in some circumstances.

The bounding box of a spatial object in spatstat, computed by boundingbox(X), is the small-est rectangle having sides parallel with the coordinate axes that encloses all of the ‘contents’ of the

npoints(X) number of points in X

marks(X) marks of X

coords(X) coordinates of points in X as.data.frame(X) coordinates and marks of X

Window(X) window of X

Frame(X) containing rectangle of X

Table 4.3.Important functions for extracting data from a point pattern.

object. For a point pattern the bounding box is the smallest rectangle containing all the points of the pattern. For a window (object of class "owin") the bounding box is the smallest rectangle con-taining the window. The bounding box of a point pattern is always contained in the bounding box of the window.

Note that the ripras function mentioned in Section 3.3, if called with shape="rectangle", returns a slightly larger rectangle than does boundingbox. The ripras function attempts to ac-commodate the negative bias which is present when the window is ‘estimated’ by the bounding box [580].

Figure 4.16.Point pattern dataset showing the bounding box of the points (dotted lines), bounding box of the Window (dashed lines), the Window (grey shading), and the Frame (solid lines).

Changing data in a point pattern

marks(X) <- value change the marks of X coords(X) <- value change the coordinates of X Window(X) <- value change the window of X

Frame(X) <- value change the containing rectangle of X Table 4.4.Important commands for altering the data in a point pattern.

Table 4.4 lists the most important commands for altering data in a point pattern. For example,

> X <- redwood

> marks(X) <- nndist(X)

nearest other point. The side effect of marks(X) <- m is that the dataset X is changed.

It is useful to understand how these commands are executed. In R syntax, an assignment such as marks(X) <- m is translated into a call to the assignment function "marks<-". This is a generic function, with a method for point patterns, "marks<-.ppp". Consult the help for "marks<-.ppp"

for information about assigning marks to a point pattern.

Note that R is clever enough to handle multiple layers of assignment in one expression. For example

> Y <- chorley

> levels(marks(Y)) <- c("case", "control") is equivalent to

> m <- marks(Y)

> levels(m) <- c("case", "control")

> marks(Y) <- m

Similarly marks(X)[3] <- 5 would set the mark value of the third data point in X to be equal to 5, leaving other mark values unchanged. It is syntactically expanded to something like

> m <- marks(X)

> m[3] <- 5

> marks(X) <- m

In commands like Window(X) <- W, if the new window W is smaller than the current Window(X), points falling outside W will be discarded. Similarly for Frame(X) <- B, all data outside the box B will be removed, and the window will be intersected with B.

Note that it is quite possible to assign new marks or new values to any object, even a ‘standard’

or ‘installed’ dataset (i.e. one of the example datasets provided by spatstat.) In the examples above, rather than creating a new dataset X, we could have typed (for example):

> marks(redwood) <- nndist(redwood)

It is, however, rather unwise to do this because it is easy to forget that you have changed the data and assigning new data to a standard name like redwood can easily cause confusion. Note that assigning marks(redwood) <- nndist(redwood) would create a new pattern redwood in your workspace (‘Global environment’). It would not change or overwrite the copy of redwood that exists in the package:spatstat position on the search path. The current workspace is searched before the libraries, so that the name redwood would be matched to the modified redwood object.

The original spatstat package dataset redwood would be said to be ‘masked’ by the new object redwood in the global environment. Warnings about ‘masked’ objects are sometimes given when a workspace or package is loaded.

4.2.2 Conversion of window types

As indicated above it is possible to change the observation window of a point pattern. One reason for doing this may be that another representation of the window is wanted. As described in Section 3.5 there are three different window formats in spatstat: rectangles, polygonal regions, and binary pixel masks. Table 4.5 lists some useful tools for converting between different window formats.

When applied to a window object of type "rectangle", the function as.polygonal just changes the internal structure of that object to that of a window object of type polygonal. When applied to a window of type "mask" it creates a polygon all of whose edges are either horizontal or vertical, these edges being edges of the outer pixels comprising the mask-type window.

as.polygonal convert a window to a polygonal window

as.mask convert a window to a binary image mask window as.rectangle extract the Frame of a window

as.matrix.owin convert a window to a logical matrix pixellate.owin convert window to pixel image simplify.owin approximate window by a polygon raster.xy raster coordinates of a pixel mask is.rectangle determine if a window is a rectangle is.polygonal determine if a window is polygonal

is.mask determine if a window is a binary pixel mask Table 4.5.Tools for window format testing and conversion.

The simplify.owin function converts a complicated polygon into one with fewer edges. We demonstrate using the clmfires dataset.

> W <- Window(clmfires)

> U <- simplify.owin(W,10) # Gives a polygon with about 200 edges.

The resulting windows are shown in Figure 4.17.

Figure 4.17.Left: complicated polygonal window W; Right: simplified window U created in the text.

4.2.3 Extracting subsets of a point pattern

In spatstat the tools for extracting subsets from a point pattern dataset are called "[.ppp" and subset.ppp and these are methods for the R generic commands "[" and subset. This supports a versatile and easy-to-use system of subset extraction which is consistent with conventional R syntax.

Subset defined by an index

Subsets of a point pattern can be specified by indices using "[", in the same manner as for subsets of a vector. That is, if X is a point pattern and s is a vector of positive integers, of negative integers, or of logical values, then X[s] is also a point pattern, consisting of those points of X selected by s.

For example we could take s to be the vector of positive integers 1:10.

> bei

Planar point pattern: 3604 points

window: rectangle = [0, 1000] x [0, 500] metres

Planar point pattern: 10 points

window: rectangle = [0, 1000] x [0, 500] metres

This produces a point pattern of 10 points, namely the first ten points in the bei dataset. The window of the resulting pattern is the same as that of bei.

Note that, although a point pattern should in principle be treated as an unordered set, the coordi-nates are obviously stored in a particular order, and can be addressed using that order. If the points of bei had been written down in some other order, equally arbitrary but different, then bei[1:10]

would consist of a different set of 10 points.

The index x can be a vector of negative integers:

> bei[-c(2,3,7)]

yields a point pattern consisting of all except the second, third, and seventh points of bei.

The index can be a vector of logical values with one entry for each point of the pattern, with value TRUE if the corresponding point is to be retained. For example,

> swedishpines[nndist(swedishpines) > 10]

extracts all points of the swedishpines pattern for which the nearest other point lies further than 10 units (1 metre) away; and

> longleaf[marks(longleaf) >= 42]

gives a (marked) point pattern consisting of all those points from longleaf where the mark value (the diameter at breast height of the tree) is at least 42 centimetres. The latter example could be done more elegantly using the subset function, explained below. A logical vector shorter than the required length will be ‘recycled’, so that

> longleaf[c(FALSE,TRUE)]

would extract every second point in the sequence.

Tip: Put quotes around the subset operator when asking for help about it. The generic subset operator is "["; the right-hand square bracket that you type when using this operator is really just punctuation. The help file for this operator has to be summoned by typing help("[") or

?"[", otherwise an error will result. Likewise the subset method for point patterns is called

"[.ppp"; the help file for this method is summoned by typing help("[.ppp") or ?"[.ppp".

Subset defined by a window

In spatstat the indexing operator "[.ppp" can also extract a subset of a point pattern correspond-ing to a spatial region. If X is a point pattern and W is a spatial window (object of class "owin") then X[W] is the point pattern consisting of all points of X that lie inside W.

> W <- owin(c(100,800), c(100,400))

> W

window: rectangle = [100, 800] x [100, 400] units

> bei[W]

Planar point pattern: 918 points

window: rectangle = [100, 800] x [100, 400] units The window of the resulting pattern is W.

Note that there is also a "[" method for the class "owin". The expression A[B] is equivalent to intersect.owin(A,B).

Subset defined by an expression

The method subset.ppp can be used to extract a subset of a point pattern defined by an expression involving the names of spatial coordinates x, y, the marks, and the names of individual columns of marks, if there are several columns. For example

> subset(cells, x > 0.5 & y < 0.4)

extracts the subset of the cells data consisting of all points with x-coordinate greater than 0.5 and y-coordinate less than 0.4. Note the single &, the vector-wise ‘and’ operator, so that the result of evaluating the expression is a logical vector of length equal to the number of points. Similarly

> subset(longleaf, marks >= 42)

extracts the trees from the Longleaf Pines data with mark value greater than or equal to 42. The finpines dataset has two columns of marks, named diameter and height, so

> subset(finpines, diameter > 2 & height < 4)

extracts the trees with diameter more than 2 centimetres and height less than 4 metres.

The argument select can be used to retain only some of the columns of marks. It is another expression, involving the name marks or the names of columns of marks. For example

> subset(finpines, diameter > 2, select=height)

retains only the height column of marks. In the expression select, the variable names are inter-preted as column numbers in the data frame of marks, so

> subset(nbfires, year == 1999, select=cause:fnl.size)

is meaningful; it extracts the subset of the New Brunswick forestfires data where the year was 1999, and retains only the columns cause, ign.src, and fnl.size. We could also do

> subset(finpines, select = -height) to remove the height column from the marks.

4.2.4 Manipulating marks

The commands m <- marks(X) for extracting marks, and marks(X) <- m for assigning marks, were discussed in Section 4.2.1 above.

To delete marks from a point pattern, assign the value NULL to the marks:

marks(X) <- NULL

For convenience, you can also perform these operations inside an expression, using the function unmark to remove marks and the binary operator %mark% to add marks. It is frequently convenient to make use of this facility when plotting. For example,

> plot(unmark(anemones))

> radii <- rexp(npoints(redwood), rate=10)

> plot(redwood %mark% radii)

The last line above plots the redwood data as a marked point pattern with random numeric marks.

A common task is to attach marks to a point pattern where the marks depend on another spatial dataset. If X is a point pattern and Z is a pixel image, then Z[X] gives the values of the image at the points of X, so X %mark% Z[X] attaches these values to the points.

> Y <- bei %mark% elev[bei]

To attach an additional column of marks to a dataset which already has marks, one would typically use cbind or data.frame.

> X <- amacrine

> marks(X) <- data.frame(type=marks(X), nn=nndist(amacrine))

> Y <- finpines

> vol <- with(marks(Y), (100 * pi/12) * height * diameter^2)

> marks(Y) <- cbind(marks(Y), volume=vol)

The generic function cut, briefly mentioned in Section 2.1.9, has a method for point patterns, cut.ppp. For a point pattern with real-valued marks, cut.ppp will divide the range of mark values into several discrete bands, yielding a point pattern with categorical marks:

> Y <- cut(longleaf, breaks=c(0, 5, 20, Inf))

> Y

Marked planar point pattern: 584 points

Multitype, with levels = (0,5], (5,20], (20,Inf]

window: rectangle = [0, 200] x [0, 200] metres

Here Inf is positive infinity, so the third category includes all values higher than 20. If breaks is a single number, it determines the number of bands:

> Y <- cut(longleaf, breaks=3)

> Y

Marked planar point pattern: 584 points

Multitype, with levels = (1.93,26.6], (26.6,51.3], (51.3,76]

window: rectangle = [0, 200] x [0, 200] metres

Used in this context, cut.ppp operates on the marks rather than the spatial coordinates.

If W is a window (class "owin"), then cut(X, W) will attach a logical-valued mark to each point, equal to TRUE if the point lies inside W, and FALSE otherwise. If A is a tessellation (class

"tess", discussed in Section 4.5) then cut(X, A) will attach a factor-valued mark to each point, indicating which tile of the tessellation contains the point.

4.2.5 Converting to another unit of length

To convert the spatial coordinates to a different unit of length, use the function rescale:

Y <- rescale(X, s)

The spatial coordinates in the dataset X will be re-expressed in terms of a new unit of length that is s times the current unit of length given in X.

The spatstat generic function rescale is designed so that the rescaled object is equivalent to the original. For example if X is a dataset giving coordinates in metres, then rescale(X,1000) will divide the coordinate values by 1000 to obtain coordinates in kilometres, and the unit name will be changed from metres to 1000 metres. The resulting object is equivalent to the original; the values have just been converted to another unit of measurement.

If the argument unitname is given, it will be taken as the new name of the unit of length.

It should be a valid name for the unit of length, as described in Section 3.3.3. In the example above, rescale(X, 1000, "km") will divide the original coordinate values (in metres) by 1000

to obtain coordinate values in kilometres, and the unit name will be changed to km (rather than to 1000 metres).

The unit name of a spatstat object may also be a multiple of a standard ‘base’ unit. For example, the Lansing Woods trees were originally observed in a square of physical side length 924 feet, then the coordinates were scaled to a unit square, so the lansing dataset in spatstat has unitname(lansing) = 924 feet. For such ‘composite’ units, the command rescale(X) with no further arguments will convert X to the base unit:

> rescale(lansing)

Marked planar point pattern: 2251 points

Multitype, with levels = blackoak, hickory, maple, misc, redoak, whiteoak window: rectangle = [0, 924] x [0, 924] feet

Note that rescaling the point pattern has no effect on the marks. Conversion of the mark values to another unit would have to be done separately.

Sometimes spatial data are stored or presented as a list of related spatial objects. If rescaling is applied to one of the objects in such a list then the same rescaling almost certainly should be applied to all of the objects in the list so as to maintain consistency. For example, to convert the murchison coordinate data from metres to kilometres:

> murch2 <- solapply(murchison, rescale, s=1000, unitname="km")

Since murchison is a "solist", we have ensured that the transformed data also belong to this class by using solapply or equivalently as.solist(lapply(...)).

4.2.6 Geometrical transformations

Many geometrical transformations of spatial data are supported in spatstat. They are listed in Table 4.6.

shift(X) translate (shift)

rotate(X) rotate

reflect(X) reflect about the origin flipxy(X) swap x- and y-coordinates

scalardilate(X) expand or contract by a scale factor affine(X) general affine transformation convexhull(X) convex hull

Table 4.6. Common geometrical transformations for point patterns, windows, and images sup-ported by spatstat (for point patterns the transformation applies to both the points and the obser-vation window). That is, the class of X can be "ppp", "owin" or "im".

By default, rotation is performed about the origin in the spatial coordinate system. To rotate about another centre of rotation, specify the argument centre, which may be either a pair of coor-dinates, or a string indicating a special location. For example

> rotate(chorley, pi/2, centre="centroid")

applies a 90-degree anticlockwise rotation to the Chorley-Ribble data around the centroid of the window.

The command scalardilate(X, f) multiplies all the spatial coordinates of the object X by the factor f without changing any other information. It produces a dataset which is not equivalent to the original.

Table 4.7 lists some commonly used geometrical operations that only apply to windows (i.e.

there is no method for "ppp" objects).

union.owin union of two windows setminus.owin set difference of two windows complement.owin swap inside and outside

inside.owin determine whether a point is inside a window is.subset.owin determine whether one window contains another is.convex test whether a window is convex

centroid.owin compute centroid (centre of mass) of window incircle find largest circle inside window

trim.rectangle cut off side(s) of a rectangle

Table 4.7.Geometrical operations on windows (objects of class "owin").

Further, spatstat supports the operations from mathematical morphology [611, 622] listed in Table 4.8. For a window W the morphological dilation W⊕rby a disc of radius r is the set of points lying at most r units away from W . A point u belongs to W⊕r if a disc of radius r centred at u intersects W . The morphological erosion W⊖r is the set of points inside W lying at least r units away from the boundary of W . A point u belongs to W⊖rif a disc of radius r centred at u is entirely contained in W . Erosion and dilation are often used in statistical calculations for spatial data.

dilation.owin morphological dilation by disc erosion.owin morphological erosion by disc opening.owin morphological opening by disc closing.owin morphological closing by disc

border create a border region around a window Table 4.8.Morphological operations on windows.

The morphological opening is the result of erosion followed by dilation, (W⊖r)⊕r. A point u belongs to the opening of W if there is some disc of radius r that contains u and lies entirely inside W. The morphological closing is the result of dilation then erosion, (W⊕r)⊖r. A point u belongs to the closing of W if u does not lie inside any disc of radius r which avoids W entirely. Morphological closing is particularly useful for ‘cleaning up’ irregular windows and for filling tiny gaps.

In future versions of spatstat, morphological operations based on other structuring elements (rather than the disc) will be implemented.

In future versions of spatstat, morphological operations based on other structuring elements (rather than the disc) will be implemented.