Normality Sampling - JOINT COUNT STATISTICS

PATTERN DESCRIPTORS

2. choose No when the procedure asks users to select all fea- fea-tures when users forgot to select feafea-tures before proceeding

5.5 JOINT COUNT STATISTICS

5.5.1 Normality Sampling

In both normality sampling and randomization sampling, calculating joint count statistics involves estimation of the expected number of black-black, white-white, and black-white joints and their standard deviations. The expected numbers of these joints reflect a random pattern, or a pattern with no significant spatial au-tocorrelation of any type. The number of black-black and white-white joints in-dicates the magnitude of positive spatial autocorrelation, while the number of

black-white or white-black joints indicates the magnitude of negative spatial au-tocorrelation.

These observed values are compared with their expected counterparts to derive their differences. These differences are then standardized by their corresponding standard deviations in order to obtain standardized scores. Using these scores, we can decide if there is a significant positive or negative spatial autocorrelation in the pattern. In other words, three pairs of comparisons have to be conducted. For illustrative purpose, we will only show in detail how negative spatial autocorre-lation can be tested. With this example, the other two situations can be repeated easily or derived using the accompanying ArcView project file.

For normality sampling, the equations for the expected number of black-black and white-white joints are

EB B= 1 2W p² and

EW W = 1 2W q². Then the equation for expected black-white joints is

EB W = W pq, where

EB B, EW W, and EB W, are the expected number of black-black, white-white, and black-white joints, respectively,

p is the probability that an area will be black, and q is the probability that an area will be white.

The two probabilities must sum to 100%, or(p+q = 1.0). If no other information is available, a common method is to set p= nB/n. But there are other consider-ations, to be discussed later, in determining p. If the spatial weights matrix is a binary matrix, the expected values can be simplified to:

EB B= J p² EW W = Jq²

EB W = 2J pq, where J is the total number of joints in the study area.

To test if the measured pattern is significant enough, a statistical test, known as the Z-test, can be applied. To perform this test, the standard deviations of the expected joints are also needed. When a stochastic weights matrix is used, the

three standard deviations are

If a binary matrix is used, these formulas are reduced to σB B=

whereσ are the standard deviations of the corresponding types of joints and J, p, and q are as defined previously. K is_n

i=1Li(Li−1). The n in_n

i=1Li(Li−1) is the total number of polygons, and Li is the number of joints between polygon i and all the polygons adjacent to it.

Let us focus on testing negative spatial autocorrelation (i.e., black-white joints) using the binary matrix (and its corresponding formulas). When calculating the expected black-white joints, the probability of any polygon’s being black or white must be known, namely, p and q. If this probability is known, the only step needed to complete the task is to count the number of joints, J .

Calculating J is quite straightforward since it only involves counting the num-ber of joints between the polygons or summing up all values of the binary con-nectivity matrix and then dividing this number in half. For the example in Figure 5.5, the number of joints is 11. Consequently, we will have J = 11. Please note that all joints are to be counted for J , including all black-black, black-white, and white-white joints.

Two or more polygons may be contiguous at only one point rather than sharing borders. In this case, it will be up to the analyst to decide if this point should be counted as a joint. That is, the analyst must decide if the rook’s case or the queen’s case should be used in constructing the spatial weights matrix. If one joint of this type is included, it is imperative that all such joints are counted, both the total number of joints and the number of black-white joints. In this illustrative example, we adopt the rook’s case in defining neighbors. Therefore, Cuyahoga County and Portage County are not neighbors. Similarly, Geauga and Summit counties are not treated as neighbors.

With regard to the p and q ratios, the analyst often needs to research or look for proper values from theories, literature, historical documents, or past experience

Figure 5.5 Joint structure.

with the subject. For example, if the subject shown by maps in Figure 5.5 is the ratios of infant mortality exceeding a preset threshold at the county level, it would make sense to use the statewide trend to derive proper values for and p and q.

Similarly, if the maps in Figure 5.5 show the preferences for either the Democratic or the Republican party at the county level, it may also make sense to research how the two parties fare for the entire state and use that trend to determine appropriate values for and p and q. To illustrate how to calculate EB W andσB W, let’s assume that p= 0.3 and q = 0.7. Please note that p + q = 1 should hold for all cases.

The next step is to calculate the value of

L(L − 1). While there may be several ways in which this number can be calculated, we suggest that a table be set up to calculate values of L, then L−1, and then L(L −1) for each polygon. In this manner, the task of calculating

L(L − 1) will be less confusing and easy to manage.

In Figure 5.5d, the polygons are labeled A, B, C, D, E, F, and G. The number of joints for each polygon is its L. As shown in Table 5.5, summing the number of all joints yields_G

i=ALi = 22. When each L is reduced by 1, the value of L − 1 can be easily calculated. The value for

L(L − 1) is derived by adding up all L(L − 1) values after multiplying each L and L − 1. With the seven polygons in Figure 5.5,

L(L − 1) = 52, as shown in Table 5.5.

All the necessary values have now been calculated and can be substituted into the following equations:

TABLE 5.5 Joint Count Statistics: Calculating

A single number of black-white joints, say 5.04, will not show if there is signif-icant negative spatial autocorrelation in the patterns we have observed. We would need to compare that number with the expected number of joints in each poly-gon pattern. This can be done by calculating the difference between the observed number of black-white joints in each polygon pattern. If we define OB W to be the observed number of black-white joints in a polygon pattern, we have OB W = 4, 6, and 8 for Figure 5.5a, Figure 5.5b, and Figure 5.5c, respectively. With these numbers, we know that Figure 5.5a is more clustered than a random pattern since 4< 4.62. Figure 5.5b and Figure 5.5c are more dispersed than a random pattern.

The difference between an expected number and an observed number of black-white joints can help us assess a polygon pattern in terms of its clustering or dispersion. However, we do not know how far away each pattern is from a random pattern. Measuring how much each pattern differs from a random pattern requires that we standardize the calculated difference, OB W − EB W.

To standardize the comparison, we will use the Z -score for this procedure.

The Z -score uses the standard deviation of the difference as the denominator to standardize the difference:

and

Z =8− 4.62

2.1 = 1.61 for Figure 5.5c.

According to the probability distribution of the Z -score, any Z value that is less than−1.96 or greater than 1.96 is less likely to happen by chance in 5 out of 100 cases(α = 0.5). With this information, we can conclude that none of the patterns in Figure 5.5 has a statistically significant negative spatial autocorrelation or a dispersion pattern.

In document Statistical Analysis With Arcview Gis (Page 162-167)