2
1 .5
1
0.5
4
4.2
4.4
4.6
4.8
5
pH
Figure 3 . 1 D i s t r i bu t i on o f a set of 2 5 pH data
procedure ( CFIT - Dept . So i l Science , Massey University ) t o conform to the normal distribut ion f( x ) ( R2 = 0 . 9 4 , p< 0 . 1 % ) where ( Clarke , 1 9 8 0 ) :
f( x ) = ( 3 . 3 )
H ere , Ql and "C' 2 are best es t imat es of the popula t i on mea n and s ample vari ance , IJ and and w i s the mid-value for each pH c lass .
0
i s calculated as a n average o f a l l the values of Z ( xi ) , and i t should give a good e s t imate of the property z , i n this ca se s o i l p H , when that is measured at any point x i along the transect . We therefore say that the expec t ed val u e of z at any point x i along the transect i s given by( Webster , 1 9 8 5 ) :
E [ Z ( xi ) ] = 1J ( 3 . 4 )
where E denotes expectation . Taking this idea to i t s logical extens ion and cons idering j us t the f i rst point on the transect , i t can be argued that i f the val ue of z at t hi s f irst point is unkno wn , i t could be expected to be equal to
�
whi ch i s an estimate of v in equation ( 3 . 4 ) ; i . e 4 . 89 . Conversely , i f the mean was unknown but the value of z at this point was known , and assuming that the distr ibution of pH va lues along,\
this transect was norma l , IJ would be expected to be equal to the value of
z i . e . 5 . 38 . Using this argument , and equation ( 3 . 1 ) to calculate the
value o f
0
when n i s greater than 1 , the change in the est imated mean and variance w i th increas ing samp l e number was invest igated by start ing at one end of the transect and moving along i t , one s ample separat ion at a t ime to include the other points , unti l the who le data set was included . The results are shown in Figure 3 . 2 as a plot of the mean and var iance vs . t h e number o f va lues u s e d to ca l culate them . S ince the sample variance of a single real izat ion is zero , the change in s 2 was calculatedfor n = 2 to n= 2 5 .
By def inition , the variance o2 i s a measure of the scatter or dispers ion
A
of the values of Z ( xi ) about the mean IJ ( Clarke , 1 9 8 0 ) . 5 2 and IJ are a lso
a s sumed t o be good e st imates o f the t rue var iance and me an of the popu l a t i on f rom which the sample data are drawn . The s i ze of s 2 tel l s us
Mean
4.8
(�)
4.4
4
0.5
0.4
Variance 0.3
0. 1
0
Figure 3 . 25
1 0
1 5
20
25
No. of samples
. A 2Change �n the mea n ( � ) and variance ( s ) with i nc reas i n g samp l e number f o r
1\
something about the precision w i th which � is measured and we can assume
"
that the more precisely � i s measured , the nearer i t w i l l approach to � · I t i s therefore interest ing to note from Figure 3 . 2 that as n increases ,
"
the change in the value of lJ ca lculated for n and n+ 1 realizat ions of
Z ( x.._ ) decreases , to the extent t hat for va lues of n greater than 1 5 , there i s very l i ttle f luctuat ion in the est imated value of
�
relat ive to the f luctuation when n i s less than 9 . That is , the greater the value of n, the lower the value of 5 2 ( Fi gure 3 . 2 ) , and consequent ly the mo re1\
precise the est imate of lJ by lJ . Thi s idea of precis i on w i l l be seen to be important in the fol lowing sections when the spatial distribution of the Z ( x.._ ) i s· cons idered .
In addit ion to the concepts o f the mean and vari anc e , i t i s a lso an important pre-requis i te of spatial analys is to unders tand the concept of covariance . Suppos ing that in addit ion to soi l pH , the C . E . C had been meas ured at each s i te , x .._ , a l ong t he transe ct . As part of the data analy s i s , it may be useful to have a measure of the corre lat ion between the two properties , denoted here by Z and Y . Thi s can be est imated by the covariance , COV , where ( Clarke , 1 9 8 0 ) :
( 3 . 5 ) n - 1
1\ ;\
where l.l z and lJ v are the mean values of the sample data of Z and Y . It is s hown be low that the concept of covariance i s important to geos tat ist ical theory .
i i i . and the semi-variance
A lthough equation ( 3 . 4 ) states that the expected value of Z at any po int x .._ is lJ , it is clear from Table 3 . 1 and impl icit in F i gure 3 . 2 that the
value o f z wi l l in fa ct vary from pl ace to p l a ce . Thi s wou ld more obvious l y be the case i f , for example , the transect crossed the boundary between two distinct soi l type s . Thus , Z ( x.._ ) is cal led a random vari abl e - geostat i s t ics are concerned with ident ifying its spatial structure . By spatial s t ructure is meant the spatial correlat ion of the variable with i t s e l f ,. w h i c h can be de s c r i bed by prope r t i e s o f i t s probabi l i t y distribu t ion . The f irst property i s the mean , def ined by equation ( 3 . 4 ) ; the second i s the spatial covariance , defined be low .
When the m ean v a l ue o f z ( x:1. ) does not va ry along the transect , the condition of firs t-order sta tionari ty is said to hold ( Webs ter , 1 9 8 5 ) :
E [ Z ( x:t. ) l = � = constant ( 3 . 6 )
and i t would be expected that a plot l ike Figure 3 . 2 would show a stra ight horizontal l ine corresponding to a pH of 4 . 89 . I f equat ion ( 3 . 6 ) holds , the expected di f ference between any two values of Z ( x:t. ) separated by a distance or l ag, h , would be zero ( Trangmar et al . , 1 9 8 5 ) :
E [ Z ( x:t. ) - Z ( x;�. + h ) ] = 0 ( 3 . 7 )
If the mean does vary , dri ft is said to be present and the changing value of the mean can be described by the dri ft function , d ( x;�. ) ( Starks & Fang , 1 98 2 ) , and equat ion ( 3 . 6 ) can be re-writ ten more generally as :
( 3 . 8 ) where w ( x:t. ) is a random function of zero mean and f inite , f i xed variance . w ( x1 ) depends on the var iation between va lues of Z ( x:t. ) and Z ( x:t. + h ) , for
all values of h .
One of the aims o f geostatistics i s to quant i fy the degree o f spatial correlat ion between the values of Z ( x:t. ) and Z ( x;�. + h ) . Thi s can be done us ing the concep t of covariance , expres s ed mathemat ical ly in equat ion
( 3 . 5 ) . Thus , the spa t i a l covari ance of Z ( x:t. ) , C ( h ) , i s given by :
C ( h ) ( 3 . 9 )
Unl i ke ( 3 . 5 ) , equat ion ( 3 . 9 ) has no denominator because 1 / ( n- 1 ) = 1 when there are only two obs ervation po ints , X;�. a nd ( x:1. + h ) . Second-order sta t i onari ty ex ists i f the value of C ( h ) for each pair o f pro
p
erty values Z ( x;�. ) and Z ( x:t. + h ) i s_ the same , and independent of its pos i t ion in the sampl ing region ; that i s , C ( h ) depends only on h ( Trangmar et a l . , 1 9 85 ) , and the variabi l i ty of z i s the same throughout the reg ion ( Russo &t o the variance of z , of ten denoted by C ( O ) . The ratio of the spatial
covariance to the sample variance is cal led the spa tial a uto-correl a tion coeffi cien t , P ( h ) given by :
P ( h ) = C ( h )
I
C ( O ) ( 3 . 1 0 )Thus , under second order stationarity , the mean and variance do not vary . P ( h ) = 1 when h = 0 and the spatial covariance decreases as h increases
and so P ( h ) becomes a useful geostatistical tool s ince a plot of P ( h ) agains t h will give an indicat ion o f the size of h for which va lues o f z
remain correlated , or are spa t i a l ly dependen t .
The a ssumption of s econd-order s tati onar ity upon which P ( h ) and C ( h ) depend i s regarded by many geos ta t i s t icians a s too s trong f or many spatial variables because of the tendency of est imates of the variance to vary w i t hout 1 i m i t a s the s i z e of t he area under investigation is ex tended ( O l i ver , 1 9 8 7 ) . As an al ternative to a s s um ing s econd-order stationari ty , the in trinsi c hypo th esi s of regi onal i zed vari abl e theory may be used . This assumes that equation ( 3 . 4 ) holds and that for a given value o f h , the di f f erence between Z ( x"- ) and Z ( x"- - + h ) has a f inite vari ance which i s inde
p
endent o f x "- , the posi t ion of the sample ( Webster ,1 9 8 5 ) :
VAR [ Z ( x"- ) - Z ( x "- + h ) ] = E { [ Z ( x"- ) - Z ( x"- + h ) ] 2 }
( 3 . 1 1 )
= 2 ¥ ( h )
where ¥ i s the semi - variance .
Imp l i c i t in the assumptions under ly ing equations
( 3 . 4 )
and ( 3 . 1 1 ) is that the soi l property fol lows the following mode l of variat ion :( 3 . 1 2 ) where � � i s the mean value of Z in a region , v , and � ( x "- ) is a spatially dependent random component with zero mean , and a variance def ined by :
VAR [ � ( x� ) - � ( x� + h ) ] = E { [ � ( x� ) - � ( x� + h) ] 2 }
( 3 . 1 3 ) = 2 ¥ ( h )
Thus , under the constraints o f the intrins i c hypothesis , variables need on ly be l o ca l l y s ta t i onar y . It wi l l be as sumed for the rest o f the
analysis of the 25 pH data that local stationarity appl ies .
i v . The
A
The semi-variance , ¥ ( h ) , i s est imated by ¥ ( h ) for each value o f h where ( Webster , 1 9 85 ) :
A
¥ ( h ) = 2: { Z ( x� ) - Z ( x� + h ) } 2 ( 3 . 1 4 ) 2m ( h )
1\
¥ ( h ) i s equivalent· to ha l f the sum of the squared di f ference betw een pai r s of values of Z ( x� ) and Z ( x� + h) averaged according to the number of pairs , m , at each value of the lag h .
A.
A plot of ¥ ( h ) against h for a range of separat i on distances i s the semi -
vari ogram, which for s impl icity w i l l henceforth be cal led the vari ogram . The variogram represents the average rate o f change of a property with distance ( Ol iver , 1 9 8 7 ) . F igure 3 . 3 shows the experimental variogram for the 25 pH data from the transect . Although there is much f l uctuation i n
A A
the value o f ¥ ( h ) , i t can be seen that the trend i s for ¥ ( h ) to increas e a s the lag increases ; i . e . samples closer together have a l ower sem i variance than those f arther apart , such that the var iance o f the property is s aid to be spati a l l y dependent .
F i gure 3 . 3 also shows how the number o f pairs o f points decl ines w i th i ncreasing lag . From Figure 3 . 2 and the prel iminary analysi s descr ibed i n
"
s ec t i on ( i i ) , i t would seem l ikel y that the values of ¥ ( h ) a t l arge lags have low prec i sion compared with those at sma l l l ags . Ol iver ( 1 9 8 7 ) noted that the preci s ion o f the variogram depends on the ef fect ive degrees o f