Median indicator kriging - Estimation algorithms

Estimation algorithms

Algorithm 7.3 Median indicator kriging

1: for Each location u in the grid do

2: Get the conditioning data n

3: if n is large enough then

4: Solve simple indicator kriging system and store the vector of kriging weights

5: else

6: Set node as uninformed, move to the next location

7: end if

8: for Each category k do

9: Compute kriging estimate i_k^∗(u) with the kriging weights found in step 4

10: end for

11: Correct FZ(u) for order violations

12: end for

Different types of data can be coded into a vector of K indicator values I(u) = [i(u; z¹), . . . ,i(u; z^K)].

Hard data

The value of Z at a given location uα is known, equal to z(uα)with no uncertainty.

The corresponding K indicator values are all valued 0 or 1:

i(uα; z^k)=

61 if z(uα)≤ z^k

0 otherwise k = 1, . . . , K.

7.2 INDICATOR KRIGING 115 interval datum [2.1, 4.9] is coded as:

I =

where the question mark ? denotes an undefined value (missing values are rep-resented by integer −9966699 in SGeMS) which would have to be estimated by kriging. See Section2.2.7on how to enter missing values in SGeMS.

Categorical variable

INDICATOR KRIGINGcan be applied to categorical variables, i.e. variables that take a finite number K of discrete values (also called classes or categories): z(u) ∈ {0, ..., K − 1}. The indicator variable for class k is defined as:

I (u, k) =

2 1 if Z(u) = k 0 otherwise

and the probability I^∗(u, k) for Z(u) belonging to class k is estimated by simple kriging:

where E{I (u, k)} is the indicator mean (marginal probability) for class k.

In the case of categorical variables, the estimated probabilities must all be in [0, 1] and verify:

!K k=1

I^∗(u; k) = 1. (7.2)

If not, they are corrected as follows:

1. If I^∗(u, k) /∈ [0, 1] reset it to the closest bound. If all the probability values are less than or equal to 0, no correction is made and a warning is issued.

2. Standardize the values so that they sum-up to 1:

I_corrected^∗ (u, k) = I^∗(u, k) /K

i=1I^∗(u, i). Parameters description

The INDICATOR KRIGING algorithm is activated from Estimation → Indicator Kriging in the algorithm panel. The INDICATOR KRIGING interface contains three pages: “General”, “Data” and “Variogram” (see Fig. 7.4). The text inside

“[ ]” is the corresponding keyword in the INDICATOR KRIGING parameter file.

1. Estimation Grid Name [Grid Name] Name of the estimation grid.

2. Property Name Prefix [Property Name] Prefix for the estimation output.

The suffix real# is added for each indicator.

3. # of indicators [Nb Indicators] Number of indicators to be estimated.

11 1 12

2 3 4

Figure 7.4 User interface for INDICATOR KRIGING

7.2 INDICATOR KRIGING 117 4. Categorical variable [Categorical Variable Flag] Indicates if the data

are categorical or not.

5. Marginal probabilities [Marginal Probabilities]

If continuous Probability to be below the thresholds. There must be [Nb Indicators]entries monotonically increasing.

If categorical Proportion for each category. There must be [Nb Indicators]

entries adding to 1. The first entry corresponds to category coded 0, the second to category coded 1, ...

6. Indicator kriging type If Median IK [Median Ik Flag] is selected, the program uses median indicator kriging to estimate the ccdf. Otherwise, if Full IK [Full Ik Flag]is selected, a different IK system is solved for each threshold/class.

7. Hard Data Grid [Hard Data Grid] Grid containing the conditioning hard data.

8. Hard Data Indicators Properties [Hard Data Property] Conditioning pri-mary data for the simulation. There must be [Nb Indicators] properties selected, the first one being for class 0, the second for class 1, and so on. If Full IK [Full Ik Flag] is selected, a location may not be informed for all thresholds.

9. Min Conditioning data [Min Conditioning Data] Minimum number of data to be retained in the search neighborhood.

10. Max Conditioning data [Max Conditioning Data] Maximum number of data to be retained in the search neighborhood.

11. Search Ellipsoid Geometry [Search Ellipsoid] Parametrization of the search ellipsoid, see Section6.4.

12. Variogram [Variogram] Parametrization of the indicator variograms, see Section6.5. Only one variogram is necessary if Median IK [Median Ik Flag]

is selected. Otherwise there are [Nb Indicators]indicator variograms.

Example

The INDICATOR KRIGING algorithm is run on the point-set presented in Fig. 4.1a.

Probabilities of having a value below 4, 5.5 and 7 are computed with a median IK regionalization. The resulting conditional probabilities (ccdf) for these three thresholds are: 0.15, 0.5 and 0.88. The estimated probabilities for each threshold are shown in Fig.7.5. The variogram model for the median indicator is:

γ (hx,hy)= 0.07Sph

(a) Estimated probability to be less than 4

0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1

(b) Estimated probability to be

less than 5.5 (c) Estimated probability to be less than 7

Figure 7.5 Median indicator kriging

(a) Estimated probability to be

less than 4 (b) Estimated probability to be

less than 5.5 (c) Estimated probability to be less than 7

0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1

Figure 7.6 Median indicator kriging with inequality data

The search ellipsoid is of size 80 × 80 × 1, with a minimum of 5 and maximum of 25 conditioning data.

A set of 200 interval-type data is added to the data set, these interval data only tell whether the z-values at these locations are above or below 5.5. The coding of these inequality data is:

if z(u) < 5.5 then i(u) =

⎡

⎣

? 1

⎤

⎦ ; if z(u) > 5.5 then i(u) =

⎡

⎣ 0 0

⎤

⎦ .

Figure 7.6 shows the resulting estimation maps for the same thresholds used in Fig. 7.5. The inequality data mostly modify the probability map for the 5.5 threshold, their impact on the low and high threshold is not as strong.

7.3 COKRIGING: kriging with secondary data 119 7.3 COKRIGING: kriging with secondary data

The COKRIGING algorithm integrates the information carried by a secondary variable related to the primary attribute being estimated. The kriging system of equations is then extended to take into account that extra information. A coregion-alization model must be provided to integrate secondary variables. SGeMS offers three choices: the linear model of coregionalization (LMC), the Markov Model 1 (MM1) and the Markov Model 2 (MM2). The LMC accounts for all secondary data within the search neighborhood while the Markov models only retain those secondary data that are co-located with the primary data; see Section3.6.4.

The LMC option can be used with either simple or ordinary cokriging. The Markov models (MM1 or MM2) can only be solved with simple cokriging; using ordinary cokriging would lead to ignoring the secondary variable since the sum of weights for the secondary variable must be equal to zero (Goovaerts,1997, p.236).

The detailed COKRIGING algorithm is presented in Algorithm7.4.

Algorithm 7.4 COKRIGING

1: for Each location u in the grid do

2: Get the primary conditioning data n

3: Get the secondary conditioning data n^′

4: if n is large enough then

5: Solve the cokriging system

6: Compute the cokriging estimate and cokriging variance

7: else

8: Set node as uninformed, move to the next location

9: end if

10: end for

In document Applied-Geostatistics-With-SGeMS-a-User-s-Guide.pdf (Page 135-140)