Estimation algorithms
Algorithm 7.3 Median indicator kriging
1: for Each location u in the grid do
2: Get the conditioning data n
3: if n is large enough then
4: Solve simple indicator kriging system and store the vector of kriging weights
5: else
6: Set node as uninformed, move to the next location
7: end if
8: for Each category k do
9: Compute kriging estimate ik∗(u) with the kriging weights found in step 4
10: end for
11: Correct FZ(u) for order violations
12: end for
Different types of data can be coded into a vector of K indicator values I(u) = [i(u; z1), . . . ,i(u; zK)].
Hard data
The value of Z at a given location uα is known, equal to z(uα)with no uncertainty.
The corresponding K indicator values are all valued 0 or 1:
i(uα; zk)=
61 if z(uα)≤ zk
0 otherwise k = 1, . . . , K.
7.2 INDICATOR KRIGING 115 interval datum [2.1, 4.9] is coded as:
I =
where the question mark ? denotes an undefined value (missing values are rep-resented by integer −9966699 in SGeMS) which would have to be estimated by kriging. See Section2.2.7on how to enter missing values in SGeMS.
Categorical variable
INDICATOR KRIGINGcan be applied to categorical variables, i.e. variables that take a finite number K of discrete values (also called classes or categories): z(u) ∈ {0, ..., K − 1}. The indicator variable for class k is defined as:
I (u, k) =
2 1 if Z(u) = k 0 otherwise
and the probability I∗(u, k) for Z(u) belonging to class k is estimated by simple kriging:
where E{I (u, k)} is the indicator mean (marginal probability) for class k.
In the case of categorical variables, the estimated probabilities must all be in [0, 1] and verify:
!K k=1
I∗(u; k) = 1. (7.2)
If not, they are corrected as follows:
1. If I∗(u, k) /∈ [0, 1] reset it to the closest bound. If all the probability values are less than or equal to 0, no correction is made and a warning is issued.
2. Standardize the values so that they sum-up to 1:
Icorrected∗ (u, k) = I∗(u, k) /K
i=1I∗(u, i). Parameters description
The INDICATOR KRIGING algorithm is activated from Estimation → Indicator Kriging in the algorithm panel. The INDICATOR KRIGING interface contains three pages: “General”, “Data” and “Variogram” (see Fig. 7.4). The text inside
“[ ]” is the corresponding keyword in the INDICATOR KRIGING parameter file.
1. Estimation Grid Name [Grid Name] Name of the estimation grid.
2. Property Name Prefix [Property Name] Prefix for the estimation output.
The suffix real# is added for each indicator.
3. # of indicators [Nb Indicators] Number of indicators to be estimated.
10
11 1 12
2 3 4
5
6
7
8
9
Figure 7.4 User interface for INDICATOR KRIGING
7.2 INDICATOR KRIGING 117 4. Categorical variable [Categorical Variable Flag] Indicates if the data
are categorical or not.
5. Marginal probabilities [Marginal Probabilities]
If continuous Probability to be below the thresholds. There must be [Nb Indicators]entries monotonically increasing.
If categorical Proportion for each category. There must be [Nb Indicators]
entries adding to 1. The first entry corresponds to category coded 0, the second to category coded 1, ...
6. Indicator kriging type If Median IK [Median Ik Flag] is selected, the program uses median indicator kriging to estimate the ccdf. Otherwise, if Full IK [Full Ik Flag]is selected, a different IK system is solved for each threshold/class.
7. Hard Data Grid [Hard Data Grid] Grid containing the conditioning hard data.
8. Hard Data Indicators Properties [Hard Data Property] Conditioning pri-mary data for the simulation. There must be [Nb Indicators] properties selected, the first one being for class 0, the second for class 1, and so on. If Full IK [Full Ik Flag] is selected, a location may not be informed for all thresholds.
9. Min Conditioning data [Min Conditioning Data] Minimum number of data to be retained in the search neighborhood.
10. Max Conditioning data [Max Conditioning Data] Maximum number of data to be retained in the search neighborhood.
11. Search Ellipsoid Geometry [Search Ellipsoid] Parametrization of the search ellipsoid, see Section6.4.
12. Variogram [Variogram] Parametrization of the indicator variograms, see Section6.5. Only one variogram is necessary if Median IK [Median Ik Flag]
is selected. Otherwise there are [Nb Indicators]indicator variograms.
Example
The INDICATOR KRIGING algorithm is run on the point-set presented in Fig. 4.1a.
Probabilities of having a value below 4, 5.5 and 7 are computed with a median IK regionalization. The resulting conditional probabilities (ccdf) for these three thresholds are: 0.15, 0.5 and 0.88. The estimated probabilities for each threshold are shown in Fig.7.5. The variogram model for the median indicator is:
γ (hx,hy)= 0.07Sph
(a) Estimated probability to be less than 4
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
(b) Estimated probability to be
less than 5.5 (c) Estimated probability to be less than 7
Figure 7.5 Median indicator kriging
(a) Estimated probability to be
less than 4 (b) Estimated probability to be
less than 5.5 (c) Estimated probability to be less than 7
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Figure 7.6 Median indicator kriging with inequality data
The search ellipsoid is of size 80 × 80 × 1, with a minimum of 5 and maximum of 25 conditioning data.
A set of 200 interval-type data is added to the data set, these interval data only tell whether the z-values at these locations are above or below 5.5. The coding of these inequality data is:
if z(u) < 5.5 then i(u) =
⎡
⎣
?
? 1
⎤
⎦ ; if z(u) > 5.5 then i(u) =
⎡
⎣ 0 0
?
⎤
⎦ .
Figure 7.6 shows the resulting estimation maps for the same thresholds used in Fig. 7.5. The inequality data mostly modify the probability map for the 5.5 threshold, their impact on the low and high threshold is not as strong.
7.3 COKRIGING: kriging with secondary data 119 7.3 COKRIGING: kriging with secondary data
The COKRIGING algorithm integrates the information carried by a secondary variable related to the primary attribute being estimated. The kriging system of equations is then extended to take into account that extra information. A coregion-alization model must be provided to integrate secondary variables. SGeMS offers three choices: the linear model of coregionalization (LMC), the Markov Model 1 (MM1) and the Markov Model 2 (MM2). The LMC accounts for all secondary data within the search neighborhood while the Markov models only retain those secondary data that are co-located with the primary data; see Section3.6.4.
The LMC option can be used with either simple or ordinary cokriging. The Markov models (MM1 or MM2) can only be solved with simple cokriging; using ordinary cokriging would lead to ignoring the secondary variable since the sum of weights for the secondary variable must be equal to zero (Goovaerts,1997, p.236).
The detailed COKRIGING algorithm is presented in Algorithm7.4.
Algorithm 7.4 COKRIGING
1: for Each location u in the grid do
2: Get the primary conditioning data n
3: Get the secondary conditioning data n′
4: if n is large enough then
5: Solve the cokriging system
6: Compute the cokriging estimate and cokriging variance
7: else
8: Set node as uninformed, move to the next location
9: end if
10: end for