Data sets and SGeMS EDA tools
4.1 The data sets .1 The 2D data set
Data sets and SGeMS EDA tools
This chapter presents the data sets used to demonstrate the geostatistics algorithms in the following chapters. It also provides an introduction to the exploratory data analysis (EDA) tools of the SGeMS software.
Section4.1presents the two data sets: one in 2D and one in 3D. The smaller 2D data set is enough to illustrate the running of most geostatistics algorithms (kriging and variogram-based simulation). The 3D data set, which mimics a large deltaic channel reservoir, is used to demonstrate the practice of these algorithms on large 3D applications; this 3D data set is also used for EDA illustrations.
Section4.2introduces the basic EDA tools, such as histogram, Q-Q (quantile–
quantile) plot, P-P (probability–probability) plot and scatter plot.
4.1 The data sets 4.1.1 The 2D data set
This 2D data set is derived from the published Ely data set (Journel and Kyriakidis, 2004) by taking the logarithm of all positive values and discarding the negative ones. The original data are elevation values in the Ely area, Nevada. The corre-sponding SGeMS project is located atDataSets/Elyl.prj. This project contains two SGeMS objects:Ely1 psetandEly1 pset samples.
• TheEly1 psetobject is a point set grid with 10 000 points, constituting a ref-erence (exhaustive) data set. This point set grid holds three properties: a local varying mean data (“lvm”), the values of the primary variable (“Primary”) and the values of a co-located secondary property (“Secondary”). The point set grid and its properties are given in Fig. 4.1a–d. This object can be used to hold properties obtained from kriging algorithms or stochastic simulations.
80
4.1 The data sets 81
(a) Simulation grid (point sets)
(c) Reference values (d) Secondary data (e) Hard data
(b) Local varying mean
3.5 4 4.5 5 5.5 6 6.5 7
3 4 5 6 7 8 1 2.5 4 5.5 7 3 4 5 6 7 8
Figure 4.1 The Ely data set
• TheEly1 pset samplesobject provides 50 well data (“samples”), which can be used as hard primary data to constrain the geostatistical estimations or simula-tions. These data, shown in Fig. 4.1e, were sampled from the reference data set (Fig. 4.1c).
4.1.2 The 3D data set
The 3D data set retained in this book is extracted from a layer of Stanford VI, a synthetic data set representing a fluvial channel reservoir (Castro,2007). The cor-responding SGeMS project is located at DataSets/stanford6.prj. This project contains three SGeMS objects:well,gridandcontainer.
• Thewellobject contains the well data set. There is a total of 26 wells (21 vertical wells, four deviated wells and one horizontal well). The six properties associ-ated with these wells are bulkdensity, a binaryfaciesindicator (sand channel or mud floodplain),P-wave impedance,P-wave velocity,permeabilityand
porosity. These data will be used as hard or soft conditioning data in the
0.33 0.28 0.23 0.18 0.13 0.08 0.03
Figure 4.2 Well locations and the porosity distribution along the Stanford VI wells
example runs of Chapters 7 to9. Figure 4.2 shows the well locations and the porosity distribution along the wells.
• The grid object is a Cartesian grid (its rectangular boundary is shown on Fig.4.2), with
– grid size: 150 × 200 × 80, – origin point at (0,0,0),
– unit cell size in each x/y/z direction.
This reservoir grid holds the following two variables.
1. Probability data. The facies probability data were calibrated from the original seismic impedance data using the well data (facies and P-wave impedance).
Two sand probability cubes (propertiesP(sand|seis)andP(sand|seis) 2) are provided: the first displays sharp channel boundaries (best quality data, see Fig. 4.3a); the second displays more fuzzy channel boundaries (poor quality data, see Fig.4.3b). These probability data will be used as soft data to constrain the facies modeling.
2. Region code. Typically a large reservoir would be divided into differ-ent regions with each individual region having its own characteristics, for instance, different channel orientations and channel thickness. The regions associated with the Stanford VI reservoir are rotation regions (property
angle) corresponding to different channel orientations (Fig.4.4), and affin-ity (scaling) regions (propertyaffinity) corresponding to different channel thicknesses (Fig. 4.5). Each rotation region is labeled with an indicator number, and is assigned a rotation angle value, see Table4.1. The affinity indicators and the attached affinity values are given in Table4.2. An affinity
4.1 The data sets 83
(a) Good quality data
1 0.8 0.6 0.4 0.2 0
1 0.8 0.6 0.4 0.2 0
(b) Poor quality data Figure 4.3 Two Stanford VI sand probability cubes
98 76 54 32 10
Figure 4.4 Angle indicator cube
2
1
0
Figure 4.5 Affinity indicator cube
value must be assigned to each x/y/z direction; the larger the affinity value, the thicker the channel in that direction.
• Thecontainerobject is composed of all the reservoir nodes located inside the channels, hence it is a point-set with (x,y,z) coordinates. The user can perform geostatistics on this channel container, for example, to estimate the within-channel petrophysical properties. In Fig.4.6the channel container is represented by all nodes with value 1 (gray), and the non-reservoir area is in black.
Table 4.1 Rotation region indicators for Stanford VI
Angle category 0 1 2 3 4 5 6 7 8 9
Angle value (degree) −63 −49 −35 −21 −7 7 21 35 49 63
Table 4.2 Affinity region indicators for Stanford VI
Affinity category 0 1 2
Affinity value ([x,y,z]) [ 2, 2, 2 ] [ 1, 1, 1 ] [0.5, 0.5, 0.5]
Figure 4.6 Stanford VI channel container (gray nodes)
Although this 3D data set is taken from a reservoir model, it could represent any 3D spatially distributed attribute and be used for testing applications in other fields than reservoir modeling. For example, one can interpret each 2D horizontal layer of the seismic data cube as coarse satellite measurements defined over the same area but recorded at different times. The application would then be modeling landscape change in both space and time.