Step 2: Conduct a Preliminary Data Review
2.3.9 Plots for Spatial Data
The graphical representations of the preceding sections may be useful for exploring spatial data. However, an analyst examining spatial data may be interested in the location of extreme values, overall spatial trends, and the degree of continuity among neighboring locations.
Graphical representations for spatial data include postings, symbol plots, correlograms, h-scatter plots, and contour plots.
EPA QA/G-9 Final
QA00 Version 2 - 37 July 2000
The graphical representations presented in this section are recommended for all spatial data regardless of whether or not geostatistical methods will be used to analyze the data. The graphical representations described below will help identify spatial patterns that need to be
accounted for in the analysis of the data. If the analyst uses geostatistical methods such as kriging to analyze the data, the graphical representations presented below will play an important role in geostatistical analysis.
2.3.9.1 Posting Plots
A posting plot (Figure 2-14) is a map of data locations along with corresponding data values. Data posting may reveal obvious errors in data location and identify data values that may be in error. The graph of the sampling locations gives the analyst an idea of how the data were collected (i.e., the sampling design), areas that may have been inaccessible, and areas of special interest to the decision maker which may have been heavily sampled. It is often useful to mark the highest and lowest values of the data to see if there are any obvious trends. If all of the highest concentrations fall in one region of the plot, the analyst may consider some method such as post-stratifying the data (stratification after the data are collected and analyzed) to account for this fact in the analysis. Directions for generating a posting of the data (a posting plot) are contained in Box 2-26. 4.0 11.6 14.9 17.4 17.7 12.4 28.6 7.7 15.2 35.5 14.7 16.5 8.9 14.7 10.9 12.4 22.8 19.1 10.2 5.2 4.9 17.2 Creek Road
Figure 2-14. Example of a Posting Plot
2.3.9.2 Symbol Plots
For large amounts of data, a posting plot may not be feasible and a symbol plot (Figure 2- 15) may be used. A symbol plot is basically the same as a posting plot of the data, except that instead of posting individual data values, symbols are posted for ranges of the data values. For example, the symbol '0' could represent all concentration levels less than 100 ppm, the symbol '1' could represent all concentration levels between 100 ppm and 200 ppm, etc. Directions for generating a symbol plot are contained in Box 2-26.
0 2 2 3 3 2 5 1 3 7 2 3 1 2 2 2 4 3 2 1 0 3 Creek Road
Figure 2-15. Example of a Symbol Plot
Box 2-26: Directions for Generating a Posting Plot and a Symbol Plot with an Example
On a map of the site, plot the location of each sample. At each location, either indicate the value of the data point (a posting plot) or indicate by an appropriate symbol (a symbol plot) the data range within which the value of the data point falls for that location, using one unique symbol per data range. Example: The spatial data displayed in the table below contains both a location (Northing and Easting) and a concentration level ([c]). The data range from 4.0 to 35.5 so units of 5 were chosen to group the data:
Range Symbol Range Symbol
0.0 - 4.9 0 20.0 - 24.9 4
5.0 - 9.9 1 25.0 - 29.9 5
10.0 - 14.9 2 30.0 - 34.9 6
15.0 - 19.9 3 35.0 - 39.9 7
The data values with corresponding symbols then become:
Northing Easting [c] Symbol Northing Easting [c] Symbol 25.0 0.0 4.0 0 15.0 15.0 16.5 3 25.0 5.0 11.6 2 15.0 0.0 8.9 1 25.0 10.0 14.9 2 10.0 5.0 14.7 2 25.0 15.0 17.4 3 10.0 10.0 10.9 2 20.0 0.0 17.7 3 10.0 15.0 12.4 2 20.0 5.0 12.4 2 5.0 0.0 22.8 4 20.0 10.0 28.6 5 5.0 5.0 19.1 3 20.0 15.0 7.7 1 5.0 10.0 10.2 2 15.0 0.0 15.2 3 5.0 15.0 5.2 1 15.0 5.0 35.5 7 0.0 5.0 4.9 0 15.0 10.0 14.7 2 0.0 15.0 17.2 3
EPA QA/G-9 Final
QA00 Version 2 - 39 July 2000
2.3.9.3 Other Spatial Graphical Representations
The two plots described in Sections 2.3.9.1 and 2.3.9.2 provide information on the location of extreme values and spatial trends. The graphs below provide another item of interest to the data analyst, continuity of the spatial data. The graphical representations are not described in detail because they are used more for preliminary geostatistical analysis. These graphical representations can be difficult to develop and interpret. For more information on these representations, consult a statistician.
An h-scatterplot is a plot of all possible pairs of data whose locations are separated by a fixed distance in a fixed direction (indexed by h). For example, a h-scatter plot could be based on all the pairs whose locations are 1 meter apart in a southerly direction. A h-scatter plot is similar in appearance to a scatter plot (Section 2.3.7.2). The shape of the spread of the data in a h- scatter plot indicates the degree of continuity among data values a certain distance apart in particular direction. If all the plotted values fall close to a fixed line, then the data values at locations separated by a fixed distance in a fixed location are very similar. As data values become less and less similar, the spread of the data around the fixed line increases outward. The data analyst may construct several h-scatter plots with different distances to evaluate the change in continuity in a fixed direction.
A correlogram is a plot of the correlations of the h-scatter plots. Because the h-scatter plot only displays the correlation between the pairs of data whose locations are separated by a fixed distance in a fixed direction, it is useful to have a graphical representation of how these correlations change for different separation distances in a fixed direction. The correlogram is such a plot which allows the analyst to evaluate the change in continuity in a fixed direction as a
function of the distance between two points. A spatial correlogram is similar in appearance to a temporal correlogram (Section 2.3.8.2). The correlogram spans opposite directions so that the correlogram with a fixed distance of due north is identical to the correlogram with a fixed distance of due south.
Contour plots are used to reveal overall spatial trends in the data by interpolating data values between sample locations. Most contour procedures depend on the density of the grid covering the sampling area (higher density grids usually provide more information than lower densities). A contour plot gives one of the best overall pictures of the important spatial features. However, contouring often requires that the actual fluctuations in the data values are smoothed so that many spatial features of the data may not be visible. The contour map should be used with other graphical representations of the data and requires expert judgement to adequately interpret the findings.
2.4 Probability Distributions