Access Details: [subscription number 768485448] Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
International Journal of Remote
Sensing
Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713722504
Gaps-fill of SLC-off Landsat ETM+ satellite image using
a geostatistical approach
C. Zhanga; W. Lia; D. Travisb
aDepartment of Geography, Kent State University,
bDepartment of Geography and Geology, University of Wisconsin-Whitewater, Online Publication Date: 01 January 2007
To cite this Article: Zhang, C., Li, W. and Travis, D. (2007) 'Gaps-fill of SLC-off Landsat ETM+ satellite image using a geostatistical approach', International Journal of Remote Sensing, 28:22, 5103 - 5122
To link to this article: DOI: 10.1080/01431160701250416 URL:http://dx.doi.org/10.1080/01431160701250416
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use:http://www.informaworld.com/terms-and-conditions-of-access.pdf
This article maybe used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
Downloaded By: [Kent State University] At: 17:22 23 October 2007
Gaps-fill of SLC-off Landsat ETM
+
satellite image using a
geostatistical approach
C. ZHANG*{, W. LI{and D. TRAVIS{
{Department of Geography, Kent State University, USA
{Department of Geography and Geology, University of Wisconsin-Whitewater, USA (Received 19 May 2006; in final form 17 December 2006)
Using appropriate techniques to fill the data gaps in SLC-off ETM+ imagery may enable more scientific use of the data. The local linear histogram-matching technique chosen by USGS has limitations if the scenes being combined exhibit high temporal variability and radical differences in target radiance due, for example, to the presence of clouds. This study proposes using an alternative interpolation method, the kriging geostatistical technique, for filling the data gaps. The case study shows that the ordinary kriging techniques may provide a powerful tool for interpolating the missing pixels in the SLC-off ETM+ imagery. While the standardized ordinary cokriging has been shown to be particularly useful when samples of the variable to be predicted are sparse and samples of a second, related variable are plentiful, the case study demonstrates that it provides little improvement in interpolating the data gap in the SLC-off imagery.
1. Introduction
The Scan Line Corrector (SLC), a small mirror in the optical path of the Enhanced Thematic Mapper (ETM) instrument, failed on May 31, 2003. The function of the SLC is to compensate for the forward motion of the satellite during data acquisition. As a result of the SLC failure, individual image scans overlap in some parts of images acquired thereafter, while leaving large physical gaps in others. Only portions of the images, near the centre, are left completely unfettered and valid. On average, about 22% of the total image is missing data in each scene. NASA has made efforts to correct the SLC malfunction. However, these have not been successful and the problem appears to be permanent. Without the operating SLC, there are resulting gaps on images ranging as large as 14 pixels near the edges (figure 1). Further, the areas that have missing pixels are not identical across all multi-spectral bands. It appears that gaps shift positions slightly with spectral bands, resulting in valid data in some bands and no data in others. The non-identical missing pixels in different bands could present a problem in trying to identify and fill the data gaps (USGS, NASA and Landsat 7 Science Team 2003).
Although the aforementioned problems produce obvious negative impacts on usability, for some applications the anomalous Landsat-ETM+ retains significant and important utility for scientific applications and some users still prefer these data over more costly alternatives (USGS, NASA and Landsat 7 Science Team 2003). Development of new tools or techniques to compensate for the
nonavail-*Corresponding author. [email protected] Vol. 28, No. 22, 20 November 2007, 5103–5122
International Journal of Remote Sensing
ISSN 0143-1161 print/ISSN 1366-5901 online#2007 Taylor & Francis http://www.tandf.co.uk/journals
Downloaded By: [Kent State University] At: 17:22 23 October 2007
ability of the SLC may enable scientific use of the data. The USGS/NASA (United States Geological Survey/National Aeronautics and Space Administration) Landsat team has been trying to develop methods to fill the data gaps in the ETM+ imagery. The technique chosen by USGS Earth Resources Observation System (EROS) Data Center (EDC) is the local linear histogram-matching, which consists of a localized linear transform performed in a moving window throughout the missing pixel region. The histogram-based compositing algorithm was selected by USGS/NASA partly based on its simplicity and ease of rapid implementation. The merged images generated by the local linear histogram-matching technique potentially resolve most of the missing-data problems if the merged images and selected input scenes satisfy pre-determined criteria such as minimum cloud, snow cover or fires, low temporal variability and minimum date separation (Scaramuzza et al.2004, USGS 2004). Under such conditions, for most locations these merged image products appear similar in quality to the previous single acquisition scenes. However, the local linear histogram algorithm can yield poor results if the scenes being combined exhibit radical differences in target radiance due, for example, to the presence of clouds, snow, or sun glint (USGS 2004). Although the algorithm performs well in homogenous regions such as agricultural fields in the American Midwest and West where there are little temporal changes, it has difficulty when the size of the regions exhibiting the temporal change becomes small, such as in Europe and Asia. If the images chosen to generate the composite product contain transient data such as clouds, snow cover, or fires, the histogram-matching technique will not work in these areas (Commonwealth of Australia 2006). Thus, alternative methods are needed under those situations to fill the data gaps in the ETM+ imagery.
Figure 1. The resulting gaps in an ETM+ image occurring without the SLC function enabled. Available in colour online.
Downloaded By: [Kent State University] At: 17:22 23 October 2007
Interpolation can be one of the alternative methods to fill the data gaps in the ETM+ imagery. There are many interpolation methods such as inverse distance weighted methods, splines and triangulation methods. The USGS EROS Data Center currently utilizes the Akima (1978) method when interpolation is applied to the SLC-off imagery. Limitations of this method are that higher order derivatives are not continuous across the sides of the triangles, and that if the property changes drastically over a short distance, there tends to be some oscillations close to the vertices (Cooke et al. 1993). Most importantly, this method does not take full advantage of the spatial information in the image. However, spatial independence rarely occurs in image scenes and there is spatial dependence of variability in the image which is reflected in the variability of digital number (DN) values (Juppet al. 1988, 1989). Adjacent pixels tend to be spatially autocorrelated and it is expected that two adjacent pixels will generally be more similar in their DN values than would two pixels separated by a greater distance (Lark 1996). Because spatial structure occurs in remotely sensed images and DN does not act uniformly across the landscape, understanding the magnitude and pattern in spatial variability is necessary for accurately interpolating the missing pixels in the ETM+ imagery. Geostatistical techniques, which were founded and developed by Georges Matheron in France in the 1960s, are designed for spatial data and take full advantage of the spatial correlation information. These techniques can be used to explore and describe spatial patterns and variations in remotely sensed data. In this paper, an alternative interpolation method, ordinary kriging established by Matheron (1962, 1965, 1971), is described and used for filling the data gaps in the ETM+ imagery, which would enable the users to make use of anomalous ETM+ data by considering spatial structure in the image. Specifically, the geostatistical measure, variogram, which relates variance to spatial separation and provides a concise description of the pattern of spatial variability, was used to measure and model the spatial correlation of the digital numbers in the imagery. The variogram models of spatial correlation were then used along with ordinary kriging or cokriging techniques to interpolate the missing pixels in the ETM+ imagery.
While many geostatistical applications in remote sensing have been documented in the literature (Curran 2001, Curran and Atkinson 1998), to the best of the authors’ knowledge, as an interpolation method, various kriging methods have been applied only to image-derived/classified data by researchers (e.g. de Bruin 2000, Herzfeld 1999, Lark 1996, Rossiet al.1994), and studies to explore applying them directly to radiometric digital numbers of imagery for interpolating values of missing pixels are rare and limited (e.g. Addink and Stein 1999). Based on the fundamental idea that radiation is spatially correlated and can be estimated locally, we demonstrate the usefulness and accuracy of kriging, which provides unbiased estimates with minimum and known error (Matheron 1971), in interpolating DN values for filling the data gaps in the SLC-off Landsat ETM+ imagery.
2. Method
Geostatistics comprise a set of spatial statistical techniques for evaluating the autocorrelation observed in spatial data and estimating the local values of properties that vary in space from sample data (Matheron 1963, Isaaks and Srivastava 1989). The fundamental idea of geostatistics is that spatial data values from locations close to each other are more similar than data values from locations far apart, which is also the first law of geography. Autocorrelation structure is depicted by the
Downloaded By: [Kent State University] At: 17:22 23 October 2007
variogram, which allows attribute values to be estimated at unsampled locations. Kriging is a family of generalized least-square regression algorithms that account for the spatial dependence represented by the variogram. Kriging ensures optimal and minimum error-variance estimation by weighting more heavily the observations that are close in space (Matheron 1963, Journel and Huijbregts 1978), as determined from the mathematical model of the variogram. The following sections briefly introduce knowledge about the variogram used for modelling spatial dependence and ordinary kriging and cokriging employed for estimation of gap locations.
2.1 Modeling spatial dependence
The digital number (DN)Zof pixelxiis a regionalized variable, because the position
of pixel xi in space is known. Thus, variograms can be used to characterize the
spatial structure and measure the spatial variation in remotely-sensed images. Variograms model the spatial dependence in a regionalized variable Z under the ‘intrinsic’ hypothesis that the increments Z(xi+h)2Z(xi) associated with a small
distance h are weakly stationary (Matheron 1971). A semivariogram is computed using the DN values in each band of the images as follows:
cð Þh ~ 1 2N hð Þ X N hð Þ i~1 Z xið Þ{Z xið zhÞ ½ 2 ð1Þ
where N(h) is the number of pairs of data locations a vector h apart. The semivariogram is a useful measure of dissimilarity between spatially separate pixels (Juppet al.1988). The larger value ofcthe less similar are the pixels.
To describe the sample semivariogram for use in geostatistical techniques such as kriging, a mathematical model has to be fit to the sample variogram. The model fitted is defined by its type and the model coefficients, which may include the nugget variance, structured variance, sill, range and gradient. The five most frequently used basic models are nugget effect model, spherical model, exponential model, Gaussian model and power model (Goovaerts 1997). In fact, the semivariogram of DN values represents all the components of image texture (Lark 1996). The magnitude of spatial variability is represented by the sill and the spatial dependence of this variability, or homogeneity, is described by the range. Any periodicity is shown by a ‘hole effect’ in the semivariogram. Moreover, anisotropy is seen in differences in semivariance for the same large distance but different directions.
The joint spatial dependence of two codependent variables is often modelled using a cross-semivariogram (in our case this refers to the two codependent images). A cross-semivariogram is computed using the DN values in each band of the images as follows: cð Þh ~ 1 2N hð Þ X N hð Þ i~1 Z1ð Þxi {Z1ðxizhÞ ½ :½Z2ð Þxi {Z2ðxizhÞ ð2Þ whereZ1represents DN values in image 1 andZ2represents DN values in image 2.
Models for cross-semivariograms are fit to the data using nonlinear least-squares methods. The most frequently used basic types of mathematic models for cross-semivariograms are linear, spherical and exponential models (Deutsch and Journel 1998). Unlike semivariogram models, the cross-semivariogram may have negative values because of a negative cross-correlation between the two images.
Downloaded By: [Kent State University] At: 17:22 23 October 2007
2.2 Ordinary kriging and standardized ordinary cokriging
Ordinary kriging refers to the kriging of a random function with stationary increments and it is the most common and robust form of kriging. It accounts for local fluctuations of the mean by limiting the domain of stationarity of the mean to the local neighbourhood (Matheron 1962, 1965, 1971; Goovaerts 1997). Consider the problem of estimating the value of a piecewise continuous DN value z in an ETM+ image at any missing pixel locationxusing only thendata {z(xi),i51, …,n}
available over the image. The ordinary kriging estimator Z*OK(x) is written as a linear combination of then(x) random variables Z(xi):
ZOKð Þx ~X n xð Þ i~1 lðiOKÞð Þx Z xð Þi with Xn xð Þ i~1 lðiOKÞ~1 ð3Þ where li(OK) is the ordinary kriging weight. The ordinary kriging estimator is unbiased by forcing the kriging weights to sum to 1.
Matheron’s equation of ordinary kriging is expressed in terms of semivariograms as:
P
n xð Þ j~1 lðjOKÞð Þx c xi{xj { mOKð Þx ~cðxi{xÞ i~1, . . . ,n xð ÞP
n xð Þ j~1 lðjOKÞð Þx ~1 8 > > > > > > > > < > > > > > > > > : ð4ÞOrdinary kriging requires neither knowledge nor stationarity of the mean over the entire images. It is a nonstationary algorithm and most often applied within moving search neighbourhoods (Deutsch and Journel 1998). The robustness of the ordinary kriging algorithm is due to its ability to rescale locally the random function model Z(x) to a different mean value in different locations.
Cokriging is the extension of kriging to more than one variable (Matheron 1973, 1979). Using the cokriging approach the secondary information can be incorporated to increase the accuracy of estimates. The estimation particularly improves when the primary data are sparse and poorly correlated in space compared to the secondary data. Cokriging takes into consideration both the autocorrelation in each variable via the variograms and the correlation between the variables via cross-variograms. Because of the extremely demanding computation of the covariance functions, cokriging is seldom used in practice (Deutsch and Journel 1998). The traditional ordinary cokriging estimator of DN valuez1in an ETM+ image at any
missing pixel locationxis Zð ÞCOK1 ð Þx ~X n1ð Þx i~1 lOCKi ð Þx Z1ð Þxi z X n2ð Þx j~1 lOCKj ð Þx Z2 xj ð5Þ
with two nonbias conditions: P n1ð Þx i~1 lOCKi ð Þx ~1 and P n2ð Þx j~1 lOCKj ð Þx ~0
With the traditional ordinary cokriging, however, there may be a large number of negative cokriging estimates because of the nonbias condition P
n2ð Þx
j~1
Downloaded By: [Kent State University] At: 17:22 23 October 2007
(Isaaks and Srivastava 1989, pp. 400–416). To reduce the occurrence of negative weights and avoid artificially limiting the impact of secondary data, a standardized ordinary cokriging (Isaaks and Srivastava 1989, p. 416) is used in this study. The standardized ordinary cokriging estimator of DN valuez1in an ETM+ image at
any missing pixel locationx is ZCOKð Þ1 ð Þx ~X n1ð Þx i~1 lOCKi ð Þx Z1ð Þxi z X n2ð Þx j~1 lOCKj ð Þx Z2 xj zm1{m2 ð6Þ with the single condition that all weights must sum to one:
P n1ð Þx i~1 lOCKi ð Þx z P n2ð Þx j~1
lOCKj ð Þx ~1, where m15E{Z1(x)} and m25E{Z2(x)} are the
stationary means of Z1 and Z2. Though the standardized ordinary cokriging
required a prior estimation of the global means of the two variables, it can substantially improve the estimation in terms of the bias and the spread of the errors and the lower incidence of negative estimates (Isaaks and Srivastava 1989, p. 416).
The cokriging weights are obtained by solving the following cokriging system:
P
n1ð Þx j1~1 lj1ð Þx C11 xi1{xj1 zP
n2ð Þx j2~1 lj2ð Þx C12 xi1{xj2 z m1ð Þx ~C11 xi 1{x ð Þ i1~1, . . . ,n1ð ÞxP
n1ð Þx j1~1 lj1ð Þx C21 xi2{xj1 zP
n2ð Þx j2~1 lj2ð Þx C22 xi2{xj2 z m2ð Þx ~C21 xi 2{x ð Þ i2~1, . . . ,n2ð ÞxP
n1ð Þx j1~1 lj1ð Þx zP
n2ð Þx j2~1 lj2ð Þx ~1 8 > > > > > > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > : ð7Þwhere C11 is the auto-covariance of the first variable; C22 is the auto-covariance
of the second variable;C12andC21are cross-covariance between the two variables;
lj1 andlj2 are cokriging weights; andm1(x) andm2(x) are the Lagrange parameters.
2.3 A simplified algorithm for estimating the gap data in the SLC-off Landsat
ETM+ imagery
Matheron’s equation (4) of ordinary kriging system and the ordinary cokriging system (equation (7)) (Matheron 1971, 1979; Journel and Huijbregts 1978; Isaaks and Srivastava 1989) provide general solutions when the known values of functionZ are arbitrarily located. The periodic stripes of the data gaps in SLC-off ETM+
imagery and the peculiar structure of the regular grids data structure yield original simplifications in the solution of equations (4) and (7) for this study. This is because the function Z is not arbitrarily located anymore and there are some patterns of the neighbourhood samplings around each pixel to be estimated. Once these patterns are identified, the weights of the neighbourhood samplings with the same pattern will become constant. Thus it is not necessary to calculate its neighbourhood sampling weights each time to estimate the value of a missing pixel by using equation (4) or (7). Equation (4) or (7) is required only one time to calculate the
Downloaded By: [Kent State University] At: 17:22 23 October 2007
weights of the neighbourhood samplings for each pattern. Once the weights of the neighbourhood samplings in a pattern haven been obtained, they can be applied to other pixels with the same pattern of the neighbourhood sampling.
The ordinary kriging estimatorZ*OK(x) can then be simplified and written as a linear combination of several weightedZ(xi)
ZOK ð Þx ~X n i~1 liZ xið Þ with X n i~1 li~1 ð8Þ wheren is the number of the neighbourhood sampling, li represents the constant weights for each neighbourhood sampling pixel in a pattern, and Z(xi) is the neighbourhood sampling value. Note: in this simplified formula the sampling weight is decided only by its relative position in a neighbourhood sampling pattern and is independent of the location of the pattern. Thus ordinary kriging reduces to several weighted averages, which is incomparably simple and efficient.
The standardized ordinary cokriging estimatorZ(1)*COK(x) can then be simplified as a linear combination of several weightedZ1(xi) andZ2(xj):
Zð ÞCOK1 ð Þx ~X n1 i~1 lOCKi Z1ð Þxi zX n2 j~1 lOCKj Z2 xj zm1{m2 ð9Þ
with the single condition that all weights must sum to one: P n1 i~1 lOCKi zP n2 j~1 lOCKj ~1, wherem15E{Z1(x)} andm25E{Z2(x)} are the stationary means ofZ1andZ2,li
OCK
and ljOCK are the constant weights for each neighbourhood sampling pixel in a pattern.
Based on the above principle, the simplified algorithm for interpolating the missing pixels in the ETM+ imagery consists of the following steps:
Step 1: set up the size of the neighbourhood sampling.
Step 2: identify the patterns of the neighbourhood sampling in the SLC-off ETM+ imagery.
Step 3: calculate the constant weightslifor each neighbourhood sampling pixel in a pattern using equation (4) of the ordinary kriging system or calculate the constant weightsli
OCK andlj
OCK
for each neighbourhood sampling pixel in a pattern using equation (7) of the standardized ordinary cokriging system.
Step 4: estimate each missing pixel with the same neighbourhood sampling pattern using equation (8) of the simplified ordinary kriging system or estimate each missing pixel with the same neighbourhood sampling pattern using equation (9) of the simplified standardized ordinary cokriging to incorporate the secondary information. Step 5: repeat steps 3 and 4 for other patterns until all unknown locations are calculated.
Because of the use of the simplified ordinary kriging system in equation (8) or the simplified standardized ordinary cokriging system in equation (9), the algorithm works faster than the traditional kriging system algorithm.
3. Case study
3.1 Background
The Lunan Stone Forest is the world’s premier pinnacle karst landscape (Zhang et al. 2005). Located among the plateau karstlands of Yunnan Province, in
Downloaded By: [Kent State University] At: 17:22 23 October 2007
southwest China, it has an unusual climate condition that causes cloud cover to occur virtually everyday. Thus, while the study of this unique landscape has significant value in karst sciences, it is difficult to find a cloud free image to allow such study using satellite imagery. As a means to test the application of the geostatistical techniques to the gaps-fill of SLC-off Landsat-ETM+ satellite image, an SLC-off ETM+ scene for April 6, 2005 was purchased from the USGS/EROS Data Center. For illustration purposes, a subset of the scene instead of the whole scene was used for the case study. The subset of the scene is located close to the edge of the scene and contains 196 rows and 196 columns. Figure 5(A) shows the subset original image (multispectral display of Bands 2, 3, 4). The technique used by USGS EROS Data Center for the initial gaps-filled SLC-off product (phase 1 algorithm of the local linear histogram-matching technique) was first attempted to replace all missing image pixels with values derived from a coregistered, histogram-matched SLC-on scene. The obvious artifacts appeared in the gaps-filled image because the only SLC-on scenes we could find for this location were either not cloud free or cloud free but for a different season (the vegetation cover changes greatly in the area with different seasons). A phase 2 algorithm was also applied to combine clear SLC-off scenes for a histogram-based compositing. The clear SLC-off scene used was acquired on March 05, 2005, which is approximately one month earlier than the previous one. Figure 5(C)shows the gaps-filled image using the phase 2 algorithm. The result is better than that of the phase 1 algorithm because of the clear image used and improved temporal match. However, visual artifacts still exist because the number of SLC-off clear scenes available to fill the gap are not enough. Some zero value cells remained and have not been filled in because of the non-identical missing pixels in different bands and the lacking of available data (black colour cells in figure 5(C)). While the local linear histogram-matching technique used by USGS EROS Data Center can successfully fill the data gaps for the regions where radiometric matching of scenes are available, it turned out to not be suitable for filling the data gaps in the Lunan Stone Forest area because of its particular climate condition and fast temporal change of the landscape in this region. The purpose of this case study is to propose an alternative approach—geostatistical methods—for filling the data gaps of the SLC-off ETM+ image for this region and other regions with similar conditions. Variograms are used to characterize the spatial structure and measure spatial dependence in the ETM+ image; and ordinary kriging and standardized ordinary cokriging are used to interpolate DN values for filling the data gaps in the ETM+
imagery. The premise of the case study is that the ‘gaps’ in the SLC anomaly data sets can be effectively estimated and filled using the digital number predictions derived from the digital number values of the same data set with the gaps.
Because semivariograms are sensitive to strong skewness of the input data, kriging may yield negative predicted values or values exceeding the range of the input data if the input data has strong skewness. Kriging works best if the input data have a normal distribution. Thus, for this case study, the normality of the data was first tested. The histograms of two band examples, as shown in figure 2, indicate that the input data is not a normal distribution. As a result, a normal score transformation procedure was performed. Normal scores of the DN values were used in the variogram model and ordinary kriging system to calculate estimates of missing pixels in the ETM+ imagery, and then the estimated values were back transformed. The normal score transformation and back normal transformation were performed with the GSLIB routinesnscoreandbacktr(Deutsch and Journel 1998) in this case study.
Downloaded By: [Kent State University] At: 17:22 23 October 2007
3.2 Modelling spatial dependence in the SLC-off Landsat ETM+ imagery using
variogram
The DN values of the Landsat ETM+ imagery contain spatial trends due to large-scale physical processes. The variogram model can describe spatial continuity and characterize spatial correlation between samples. It can act as a quantified summary of all the available structural information. Figure 3 shows the experimental variograms of Band 1 and Band 6 calculated from the normal scores of the DN values of ETM+ image. Because the samples in the image used to compute the variogram are large, the experimental variograms have smooth, gradual or rounded shapes and are gently sloping near the sill. This makes it easy to fit the experimental variograms with standard mathematical models such as linear, spherical and exponential models for ordinary kriging interpolation. Model selection is usually based on a criterion of goodness of fit, which, in our case, involves fitting the model to data using nonlinear least-squares methods. Exponential mathematical models were chosen to fit the experimental variograms (see figure 3). Table 1 shows the parameters of the fitted mathematical variogram models for ordinary kriging. Ideally the variogram nugget, which is nonzero variance and represents unexplained or random variance (Deutsch and Journel 1998), should be zero for remotely sensed imagery. However, there is a nonzero value for Bands 1, 5 and 6, respectively. The main reasons for the different nugget effect for different bands are the following: a Figure 2. Histogram of DN values of the SLC-off landsat-ETM+ satellite image of Band 1 and Band 6 (Note: DN values of missing pixels have been removed).
Downloaded By: [Kent State University] At: 17:22 23 October 2007
nugget effectsmay have two distinct causes, namely the impact of all structures of a smaller scale, such as s9, plus the variance of the errors, such as s0, so that
s5s9+s0. The second term is perhaps the same for all bands, buta priorithe first is not. Moreover, if the area ‘‘a’’ of the rectangular pixel changes (e.g. another satellite), and becomes ‘‘b’’, then we haves95k/a for the first satellite ands95k/b for the second one, with the same k. The variance of the errorss0does not follow this law inverse proportionality with the area, so that they will appear differently. Also, if we assume that the images are piecewise continuous at the scale under study, which is realistic, then the point variogram is linear near the origin, i.e.
c(h)5amodule(h), and the smoothing by the pixel size transforms the point variogram into c(h)5ah2log[module(h)] (near the origin). In the present case, this smoothing effect seems to partly compensate for the jump of the nugget effect, hence resulting in linearity appearing on the variogram plots.
We also compared the experimental residual variogram and the experimental variogram. The comparison results showed that they are almost identical, and therefore the assumption of stationarity is justified and ordinary kriging may be applied to the gap data interpolation of the SLC-off ETM+ image.
3.3 Estimation of gap locations in the SLC-off Landsat ETM+ imagery using
ordinary kriging
In the case study eight neighbourhood samples were used to estimate the missing pixel in the ordinary kriging. Two constraints were employed to choose the eight neighbourhood samplings: one is to ensure the close samples (measured by the Euclidean distance) and the second is to make the samples from different sides of the point being estimated if samples have the same closeness. We applied the simplified algorithm to the case study. There are a few patterns of the neighbourhood sampling which can be identified and used to estimate the missing pixel DN value in the case study. Figure 4 shows two such pattern examples. Note: because there are different gap ranges in different locations in the ETM+ image, the neighbourhood sampling patterns will change according to the size of the chosen neighbourhood sampling. For example, the geometry and total number of available patterns of eight chosen neighbourhood samples will be different from those of 15 selected neighbourhood samples.
To calculate the constant weights li for each pixel in a pattern, the Matheron’s equation (4) of ordinary kriging system were used in the case study:
Table 1. Parameters of normal score variogram models for ordinary kriging.
Band Type Nugget Sill
Structured variance Range Band 1 Exponential 0.03 0.66 0.63 440 Band 2 Exponential 0 0.63 0.63 420 Band 3 Exponential 0 0.68 0.68 500 Band 4 Exponential 0 0.72 0.72 350 Band 5 Exponential 0.04 0.75 0.71 450 Band 6 Exponential 0.03 0.79 0.76 440
Downloaded By: [Kent State University] At: 17:22 23 October 2007 l~ l1 l2 l3 l4 l5 l6 l7 l8 u 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ~ C11 C12 C13 C14 C15 C16 C17 C18 1 C21 C22 C23 C24 C25 C26 C27 C28 1 C31 C32 C33 C34 C35 C36 C37 C38 1 C41 C42 C43 C44 C45 C46 C47 C48 1 C51 C52 C53 C54 C55 C56 C57 C58 1 C61 C62 C63 C64 C65 C66 C67 C68 1 C71 C72 C73 C74 C75 C76 C77 C78 1 C81 C82 C83 C84 C85 C86 C87 C88 1 1 1 1 1 1 1 1 1 1 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 {1 C10 C20 C30 C40 C50 C60 C70 C80 1 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ð10Þ
whereC11,C12, …..C88are covariance between the paired samples and C10, …C80
are covariance between the pixel to be estimated and the eight samples. The covariance can be estimated based on the fitted exponential mathematical variogram model. The relationship between the covariance function and variogram model can be described by the following formula:
C hð Þ~c{cð Þh ð11Þ wherecis the sill.
Since eight neighbourhood samples were chosen in the case study, the largest involved kriging distance on the variogram is greater than seven pixels for most patterns. The experimental variograms show that the variograms are practically linear up to four pixels for all bands of the ETM+ image. Because the involved kriging distances are greater than the linear distance, the exponential mathematical models were used to calculate the covariance. Note: if fewer neighbourhood samplings were chosen in the case study, for example, four neighbourhood samples, then the variograms will be linear up to the largest involved distance, this would simplify the Matheron’s equation (4) because then the kriging weights become independent of the slope of the linear variogram and the nugget effect. The standardized linear variogram model can be described by the following simplified Figure 4. Two pattern examples with eight chosen neighbourhood samples. Available in colour online.
Downloaded By: [Kent State University] At: 17:22 23 October 2007
formula:
cð Þh ~h ð12Þ
3.4 Results
Figure 5 shows the interpolation results using ordinary kriging for Bands 1 and 6. It can be seen that no obvious image interpolation-related artifacts are visually apparent throughout most of the area shown. The gap data areas where the interpolations were done appear to be good and useable data. Figure 6(b)displays the interpolated multispectral image of Bands 2, 3, 4, which allows us to check clearly the interpolation-related artifacts. Compared with the gap-filled image using USGS’s local linear histogram-matching method (figure 6(c)), the visual artifacts of the interpolated image have been greatly reduced although there are still some visual Figure 5. Image pairs showing interpolation results using the ordinary kriging method. Original images are of Bands 1 and 6 (a,c) with interpolated images for each, respectively, to the immediate right (b,d). Available in colour online.
Downloaded By: [Kent State University] At: 17:22 23 October 2007
artifacts existing such as smoothing effects. Visual inspection with the interpolated multispectral band image (figure 6(b)) suggests that image interpolation does have a few noticeable deleterious impacts on image quality, although they are not severe. Linear features are particularly affected, as in the image shown, where the roads become somewhat blurred. While this would not be a problem for most scientific analyses, it might affect the usability of the image in some applications such as transportation planning.
To verify the feasibility and reliability of the method and assess the accuracy of the method, two strips, which are similar to the data gap in the SLC-off ETM+
image, were cut manually from the SLC-off ETM+ image (figure 7(a)). Figure 7(b) shows the image after the two strips were cut. Ordinary kriging with the same parameters in the previous gap fill was used to fill the data gaps in the two strips, where the truth is known. Figure 7(c) illustrates the interpolated image using ordinary kriging (the double, thin black lines indicate the two cut-off strips’ locations). Compared visually with the true data in the original image, the continuity of the majority of the scene in the interpolated pixel region is good and spatial patterns are restored well.Q–Qplots were employed to determine if the DN Figure 6. Comparison of ordinary kriging and linear historgram-matching methods in gaps-fill. (a) Original image with Bands 2, 3, 4. (b) Interpolated image using ordinary kriging (Bands 2, 3, 4). (c) Gap-filled image using USGS the local linear histogram-matching methods (Bands 2, 3, 4). Available in colour online.
Figure 7. Images with manually cut strips for validation purpose. (a). original image (the double, thin black lines indicate the two cut-off strips location). (b). the image with the two strips cut. (c). interpolated image (the double, thin black lines indicate the two cut-off strips location). Available in colour online.
Downloaded By: [Kent State University] At: 17:22 23 October 2007
values of interpolated pixels and the DN values of truth pixels in the two cut-off strips share a common distribution or not. Figure 8 shows Q–Q plots of the distribution of true DN values versus that of estimated DN values for Band 1 and Band 6 of the image. Both bands show discrepancies at the upper tails of the two distributions: the large quantile values of the estimated DN values are smaller than the corresponding quantile values of the true DN values (dots are below the 45u line). SuchQ–Q plots indicate that the larger DN values are underestimated. Both bands, especially Band 6, also display discrepancies at the lower tails of the two distributions: the small quantile values of the estimated DN values are larger than the corresponding quantile values of the true DN values (dots are above the 45u line). Such Q–Q plots suggest that the smaller DN values are overestimated. In general, the Q–Q plots show the smoothing artifact effects that inherently accompany the ordinary kriging algorithm and most other interpolation methods (Deutsch and Journel 1998). Because ordinary kriging estimates are weighted moving averages of the original data values, they usually produce less spatial variability than the true data. To assess the accuracy of the ordinary kriging, the errors (true DN value minus estimation DN value) between the truth and the estimate were computed in the study. Figure 9 shows the histograms of the errors for Figure 8. Q–Q plot of the distribution of true DN values versus that of estimated DN values in the two cut-off strips for different bands of the image (Bands 1 and 6).
Downloaded By: [Kent State University] At: 17:22 23 October 2007
Bands 1 and 6. It can be seen that the histograms are fairly symmetrical for both bands. The symmetry of the histograms indicates that there are almost equal overestimation and underestimation. More importantly, the histograms indicate that the majority of interpolated pixels in the two strips have small errors and only a few of the pixels have relatively large different DN values compared with their corresponding true DN values. The mean of the errors ranges from 0.41 to 1.06, the upper quantile of the errors ranges from 1.64 to 5.7, the lower quantile of the errors ranges from24.9 to21.16, and the median of the errors ranges from 0.36 to 1.09. The small values of mean, quantile, and median of the errors suggest that the ordinary kriging performed well overall. However, performance may be location-dependent. To assess spatially ordinary kriging’s effectiveness we created a corresponding errors distribution map. The error distribution map is useful in analysing the reliability of the DN value of each pixel in the two strips of gap data filled imagery. It is also helpful to determine where more information is needed in real world applications to effectively make decisions using the gap filled imagery. Figure 10 displays the error spatial distribution map for Bands 1 and 6. Green colour demonstrates the locations of the accurate estimation, while red and blue locations show areas of positive (i.e. underestimation) and negative (i.e. over-estimation) errors, respectively. One expected pattern is that the error is greatest in those areas that have the least consistent DN values. For example, one location of the red colour is composed of pixels whose DN values represent the ‘lake’ feature. Lakes represent a small land cover class in this region and the pixels that depict them have a relatively small amount of local data. Because it is difficult to accurately interpolate its DN values according to the DN values of its neighbours, it is reasonable that the ‘lake’ feature has a greater estimation error. In general, the validation results of the gap-fill of the two strips show that it is feasible to fill the data gaps in SLC-off ETM+ imagery using the ordinary kriging interpolation approach. Although there are some artifacts such as the smooth effects and the disrupted linear features in the interpolated images, the general results are good and the errors are within acceptable limits for classification for many applications. This suggests that ordinary kriging provides a great tool for interpolating the missing pixels in the SLC-off ETM+ imagery.
To consider other better geostatistical interpolation methods, we also attempted to use standardized ordinary cokriging to interpolate DN values for filling the data Figure 10. The spatial distribution map of the errors (true value minus estimation value) for Bands 1 and 6. Available in colour online.
Downloaded By: [Kent State University] At: 17:22 23 October 2007
gap in the ETM+ imagery. A Landsat ortho rectified TM Mosaic imagery from the same study area was used as the secondary variable for the cokriging. Variogram and cross-variogram models were fitted to represent co-regionalization of the two variables. Each fitted variogram and cross-variogram model is composed of three structures: a nugget and an exponential model and a Gaussian model. The co-regionalization matrices are positive semi-definite. Figure 11(c) displays the interpolated multispectral image (Bands 2, 3 and 4) using standardized ordinary cokriging. However, when visually comparing it with the interpolated image using ordinary kriging, we found that it had no obvious improvement. To quantify the distinction between the two methods, the difference between the interpolated DN values using the two methods was calculated. Figure 12 shows the histograms of the difference for Bands 1 and 6. The histograms demonstrate that there is little difference between the interpolated DN values using the two methods. The mean of the difference ranges from20.029 to 0.1 for Bands 1–6. The difference is so small that it can be considered negligible in the image classification. This also shows that, as expected, when a primary attribute that is interpolated has been reasonably well sampled, the secondary data are effectively redundant and hence both ordinary kriging and standardized ordinary cokriging give similar results. Cokriging is helpful only when the primary attribute interpolated is sparsely sampled.
Figure 11. Comparison of ordinary kriging and cokriging. (a). Original image (Bands 2, 3, 4). (b). Interpolated image using ordinary kriging (Bands 2, 3, 4). (c). Interpolated image using cokriging (Bands 2, 3, 4). Available in colour online.
Figure 12. Histograms showing change in DN values between interpolated cells of the image using ordinary kriging versus cokriging for different bands (Bands 1 and 6).
Downloaded By: [Kent State University] At: 17:22 23 October 2007
4. Discussion and conclusions
Despite the abnormal function of the SLC, the quality of the radiometry and geometry of the Landsat 7 ETM+ data sets is still excellent for many applications on regional and global scales. Thus it is important to find alternative techniques to fill in the data gaps in the SLC-off imagery. We provide a case study which demonstrates that geostatistical interpolation methods such as ordinary kriging provide a good tool in filling the data gaps of an SLC-off ETM+ image and the interpolated pixels are reasonable approximations of the ‘real’ ones. By calculating the variogram the geostatistical interpolation methods take into account the spatial variability and dependence inherent in the images. Because of the abundant data available in the SLC-off ETM+ image it is possible that the experimental variograms are well-structured and can be fitted well using the available standard mathematical models. Ordinary kriging interpolation results in an optimal unbiased estimate for the data gap because its estimate is based on the structural characteristics of the image. The images under study are piecewise continuous (i.e. with some lines of discontinuous cliffs, continuity between the cliffs), and this structural feature is translated by the quasi-linear behaviour of the variogram. It is because of this piecewise continuity that kriging is the adapted alternative to other interpolation techniques such as triangulation. Further, several other factors facilitate ordinary kriging for the more accurate estimation, such as the relatively small size of data gap, the strip shape of the missing pixels and the parallel distribution of the data gap. Although the interpolation approach obviously cannot perfectly restore the image, the continuity and accuracy of the gap filled image by the interpolation are good. No obvious visual differences were noted between interpolated and non-interpolated pixels in most areas. The summary statistics, histograms and distribution of the errors indicate that the ordinary kriging results are accurate. While there are still some subtle differences or errors as compared with the non-interpolated areas, such differences are typically minor and within the acceptable error limits of variability to most image classification methods. Classification of such interpolated imagery should give results similar to that of analysis from images not having similar gap problems and thus, be applicable for regional and global scale studies. While some evidence of artifacts still exists where the gaps were previously located, it is apparent that the approach has great promise for filling the data gaps in SLC-off ETM+ images.
While standardized ordinary cokriging has been shown to be particularly useful when samples of the variable to be predicted are sparse and samples of a second, related variable are plentiful or exhaustive (Goovaerts 1997), the case study demonstrated that it provided little improvement in interpolating the data gap in the SLC-off image compared with that of the ordinary kriging approach. The cokriging method produces a similar gap-filled image to that produced by interpolation of only the primary data set using traditional ordinary kriging techniques. This can be explained by the structural characteristics existing in the primary image data and the secondary image data. Consider, for example, the same functionZfor the first and second image. Clearly, cokriging cannot bring more information than ordinary kriging, even though the two inputs are perfectly correlated. Taking now the functionZfor the first image, and a white noiseWfor the second, again we find that there is no improvement and this time the sources are completely decorrelated. KeepingZto be the first, and taking for the second image the shifted versionZ9ofZ by the vector of the range in a direction not parallel to the data gap stripes, we find
Downloaded By: [Kent State University] At: 17:22 23 October 2007
that cokriging will obviously improve the estimation ofZin the data gap, though Z(x) and Z9(x) are completely uncorrelated. Similarly, one can take for Z9 a function which surroundsZ, as might be found in certain metallic deposits. Indeed, shift and crown are the two major structural situations when cokriging yields an improvement. Both structures induce a hole effect in the cross-covariograms (i.e. one peak at least, followed by a minimum). The variograms and cross-variograms in the case study shows that they are not in this situation, so cokriging turns out to be cumbersome. In addition, two other reasons may cause lack of improvement by cokriging: (1) since the data gap in the SLC-off image is narrow and in a relatively small percentage, the primary data—the SLC-off image itself in the case study— contains abundant data and the second data—the Mosaic image—doesn’t provide much more data; (2) the primary data is given more weighting than nearby secondary data in the vicinity of a point being interpolated since the primary and secondary data are close or even collocated. Considering the complex and tedious joint modelling a co-regionalization in the process of standardized ordinary cokriging, it may not be practical to use it for filling data gaps in the SLC-off satellite imagery.
The ordinary kriging interpolation approach suggested for filling data gaps in the SLC-off satellite imagery in this study is particularly suitable for regions where cloud-free imagery is hard to obtain and frequent temporal change happens. It allows users to extract the maximum information from the individual scene of the SLC-off image by considering spatial variability and dependence. It overcomes the inherent problems caused by radiometric differences and small georeferencing errors with the USGS local linear histogram algorithm. Although the interpolation results are not accurate enough for certain small-scale applications such as detailed mapping of small regions, the approach is appropriate for many other applications, such as assessment of mapping in large-scale agricultural landscape areas where rapid seasonal changes take place.
One problem encountered in the case study concerns the selection of neighbouring samples. Considering the large number of sample data available from the SLC-off ETM+ image, clarification of several issues would be beneficial for further research, such as how many neighbouring samples can be used in the study and how to efficiently select neighbouring samples. In the case study, the number of data points used in the case study for ordinary kriging is 8. Using such a set of parameters, on one hand, makes the ordinary kriging algorithm work fast because of the small number of neighbouring samples used; on the other hand, it may reduce the interpolation accuracy due to the decision of ignoring the abundance of sample data existing in the satellite image. Further research is needed to find how many neighbouring samples work best for data gap filling of the SLC-off ETM+ image, considering the specific spatial distribution characteristics of the missing pixels in the image.
Acknowledgements
Support from the NASA Wisconsin Space Grant Consortium is appreciated. Constructive suggestions from two anonymous reviewers substantially improved the manuscript and are greatly appreciated.
References
AKIMA, H., 1978, A method of bivariate interpolation and smooth surface fitting for irregularly distributed data points. ACM Trans. Mathematical Software, 2, pp. 148–159.
Downloaded By: [Kent State University] At: 17:22 23 October 2007
ADDINK, E.A. and STEIN, A., 1999, A comparison of conventional and geostatistical methods to replace clouded pixels in NOAA-AVHRR images.International Journal of Remote Sensing,20, pp. 961–977.
COMMONWEALTH OF AUSTRALIA 2006, Landsat 7 ETM+ SLC-off composite products. Available online at: http://www.ga.gov.au/acres/referenc/slcoff_composite.jsp (accessed 12 April 2006).
COOKE, R., MOSTAGHIMI, S. and PARKER, J.C., 1993, Estimating oil spill characteristics from oil heads in scattered monitoring wells.Environmental Monitoring and Assessment,28, pp. 33–51.
CURRAN, P.J., 2001, Remote sensing: Using the spatial domain.Environmental and Ecological Statistics,8, pp. 331–344.
CURRAN, P.J. and ATKINSON, P.M., 1998, Geostatistics and remote sensing. Progress in Physical Geography,22, pp. 61–78.
DEBRUIN, S., 2000, Predicting the areal extent of land-cover types using classified imagery and geostatistics.Remote Sensing of Environment,74, pp. 387–396.
DEUTSCH, C.V. and JOURNEL, A.G., 1998,GSLIB: Geostatistical Software Library and User’s Guide(New York: Oxford University Press), 369 pp.
GOOVAERTS, P., 1997, Geostatistics for Natural Resources Evaluation (New York: Oxford University Press).
HERZFELD, U.C., 1999, Geostatistical interpolation and classification of remote sensing data from ice surfaces.International Journal of Remote Sensing,20, pp. 307–327. ISAAKS, E.H. and SRIVASTAVA, R.M., 1989,An Introduction to Applied Geostatistics (New
York: Oxford University Press), 561 pp.
JOURNEL, A.G. and HUIJBREGTS, C.J., 1978,Mining Geostatistics(London: Academic Press). JUPP, D.L.B., STRAHLER, A.H. and WOODCOCK, C.E., 1988, Autocorrelation and regulariza-tion in digital images: I. Basic theory.IEEE Transactions on Geoscience and Remote Sensing,26, pp. 463–473.
JUPP, D.L.B., STRAHLER, A.H. and WOODCOCK, C.E., 1989, Autocorrelation and regulariza-tion in digital images: II. Simple image models.IEEE Transactions on Geoscience and Remote Sensing,26, pp. 27, 247–258.
LARK, R.M., 1996, Geostatistical description of texture on an aerial photgraph for discriminating classes of land cover. International Journal of Remote Sensing, 17, pp. 2115–2133.
MATHERON, G., 1962,Traite´ de ge´ostatistique applique´e(Editions Technip.).
MATHERON, G., 1963, Principles of geostatistics.Economic Geology,58, pp. 1246–1266. MATHERON, G., 1965,Les variables re´gionalise´es et leur estimation(Masson, Paris). MATHERON, G., 1971,The Theory of Regionalized Variables and its Applications(Les Cahiers
du center de Morphologie Mathematique de Fontainebleau 5. CMMF, Fontainebleu).
MATHERON, G., 1973, The intrinsic random functions and its applications. Advances in Applied Probability,5, pp. 439–468.
MATHERON, G., 1979, Recherche de simplification dans un proble`me de cokrigeage, (Rapport N-628, CG, Ecole des Mines de Paris). Available online at: http://cg.ensmp.fr/ bibliotheque/1979/MATHERON/Rapport/DOC_00218/MATHERON_Rapport_00218. pdf.
ROSSI, R.E., DUNGAN, J.L. and BECK, L.R., 1994, Kriging in the shadows: geostatistical interpolation for remote sensing.Remote Sensing of Environment,49, pp. 32–40. SCARAMUZZA, P., MICIJEVIC, E. and CHANDER, G., 2004, SLC gap-filled products: Phase one
methodology. Available online at: http://landsat.usgs.gov/data_products/slc_off_ data_products/documents/SLC_Gap_Fill_Methodology.pdf (accessed 12 April 2006). USGS, NASA and LANDSAT7 SCIENCETEAM2003, Preliminary assessment of the value of Landsat 7 ETM+ data following Scan Line Corrector malfunction. Available online at: http://landsat.usgs.gov/data_products/slc_off_data_products/documents/ SLC_off_Scientific_Usability.pdf (accessed 12 April 2006).
Downloaded By: [Kent State University] At: 17:22 23 October 2007
USGS 2004, Phase 2 gap-fill algorithm: SLC-off gap-filled products gap-fill algorithm methodology. Available online at: http://www.ga.gov.au/image_cache/GA4861.pdf (accessed 12 April 2006).
ZHANG, C., LI, W. and DAY, M., 2005, Towards establishing effective protective boundaries for the Lunan Stone Forest using an online spatial decision support system.Acta Carsologica,34, pp. 178–193.