Object based image analysis of high resolution data in the alpine forest area

(1)

Object based image analysis of high resolution data in the alpine forest area

R. de Kok, T. Schneider, M. Baatz, U. Ammer

Lehrstuhl für Landnutzungsplanung und Naturschutz, am Hochanger 13, 85354 Freising [email protected]

KEYWORDS: Image-object analysis, Data fusion, Texture analysis, Forest-GIS ABSTRACT

A typical multi-resolution satellite sensor is working in several multi-spectral modes as well as a panchromatic mode with a much higher spatial resolution. The integration of panchromatic data into the standard multi-spectral analysis is not a straightforward procedure. The additional image information of the panchromatic band in combination with the multi-spectral bands however, allows to retrieve the maximum image information from a given dataset. Especially the second order statistics of the panchromatic band shows very typical behaviour in analysing different forest stand parameters.

The human vision is quite capable of selecting image objects inside high resolution panchromatic data. Each ‘object of interest’ is than analysed further according to the attributes linked to the selected set of ‘objects of interest’. Using traditional image fusion techniques, such as IHS and Brovey transformation, the visual interpretation is much aided but the spectral classification is hampered, as the original pixel values are enhanced with panchromatic values.

Object oriented classification techniques, well known from radar analysis and GIS based classification of raster images can deal with topology descriptions and spatial object statistics. Manual construction of spatial objects however is expensive and time consuming. A proper solution would be to use an advanced segmentation technique, such as used in the Delphi2 Creative Technologies GmbH, eCognition software to allow advanced (semi) automatic object building and object analysis.

In object based image analysis, the image fusion is a trivial issue as the objects in the panchromatic band can be displayed with any given attribute, including the original pixel values of the multi-spectral bands. The object oriented classification method allows a proper segmentation of the panchromatic data into a

set of spatial objects. This makes a pre-selection of the ‘objects of interest’ possible. Object based image analysis therefore offers the possibility to continue the spectral analysis with ‘fused’ images. As spatial objects are selected on the basis of the image resolution preferred by the operator, all other resolution image layers are attributes to the selected object layer. This allows any resolution to be integrated and any type of surface description, depending on aggregation preferences of the user.

The whole construction of pixel-objects and the object based image analysis allows an image interpretation, which surpasses traditional spectral analysis. Object-topology and object-texture allows new ways of defining mixed pixel analysis. Also it becomes very interesting to redefine image texture analysis, not only as analysis of variance among neighbouring pixels (a filter operation), but also as spatial relationship among image-objects on different levels of resolution (Baatz,1999).

In the forestry application inside the difficult area of the Bavarian Alps, GIS updating with cheap data is becoming increasingly important. Extreme snowfall and rain conditions in this and past year, including the loss of infrastructure and property, reaching into the amounts of billions of German marks shows again how important it is to gain a proper grip on the management of geo-factors in the region. Accurate and up to date maps are therefore a must and only modern sensors allow the deliverance of quick and up to date material to acquire the proper parameters to construct an optimal decision support system for the Bavarian state officials. Cheap data in this sense is not only the acquisition of quality satellite imagery, but also cheap methods of processing them with reliable accuracy. In this study, the potential of object based image analysis with a seamless forest-GIS synergy is shown. Typical forest parameters such as stand closure, erosion hazards and forest species composition can be derived from images that have 1 meter panchromatic and 5 meter multi-spectral resolution, the resolution expected from the new satellite generations .

(2)

1. INTRODUCTION

Modern satellite sensors for earth observation are capable to create images in several multispectral bands, as well as in the panchromatic band. To derive the optimal image information, the visual interpretation of high resolution panchromatic data is still common practice in applicational remote sensing. To increase the capability of the visual interpretation, different data fusion techniques have been developed to enhance the panchromatic data with multispectral information. Gorte (1998) points out that there is a clear difference between digital image processing and information extraction (from satellite imagery), the image processing part should be well adopted to further steps in the (automatic) extraction phase of the image analysis. The military origin of Very High Resolution data is mainly focussed on specific object detection which can only be obtained with a certain image resolution and visual interpretation is preferred. In remote sensing applications inside the agricultural and forestry domain, the main aim is to characterise surface area’s and not special single objects. This large area classification alters the way in approaching the automatic analysis of such data.

Landgrebe (1999) makes a distinction between image domain, spectral domain and feature space domain, to clarify the difference between human visual perception and computer ‘vision’. The well known image domain allows the operator to visually analyse spatial relationships among image objects. The decision rules are operator dependent. They are not available for the automatic image analysis and the results are recorded as digitised GIS layers. High resolution is in such cases a definite pre-advantage if sufficient context is offered for the analysis procedure. For analysis in spectral domain and feature space domain, the domains where computers play a dominant role, high resolution imagery is becoming difficult to handle. Using high resolution imagery in a visual analysis however is very costly , time consuming and biased. A proper semi-automatic analysis of such imagery is possible, if proper knowledge from the applicational domain is feeded to the analysis process.

Spectral analysis in high resolution imagery without knowledge based decision rules will prove to be extremely difficult as the problem of mixed pixels is increasing while the amount of detectable objects increases as well. In this case, algorithms which are

applied to the whole image are not very effective and feature space alone can not be used to create all decision boundaries between classes. As we deal with different resolutions and therefore different levels of object detection; it is for example the forest stand, the tree groups and the single crowns, it becomes necessary to analyse each object level with it’s proper semantics and relationship to other object layers. To deal with such a complicated situation Delphi2 Creative Technologies, eCognition is following an hierarchical approach to allow a clear description of the object semantic network.

This method allows the efficient integration of the spectral and spatial properties of imagery from different resolutions, including additional GIS information.

The specific features of object based analysis are highlighted in the following subjects which cover an important part of the image analysis research field.

2. DATA FUSION

Inside the research field of remote sensing the discussion about the exact definitions of image fusion is going into the direction of reconstructing a sensor simulation with all detectable wavelength, displayed in the highest resolution possible with the given dataset. In case of a classical example with SPOT 3 imagery it would be the goal to construct an image-set with 10 meter panchromatic and also 10 meter multispectral resolution. The original 10 meter panchromatic image delivers a proper description of the real world objects responsible for the multispectral image characteristics. Most of the known algorithms for image fusion are not based on a reconstruction of the spectral values per image object, but are applied on the image as a whole. Unfortunately, different objects, it is landcover surfaces, behave quite different in higher resolutions. For typical homogene objects with a relative large area, such as a water surface, a simple resampling of the pixels inside the multispectral bands would not lead to information loss while increasing the resolution. For forest area’s the amount of shadow is quite important and with increasing resolution, the tree crown spectral characteristics are deviating considerably from their shadowed spatial neighbourhood. In the case of imagery with pixels around 2 meters and higher spatial resolution, the spectral characteristics of tree crowns can be sampled apart from the rest of the forest stand. In

(3)

such cases, image fusion is not a trivial issue. Reconstructing objects of interest in raster imagery is best been done with visually digitizing from high resolution imagery. This procedure is biased, non repetitive and very expensive. Alternatively, proper segmentation techniques can create spatial objects with an acceptable level of accuracy in an unbiased repetitive and cheap way.

The following definition for data fusion was adopted by the EARSeL - SEE - EMP working group (Wald,1998) ,

" data fusion is a formal framework in which are expressed means and tools for the alliance of data originating from different sources. It aims at obtaining information of greater quality; the exact definition of 'greater quality' will depend upon the application ".

In object based analysis, data fusion deals with incorporating attribute values from lower resolutional data or even GIS information into the objects derived from highest resolution level. Greater quality means to give the best attributes to the lowest level of objects that can be derived from VHR data. The visual qualities of such a procedure may be less than traditional image transformations, but the main aim is to assist automatic image analysis. Still the original VHR data can be used as backdrop information, by using transparency under the object layers or display objects as polygons. From the applicational side, this requires a good definition of the objects of interest in advance. So choosing the proper sensors is then crucially important.

Realising the link between image objects, resolution and spectral characteristics would lead to the conclusion that it is more useful to predefine objects of interest in the landscape and then continue to determine their different attributes including topology and spectral characteristics.

3. MIXED-PIXEL ANALYSIS

From a satellite image with different landcover classes, it can be presumed that each patch or spatial object has a characteristic histogram of spectral distribution (Steinnocher, 1997). Mixed pixels are those that belong to more than one such a distribution and can be found on the borderlines between them. Furthermore mixed pixels are the opposite of pure pixels which are characteristic for the landcover class to be classified. In VHR images,

this approach will lead to problems as pure pixels can not easily be defined for ‘traditional’ landcover classes. In VHR data it is necessary to redefine the concept of pure pixels. The Delphi2 eCognition team is convinced that the spectral and spatial characteristics of the group of pixels that form an image object should be the very basis of the classification procedure. (Baatz, 1999). As there are no pure pixels in VHR data, mixed pixel analysis seems to be restricted to Landsat type of data, depending upon that scale of measurement (30 m^2). For VHR, the textural differences inside an object are very characteristic for such an object and should be taken into account.

In this study information from 3 levels are used in the analysis, each with it’s proper objects

• A Forest GIS Layer

• A SPOT 4 Image

• A Panchromatic band with 1 meter resolution resampled from a scanned orthophoto

The objects in all 3 layers have their own semantic networks among themselves and between other layers as well (see figure 1).

A typical example is the forest, were forest stands are influenced by their structure of tree crowns who attribute to the spectral characteristics of the class ‘forest’. In high resolution imagery with >2Meter pixel size, individual tree crowns are distinctly different than their shadows. This becomes even more typical in pure coniferous stands with higher age (>50years) where the typical cone shape is responsible for a high shadow content in the stand characteristics. This is even so pronounced, that forest masks can be derived from panchromatic data with 5 meter resolution. With increasing resolution , the crown characteristics become quite distinct from their dark environments as high reflectance values are received from the illuminated crown area’s. So the pixels over a forest area are in this sense always mixed pixels and their textural characteristics in different resolutions are defining the unique properties of the object ‘forest stand’. For the forest application, the forest stand is the basic object and a proper attribute is needed in forest GIS updating. Although high resolution imagery can deliver much more detailed information, an aggregated label parameter has a higher demand. The most simple form to analyse mixed pixels in a SPOT 4 image is to register each pixel as a square polygon and display inside a high resolution panchromatic image. In one

(4)

Figure 1 The panchromatic layer with 1meter resolution is segmented into different image objects which receive their mean pixel value. In the visualisation, a small amount of image sharpness is offered as the objects loose their variance, only in the display ! A SPOT image on the same area is segmented per pixel creating a regular grid of polygons with 529 m^2. The borders of a forest GIS map are taken into account and used as the top level objects. All objects are linked within a semantic network that is automatically derived from the multi layer segmentation procedure inside DELPHI2 eCognition. The operator then uses the links between objects inside and among the layers to formulate class decision rules.

of the experiments in this study, A 1 meter panchromatic orthophoto was used to simulate the resolution of new satellite generations comparable with Ikonos and Quickbird.

The SPOT pixels are registered as image objects with 23*23 meter surface (see Figure 2A and Figure 3). The selection of SPOT pixels that represent ‘pure stands’ can be checked immediately with the backdrop panchromatic image. At the same time each SPOT pixel contains the data from the data-bank that has been constructed during the segmentation procedure (figure 2B). This allows for example to check the typical amount of panchromatic objects per SPOT object as a textural characteristic of such data. For comparison, the mean value of the first moment filter operation , a classical textural measurement which measures image contrast (ENVI 2.7 software, see Russ 1992) and it’s image are displayed in figure 2A. Mixed pixels in this combination can be divided in several categories. The most prominent one is the erosion gullies on dolomite chalk which influences the

spectral value of bordering forest stands. Because of their typical spectral value and spatial properties, they can be automatically shifted out. The most interesting mixed pixels for this application are those SPOT-image objects which have a characteristic tree crown distribution to label a specific forest stand, such as young pine trees etc. This study shows the state of the art and the way to proceed further, but for the typical stands in the alpine environment, the list of spatial characteristics per stands has not yet been completed.

4. TEXTURAL ANALYSIS

Image texture analysis in satellite remote sensing has been dominated by filter operations in which for example the Haralick textural filters have become popular in image analysis software. Two fundamental ways of looking at texture can be described in terms of a structural and a statistical approach. As raster images from landcover patterns can not be described in terms of predefined primitive

(5)

elements, the statistical approach has gained more influence in satellite image analysis (Musick, 1990). For practical applications, Steinnocher (1997) shows the use of high versus low texture in panchromatic IRS-1c data (5 meter resolution) to discriminate urban area’s from agricultural areas. Relative small objects (depending on the resolution) and edges are filtered out of the panchromatic data, leaving homogene forest and agricultural areas for further analysis. This practical example is very useful to explain in what way this approach for textural analysis changes, when using an object oriented approach.

Important is to realise that borders between objects are not objects themselves, they have attributes, but occupy no real surface (1 dimensional arc). Regarding each ‘homogene’ landcover surface, for example a potato field is showing a characteristic spectral distribution, and a neighbouring field having a different landcover/growing stage, showing a different spectral distribution. The mixed pixel is influenced by both distributions making it deviate from both characteristic histograms of the land cover type on both sides of the mixed pixel. The chance of encountering a ‘line’ of mixed pixels between two landcover classes is very likely in Landsat type imagery. These ‘lines’ have a high response when applying a first moment filter to the image.

In object based analysis, texture can be measured inside a certain surface in a statistical as well as a structural way. The border is described by the topology of that surface. The measurement of contrast can be done between existing objects (defined primitive elements as used by Musick, 1990) or among the pixels inside a spatial object. The amount of sub-objects and their relative brightness is then a strong measurement of contrast inside the surface. In this study, the figure 2A shows that a textural analysis (in the panchromatic band ) using a first moment filter operation is highly correlated with the border pixels between the different objects. The amount of subobjects per Spot-image object is therefore an acceptable alternative for statistical based filter operator. Although Musick (1990) points out the problems of the structural approach of textural analysis for landcover classes, the DELPHI2 eCognition allows both approaches when the operator is familiar with the object of interest. Forest stands do have predefined primitive elements, with the tree crowns as the most prominent one’s.

Choosing the following parameters in advance; The image resolution, the objects of interest and the segmentation levels, allows the operator to focus on conglomerates of small objects as is the case with urban environments and separate them from agricultural field boundaries.

For forestry applications, the textural measurements inside a forest stand is related with the stand structure, it’s species composition, the stand closure etc. The borderlines between the forest stands are less interesting in terms of contrast characteristics. Their relative length to the enclosed surface however is an important feature, which can be derived from the topological database linked to the spatial objects after the segmentation process.

5. FOREST CLASSIFICATION WITH MULTI RESOLUTIONAL IMAGERY AND GIS Using sensor derived imagery from airborne and satellite data in an existing forest GIS is different from standard image classification. First of all, the knowledge about important forest parameters such as species composition is well known inside the GIS data. Just to confirm that forest stand ‘X’ is correctly classified as having the spectral properties of a coniferous stand is not enough and often superfluous. What is needed is to point out which GIS objects are candidates for updating, and more important, what stands deviate from the general stand of that type. Has the stand become less dense and is undergrowth available etc. Dealing with sensor derived objects and the same time having the attributes from GIS available in an analysis package that also allows fuzzy logic decision functions offer a very strong tool for forest GIS modelling. In Delphi2, eCognition, The interchange ability of objects from image domain (satellite data) and administrative domain (GIS data) makes a so called synergy of these two domains straightforward to handle. Fuzzy logic rules are available for the classification phase, which makes it easier to deal with the lack of sharp borderlines between image objects and the predefined sharp borderlines of administrative GIS boundaries.

6. DISCUSSION

High resolution imagery of the generation IKONOS and Quickbird should not be treated as a pure alternative for visual interpretation of aerial

(6)

photography. As they are digital data, this type of data allows a further automatic analysis using new generation of hard- and software. The latest developments in this field are impressive and convincing. The background of this VHR data was the military and for special object detection, this is still a visual interpretation case. The new generation of satellite data allows users that are interested in monitoring large area’s of landcover classes to obtain parameters that are detailed enough and with an automatic analysis, cheap enough to incorporate such data in daily practice. With a price of $20 per Km^2, 1meter panchromatic and 5 meter multispectral imagery are still too expensive. When prices start to drop down to $5-$8 per Km^2, the use in forestry applications in non-productive area’s (national parks, environmental protection area’s) might have a chance. With a swathwith of 11 Km, such as IKONOS, it will take quite some time to cover a single SPOT scene with IKONOS data. Interesting is to find out which scene should be ordered with VHR, and try to use Landsat 7 and SPOT 4 data to detect important changes or events that makes it necessary to acquire high resolution imagery on that part of the scene. The lower resolution in a function of change detector, that points operators to areas where higher resolution imagery can define the exact nature of sudden changes. The sales of VHR data per Mbyte instead of per scene would than be helpful.

7. CONCLUSION

The main aim in characterising larger area’s with VHR data is possible if the original texture of image objects are used in the automatic image analysis. This means there is no focus on pure pixels, but characteristics in the pixel distributions among image objects are the very basis of analysis.

For this, object based analysis offers a very good tool to deal with the high demands of automatic image analysis of VHR data.

The construction of a semantic network between different object layers becomes necessary to handle the different object layers at the same time. Delphi2 Creative Technologies GmbH, eCognition has chosen a reliable hierarchical structure of image objects, that allows this type of multi-level/multi-resolutional object analysis.

Textural analysis in Landsat like data has often been based on filter operations , using a statistical

approach. A structural approach is possible in VHR data when objects of interest are containing definable primitive elements.

Object analysis is bridging the gap between GIS information, sensor derived parameters and expert knowledge. This allows an impressive way of dealing with automatic object analysis which limits are not reached yet.

8. REFERENCES

Baatz,M., Schäfe,A., 1999.Delphi2 creative technologies, GmbH, eCognition, Software tutorial. Munchen.

Carl, S., 1996. Klassifikation landwirtschaftlicher Kulturen aus ERS-1 SAR Satellitendaten mit Hilfe neuronaler Netze. Dissertation, Technical University Munich, Germany.

Cross, A. M., Mason, D.C., 1988. Segmentation of remote-sensed images by split-and-merge process. Int. J. Remote Sensing, 9(8):1329-1345.

Flack, J., 1995. Interpretation of remotely sensed data using guided techniques, Ph.D. Thesis, School of Computer Science, Curtin University of Technology, Western Australia.

Gorte, B., 1996. Multi-spectral quadtree based image segmentation. Int'l Archives of Photogrammetry and Remote Sensing, Vol. 31, Part B3, pp. 251-256. Gorte, B. 1998. Probabilistic segmentation of remotely sensed images. ITC, publication 63, PhD Thesis, ITC, Enschede.

Haralick, R. M., Shanmugan, K., and Dinstein, I., 1973. Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, SMC- 3(6):610-621.

Kenneweg, H., Förster, B., Runkel, M., 1991. Diagnose und Erfassung von Waldschäden auf der Basis von Spektralsignaturen. BMFT, Techn. Univ. Berlin.

Kettig, R.L., Landgrebe, D.A., 1976. Classification of Multispectral Image Data by Extraction and Classification of Homogeneous Objects. IEEE Transactions on Geoscience Electronics, Vol. GE-14, No. 1, pp. 19-26.

De Kok R. Object based classification and applications in the alpine forest environment.

International Archives of Photogrammetry and Remote Sensing, Vol. 32, Part 7-4-3 W6, Valladolid, Spain, 3-4 June, 1999

(7)

Landgrebe, D., 1999. Some fundamentals and methods for hyperspectral image data analysis. SPIE International Symposium on Biomedical Optics (Photonics West), San Jose, California, January 23-29. In Proc. of SPIE, Vol. 3603.

Musick, H.B., Grover, H.D., 1990. Image textural measures as indices of landscape pattern. In: Turner, M.G. and R.H. Gardner (Eds.), Quantitative methods in landscape ecology, Springer Verlag, New York, pp. 77-103.

Richards, J.A., 1992. Remote Sensing Digital Image Analysis. Springer Verlag, Berlin.

Russ, J. C., 1992. The image processing handbook. CRC Press Inc., Boca Raton, Florida.

Schowengerdt, R., 1997. Remote sensing models and methods for image processing. Academic Press, San Diego.

Steinnocher, K., 1997. Texturanalyse zur Detektion von Siedlungsgebieten in hochauflösenden panchromatischen Satellitenbilddaten. In: Dollinger, F. and J. Strobl (Eds.), Proc. of Angewandte Geographische Informationsverarbeitung IX, Salzburger Geographische Materialien, No. 26, pp. 143-152. Published by the Institute for Geography, University of Salzburg.

Wald L., A European proposal for terms of reference in data fusion. International Archives of Photogrammetry

and Remote Sensing, Vol. XXXII, Part 7, 651-654, 1998

Figure 2A. The image is a subset of an SPOT image 182*205 meter, showing a forest road with brighter beech tree crowns and darker coniferous tree-crowns. The black raster on the left side are the SPOT-objects (529 m^2) displayed as polygons. On the right side, the same area were a first moment filter is applied to the panchromatic data (1 meter resolution). The operator notice the strong response of bright individual tree crowns and their dark environment. This allows him to select this feature for further classification procedures. The Black square in the image is an activated SPOT polygon. Ist attributes can be displayed on screen in a pop-up menu similar to figure 2B

(8)

Figure 2B, A pop-up menu with object information in the 2e level, showing only a small part of the available database with attributes to all objects at all different levels. The construction of this database is directly linked to the segmentation process and is an important automatic derived product from the available data layers.

Figure 3. A SPOT image overlaying a 1meter panchromatic image of approx. 700 *700 meter, showing a forest road with different stands. The road is derived from an existing forest-GIS.

After a first moment filter operation on the 1 meter panchromatic data, the SPOT-Polygons with 23*23 area size, having high values for This first moment textural filter are classified as white opaque surfaces in the right part of the image. By displaying the SPOT-polygons as closed, square white-arcs, the original panchromatic image shines through. The operator than notice that high textural values are associated with forest stands that are less dense or show strong contrast between crowns and shadow area’s (in young stands). Additional features should then be selected to separate open forest stands from dense young tree stands. Note; The spectral values of SPOT multi-spectral bands are yet not taken into account.