3198 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 48, NO. 8, AUGUST /$ IEEE

(1)

Rule-Based Classification of a Very High Resolution

Image in an Urban Environment Using Multispectral

Segmentation Guided by Cartographic Data

Mourad Bouziani, Kalifa Goita, and Dong-Chen He

Abstract—Classification algorithms based on single-pixel analy-sis often do not give the desired result when applied to high-spatial-resolution remote-sensing data. In such cases, classification algorithms based on object-oriented image segmentation are needed. There are many segmentation algorithms in the literature, but few have been applied in urban studies to classify a high-spatial-resolution remote-sensing image. Furthermore, the user must specify the spectral and spatial parameters that are data dependent. In this paper, we propose an automatic multispectral segmentation algorithm inspired by the specific idea of guiding a classification process for a high-spatial-resolution remote-sensing image of an urban area using an existing digital map of the same area. The classification results could be used, for example, for high-scale database updating or change-detection studies. The algorithm developed uses digital maps and spectral data as in-puts. It generates the segmentation parameters automatically. The algorithm is able to provide a segmented image with accu-racy greater than 90%. The segmentation results are then used in a rule-based classification using spectral, geometric, textural, and contextual information. The classification accuracy of the proposed rule-based classification is at least 17% greater than the maximum-likelihood classification results. Results and future improvements will be discussed.

Index Terms—Geographic database, high-resolution satellite imagery, rule-based classification, urban environment.

I. INTRODUCTION

T

HE AVAILABILITY of new satellite sensors that are capable of providing very high spatial resolution (VHSR) images has encouraged the scientific community to study how these sensors can be used in high-scale cartography [1]. Ikonos, the first commercial satellite with VHSR images, became available in September 1999. Other similar sensors have been accessible: QuickBird, since October 2001, and OrbView, since June 2003.

Using VHSR images for urban environments has drawn the interest of many mapping agencies. Some research has

Manuscript received July 31, 2009; revised January 7, 2010. Date of pub-lication April 19, 2010; date of current version July 21, 2010. This work was supported by the Natural Sciences and Engineering Research Council of Canada.

M. Bouziani is with the Institut Agronomique et Vétérinaire Hassan II, 10101 Rabat, Morocco (e-mail: m.bouziani@iav.ac.ma).

K. Goita and D.-C. He are with the Department of Applied Geomat-ics, University of Sherbrooke, Sherbrooke, QC J1K 2R1, Canada (e-mail: kalifa.goita@usherbrooke.ca; Dong-Chen.He@USherbrooke.ca).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TGRS.2010.2044508

been carried out on the characteristics of these images and their cartographic potential [2], [3]. Applications such as topo-graphic cartography, change detection, and cartography of land occupation were considered by several researchers [3]–[7]. Their results showed that VHSR images have potential for urban cartography and updating of several objects of high-scale topographic maps. Objects such as roads, railroads, water surfaces, and most buildings can clearly be identified in these images. Some smaller elements, such as little constructions, are difficult to identify and, in some cases, impossible to detect.

It is evident from these studies that VHSR images can be used as an information source complementary to photogram-metric and direct survey. In countries with no detailed car-tography and lack of resources to produce and analyze aerial photography, VHSR images can be used as a surrogate for topographic and thematic cartography [8]. On the other hand, in countries that have a cartographic tradition and established geographic databases, as is the case with Canada, the U.S., and western European countries, VHSR images can be used to update and control the quality of existing maps [1].

VHSR sensors provide images that yield more details about objects. Nevertheless, information extraction using classifica-tion techniques is more complex [9]. Indeed, as spatial res-olution increases, the internal spectral variability of objects increases, resulting in a reduction of spectral differences be-tween object classes [10]–[12]. Spectral resolution is another drawback of current VHSR images. Indeed, while the spatial resolution of these images is high, the spectral resolution is limited in comparison with other sensors, such as Landsat TM [13], [14]. Therefore, different materials appear with similar spectral signatures, making spectral discrimination between objects difficult.

Because of these limitations, classification problems are un-avoidable when using only spectral information. For example, confusion between buildings and roads always occurs because materials for these surfaces are spectrally similar. If advantage is to be taken of the spatial resolution of VHSR sensors, it is therefore important to use information to supplement spectral information for urban image analysis [15]. This additional in-formation can help to solve the problem of spectral resemblance between some classes [11], [15]–[17].

This additional information can be introduced by segmenting the image before analysis. After segmentation, every image segment becomes a homogeneous unit for which attributes can be calculated and used in the analysis. The attributes 0196-2892/$26.00 © 2010 IEEE

(2)

most frequently used are spectral and textural parameters, area, perimeter, and compactness [15], [18], [19].

Several segmentation methods are available, some of which can be found in [13]. However, these methods require segmen-tation parameters (seeds, thresholds, and criteria of fusion). These parameters must be determined by the user, who must then perform many tests before choosing the final parameters, which depend on the image and application.

In this paper, a new multispectral segmentation method is proposed to automate the choice of segmentation parameters. It uses the valuable information content of existing maps. Then, the segmented image is used in a classification method that considers spectral, geometric, and contextual information. The approach produces a meaningful improvement of classification results compared to standard maximum-likelihood classifica-tion results.

The remainder of this paper is organized as follows. The study area and the data are presented in Section II. The methodology proposed is presented in Section III. The results are presented in Section IV. In Section V, we discuss the methodology and the results obtained. Finally, the conclusions are presented in Section VI.

II. STUDYAREAS ANDDATA

This study concerns two cities: Sherbrooke (Canada) and Ra-bat (Morocco). The city of Sherbrooke is situated southeast of Montreal, in the province of Quebec. Its geographical location is 45◦2400north and 71◦5400west.

Sherbrooke has a population of about 147 000. As in most Canadian cities, streets are generally large, and the buildings are of different sizes, shapes, and colors. City topography is hilly with significant variations in altitude (from 140 to 340 m). The city also contains several vegetation zones (forests, trees, and lawn). A number of Sherbrooke’s residential districts have houses with small outside swimming pools, generally circular in shape.

Available data consist of an Ikonos multispectral image and an Ikonos panchromatic image of Sherbrooke acquired in July 2006. The multispectral image has a spatial resolution of 4 m and contains four bands (red, green, blue, and near infrared). This image was merged with the panchromatic image, which has a spatial resolution of 1 m in order to make use of the spectral information and the spatial accuracy of both images.

An area of 5 km × 5 km, corresponding to the part for which validation data are available, has been selected as the study area of Sherbrooke (Fig. 1). The digital map of Sher-brooke extracted from the topographic database for Quebec (BDTQ) is also available at the scale of 1:20 000. The BDTQ is a digital cartographic product of the Direction Générale de l’Information Géographique. The BDTQ data were updated in 2000. It consists of 25 layers describing the classes of water, roads, buildings, infrastructure, and vegetation. The number of layers is high because, in the digital map, some classes are presented in many layers. For example, different layers exist for buildings (administrative buildings, residential, etc.), which we need to regroup together for our purpose.

Fig. 1. Ikonos image of Sherbrooke, acquired in July 2006, showing the study area.

Fig. 2. QuickBird image of Rabat, acquired in August 2007, showing the study area.

Rabat is the capital city of Morocco. It is situated on the Atlantic coastline of the country. Its geographical localization is 33◦131north and 6◦5310west. Rabat has a population of about 1.6 million according to the official census of 2004. Rabat contains different types of districts. Three main types can be distinguished: the residential districts with buildings and small individual houses; the old districts as the one of the former Med-ina, characterized by a strong density of houses and the alleys of tiny width; and the new arranged residential districts that are di-vided on a regular frame and constituted of villas, big buildings, and large streets. The relief of the city of Rabat is generally flat except in the periphery where altitude variation is important.

Available data for Rabat consist of a QuickBird multispectral image and a QuickBird panchromatic image acquired in August 2007. The multispectral image has a spatial resolution of 2.4 m and contains four bands (red, green, blue, and near infrared). This image was merged with the panchromatic image, which has a spatial resolution of 0.6 m.

Validation data are available for an area of 4 km × 3 km (Fig. 2). The digital map used is extracted from the topographic

(3)

Fig. 3. Diagram of the proposed method to classify a very high resolution image in an urban environment.

Fig. 4. Subset images of the study area of Sherbrooke.

database of Rabat which is available at the scale of 1 : 10 000. This map was updated in 2002.

III. METHODOLOGY

The proposed method uses per-pixel classification results, the existing geodatabase, and a rule base to improve the classifica-tion accuracy of a very high resoluclassifica-tion multispectral image in an urban environment. The methodology involves the following steps (Fig. 3):

1) preprocessing that consists of image fusion and map data registration;

2) image classification using a standard maximum-likelihood method;

3) multispectral image segmentation to obtain homogeneous groups of pixels;

4) object rule-based classification.

A. Preprocessing

The main objective of image fusion is to create a new image integrating complementary information from the original images. The challenge is thus to merge these two types of images by forming new images that integrate both the spectral aspects of the low-resolution images and the spatial aspects

of the high-resolution images. Several image fusion methods are available [20]. The most commonly used methods have a shared limitation: They do not faithfully preserve the colors of the original multispectral image. For our work, we used an algorithm developed by Heet al.[21] which is capable of preserving the spectral aspect from the low-resolution multi-spectral image while integrating the spatial information from the high-resolution panchromatic image.

The digital map objects are used in some steps of this work. In this case, it is important to have the corresponding objects in both data sets (image and map) projected in the same spatial reference system. The image to map registration was done using a first-order polynomial function. The accuracy obtained is less than 1.5 pixels for the study areas.

For the presentation of the results of the proposed method-ology, we chose two subset images of Sherbrooke (Fig. 4) and two subset images of Rabat (Fig. 5). This is only for illustration purposes in the paper. The approach is used and evaluated on the entire study areas.

B. Per-Pixel Classification

The merged images comprising the four spectral bands were used in a standard maximum-likelihood classification. The land cover classes used are roads, buildings, trees, grass,

(4)

Fig. 5. Subset images of the study area of Rabat. TABLE I PIXELNUMBERS BYCLASS

bare soil, water, and shadow. Many sites within the study areas were used for training and assessment. The validation of the classification was done using reference pixels that are different from those used for training. The intersection between training and validation sets is empty. For the Ikonos image of Sherbrooke, a total of 126 test sites—randomly distributed and representative of the different classes—were considered. These sites contain 19 076 training pixels and 61 075 reference pixels for validation. For the QuickBird image of Rabat, a total of 152 test sites—randomly distributed and representative of the different classes—were considered. These sites contain 19 345 training pixels and 76 818 reference pixels for validation. The distribution of training and validation pixel by class is given in Table I.

The same reference sites were used to assess the different classification results presented herein. The confusion matrix and overall accuracy were calculated using the validation data for the entire study areas. Illustrations are given only for the subsets presented hereinafter. Some additional information was used to attempt to improve the classification of the images. The first information added was texture. It contains information about the spatial distribution of spectral variations in a band within a given window.

When combining spectral and textural information, confu-sion between some classes can be reduced, resulting in better classification accuracy. Texture can be used to discriminate between trees and grass. These two classes have similar spectral signatures but have different textures. Zones covered with grass are more homogeneous than those covered with trees. This difference in homogeneity can be used to reduce confusion between the two classes. Several texture measures have been

recommended in the literature. Herein, we valued the variance, homogeneity, and entropy calculated from the matrix of cooc-currence on the panchromatic band. Every measure of texture was calculated on different windows (5 ×5, 7 × 7, 9× 9, 11 × 11, 13 ×13, and 15 × 15). Texture bands were used together with the four other spectral bands in a classification scheme based on maximum-likelihood method. Homogeneity with a 7×7 window gave an increase of 6% in the classification accuracy for the grass and tree classes.

The second information added is of geometric nature. There are differences between parcels containing trees and those containing grass. In urban environments, trees are generally spaced and circular in shape, whereas grass parcels have regular shapes. Compactness can be used to discriminate between the two classes.

Texture is not appropriate to differentiate between roads and buildings. In general, their textures are similar. Geometry and context information can help differentiate between them. Concerning geometry, roads are elongated features, whereas buildings are rather rectangular or square. We used two param-eters: compactnessIcmand elongationIe[22]

Icm= 2·π·Area(object) perimeter(object) (1) Ie= Area(object) [length(object)]2. (2) Another important point for differentiating between the two classes is that buildings are elevated objects. They cast shadow on the ground opposite to the direction of the sun. The quantity of shadow depends on the positions of the sensor and the sun. If shadow is visible in an image, it can be used to identify buildings. For the Ikonos image of Sherbrooke used in this study, the sensor position is defined by an azimuth of 145◦and an elevation of 79◦. Sun position is defined by an azimuth of 169◦and an elevation of 31◦. For the QuickBird image of Rabat, the sensor position is defined by an azimuth of 215◦ and an elevation of 88◦. Sun position is defined by an azimuth of 165◦ and an elevation of 70◦. Consequently, shadow is visible in the two images.

Trees are vegetation that can also generate shadow because of their height. Shadow can then help detect trees. This could

(5)

eliminate some classification errors due to confusion between trees and grass.

To summarize, the use of additional information plays an important role in reducing classification errors due to spectral confusion between trees and grass, on the one hand, and be-tween roads and buildings, on the other hand. Texture, geom-etry, and context were used to distinguish between trees and grass. Trees are more compact and have a stronger texture than grass parcels. In addition, trees generate shadow. Geometry and context were used to differentiate between roads and buildings. Therefore, the following five facts were taken into consid-eration to develop an approach to improve the classification accuracy of a VHSR image of an urban environment: 1) Trees generate shadow, whereas grass does not; 2) trees are circular in shape, whereas grass parcels are oblong; 3) texture is strong for trees and weak for grass; 4) buildings have compact geometry, whereas roads are longish; and 5) buildings generate shadow, whereas roads do not.

In the classification approach, nonspectral information is used only for classes where it can bring improvements. For example, texture is used to classify vegetation (trees and grass) and elongation applied to classify roads and buildings. We try to minimize errors from spectral confusion between these classes. A blind application of nonspectral information to all classes could decrease the overall classification results.

The pixel-oriented approach is not appropriate to use geo-metric and contextual attributes. An object-oriented approach is needed. Thus, a multispectral image segmentation method should be used to generate objects.

C. Multispectral Segmentation

Segmentation aims at grouping similar pixels to get homoge-neous zones within the image. Segments deliver more spectral, geometric, and contextual information than pixels, which is important for classification. Many segmentation algorithms are available. A review of some can be found in [13] and [23]– [25]. Few segmentation methods permit the use of multispectral images. Unfortunately, segmentation parameters such as seeds and thresholds must be determined by the user. These para-meters are data and application dependent, which means that several tests are needed before the definitive parameters can be selected.

A new segmentation method using a multispectral image is proposed here to automate the selection of segmentation parameters. Existing digital maps are used as auxiliary data to improve segmentation results.

An automatic growing region segmentation algorithm was used, involving an existing digital map and spectral data as input to automatically determine the segmentation parameters. The segmentation seeds used are centroids of geographic infor-mation system (GIS) map objects. Image segmentation begins with these points to extract homogeneous objects from the image. A spectral threshold is calculated for every band of the image. The digital map is used to automatically determine thresholds as follows: Pixels that correspond to every object in the digital map are processed to calculate the standard deviation in every band, which is the spectral threshold (St) in this band.

Geometric thresholds (object minimum area Am and object

minimum compactness valueCm) are determined using objects

from the digital map.

It is important to notice that seeds are positioned first only where objects exist on the map. A first segmentation is per-formed using these seeds. Then, other seeds are randomly posi-tioned in areas where map objects do not exist. The thresholds determined as explained earlier are applied to the entire image to be segmented.

A growing region algorithm starts from seed points and groups the adjacent pixels according to a criterion of homo-geneity, based on the spectral distance between adjacent pixels (Dsk). To respect the spectral homogeneity criteria,Dsk must be lower than the adopted spectral threshold in each band

Dsk(i, j) =|Vi,k−Vj,k| (3)

whereVi,k is the value of the pixeliin bandkandVj,kis the

value of the pixeljin bandk.

A diagonal eight-neighborhood function was used to de-termine whether an adjacent pixel should be merged with an existing image object or whether it should be part of a new object. An image object will stop growing once it exceeds the thresholds (StandCm).

The segmentation starts from the seed points. Every seed pixel has eight neighboring pixels. The spectral distance be-tween the seed and each neighbor is calculated, and then, the spectral homogeneity test is applied. For each neighbor pixel, if the distance is less than the spectral threshold, the seed and pixel are merged to form a segment. A segment’s spectral value is the average of spectral values of pixels that belong to it. The procedure continues by comparing the spectral value of the segment formed with the values of all its neighboring pixels. The procedure stops when, for a segment, no neighboring pixel satisfies the criterion of homogeneity. The procedure is then applied to a new seed point and so forth, until the entire image has been segmented.

The objects acquired by segmentation need further process-ing to remove unnecessary holes inside objects and to merge spectrally similar segments and small segments along object boundaries. To do this, adjacent segments with similar spectral features are merged. The criterion of homogeneity explained earlier is applied. Moreover, every small segment is merged with the adjacent segment that is spectrally nearest. The area of all segments must be necessarily greater than the minimal area valueAm. Compactness of all segments must be greater

than the minimal compactness valueIcm. Thus, every segment whose compactness is less thanIcm is merged with the adja-cent segment that is spectrally nearest to get a more compact segment.

After the image is segmented, geometric and spectral at-tributes are calculated for each segment. These atat-tributes are the meanmiand varianceσi2in every spectral band, homogeneity,

area, perimeter, and the geometric indicators of compactness and elongation.

The visual and quantitative assessments of segmentation were done. Visual analysis reveals if the resulting segments fol-low the limits of the actual objects. The quantitative assessment

(6)

makes it possible to estimate the percentage of segmentation errors by comparing the results to reference data provided by an independent photo interpreter. Two measures of quality were used.

The first measure is a ratio between the number of segments from our approach and the number of segments in the reference segmentation. It is given by the following formula [23]:

Rseg=

Nseg

Nref

. (4)

This measure reveals if the segmentation algorithm yields more or fewer segments than the actual segments.

The second measure of quality represents the error of seg-mentation. It gives the proportion of pixels wrongly segmented in the image in relation to the total number of pixels in the image [13], [23]. To assess the segmentation, the result has been compared to a reference segmentation done by a skilled interpreter. It is obtained from a visual segmentation digitized manually in GIS software. In each study area, two subsets are visually segmented. The extent of the subsets is 250 pixels by 250 pixels. We used the same method defined in [13]. The boundaries of adjacent regions are placed at the center of the transition zone between them. The total error is calculated as follows: ET = N i=1 N j=1 N Pi,j− N k=1 N Pk,k N i=1 N j=1 N Pi,j (5)

whereNis the number of segments from the reference segmen-tation,N Pi,jrepresents the number of pixels in thejsegment

that have been assigned to the i segment by the proposed segmentation algorithm, andN Pk,k represents the number of

pixels assigned to the good segments.

Now that we have a segmented image, properties such as shape and context can be used. Moreover, spectral statistics like the mean and variance of each segment can be calculated. D. Object Rule-Based Classification

The goal of rule-based classification is to improve the accu-racy of classifying urban land cover. We aim to minimize con-fusion errors between spectrally similar classes. The strategy for discriminating between grass and trees is to label all pixels classified as grass and those classified as trees as vegetation. Object classification is then used to differentiate between trees and grass. The attributes used are the class of pixels forming the segment, homogeneity, compactness, and adjacency to a shadow segment.

The strategy for discriminating between roads and buildings is to label all pixels classified as roads and those classified as buildings as built-up areas. Object classification is then used to differentiate between roads and buildings. The attributes used are the class of pixels forming the segment, compactness, elongation, and adjacency to a shadow segment. The rule base comprises four types: pixel classification, texture, geometric, and contextual rules.

Since the image has already been classified using maximum-likelihood classification in a pixel approach, image segments can be initially assigned by analyzing classes of pixels belong-ing to every segment.

The percentage pi of pixels of each class present in every

segment is calculated using the results of the per-pixel clas-sification already performed. Thus, a certainty factor F c is assigned to this classification information, which corresponds to the calculated percentagepi. Segments composed essentially

of pixels belonging to only one class will have a certainty factor close to one for this class and zero for the other classes. Segments composed of pixels belonging to several different classes, however, will not have a certainty factor close to one for any class, thereby reflecting the ambiguousness resulting from the per-pixel classification.

The certainty factorF cion the spectral classification

infor-mation represents a segment’s membership to the class i. It is given as

F ci=

npi

N p (6)

wherenpiis the number of pixels of the classipresent in the

segment andN pis the number of all pixels in the segment. The homogeneity for the entire image is calculated first, and the mean value of homogeneity for each vegetation segment is used to differentiate between trees and grass. Thus, the mean homogeneity value is calculated for segments in the class vegetation with an F c superior to 0.5. The segments with a strong texture are probably trees.

Geometric rules consist of comparing the segment shape (compactness and elongation) to the threshold values of some characteristic objects.

Buildings generally have high values of compactness and elongation, whereas roads have low values of these geometric indices. The user can easily determine the compactness and the elongation characteristic values of these objects in the study area. The proposed method uses the existing objects of the digital map to calculate and analyze these indices. In our study, we analyze many buildings of different shape. We find that most of them have compactness ranging from 0.4 to 0.8 and elongation varying from 0.3 to 1. The same analysis was done for roads of the study area. We find that the compactness of most roads is lower than 0.3 and that the elongation is lower than 0.2.

The comparison for geometry uses the mean and standard deviation of the attributes calculated from the training sites. Several tests brought us to choose a functionh()of resemblance for the geometric attributes. A geometric attribute has a mean of mIgand standard deviation ofσIg. The geometric resemblance

is given by the following function: h(Ig) = 1−| Ig−mIg| σIg , If0≤ |Ig−mIg| ≤σIg h(Ig) = 0, Otherwise (7) where Ig represents the value of the geometric attribute for

a given object. The values of a geometric attribute situated between the limits defined by the mean and standard devia-tion correspond to a degree of geometric resemblance situated

(7)

between zero and one, so that the values near the mean value correspond to objects having a large resemblance to the objects of the class analyzed.

The geometric attribute can be either compactness or elonga-tion. Two factors of certainty are therefore calculated:F cpifor

compactness andF eifor elongation.

The context rule makes it possible to verify the presence of shadow for a segment opposite to the sun. In the following, we explain how this rule is used:

Ok objectkin the segmented image;

C(Ok) class of the objectOk;

A(Ok) list of objects adjacent toOk.

Thus, the context rule is written as follows.

1) For an object Ok such that C(Ok) =“built-up area.”

If ∃Oj⊂A(Ok) such that C(Oj) =“Shadow”

and Azimut(Oj, Ok)≈AzimutSun, then C(Ok) =

“Building.”

2) For an object Ok such that C(Ok) =“vegetation.” If

∃Oj⊂A(Ok) such that C(Oj) =“Shadow” and

Azimut(Oj, Ok)≈AzimutSun, thenC(Ok) =“T rees.”

The procedure begins by identifying the segments that have a certainty factor (membership) in the class of shadow greater than or equal to 0.5. Then, the adjacent segments to these likely shadow segments in the direction of the solar azimuth are extracted. The segments that have an important F c for vegetation are probably trees, and those that have an important F cfor built-up areas are probably buildings. The segments that do not have an adjacency length greater than the one-third of the length of the likely shadow segment are not considered in this analysis. This makes it possible to avoid identification problems.

The rule base used in the object classification is defined as follows.

Rule 1) If a segment contains a high rate of pixels belonging to one class, then the segment potentially belongs to the same class.

Rule 2) If the homogeneity of a vegetation segment is not high, the segment likely contains trees.

Rule 3) If the homogeneity of a vegetation segment is high, the segment likely contains grass.

Rule 4) If the compactness of a vegetation segment is close to a circle, the segment is likely an isolated tree. Rule 5) If a vegetation segment is adjacent to a shadow

segment in the direction of the sun’s azimuth, then the segment contains trees.

Rule 6) If a segment of built-up class has compactness close to a rectangle, then it is likely a building.

Rule 7) If the elongation of segment of built-up class is not high, then it is likely a road.

Rule 8) If a segment of built-up class is adjacent to a segment of shadow in the direction of the sun’s azimuth, then the segment is a building.

To take the inaccuracy of rules into consideration, a confi-dence degree is obtained for each rule. The conficonfi-dence degree of rule conclusions could be positive or negative depending on the satisfaction of the rules by the segment. The value of the confidence degree of a rule conclusion depends on how the

TABLE II WEIGHTS OFRULESUSED

image segment satisfies the condition of the rule. For example, the value of the confidence degree of Rule 6) depends on the value of the segment compactness. Indeed, the confidence degree value is high if the segment compactness is close to the compactness mean obtained from training data. It decreases as the compactness moves away. It could be negative if the compactness is very far from the mean.

The quality results of a rule-based system also depend on rule weighting, since they do not have the same importance. The importance of each rule can change according to the applica-tion’s context. There is no automatic method to determine rule weighting [22]. Thus, the user must define a weight for every rule based on his experience and knowledge of the environment. In this study, we propose a semiautomatic technique to help the user determine the rule weighting. The technique works as follows: The user chooses an image subset with available validation data of all classes (buildings, roads, trees, grass, bare soil, water, and shadow). The classification rules are then ap-plied separately, and the results are compared to the validation data. This yields the identification accuracy for each rule, which is the proposed rule weight. The weights determined for the rules used in this study were obtained using a small subset image of 150 m×120 m. The results are given in Table II.

Several rules can be applied for a segment. The more rules are satisfied for a segment, the more likely it is that the result is accurate. This provides a means for increasing or decreasing classification certainty while combining knowledge contained in the rules. We combine several rules by adopting the same strategy as in the MYCIN system [26]. If two different rules R1 and R2 give the same conclusionC, withDC1andDC2 being the confidence degrees of the deduction of R1 and R2, respectively, then the certainty factor of the conclusion C is defined as F C(C) =f(DC1, DC2) where f(x, y)= ⎧ ⎨ ⎩ x+y−x∗y, Ifx >0andy >0 x+y+x∗y, Ifx <0andy <0 x+y

1−Min(|x|,|y|), Ifxandyhave opposite signs. (8) The MYCIN system can be extended to more than two rules. For example, the certainty factor of the conclusionCfor three rules R1, R2, and R3 is defined as

F C(C) =f(f(DC1, DC2), DC3).

This strategy allows combining efficiently the rules so that if several rules lead to the same conclusion, its certainty is high. However, if some rules lead to opposite conclusions, then its certainty is low. A diagram summarizing the different rules used in the classification approach is shown in Fig. 6.

(8)

Fig. 6. Diagram of the rules used in the classification approach.

Rules 1), 2), 3), 4), and 5) representing spectral information, texture, geometry, and shadow information are used for the vegetation class. Rules 1), 6), 7), and 8) representing spectral information, geometry, and shadow information are used for the built-up class. For the Bare Soil, Shadow,and waterclasses, only Rule 1) (spectral information) is used to generate the certainty factor value for these classes.

The certainty factor values for each class are calculated in parallel, so the rule order has no influence on the final result. The result is a classification of the image segments, with each segment having final certainty factor values in each of the classes: Road, Building, Grass, Tree, Bare Soil, Water, and Shadow. To assign each image segment to a single class, a comparison of the certainty factors is performed using the maximum function so that each segment is assigned to the class with the highest certainty factor value.

IV. RESULTS A. Per-Pixel Classification Results

Tables III and IV show the confusion matrix for the per-pixel maximum-likelihood classification of the entire study areas of Sherbrooke and Rabat, respectively. The overall accuracy is computed by dividing the number of correctly classified reference pixels by the total number of reference pixels. The overall accuracies obtained for this first classification are 74% and 72% for Ikonos and QuickBird images, respectively.

The classification maps of subsets of Figs. 4 and 5 are shown in Figs. 7 and 8 for illustration purposes. The analysis of the classification maps shows that the most important source of errors lies with confusion between roads, on the one hand, and buildings and other surfaces made of asphalt, which are mainly parking lots, on the other hand. For the Ikonos image, 27% of reference pixels belonging to the road class were classified as buildings and 25% of reference pixels belonging to the building class were classified as roads. For the QuickBird image, 28% of reference pixels belonging to the road class were

classified as buildings and 24% of reference pixels belonging to the building class were classified as roads. The other most important source of errors comes from confusion between trees and grass. Indeed, for the two images, 14% of reference pixels belonging to the class of trees were classified as grass and 17% of reference pixels belonging to the class of grass were classified as trees. The third source of error is the confusion between bare soil and roads/buildings. Thus, between 19% and 23% of reference pixels belonging to the bare soil class were classified as roads or buildings in the two images.

The classes of roads, buildings, and bare soil, on the one hand, and the classes of grass and trees, on the other hand, are spectrally similar classes and present a major zone of spectral overlap. This is the main reason for classification errors. The supervised classification methods that take into consideration only spectral information—such as the maximum-likelihood method used here—cannot differentiate between these classes with a significant accuracy. Thus, other information must be used to achieve better classification.

B. Segmentation Results

Overall, the resultant segmentation largely respects object perimeter and shape (Figs. 9 and 10). Some objects are rep-resented by only one segment that closely adheres to the object’s limits. Several roads were well segmented into only one segment, whereas others resulted in several segments. As for the buildings, every building corresponds generally to only one segment. The zones of shadow and vegetation were segmented very well. There are other objects comprising two or more segments. A positive point to note concerns the absence of small trivial segments. This means that using a geometric threshold in the segmentation algorithm is useful in this kind of application.

The total error of segmentation (ET) for the entire Ikonos

image of Sherbrooke is about 9%. This means that 91% of the segments accurately represent the corresponding real objects. The obtained mean value of the segmentation ratio Rseg is three. For the QuickBird image of Rabat, the error of segmentation is about 10%, andRseg is 3.5. These results indicate that the segmentation is of good quality. Therefore, the segmentation results can be used in the proposed object-classification approach.

C. Rule-Based Classification Results

The proposed rule-based method was applied to the Ikonos image of Sherbrooke and the QuickBird image of Rabat. We used the same training sites as for the preceding per-pixel clas-sification. The confusion matrix was calculated from the same reference sites used for the maximum-likelihood classification (see Tables V and VI). The overall accuracies obtained are 92% and 89% for the Ikonos and QuickBird images, respectively. The results are illustrated for the subset images (Figs. 11 and 12). The visual quality of the object rule-based classification compared to the per-pixel classification is evident. Roads and buildings are more homogeneous, and overall, they look like the reality. The classification results of vegetation are also better.

(9)

TABLE III

CONFUSIONMATRIX FOR THEPER-PIXELMAXIMUM-LIKELIHOODCLASSIFICATION OF THESTUDYAREA OFSHERBROOKE

TABLE IV

CONFUSIONMATRIX FOR THEPER-PIXELMAXIMUM-LIKELIHOODCLASSIFICATION OF THESTUDYAREA OFRABAT

Fig. 7. Maximum-likelihood classification results of the two subsets of Sherbrooke. The same comments can be made for the other classes (water, shadow, and soil). Isolated trees have also been identified.

The rule-based classification method applied to Sherbrooke image has improved accuracy by approximately 18% over the maximum-likelihood accuracy. This improvement is particu-larly visible for the classes offering significant spectral

con-fusion. Accordingly, the classification accuracy for buildings increased from 72% to 89%, while that for roads increased from 69% to 94%. The error of confusion between these two classes decreased from 26% with the per-pixel classification to only 7% with the object rule-based classification. The classifi-cation accuracy of trees increased from 83% to 91%, and the

(10)

Fig. 8. Maximum-likelihood classification results of the two subsets of Rabat.

Fig. 9. Segmentation results of the two subsets of Sherbrooke.

Fig. 10. Segmentation results of the two subsets of Rabat.

classification accuracy of grass went from 82% to 98%. The confusion errors between these two classes decreased from 15% to 3%. For Rabat image, the improvement of classification accuracy is of about 15%. The classification accuracy increased from 69% to 86% for buildings and from 68% to 87% for roads. The confusion errors between these two classes decreased from 24% with the per-pixel classification to 11% with the object rule-based classification. The classification accuracy of trees increased from 84% to 94%, and the classification accuracy of grass went from 80% to 96%. The confusion errors between these two classes decreased from 17% to only 4%.

V. DISCUSSION

The results of this paper demonstrate some problems with conventional per-pixel classification of high-resolution satellite imagery for urban land-cover mapping. Indeed, a maximum-likelihood method results in large confusion errors between some classes. For example, the building class and the road class presented a misclassification error of about 24%. We presented an object rule-based classification method to re-solve this issue. This method uses a rule base that com-bines spectral, textural, geometric, and contextual information. The method requires, however, a new segmentation approach

(11)

TABLE V

CONFUSIONMATRIX FOR THEOBJECTRULE-BASEDCLASSIFICATION OF THESTUDYAREA OFSHERBROOKE

TABLE VI

CONFUSIONMATRIX FOR THEOBJECTRULE-BASEDCLASSIFICATION OF THESTUDYAREA OFRABAT

Fig. 11. Object rule-based classification results for the two subsets of Sherbrooke. for multispectral images, which was proposed. This approach uses an existing map to automatically generate segmentation parameters.

In our study, we propose a region-based segmentation algo-rithm adapted to the urban environment and whose parameters

are generated automatically from an existing map. The assess-ment of the new approach of segassess-mentation revealed a segmen-tation accuracy of 90%. The segmensegmen-tation errors fall primarily in the transition zones between some spectrally similar objects, such as building roofs and parking lots.

(12)

Fig. 12. Object rule-based classification results for the two subsets of Rabat. TABLE VII

RULEIMPORTANCE

The segmentation results depend largely on the segmenta-tion parameters. The user must interactively experiment with several parameters (spectral and geometric thresholds) until the best segmentation is achieved [13], [18]. In our method, the parameters are automatically deduced and are adapted to every analyzed image. Thus, spectral and geometric thresholds for each image are calculated using the existing map of the zone. As for classification, combining many rules proved beneficial. Each type of rule improves classification through the addition of information about the studied classes.

We demonstrated that implementing a per-pixel classification does not yield good classification results for a high-resolution image of the urban environment. To study the importance of the rules proposed herein, we first implemented the object rule-based classification of the image by using all the rules. The results of this classification were presented in the preced-ing section. We then implemented the rules separately. Thus, we performed classification with the spectral and geometric rules. Then, we performed classification with the spectral and contextual rules. Finally, we conducted classification with the geometric and contextual rules. This analysis makes it possible to determine what improvement can be yielded by each piece of information added during classification. The analysis results are summarized in Table VII.

This analysis brings out the importance of every type of rule. Using all the rules yields classification accuracies of 92% and 89% for the Shebrooke and Rabat images, respectively. When the contextual rules were excluded, the classification accuracy decreased by 9% for the Shebrooke image and by 6% for the Rabat image. The exclusion of the geometric rules decreases the classification accuracy by 7% for the two images. While

TABLE VIII

COMPARISON OF THECLASSIFICATIONS OF THEIKONOSIMAGE

implementing the geometric and contextual rules without using the spectral rules, the classification accuracy is reduced by 18% for Shebrooke and by 21% for Rabat. The use of the three types of rules improves the results.

The object classification improved the overall classification accuracy compared to the per-pixel classification. Confusion between classes also decreased. Thus, separation between roads and buildings was significantly improved.

In order to evaluate the statistical significance of differ-ences in accuracy between the per-pixel maximum-likelihood classification and the proposed object rule-based classification, comparisons of results were undertaken using the McNemar test. It is a nonparametric test used to compare the correct proportions where samples used are not independent [27]. The test equation may be expressed as

χ2=(f12−f21) 2 f12+f21

. (9)

fijindicates the number of pixels correctly classified by one

method and misclassified by the other method. χ2 _{is a} chi-squared statistic with 1 DOF. The calculated value compared against tabulated Chi-squared values indicates its statistical significance [27]. Tables VIII and IX show the comparison between the two classifications of the two images using the McNemar test. The results indicate that the differences ob-tained between the two classification methods are statistically significant. TheP value is very low, and the calculated values of the test are very high compared to tabulated Chi-squared values. This indicates that the improvement of accuracy yielded by the proposed object rule-based classification is significant

(13)

TABLE IX

COMPARISON OF THECLASSIFICATIONS OF THEQUICKBIRDIMAGE

TABLE X

CLASSIFICATIONACCURACY OFNEWBUILDING ANDROADS

compared to that yielded by the per-pixel maximum-likelihood classification.

In the two study areas of Sherbrooke and Rabat, there are many zones where changes occurred since the dates of map production. These changes concern mainly the construction of new buildings and new roads. It is then interesting to evaluate the results of the proposed method in these zones to examine the potential of the method to classify new parts of the images. This analysis was applied to 3906 pixels of new buildings and 4074 pixels of new roads in the Ikonos image and 4298 pixels of new buildings and 4066 pixels of new roads in the QuickBird image. Table X summarizes the results obtained. It gives the classification accuracy of new buildings and new roads compared to the classification accuracy of all buildings and roads (old and new ones).

This analysis shows that the classification accuracy of new structures (building and roads) by the proposed algorithm is about 87% for the two images. This accuracy is better than the one obtained by the maximum-likelihood classification which is of 69%. Table X indicates also that there is difference in clas-sification accuracies between new and old structures (average of 2%). In the image of Sherbrooke, the classification accuracies of new buildings and new roads are, respectively, 3% and 2% less than the classification accuracies of all buildings and of all roads. For the image of Rabat, the classification accuracy of new buildings is 4% less than the classification accuracy of all buildings, but the classification accuracy of new roads is 1% more than the classification accuracy of all roads. We can notice that the algorithm gives almost similar results for old and newly constructed areas.

VI. CONCLUSION

In this paper, we have presented an object rule-based ap-proach to classify urban land cover from a very high resolution multispectral image. First, a per-pixel classification was applied to classify individual image pixels. Next, a multispectral

seg-mentation technique using an existing digital map was proposed to facilitate object analysis. Several information sources were used by the object rule-based method: spectral statistics, geo-metric information, and segment neighborhood analysis.

In our method, many known parameters are used in a proce-dure that gathers them. Furthermore, they are used in an object rule-based procedure and combined to per-pixel classification and other parameters such as shadow. In the classification ap-proach, nonspectral information is used only for classes where it can bring improvements and not to all the classes.

With the proposed classifier, the results were consider-ably improved in comparison to the results of the per-pixel maximum-likelihood classification. Buildings were identified with 89% accuracy for the Ikonos image and 86% accuracy for the QuickBird image, and roads were identified with 94% accuracy for the Ikonos image and 87% accuracy for the Quick-Bird image. Trees were identified with 91% accuracy for the Ikonos image and 94% accuracy for the QuickBird image, and grass were identified with 98% accuracy for the Ikonos image and 96% accuracy for the QuickBird image. The algorithm performs well for new and old constructions in both cities.

Further work is needed to improve the proposed method. Concerning segmentation, a thematic segmentation method is under assessment. It consists of choosing the seed points ac-cording to the probable pixel class. In our method, the user selects the training zones. The proposed method could become independent from selecting training zones. An automatic train-ing method from the existtrain-ing digital maps can be considered. Further work is necessary to take advantage of the available data in order to automate the whole classification process.

REFERENCES

[1] D. A. Holland, D. S. Boyd, and P. Marshall, “Updating topographic mapping in Great Britain using imagery from high-resolution satellite sensors,”ISPRS J. Photogramm. Remote Sens., vol. 60, no. 3, pp. 212– 223, May 2006.

[2] D. Tuia, F. Pacifici, M. Kanevski, and W. J. Emery, “Classification of very high spatial resolution imagery using mathematical morphology and support vector machines,”IEEE Trans. Geosci. Remote Sens., vol. 47, no. 11, pp. 3866–3879, Nov. 2009.

[3] M. Ettarid and F. Degaichia, “Potentiel cartographique de l’imagerie Ikonos Geo,”Int. Archives Photogramm. Remote Sens., vol. 35, pt. B4, pp. 1161–1166, 2004.

[4] Q. Zhang and I. Couloigner, “A framework for road change detection and map updating,”Int. Archives Photogramm. Remote Sens., Spatial Inf. Sci., vol. 35, pt. B2, pp. 729–734, 2004.

[5] A. Touzani and R. Aguejdad, “Investigation du Potentiel Cartographique de L’imagerie Satellitale à Très Haute Résolution Ikonos,” M.S. thesis, Institut Agronomique et Vétérinaire Hassan II, Rabat, Morocco, 2001. [6] D. Holland and P. Marshall, “Using high-resolution imagery in a

well-mapped country,” presented at the ISPRS/EARSEL Int. Workshop ‘High Resolution Mapping from Space’, Hannover, Germany, 2003.

[7] E. P. Baltsavias, “Object extraction and revision by image analysis using existing geodata and knowledge: Current status and steps towards oper-ational systems,”ISPRS J. Photogramm. Remote Sens., vol. 58, no. 3/4, pp. 129–151, Jan. 2004.

[8] M. Kumar and O. Castro, “Practical aspects of IKONOS imagery for mapping,” inProc. 22nd Asian Conf. Remote Sens., Singapore, Nov. 2001, pp. 1181–1185.

[9] J. Inglada and J. Michel, “Qualitative spatial reasoning for high-resolution remote sensing image analysis,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 2, pp. 599–612, Feb. 2009.

[10] M. Kumar and O. Castro, “Practical aspects of ikonos imagery for map-ping,” inProc. 22nd Asian Conf. Remote Sens., Singapore, Nov. 2001, pp. 1181–1185.

(14)

[11] Y. J. Zhang, “Texture-integrated classification of urban treed areas in high-resolution color-infrared imagery,”Photogramm. Eng. Remote Sens., vol. 67, no. 12, pp. 1359–1365, Dec. 2001.

[12] N. Thomas, C. Hendrix, and R. G. Congalton, “A comparison of urban mapping methods using high-resolution digital imagery,”Photogramm. Eng. Remote Sens., vol. 69, no. 9, pp. 963–972, Sep. 2003.

[13] A. P. Carleer, O. Debeir, and E. Wolff, “Assessment of very high spa-tial resolution satellite image segmentations,”Photogramm. Eng. Remote Sens., vol. 71, no. 11, pp. 1285–1294, Nov. 2005.

[14] M. Herold, M. E. Gardner, and D. A. Roberts, “Spectral resolution re-quirement for mapping urban areas,”IEEE Trans. Geosci. Remote Sens., vol. 41, no. 9, pp. 1907–1919, Sep. 2003.

[15] A. P. Carleer and E. Wolff, “Urban land cover multi-level region-based classification of VHR data by selecting relevant features,”Int. J. Remote Sens., vol. 27, no. 6, pp. 1035–1051, Mar. 2006.

[16] B. Guindon, “A framework for the development and assessment of ob-ject recognition modules from high-resolution satellite image,”Can. J. Remote Sens., vol. 26, no. 4, pp. 334–348, 2000.

[17] T. Matsuyama, “Knowledge-based aerial image understanding systems and expert systems for image processing,”IEEE Trans. Geosci. Remote Sens., vol. GE-25, no. 3, pp. 305–316, May 1987.

[18] A. K. Shackelford and C. H. Davis, “A combined fuzzy pixel-based and object-based approach for classification of high-resolution multispectral data over urban areas,”IEEE Trans. Geosci. Remote Sens., vol. 41, no. 10, pp. 2354–2363, Oct. 2003.

[19] J. R. Jensen,Introductory Digital Image Processing—A Remote Sensing Perspective, 3rd ed. Upper Saddle River, NJ: Pearson Prentice-Hall, 2004.

[20] Z. Wang, D. Ziou, C. Armenakis, D. Li, and Q. Li, “A comparative analysis of image fusion methods,”IEEE Trans. Geosci. Remote Sens., vol. 43, no. 6, pp. 1391–1402, Jun. 2005.

[21] D.-C. He, L. Wang, and M. Amani, “A new technique for multi-resolution image fusion,” inProc. Geosci. Remote Sens. Symp., Anchorage, AK, 2004, pp. 4901–4904.

[22] Y. Voirin, “Élaboration d’un Système à Base de Règles Pour l’identification des Zones Perturbées en Milieu Forestier,” Ph.D. dissertation, Université de Sherbrooke, Sherbrooke, QC, Canada, 2004.

[23] M. Neubert and G. Meinel, “Evaluation of segmentation programs for high resolution remote sensing applications,” in Proc. Joint ISPRS/ EARSel Workshop ‘High Resolution Mapping from Space’, Hanover, Germany, Oct. 6–8, 2003.

[24] D. Lu and Q. Weng, “A survey of image classification methods and tech-niques for improving classification performance,”Int. J. Remote Sens., vol. 28, no. 5, pp. 823–870, Jan. 2007.

[25] O. Debeir, “Segmentation Supervisée d’Images,” Ph.D. dissertation, Faculté des Sciences Appliquées, Université Libre de Bruxelles, Brussels, Belgium, 2001.

[26] E. H. Shortliffe, Computer Based Medical Consultations: MYCIN. New York: American Elsevier, 1976.

[27] G. M. Foody, “Thematic map comparison: Evaluating the statistical sig-nificance of differences in classification accuracy,”Photogramm. Eng. Remote Sens., vol. 70, no. 5, pp. 627–633, May 2004.

Mourad Bouziani received the Engineering de-gree in surveying engineering from the Insti-tut Agronomique et Vétérinaire Hassan II, Rabat, Morocco, in 1996, the M.Sc. degree in geomatic sciences from the Université Laval, Québec City, QC, Canada, in 2004, and the Ph.D. degree in remote sensing from the Université de Sherbrooke, Sher-brooke, QC, Canada, in 2007.

He is currently a Professor of geomatics (Filière des Sciences Géomatiques et Ingénierie Topographique) and a Research Scientist with the

Unité de recherche en développement de concepts, d’outils et de modèles en géomatique, Institut Agronomique et Vétérinaire Hassan II. His research ac-tivities include applications of remote sensing, geographic information system modeling, construction and cadastral applications of surveying, and GPS.

Kalifa Goita received the Engineering degree in surveying engineering from the École Nationale d’Ingénieurs, Bamako, Mali, in 1987 and the M.Sc. and Ph.D. degrees in remote sensing from the Uni-versité de Sherbrooke, Sherbrooke, QC, Canada, in 1991 and 1995, respectively.

From 1995 to 1997, he was a Postdoctoral Fellow with the Climate Research Branch of Environment Canada, Toronto, ON. From 1997 to 2002, he was a Professor of remote sensing and geographic infor-mation system (GIS) with the Faculty of Forestry, Université de Moncton, Moncton, NB, Canada. Since June 2002, he has been a Professor of geomatics with the Département de Géomatique Appliquée, Université de Sherbrooke, where he has also been a Research Scientist with the Centre d’Applications et de Recherches en Télédétection. His exper-tise includes both physics and applications of remote sensing (extraction of biophysical parameters from visible to microwave spectrum) and also GIS applications to environmental modeling.

Dr. Goita received the Canadian Remote Sensing Society award for the best Ph.D. thesis in remote sensing in 1995.

Dong-Chen Hereceived the B.S. degree from the Petroleum University of China, Dongying, China, in 1982 and the D.E.A. (Diplôme d’Études Appro-fondies) and Ph.D. degrees in automatics and sig-nal processing from the Université de Nice, Nice, France, in 1984 and 1988, respectively.

Since 1989, he has been with the Centre d’Applications et de Recherches en Télédétection, University of Sherbrooke, Sherbrooke, QC, Canada, where he is currently a Professor. His research activ-ities center around image processing, pattern recog-nition, texture analysis, and remote sensing.