Individual Building Rooftop Segmentation from High resolution Urban Single Multispectral Image Using Superpixels

(1)

2019 International Conference on Information Technology, Electrical and Electronic Engineering (ITEEE 2019) ISBN: 978-1-60595-606-0

Individual Building Rooftop Segmentation from High-resolution Urban

Single Multispectral Image Using Superpixels

Cheng ZHANG, Ji-chao JIAO

*

, Zhong-liang DENG and Yan-song CUI

Beijing University of Posts and Telecommunications, School of Electronic Engineering, Beijing, China 100081

*Corresponding author

Keywords: High-resolution urban aerial optical image, Refined superpixel, Rooftop segmentation, Salient feature.

Abstract. This paper proposes an algorithm to segment individual building rooftops from optical photographs based on superpixels. A novel method based on image complexity is proposed to calculate the number of superpixels and refine their boundaries. In order to differentiate buildings from ground and roads, the salient feature vectors are built by using features extracted from refined superpixels. In particular, those feature vectors are used to classify refined superpixels as object or non-object. Our segmentation algorithm is compared with two other state-of-the-art segmentation algorithms in terms of the recall and precision. Experiments show that the proposed method can segment building rooftops appropriately, yielding higher precision and better recall rates.

Introduction

Building rooftop segmentation from aerial images is used in different applications and analysis, for example, mapping, video games, 3D modeling, environmental management and monitoring, and disaster management. Many techniques segment building rooftops by depending on accruate height information that is obtained by stereo photogrammetry or LiDAR [1,2] Sun and his colleague [3] presented a method to create 3D watertight building models from airborne LiDAR point clouds. This proposed method analyzed the scene content and produces multi-layer rooftops with complex boundaries. Cote and Saeedi proposed a method for extracting 2-D rooftop footprints from nadir aerial imagery through a fully automatic approach, which combined the strength of energy-based approaches with distinctiveness of corners [4]. Then, Cheng and his co-workers gave an algorithm for automatic derivation of building region information from LiDAR, in which the reversed iteration makes that the thresholds can be determined in a simple way [5]. Yuan and his fellows utilized both spectral and texture information to segment building rooftops from remote sensing images, which was sensitive to the noise because of using the least squares solution [7]. Wang et al. conducted building extraction based on distinctive image primitives such as lines and line intersections, which extracted building rapidly from very high resolution (VHR) optical satellite imagery [8]. Kluckner and Bischof [9] proposed an automatic building extraction method based on superpixels, which is similar with our method.

Above all, we are given a high resolution aerial or satellite photograph from a nadir viewpoint that includes visible color (RGB or HSV) that is an important feature for building rooftop segmentation. Moreover, image corners are a kind of salient features. Therefore, a novel function which is used to calculate the threshold is proposed in this paper. Among various approaches for extracting texture features, Gabor filter has emerged as one of the most popular ones [11].

Proposed Algorithm

(2)

extracted and the training data is generated. And then, a classifier is generated by training these features. The building rooftop are marked from the image by this classifier, and the shadow can be cast after finding the building. Finally, buildings are segmented with a high accuracy.

[image:2.595.70.532.119.241.2]

Figure 1. The original image with its zoom-in area. Figure 2. The flow chart of our algorithm.

Pre-segmentation Based on Simple Refined Superpixel

In this paper, a novel method is proposed to pre-segment the large-scale image with a high accuracy. This method is based on SLIC superpixels proposed by Achanta etc. al. [12], whose number of superpixels was given by users, and this method cannot be used to process image sequences. However, different with the SLIC superpixel, our method used two formulas to overcome the under-segmentation caused by SLIC and achieved to process the high-resolution image sequences.

In this paper, we use the edge ratio κ to measure the level of the object number not the value of the number, which can indicate the image complexity. The following formula is used to calculate:

κ=Γedge

Μ (1)

where M is the size of the image, Γedge is the pixels number of all objects edges. This function was

original developed for automatic target recognition. However, according to lots of experiments, we suggest another novel use of this metric to choose the number of superpixels by using the following formula:

Nsuperpixel=κ×M_ϕ (2)

where φ is a weighted value. According to Eq.2, we can find that the number of superpixels is dependent on the image complexity and the image size. The bigger the κ is, the more useful information is included in the aerial image. Moreover, the bigger M is, the more objects are contained in the aerial image. Therefore, the Nsuperpixel must be bigger, which means that more superixels whose

boundaries lie on edges between objects in the image are needed to segment the images.

In this paper, we propose a function to determine which superpixels need to be refined. The function is shown as follows,

{_σσi> thr Detection

i≤ thr No Detection (3)

where σi is the standard deviation of the ith superpixel, thr is the threshold that is decimal. In this

paper, we suggest that thr is the threshold whose interval is [4,10].

Salient Features Extraction from Refined Superpixels

(3)

Sl={

d, Il≤Icp-TFast_corner(darker)

s, Icp-TFast_corner≤Il≤Icp+TFast_corner(similar)

b, I_l≤I_cp+T_{Fast_corner}(brighter)

(4)

where Il is the gray values of candidate corners in a superpixel, Icp is the gray values of neighbors

around Il, d, s and b are just used to mark the gray value of each pixel around the candidate FAST

corner. and TFast−corner is the FAST corner number which is calculated by the following equation.

TFast_corner=

f2(xi,yi) 1

n∑ni=1f(xi,yi)

+σ (5)

where σ is the variance of an aerial image, n is the number of the pixels in the image f(x,y), and f(xi,yi)

is the value of the ith pixel. According to the Eq.4, we can find that the more complexcity of the image is, the bigger σ is, then the more FAST corners are extracted from the superpixels.

Because more salient corners are in an image whose variance is bigger.

In our method, we use the Gabor filter to extract structure features of buildings. In the spatial domain, a 2D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave with a complex sinusoid.The function is shown as follows,

g(x,y)= 1

2πσxσyexp[ -1 2(

x2 σ_x2+

y2

σ_y2)+2πjΩx] (6)

where Ω is the frequency of the Gabor function, and σx and σy determine its bandwidth.

Through concatenating the above features, a salient feature vector S FV{CHSV,S l,Gvs} is generated

for each superpixel. It is noticed that CHSV is indicated the color feature, and S l indicates the FAST

corners, and the Gvs indicates the Gabor feature. The S FV is a feature vector that represents the

properties of the color, FAST corners and Gabor textures. This salient feature vector is calculated by the following function:

S FV(x,y) = ‖Vμ-Vwhc(x,y)‖ (7)

Results

Our method was implemented by using Matlab 2015a, and this method was coded by integrating C# and matlab. The proposed algorithm was tested on many large high-resolution urban images taken from Phoenix, Arizona, USA.

Study Area

[image:3.595.194.401.634.738.2]

The study area is located at Phoenix, Arizona, United States and its location is 33◦270N,112◦40W. The area of the study region is about 291.7264km2. One subarea of Phoenix is shown by Fig. 3 whose resolution is 2048 × 1642pixels. Moreover, in order to train the classifier, 50 images that are labeled {Building rooftops, others} are used and the examples of the original labeled images are shown by Fig. 4. Furthermore, 500 images are used to evaluate the proposed algorithm.

(4)

[image:4.595.91.507.72.224.2]

(a) (b) (c)

Figure 4. Three examples of the original labeled images for classifier training.

Pre-segmentation Result

[image:4.595.115.487.353.499.2]

Fig. 3 is an original image that includes trees, grass, buildings and other objects. The zoom-in area of Fig. 4 shows that the rooftop color is similar to the ground, the textures of building rooftop are coarse and salient, while the textures of ground are smooth. Fig. 5 shows the pre-segmentation result. By comparing the Fig. 5(a),(b) and (c), we can find that the surperpixel in Fig. 5(a) were refined by our method. It is noted that the number of superpixels is 8200 before the aerial images are pre-segmented by using SLIC.

(a) (b) (c)

Figure 5.Pre-segmentation result and its zoom-in regions.

Fig. 5(a) shows the result which is obtained by Eq.(3). The boundaries are marked by red, and the boundaries that are got by Eq.(3) are marked by green. This function is used to constraint the boundaries of superpixels when the superpixel’s boundary cannot match the edge of an object. In this method, the first step is to find the superpixels whose boundaries is not matched with the edges of the objects. Then, the snake is introduced to find the exact boundary which match the object’s edges very well. By subjectively comparing these three results, we found that the classification result got by our method is the best.

Building Rooftop Classification and Segmentation Result

Besides, the proposed pre-segmentation approach was also tested on the image sequence. According to the Fig. 6, we can find that the white building rooftop is precisely covered by several RS superpixel, and the edges of rooftops are consistent with edges of RS superpixels. Moreover, the rooftops could be pre-segmented by our proposed approach even though the image is scaled, which illustrates that the proposed approach is robust. Fig. 7 shows three classification results. From Fig. 7, we can find that yellow rectangles are used to mark the same regions in the result images.

(5)

5 (c) and (d). Moreover, the features extracted from images cannot train classifiers to distinguish between ground and rooftops, which are shown by Fig. 8 (c) and (d).

[image:5.595.117.481.97.305.2] [image:5.595.201.392.341.553.2]

(a)1st Frame (b)15th Frame (c)30th Frame (d)50th Frame (e)60th Frame Figure 6. Pre-segmentation result based on our method.

Figure 7. Classification on aerial images from different regions.

(a) Ground truth of the building rooftop segmentation (b) Building rooftop segmentation based on our method

(c) Building rooftop segmentation based on ERS (d) Ground truth of the building rooftop segmentation

[image:5.595.61.538.571.763.2]

(6)

Conclusion and Future Work

In this paper, we proposed a building rooftop segmentation algorithm by using improved superpixels from large high-resolution urban images. A function was proposed to automatically calculate the number of superpixels. We evaluated our method by using many aerial images and compared our method with two other state-of-the-art methods. Experiments showed that our method is fast and robust, while still being simple and efficient. The results of our method can be implemented for generating 3D urban models.

Acknowledgement

This work was supported by the National Key Research and Development Program under Grant 2017YFB0503701.

References

[1]G. Sohn, J. Jung, Y. Jwa, and C. Armenakis, “Sequential modeling of building rooftops by integrating airborne lidar data and optical imagery: Preliminary results,” Proceedings of VCM (2013).

[2]Q.-Y. Zhou and U. Neumann, “2.5 d dual contouring: A robust approach to creating building models from aerial lidar point clouds,” in Computer Vision–ECCV 2010, 115–128, Springer (2010).

[3]S. Sun and C. Salvaggio, “Aerial 3d building detection and modeling from airborne lidar point clouds,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 6(3), 1440–1449 (2013)

[4]M. Cote and P. Saeedi, “Automatic rooftop extraction in nadir aerial imagery of suburban regions using corners and variational level set evolution,” IEEE transactions on geoscience and remote sensing 51(1), 313–328 (2013)

[5]L. Cheng, W. Zhao, P. Han, W. Zhang, J. Shan, Y. Liu, and M. Li, “Building region derivation from lidar data using a reversed iterative mathematic morphological algorithm,” Optics Communications 286, 244–250 (2013).

[6]L. Cheng, L. Tong, Y. Chen, W. Zhang, J. Shan, Y. Liu, and M. Li, “Integration of lidar data and optical multi-view images for 3d reconstruction of building roofs,” Optics and Lasers in Engineering 51(4), 493–502 (2013).

[7]J. Yuan, D. Wang, and R. Li, “Remote sensing image segmentation by combining spectral and texture features,” Geoscience and Remote Sensing, IEEE Transactions on 52(1), 16–24(2014).

[8]J. Wang, X. Yang, X. Qin, X. Ye, and Q. Qin, “An efficient approach for automatic rectangular building extraction from very high resolution optical satellite imagery,” IEEE Geoscience and Remote Sensing Letters 12(3), 487–491 (2015).

[9]S. Kluckner and H. Bischof, “Image-based building classification and 3d modeling with super-pixels,” in ISPRS Technical Commission III Symposium on Photogrammetry Computer Vision and Image Analysis, 233–238 (2010).

[10]E. Rosten and T. Drummond, “Machine learning for high-speed corner detection,” in Computer Vision–ECCV 2006, 430–443, Springer (2006).

[11]W. Li and K. Mao, “Selection of gabor filters for improved texture feature extraction,” in Image Processing, 2010 17th IEEE International Conference on, 361–364 (2010).