Multiple Object Extraction of Remote Sensing Images Based on Convolutional Neural Networks and Support Vector Machines

(1)

n o e c n e r e f n o C l a n o it a n r e t n I 8 1 0

2 Communicaiton ,NetworkandAritifcia lIntelilgence(CNA I2018) 8 7 9 : N B S

I -1-60595- 50 -5 6

n

o

d

e

s

a

B

s

e

g

a

m

I

g

n

i

s

n

e

S

e

t

o

m

e

R

f

o

n

o

it

c

a

r

t

x

E

t

c

e

j

b

O

e

l

p

it

l

u

M

s

e

n

i

h

c

a

M

r

o

t

c

e

V

t

r

o

p

u

S

d

n

a

s

k

r

o

w

t

e

N

l

a

r

u

e

N

l

a

n

o

it

u

l

o

v

n

o

C

o

a

T

-

x

i

n

g

F

E

N

G

,

W -

e

i

y

u

W

A

N

G

,

X

i

a

o

-

m

e

n

g

Q

I

A

N

,

Z

h

a

n

-

h

o

n

g

Z

H

A

N

G

,

i

u

H L U

I

,

Y

i

n

g

X

I

N

G

a

n

d

B

i

n

Y

A

N

G

a n i h C , 6 7 8 0 0 1 g n ij i e B , s n o it a c i n u m m o c e l e T d n a s t s o P f o y ti s r e v i n U g n ij i e B : s d r o w y e

K Machinel earning ,Deepl earning, Convoluitona lneura lnetwork(CNN) ,Suppor tvector i

h c a

m ne(SVM),Remotesensingi mages, Mulitpleobjectextraciton.

.t c a r t s b

A Thedataprovidedbyremotesensingtechnologyischaracterized bywidecoverage ,high l

a e

r -time performance and a supply of rich and objective informaiton .Therefore ,extracting geo -l p s e g a m i g n i s n e s e t o m e r m o r f s d a o r s a h c u s s t c e j b

o aysimportan trolesinmany urbanappilcations , y ll a u n a m d e m r o f r e p y l l a r e n e g s k s a t h c u s , t s a p e h t n i , r e v e w o H . s a e r a f o e g n a r e d i w a n i d e s u d n a e m i t d n a y l t s o c y r e v s i h c i h

w -consuming .And many ofexistingautomating attemptsaredesigned a e f c i f i c e p s r o

f turesand classifiers wtih ilmitation .This paperproposes a CNN-SVM-based road s e k a t h c i h w , ) s e n i h c a m r o t c e v t r o p p u s e e r h t h ti w k r o w t e n l a r u e n l a n o i t u l o v n o c a ( m e t s y s n o i t c a r t x e e e r h t d e t c i d e r p d n a s t u p n i s a y r e g a m i l a i r e a n i s e u l a v l e x i p w a

r -channe llabe limages(background -s g n i d l i u

b -roads)asoutputs .By training asingleCNN efficiently ,featureextractorsandclassifiers , M V S g n i s U . y l s u o e n a t l u m i s d e t c a r t x e e r a s t c e j b o f o s d n i k e l p i tl u m d n a d e t c u r t s n o c y ll a c i t a m o t u a e r a b n a c s t l u s e r n o i t a c i f i s s a l c e h

t e optimized .A tthe same time , we improve our experimenta l r u o g n i r a p m o c , y l l a n i F . t n e m e c a l p s i d l a i t a p s h t i w g n i t c i d e r p d n a d o h t e m t u o p o r D e h t y b e c n a m r o f r e p N N C f o n o i s i c e r p e h t t a h t t l u s e r e h t t e g e w , s d o h t e m s u o i v e r p h t i w l e d o

m -SVMi ncreases8.37% .

t n

I roduciton

y n a m s a h y r e g a m i l a i r e a m o r f s g n i d li u b d n a s d a o r e k i l s t c e j b o n a b r u e l p i t l u m f o n o i t c a r t x E r e t s a s i d , e c n a s s i a n n o c e r y r a ti l i m , e c a p s o r e a s a h c u s , s d l e i f n a i l i v i c d n a y r a t i l i m n i s n o it a c i l p p a o i t a z il i t u d n a g n i n n a l p d n a l , g n i t s a c e r o f d n a g n i r o t i n o

m n ,etc .However ,duet ot hevariousshapesof s i s e g a m i g n i s n e s e t o m e r e h t f o g n il e b a l l e x i p e t a r u c c a , s r o t c a f e c n e r e f r e t n i e m o s d n a s t c e j b o e s o h t y l h g i h s i s d a o r d n a s g n i d l i u b f o n o it c a r t x e e v it c e f f e , s u h T . e u l a v h c r a e s e r h g i h f o k s a t a l l i t s n a , d e d n a m e

d dt herehavebeenplentyofattemptsutliizingdifferen tmethods. i m e s e r a h c i h w , s e i r o g e t a c o w t o t n i d e i f i s s a l c e b n a c s d o h t e m n o it c a r t x e s t c e j b

O -automaticand

r o f n I . ] 2 [ t o n s e o d r e t t a l e h t e l i h w n o i t c a r e t n i n a m u h s e r i u q e r r e m r o f e h T . ] 1 [ c i t a m o t u a y l l u

f mation

i m e s y b e n o d s e g a m i e t i l l e t a s m o r f n o i t c a r t x

e -automatic methods is usually expensive and time -. d e il p p a d n a d e r o l p x e e r a s d o h t e m c i t a m o t u a , s n o it a ti m i l e s e h t e m o c r e v o o T . g n i m u s n o c u o b a s e v i t c e p s r e p s u o r e m u n , s r e i f i s s a l c d n a s e r u t a e f s u o i r a v n o d e s a

B tautomaticmethodswere W ( n o i t c n u f p i h s r e b m e m d e t h g i e w a y b d e t s i s s a , ) 4 1 0 2 ( l a t e P P . h g n i S n I : d e s o p o r

p -mf),t hefuzzy

-C-means(FCM)techniquesuccessfullyenhancedtheclassification resultsand detected theobjects t c e j b o o w t g n i s U . ] 3

[ -based flitersand suppor tvectormachine (SVM) ,Zelang Miao e ta l(2015) H J g n a W n I . ] 4 [ s s a l c d a o r e h t d e t c a r t x e n e h t d n a s e t a d i d n a c d a o r t c e l e s o t s e r u t a e f t c e j b o d e t u p m o c e g d e l w o n k a d e z i l it u y e h t , ) 6 1 0 2 ( l a t

e -basedmethodtoextrac tthespatialt exturefeatureforimage M K r a m u K , 7 1 0 2 n I . ] 5 [ l e d o m e h t t c u r t s n o c o t s e r u t a e f r e p o r p e r o m t u o d e k r o w d n a n o i t a t n e m g e s i tl u m d i r b y h l e v o n d n a r e tl i f e l c i t r a p g n i s u h c a o r p p a w e n a d e s o p o r p l a t

e -kerne lpartiall eas tsquares . ] 6 [ ) S L P ( s i M V S , s d o h t e m c it a m o t u a e s o h t g n o m

(2)

a , s t c e j b o e l p i tl u m h t i w d e t s a r t n o c , n o i t a c i l p p a l a u t c a n

I singleobjec tcanprovidelessinformation e l p i t l u m f o n o i t c a r t x e s u o e n a tl u m i s e h t n o s i s a h p m e e h t t u p e w , e r o f e r e h T . e u l a v e c n e r e f e r e h t d n a n e e b e v a h ) N N C ( k r o w t e n l a r u e n l a n o i t u l o v n o c g n it i o l p x e s e h c a o r p p a , s r a e y t n e c e r n I . s t c e j b o ] 1 1 [ ] 0 1 [ ] 9 [ d e s o p o r

p and applied in solving mutli-objectiveextraction problemswith pretty good l e x i p w a r m o r f s r o t c a r t x e e r u t a e f d o o g t e g d n a s e h c t a p o t n i s e g a m i e d i v i d n a c d o h t e m h c u S . s t c e f f e e r p x e l p m o c t u o h t i w s e u l a

v -processing .In 2016 ,Shunta Saito e ta lgenerated three-channe lmaps s l e x i p g n i d li u b d n a d a o r g n i t c e t e d f o e g a t n a v d a k o o t y e h T . t u p n i y r e g a m i g n i s n e s e t o m e r w a r m o r f n o i t c n u f t u p t u o w e n a d n a y l s u o n o r h c n y s s l e b a l e l p i t l u m g n i r e d i s n o c y r e g a m i g n i s n e s e t o m e r m o r f l e n n a h c d e ll a

c -wisei nhibitedSoftmaxt ot raint heCNN[8] .Ont hebasisoft hepreviouswork ,Rasha w o l l a r e v e s h t i w s e r u t a e f ) N N C ( k r o w t e n l a r u e n l a n o i t u l o v n o c e h t d e n i b m o c ) 7 1 0 2 ( l a t e i h h e h s l A -t s o p e h t n i s g n i d l i u b d n a s d a o r f o s e r u t a e f l e v e

l -processingsectionsoast osmootht hei rregularand j

s i

d oin tregions[12].

e h t f o s e g a t n a v d a n o it c a r t x e t c e j b o e l p i t l u m d n a d e e p s h g i h e h t e n i b m o c e w , s i s e h t e h t n I . l e d o m e h t t c u r t s n o c o t M V S f o s t l u s e r n o i t a c i f i s s a l c t a e r g e h t h t i w k r o w t e n l a r u e n l a n o i t u l o v n o c o t c e v t r o p p u s e s u e w , k r o w s u o i v e r p e h t g n i w o l l o

F rmachine(SVM)totrain theCNNfeaturesand r u O . e c n a m r o f r e p e v o r p m i o t t n e m e c a l p s i d l a i t a p s h t i w g n i t c i d e r p d n a d o h t e m t u o p o r D e h t d e s o p o r p d e w o h s s tl u s e r e h t d n a s t e s a t a d y r e g a m i l a i r e a d a o r s t t e s u h c a s s a M n o d e t c u d n o c e r e w s t n e m i r e p x e e s o p o r p r u o t a h

t d methods outperformed the previous achievements .Our methods avoided the s n i a r t y l t n e d n e p e d n i h c i h w s r e i f i s s a l c f o n g i s e d s u o i r a f i t l u m d n a n o it c a r t x e e r u t a e f d e t a c i l p m o c f i s s a l c y l e t a r u c c a e r o m e r e w s e g a m i n e s o h c r u o n i s l e x i p d n A . d e t c a r t x e e b o t s t c e j b

o ied into

e e r h t , s t n e m i r e p x e r u o n i , s d r o w r e h t o n I . s d a o r d n a , s g n i d l i u b , d n u o r g k c a

b -channe llabe limages

. d e t c u r t s n o c y l t c a x e e r o m e b d l u o c c i l b u p e h t s e c u d o r t n i n o it c e s d r i h t d n a d n o c e s e h T . s w o ll o f s a d e z i n a g r o s i e l c i t r a s i h t f o t s e r e h T t u e w t e s a t a

d iilzed and the environmen tconfiguration including hardwarepar tand softwarepart . r o t c e v t r o p p u s d n a s k r o w t e n l a r u e n n o i t u l o v n o c f o e g d e l w o n k c i s a b e h t s t n e s e r p n o i t c e s h t r u o f e h T s l i a t e d n i e l c i t r a s i h t n i d e s u s d o h t e m e h t e s o p o r p e w , n e h t d n A . e n i h c a

M in thefifth seciton .The e h T . n o i t a u l a v e d n a , n o i t c i d e r p , g n i n i a r t g n i d u l c n i t n e m i r e p x e e h t f o n o i t p i r c s e d a s i n o i t c e s h t x i s l e d o m r u o f o y t i r o i r e p u s e h t d n a y g e t a r t s n o i t a c i f i s s a l c M V S e h t f o n o i s s u c s i d e h t s i n o i t c e s h t n e v e s m e e r h t r e h t o h t i w d e r a p m o

c odels .In thefina lseciton ,wepresen taconclusion for ourimportan t s g n i d n i

f .

D aa S st e t

: e ti s b e w e h t n o h i n M y b d e s o p o r p a t a d e l b a l i a v a y l c i l b u p d e n i a t b o e W / u d e . o t n o r o t. s c . w w w / / : p t t

h ~vmnih/data/ .Merging data from Massachusetts Buildings Datase tand k o o t e W . t e s a t a d s d a o R d n a s g n i d li u B s t t e s u h c a s s a M d e t a e r c e w , t e s a t a D s d a o R s t t e s u h c a s s a M g n i t a l u c l a c y b d e t a e r c e r e w s l e b a l d n u o r g k c a B . s l e n n a h c e e r h t e h t s a d n u o r g k c a b d n a , d a o r , g n i d l i u b s e g a m i l l a f o e z i s e h T . s e g a m i l e b a l d a o r d n a g n i d l i u b f o R O X e h

t inthisdatase tare1500×1500 in m 1 s i n o i t u l o s e r e h t d n a e z i

s 2_/_p_i_x_e_.l

Convolu itonNeura lNetworkandSuppor tVectorMachine

. r o t c a r t x e r u o f o s t r a p l a it n e s s e t s o m e r a e n i h c a m r o t c e v t r o p p u s d n a k r o w t e n l a r u e n n o i t u l o v n o C f o s l i a t e d e h t e v i g e w , e r o f e r e h

T thesetwopartsandthebasicarchitectureweusedint hisarticleas . s w o l l o f N N C f o y r o e h T c is a B d e t c e n n o c y l l u f d n a , s r e y a l g n i l o o p , s r e y a l n o i t u l o v n o c : s t r a p c i s a b e e r h t m o r f N N C t n e s e r p e W . s r e y a l r e y a L l a n o it u l o v n o

(3)

n o i t a r e p o n o i t u l o v n o

c usesaconvolution kerne lto convolve with the corresponding region ofthe e h t e t e l p m o c o t l e n r e k n o i t u l o v n o c e h t s e v o m y l s u o u n it n o c n e h t d n a , e u l a v a n i a t b o o t e g a m i

m i e h t t e g l l i w r e tl i f h c a e , n o i t a r e p o n o i t u l o v n o c r e t f A . e g a m i e r it n e e h t f o n o i t u l o v n o

c age of

n o i t c n u f n o i t a v it c a n a o t n i s e r u t a e f e s e h t t u p n i o t d e e n e w , n e h t d n A . s e r u t a e f d e t c a r t x e g n i d n o p s e r r o c

x a m = ) x ( f : n o i s s e r p x e e h t s a h h c i h w , U L e R n o it c n u f e s u e w , e l c i t r a s i h t n I . t u p t u o l a n i f e h t t e g o t

n o c t s a f e r a U L e R f o s e g a t n a v d a e h T . ) x , 0

( vergenceandsimplegradien.t

g n il o o

P .Conventiona lCNN isa continuous convolution operaiton and pooling isan importan t f o s d o h t e m n o m m o c t s o m e h T . e s i c n o c e r o m n o it c a r t x e e r u t a e f e k a m o t s r e t e m a r a p e c u d e r o t p e t s

x a m e r a g n i l o o

p -pooling andmean-pooling .Weadop ttheformeronein thisarticle .Theextracted n

o n l a r e v e s d n a x i r t a m a s a d e t a e r t e r a s e r u t a e

f -overlappingregionsaredivided on thismatrix .We h t n i e t a p i c i t r a p o t d e s u e r a s e u l a v e s o h t d n a n o i g e r h c a e n i s e r u t a e f e h t f o m u m i x a m e h t e t a l u c l a

c e

e s e h t f o t s e g n o r t s e h t y l n o n i a t e r e w t a h t s t n e s e r p e r e u l a v m u m i x a m e h t g n i k a T . g n i n i a r t t n e u q e s b u s

. e p y t s i h t f o s e r u t a e f k a e w r e h t o e v a e l d n a s e r u t a e f

r e y a L d e t c e n n o C y ll u

F .Fullyconnectedl ayersconnectt heiral lnodest ot henodesi nt heprevious a

l yerandmapt hel earnedfeaturest ot hesamplemarkupspace.

M V S f o y r o e h T c is a B

e h t h t i w s e i r o g e t a c o w t o t n i e l p m a s e h t s e d i v i d t a h t e n a l p r e p y h a d n i f o t s i M V S f o e s o p r u p e h T

n a c t I . d n i f o t m i a e w h c i h w e n a l p r e p y h e h t f o t n e i c i f f e o c e h t t n e s e r p e r o t ω e s u e W . l a v r e t n i t s e g r a l

y b d e t n e s e r p e r e

b Eq.1.

x a

m _||𝜔||1 , 𝑠. 𝑡. , 𝑦𝑖(𝜔𝑇𝑥𝑖+ 𝑏) ≥ 1, 𝑖 = 1, … , 𝑛. ( 1)

. 1 e r u g i F s a d e t n e s e r p s i M V S f o m a r g a i d c i t a m e h c s e h t d n A

e r u g i

F 1. SchematicdiagramofSVM.

y f i s s a l c d l u o h s e w , r e i f i s s a l c y r a n i b a s i f l e s t i M V S e c n i

S onet ypeofsamplesi ntot hesameclass o t d e e n e w , s e l p m a s f o s e p y t k e r a e r e h t n e h w , e r o f e r e h T . s s a l c r e h t o n a o t n i s e n o g n i n i a m e r e h t d n a

(4)

e r u t c e ti h c r A c is a B

2 e r u g i

F . Basicarchitectureoft hisarticle.

e s a b e h t s w o h s 2 e r u g i

F architecture we use in this article . Our architecture following the y l l u f h t i w d e k c a t s e r a s r e y a l g n i l l o p l a i t a p s d n a s r e y a l l a n o i t u l o v n o c h c i h w n i N N C f o c i t s i r e t c a r a h c

M V S e e r h t o t n i N N C y b d e t c a r t x e s e r u t a e f e h t t u p n i e w n e h T . d e w o l l o f s r e y a l d e t c e n n o

c classifiers .

4 6 * 4 6 a e k a t e W . s r e t e m a r a p e l b a n i a r t g n i n i a t n o c s r e y a l e v i f e r a e r e h

T -sized three-channe lRGB 8

6 7 a d n a t u p n i e h t s a h c t a p y r e g a m i l a i r e

a -dimensiona lvectorastheoutput .Thenwereshapethe 6

1 * 6 1 a o t n i t u p t u

o -sizedt hree-channe lpatchmadeupofbuildings-roads-backgroundchannels .We b

* b h t i w r e y a l l a n o i t u l o v n o c a s i ) c / b * b , a ( C t a h t e m u s s

a -sizedfiltersand theconvolutionstridec , n I . s t i n u a h ti w r e y a l d e t c e n n o c y l l u f a s i ) a ( C F d n a , b e d i r t s h t i w r e y a l g n il o o p x a m a * a n a s i ) b / a ( P

h

t isway,t hearchitecturecanbei nterpretedasC(64 ,16*16/4)-P(2/1)-C(112 ,4*4/1)-C(80 ,3*3/1) -)

6 9 0 4 ( C

F -FC(768).

Methodology

o t S e g a m i l a i r e a t u p n i n a n i s l e x i p w a r m o r f g n i p p a m a n r a e l y l t c e r i d n a c e w , N N C r u o g n i n i a r t y B

e g a m i l e b a l e u r t

a _𝑀� .Andweaimt opredic tamutli-channe llabeli mage_𝑀�fromS .Int hisarticle ,a e s e h t p a m e w d n A . d n u o r g k c a b d n a , s d a o r , s g n i d l i u b g n i d u l c n i s l e n n a h c e e r h t f o s t s i s n o c e g a m i l e b a l

R ( s l e n n a h c B G R o t s l e n n a h c e e r h

t -roads ,G-buildings ,B-background)sot ha teachpixe lont hel abe l 3

a s i e g a m

i -dimensiona lvector .Sinceeachpixe lshouldalwaysbeeitherbackground ,buildingsor n a s i 3 e r u g i F . 1 e b s y a w l a d l u o h s r o t c e v l e x i p a f o s t n e m e l e l l a r e v o m u s e h t , e g a m i l e b a l a n i s d a o r

. e l p m a x e

e

g a m i t u p n

i truel abeli mage 3

e r u g i

F . Exampleofani nputi mageandi tst ruel abeli mage.

h c t a

P -basedFormula iton

a e s u e W ] 1 [ . l a t e h i n M y b d e s o p o r p n e e b e v a h d o h t e m e h t o t r a l i m i s s i d o h t e m g n i l e b a l l e x i p r u O

wS* wS -sized aeria limagerypatch sto obtain awm * wm -sizedtruelabe lpatch𝑚�by training the

e s u d n a N N

C _𝑚�todenotet hepredictedpatch .Wedescribepixell ocatedati i n_𝑚�asa3-dimensiona l e

n

o -ho tvector ,_𝑚�_𝑖 [= _𝑚�_𝑖1 ,_𝑚�_𝑖2 ,_𝑚�_𝑖3] .In a predicted labe lpatch _𝑚� ,each pixe la t iis also a 3 -r

o t c e v l a n o i s n e m i

(5)

i p l l a , n e v i

g xelsin a truelabe lpatch_𝑚�_𝑖(i=1 ,…_𝑤_𝑚2₎ _a_r_e_c_o_n_d_ti_i_o_n_a_l_l_y _i_n_d_e_p_e_n_d_e_n_t_o_f _e_a_c_h _o_t_h_e_r_.

s a d e s s e r p x e e b n a c h c t a p l e b a l e u r t a f o r o i r e t s o p e h t , e r o f e r e h

T Eq.2.

𝑝(𝑚�|𝑠) = ∏𝑤𝑚𝑝(𝑚�𝑖|𝑠). 2

𝑖=1 ( 2)

s a d e b i r c s e d e b n a c n o it c n u f s s o l e h

T Eq.3.

𝐿 = − ∑𝑤𝑚𝑙 𝑝(𝑚�𝑛 𝑖|𝑠). 2

𝑖=1 )(3

a n o g n it a r t n e c n o C . s e h c t a p t u p t u o d n a t u p n i e h t s w o h s 4 e r u g i

F small-regionpatch ,tii ssoabstrac t n o i t a m r o f n i t x e t n o c g n i s u n o i g e r r e d i w a r e d i s n o c e w , e r o f e r e h T . s i ti t a h w e z i n g o c e r t o n n a c e w t a h t

d e s a B . g n i d l i u b a f o t r a p a s w o h s h c t a p e h t t a h t d n i f n a c e w , y a w s i h t n I . s l e b a l t c i d e r p o t s u p l e h o t

o c t x e t n o c n

o nsideration ,thesizeofaninpu tpatchwsisse tlargerthanthesizeofapredictedlabe l

w h c t a

p m .Thist echniquei salsoi mplementedt ohavebetterperformancebyMnihe tal .[14]

4

e r u g i

F . Inpu tandoutpu tpatches.

l e n n a h

C -wsieI nhibtiedSo tfmax[ 8]

, e l c i t r a s i h t n

I wS =64 ,wm =16 .Wereshapea 768-dimensiona lvectorto a16×16×3-sized image

x [ = i x e s u e W . h c t a

p i1 ,xi2 ,xi3] Ttodenotetheithpixe loftheoutpu tpatch .TheSoftmaxisdefined

s a Eq.4.

𝑚�𝑖𝑘 =_∑e (𝑥_𝑗x_{e (𝑥}p_x_p𝑖𝑘_𝑖)_𝑗₎. (4)

r o t c e v y t i l i b a b o r p l e b a l e h t o t i x t r e v n o c e

W _𝑚�_𝑖 = _[𝑚�_𝑖1_{, 𝑚�}_𝑖2_{, 𝑚�}_𝑖3_]𝑇_a_s_E_q_.₅_.

𝑚�𝑖𝐶𝑘𝐼𝑆 =_∑e (𝑐_𝑗x_{e (𝑐}p_x_p𝑘𝑥_𝑗_𝑥𝑖𝑘_𝑖)_𝑗₎, 𝑐𝑘 = � 0, 𝑓_{1, 𝑡}_{𝑜 ℎ}𝑖 𝑘 = 1,_𝑒_𝑟_𝑤_𝑖_𝑠_𝑒_. ( 5)

s g n i d li u b f o s l e x i p f o r e b m u n e h t n a h t r e ll a m s h c u m s i s l e x i p d n u o r g k c a b f o r e b m u n e h t , y t i c e h t n I

e r o f e r e h T . s d a o r d n

a ,wese tk=1 ,ck=0t oeilminatet hei nfluenceoft hebackground.

N N

C - MSV

d l u o h s t c a r t x e o t d e e n e w s e r u t a e f e h T . n o i t a c i f i s s a l c n i e l o r t n a t r o p m i n a s y a l p n o it c a r t x e e r u t a e F

l a n o i ti d a r t g n i s u , M V S r o F . s e s s a l c t n e r e f f i d n e e w t e b h s i u g n i t s i d y l e v it c e f f

e methods to extrac t

e m i t d n a t l u c i f f i d a s i s e r u t a e

f -consuming task .And the trained CNN network can simply and e w , e r o f e r e h T . y l l a c i t n a m e s d e b i r c s e d e b o t e l b a t o n e r a t a h t s e r u t a e f d e c n a v d a e h t t c a r t x e y l t n e i c i f f e

e h t f o e c n a m r o f r e p e h t g n i v o r p m i e g a s i v n

e mode lbycombiningCNNandSVMt ogether.

e h t h t i w s M V S e e r h t h t i w r e y a l t u p t u o s ' N N C e h t e c a l p e r e w , n e h T . l e d o m N N C a n i a r t e w , t s r i F

t i f o t N N C y b d e t c a r t x e s l e b a l d n a s e r u t a e f e h t e s u e w , t a h t r e t f A . d e g n a h c n u g n i n i a m e r l e d o m N N C

n e h W . s M V S e e r h t e s e h

t weuseCNN-SVMt opredictr emotesensingi mages ,wecanuseappropriate .l

l a c e r d n a n o i s i c e r p e h t t s u j d a o t s e i g e t a r t s

t u o p o r D

, g n i n r a e l p e e d n i g n i n i a r t k r o w t e n f o s s e c o r p e h t n I . g n i t t i f r e v o t n e v e r p y l e v i t c e f f e o t t u o p o r D e s u e W

s ti n u k r o w t e n l a r u e n e h

t are discarded from the network temporarily in acertain probabliity .For i

n i m h c a e , t n e c s e d t n e i d a r g c i t s a h c o t

(6)

r e p o o c e h t n o s d n e p e d r e g n o l o n s t h g i e w e h t f o l a w e n e r e h t , y a w s i h t n I . g n i p p o r

d ation ofimplici t

r e h t o r e d n u y l n o e v i t c e f f e g n i e b m o r f s e r u t a e f n i a t r e c e h t g n i t n e v e r p , s n o i t a l e r d e x i f h t i w s e d o n

. s e r u t a e f c i f i c e p s

t n e m e c a l p si D l a it a p S h ti w g n it c i d e r P

n i g i r o e h t e c a l p s i d e W . g n i t c i d e r p n i e s u e w t n e m e c a l p s i d l a i t a p s e h t s w o h s 5 e r u g i

F a linpu taeria l

e s o h t f o s e h c t a p l e b a l d e t c i d e r p e h t , n e h T . e m it h c a e l e x i p e n o h t i w s e m i t n e v e s r o f s e h c t a p y r e g a m i

e d i v i d d n a r e h t e g o t s e h c t a p l e b a l d e t c i d e r p e s o h t e l it e W . t n e m e c a l p s i d e m a s e h t e v a h l l i w s n o i s r e v

e v a n a t e g o t t h g i e y b s e u l a v l e x i p l l

a rage.

5 e r u g i

F . Spatia ldisplacementi nprediction.

s t u p t u o e h t g n i h t o o m s n i e l o r t n a t r o p m i n a y a l p n a c d o h t e m s i h t , s e h c t a p l e b a l d e t c i d e r p e h t r o F

. s s e n e v i t c e f f e s t i s w o h s 6 e r u g i F . e c n a m r o f r e p e h t e v o r p m i y l t n a c i f i n g i s n a c d n a s e i r a d n u o b e h t r e v o

6

e r u g i

F . Comparisonchar tofusingspatia ldisplacement.

Experiment

n o it a r u g if n o C t n e m n o r i v n E

n o it a r u g if n o C e r a w d r a

H .CPU :2Intel(R)Xeon(R)[email protected] ,whichhas12 .

s e r o c

s a h h c i h w , ] 0 0 0 4 K o r d a u Q [ L G 6 0 1 K G n o i t a r o p r o C A I D I V N : U P

G 3017MiBmemory.

. y ti c a p a c B G 2 1 5 s a h h c i h w , 1 D S S 0 0 1 X M 2 1 5 T C _ l a i c u r C : k s i D d r a H

n o it a r u g if n o C e r a w tf o

S . Anaconda3(python3.6) : Cython ; Chainer 1.5.0.2 ; NumPy ; Tqdm ; ;

b d m L

; 0 . 1 . 3 V C n e p O ; 2 V C n e p O

0 . 9 5 . 1 t s o o

B ;

. ) b 5 a a a 6 2 ( y P m u N .t s o o B

(7)

h i n M n i a r t e

W -CNN ,Mnih-CNN with CIS (Channel-wise Inhibited Softmax)[3][4] ,Mnih-CNN -h

i n M d n a M V

S - CNN-SVMwithCISfourmodelst oshowt heeffecitvenessofourCNN-SVM. i

n i m t p o d a e w , g n i n i a r t g n i r u

D -batchstochasticgradien tdescen tmethodwithmomentum .Inthe g

n i n r a e

l stage,t hehyper-parametersaret hemini-batchsize,t hel earningr ate( LR)η,t heLRr educing t h g i e w 2 L e h t f o t h g i e w a d n a , α m r e t m u t n e m o m e h t f o t h g i e w a , τ y c n e u q e r f g n i c u d e r R L e h t , γ e t a r

c i t r a s i h t n i s t n e m i r e p x e l l a r o f d e s u e w s e u l a v e h T . β y a c e

d learechosenbasedonRef.4asfollows : i

n i m e h t h t i w , 5 0 0 0 . 0 = β , 9 . 0 = α , 1 . 0 = γ , 4 0 1 = τ , 5 0 0 0 . 0 =

η -batchsizeequalst o128 . i t l u m n g i s e d e W . N N C d e n i a r t e h t m o r f s e r u t a e f d e t c a r t x e h t i w s M V S n i a r t e

W -classificationSVM

s g n i d l i u b , d n u o r g k c a b y f i s s a l c o

t ,androadsandadoptt heone-versus-res tmethod .Thet rainingsets :

s a n w a r d e r a

r o t c e v e h t d n a , t e s e v i t i s o p e h t s a s i d n u o r g k c a b e h t o t g n i d n o p s e r r o c r o t c e v e h t e k a T . 1

; t e s e v i t a g e n e h t s a d a o r e h t d n a s g n i d li u b o t g n i d n o p s e r r o c

r r o c r o t c e v e h t e k a T .

2 espondingtobuildingsasthepositiveset ,and thevectorcorrespondingto ;

t e s e v i t a g e n e h t s a s d a o r d n a d n u o r g k c a b e h t

o t g n i d n o p s e r r o c r o t c e v e h t d n a , t e s e v i t i s o p e h t s a d a o r e h t o t g n i d n o p s e r r o c r o t c e v e h t e k a T . 3

e h t s a s g n i d l i u b d n a d n u o r g k c a b e h

t negativese.t

n o i t c i d e r p e e r h t e h t e p a h s e r e w , n o i t c i d e r p r o F . s M V S e e r h t t e g e w , s t e s g n i n i a r t e e r h t e s e h t g n i s U

i n i m t u p n i e w , t a h t r e t f A . x i r t a m 3 x n n a o t s t l u s e

r -batchlabelsandfeaturesextractedfromtrained h

i n M s l e d o

m -CNNandMnih-CNNwithCISt ofitt hreeSVMseparately.

n o it c i d e r P

N N C r u o f o s s e n e v i t c e f f e e h t e t a r t s n o m e d o

T -SVM ,wepredic t10remotesensingi magesoft het es t h

i n M h t i w t e

s -CNN ,Mnih-CNNwithCIS[ 8] ,Mnih-CNN-SVMandMnih-CNN-SVMwithCISf our ,

d n u o r g k c a b e k a m e w , n e h T . l e d o

m building and road ,components of the prediciton resutls h c u s , e g a m i e n o o t n i m e h t g n i v a s d n a , y l e v i t c e p s e r , B G R f o s t n e n o p m o c , R d n a , G , B o t d n o p s e r r o c

. 7 e r u g i F s a

7 e r u g i

F . Exampleofpredictionresults.

n o it a u l a v E

e h

T Weuseprecisionandrecallt oevaluatet heextracitonresults .Precisioni st heraitooft henumber . s e g a m i l e b a l d e t c i d e r p e h t n i e s o h t f o r e b m u n e h t o t s e g a m i l e b a l e u r t n i s l e x i p s d a o r r o s g n i d l i u b f o

s i r a p m o c a s i e r e H . s l e x i p e u r t e h t o t s l e x i p d e t c i d e r p e h t f o o it a r e h t s i l l a c e

R on ofprecision and

(8)

e l b a

T 1 .Precisionandrecal lont het es tdataset. l

e n n a h C g n i d l i u

B RoadChannel

n o i s i c e r P e g a r e v

A AverageRecall AveragePrecision AverageRecall

h i n

M -CNN 0.90004357 0.90021818 0.85319922 0.85342776

h i n

M -CNNwithCIS 0.89444072 0.89417839 0.85100188 0.8510249

h i n

M -CNN-SVM 0.8987596 0.87341092 0.93471307 0.71739616

h i n

M -CNN-SVMwithCIS 0.89154397 0.89168285 0.84641856 0.84685401

n o i s i c e r p e h t s w o h s 9 e r u g i F d n a 8 e r u g i

F -recal lcurveofMnih-CNNandMnih-CNNwithCIS.

. 8 e r u g i

F Precision-recal lcurveofMnih-CNN.

. 9 e r u g i

(9)

h i n M , 0 1 e r u g i F n

I -CNNandMnih-CNNwtihCISpointsr epresentt heprecisionoft heMnih-CNN h

i n M e h t d n a l e d o

m -CNNwith CIS mode lafter400epochs .After11iterationsofourMnih-CNN -h

i n M d n a l e d o m M V

S -CNN-SVM with CIS model ,weevaluated them wtih ates tse tandge ttwo s e r u t a e f g n i y f i s s a l c t a h t t n e d i v e s i t i , s t n i o p d n a s e v r u c e h t g n i r a p m o C . s e v r u c g n i d n o p s e r r o c

h i n M y b d e t c a r t x

e -CNN-SVMwithCISmode lcani mproveprecision.

. 0 1 e r u g i

F Precisioncomparisonof4models.

Dsicus ison

e l b a

T 1 andFigure10showst hepredictionprobabilitiesoft hebuildingchanne landt heroadchanne l h

t n e e s e b n a c t I . y l e v i t c e p s e r s l e d o m r u o f e h t f

o a tour mode l(CNN-SVM with CIS)hasthebes t s t u p t u o e h t h t o o m s o t r e d r o n i t n e m e c a l p s i d l a i t a p s h t i w g n i t c i d e r p e W . n o i t c i d e r p n i e c n a m r o f r e p

e c n a m r o f r e p r e tt e b e h t d n A . N N C f o e c n a m r o f r e p e h t e v o r p m i y l t n a c i f i n g i s d n a s e i r a d n u o b e h t r e v o

e r s i M V S r u o f

o lated to our classification strategies ,which is used to dea lwith controversia l .

s l e x i p d e t c i d e r p

r o t c e v e h t h t i w e b y a m l e x i p d e t c i d e r p l a i s r e v o r t n o c e h

T _�_𝑚�_𝑖₁_,_𝑚�_𝑖₂_,_𝑚�_𝑖₃_�mappedt o[ 1,1,1]because y l n o e c n i S . e m it e m a s e h t t a ) s d a o r d n a , s g n i d l i u b , d n u o r g k c a b ( s e s s a l c e e r h t o t g n o l e b t o n n a c l e x i p a

e h t f o e s a c e h t r o f , s d a o r d n a s g n i d li u b h t i w s e r u t a e f e m a s e h t e v a h y a m s s a l c d n u o r g k c a b e h t

e w , ] 1 , 1 , 1 [ t l u s e r n o it c i d e r

p assumetha tthispixe lbelongsto aroad class ,which meanstha tthe .

] 1 , 0 , 0 [ o t d e g n a h c s i t l u s e r n o it c i d e r p

r o t c e v h t i w s l e x i

P _𝑚�equa lto[0 ,0 ,0]arealsocontroversial ,becauseapixe lmus tbelongtoone ,

s g n i d li u b , d n u o r g k c a b ( s e s s a l c e e r h t e h t f

o and roads) .Since the background has no significan t .

s s a l c d n u o r g k c a b e h t o t n i s l e x i p h c u s y f i s s a l c e w , s e r u t a e f

e h t f o h t o B . n o it a r e d i s n o c r u o o t n i ] 1 , 1 , 0 [ d n a ] 1 , 0 , 1 [ , ] 0 , 1 , 1 [ s t l u s e r n o it c i d e r p e h t o s l a e W

c d n u o r g k c a b a n i a t n o c s t l u s e r o w t t s r i

f lass .Owingt ot hei nterferenceoft hebackground ,wedecide , t x e t n o c e h t n i s s a l c d n u o r g k c a b a s i e r e h t f i , y l l a u t c A . ] 1 , 0 , 0 [ d n a ] 0 , 1 , 0 [ e b o t s tl u s e r n o i t c i d e r p e h t

t e h t e k a t e w o s , s d a o r f o n o i t c e t e d e h t h t i w e r e f r e t n i o s l a l l i w s s a l c g n i d l i u b e h

t hirdpredictionresul t

. ] 1 , 0 , 0 [ s a

e l b a T n i n w o h s s i y g e t a r t s n o i t c i d e r p e h t , d e g n a h c n u s t l u s e r n o i t c i d e r p r e h t o h t i

(10)

e l b a

T 2 .SVMClassificationstrategy. s

t l u s e r n o i t c i d e r p s M V

S Fina lpredictionresults Class

] 0 , 0 , 0 [ , ] 0 , 0 , 1

[ [1,0,0] Blackground

] 0 , 1 , 1 [ , ] 0 , 1 , 0

[ [0,1,0] Buildings

, ] 1 , 0 , 0

[ [1,1,1],[1,0,1],[0,1,1] [0,0,1] Roads

Conclu isonandFutureWork

N N C a e s o p o r p e w , e l c i t r a s i h t n

I -SVMmodelf orextracitngt heroadf romremotei magery .Westudy d

n a s t l u s e r n o it c i d e r p e h t n e e w t e b e c n e r e f f i d e h

t the truth labels to develop a proper conversion d n a , s g n i d li u b , d n u o r g k c a b n e e w t e b s m e l b o r p e c n e r e f r e t n i e h t e v l o s y l e v i t c e f f e n a c h c i h w , y g e t a r t s

n o i s r e v n o c e h t , s d e e n c i f i c e p s r o F . l e d o m e h t f o e c n a m r o f r e p e h t e v o r p m i y lt n a c i f i n g i s d n a s d a o r

a c y g e t a r t

s n help to ge tbetter precision .In thefuture ,we wli limproveour CNN-SVM mode lto o s , s c i t n a m e s e h t o t g n i d r o c c a y g e t a r t s n o i s r e v n o c e h t e g n a h c d n a e r u t c i p e h t f o t x e t n o c e h t y f i t n e d i

d o m r u o d e t n e m e l p m i e w , y l l a n i F . n o i t c i d e r p r e t t e b a e v a h l l i w e w t a h

t elswithanewand flexible d n a s d o h t e m r u o f o s e d o c e h t d n a s t e s a t a d r u o w o h s l l i w e w d n A . r e n i a h C , k r o w e m a r f g n i n r a e l p e e d

t a s t n e m i r e p x

e https://github.com/natrueSwitch/CNN-SVM.

t n e m e g d e l w o n k c A

y b d e t r o p p u s y l l a i c n a n i f s a w h c r a e s e r s i h

T ResearchInnovationFundforCollegeStudentsofBejiing s

n o i t a c i n u m m o c e l e T d n a s t s o P f o y t i s r e v i n

U ,theNationa lNatura lScienceFoundationofChina(No . ) 7 2 C R 7 1 0 2 . o N ( s e i t i s r e v i n U l a r t n e C e h t r o f s d n u F h c r a e s e R l a t n e m a d n u F e h t d n a , ) 4 4 0 2 0 7 1

6 .

s e c n e r e f e R

] 1

[ X. L ni Z , . Liu ,J. Zhang ,J. Shen ,Combiningmultiplealgorithmsforroadnetworkt rackingfrom .

n o it a u l a v e e c n a m r o f r e p d n a m e t s y s l a c i t c a r p a : y r e g a m i d e s n e s y l e t o m e r s e c r u o s e l p i t l u

m Sensors

7 3 2 1 ) 9 0 0 2 ( .

9 -1258. 2

[ ] A.P. Da lPoz ,R.B. Zanin ,G.M. dVale ,Automatedextractionofroadnetworkfrommedi.um-and h

g i

h -resolutioni mages .PatternRecog .ImageAna.l 16(2006)239- 82 . 4 ]

3

[ SinghP.P. ,GargR.D. ,Classificationofhigh-resolutionsatellitei magesusingspatia lconstraints -g

n i r e t s u l c y z z u f d e s a

b , Journa lofAppliedRemoteSensing ,2014. ]

4

[ MiaoZ.L. ,Sh iW.Z. ,GambaP. ,L iZ.B. ,AnObject-BasedMethodforRoadNetworkExtraction ,

s e g a m I e ti l l e t a S R H V n

i IEEEJourna lofSelectedTopicsi nAppliedEarthObservationsandRemote 3

5 8 4 . p p , 5 1 0 2 , g n i s n e

S -4862. ]

5

[ Kumar K.M. ,Velayudham A. ,Kanthave lR. ,An Efficien tMethod for Road Tracking from i

t l u M d i r b y H g n i s U s e g a m I e ti l l e t a

S -Kerne lPartia lLeas tSquareAnalysisandParitcleFilter .Journa l e

t s y S s t i u c r i C f

o msandComputers ,2017. ]

6

[ WangJ.H. ,QinQ.M. ,GaoZ.L., ZhaoJ.H. ,YeX. ,ANewApproacht oUrbanRoadExtraction h

g i H g n i s

U -ResolutionAeria lImage .ISPRSInternationa lJourna lofGeo-information ,2016. n

i a m n a b r u r o f d o h t e m d e t a r g e t n i n A “ , e l y a b e D . J d n a , o a i M . Z , i h S . W ] 7

[ -roadcenterlineextraction p

o m o r

(11)

S o ti a S ] 8

[ . , Yamashita T. , Aok i Y. , Mutliple Objec t Extraction from Aeria l Imagery with h

c e T d n a e c n e i c S g n i g a m I f o l a n r u o J . s k r o w t e N l a r u e N l a n o i t u l o v n o

C nology ,2016.

] 9

[ K .Fukushima ,Neocognitron :aself-organizingneura lnetworkmode lForamechanismofpattern n

u n o i t i n g o c e

r affectedbyshifti nposition,Biol .Cybern .36 ,(1980)1 –93 202. .

J , r e s o B . B , n u C e L . Y ] 0 1

[ S .Denker ,D .Henderson ,R.E .Howard,W .Hubbard ,and L. D .Jackel , n

a h o t d e i l p p a n o i t a g a p o r p k c a

B dwrittenzipcoderecognition,Neura lComput .1,(1989) 15 –4 551. o

t t o B . L , n u C e L . Y ] 1 1

[ u ,Y .Bengio ,andP .Haffner ,Gradient-basedlearningappliedtodocumen t ,

n o i t i n g o c e

r Proc .IEEE86 ,(1998)2278–2324. l

A ] 2 1

[ shehh iR. ,MarpuP.R., Woon W.L., DallaMuraM. ,Simutlaneousextraction ofroadsand f o l a n r u o J S R P S I . s k r o w t e n l a r u e n l a n o i t u l o v n o c h t i w y r e g a m i g n i s n e s e t o m e r n i s g n i d l i u b

e m m a r g o t o h

P tryandRemoteSensing ,2017 ,pp .139- 91 . 4 ,

h i n M . V ] 3 1

[ MachineLearningforAeria lImageLabeling,Ph.D.thesis(2013). ,

n o t n i H . G d n a h i n M . V ] 4 1

[ Learning to detec troadsin high-resolution aeria limages, Proc .11th .