Background modeling for intelligent video surveillance system.

(1)

University of Windsor University of Windsor

Scholarship at UWindsor

Electronic Theses and Dissertations Theses, Dissertations, and Major Papers

1-1-2006

Background modeling for intelligent video surveillance system.

Simeon Immanuel Kiran Indupalli

University of Windsor

Follow this and additional works at: https://scholar.uwindsor.ca/etd

Recommended Citation Recommended Citation

Indupalli, Simeon Immanuel Kiran, "Background modeling for intelligent video surveillance system." (2006). Electronic Theses and Dissertations. 7100.

https://scholar.uwindsor.ca/etd/7100

This online database contains the full-text of PhD dissertations and Masters’ theses of University of Windsor students from 1954 forward. These documents are made available for personal study and research purposes only, in accordance with the Canadian Copyright Act and the Creative Commons license—CC BY-NC-ND (Attribution, Non-Commercial, No Derivative Works). Under this license, works must always be attributed to the copyright holder (original author), cannot be used for any commercial purposes, and may not be altered. Any other use would require the permission of the copyright holder. Students may inquire about withdrawing their dissertation and/or thesis from this database. For additional inquiries, please contact the repository administrator via email

(2)

BA C K G R O U N D M O DELING FOR

INTELLIG ENT VIDEO SURVEILLANCE SYSTEM

by

Sim eon Im m anuel K iran Indupalli

A Thesis

Subm itted to th e Faculty of G raduate Studies and Research through th e C om puter Science

in P artial Fulfillment of the Requirem ents for the Degree of M aster of Science at the University of W indsor

W indsor, O ntario, C anada

(3)

Library and Archives Canada

Bibliotheque et Archives Canada

Published Heritage Branch

395 Wellington Street Ottawa ON K1A 0N4 Canada

Your file Votre reference ISBN: 978-0-494-35961-7 Our file Notre reference ISBN: 978-0-494-35961-7

Direction du

Patrimoine de I'edition

395, rue Wellington Ottawa ON K1A 0N4 Canada

NOTICE:

The author has granted a non

exclusive license allowing Library

and Archives Canada to reproduce,

publish, archive, preserve, conserve,

communicate to the public by

telecommunication or on the Internet,

loan, distribute and sell theses

worldwide, for commercial or non

commercial purposes, in microform,

paper, electronic and/or any other

formats.

AVIS:

L'auteur a accorde une licence non exclusive

permettant a la Bibliotheque et Archives

Canada de reproduire, publier, archiver,

sauvegarder, conserver, transmettre au public

par telecommunication ou par I'lnternet, preter,

distribuer et vendre des theses partout dans

le monde, a des fins commerciales ou autres,

sur support microforme, papier, electronique

et/ou autres formats.

The author retains copyright

ownership and moral rights in

this thesis. Neither the thesis

nor substantial extracts from it

may be printed or otherwise

reproduced without the author's

permission.

L'auteur conserve la propriete du droit d'auteur

et des droits moraux qui protege cette these.

Ni la these ni des extraits substantiels de

celle-ci ne doivent etre imprimes ou autrement

reproduits sans son autorisation.

In compliance with the Canadian

Privacy Act some supporting

forms may have been removed

from this thesis.

While these forms may be included

in the document page count,

their removal does not represent

any loss of content from the

thesis.

Conformement a la loi canadienne

sur la protection de la vie privee,

quelques formulaires secondaires

ont ete enleves de cette these.

Bien que ces formulaires

aient inclus dans la pagination,

il n'y aura aucun contenu manquant.

i*i

(4)

(5)

A bstract

D espite th e dram atic grow th of digital image and video in recent years, m any chal lenges rem ain in enabling com puters to interpret visual content. This thesis addresses th e initial stages of building an intelligent video surveillance system . T he central con trib u tio n of th is work is developing a framework which models the scene background and segm enting the moving foreground objects in video image d ata. This work is divided into two sections.

In th e first section, a background model is being designed dynam ically w ith no prior knowledge of the scene. A simple histogram based m ethod is used to accomplish this task.

T he second step involves segmenting the moving foreground objects using a clustering mechanism. We have used a simple K-m eans clustering, where the value of K is two. We have im plem ented our m ethods in HSV color space. T he results obtained were quite satisfactory in different real environm ents.

(6)

D edication

To m y parents

(7)

A cknow ledgem ents

’’The fear o f the LO R D is the beginning o f knowledge” Proverbs 1:7

W ith a grateful heart I th an k Him for all th e blessings and knowledge in pursue of th is worldly wisdom.

Secondly, I would like to express my sincere gratitu d e to my advisor Dr. Boubakeur Boufama. W ithout his support this thesis work would be impossible in m any ways. His continuous support and insightful guidance helped in every stage of m y m aster’s degree. I feel privileged to have him as my advisor.

I would like to th an k th e Networks Centres of Excellence Auto21 for funding th is thesis. I th an k Dr. Im ran Ahm ed and Dr. Jo n ath an W u for serving on my thesis advisory com m ittee and providing me w ith valuable feedback.

I gratefully acknowledge my colleague Ahsan Ali for his helpful discussions and suggestions. I render special thanks to my friend M ayuran: he has been w ith me right from th e first day of my arrival here on campus. I acknowledge his help and support.

Finally, m any thanks to all my friends on and off th e campus; it is hard to imagine a life in W indsor w ithout them .

(8)

C ontents

A b stra ct iv

D ed ica tio n v

A ck now ledgem en ts vi

List o f Tables x

List o f Figures xi

1 In trod u ction and M otivation 1

1.1 I n tr o d u c tio n ... 1

1.2 Problem of Visual R e p r e s e n ta tio n ... 1

1.3 T he O r ig in ... 2

1.4 Intelligent Video Surveillance Systems ... 3

1.5 M o tiv a tio n ... 3

1.6 Issues addressed in this thesis and C o n trib u tio n s ... 4

1.7 Thesis S tru c tu re ... 4

1.8 C o n c lu sio n ... 4

2 In telligen t V id eo Surveillance System : A n O verview 5 2.1 Overview of Intelligent Video Surveillance S y s t e m ... 5

2.1.1 In tr o d u c tio n ... 6

2.1.2 Pixel processing l e v e l ... 7

2.1.3 Segm entation l e v e l ... 10

2.1.4 Tracking l e v e l ... 11

2.2 Background M odeling and Segm entation: An o v e r v i e w ... 11

2.2.1 Characterizing the issues in Background M o d e lin g ... 11

(9)

2.2.2 Background modeling in real t i m e ... 12

2.2.3 Functional expectations of Background M odeling ... 14

2.2.4 Segm entation of moving o b j e c t s ... 14

2.3 C o n c lu sio n ... 15

3 L iterature R eview 16 3.1 Non-recursive T echniques... 16

3.1.1 M edian filter ... 16

3.1.2 Linear predictive f i l t e r ... 17

3.1.3 Histogram based m o d e l s ... 17

3.1.4 N on-param etric m o d e l ... 18

3.2 Recursive T e c h n iq u e s ... 18

3.2.1 A pproxim ated m edian f i l t e r ... 18

3.2.2 K alm an f i l t e r ... 19

3.2.3 M ixture of G au ssian(M oG )... 19

3.2.4 S tatistical m ethods w ith com bination of M o G ... 22

3.2.5 O ther S tatistical m o d e l s ... 24

3.3 C o n c lu sio n ... 24

4 D yn am ic B ackground M od elin g and S egm en tation 25 4.1 Dynam ic Background M o d e li n g ... 25

4.1.1 Histogram based Background Model ... 27

4.2 Segm entation by C l u s t e r i n g ... 28

4.2.1 K-M eans Clustering te c h n iq u e ... 29

4.2.2 Problem s in S e g m e n ta tio n ... 31

4.3 C o n c lu sio n ... 31

5 Im p lem en tation and R esu lts 32 5.1 Background Modeling using H is to g r a m ... 32

5.1.1 D is c u ss io n ... 36

5.2 Segm entation by K-m eans C l u s te r in g ... 36

5.2.1 D is c u ss io n ... 37

5.3 C o n c lu sio n ... 38

6 C onclusions 40

(10)

B ib liograp h y 42

V ita A uctoris 48

(11)

List o f Tables

4.1 H istogram based Background M o d e lin g ... 27 4.2 Segm entation by k-means c lu s te rin g ... 30

(12)

List o f Figures

2.1 Building blocks of intelligent video surveillance system [ 4 2 ] ... 6

2.2 A Scenario a t pixel processing level [4 5 ]... 7

2.3 A person identified using background s u b t r a c t i o n ... 8

2.4 An example of frame difference technique: (a) and (b) are two consec utive frames (c) is th e resulting image obtained using frame differencing. 9 2.5 An example of optical flow: Moving objects identified using these mo tion v e c t o r s ... 10

2.6 Some tracking results taken from [ 3 5 ] ... 11

2.7 G radual lighting change during the d a y ... 12

2.8 Moving branches of the trees a t th e b a c k g r o u n d ... 13

2.9 Continuous moving objects in th e b a c k g r o u n d ... 13

4.1 R epresentation of HSV color space in a 3 dim ensional space [53] . . . 26

4.2 H istory m ap: A sam ple overview of how pixels are being captured over t i m e ... 26

4.3 Sam ple histogram formed of 25 image frames on V com ponent . . . . 28

4.4 Sample clustering mechanism using K-means [54] ... 29

5.1 A series of images extracted from th e video took inside th e building . 33 5.2 Identified background of th e series shown in Fig 5 . 1 ... 33

5.3 Image series of th e video took inside th e shopping m a l l ... 34

5.4 Identified background of the series shown in Fig 5 . 3 ... 34

5.5 A nother image series of th e video inside the shopping m a l l ... 35

5.6 Background of th e series 5 . 5 ... 35

5.7 Result of segm entation in a shopping m a l l ... 36

5.8 Result of segm entation in a c o rr id o r ... 36

(13)

5.9 Problem s in Segm entation: (a)P artially segmented p a rt of shirt,(b)alm ost detected region after considering chrom aticity i n f o r m a t i o n ... 37 5.10 Experim ented Image of the above r e s u l t s ... 37 5.11 T he present m ethod is com pared to different other m ethods as m en

tioned in Walflower[28] a l g o r i t h m ... 39

(14)

C h ap ter 1

Introduction and M otivation

1.1 Introduction

This chapter gives an introduction to th e origin of com puter vision, and related fields associated w ith it. In th e subsequent sections, we have given a brief narratio n on the concept of intelligent video surveillance systems. T he m otivation and contributions follows this section. A detailed sum m ary on the stru ctu re of this thesis was given at th e end.

1.2 Problem o f Visual R epresentation

” ....Consider th e nam e of signs th e m ind makes use of for th e understanding of things, or conveying its knowledge to others. For, since th e things th e m ind contem plates are none of them , besides itself, present to th e understanding, it is necessary th a t som ething else, as a sign or representation of the thing it considers, should be present to it.”

-Jon Locke

E ssay Concerning Hum an Understanding - 1690

H um an brains respond alike when they see a known object. For instance, if two people see a cat they respond alike saying it is a cat. W h at m ight be the under lying factor behind this analogy? This question leads us to th e philosophy of m ind (Cum m ins, 1989). M any applications axe possible if we can interpret this analogy to m achines, b u t m any implications follow this delegation of hum an in terpretation to machines.

C om puter Vision is th e enterprise of autom ating and integrating a wide range

(15)

Chapter 1 Background Modeling fo r Intelligent Video Surveillance Systems

of processes and representations used for vision perception. T he field of C om puter Vision partially collide w ith other research areas like image processing (transform ing, encoding and tran sm ittin g images) and statistical pattern classification (statistical decision theory applied to general pattern s, visual or otherwise).

Visual perception is the relation of visual input to previously existing models of th e world. There is a large representational gap between th e image and th e models (’im age’, ’concepts’) which explain and describe th e image inform ation. To bridge the gap, com puter vision system s have a range of representations connecting the input and th e ’o u tp u t’ (a final description, decision, or interpretation). C om puter vision th en involves the design of these interm ediate representations and th e im plem entation of th e algorithm s to construct them and relate them to one another [52].

1.3 The Origin

T he field of C om puter Vision was not very famous until th e late 1970’s. T he actual work sta rte d when com puters could m anage th e processing of large d a ta sets such as images. However, m any m ethods and applications are still in the s ta te of basic research, b u t m any m ethods find their way into commercial products, where they constitute a small p a rt of a larger system and can solve th e complex tasks [51].

As m entioned earlier, com puter vision collaborates w ith different other fields of science. In physics it plays a significant role in understanding th e processes in which electrom agnetic radiation, (typically in th e visible or th e infra-red range and th e surface reflectance of the objects) are m easured by th e image sensor to produce image d ata. In th e area of artificial intelligence, C om puter Vision deals w ith robot navigation through some environm ent. This type of processing needs input d a ta provided by a com puter vision system . M any more scientific dom ains including signal processing, neurobiology, statistics, optim ization and geom etry need com puter vision system s in their respective applications.

W ith all of th e dem and in various application dom ains as m entioned above, com puter vision system s have been introduced in th e field of video surveillance ap plications. As defined in [36], ”It is th e science and technology of machines th a t see” . T his technology provides th e background to develop intelligent system s based on the inform ation obtained from th e images. T he applicability of vision system s in video surveillance are given below:

(16)

Chapter 1 Background Modeling for Intelligent Video Surveillance Systems

• D etecting events for visual security surveillance.

• M odeling objects or environm ents for industrial inspection or topographical m od eling.

And m any more real-tim e applications are possible w ith the robust expansion of this field.

1.4 Intelligent V ideo Surveillance System s

Someone approaches the door o f a national security services facility. A security cam

era watching this person does more than sim ply sending this inform ation to a guard.

I t scans the person’s face and runs this video through facial recognition program that

checks against a crim inal watch list and get list. The person tries to break in, the

perim eter breach alarm sets o ff and alerts the control room. It locks all the entrance

doors. Text messages will be sent to all the personnel across the facility indicating the

possible threat. The cameras tracking this person would now initiate an activity recog

nition program to identify possible suspicious actions and use o f weapons possessed

by this person.

T he above scene im itates th e scenario in a science fiction story. It is not sur prising to say, th a t we will have this system s in action very soon. M any organizations prefer video surveillance system s doing more th a n ju st capturing th e videos.

T he intelligent video surveillance systems improve th e security standards and solve th e scalability issues arising due to th e traditional video surveillance systems. Intelligent system will enable its users to understand as m uch as 10 tim es m ore because they can ju st look a t mug shots instead of raw footage.

1.5 M otivation

Keeping th e versatility of th e above system in mind, a num ber of issues have to be taken care of, to model such system. Background subtraction is at th e heart of intelligence video surveillance applications. To perform this operation, one has to provide a background image th a t has no foreground objects. However, in real-tim e conditions it is not always possible to capture th e background in advance. In th a t situation dynam ic background modeling plays an im portant role. In other words,

(17)

nam ic background modeling is a desired startin g step for any robust video surveillance system .

Segmenting the foreground objects follows the background modeling. In order to segment a foreground object, one has to provide a proper threshold value based on which segm entation shall be performed, b u t it needs m anual observation. A utom atic threshold calculation will considerably elim inate th e hum an intervention a t th e m iddle of th e process.

1.6 Issues addressed in this thesis and Contributions

T his thesis m ainly focuses on modeling th e background in busy areas and segmenting th e foreground moving objects. We have modeled the background using a histogram based m ethod and a segm entation technique was also proposed to segment th e moving foreground objects from th e background. We have used k-means clustering technique in our segm entation algorithm . We did an informal com parison on various color spaces in im plem entation phase. We chose HSV color space, which has an edge over o ther color spaces in regard to shadows and noise.

1.7 Thesis Structure

T he rest of th e thesis is organized as follows:

In chapter 2, we have given a brief introduction to video surveillance sys tem s followed by background modeling and segm entation issues. C hapter 3 gives an overview of th e m ethods proposed by different researchers in th e literature. C hapter 4 is divided into two sections: section one deals w ith background modeling, the other section explains th e detailed process of segm entation for segmenting th e foreground objects. Results of th e proposed m ethods were presented in C hapter 5. We conclude this thesis w ith some rem arks and future work in C hapter 6.

1.8 Conclusion

T his is an introductory chapter on various issues relating to video surveillance and com puter vision. We briefly m entioned th e topic of intelligent video surveillance w ith a narrative description. O ur m otivation behind this work was presented in the m otivation section.

(18)

C h ap ter 2

Intelligent V ideo Surveillance

System : A n O verview

T his chapter focuses on th e overview of an intelligent video surveillance system . We have given a detailed description of th e functional architecture of an intelligent video surveillance system . In th e subsequent sections, a detailed elaboration of issues re lating to background modeling and segm entation were given.

2.1 Overview o f Intelligent V ideo Surveillance System

W ith recent advances in com puter technology visual surveillance has become one of th e popular areas for research and development. In the earlier days, visual surveil lance involved m anual observation of video sequences for several hours. From the perspective of real-tim e th re a t detection, it is a well known fact th a t hum an visual atten tio n drops below th e acceptable level even when tra in ed personnel are assigned to th e task of visual m onitoring [41].

As a whole, visual surveillance system s seek to autom atically identify events of interest in a variety of situations [30]. Surveillance cam eras are installed in m any public areas to improve safety, and com puter-based image processing is a promising m eans to handle th e vast am ount of image d a ta generated by large networks of cam eras. T he task of an integrated surveillance system is to warn an operator when it detects events which m ay require hum an intervention, for example to avoid possible accidents or vandalism. A lot of promising applications are based on successful vi sual surveillance systems, such as vehicle guidance [24], object tracking for security

(19)

surveillance [25], traffic m onitoring, detection of abnorm al activity and th re a t evalua tion [44]. T he general framework of these applications groups th e num ber of different com puter vision tasks such as detection, tracking and classification of objects of in terest in image sequences, and understands and describes the activities involving the objects.

2.1.1 In trodu ction

Tracking Level Tim e of

the day Light Cyclei

Object Parameters Extraction O bject Tracking Scene Geometry Blob Segm entation Size

Filtering Segm entation Level

Pixel Classification Noise

Filtering Pixel P ro cessin g Level

Image Sequence

Figure 2.1: Building blocks of intelligent video surveillance system [42]

In Fig. 2.1, an ab stract overview of an intelligent video surveillance system has been given. T he process s ta rts from the collection of video frames sequentially from an uncalibrated cam era. These video frames undergo further processing, for analysis and inform ation retrieval. T he whole process has been divided into th e following functional blocks:

1. Pixel processing level 2. O bject segm entation level 3. Tracking level

(20)

This thesis m ainly focused on the first building block of th e system . Wece have assum ed to model a system which can dynam ically model th e background from a busy scene and segment th e moving foreground objects.

2 .1.2 P ix e l processin g level

Pixel classification is a basic step for any intelligent video surveillance system . The accuracy and versatility of th e whole application depends on this step. Different techniques are being used to accurately classify the pixels, based on th e intensity, color and pixel position [34], [8], [25].

list of segmented occluded regions

r L heck new pixel values agamsr model

m

new frame

■

! k v Connect non-background pixels s

into connected components

■

■" Filter components

■

Update background model

■

Figure 2.2: A Scenario a t pixel processing level [45]

T he pixel processing can be perform ed a t three different levels of abstraction, depending on th e type of application and processing speed in real-tim e.

B ackground Subtraction

T he m ost popular m ethod is to classify pixels into foreground and background cat egories. ft is a simple m ethod by which incoming video frames shall be subtracted from a background image. T he sample result of the background subtraction can be seen Fig 2.3.

However, one should know th e background scene before subtracting th e in coming image frame. T he background image can be a tta in e d in two ways, one by capturing th e em pty scene w ith no moving objects prior to th e startin g of th e process. D ynam ic background modeling is another approach, which models th e background dynam ically w ith no prior knowledge of the scene.

(21)

F igure 2.3: A person identified using background su btraction

T he dynam ic background modeling capable to model th e approxim ate back ground using various techniques. T he G aussian M ixture model, linear predictive filter, K alm an filter were among th e famous contributions in this area. In this thesis, we have proposed a histogram based model to model th e background. T he proposed model make use of some statistical inform ation captured from th e image sequence over a period of tim e. T he detailed explanation can be found in th e later chapters.

L im itations o f Background Subtraction

Background subtraction heavily depends on th e background model of the scene. A constant updating of the background model is necessary for b e tte r seg m entation* in later stages.

(22)

Frame D ifferencing

Fram e differencing is another technique to identify th e moving foreground objects. T he difference of two consecutive frames is further processed to identify th e foreground regions. However, this m ethod has some severe drawbacks in accurate segm entation of th e moving objects. T he result of such operation is shown Fig. 2.3.

(c)

Figure 2.4: A n exam ple of fram e difference technique: (a) and (b) are two consecutive frames (c) is th e resulting image obtained using frame differencing.

Lim itations o f Frame Differencing

One m ajor problem w ith frame differencing technique is it leaves holes in the resulting image. This is quite problem atic in later stages of th e application, especially w hen up d atin g th e model of th e background as this error linger for a longers period of tim e.

O ptical Flow

O ptical flow is a concept for estim ating th e m otion of th e objects w ithin a visual rep resentation. Typically the m otion is represented as vectors originating or term inating a t pixels in a digital image sequence [36].

Lim itations o f Optical Flow methods

(23)

* i * + +

* * 4 ♦ *

F igure 2.5: A n example of optical flow: Moving objects identified using these m otion vectors

Once th e optical flow is estim ated from the consecutive image frames the segm entation is implied based on th e flow of motion. O ptical flow techniques are considerably slow due to the complexity and com putational tim e.

2.1.3 S egm en tation level

Segm entation is th e next level after th e pixel process. It deals w ith th e identification of th e moving region in th e image frame. T he object segm entation has different problem s to deal w ith ranging from noise, object shape and size of th e object. For segm enting th e objects people have used different techniques ranging from threshold calculation [15], G aussian m ixture models [5],[25], Hidden M arcov M odels etc.

Segm entation suffers from different problems arising due to different condi tions. T he accuracy of tracking is very much depending on th e perfect segm entation of th e foreground object. T he following factors cause problem s in segmentation:

• Foreground aperture - the foreground color has a close m atch w ith the background color.

• Choosing threshold value - we need to choose possible threshold value in order to separate the actual foreground pixels, which need some observation on pixel intensities.

This thesis also contributes a m ethod addressing th e above m entioned prob lems. T he detailed discussion on this topic can be found in later chapters.

(24)

2.1.4 Tracking level

Tracking is th e final stage of th e video surveillance system . At th e tracking level th e movements of th e objects are tracked and suspicious behavior of th e objects are closely observed. Various factors are taken into consideration such as object motion, geom etry etc. T he results of tracking can be used in different real-tim e application.

Figure 2.6: Some tracking results taken from [35]

The detailed im plem entation and applications of various tracking algorithm s can be found in th e thesis work of Ali et al. [35].

2.2 Background M odeling and Segm entation: A n overview

In all the video surveillance applications, th e effective way for segmenting foreground objects is to suppress th e background points. To achieve this goal an accurate and adaptive background model is often desirable.

In this section we have addressed various stages in background modeling and problem s in real-tim e. We have also given a brief explanation of different problems while segm enting th e foreground objects.

2.2.1 C haracterizing th e issues in B ackground M od elin g

T he background modeling process has been characterized in three different sections:

B ackground M odel Initialization

This is th e first step in background modeling algorithm s. T he initial model is obtained by a short training period w ith no foreground objects when they are present in the scene [7],

(25)

B ackground M od el G eneration

This step gives the estim ated background from a short training period. In other words, it dynam ically models the background w ith moving objects present in the scene [43].

B ackground M od el M aintenance

This is a difficult step in background modeling. In real tim e situations th e observed scene undergoes different changes. T he background m odel has to be u p d a te d accord ingly w ith th e changes in th e scene.

2.2.2 B ackground m od eling in real tim e

Background usually contains nonliving objects; those rem ain passive in the scene. T he background objects m ay be stationary, such as walls, doors and room furniture, or they m ay be non stationary objects like tree branches, wavering bushes and moving escalators. These background objects often undergo various changes in a course of tim e due to th e changes in brightness caused by changing w eather conditions or the switching off th e lights.

W ith respective to th e above reasons, background image can be described as a com bination of static and dynam ic pixels. T he static pixels belong to th e stationary objects, and th e dynam ic pixels are associated w ith non-stationary objects. Below are th e examples of various scene characteristics:

T im e o f th e day

Tim e of th e day comes under updation of the background scene. This is a common problem while u p dating th e background when th e light gradually fades out.

F igure 2.7: G radual lighting change during th e day

(26)

Sudden illum ination changes

Quick illum ination changes completely alter th e color characteristics of th e back ground, thus increases th e intensity of background pixels. T he result of this change will cause a false detection of background as foreground, T he worst cases, th e whole image m ight appear as foreground.

M oving branches at th e background

This is a very common scenario in alm ost all th e outdoor video surveillance systems. T he moving branches of th e trees in th e background will cause problem s to accurately model the background.

F igure 2.8: Moving branches of th e trees a t th e background

U ncontrolled m oving ob jects in th e background

It is hard to identify the actual background scene if objects in th e background move continuously. M any algorithm s use a scene w ithout moving objects, b u t this kind of assum ption will p u t some serious lim itations in highly densed areas.

Figure 2.9: Continuous moving objects in th e background

(27)

S h a d o w s

Shadows create problems while modeling the background. Shadows cast by th e ob jects are classified as foreground, due to false illum ination in th e shadow region.

2 .2.3 Functional ex p ecta tio n s o f B ackground M od elin g

To develop a general background scene, a background m odel m ust be able to satisfy th e following conditions:

1. Should represent th e appearance of a static background pixel 2. Should represent th e appearance of a dynam ic background pixel 3. Self-evolve to gradual background changes

4. Self-evolve to sudden once-off background changes

Background is usually represented by image features a t each pixel. T he fea tures extracted from an image sequence can be classified into three types: spectral, spatial, and tem poral features. Spectral features can be associated w ith gray-scale or color inform ation. Spatial features can be associated w ith gradient or a local struc ture, and tem poral features can be associated w ith inter fram e changes a t the pixel. M any existing m ethods utilize spectral features (distributions of intensities or colors a t each pixel) to model the background [4],[5],[7],[9].. In order to be robust to illumi nation changes, some spatial features are also exploited [2],[10],[12]. T he spectral and spatial features are suitable to describe th e appearance of sta tic background pixels.

Recently, few m ethods have introduced tem poral features to describe th e dy nam ic background pixels associated w ith non statio n ary objects [6],[13],[14]. There is, however, a lack of system atic approaches to incorporate all three types of fea tures into a representation of a complex background containing b o th stationary and non statio n ary objects. T he features th a t characterize statio n ary and dynam ic back ground objects should be different. If a background m odel can describe a general background, it should be able to learn the significant features of th e background at each pixel and provide the inform ation for foreground and background classification.

2.2.4 S egm en tation o f m oving o b jects

Video object segm entation is a challenging step. Segm entation is nothing b u t figuring w hat is background and w hat is foreground. Segm entation can be done on variety of

(28)

factors like intensity, color, size and location.

P r o b le m s in s e g m e n t a tio n

Segm entation stage contributes a lot while tracking th e moving objects. T he accurate segm entation of foreground objects is difficult due to shadows in th e object region, and foreground aperture.

S h a d o w s in t h e f o re g ro u n d re g io n

Shadows create problems during segm entation, shadows are categorized into cast shadows and self shadows. C ast shadows are caused by the moving objects on the side of th e object, where as self shadows are cast by th e object on th e object itself (shadows cast by th e folding of a shirt).

In [50], P ra ti et al conducted a comprehensive survey on th e shadow detection of moving objects. A study was conducted on a determ inistic character, shadows are elim inated based on the assum ption th a t the chrom aticity of th e shadow region is slightly dark com pared to actual object region.

Foreground ap erture is another problem occurs while segm entation, when the foreground object color m atches close th e background object, p a rt of th e foreground object disappears after segm entation. A b e tte r segm entation m echanism can identify these problems.

2.3 Conclusion

This chapter is a introductory chapter on intelligent video surveillance applications. The filed of intelligent video surveillance, gaining popularity in th e recent years, m any interesting methodologies being proposed by different researchers for real tim e applications. Freshers to this field can benefit from this chapter, as they come across various problem s in modeling th e background and segm entation.

(29)

C h ap ter 3

L iterature R eview

B ackground m odeling is a t th e heart of any background su b traction algorithm . These algorithm s have been classified into two broad categories, they are non-recursive and recursive [1].

3.1 Non-recursive Techniques

In Non-recursive techniques, th e algorithm s have to process a buffer of previous N video fram es and estim ate th e background image based on th e tem poral variation of each pixel w ithin th e buffer. These Non-recursive algorithm s are highly adaptive and they do not depend on the history beyond the stored buffer. However, it is expensive to model a scene w ith slow moving objects using these algorithm s. Following are the m ajor algorithm s, th a t fall under this category:

3.1.1 M ed ian filter

A m edian based m ethod was proposed in [6] and [19]. A slightly modified version was proposed by Howe et al in [3], this m ethod recursively u p d ates th e background based on th e previous three image frames. W ith a sim ilar idea, Long et al proposed an adaptive sm oothness algorithm , which finds th e intervals of stable intensity and uses th e heuristics to choose th e longest interval [18]. Similar to this proposed approach, G utchess et al. in [7], proposed a m ethod called th e local image flow algorithm . This m ethod [7] also depends on th e intervals of stable intensity of an image sequence. Along w ith observing th e intervals of stable intensity, th e vicinity of neighboring pix els are also taken into consideration for estim ating th e background. T his m easure elim inates th e blending problem , when foreground objects stay in th e image for a

(30)

longer period. An optical flow technique is being implem ented, by sum m ing the dis tances of each vector head in the local neighborhood using th e G aussian m ixture. The m ain stren g th of this algorithm in [7], is to take a decision, w hether th e neighboring pixel belongs to either a background or a foreground.

3 .1.2 Linear p red ictive filter

Toyam a et al. proposed an approach which uses wiener filter[28], for background modeling th e background. Addressing various issues in background modeling, they have proposed an approach a t pixel, region and frame level to construct a robust background m odel which can deal w ith m ost of th e real tim e problems. A t pixel level if th e prediction deviates far from th e expected value, th e pixel shall be considered as foreground. For a given pixel, th e linear prediction of its next value is given by

p

St =

^2

akSt-k (3-1)

K= 1

This filter uses the past P values to predict pixel S t a t tim e t, s t-k is th e past value of a pixel and ak is th e prediction coefficient. At region level, they consider to segm ent th e whole object rath e r th a n some isolated pixels. At frame level,a repre sentative set of scene background models will switch autom atically to represent the current background. This algorithm is a m ilestone contribution in this area, which addresses and solves various problems in th e background modeling.

3 .1.3 H istogram b ased m odels

In [15], K um ar et al. proposed a histogram based background modeling algorithm , to model busy scenes. Here they used a queue concept w ith three channels of color m odel Y ,C b and Cr. W hen th e queue gets full, a calculation would be perform ed to regenerate th e background model. This m echanism is called blind u pdation. T he pixel values are p lotted on a histogram , th e values which belong to background will take th e higher values in th e histogram . T he value of W (G aussian m ixture coefficient), is set to 5 if th e background has m otion due to moving branches, twinkling surfaces etc,. Each pixel will be modeled using adoptive m ixture of Gaussians.

In th e segm entation level, a pixel considered to be p a rt of th e foreground, when th e current value of th e pixel is far from th e m ean relative to th e variance of G aussian m ixture.

(31)

3 .1 .4 N on-param etric m odel

In [4],Elgammal et al. described a basic background model and a subtraction process. T he m ain objective of this algorithm , is to capture th e inform ation about th e scene region. A continuous updation of th e background inform ation, to capture th e fast changes in th e background has also been proposed by this m ethod. Here ( x i,X2,..., xn )

are th e recent intensity values of a pixel. Using this m ethod, a P D F (probability Den sity Function) can be estim ated. T he non-param etric equation to estim ate a pixel intensity x t a t tim e t, is given below:

Here K is the kernal estim ator function, it can be calculated using a norm al distribution. A pixel is considered as foreground pixel, if th e probability P r ( x t) < th.

Here th is th e global threshold, which can be adjusted to achieve th e desired

percent-on the current history of pixel values.

3.2 R ecursive Techniques

Recursive techniques do not m aintain a buffer for background estim ation. Instead, they recursively u p d ate a single background model based on each input frame. Due to this factor, th e input frames from d istant past do not have any effect on th e current background model. Com pared to the non-recursive techniques, recursive techniques require less storage. However, any error in th e background model can linger for a longer period of tim e. Some of th e recursive techniques are described below:

3.2.1 A pp roxim ated m edian filter

To slightly improve th e m edian based m ethod proposed in [19]. In [2], Chung et al took th e confidence values of histogram , to identify th e stable intensity of each pixel over a training period. Similar m ethod was ad apted in [20], for developing th e initial background model before im plem enting a shadow removal technique while segm entation of foreground objects in HSV color space. In [14], K ornprobst et al. assum ed, th a t background can be defined as the m ost often observed p a rt over the sequence. T hey have presented an approach to deal w ith background reconstruction

University o f W indsor, 2006 18

(3.2)

(32)

and m otion segm entation based on P artial Differential E quations (PD E). T he results of this m ethod were good, b u t choosing th e param eters is difficult. Based on the same idea Hou et al. also proposed a m ethod in [12]. In th is approach, pixel intensity difference between inter-fram es is calculated, and pixel intensity values are classified based on th is difference. T he process is perform ed a t 4 different steps. In the initial step, classification th e intensities w ith stable intervals and calculate th e average of intensities a t each interval. In th e second step, classify intensities of stable intervals, w ith close average intensity values and take th e count of th e pixels in th a t group. Finally select th e intensity value, w ith m axim um num ber of pixels w ith homogeneous interval as background intensity value. The sim ulation results were good using this approach.

3.2.2 K alm an filter

A K alm an filter based mechanism was proposed in [24], which estim ates th e expected pixel value based on the previous observations. It explicitly considers th e noise on the m easurem ent of th e system values. T he com ponents include m easurem ent m atrix, system in put and K alm an gain m atrix in predicting th e future observation. The system estim ates th e values recursively and will assign th e weights according to the accuracy of prediction. This model is good, in handling th e illum ination changes in th e scene model. However, it involves higher complexity in updating.

3 .2.3 M ix tu re o f G aussian(M oG )

T his is a popular m ethod in background modeling, it was first proposed in [5] for background modeling. T hey have im plem ented this m ethod to explicitly classify the pixel values into three separate states corresponding to road, shadow color and colors corresponding to vehicles. A nother m ethod proposed by Stauffer et al, in [25] has draw n th e atten tio n of m any researchers. In this approach rath e r explicitly modeling th e values of all th e pixels as one p articular type of distribution, they sim ply model the values of a p articu lar pixel as a m ixture of Gaussians. Based on th e persistence and th e variance of each of the Gaussians of the m ixture, th ey determ ine which Gaussians correspond to background colors.

Their m ethod contains two significant param eters a - th e learning constant and

(33)

T- th e proportion of the d a ta th a t should be accounted for by th e background. In online m ixture model, they consider th e series of pixel values for a particular tim e as a ’’pixel process” . These are scalars for gray images and vectors for color images. T he following equation represents a pixel {ar0, 2/o} at tim e t, where I is th e intensity of th e image sequence.

{ A i ,...., A t} = { /( x 0, i/o, i) : 1 < i < t} (3.3) Here A i..A t are th e sequence of images in a tim e frame. Different guiding factors were taken into consideration while modeling and u p d atin g th e background. T he following equation shows the probability of observing th e current pixel value from the recent history and models it using K m ixture of Gaussian.

K

P {A t) = Y , wi,t * V(At, pa,u S M) (3.4)

i=i

where K is th e num ber of distributions and u>itt Sj,t are th e estim ates of weight (w hat portion of th e d a ta is accounted for by this G aussian), m ean value and covariance m atrix of th e ith G aussian in th e m ixture a t tim e t. W here rj is a Gaussian probability density function as shown below.

>)(*., M, S) = -

1 Ax,-Hrz-Hx,-K)

(3.5)

2 j | 2

K determ ined by the available mem ory and com putational power, currently 3 to 5 are used. T he covariance m atrix is assum ed to be of the form

E M = a \ l (3.6) this assumes th a t th e red, green, blue pixel values are independent and have th e same variance. P rior to th e pixel classification the weights are assigned to K distributions a t tim e t.

A pixel in every image frame has an image m easurem ent vector of th e form

[I, I x , I y, I t] was proposed in [23]. T he advantage of having m ulti-dim ensional G aussian distributions allows a greater freedom to represent th e distribution of th e m easure m ents occurring in th e background. This concept was based on th e m ethod proposed in [9], called TA PM O G (Tim e A daptive Per-Pixel MoG), where each pixel observation consists of a color and a d ep th m easurem ent. T he color is represented in YUV space, which helps to elim inate lum inance and chrom aticity, the d ep th m easurem ents are de noted by D. T he observation of a pixel a t tim e t is w ritten as X ^ t = [Yi>t, U^t , V}t , D ht].

(34)

T he history of observations a t a given pixel is [X^i, modeled by K gaussian distributions. K is same for all pixels and it is determ ined to be between 3 to 5.

T he new pixel values are compared w ith th e existing ones to find a m atch between th e current pixel value w ith th e existing Gaussian If a m atch is found, then its param eters are upd ated using current observation. If not, it will be replaced as a new observation. A lthough this m ethod is robust w ith respect to foreground segm entation and background modeling, it has slow learning rate and high com putational resources.

In [30], a m ethod nam ed ’PixelM ap’ was developed based on G aussian M ethod ology. T he m otivation behind their work is, in a fast varying background few G aus sians m ay not be helpful, so they have extended th eir scope from pixel to region and fram e levels. As explained before each pixel is modeled using K Gaussians categorized based on these characteristics [M e a n rgb,Varrgb ,M a x rgb

, M in rgb, F lag, T im e ]

At pixel level, each incoming pixel is checked against th e K G aussian until a m atch is found. T he moving objects have the larger variance th a n th e background pixels. So these distributions are ordered based on decreasing order of weights. In th e fram e level processing each fram e is sub tracted from previous and next frame. T he pixels which are identical in th e three frames are tre a te d as background. And a m ask is obtained from this process is treated as foreground. In region level process, a shadow elim inating mechanism and a foreground connected com ponents m ethod is im plem ented to remove shadows and fill gaps in foreground regions. T his m ethod showed fairly good results, b u t th e com putation cost is high.

T he improved version of G aussian M ixture was proposed in [32]. This m ethod adaptively u p d ates the background, w ith respect to newly introduced objects and lighting conditions. T he training set has been up d ated in reasonable tim e period T and at tim e t they m aintain a d a ta set Xt = { x * , x t~T}. For each new sample d a ta set, Xt th e background will be re estim ated as p { x \X r , B G ). However, among th e samples from th e recent history there could be some values, th a t belong to the foreground objects. T he estim ate of those objects can be done as follows.

M

p ( x \X T , B G + F G ) = Y 1 (3.8)

m=1

(35)

where estim ates of m eans and estim ates of th e variances m th a t describe th e G aussian components. A series of recursive u p d ate of equations were presented for this purpose. T he clustering of samples enable to identify th e background sam ples from th e foreground. After identification of background clusters, they m ust be arranged in descending order of weights for future process. T he value of Gaussian is set to 4 while experiments.

Above all are the various Gaussian methodologies th a t were proposed in the literature. M ost of th e above described m ethods work reasonably outdoor video sur veillance system s. B ut th e m ain draw back of G aussian models are th e tim e complex ity. T he per pixel processing of each image frame, consumes enorm ous com putational resources in real time.

3 .2 .4 S ta tistica l m eth od s w ith com bination o f M oG

In [13], Kim et al, proposed a statistical model w ith th e com bination of Gaussians in their process. In this model, X is a training set for a single pixel consisting of N RGB-vectors.

X = {xi, X2,..., xpj} and C = Ci, c2, ..., c/v represents th e codebook consisting of L

codewords. Each pixel has different codebook size based on its sample variation. Each codeword consisting of c*, i = 1...L, consists of an RGB vector vt = {R i, Gi, B,} and a six tuple auxi = (I i,I i,fi,X i,P i,q i), this tuple contains intensity values and tem poral variables described below.

1 ,1 - th e m in and m ax brightness, respectively

/ - th e frequency w ith which the codeword has occurred A - th e longest period since this code word has occurred

p,q - first and access tim es of each codeword th a t has occurred.

Based on th e color and brightness conditions we assign th e codewords for each pixel. A fter constructing the code book th e next step would be elim inating th e foreground codewords from th e constructed codebook. T he new code book, after th is tem poral filtering, will satisfy th e following condition.

M = {cTO|cm € C A Am < Tm} (3.9)

(36)

where a threshold TM is equal to half the num ber of training frames, y . A prob abilistic model using Bayesian decision theory was proposed in [16]. A t the lowest level , video background segm entation is a binary classification problem. A decision to be m ade for each pixel in th e fram e a t tim e t as foreground or background. Prom Bayesian perspective, a pixel being background P ( B \x ) , where x denotes the pixel observed in th e frame at tim e t and B denotes the background class. To make this decision a gaussian should be underlying the process and th e classification as follows:

K K

P ( x) = ] £ p ( G * ) P ( x |G * ) = Y J W k.g {x ,li k,a k) (3.10)

k= 1 k= 1

where Gfcis th e k-th Gaussian and gk(x) = g(x, fik, a k) is th e norm al density function. Segm entation in this approach consists of two independent problems: estim ating the distribution of all observations at the pixel level as a G aussian m ixture and evaluating how likely each G aussian in th e m ixture being background. T hey have also proposed a generalized formula to elim inate th e discontinuities of Gaussians switching from background to foreground during th e process. Recently in [17], a unim odel gaussian distribution w ith some updation equations for foreground segm entation was proposed.

A ddressing th e problems w ith th e various problem s in outdoor surveillance H orprasert et al. proposed a color model th a t separates th e brightness from chro- m aticity [10]. Each pixel was expressed in 3 dimensional RGB space, where Ei =

[ER( i) ,E c { i),E B (i)\ are expected color value of a pixel i in the reference or back ground image. 7j = [Ir, Ig, Ib] are th e intensity of th e pixel in the current image. T he m easure of distance between th e expected value and th e current value using the following equation gives th e brightness (a) and color distortion(C D ).

4 (oi) = (U - o a E if (3.11)

where a* represents the pixel’s stren g th of brightness w ith respect to expected value,

C D = ||ii - aiEi\\ (3.12) T hey have considered th e following a ttrib u te vector (E t, S t, au bi), where E t is the expected color value,s* is th e standard deviation of color value and a* is th e variation of brightness distortion and bi is th e variation of chrom aticity distortion. T his m ethod is able to deal w ith various outdoor conditions b u t th e u p d atin g of th e model is poorly addressed. Instead of traditional pixel based background m odeling, a novel approach was presented in [31]. This m ethod based on frequency dom ain analysis.

(37)

Each fram e is divided into 8X8 blocks, and two features Jd cJ ac extracted from th e D C T coefficients and each block is individually m odeled using single Gaussian m odel w ith m ean and stan d ard deviation (fi,cr). These two a ttrib u tes are initially estim ated from previous frames, possibly w ith some moving objects. T he advantage of this kind of background modeling is, it is not sensitive to pixel noise scene changes w ith respective to lighting.

3.2.5 O ther S ta tistica l m odels

In [22], P a n et al. proposed a statistical model from first N background frames. In the modeling phase, they have considered th e M ean and sta n d a rd deviation of each pixel in th e respective Y ,C b,C r color space. T he segm entation of foreground followed by th e fram e differencing. W hile segm entation, they have used a characteristic gain in th e commercial cam era and based on th a t factor a unim odel Gaussian was employed to segment th e foreground pixels from th e background. A different statistical model was presented in [8], they assumed, th e pixel’s intensity distribution is bim odal. The background scene is then modeled by three values, m inim um pixel intensity m (x), m axim um pixel intensity n(x) and th e m axim um difference d(x), betw een the con secutive frames observed during th e training period. T his process held in two stages, in th e first stage use the m edian filter to distinguish th e moving pixels from th e sta tionary ones, in th e second stage those stationary ones are processed to construct the initial background model. Let V be an array containing N consecutive im agesV l (x) is th e intensity of a pixel location x in th e ith image of V. T he initial background model is obtained as follows:

m (x ) = m in z { V z (x )} ;n (x ) = m a x z { V z (x)}; d(x) — m a x z { \V z(x) — E 2_1(:r)|} (3.13) where \V z {x) — A(x)| > 2 * a (x ) Hence,V z (x) is classified as moving pixel.

3.3 Conclusion

This chapter gives an brief description of various m ethods proposed in literature. Every m ethod has its significance in modeling the background and segm entation. However, each m ethod has its own drawbacks to be functional in real-tim e.

(38)

C h ap ter 4

D ynam ic Background M odeling

and Segm entation

This chapter gives a detailed explanation and contribution of th is thesis. A histogram based background modeling algorithm is presented, which dynam ically models the background w ith no prior knowledge of th e scene. A segm entation technique is also presented, which segments th e foreground objects from th e background [43]. The subsequent sections present the m ethodology and algorithm s, results can be found in th e late r chapters.

4.1 D ynam ic Background M odeling

T he key assum ption in developing this m ethod is because of increasing autom ation in this area. Instead of using a off line picture as a sta rtin g point, if we can able to develop a model, which can model the background by itself w ithout user inter vention would be much robust and serve the purpose in real tim e under different circum stances. We have also experim ented widely in different color spaces, to find th e suitable and reliable color space, which can handle noise and shadows a t a con siderable level.

HSV stands for Hue, S aturation and Value, which can also be tre a te d as Hue, S aturation and Brightness (HSB). HSV color space created by Alvy Ray Sm ith in 1978. T his is a nonlinear transform ation of RGB color space. Possibly used in color progression. O ur m otivation to use this color space came from [39]. C om pared to the other color spaces, it is capable to deal w ith noise and shadow in th e image region

(39)

very effectively. A pictorial representation of HSV color space is given in Fig. 4.1.

G reen

Cyan

'Red

Gray, S cale

| / * 0 '

(Black)

S (S atu ratio n )

Figure 4.1: R epresentation of HSV color space in a 3 dim ensional space [53]

In Fig 4.1, S'(saturation) and V(value) param eter values have a range of [0,1] and th e H(hue) param eter has a range of [0,360 degrees]. In th e figure 4.1, V is the axis pointing up in th e picture. W ith V = l, at the top represents relatively bright colors. Due to this unique separation of brightness property, we used HSV space rath e r th e traditional RGB space.

M any other works in HSV color space can be found in [37], [38],[33].

O bserved Images

Cell String

Pixel (m,n)

D 1R 1C 2R 2C N1 N2

; H*i*s"[v ] i hj s [v "i T h;s iv 1

I i , I_____ | i_I______________________________ 4 .___

[ 'hTs" !v' ! i'hTs'iv' ] i'hTs’ Fv I j— ».— i—- i j— i.—i— ^ i— L— ,—

/ ^— |

D 1R 1C 2R 2C N1 N2

□ [

Figure 4.2: H istory map: A sam ple overview of how pixels are being captured over tim e

(40)

As shown in Fig 4.2, th e history map represents intensity characteristics of a pixel over a tim e period. T he observed images are the image frames obtained from th e video over a period of tim e. T he cell string represents entire intensity variation a t each pixel location over all th e frames. Each pixel has three com ponents H ,S and

V. T he inform ation can be used to estim ate th e variation of pixel intensity over a tim e period. T he conclusions draw n from these, can be used for th e identification of stable intensity p a tte rn over tim e. Several m ethods had been proposed underlying this concept. T he simple m ethod to sta rt w ith is th e m edian calculation of this interval. Some other m ethods nam ely estim ation of G aussian coefficient or Gaussian m ixture.

In th e present context, we have used a histogram based m ethod. This his togram m ethod identifies th e intervals of stable intensity of each pixel. Later, these pixel intensities for each pixel are back su b stitu ted to form an approxim ate back ground model.

4.1.1 H istogram based B ackground M odel

In general, a histogram displays continuous d a ta in ordered columns. It an effective way to represent the continuous measures, such as tim e, intensity etc. Histogram is th e best way, to identify th e stable intensity of each pixel in certain tim e interval.

T he histogram in this case is formed of HSV com ponents, for each individual com ponent H,S and V, we have identified a stable intensity occurred m ost of the observed period of tim e and took into account. T he values are separated among four bins, we have chosen four bins because of th e stability in results. Values in each bin represents th e color information(mfensz£?/)at different tim e intervals. T he same procedure has been applied for all th e pixels. T he procedure for the histogram based m ethod given below:

1. Convert each fram e in HSV color space.

2. C alculate th e histogram of H, S and V com ponent for a p articu lar pixel in all th e fames. 3. F ind th e histogram bin w ith highest value and assign the m edian of this bin to th e

H, S and V com ponent of th e pixel in th e background model.

4. Perform step 1,2 and 3 for all th e pixels in th e background m odel until th e pixel values are stabilized.

Table 4.1: H istogram based Background M odeling

(41)

A sam ple graph is formed of ’V ’ com ponent of th e HSV color space in Fig 5.3. T he horizontal axes represent th e intensity value and vertical axes represents the count of each bin.

Pixel Intensity range

F igure 4.3: Sample histogram formed of 25 image frames on V com ponent

Once th e background model has been identified, th e next step is to segment th e foreground moving objects from th e background. In m any previous m ethods, an approxim ate threshold value is being used to differentiate th e foreground pixels from th e background. However, setting a threshold value is not so robust and m ay effect when th e scene changes dynamically. Because of this reason, we have used K- m eans clustering technique, which autom atically calculates th e threshold value. The algorithm and explanation can be found in th e subsequent chapters.

4.2 Segm entation by Clustering

C lustering is a classification technique. In m easurem ent space it is used as an indi cator of sim ilarity of image regions. It m ay also be used for segm entation purposes. Sim ilarity betw een image regions or pixels can be grouped, by using clustering (small separation distances) in th e feature space. These clustering m ethods come under the earliest d a ta segm entation techniques.

In Fig 5.4, a sample visible p artitio n of two different groups using K-means clustering can be seen. K-M eans clustering finds, a grouping of th e pixels which are sim ilar in color, position or a com bination of b oth etc. T he sim ilarity m easure betw een th e feature vectors is calculated by the Euclidian distance. Here th e value of ’K ’ represents num ber of clusters to be formed among th e pixels of th e image. W hile

Background modeling for intelligent video surveillance system.

Scholarship at UWindsor

Scholarship at UWindsor

Background modeling for intelligent video surveillance system.

Background modeling for intelligent video surveillance system.

BA C K G R O U N D M O DELING FOR

INTELLIG ENT VIDEO SURVEILLANCE SYSTEM

NOTICE:

The author has granted a non­

exclusive license allowing Library

and Archives Canada to reproduce,

publish, archive, preserve, conserve,

communicate to the public by

telecommunication or on the Internet,

loan, distribute and sell theses

worldwide, for commercial or non­

commercial purposes, in microform,

paper, electronic and/or any other

formats.

AVIS:

L'auteur a accorde une licence non exclusive

permettant a la Bibliotheque et Archives

Canada de reproduire, publier, archiver,

sauvegarder, conserver, transmettre au public

par telecommunication ou par I'lnternet, preter,

distribuer et vendre des theses partout dans

le monde, a des fins commerciales ou autres,

sur support microforme, papier, electronique

et/ou autres formats.

The author retains copyright

ownership and moral rights in

this thesis. Neither the thesis

nor substantial extracts from it

may be printed or otherwise

reproduced without the author's

permission.

L'auteur conserve la propriete du droit d'auteur

et des droits moraux qui protege cette these.

Ni la these ni des extraits substantiels de

celle-ci ne doivent etre imprimes ou autrement

reproduits sans son autorisation.

In compliance with the Canadian

Privacy Act some supporting

forms may have been removed

from this thesis.

While these forms may be included

in the document page count,

their removal does not represent

any loss of content from the

thesis.

Conformement a la loi canadienne

sur la protection de la vie privee,

quelques formulaires secondaires

ont ete enleves de cette these.

Bien que ces formulaires

aient inclus dans la pagination,

il n'y aura aucun contenu manquant.

A bstract

D edication

A cknow ledgem ents

C ontents

List o f Tables

List o f Figures

C h ap ter 1

Introduction and M otivation

1.1

Introduction

1.2

Problem o f Visual R epresentation

1.3

The Origin

1.4

Intelligent V ideo Surveillance System s

1.5

M otivation

1.6

Issues addressed in this thesis and Contributions

1.7

Thesis Structure

1.8

The author has granted a non

worldwide, for commercial or non