Real time face matching with multiple cameras using principal component analysis

(1)

Rochester Institute of Technology

RIT Scholar Works

Theses

Thesis/Dissertation Collections

6-2006

Real time face matching with multiple cameras

using principal component analysis

Andrew Mullen

Follow this and additional works at:

http://scholarworks.rit.edu/theses

This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please [email protected].

Recommended Citation

(2)

Real Time Face Matching With Multiple Cameras Using

Principal Component Analysis

by

Andrew Mullen

A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science in Computer Engineering

Approved By:

Supervised by

Dr. Andreas Savakis

Department of Computer Engineering

Kate Gleason College of Engineering

Rochester Institute of Technology

Rochester, NY

June 2006

Andreas Savakis

Dr. Andreas Savakis - Professor and Department Head

Primary Advisor - R.I T Dept. of Computer Engineering

Shanchieh Jay Yang

Dr. Shanchieh Jay Yang - Assistant Professor

Secondary Advisor - R.I T Dept

.

of Computer Engineering

M.Shaaban

(3)

Thesis Release Permission Form

Rochester Institute of Technology

Kate Gleason College of Engineering

Title: Real Time Face Matching with Multiple Cameras Using Principal

Component Analysis

I, Andrew Mullen, hereby grant permission to the Wallace Memorial Library to

reproduce my thesis

in

whole or part.

Andy Mullen

Andrew Mullen

(4)

Acknowledgements

Iwouldliketo express_mysincere appreciationtoDr.

AndreasSavakis for hisguidance and encouragement. His

constant interest helpedat _everystage toenablethis thesis

tobecompleted.

I amalso gratefultoDr. Shanchieh

Yang

andDr.

Muhammad Shaabanfortheircontinualsupportand efforts

toallowmeto overcome obstacles.

Otherpeoplewho provided significant assistance were

(5)

Abstract

Face recognitionis a_{rapidly advancing} researchtopic due to the largenumberof

applications that can

benefit

from it. Facerecognitionconsists of

determining

whethera

known face ispresent inanimageandis

typically

composedoffour distinctsteps. These

steps are face

detection,

face _{alignment, feature} extraction, and face classification [1].

The

leading

application for face recognition is video surveillance. The majority of

current research in face recognitionhas focusedon

determining

ifaface is present inan

image,

and if_so, which subject in aknown database is the closest match. This Thesis

deals with _{face matching,} which is a subset of face _recognition,

focusing

on face

identification,

yetit is an area where littleresearchhas been done. Theobjective offace

matching is to

determine,

in real-time, the degree ofconfidence to which a live subject

matches a facial image. _{Applications for face matching include} video _{surveillance,}

determination of identification _credentials, computer-human

interfaces,

and

communicationssecurity.

The method proposedhere employs principal component analysis

[16]

to create a

method of face matching which is both _{computationally} efficient and accurate. This

method is integrated into a realtime systemthat is basedupon atwo camera setup. It is

able to scan the room, detect

faces,

and zoom in for ahigh _quality capture ofthe facial

features. The image captureisused inaface _matchingprocess todetermine iftheperson

found is the desired target. The performance ofthe system is analyzed based upon the

(6)

Table

of

Contents

Acknowledgements in

Abstract iv

TableofContents V

ListofFigures vii

ListofTables X

Glossary

xi

1 .0 Introduction 1

1 .1 Motivation 3

1 .2 Problem Statement 4

1 .3 Outline 6

2.0 Background 7

2.1 Current Researchin Video Face Recognition 8

2.2

Key

Attributes oftheFace Detection Algorithm 9

2.3 Face Identification Strategies 10

2.4

Key

Attributes oftheFace Classification Algorithm 13

2.5 Face Recognition Strategies 14

2.6 In Depth LookatPCA 17

3.0 PCA Implementation 23

3.1 Source Images 23

3.2 PCAAlgorithm 25

3.2.1 Difficulties 27

3 .2.2 Limitations 28

3.3 DistanceMeasure Implementationand

Testing

29

3.4 DistanceMeasure ResultsandSelection 32

3.5 Eigenfaces 33

4.0 EvaluationofPrincipal Component Analysis 37

4.1 Data Gathered 37

4.1.1 Image SizeandNormalization Method 38

4.1.2 NumberofEigenfaces 41

4.1.3 NumberofImagesper Subject 43

4.1 .4 NumberofSubjectsfor

Training

44

4.1.5 EvaluationoftheFirst Three Eigenfaces 45

4.1.6 EvaluationofImage Masks 46

(7)

5.0 Face

Matching

55

5.1

Thresholds

andCharacteristics 56

6.0 _{Real Time System} 60

6.1

Components

61

6.1.1 GUI 61

6.1.2 Camera 70

6.1.3 Frame 75

6.1.4 Face 76

6.1 .5 Eigenspace 81

6.1.6

Tracking

81

6.1.7 Threads 82

6.2 DataPath 84

6.3 Integration 85

6.4 OpenCV 86

6.5 IncreasedPrecision Eigenfaces 87

6.5.1 Eigenspace Evaluation 90

7.0 Evaluation 92

7.1 Initial Test 92

7.2 Second Test 96

7.2.1 InclusionofSubject

Tracking

101

7.2.2 Confidence Levels 102

7.3 Digital CameraImages 108

7.4 SystemSpeed 110

7.5 System Limitations 113

8.0 Conclusion 115

8.1 Future Work 116

9.0 AppendixA 118

(8)

List

of

Figures

Figure 1 - Skin Tone

Concentrationin HSV Colorspace 10

Figure 2

-Average FaceTemplate& EdgeTemplate

[10]

11

Figure 3

-SVM

Classifying

Model 12

Figure 4- Face

Rotations , 14

Figure 5 - Eigenface Creation Flowchart 25

Figure6 - Distance Metric

Testing

Results 33

Figure 7

-Average Face 34

Figure 8

-The First Five Eigenfaces 35

Figure 9

-Eigenfaces 50 &

51,

100 &

101,

and 150 36

Figure 10- NormalizationTest 1 39

Figure 11 - NormalizationTest 2 39

Figure 12- Normalization Test 3 40

Figure 13

-EigenfaceTest 1 42

Figure 14- EigenfaceTest 2 42

Figure 15 - Images

per Subject Test 44

Figure 16- Number

ofSubjects Test 45

Figure 17 - EvaluationoftheFirst Three Eigenfaces 46

Figure 18-Mask Images 47

Figure 19

-Masking

Test Results 47

Figure20- Male

(9)

Figure23-RetestoftheDistance Metrics 52

Figure 24

-ThresholdSelection forFace

Matching

57

Figure 25

-ThresholdData forFace

Matching

58

Figure 26

-GUIScreenshot ; 62

Figure 27

-Two views oftheFrameControls 63

Figure 28

-Frame

Display

Area 65

Figure29- Face

Component 66

Figure 30

-Eigenspace Component 67

Figure 31

-Match Target Information -68

Figure 32

-Training

theTools 69

Figure 33 -CameraView Correspondence Problem 71

Figure 34-2DViewofSimple CVC 72

Figure35

-Face Offset 73

Figure 36 - General Face Proportions 77

Figure 37- Tan-Sigmoid

andLog-Sigmoid Transfer Functions 80

Figure38- Flowchart_oftheData Transmissionbetween Components 85

Figure39 - EigenspaceAverage Image Comparison 88

Figure 40- Eigenface Comparison

89

Figure41 - Eigenspace Comparison 90

Figure 42-Examples ofUnmatchedImages 98

Figure 43- False Positive Images

99

Figure 44

-Correctly

Matched ImagesandtheirMatch Scores 100

(10)

Figure 46-Matches foraThresholdof0% 104

Figure 47-Matches foraThresholdof15% 105

Figure 48

-Matches foraThresholdof25% 106

Figure 49

-MatchesforaThresholdof40% 107

Figure 50

-Imagesof

Andy

inEnvironment A

(left)

andEnvironment B

(right)

108

Figure 51 - Image

Comparison,

EnvironmentA

(left),

Video System

(right)

109

Figure 52

-Equalized,

Grayscaleversion ofFigure 51 1 10

Figure 53

-Square Image Size

Testing

-NoNormalization

-Data Set 2 118

Figure 54

-Rectangular Image Size

Testing

- No Normalization- Data Set 2

118

Figure 55

-Square Image Size

Testing

- Contrast

Stretching

- Data Set 2

1 19

Figure 56

-Rectangular Image Size

Testing

Contrast

Stretching

DataSet 2 119

Figure57-_{Square Image Eigenvector}

Testing

-DataSet 2 119

Figure 58-RectangularImage Eigenvector

Testing

- Data Set 2 120

Figure59

-Testing

fornumber ofImagesfor Eigenface

Training

- Data Set 2 120

Figure60- Evaluation

ofthefirst Eigenfaces

(11)

List

of

Tables

Table 1

-Data

Processing

Pipeline ...83

;'.;(_X

Table 2

-ResultsfromtheInitial Test 94

Table 3- Results

fromtheSecond Test 97

Table 4- Resultsfor

using 10 frames forrecognition 102

Table 5

(12)

Glossary

FaceRecognition

FaceDetection

FaceAlignment

Feature Extraction

FaceClassification

Face

Matching

Eigenface

Yaw

Pitch

Roll

PCA

SVM

SVC

OVC

CVC

ROI

Process of

identifying

aface inanimage.

Determining

thelocationand scale ofa face inanimage.

Determining

where facialcomponents arelocatedand_performing

geometricnormalization.

Finding

features todistinguish between face images.

Matching

asubjectinanimageto animage inadatabase.

Real-time determinationofthedegreeof confidencetowhicha

livesubjectmatchesafacial image.

An Eigenvector froma setofface images.

RotationaroundtheYaxis.

RotationaroundtheXaxis.

RotationaroundtheZaxis.

PrincipalComponent Analysis

Support Vector Machine

Scene ViewCamera

Object View Camera

Camera ViewCorrespondence

(13)

Chapter

1

Introduction

Face recognition has been a topic of great interest to people in _many different

fields.

Numerous

researchers from the disciplines of

Psychology,

Engineering,

Image

Processing,

and Computer Science are interested in

developing

intelligent

_technology

which mimics the workings of the human mind. Face recognition has attracted

significant _{interest because many}ofthe approaches usedto

develop

algorithmsarebased

onknowledge ofthe humanbrain. The humanbrain is wondrous

ly

complex and allows

people to perceive and interpret visual information about an object in moments.

Identifying

aface is so simple and commonplace fora personand yet

inherently

difficult

to duplicate for a computer. This is

largely

because our _{understanding} ofthe human

brain is so limited. This lackofknowledge has ledto a plethora of proposed algorithms

forthedevelopmentof computer systemswhichseektorecognizefaces.

Face recognition _{may be} viewed as a combination of four distinct _processing

steps. These steps are face

detection,

face _alignment, feature _extraction, and face

classification[1]. Face detection

determines

whether a face is present inthe image and

findsthe location andthe size ofthe face. In face _{alignment, facial}

features,

such as the

eyes and _mouth, are located and used to normalize the _geometry of the face. Feature

extraction refers to the process of_selecting a set of

distinguishing

characteristics for a

specific image.

Finally,

face

identification

uses the features to determine a match in a

databaseofknown images.

The majority of current research in face recognition is focused more on the

theoretical application of the

knowledge

and less on the practical aspects.

Existing

(14)

database which is the closest matchto a source image. The images are gathered _using

digital cameras and

thus,

there are no time constraints for processing ofthe image. In

addition, the

lighting

is controlled and a flash is usedto producewell illuminated faces.

While thisresearchis_useful, it cannotbe_{effectively implemented in}the_majorityof

real-world situations. The use oflarge databasesmakes the system

inherently

slow, causing

unacceptable delays if the subjects are

being

monitored through a video system. In

general, there is no guaranteethat therecognition ofthesubject is _correct, because ifthe

subject isnotinthe

database,

itcouldberecognizedincorrectly.

This thesis focuses on the problem offace identification in a_real-time, dynamic

environment. Instead of_using a large database of

images,

the system

typically

uses a

single target image.

However,

it can also use a small set of images. These images

include subjects ofinterest forwhichthe camera systemis searching intheenvironment.

The camera system employs face matching to _{decide if any} ofthe desired targets exist

and determines theconfidence of amatchbetweena subject inthe dynamic environment

and afacetargetinpictorialform.

The differences between the common face recognition system and a face

matchingsystem canbe easilyexplainedthroughan example.

If,

for

instance,

acommon

face recognition system was implemented in an _airport, it would require a massive

database of face images and would

identify

_every person _passing

by

(assuming

an

enormous amount ofprocessing power). Ifthe

identity

of_every person in the _vicinity

was

important,

this system wouldbe justified.

However,

ifthesystem was_attempting to

(15)

system would then be able to select the criminals from the multitude offaces. This

system would examine _{how closely} eachpersonresembledthecriminals inthe

database,

searching formatches. Ifthecommon systemwereto usethis smaller

database,

it would

identify

everypersonasthecriminal

they

most_closelyresembledinthedatabase.

The _{face matching}

technology

could also be used in a manner similar to a

fingerprint scan inorder to give someone privileges or access to a system. An ID card

could have an encrypted image of a person's face embedded in it which could be read

whenthecardisswiped at aterminal. Acamerasystem couldthenbeusedto_verifythat

theperson who swipedthecard isthe actualowner oftheID card. Ifthe two match, the

identificationofthesubject is verified andheor shecanbegranted access.

The resulting outcome is that while current face recognition techniques and face

matching are similar _{concepts, the}

development,

implementation,

and outcome ofthese

systems is significantlydifferent.

1.1.

Motivation

Facerecognition is a_{rapidly advancing}research topic _{in image processing}dueto

the largenumber of applicationsthat canbenefit fromit. The

leading

applicationforface

recognition is video surveillance. Other applications include determination of

identification_credentials, human-computer

interfaces,

andcommunicationssecurity. The

intent of such a system is _simply to extract information from a live video stream and

conveyittoa useful source.

Ifa system is developed that can match the face ofa _{subject, the} subject can be

(16)

which can receivethe

identification

fromthe camerasystem. This openspossibilities for

alot of opportunities asidefromstandardvideo surveillance.

1.2.

Problem Statement

Thegoal ofthis thesis isto

develop

a_real-time, face matching system which can

beemployedinadynamicenvironment. The critical requirementsforthissysteminclude

the_abilitytoobtainhigh_{quality face} imagesandtomatchfaces inafractionof asecond.

In order to get the precision required to manipulate and

identify

a face with a video

camera _{system, two} cameraswillbe used. The reasonforthis is thatone ofthe cameras

will be

functioning

athigh levels of optical zoom in orderto get the most accurate face

image possible.

However,

while zoomed

in,

it will _{only be} able to track a _very small

portion oftheenvironment. While it ispossibleto start zoomed _out,finda

face,

zoomin

on

it,

and capture an

image,

each ofthese camera motions requires motor movement and

is

inherently

slow. Thisprocess takestimeand unlessthe subject is_stationary, it is

likely

to fail. Amore effective method wouldbeto haveone camera_searchingtheenvironment

at low zoom. This camerais called the Scene View Camera (SVC). This camera finds

faces and givesthe locationofthesefaces to thecamerawhich is at ahigherzoomlevel.

This second camera is the Object View Camera (OVC). The two cameras are able to

workintandem,

finding

the locationofthe face and_capturingahighresolutionimage of

theface.

Inaddition to the obvious hardware requirements, the system needs sophisticated

(17)

images,

put themina standardized

form,

and

finally

attemptto matchthe face image to

thedesiredsubject.

In order to improve the face _matching process, the system employs ah

implementation ofprincipal components analysis that has been _specifically tailored to

face matching. It is common practice to

develop

a principal components algorithm to

determine ifa region ofthe image

is,

in

fact,

a face. In order to recognize faces at a

variety of pose _angles, yet match faces _accurately, a support vector machine will be

integrated for face detection. The use ofprincipal components will be _{solely for} the

purposes offace identification.

Each frame captured

by

the SVC undergoes color segmentation to determine

possible face regions. These regions are examined

by

the support vector machine to

determine ifthe region is a face or not. The best face region is used as a guide to

determine the location ofthe subject in the environment. This information is used to

determine the pan, tilt, and zoom of the OVC. Each image captured

by

the OVC

undergoes color segmentation aswell to aid in

locating

the face inthe new image. This

high_qualityface imagewillbe examinedfor its facial features. Oncetheeyes and mouth

have been

found,

the image willgo throughbothgeometric normalizationand histogram

equalization. The image resulting fromthisprocess is thenexamined_usingtheprincipal

components algorithmto determine if itisa matchforone ofthe target faces.

Lastly,

inorderto allowthe systemto runinreal

time,

allofthesoftware must be

streamlined and efficient so that all ofthis _processing can be done on multiple

images

each second. This ensures the system runs _{smoothly, matching faces} ofthose who pass

(18)

1.3.

Outline

The

implementation

of a _{multiple-camera, real-time,} face recognition system is

presented inthis thesis. This systemimplements a PCA based feature detection system

that has beentailored forface matching. The

following

chapterdiscusses priorresearch

focusing

onthe myriad of strategies for face recognition. The development and analysis

ofthe PCA algorithm are detailed in Chapter 3 and Chapter4. The threshold selection

and final

tweaking

of the _{face matching} algorithm are detailed in Chapter 5. The

structure and implementation ofthe real-time system are described in Chapter 6. The

performance and results obtained from the system are examined in Chapter

7,

followed

(19)

Chapter

2

Background

A Ph.D. student

by

the name ofTakeo

Kanade,

who is currently Professor at

Carnegie Mellon

University,

is considered to have designed and implemented the first

face recognition system in 1973 [2]. Kanade used two computers to examine

approximately 800faces. Hisgoal was to examineand extract_{the eyes, nose,} andmouth

of the subjects. His research into _examining faces excited interest in facial image

processing,

leading

to thedevelopmentoffacialrecognition as it is knowntoday.

Serious activity in facerecognitionbegan inthe late80's and_{early 90's.} One of

themajor reasons forthiswasthat_processingpowerwas

finally

_reachingthepoint where

largeamounts ofimage processing couldbe donewithout significant_waitingtime.

Early

image processingwas alldoneon stillimages storedonahard drive. It was not possible

to integrate the algorithms into real time because the system _simply couldn't do it fast

enough. As the amount of_processing power

increases,

so does the _ability to run more

and more complex algorithms.

As computing power continues to _{increase many} of these algorithms can be

placedinto a realtimevideo system which allowstheprocessto movefromstatic images

to a dynamic environment.

However,

with this change a significant number of new

factorscomeintoplay. Theenvironment isno longerascontrolled, andthe

lighting,

pose

(20)

2.1.

Current Research in Video Face

Recognition

Although significant amounts of research have been done on still image face

recognition, research in video face recognition has been sparse _according to a literature

survey done

by

Zhaoet al. [3]. Some ofthenotable approaches includetheworkdone

by

Liu et al.

[4]

_using adaptive hidden markov models to model human faces in video

sequences and Shaohua Kevin Zhou and Rama Chellappa's

[5]

work _using mean

dependent image groupsto

identify

subjects. Thereare four_primaryreasons researchin

this area has been limited. Typical video environments have poor video _quality,

variations in

illumination,

significant changes in subject_pose, andlow imageresolution.

Thiscombination offactorsmakes it difficultto obtain a_galleryofimage subjects which

canbeused for facerecognitionandsystemtraining.

In order to address the difficulties with video and image _quality, a multiple

camera system has been used to enable the maximum effective level of zoom to be

selectedfor

tracking

asubject. Multiplecamera systemshave beenshownto beeffective

for aiding in

locating

small features on a subject

[6],

yet have not been proven to be

effective for face recognition. The combination ofthe two cameras at different zoom

levelswill allow subjectsto be effectivelymonitoredandtrackedathighimagequality.

Insteadof_obtainingan extensive _galleryof subjectsthrough therealtime_system,

adatabaseofstillimageswillbeusedtocreate a_galleryof subjects which willbeusedto

train the system. The FERET database contains a large number offacial images which

were captured in a controlled environment [7]. The variations present in subjects

(21)

2.2.

Key

Attributes

for

the

Face Detection

Algorithm

In

developing

a realtime face _{matching system,} it is important tomake sure that

the_{desire for accuracy}does not overcomethedesireforspeed. Thecombinationofthese

two metrics determines the overall performance.

Therefore,

even though analgorithm

maybe very accurate, the processingtime required per image must be weighed. Ifthe

processing time is simply too

long,

the performance ofthe system is unacceptable in

practice.

Face matching with multiple cameras requires that the subject is

looking

at one

camera, but canbe detectedwiththeother camera.

Therefore,

the face detectionmethod

must be able to recognize profile views of a subject's face in addition to a direct frontal

view. In_{addition, the} distanceofthe subject from thecameras is variable,requiringthe

method usedtobeableto compensate forfacesof_varyingsizes.

Lastly,

sincethe subject

is able to move

freely

while

they

systemis

functioning,

the system mustbeable todetect

subjects who are

looking

_up or down. All ofthese variations require a face detection

method that is _versatile, yet _very efficient. The detectionofthe face in the frame is but

the _{first step} in a _very complicated _process, and thus it cannot require too much

processingtime.

Face candidates will be located in the image _using color segmentation. Color

segmentation will allow the system to locate regions which contain a large amount of

skintoneswhich shouldbeexaminedto determine ifthe region is aface or not. _For _this

reason, the face detection method will not be usedacross the entire

image,

but will be

(22)

HSV (Hue Saturation

Value)

color space and that image _{processing in} this space is

simple and effective [8,9]. Adepiction ofthe special

locality

of skin tones in the HSV

spaceisshowninFigure 1.

Figure 1- Skin Tone Concentration in HSV Color space

2.3.

Face Detection

Strategies

Template matching isone ofthe simplest methods for

determining

whether a face

object is present. Template matching is a method which attemptsto search fora surface

that is similarly contoured to a face [10]. This is done

by

_creating a surface whose

intensity

values match those predicted fora face. The eyes containthe darkest portions

alongwiththemouth andnostrils, whilethe foreheadand nose arethe brightestportions.

This templateis scaled andplacedover portions ofthe image inorderto determine

likely

(23)

canbeused to detect areas inwhichthe prominent edges match_up withexpected facial

features. Although reasonable success at face detection can be obtained using this

method, template _matching is limited because it cannot account well for image rotations

which deform the face and _may result in a poor match. An example of template

matching isshowninFigure 2below.

Figure 2- Average Face Template&Edge Template

[10]

A more complex method which is similar to template _matching is 3-D face

modeling [11]. This method uses a set of generic three-dimensional face objects and

attempts to matchdifferent views of aperson onto one ofthese

fully

three-dimensional

objects. As an image is mapped onto the generic face _object, it will be deformed to

match the subject. The

deforming

process is done _using a distance _map which maps

features onthe image to the featuresexpected on the 3-D model. Once _{created, the} 3-D

subject model will allow accurate representations ofthe subject's face independent of

facialrotation.

Unfortunately,

due to its _{complexity it} cannotbedone inrealtime and it

isnotguaranteedthatenoughimages of a subject willbeavailabletocreatea complete

3-Dface model.

Anotherapproachto face detection is theuse of support vectormachines [12]. A

(24)

marginbetweensets ofdata. The supportvectormachinetakestwo sets ofinputdataand

attempts to _classify the two types without

being

subject to overtraining. A graphical

representation of how the SVM classifies

linearly

separable data is shown below in

Figure 3. Inthefigure

below,

thedark dotsare

being

separatedfromthelightones.

Origin

Figure 3

-SVM

Classifying

Model

In most _cases, support vector machines are used to

identify

a particular object

class

by

_{separating it from}otherpossible objects. Inthe case of a face

detecting

SVM,

it

is attempting to find the optimal method of

differentiating

betweena face object andthe

rest ofthe world. Support vector machines are _generally quite _effective, and can be

trained_usingimages selected

by

the user. The selection of a

training

set would allowthe

SVM to recognize faces at various poses. In _addition, the input ofthe SVM must be

regulatedtoa specific numberofpixels.

Therefore,

_{any face} region selected

by

thecolor

segmentation algorithm would be _resized, _allowing the SVM to be

independent

to the

(25)

The last face detectionapproach istheartificial neural network[13]. Anartificial

neural network

(ANN)

is a

learning

systemthat ismotivated

by

the internal connections

ofthe human brain. An ANN is made _up ofinterconnectedneurons called nodes which

respond inparallelto a set ofinput signals. Aneural network consists ofan input layer

of nodes and an output layerof nodes whichtheuser interactswith. Anumber ofhidden

layers of nodes _may be present in betweenthese two layers to improvethe accuracy of

the system. Signals fromthe input layerare passedthrough thenodes and are subjectto

an activation rule. If the correspondence at the node is not high enough, it will not

produce a positive output signal.

Unfortunately,

as the _complexity of the network

increases,

the _processing time required to compute the results of the network rises

sharply. Artificialneural networks often require an extensive

training

setand_maynotbe

ableto _effectivelycapturethe_variabilityoffacespresentintherealtime system.

Examinationofthe face detection methods outlined above shows that the support

vector machine allows for the largest amount of _variability while still _maintaining

efficient performance.

Therefore,

the support vector machine will be used for face

detectionintherealtimesystem.

2.4.

Key

Attributes of

the

Face Classification Algorithm

When snapping a photograph or a series of photographs of a person, a large

numberof variations can come into play. The

lighting,

levelof_zoom, camera

focus,

and

quality ofthe captured image can _varydue to the characteristics ofthe camera _selected.

Ifthesubject

being

photographed is not_completely_still, then therecanbe changes inthe

(26)

roll, onlyroll canbe compensatedforinaphotograph(orvideo frame). Figure

4,

shows

the _relationship ofthese rotations. Due to the abundance of possible variations, the

algorithm selected mustbe

largely

independentofthese sources of error.

Yaw

Roll

6

*0

Pifcb

Figure4- Face Rotations

With this in _mind, the _{existing face} classification algorithms must be evaluated

based on three important criteria. These criteria are the _accuracy ofthe _{algorithm, the}

algorithm'srobustness intheface of_error,andthe_processingtimerequired.

2. 5.

Face Identification

Strategies

A promising method for matching faces appears to be principal component

analysis

[14,15,16,17,

and 18]. This _method, pioneered

by

M. TurkandA Petland

[16],

uses objects which are called Eigenfaces. Eigenfaces are _simply the Eigenvectors

derived from aset offace images. The Eigenvectors represent anew set of_coordinates,

called an

Eigenspace,

which allow efficient _encoding of the most _{important facial}

(27)

ontothe Eigenfaces. The sets of weights fortwo separate images canthenbe compared

in order to assess the degree ofcorrelation between the images. Images ofthe same

person shouldhave _very high correlation. Theuse ofEigenfaces has _recentlystartedto

emerge astheresearch standard for facedetection because it is quite reliable and nottoo

computationallyintensive.

Researchers have been _workingon ways to improvethe standard PCAalgorithm

and have succeeded in multiple ways for specific systems. One ofthe most _promising

methods is based on Linear Discriminant Analysis (LDA). LDA reduces the

dimensionality

of a face image in a similar method to

PCA,

yet instead of_producing

orthogonal vectors, it produces vectors which

linearly

separate the data as much as

possible [19].

Therefore,

it

theoretically

produces a set of vectors which are more

effective at face classification than Eigenfaces. The LDA vectors are similar to

Eigenfaces and are often called

Fisherfaces,

since

they

weredeveloped

by

Ronald Fisher.

Unfortunately,

inorder to trainsuch a system to be more effective than

PCA,

all ofthe

variables which will be encountered

by

test subjects must be in the

training

set. Such

variables include

lighting,

pose _angle, andexpression. Sincethe system willbe run ina

dynamic_{environment, ensuring incorporation}of all ofthevariables _may_{be impossible.}

Another method,

Evolutionary

Pursuit,

attempts to

directly

improve PCA

by

manipulating the Eigenfaces themselves [20].

Evolutionary

Pursuit

(EP)

requires an

initial set of Eigenfaces generated

by

a standard PCA algorithm. The _resulting

Eigenfaces arethenputthrougha largeseries oftestsinwhichthe vectors arerotatedand

theperformance ismeasured. Theperformance ofthe _resultingvectors is measuredwith

(28)

vectors discriminate wellbetween

faces,

yet are still effective at _matching a wide range

of faces. While this method can generate vectors which are more effective than

Eigenfaces,

it is limited

by

the _efficiency of the Eigenspace and the additional

computationisnotjustified

by

theperformance increase.

Independent Component Analysis

(ICA)

was

initially

developed to separate

individualsignals froma combinationandcanbe employed for facerecognition [21]. It

decomposes a complicated signal into additive subcomponents which are _{statistically}

independent and was firstused for facerecognition

by

Bartlett and Sejnowski[22]. ICA

differsfrom PCA inthat ituseshigherordermoments inorderto separatethedata. PCA

uses 2n

order momentsand decorrellatesthedata. Whenused for face _recognition, ICA

assumes that a face is a combination of a set of unknown source images. These

independent images can be used to

identify

a particular subject. While ICA has been

shownto bemore effective at

distinguishing

differences between images of a_set, it does

notgeneralize welltoimageswhich are notintheset.

Every

human face has a _unique, yet similar topological structure. A method

which examines this structure in order to determine the

identity

ofthe face is Elastic

Bunch Graph

Matching

(EBGM)

[23]. In this _system, faces are represented as graphs

which have nodes at important features such as _{the eyes, nose,} and mouth. The edges

betweenthe nodes are weighted with the distance betweenthe nodes. In _addition, each

node contains information about its feature in the form ofGabor wavelets. While this

method differs _{greatly from} the other methods examined _above, it does not achieve the

(29)

A final method, Bayesian statistics has been used in conjunction with both

Eigenfaces and Fisherfaces to improve the performance ofthe feature vectors in face

classification. Thisapproach uses a statisticalclassifierto

help

determine the importance

ofthe components ofthe feature vector, allowing for more accurate face classification

[24,25].

However,

the use ofBayesian statistics is

heavily

dependent on _{the underlying}

feature set selection and can _{only marginally improve} performance. Thus Bayesian

statistics willnotbeemployedinthis thesis.

2.

6.

In

Depth Look

at

Principal Component

Analysis

The methodology behind principal component analysis

(PCA)

is related to the

entropy of a _system; where _{entropy is} a measure ofthe amount of_{energy in} a system.

PCA works

by

first gatheringa large setoftarget images. The covariance ofthis image

setis thenfound. Thecovarianceis a measure oftherelative variation ofthe imageswith

respect to each other. With this

information,

the Eigenvectors and Eigenvalues ofthe

systemcanbeproduced. The Eigenvectors forma coordinate system where an

image,

of

predetermined_size,canbe located. Ifthe setofEigenvectors is _{complete, the} imagecan

be completelyrecreated from its location inthis coordinate. The advantage of_using the

Eigenvectorsto describe the image is that the Eigenvectors are produced in such a_way

that the axes

try

to minimize the _entropyofthe system.

Minimizing

the _entropyofthe

system allows the maximum amount ofdata to be encoded inthe minimum amount of

space. This means that _{if only} a small number of the most important

Eigenvectors,

determined

by

larger

Eigenvalues,

are _used, the _majority of the

image

will be

(30)

hundred or so values which correspond to its coefficients in the Eigenspace. While a

reconstructed image is not _{perfect, it} contains the principal components ofthe face

by

which the person can be

identified.

On a side note, ifthe Eigenvectors are viewed as

images lookas if

they

are_ghostly

faces,

and are oftenreferredto as Eigenfaces.

In order to

determine

the

Eigenfaces,

_many steps must be taken. The general

formofthisprocess wasdone

by

Matthew TurkandAlex Petland[16]. The initialset of

training

images is defined as

Ii, F2,

X_.

...

tu,

where there are M images ready for

processing. Theaverage oftheseimages is *P andcanbe found

by

theequation

1 M

M^

Each face image will then be processed

by

_subtracting the average image to create the

new set offace images marked

by

<DN,

where

O^

=

FN

-*P and N =

1,

2,

3 ... M. .

Thecovariance_matrix,

C,

ofthisset ofimages canbe determined

by

T

Thecovariance is_{generally in}the formofC=_AA

and canbeput into this form if A is

defined as the set of mean adjusted images A =

[Oi, 02,

<t>3 ... Om]- With the

covariance ofthesystem

defined,

it ispossible to findtheEigenvectors ofthecovariance

matrix. The Eigenvectors ofthis matrix with the largest Eigenvalues correspond to the

dimensions ofthesystem withthestrongest correlationinthe dataset.

Unfortunately,

thecovariance matrixis ofthe size PxPwhere P isthenumber of

pixels in the source images. For images ofdecent quality, 128 x 128 pixels, this matrix

(31)

needed. It is possible to find the Eigenvectors and

Eigenvalues,

Vi and

Xi,

ofthe matrix

ATA suchthat

ATAv;

= _Xyi. _If_both

sides ofthis equationare pre-multiplied

by

A,

the

equation becomes

AATAv;

=

XjAvj. In this equation, it is obvious that

Avi

are the

Eigenvectors ofthe matrix

AAT,

which is the covariance matrix. The benefit ofthis

method is that the size ofthematrixATA is notbased uponthenumber of_pixels, but on

the number ofimages.

Therefore,

with this method, the principal M Eigenvectors of a

system canbe

determined,

whereMisthenumberofimagesused.

In order to make use of this

information,

a "scrambled covariance

matrix"

is

calculated_usingtheequation

cXzx*.

thus_{allowing Cs}=_ATA. _{The Eigenvectors}_of_this"scrambled_covariance

matrix" canbe

determined using singular value decomposition. These Eigenvectors must now be

multiplied

by

the matrix A in order to get the principal Eigenvectors of the real

covariance matrix. SincethematrixAis a

listing

ofthe

images,

thiscanbe donethrough

a processof_summingthemultiplicationofthe Eigenvectorsandtheimages fromthedata

set. Theprocessto

develop

therealEigenvectorsof_{the system, ul, therefore}is

M

K=l

whereL =

1, 2,

3 ... M. Withthisprocess complete, the Eigenfacesare simplytheset of

Eigenvectors

pl-The Eigenfaces forma set of vectors which constitutes a newcoordinate system

throughwhichdistancescanbe measured. In orderto getthecoordinates of an

image

in

(32)

the image into the Eigenspace. This process israther simple andinvolves the use ofthe

innerproduct. It consists of_subtractingtheaverage image fromthe imageto project and

then_multiplying

by

theEigenfacestoobtainthe set of weights. Theweight vectorhasan

entry

corresponding

toeachEigenfaceandis bea

floating

point number. The creation of

theseweightsis done

by

coK

=pTK(T-P)

whereK=

1, 2,

3 ... M.

The weight vectoris the set of coordinates

by

whichthe image canbe located in

the Eigenspace.

Therefore,

once the weights of two images can be

determined,

the

"distance"

between the two images can be calculated relative to the Eigenspace. There

are _many different distance metrics that can be used. The distance metrics that were

examinedareshownintheequationsbelow [14].

Dist(x,y)

=

|

Equation 1

-LtDistance

Dist(x,y)

=J_li(xi-yi)2

Equation 2 L2Distance

ZX-XK,-Dist(x,

y) =

,X

A=X

J

Equation3- Anglebetween Vectors

Dist(x,y)

=

-_?.=ixiyizi

Equation4- MahalanobisDistance

Dist(x,y)

=

Equation5

(33)

Dist(x,

y)=

__]kJ=l

(x,

-yi)2zi

Equation6

-L2&MahalanobisDistance

1*

Dist(x,y)

=

-E^xT

Equation7- Angle_andMahalanobis

where z=

,

Xi

=Eigenvalueofthe ith

Eigenvector

A;

Thedistance measures above correspondto two vectors ofweights_{corresponding}

to two separate images. The first image is assumedto be Xwhere _x; is the ith weight in

thevector. The secondimage is assumedtobe Ywherey; isthe ithweight inthe second

image's weight vector. Equation 1 shows that the Ll distance is _simply the sum ofthe

absolute value ofdifferences betweentheweights. The L2 distance is similarto the Ll

distance,

yet it computes the totalsquare errorbetweenthe weights. While this normis

generally considered more useful than the Ll _metric, bothwill be included in the tests.

The Mahalanobis distance metric takes into account theproperties ofthe Eigenspace. It

applies a_weighting

factor,

_{z, to the}system. This factorrelates the Eigenvectorwiththe

strength

by

which it is correlated to the dataset.

Therefore,

Eigenvectors with higher

correlationto the datasetshouldbeweighted more

heavily

since

they

are moreimportant.

Thus,

the Eigenvalues allow the distance metric to weight the distance based upon how

importantthe

difference

betweenthe two measures _{actually is.}

Theselection of principal component analysis as the algorithm for face matching

(34)

thismethod. Thealgorithm _must, _therefore, be adaptedto obtainthe maximum levelof

(35)

Chapter

3

PCA

Implementation

Developing

an implementation of the PCA algorithm was no simple task. In

orderto obtain an accurate set ofEigenfaces and _successfullytest the system, over 600

images were

individually

prepared

by

hand. The algorithm was then implemented and

testedwhile_{varying many}parametersinordertoobtainefficient, yetaccurate results.

3.1.

Source Images

In orderto obtain the sheer volume ofimages needed to

develop,

train,

and test

the system,adatabaseoffaceswas necessary. Onesuchdatabase istheFERET database

[26]. The FERET database was sponsored

by

the Face Recognition

Technology

(FERET)

program which was supported

by

the Department of Defense's

Counterdrug

Technology

Program. The goal ofthe FERET program was to

develop

automatic face

recognitiontechnologiesthat couldbeusedto assistthelawenforcement community.

Inorderto accomplish this _task, alarge database offace images was gathered

by

professors atGeorge Mason

University,

independently

ofthealgorithmdevelopers. The

imageswere collected ina_partially controlled environmentinorderto maintain adegree

ofconsistency. The same_setupwasusedineach_photography_setup, yettheimages were

collectedon_{many different}dayswith _slightlydifferentequipment.

The database was

fully

assembled after fifteen sessions whichtookplacebetween

the fall of1993 andthe _spring of1996. The _{FERET database} _totals more than fourteen

thousand images in about fifteen hundred sets. Over one thousand subjects were

(36)

Inorderto

develop

a cohesive set of

Eigenfaces,

the images inthedatabase were

examined and selected. Since theobjective ofthe

testing

would be to matchanimageof

a subject to another image of the same subject, two images of each individual were

needed. The

images

hadtobe _similar, yetnot identical. Imageswith variations in facial

pose _angles, _expression,and

lighting

were acceptable candidates. Imageswithvariations.

in facial

hair,

_glasses, or severe differences were discarded. The images were then

groupedintothreedistinct datasets.

Thedatasets were determined

by

theperceivedlevel ofdifferencesinthe images.

Thefirst dataset contained pairs ofimages which were deemedthemost similarto each

other. Theimages inthisset werealso takenatthesame levelof_zoom,

lighting,

andhad

small variations in facialposes angles. Thesecond set consisted ofimages with_slightly

larger variations. The images in the third set were considered the oneswith thegreatest

variation. Images inthisset varied

by

zoom

level, lighting,

larger facialpose_angles, and

more extreme expressions. It was this third set that was intended to measure the

generalizationofthealgorithmdeveloped. This set is also themost importantsince once

thealgorithmis integrated into therealtime camera_system, allofthesevariations would

comeintoplay.

Data Sets 1 and2 each consisted of 100 subjects,

fifty

ofwhich were male and

fifty

of which were female. Data Set 3 consisted of77 _subjects, 5 1 ofwhichwere male

and 26 ofwhich were female. There were two images ofeach subject selected and no

subject appeared in more than one data set. The images were

finally

cropped in a

(37)

3.2.

PCA Algorithm

In order to calculate the

Eigenfaces,

the program

input

requires a

listing

of

images,

the desired width and length of the

Eigenfaces,

and the desired number of

Eigenfaces. In order to produce solid Eigenface

images,

the source images must be

standardized inthe same way. A flowchart ofthe steps takenis shownbelowin Figure

5. ( Loadimage ? r Enoughimages led? No f -Load Image > -Convertto -? r Resize

Listing

File Loa( Grayscale

Yes Calculate Covariance ? Calculate Eigenvectors ? Calculate Eigenfaces

-? Saveto File

Figure 5- Eigenface Creation Flowchart

Thecovariance ofthe imageset reliesupontheimportant features from differentsubjects

matching up in the same pixel locations. This means that each ofthe source images

shouldbe croppedinthe same _manner, centered on_{the eyes,} and rotated so that the face

image contains no more than 15 degrees of roll. For initial

training

and development

purposes this was all done

by

hand. The faces were cropped at the

top

ofthe

forehead,

bottomof_{the chin,} and at the base ofthe ears for over500 faces. Ifthe face had more

than 15 degrees of_roll, it was

digitally

rotated so that the roll was as close to zero as

possible.

After all ofthe images were cropped, listings ofthe images were created so that

(38)

created with the full name of the images desired with a single filename per line in

alphabeticalorder. This simplified theprocess of_{reading in}the

images,

_counting

them,

andmost

importantly,

keeping

trackofwhichfaceswere matches.

The images are then read into the system one at a time with each image

being

processedbefore

being

addedto alarge input imagearray. _{The processing} ofthe images

is _relatively simple. Each image was resized to a standard width and length in pixels.

Once thiswas _{completed, the} imagewas normalizedto

help

reduce theeffect of

lighting

uponthe image. At the completion of_{this step, the} system contained an _arrayof_up to

500 imageswhichhadbeen_properlyprocessed.

In order to determine the Eigenvectors of the

images,

a standard covariance

matrixcould not beused since the covariance matrix of a set ofimages sized at 128

by

128 pixels would contain over 250 million entries (228). This data would be nearly

impossibleto contendwith. Itis forthisreasonthatthe"scrambled" covariancematrixis

calculated as mentioned in section 2.6. The scrambled covariance matrix contains the

same number of entries as the number of pixels in the input images. From this

covariance matrix aset ofEigenvectors isdetermined along withthe set of_{corresponding}

Eigenvalues. The Eigenvectors are used to create the Eigenfaces through a process of

scaling and _adding the initial images to one another as described in Section 2.3. The

resulting Eigenfacevectors are storedto atextfile for laterretrieval and use.

In orderto _effectivelyuse the PCAalgorithm, Eigenfaces should be created and

used at the same resolution.

Resizing

the Eigenface objects creates distortions which

(39)

addition, any imagewhichis going to beused in the Eigenspace should be processed in

exactlythesame manner asthe

images

which were usedinthecreation oftheEigenfaces.

3.2.1

Difficulties

Thereare _many challenges inherent in

developing

asuccessful set ofEigenfaces.

The first and _{foremost among} them is the need for clear, pronounced, relevant

Eigenfaces. The only waytoachieve such results istouse alargenumberoffaceswhich

have beencropped and rotated in thesame manner so that_everyprincipal component of

thedifferent face images line _up with eachother.

Unfortunately,

it is _{nearly impossible}

to _crop each image

by

hand _{in exactly} the same manner. To make matters _{worse, the}

pitch and yaw of the face in the images can not be controlled. While these slight

deviations do not matter to the human eye,

they

have unknown and _possibly profound

effects uponthecreationoftheEigenfaces.

Each error in cropping or poor image selection _potentially introduces error into

the development ofthe Eigenfaces. Ifenough error is

introduced,

an entire Eigenface

couldbe devotedto

it,

_reducingtheeffectiveness ofthe otherEigenfaces.

Unfortunately,

a visual examinationofthe Eigenfaces cannot reveal ifone is valid or not.

Furthermore,

whenthe_matchingprocedure failsto match a set offaces it isoften_{extremely difficult}to

(40)

3.2.2

Limitations

There are_many

limitations

placed upontheeffectiveness ofthe Eigenfaces based

upon their desired use. For _{this system, the} Eigenfaces are

being

designed for the

identification of the principal components of a subject's face and not for

directly

identifying

faces. Thismeans that the Eigenfacesare always _assuminga frontaland

non-rotated view of the face and cannot work _effectively with a side view of the face.

Eigenfacesare_generallytrainedwithtiltedfacesaswellas frontalviewstoallowthemto

recognizefacesthat are

tilted,

as faceobjects. Inthisworkitisassumedthat thefacewill

already be identified before it is passed to the Eigenspace for calculations. It is also

desiredthat the pose angle ofthe face should not_play an important factor in

identifying

whetherornotthe facematches a particular subject. This lineof_{reasoning clearly leads}

to the factthat in

training

the

Eigenfaces,

itis desiredthatno principal componentsinthe

Eigenspacecorrespondtotheangle ofthe face. Iftoo _manyoftheEigenfacescorrespond

to the angle ofthe face instead ofthe facial

features,

the system will match subjects

whoseheads are posedinthesame manner.

Another limitationofthealgorithmisthat it is _{significantly}moredependentupon

lighting

variationthan_previouslyreported. Whilethis is not_{surprising in}retrospect, it is

still a difficult issue to handle. Research showed that an effective Eigenspace was

relatively independent ofthe light source, as

long

as there was enough light to view the

face. Forthe purposes of_matching specific

faces,

the camera needsto produce animage

with crisp, clear features instead ofjust a general view that canbe identified as a face.

(41)

intensity

and angle ofthe light. This means that the environment needs to be more

controlledthan

initially

believed ifperformanceis tobe maintained. It also _{may have} an

effect onhowthealgorithmgeneralizes acrossdifferent environments.

Athird

important

limitofthe algorithmis that the ears, a ratherdistinct featureon

most_people, cannotbeused to

identify

the

face,

withoutserious restrictionsplaced upon

the camera system's environment. The reasonforthis is that the ears stick out from the

face and therefore create pockets above and below them, in a cropped

image,

which

contain a view ofthe background. Since the background is ever-changing, this adds a

significant amount of noiseto theprocessed image and cloudstheEigenfaces. Thiscould

be avoided

by

_putting the camera system into a room with where _everything was

featurelessand_white,butwould renderthesystemimpractical. Itwas found

during

_early

testing

that thiswas_{the case,}hence itisdesirableto_cropthefaceatthebaseoftheears.

Finally,

the goal of_{face matching is} not to foil

disguises,

but to

identify

a more

cooperative subject. In _{this regard,} disturbances such as _removing _glasses, _modifying

facial

hair,

and other face occlusions were not trained into the Eigenfaces. It is desired

that the system work on _{effectively matching} clear facial images. To _{this end,} facial

occlusions were not presentin anyofthe

training

imagesand were nottakenintoaccount

inthe finalsystemdesign.

3.3.

Distance Measure

Implementation

&

Testing

Previous research showed that there exist _many

different

ways to measure the

distance betweentwo images which had been projected into an Eigenspace [14]. While

(42)

instead of_selectingthe _{best resulting} measure fromanotherexperiment. The reasonfor

this decisionwas that the Eigenfaces were trained _{specifically for} this system,

focusing

on certain variables and

being

used ina unique way. The effectiveness ofthe distance

measures would most

likely

_vary, since so much ofthe system would not remain the

same. In orderto test each ofthe distance measures, a systematicprocess of_examining

thefaces wasdeveloped.

The method developed alloweda large set ofimages to be examined. In the set

eachsubject wasrepresented

by

two images. The process functioned

by

taking

the first

image,

and _calculating its distance to each ofthe other images in the set. The image

which was _closest, yet not identical was considered the best match. The system

displayedthe initial

image,

and the twobest matches foreach ofthedistance metrics, to

allow visual confirmationofthe _accuracyofthestatistical measure

being

calculated. The

system would record whetheror not the imagesselected werecorrect matches. Afterthe

calculations were completethesystemwould pause_momentarilyandthenmove onto the

next image. This pattern would continue until the entire set of images had been

processed. The resultingstatistical informationwas loggedtoafile.

In orderto gather useful statistical

information,

_onlyoneparameterwas varied at

a time. The test results for each of the variables are displayed in Section 4.1. The

importantvariables inthe

testing

were:

- Distance Metric Selection

- The Normalization Method

(43)

- Number

ofImagesperSubject

-The NumberofEigenfacesto

Keep

- The

EffectivenessoftheFirst Eigenface

- TheEffectiveness

of

Masking

theImage

- The

Effectiveness of Malevs. Female Eigenfaces

The number ofEigenfaces to

keep

was an important variable because while the

firstfew Eigenfaces are_relatively_{clear, the} last Eigenfaces are _{significantly}distorted

by

noise. Asa_result, it is importantto determinethepoint at whichanEigenface becomes

useless because it no longer contains _any valuable data. In addition to

this,

the

effectiveness ofthe first Eigenface is called into question. Thefirst few

Eigenfaces,

it is

generally

believed,

are relatedto the most prominent angles of

lighting

in the image set

[14].

Therefore,

_{it may be} wise to remove these images since it is desired that the

Eigenspaceisunaffected

by

the

lighting

conditions.

Asecond importantvariable iswhether or nottomaskthecorners ofthe imageto

remove noise. While this is a _pretty standard _practice, once again the focus of the

Eigenfaces

being

developed is different and _may require a different approach. The

primary concern ofthe

Eigenfaces,

for this system, is to be able to characterize the

principalcomponents ofthe face. To this_end, the imagewas cropped_{very closely}about

the face and contains _very little ofthe _noisy background scene. This makes it possible

that _masking the imagewould remove importantportions ofthe image instead of_aiding

in _removing noise from the image. The use of a mask could also add unexpected

(44)

Lastly,

an

intriguing

_question arose. The human visual system can _easily

recognize the

difference between

male and female faces. Would itbe useful to

develop

Eigenfaces which could _{be specifically}male or female? This would be possible in this

system, since user input is required to select the face that is desired for matching. It

would be a simple matter to force the user to input a gender with the desired face.

Despite the_seemingly

intuitive

nature of_{this question,} _very little information seemed to

be available onthewisdomofsuchanapproach.

3.4.

Distance Measure Results & Selection

After some initial

testing

to ensure that the Eigenface

testing

algorithm was

working correctly,

testing

began to determine the most effective distance measurement.

With so _many _variables,

taking

measurements with more than two different distance

metrics would_simplycreatetoo muchdataandbeginto causeconfusion.

Inordertodrawa usefulconclusion, thealgorithmwasset _{up in}amannersimilar

to what was found in other papers. The image size was set at 128 x 128 _{pixels, the}

imageswere normalized withhistogramequalization, 200 images of unique subjects were

used to train the

Eigenfaces,

and 100 Eigenfaces were used in the distance calculations.

(45)

100

90

-f-3 60J

u < 50

ComparisonofMeasurementMetrics

Training Set- QataSet 1

Angle Mahal Ll&M L2& M M&A Distance Metrics

Figure 6- Distance Metric

TestingResults

While the Ll distance metric scored the highest in the first data set, it was not

selectedbecause it did not appearto generalize wellacross thedifferent data sets. Since

theEigenspace is intended forareal-life, dynamic environment, themostweight was put

onthe _accuracyofthe results _{in measuring Data Set 3. The Ll & Mahalanobis} distance

metric scored the highest in Data Set 3 and had the highest average score. The tight

grouping of the scores and the average above 85% _{matching accuracy} gave it the

definitive lead intheselectionofthedistance metric. The

Ll, L2, Mahalanobis,

Ll &

M,

and L2 & M metrics are all

highly

correlated measurements. Since the Ll &

Mahalanobis metric was the most successful ofthese

five,

the second metric to test was

selectedas theMahalanobis&Anglemeasurement. Thismeasurement scoredthesecond

highest forthefirstdataset and wasbetterthan the Anglemeasurement alone.

3.5.

Eigenfaces

Inorderto computetheEigenfaces foraparticular set ofdata

images,

theaverage

(46)

inthedata setbefore processing begins. Theaverageface for Data Set 1 isshownbelow

in Figure 7.

Figure 7- Average Face

Almost all ofthe calculations in regards to

developing

the covariance _matrix,

Eigenvectors,

and the _{resulting Eigenfaces} are done with 32 bit

floating

point precision.

Inorder to maintain this level of_accuracy, all Eigenfaces and information stored in the

database files for future use are also storedinthe 32 bit

floating

point format. Inorderto

view the Eigenfaces

however,

_{it is necessary}to scale themto thestandard 8 bitgrayscale

version suitable fora

bitmap

image.

To this effect, when_saving or _viewing anEigenface throughthe system, a linear

scaling is done whereby the lowest value is scaled to 0 and the highest to 255. This

scaling has been performed on the first five Eigenfaces and can be seen in Figure

8,

(47)

Figure 8- The First Five Eigenfaces

The first Eigenfaces are_relativelyclear as intended. The features seem to match

up quite welland adistinct set of_eyes, a_nose, and a mouth, are _clearly visible ineachof

them. An important featureto noteis thegeneral lackofextremes andhow thecontours

on the Eigenfaces appear as one would expect. A good feature to note is the lack of

strange artifacts or noise in the corners ofthe images. The corners ofthe image contain

background image and are not important facial feature data. While it is generally very

difficult to determine what principal components each Eigenface characterizes, the

second and third Eigenface appearto be relatedto the horizontal and vertical

lighting

of

the subject. _{This may be important}when_moving the algorithminto the real-timesystem

where

lighting

will

by

no meansbeconstant.

As the order of the Eigenfaces

increases,

and their respective Eigenvalues

decrease,

theimagesbecome noisier. Thisprogressionisexpectedbecausethesize ofthe

Eigenvalue determines the relevance ofthe _{corresponding Eigenface.} Therefore as the

Eigenvalue

decreases,

the _{corresponding Eigenface} should be capturing less significant

(48)

Figure 9

-Eigenfaces50&_51, 100 &101,and 150

The 50 and 51st

Eigenfaces are still _relatively clear face

images,

but the _clarity

rapidlycontinuestodegrade untilthe 150th_{Eigenface only hints}at facial features. While

the later Eigenfaces _may appear to be worthless images _containing

largely

noise, a

(49)

Chapter

4

Evaluation

of

Principal

Component

Analysis

Thedatapresented inthissection wasrecordedafterinitial

testing

confirmedthat

theprincipal component analysis algorithm was working. Itwas also confirmed that the

testing

and statistical measurement aspect ofthe algorithm was _working as well. The

initial results will not be presented

however,

since the data does not reflect the final

implementationofthealgorithm.

While it is often accepted practice to use square images to trainand producethe

Eigenfaces,

it seemed prudent to test the images in a more natural shape for a face as

well. To _{this end, the} average length and width of the cropped face images were

measured and calculated. Itwas foundthat the croppedimageswereinthe ratio of1.38

to 1 forthe height with respect to the width.

Using

this _ratio, several image sizes were

selected and_subsequentlytested.

In order to examine the effectiveness of _{the system,} and how the algorithm

generalized across data _{sets, the} system was trainedand testedwith both Data Set 1 and

DataSet 2. The results forthe two different

training

sets werethenexaminedtogetherto

help

determinethebestmethodfor

tiaining

thefinal Eigenfaces.

4.1.

Data Gathered

In the

following

_section, results were obtained to allow the PCA algorithmto be

optimized to produce Eigenfaces which would be effective in the real time

implementation. Datawas gathered foreach ofthe points ofinterest outlined _{in Section}

(50)

4.1.1

Image Size

and

Normalization

Method

Inorderto examinetheeffects ofnormalizationonthe

images,

the firstthree tests

were done_{using equalization,} contrast_stretching, and nonormalization.

Equalization,

or

histogram_{equalization,} is an imagetransformation_wherebythe image is adjusted so that

each

intensity

level is represented

by

an equal number of pixels in the image [27]. In

most _{cases, this} improves the contrast in the image.

Unfortunately,

it can also cause

imperfections in the image to stand out. Contrast stretching attempts to adjust the code

values inthe image sothat alargerrange ofintensities is used. Thenew

intensity

ofthe

(

255

^

pixel canbe calculated

by IN

(x,

_y)=

(l(x,

y)

-Min)

for standard grayscale

\Max-Min)

images. MaxandMinare themaximum and minimum

intensity

levels inthe image. For

example, ifall ofthe pixels in an 8 bit image fall in the range of20 to

200,

_applying

contrast _stretching

linearly

scales the pixel values to be in the range of0 to 255. This

increaseof_{seventy five}levels ofimage

intensity

allows forabetterdefined image.

The results for the normalizationtests are shownb