Face Identification by Human and by Computer: Two Sides of the Same Coin, or Not? Tsuhan Chen

(1)

Face Identification by Human and by Computer:

Two Sides of the Same Coin, or Not?

Tsuhan Chen

[email protected] Carnegie Mellon University

Pittsburgh, USA

TsuhanChen2004

What do you see?

(2)

TsuhanChen2004

What do you see?

(3)

TsuhanChen2004

What do you see?

[http://www.palmyra.demon.co.uk]

TsuhanChen2004

(4)

TsuhanChen2004 [Adam Finkelstein, “Mona”]

(5)

TsuhanChen2004

From Human to Computer

TsuhanChen2004

Face Identification: A Generalization Problem

“Single gallery image”

Pattern

Recognition _{“Single probe image”}

Need to generalize for all variations, without observing those variations

(6)

TsuhanChen2004

Computer vs. Human

Feng-Shui (風水) as an example

Ancient Chinese room arrangement technique

Way 1:

Write down all the rules

Too many and do not generalize

Way 2:

Imagine how a dragon would move through the room to arrange it in a livable manner

Intuitive and creative Done by Feng-Shui masters

“Biomimetics”?

Neural networks

Among the first biologically motivated PR; some success in face detection and recognition

(7)

TsuhanChen2004

Lesson from Deep Blue

November 5, 1997: Deep Blue beat Kasparov

(2-w, 1-l, 3-d), the first time in history

Deep blue was not designed to mimic

humans. Instead, it was designed to take

advantage of a computer’s strengths, i.e.

speed and memory

Deep Blue beat Kasparov by memorizing a

large amount of information and table lookup

Kasparov said it best, “quantity is sometimesquality”

TsuhanChen2004

Quantity is not always quality

Unfortunately, most object recognition work is under the “Deep Blue” paradigm

Deeper search, more data, faster computers, etc.

Object recognition requires some attributes in computers that will be more human-like

Generalization (intuition) Adapting from the past Bounds on knowledge

Face perception as example

Initial overall examination of external features, followed by a sequential analysis of internal features [Matthews, 1978] [Fraser and Parker, 1986]

(8)

TsuhanChen2004

“Generalization”

Banca Database

(ICPR2004)

“Controlled” “Degraded” “Adverse”

- Studio lighting - High quality camera - Minimal pose variation

- Varying lighting

-Low quality web camera - Some pose variation

- Varying lighting

- High quality camera - Noticeable pose and other variations

(9)

TsuhanChen2004

DCT/Gabor Transform

“Parts” Representation Of The Face

TsuhanChen2004

GMM

EM Learner

Estimating Parametric Model

DCT/Gabor Transform

(10)

TsuhanChen2004

Combining Representations

Hypothesis

Monolithic representation: low-frequency information Parts representation: both low- and high- frequency representation

Weighting can be view-dependent Combine the scores with sum rule

LDA-COS

FSC-GMM

Input face + COMB

Comparative Results

Algorithm/Protocol Mc Ud Ua P LDA-NC [1] 4.93 15.99 20.24 14.79 ORG-SVM [1] 5.43 25.43 30.11 20.33 PCA-MAH 10.2 17.84 26.63 21.57 LDA-COS 6.46 10.99 20.39 14.96 FSC-GMM 2.14 24.78 17.06 21.97 COMB 1.42 9.65 16.51 12.52

[1] _{M. Sadeghi, J. Kittler, A. Kostin, and K.Messer, “A comparative study of automatic face}

(11)

TsuhanChen2004

Some Motivations

TsuhanChen2004

Holistic vs. Parts

(12)

TsuhanChen2004

‘Thatcher Illusion’

[Thomson, 1980]

‘Thatcher Illusion’

(13)

TsuhanChen2004

Holistic vs. Parts

Parts of faces are [Tanaka and Farah, 1993]

easily recognized in typical whole-face configuration less easily in new configuration

most poorly recognized in isolation

Chins differences detected first [Sargent, 1984]

Not as obvious when faces are inverted These suggest

Face perception is holistic

andby parts

Orientation is important

TsuhanChen2004

“Bounds on Knowledge” and

(14)

TsuhanChen2004

Bounds on Knowledge

Socrates (470-399 B.C.)

Computer is no where near this yet. It thinks it knows every conceivable variation (but in fact only limited to what has been programmed to it)

"The only true wisdom is in knowing you know nothing."

Adapt from Past

Humans are good at adapting using past

experience. Can computers do the same?

Yes, it is called relevance adaptation (RA)

Previously used in speech recognition Obtains a subject-dependent model from a subject-independent average distribution (the past), using a small amount of adaptation data

(15)

TsuhanChen2004

No Relevance Adaptation

TsuhanChen2004

Relevance Adaptation

(16)

TsuhanChen2004

Another Aspect: 3D/Video

3/4-View

Frontal/profile views result in poorer recognition by human than 3/4-view for unfamiliar faces [Baddeley & Woodhead, 1981; Bruce, 1982]

3/4-view looks good too!

frontal view ¾ view ¾ view

(17)

TsuhanChen2004

“Face Mosaic”

m 1 v 2 v m w TsuhanChen2004

(18)

TsuhanChen2004

Face in Video

Moving faces are significantly better

recognized by human than still images

Movement provides 3D structure of the face and allows recognition of facial gestures [Knight and Johnston, 1997]

To pixelating or blurring, moving images of faces are recognized better than still images [Lander, et al.

1999](perhaps “masking” or “super-resolution”)

Face Recognition from Video

Computer can use video too

More than simple majority voting or frame selection

Integration of temporal/motion/geometry information

Updating over time

Most variations are continuous (at 30Hz): pose, illumination, expression, registration, etc.

(19)

TsuhanChen2004

Face-in-Action (FiA) Database

TsuhanChen2004

Other aspects to be explored...

(20)

TsuhanChen2004

Stages of Face Identification

Face

→

Identity

→

Name

[Young et al, 1982]

Common situations:

Case 1: Can not recognize the face

Case 2: “The face looks familiar” without identity

Case 3: Identify the face (e.g., occupation) but can’t recall the name

Own-Race Bias

We are better at identifying faces belonging

to races with which we are familiar

[Shapiro and Penrod, 1986]

(21)

TsuhanChen2004

Own-Race Bias

TsuhanChen2004

Independent Modules

Facial expression identified independently of

face identity

[Bruce, 1986, Young et al.,1986]

Prosopagnosia patients can still identify facial emotion

Some patients cannot identify facial emotion, but could identify famous faces

(22)

TsuhanChen2004

Independent Modules

McGurk effect

A prosopagnosia patient can still experience McGurk effect [Campbell et al., 1986], suggesting that holistic face

recognition is affected, but not the by-part

Audio + Visual → Perceived ba ga da pa ga ta ma ga na

Internet Psychology Lab http://kahuna.psych.uiuc.edu//ipl [McGurk and MacDonald ‘76]

(23)

TsuhanChen2004 [Baker and Kanade, “Hallucinating Faces”]

How many samples for a face?

One Single Image

images face possible all of number >> population world history human 365 24 60 60 30× × × × × × >>

Number of all possible 16×12 images

Reconstruct

Æ Power of prior; “adapt from past” 8 12 16 2 × × = TsuhanChen2004

Some Art Work

12 x 16 LEDs, 8-bit Grayscale [Jim Campbell, “Portrait of a Portrait of Harry Nyquist’]

(24)

TsuhanChen2004

More…

12 x 16 LEDs, 8-bit Grayscale [Jim Campbell, “Portrait of a Portrait of Claude Shannon”]

Finally…

“The most compelling shapes are those near to

our hearts: people’s

faces

, a gracefully

moving body, a natural scene with rustling

leaves and flowing water.

Evolution has tuned us to these sights.

By combining vision and graphics, capturing

and creating images of these scenes may

soon be within reach. …”

(25)

TsuhanChen2004

Try this…

[http://www.palmyra.demon.co.uk]

TsuhanChen2004

Advanced Multimedia Processing Lab

Please visit us at: http://amp.ece.cmu.edu