RIT Scholar Works
Theses
Thesis/Dissertation Collections
10-3-2000
Eye movements and natural tasks in an extended
environment
Roxanne Canosa
Follow this and additional works at:
http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please [email protected].
Recommended Citation
in an
Extended
Environment
Roxanne Canosa
B.S. State University of New York College at Brockport (998)
A thesis submitted
in
partial
fulfillment of the
requirements for the degree of Master of Science
in
the Center for Imaging Science
in
the College of Science
Rochester Institute of Technology
October 3, 2000
Signature of Author
_
Roxanne Canosa
Accepted
by
THESIS RELEASE PERMISSION
ROCHESTER INSTITUTE OF TECHNOLOGY
COLLEGE OF SCIENCE
Eye
Movements and Natural Tasks
in
an
Extended
Environment
I, Roxanne Canosa, hereby grant pelnllssion to the Wallace Memorial Library of RIT to
reproduce my thesis
in
whole or
in
part. Any reproduction
will
not be for commerdal use or
profit.
Signature of Author
.
_
Roxanne Canosa
CHESTER F. CARLSON
CENTER FOR IMAGING SCIENCE
COLLEGE OF SCIENCE
ROCHESTER INSTITUTE OF TECHNOLOGY
ROCHESTER, NEW YORK
CERTIFICATE OF APPROVAL
M.S. DEGREE THESIS
The M.S. Degree of Roxanne Canosa
has
been
examined
and approved by the thesis committee
as satisfactory for the thesis requirement for the
Master of Science degree
in
Imaging
Science
Dr. Jeff
B.
Pelz,
The..c:;is
Advisor
Dr. Eriko Miyahara
Abstract
Eye movements can be thought of as awindow onto pre-conscious thought. Patterns of
visual
fixations
over time as well as space can reveal cognitive strategies that are notamenabletoconsciouscontrol orverbalization. Aspatial analysis of aneyemovementtrace
usually emphasizes the role that eye movements have in moving the retinal image of an
object of interest from the periphery to the fovea for closer inspection. It is generally
believedthat a sequence of
fixations
across a region of spacebuilds up the perception of a high-resolutionfield
of view everywhere. Recentstudieshaveshownthat thisperceptionislargely
illusory. The visual-perceptual system prefers to maintain a limited internal representation of physical objects in theworld and uses the environment as an externalsource of
information,
accessingtheinformationonlyatthe timeit isneeded.Thegoal ofthis researcheffortwastoinvestigatetherolethateyemovements
have
intheperformance ofeverydaytasks ina natural environment. Aseries offour experiments
were conducted that represent an attempt to step away from the classical psychophysical
approach of studying eye movements widiin the confines and contaol of the
laboratory.
There existslittle precedence forthiskind ofapproach, partly becausepast researchefforts
haveemphasized alinearsystemsmethodtorendertheanalysistractable,andpartly because
the
technology
that is required toperform theseexperiments has not existed until recently.The hardware thatwas developed
by
the Visual PerceptionLaboratory
at RIT specificallyaddresses the portabilityconcerns thatare crucial for successfully studyingeye movements
during
naturaltasksinanon-linearextended environment.A model was developed to describe the temporal sequencing of eye movements in
termsofa hierarchicalstructure of goal-oriented tasks,withindividual
fixations
consideredthe lowest level of the hierarchy. The analysis gives evidence
for
thesequencing
of eyemovements based ona
desire
tomaximize theefficiencyoftaskperformance overtimeby
anticipating
future
activities. Thepurposeofthis sequencing is toenhance interactionwiththe world under conditions of limited memory representations rather than to create the
I would like to thank and acknowledgeall thosewho helpedto make this research project
possible. ThankstoJason Babcock forthecoundesshoursspent
developing
andfine-tuning
the eye-tracker, and for
his
help
in recruiting and rurining subjects. Thanks also toAmy
Silver
for
her coding expertise andher insights into what constitutes a goodtrack,
and toDiane Kucharczyk
for
her lab experience. I would also like to thank all those whovolunteeredtobesubjects for theexperiments, andthosewho tolerated thesometimes odd
intrusionsintothehallwaysandbathrooms. Thanks toJeff
Pelz,
ErikoMiyahara,
andMary
Hayhoe
for
their thoughtfulinsightsandrecommendations.Finally,
thanks tomyfamily
for
Table
of
Contents
ListofFigures ix
ListofTables xviH
1.Introduction 1
1.1 Overview 1
1.2Objectives(Statementof
Work)
5
2. Background
6
2.1 Historical Perspective
6
2.2 Eye Movements 12
2.3
Visual AttentionandSelection 162.3.1
Saliency
Maps 162.3.2 Feature Integration
Theory
ofAttention 172.4 TheWorldasAnchor 18
2.4.1 Semantic
Consistency
182.4.2ChangeBlindness 18
2.4.3
Exocentric Reference Frames19
2.4.4Position
Constancy
During
Passive Movement19
2.5
Perceiving
theDirectionofHeading
During
Motion 202.5.1 Retinalvs. Extra-retinal Information 20
2.5.2 Differential Motion Parallax
23
2.6 The Effectsof
Freeing
theHead23
2.7 NaturalTasks 26
2.7.1
Memory
Representationof aSimple NaturalTask-Blocks
Copying
27
2.7.2Sequential
Looking
TaskTapping
vs.Looking Only
29
2.7.3
VisualMemory
inProblemSolving
-Geometry
312.7.4 Eye MovementsandWorkLoad
During Driving
322.7.5
TheDirectionofGazeDuring
Driving
342.7.6 Eye Movements While
Making
Tea35
3.Approach
41
3.1
History
ofEye-tracking
Methods42
3.1.1 Electrical Methods
42
3.1.2 Optical Methods
43
3.2The VPL
Portable,
Wearable Eye-Tracker45
3-2.1 The Custom Goggles Headgear
46
3.2.2 Other System Components 50
3.2.3
Theory
ofOperation 523.2.4 Eye-Tracker
Set-Up
andCalibration 543.2.5
Eye MovementMonitoring
563.3
ExampleofReal-Time Data Capture 573.4 Data Analysis
59
3.4.1
Coding
theData60
3.4.2 Fixation DurationsandSaccade Size
61
3.4.3
Statistical Analysis63
4.
Experiment1-Moving
ThroughaHallway
66
4.1
Objective67
4.2Experimental DesignandConditions
68
4.3
Data AnalysisandResults 714.4
Conclusion 885. Experiment2
-Fixation
Stability
89
5.1 Objective
89
5.2Experimental DesignandConditions 90
5.3
Data AnalysisAnd Results93
5.4 Conclusion
99
6.
Experiment3
-Handwashing
1006.
1 Objective 1 006.2
ExperimentalDesignandConditions 1006.3
DataAnalysisandResults 1016.3-1
FixationDurations 1016.3-2
Saccade Size 1046.3-3
MajorSub-tasks 1076.3-4
"Look-aheads"110
6A
Conclusion115
7. Experiment
4
-Making
aSelection FromaVending
Machine 1167.1 Objective 116
7.2 Experimental DesignandConditions 117
7.3
Data AnalysisandResults119
7.3.1 Fixation Durations
119
7.3.2 Saccade Size 121
7.3.3
ComparisonofHandwashing
andVending
Machine Experiments123
7.3.5
"Look-aheads"127
7.4 Conclusion 138
8. Conclusionand
Recommendations
139
8.1 The EffectofaReal Environment
139
8.2 Eye Movements Extended Over Time 141
8.3
Applicationsfor
Artificial Systems 1428.4 Recommendations
for
FutureWork 147Appendix 148
List
of
Figures
Figure2-1 Humanshavea perceptualbiastowardseeingthe triangleasa whole 7
Figure 2-2 Scanpaths aretasksdependent 10
Figure
2-3
Thedistributionofrods and conesisunevenly distributedacrosstheretina...13
Figure2-4 Vergenceeyemovement 15
Figure
2-5
Experimentalconditions for sufficiencyof retinalinformation 21Figure2-6 Theeffect of
freeing
theheadonfixation stability25
Figure2-7 Theeffects of
freeing
theheadonsaccades 26Figure2-8 Blocks copyingtask 27
Figure
2-9
Eyemovement strategies usedfor blocks copyingtask 28Figure2-10
Tapping
vs.Looking
only 30Figure 2-1 1 Fixationdurations forexpert andnovice
drivers,
forseveral conditions33
Figure2-12 Fixationdurationasafunctionoftask
difficulty
for
adriving
task 34Figure3-1
Portable,
wearableeye-tracking headgear46
Figure3-2 Opticsmodule
(HMO)
ofheadgear47
Figure
3-3
Top
view ofheadgear47
Figure 3-6 Close up
front
panel of controlunit 50Figure 3-7 Subject wearing
eye-tracking
gear, readytoperform an experiment 51Figure 3-8 Imageofpupil
(white
outline)andcorneal reflection(blackoutline) 52Figure
3-9
Calculationoftheline
ofgaze53
Figure3-10 Diffractionpatternusedforcalibration
55
Figure 3-11 RealData Capture 57
Figure 3-12 Traceof vertical eye position 58
Figure
3-13
Traceofhorizontaleye position 58Figure3-14 Expandedviewof vertical eye position 58
Figure
3-15
Expandedview ofhorizontaleye position . 58Figure3-16 Eye-tracker noise,noaveraging :
59
Figure3-17 Eye-tracker noise,2
field
averaging :59
Figure3-18 Eye-tracker noise,4
field
averaging59
Figure
3-19
Eye-tracker noise, 8 field averaging59
Figure3-20 Calculationof visual anglefrom fieldof view
62
Figure 3-21 Thegamma
density
functionwithB = 1andA = 2
65
Figure
4-1
Simulatedconditionset-up forExp. 168
Figure
4-2
Firsthallway
for Exp. 169
Figure
4-3
Secondhallway
forExp. 169
Figure
4-4
Thirdhallway
forExp. 169
Figure
4-5
Fourthhallway
for
Exp. 169
Figure
4-6
Simulated/Active 70Figure
4-8
Real/Active 70Figure
4-9
Real/Passive 70Figure
4-10
Fixation Durations- Simulated/Active-JB 71
Figure
4-11
Fixation Durations-Simulated/Active- JP 71
Figure
4-12
Fixation Durations-Simulated/
Active- MA 71Figure
4-13
Fixation Durations - Simulated/Active- RC 71Figure
4-14
Fixation Durations- Simulated/Passive- JB 72Figure
4-15
FixationDurations- Simulated/Passive- JP 72Figure
4-16
FixationDurations - Simulated/Passive- MA72
Figure
4-17
FixationDurations- Simulated/Passive- RC 72Figure
4-18
FixationDurations-Real/Active
-JB
73
Figure
4-19
FixationDurations- Real/Active- JP73
Figure
4-20
FixationDurations-Real/Active- MA
73
Figure 4-21 FixationDurations - Real/Active- RC
73
Figure
4-22
FixationDurations-Real/Passive
-JB 74
Figure
4-23
FixationDurations - Real/Passive-JP 74
Figure
4-24
FixationDurations - Real/Passive- MA74
Figure
4-25
FixationDurations - Real/Passive- RC74
Figure
4-26
FixationDurations - StringsofLengthx-Simulated/
Active-JB 77Figure
4-27
FixationDurations - StringsofLengthx- Simulated/Passive
-JB 77
Figure
4-28
Fixation Durations- StringsofLengthx-
Simulated/
Active- JP77
Figure
4-29
FixationDurations - StringsofLengthx-Simulated/Passive- JP
77
Figure
4-30
FixationDurations- StringsofLengthx
-Simulated/
Active- MA77
Figure
4-31
Fixation Durations- StringsofLengthx-Simulated/Passive
- MAFigure
4-32
Fixation Durations-StringsofLengthx- Simulated/Active- RC 78
Figure
4-33
Fixation Durations- StringsofLengthx- Simulated/Passive- RC 78
Figure
4-34
Fixation Durations- StringsofLengthx Real/Active- JB 78
Figure
4-35
Fixation Durations - StringsofLengthx- Real/Passive- JB 78
Figure
4-36
Fixation Durations-StringsofLengthx Real/Active- JP 78
Figure
4-37
Fixation Durations- StringsofLengthx- Real/Passive
-JP 78
Figure
4-38
Fixation Durations- StringsofLengthx
-Real/Active- MA 78
Figure
4-39
Fixation Durations - StringsofLengthx- Real/Passive
-MA 78
Figure
4-40
FixationDuraitons-StringsofLengthx
-Real/Active- RC 78
Figure
4-41
FixationDurations - StringsofLengthx- Real/Passive- RC 78
Figure4-42 SaccadeSize- Simulated/Active- JB 80
Figure
4-43
Saccade Size-Simulated/Active
-JP 80
Figure
4-44
Saccade Size-Simulated/Active- MA 80
Figure
4-45
SaccadeSize- Simulated/Active- RC 80Figure
4-46
Saccade Size- Simulated/Passive- JB 81Figure
4-47
SaccadeSize- Simulated/Passive- JP 81Figure
4-48
SaccadeSize - Simulated/Passive - MA 81Figure
4-49
SaccadeSize- Simulated/Passive- RC 81Figure
4-50
Saccade Size - Real/Active- JB 82Figure
4-51
Saccade Size- Real/Active- JP.82
Figure
4-52
Saccade Size- Real/Active- MA 82Figure
4-53
SaccadeSize- Real/Active- RC 82Figure
4-54
SaccadeSize- Real/Passive- JB83
Figure
4-55
SaccadeSize- Real/PassiveFigure
4-56
Saccade Size-Real/Passive - MA
83
Figure
4-57
Saccade Size-Real/Passive- RC
83
Figure
4-58
Fixation Durationvs. Saccade Size-Simulated/Active- JB 86
Figure
4-59
Fixation Durationvs. Saccade Size- Simulated/Passive-JB 86
Figure
4-60
Fixation Durationvs. Saccade Size- Simulated/Active- JP 86Figure
4-61
Fixation Durationvs. Saccade Size- Simulated/Passive- JP 86Figure
4-62
FixationDurationvs. Saccade Size- Simulated/Active- MA 86Figure
4-63
FixationDurationvs.Saccade Size- Simulated/Passive- MA 86Figure
4-64
FixationDurationvs.Saccade Size-Simulated/Active- RC 86
Figure
4-65
FixationDurationvs.Saccade Size-Simulated/Passive- RC 86
Figure
4-66
FixationDurationvs.Saccade Size- Real/Active- JB87
Figure
4-67
Fixation Durationvs.Saccade Size- Real/Passive- JB 87Figure
4-68
FixationDurationvs.SaccadeSize-Real/Active
-JP 87
Figure
4-69
FixationDurationvs.Saccade Size- Real/Passive- JP87
Figure
4-70
Fixation Durationvs.SaccadeSize-Real/Active- MA 87Figure
4-71
FixationDurationvs.Saccade Size-Real/Passive- MA
87
Figure4-72 Fixation Durationvs.SaccadeSize- Real/Active
-RC 87
Figure
4-73
Fixation Durationvs.Saccade Size- Real/Passive- RC87
Figure5-1
Set-up
forCondition 1: SmallScreen 91Figure5-2
Set-up
forCondition2: LargeScreen 92Figure
5-3
Set-up
forCondition3: RealWalking
92Figure5-4 Fixationstability fortargetonrightforsubjectJB 94
Figure
5-5
Fixationstabilityfor
targetstraightaheadfor
subjectJB 94Figure
5-7
Fixationstability for
targetstraight aheadfor
subjectJK 96Figure 5-8 Fixation
stability for
small screencondition 97Figure
5-9
Fixationstability for large
screencondition 97Figure5-10 Fixation stability
for
realwalkingcondition 97Figure
6-1
Men'swashroom usedfor
Exp.3
101Figure
6-2
Women'swashroomusedfor
Exp.3
101Figure
6-3
FixationDurations-Handwashing
- SubjectAS 102Figure
6-4
FixationDurations-Handwashing
- SubjectJB 102Figure
6-5
FixationDurations-Handwashing
-Subject JP 102
Figure
6-6
FixationDurations-Handwashing
- Subject MH 102Figure
6-7
FixationDurations- NotIncluding
Wash Hands- Subject AS103
Figure
6-8
FixationDurations - NotIncluding
Wash Hands- SubjectJB103
Figure
6-9
FixationDurations-Not
Including
Wash Hands-Subject JP
103
Figure
6-10
FixationDurations - NotIncluding
Wash Hands- Subject MH103
Figure
6-11
Saccade Size-Handwashing
- Subject AS105
Figure
6-12
Saccade Size-Handwashing
- SubjectJB105
Figure
6-13
Saccade Size-Handwashing
- SubjectJP105
Figure
6-14
Saccade Size-Handwashing
- SubjectMH105
Figure
6-15
SaccadeSize- NotIncluding
WashHands- Subject AS106
Figure
6-16
SaccadeSize- NotIncluding
Wash Hands- SubjectJB 106Figure
6-17
SaccadeSize- NotIncluding
WashHands- SubjectJP106
Figure
6-18
Saccade Size - NotIncluding
WashHands- SubjectMH 106
Figure
6-19
Handwashing
- SubjectAS108
Figure
6-20
Handwashing
- Subject JBFigure
6-21
Handwashing
-SubjectJP 108
Figure
6-22
Handwashing
-SubjectMH 108
Figure
6-23
Handwashing
-Subject AS
109
Figure
6-24
Handwashing
- SubjectJB109
Figure
6-25
Handwashing
-SubjectJP
109
Figure
6-26
Handwashing
- SubjectMH109
Figure
6-27
Elapsedtimebetweenobjectfixation
and objectmanipulationforHandwashing
-SubjectAS Ill
Figure
6-28
Elapsedtimebetweenobjectfixationand object manipulationforHanchvashing
- SubjectJBIll
Figure
6-29
Elapsedtimebetween
objectfixation
andobjectmanipulationforHanchvashing
-SubjectJP Ill
Figure
6-30
Elapsedtimebetween
objectfixationand objectmanipulationforHandwashing
- SubjectMH IllFigure
6-31
JB'sfirst fixationon garbage can 112Figure
6-32
JB'sfirst
manipulation of garbagecan, 7.1secondslater 112Figure
6-33
Handwashing
- SubjectAS 114Figure
6-34
Handwashing
- SubjectJB 114Figure
6-35
Handwashing
-Subject JP 114
Figure
6-36
Handwashing
- SubjectMH 114
Figure 7-1 Thealcove withthefour vendingmachines usedin Experiment4 117
Figure 7-2 Coffeemachine 118
Figure
7-3
Sodamachine 118Figure 7-4
Candy/chip
machine 118Figure
7-5
Sandwichmachine 118Figure7-6 Fixation Durations
-Vending
Machine- Subject ASFigure 7-7 Fixation
Durations
-Vending
Machine- SubjectJB, 120
Figure 7-8 Fixation
Durations
Vending
Machine- SubjectJT 120Figure
7-9
FixationDurations
Vending
Machine- SubjectMH 120Figure 7-10 Saccade Size
-Vending
Machine- SubjectAS 122Figure 7-11 Saccade Size
-Vending
Machine- SubjectJB 122Figure 7-12 Saccade Size
-Vending
Machine- SubjectJT 122Figure
7-13
Saccade Size-Vending
Machine- SubjectMH 122Figure 7-14
Vending
Machine- SodaMachine- SubjectAS 124Figure 7-15
Vending
Machine-SandwichMachine- SubjectJB 124
Figure 7-16
Vending
Machine-Candy/Chip
Machine- Subject JT 124Figure 7-17
Vending
Machine- CoffeeMachine- SubjectMH 124
Figure7-18
Vending
Machine- SubjectAS 126Figure
7-19
Vending
Machine-Subject JB 126
Figure7-20
Vending
Machine- SubjectJT 126Figure 7-21
Vending
Machine- Subject MH126
Figure7-22 Elapsedtimebetweenobjectfixationand objectmanipulation
for
Vending
Machine- Subject AS 128Figure
7-23
Elapsedtimebetweenobjectfixationand object manipulationforVending
Machine-Subject JB 128
Figure 7-24 Elapsedtimebetweenobjectfixationand objectmanipulation
for
Vending
Machine-SubjectJT 128
Figure
7-25
Elapsedtimebetweenobjectfixation
and object manipulationfor
Vending
Machine- SubjectMH128
Figure7-26
Vending
Machine- Subject AS130
Figure7-27
Vending
Machine- Subject JB130
Figure7-28
Vending
Machine- Subject JTFigure
7-29
Vending
Machine-SubjectMH 130
Figure 7-30 JB
looks
at exit ofcandy/chip
machine attimecode of00:08::05:02 132Figure 7-31 JBretrieves purchase
from
machineattimecode of00:08:11:26,
6.8
secondslater
132Figure7-32 MH
looks
at coffeecuplids
at atimecodeof00:10:03:27 132Figure
7-33
MHlooks
atcoffeecuplids
at atimecodeof00:10:40:10 132Figure 7-34 MHretrieves coffeecup lidatatimecode of
00:10:56:27,
53
seconds afterfirst
locating
thelids,
and16.5
secondsafter a secondlooktothelids 132Figure7-35 FixationDurationsandSubtasks
-Vending
Machine- SubjectAS 134Figure 7-36 FixationDurationsandSubtasks
-Vending
Machine- SubjectJB 134Figure7-37 FixationDurations andSubtasks
-Vending
Machine- SubjectJT 134Figure7-38 FixationDurations andSubtasks
-Vending
Machine-SubjectMH 134
Figure
7-39
FixationDurations andSubtasks-Handwashing
- Subject AS135
Figure 7-40 FixationDurationsandSubtasks
-Handwashing
- Subject JB135
Figure 7-4 1 FixationDurationsandSubtasks
-Handwashing
-SubjectJP 1
35
Figure 7-42 FixationDurationsandSubtasks
-Handwashing
- SubjectMH 135
Figure
7^43
Saccade Size andSubtasks-Vending
Machine- SubjectAS136
Figure 7-44 Saccade SizeandSubtasks
-Vending
Machine- SubjectJB 136Figure7-45 SaccadeSizeandSubtasks
-Vending
Macliine- SubjectJT136
Figure7-46 SaccadeSizeandSubtasks
-Vending
Machine- SubjectMH136
Figure7-47 SaccadeSizeandSubtasks
-Handwashing
Subject AS137
Figure7-48 Saccade SizeandSubtasks
-Handwashing
- SubjectJB137
Figure
7-49
Saccade Size andSubtasks-Handwashing
- SubjectJP137
Figure7-50 SaccadeSizeandSubtasks
-Handwashing
- SubjectMHList
of
Tables
Table2-1 Somecomputationscanbesimplified
by
makingassumptionsaboutbehavior
9
Table 3-1 Exampleofcodefrom eye-trackingexperiment
61
Table7-1 Comparisonof
fixation
durations forhandwashing
andvendingmachine ....1231.
Introduction
1.1
Overview
Thepurpose of vision is to servetheneeds oftheindividual. As anindividual goes about
performing
day-to-day
activities,thevisualsystemiscontinually monitoringtheenvironmentto provide information about each
interaction;
information that enables meaningfulinteractionswiththatenvironmentfortheMfillmentof a plan of action.
Inthis sense,vision isnot a passive processwherebyinformationis merelycollected,
processed, and stored for later retrieval, but rather an active process that integrates
goal-oriented behavior with proprioceptive signals
from
the individual's physical state, andexteroceptiveinformationaboutthelayoutoftheenvironment.
Visual perception is essentially a selective process. The particular sequence of
selections is
largely
dependent uponthe task tobe
performed, and as such isdriven
by
thegoals of the
individual,
but each discrete selection occurs mostiyat a subconsciouslevel.
Eyemovements areone of themechanisms
by
which theselection process proceeds. Theangleissurrounded
by
alow
resolution periphery. Aneye movementisrequiredtobring
anobject ofinterestto the
fovea,
andis themeansfor
sustainingovertattentionontheobject.The apparent purpose of eye movements thus appears to be that of allowing
for
theimpressionof a
broad,
high
resolutionvisualfield from
multiplesequential fixations. Thisobservation,
based
ontheverifiablephysiologyofthehumaneye, doesnotadequatelyofferan answer to a central question regardingthe role of eye movements in visual perception:
whereisattentiontobe
focused
next?Thisresearcheffortis
largely
concerned withprovidingaframeworkwithin whichthatquestionmay beapproached. Thereis obviouslynot a single answerthatwillapply toevery
situation requiring
focused
visualattention, butit is possibletoextract a certain amount ofcommonalityineverydaytasks thatgives risetoparticularpatternsof selection.
A primaryobjective ofthis researchis tostudyeyemovemeris of subjectswhile
they
perform everyday tasks is a natural environment. Much of what is currentiy understood
about human eye movements, and also about visual perception in general, is based on
psychophysicalstudiesconductedin theconfines of a
laboratory
setting. Sincehumans
didnot evolve their sensory-perceptual abilities in such a restrictedenvironment, it is valid to
question whether ornot theresults obtainedfromsuch studiesapplyina practical sense. It
is also possible that subjects may exhibit an unconscious bias while
performing
in alaboratory,
providingresults that are valid in thelaboratory,
but not necessarily so in theworld outside ofit.
The psychophysical "black-box" approach to studying eye movements in
isolation,
and notinthecontextof arich, interactiveenvironmentsuffers
from
othermethodologicalinput can
be
isolatedfrom
all other possibleinputs,
its effect on the outcome canbe
precisely
measured andquantified. Allsuchinputs,
whentakentogether, describethesystemresponse. Inthiscontext,
linearity
refersto theideathat thewholeisequalto thesumofitsparts. The assumption of
linearity
as applied to human eye movements has not beenadequately shown to
be
valid. A secondary goal of this research project is to providegrounds either
for
or against thatassumption. Themeans fordoing
this is providedby
theRIT portable, wearable eye-tracker, which was developed
for
the purpose of smctyingsubjects'
eye movements while
diey
are performing common, everyday tasks in a natural,unrestrictedenvironment.
A portable, headmounted eye-trackerwas usedforthis research, aswell ashardware
andsoftwarethatenabled a computationoftheline-of-gazeforasubjectwhoiswearingthe
eye-trackerandperformingtasksina natural environment. Theline-of-gazeisdisplayedas a
cursor superimposed on a video scene of the environment as seen
by
the subject. Dataanalysis ofthecursorposition as afunctionoftimecorrelatestoeye movementsand affords
anindirectmethodof
determining
thecognitive processesunderlyingvisualperception.A final objective of this research was to consider the implications of a sequential
fixation strategyand ofnon-uniformsampling, or
foveation,
for
an artificial vision system.Researchers inrobotics andcomputer vision often consider the
human
visual system as amodel
for
artificial vision systems. It wouldbe
beneficial to be able todescribe
thehigh-levelcognitive processesunderlyingvisualperceptionin awaythatwould beamenabletoa
computer program. Activecomputer visionis an area of current research, and much work
thosealgorithms.
In summary, the objectives of this research project are three-fold: to conscbrthe
effects of
freeing
thesubjectfrom
therestraints ofthelaboratory
settingduring
eye-trackingexperiments, to
develop
aframework for
describing
the temporal sequencing of fixationsacrosstasteaswellaswithintasks, andtoevaluatetheappropriatenessof such aframework
for
servingas a modelforan artificialvision system. Sincethepurposeofvisionis toservetheneeds ofthe
individual,
itseemsreasonabletoconcludethatahypothesisabout wheretolooknext canbest be
formulated
by
consideringthecooperativerelationship betweenvisionFollowing
arethe objectivesmandatedfor
thisresearch:a)
Conduct
aliterature
review ofthesubject. Thetopicsrelatedto the topicare: eyemovements, visuo-motorcoordination, selectiveattention,planschemata,activevision,
and animate vision.
b)
Designa series of experimentstomonitorsubjects'
eye movements as
they
performarange of common,
everyday
tasksselectedtogain anunderstandingoftheinteractionbetweenvision and action. Suchtasksinclude:
i)
Walking
alongacorridor,being
pushedinawheelchair, andwatching avideotapeof someonewalking alongacorridoror
being
pushedinawheelchairii)
Maintaining
fixation
onan object whilewalking alonga corridoriii)
Washing
one'shands
inalavatory
iv)
Making
a selectionfrom
avendingmachinec) Recruitsubjects andcarryouttheexperimentation.
d)
Analyzethedatacollectedintermsof eye movementmetrics. Examplesof such metricsare: fixation
duration,
number offixations,
saccadelength,
saccades persecond,etc.e)
Study
theresultstodeterminethepre-conscious strategiesusedby
individualsasthey
performedthetasks.
f)
Modify
andrepeattheexperimentationanddata
analysis asnecessarytoinvestigate anyinteresting
or emergent patternofoculomotorbehavior.g) Formulateconclusionsbasedon results. Demonstratetheusefulness ofresults,and
2.
Background
2.1
Historical Perspective
In 1867 Herman von Helmholtz published
his
thoughts on the nature of visualperceptioninabookentided TrmtiseenFhysbtgiccdCpfc
(Helmholtz,
1867/1925). Thiswork laid the
foundation
for the classical approach to the philosophical treatment ofvision known as constructivism. The goal of constructivism wasto explain visual
perception as arising
from
the confluence ofmany local information processingunits,which when combinedtogether, constructaglobal percept oftheworld. Acentral tenet
of modem constructivism is the belief that perception relies upon a process of
unconsciousinference. Inotherwords, inorder
for
localinformation tobebound
withotherlocal information in ameaningfulway,aninferencemustbemade aboutthemost
likely
interpretation.Anexample ofhowunconsciousinferencecould
be
used toexplain perceptionisshownin Figure 2-1. Twopossibleinterpretationsofimage A are shown asimage Band
inference to explain the
human
perceptualbias
of choosing image B as the correctinterpretation. Image B ischosen
because
it isthemostlikely
possibility.^
^
*
ABC
Figure2-1. Hiimanshavea perceptualbiastowardseeingthetriangleas whole.
The inference is
largely
unconscious in that the observer is generally not aware thatprobabilities are
being
compared,andthatlogicalinferencesatebeing
made.A constructivist approach to theinverseproblem- that
is,
theproblemof how2-D retinal images are tiansformedinto aperception ofthe3-D environment
-would
betoconsiderthe2-D retinalimageas
belonging
to themostlikely
state of affairsintheenvironmentthatwould giverisetosuchanimage.
In contrast to the constnictivist
theory
of unconsciousinference,
an ecologicalperspective was espoused
by
James Gibson(Gibson,
1966),
who argued that directperception oftheenvironmentissufficient
for
solvingtheinverseproblem. Hebelieved
that all visual perception is the result of the interaction between the observer and
surfaces, or more specifically the light reflected off surfaces, in the environment.
Surfaces are composed of texture elements, and it is the structure that exists in the
surfacesthat in turnstructuresthelight that reachestheeye oftheobserver. When the
observermoves aroundthesurfaces, thechangingambient optic arrayof
light reaching
Thus,
the inverse problem is solvedby
considering the movement ofthe observer asintegral to the reconstmction. Change in structure over time supplies the missing
dimension.
In the
late
1970's DavidMan-(Marr,
1982)
combined the theoretical constructsfrom both
constructivism and ecological perception to create diefirst
computationalapproach
for
describing
visualprocesses. Heusedmathematicaltechniques todevelop
computer programs that simulated biological vision, and led the early efforts of
computational and computer scientists whodesignedthefirstmachine vision systems.
Marr disagreedwith
Gibson,
however,
on the issue of representation. Gibsonheld
that theenvironmentis therepositoryfor
all oftheinformationthatisnecessary forvisual
interaction,
whereasMarr believedthat theexternal worldisrepresentedinternaljy,
inall ofits detail. Anexample oftheinternal representationiswhatMarr callsthe "2V2
dimension"
sketch, an internal retinotopic image with the potential for a 3-D
representation.
Marr's work has had a strong influence on the current understanding of early
vision, and thisunderstanding has led toa numberofcomputational approachesbased
onearly, orlow-levelbiologicalvision. It isassumedthatinorder tosimulate a process
as complex as high-level visual perception, one must
begn
with, and correcdyimplement,
thelowerlevelprocesses.Only
thenwill the "correct"waytoimplementthehigher-levelcognitive processesbecomeapparent.
Ballard and Brown pointed out several weaknesses to this approach
(Ballard
&Brown,
1992).First,
earlyvisual processesdo
not take intoaccount the motivation ofthe observer. Marr's treatment of the visual process as
purely
passive precludes aof cognition as the
driving
force behind
thecollection oflow-levelinformation,
insteadof
thinking
ofitasmerely
theresult of a collection ofresponses.Second,
theearly
vision approachdoes
nottakeintoaccount sequentializationandgaze controlthat
humans
usetomake efficientuse ofthemulti-resolution capabilitiesofthe
human
eye.Finally,
Marr's model does not make use oflearning
strategies oradaptational responsesto theenvironment. Hismodel ofperceptionisessentiallya
rich,
highly detailed,
task independentdescription
of the world, which is continuallybeing
called upon
by
cognitionfor
performing specific tasks. Ballard and Brown(1992)
describe an alternate way of approaching the complexities imposed
by
vision, andsuggest numerous simplifications that would result from
taking
behavioral assumptionsintoaccount. Their
findings,
whichare exemplifiedby
aconstructcalled animatevision,aresummarizedinTable2-1 below.
ComputationsSimplified
by
Behavioral AssumptionsAgent's Behavior Behavioral Assumption
Shape from shading Lightsourcenot
direcdy
behindviewerTimetoadjacency Rectilinear motion;gazeinthe
direction
of motionKineticdepth Lateralheadmotionwhile
fixating
a pointinastationaryworld
Color
homing
Targetobjectisdistinguishedby
itscolor spectrumOptic
flow
Texture-richenvironmentStereo depth Systemcan
fixate
environmental pointsEdge
homing
Targetpositioncanbe described
by
approximatedirections
from
texturein itssurroundObject
tracking
Vergence canbeusedtoimprovetracking
performanceAnother objection to the early-vision approach toward computational vision is
suggested
by
thework conductedby
Yarbus inthe 1960's. Yarbus showedhow high-levelcognitive events are reflected in the patterns ofeye-movement traces
(Yarbus,
1967). Hefound
thatdifferent
patterns of eye-movementtraces,
or scan-paths, couldbeelicitedfrom
subjects when
they
performed context-sensitive tasks. For example, when subjects wereshown a
painting
depicting
a scene of several people greeting an unexpected visitor, aspecific question posed to the subjects elicited a specific "signature" pattern of eye
movements. Different questions elicited different "signature"
patterns. Figure 2-2 below
showsthepaintingandtypical scanpathsforasubject
formulating
an answerto thevariousquestions.
Original painting
Freeviewing
f;
m
HO?
Estimatethe economiclevel
ofthepeople
iiJK
\Scy^^^^t:
\!
Judgetheirages Guesswhat
they
hadbeen
Remembertheclothes worndoing
beforethevisitor'sby
thepeoplearrival
Figure2-2. Scanpaths aretaskdependent FromPalmer,1999andYarbus,1967.
The observation that oculomotor
behavior
islargely
taskdependent leads
one tooftheobserver. David Lee
has
suggestedthatinformation processingby
humansshouldbe
considered inthecontext of a unified perceptuo-motorsystem,which is itselfa part ofthe
organism-environment system
(Lee,
1978,
1980). His ideas pertaining to the functions ofvision are an extension of the ecological perceptual model set
forth
by
Gibson a decadeearlier. In
his
view, thehuman
visualsystem mustbe
studiednot onlyin anenvironmentalcontext, but also in the context of the individual's sensory-motor system. Vision is
functionally
inseparablefrom
the motor system. Information becomes available to theindividual via three separate sources: exteroceptive, propioceptive, andexprcprioceptive.
The exteroceptive source delivers information about the layout and affordances of the
environment. The proprioceptive source delivers information about the position,
orientation, and movement ofthe
body
or parts ofthebody. The exprcprioceptive sourcedelivers information about the union of the exteroceptive and proprioceptive sources,
information about the movement of the
body
relative to the environment. Theexproprioceptive information represents the interaction between the individual and space
overtime.
Taken together, thethreesources provide the meansforacooperativerelationshipto
existbetweenvisionandaction. Goal-oriented
behavior,
planning, anddecision-makingall
playasignificant partinthevisualperceptionexperienced
by
theindividual.To summarize the
history
offormal
thinking
about the nature of human visualperception, the constructivist and computational early-vision approaches taken
by
Helmholtz and Marr emphasize the autonomy of the individual and unconscious
mechanisms toguide thevisual perceptual process. This is the
foundation for
much ofthecurrent linear-systems methods for
teasing
apart thefactors
that influenoe perception.the interaction
between
the individual and the environment according to goals, actions,motivation and
behavior.
Forthem,
trying
to understand howvision worksby
studyingsubjects'
responsestoartificial stimuliina
laboratory
setting is liketrying
tounderstandhowfish
swimby
putting
theminasandbox. Fromthispoint ofviewthen, thefactors thathavehad a major
evolutionary
influenceon vision and that havelargely
shaped human visualperception arepreciselythose
factors
thataremissing fromthelaboratory
setting.2.2 Eye
Movements
The binocular visual
field
subtends an area approximately 130 vertically and 180horizontally. Most ofthat areacontains low-resolutionperipheralinformation. Inorderto
obtain
detailed,
high-resolution informationfrom
different areas in the environment, theeyesmust move.Thepurpose of an eye movementisto
bring
themostvisuallyrelevant partof a scene onto the area ofthe retina with the highest visual acuity, and to
keep
it thereduring
focused attention. This area is called thefovea
and subtends approximately onedegreeof visualangle,coveringan area ofthevisualfield approximately equalto thesizeof a
thumbnailextended at an ami'slength. Attentioncanthenbere-deployedtoanother areain
thevisualfieldtoinitiatethenext eyemovement.
Thephotoreceptors ofthehumaneyeconsist ofbothrods andcones,thecones
being
thephotoreceptors responsible
for
colorperception andvisual acuity. As shownin Figure2-3,
thepopulation of conesishighestinthefovea
andfalls
offrapidlytoward theperiphery.There is a 1:1 or greater correspondence
between
photoreceptors and ganglion cells indie
fovea,
however this ratio increases continuouslyalongtheperiphery. Thisfact,
along
withthe higherconcentration ofcones in the
fovea,
accountsfor
thehigher
visualacuity
there.M *
150,000
*7
IK
Cones
80 60 40 20 0 20 40 60 80
VisualAngle(degrees fromfovea)
Figure 2-3. Thedistributionofrods and conesis unevenly distributedacrosstheretina.
Thefoveacontainsthehighestconcentration ofcones,forhighvisualacuityhi thatregion.
FromPalmer,1998.
Traditionally,
eyemovementshave beenclassifiedintosixcategories:1. Miniatureeye movements
- These
aretheonlytypeof eye movementsthatdo nothavea
selectivefunction.
They
includetremorsintheextraocular musdesthatcontrolrotationoftheeyes in theirspherical socket, drift ofthe foveated
image,
andmicrosaccades tobring
thedriftedimagebackto the fovea. The result is constant motion ofthe opticalimageontheretina.
2. Saccades Theseare high velocity,
ballistic
eye movementsthathavethefunction
ofbringing
images of objects ofinterest to the fovea. It isgenerally believedthat once asaccadic eye movement has
begun,
it cannot be altered. A typical saccade takesapproximately 150 - 200
msec to planand execute; planning takesabout 150 msec on
average, andtheduration oftheeyemovement is approximately 20 msec plus2 msec
perdegree ofvisual angle
(Carpenter,
1988). Saccades can reach velocities up to 600persecond, and individuals
typically
make3
or4
saccades persecond,depending
ontheStudies
on eye movementsduring
reading have
shown that saccadesduring
reading
aretypically
sevenletters
long,
which isa saccadelength
ofbetween
1 and 2for reading
standard size textat adistance
of40
cm(O'Reagan,
1990). Ithas alsobeen
found
that thereis a widedistribution
of within-wordtargetlanding
for
readingtext. Inotherwords, thereisno precise position within thewordthat theeyetargets thesaccade
to
land
on, anywhere within theword is sufficientfor
comprehension(Morgan,
et. al.,1990). Fixationsare
defined
as the timebetween successive saccades; a typicalfixation
duration for
readingisbetween
200and300msec.3. Smoothpursuit- These
eye movements track thepositionofamovingobject,withthe
purposeof
keeping
theimageinthefoveal
region.Ideally,
theimageremainsstationaryon theretina. Afteran initialsaccade to track themovingobject, theeye movement is
smooth andcontinuous, as opposedto theabruptnessofsaccades. Constantcorrection
ofimage position on the
fovea
is maintainedby
means of afeedback
signal from thebrain that senses the position of the object as it moves. Thus smooth pursuit cannot
usually
be
maintained in the absence of a moving target. The maximum velocity isapproximately 100
per second; targetvelocities higher than that cause retinal slippage
anddisable the
tracking
mechanism.During
pursuit, theimageofthepursued objectisclear,withall otheruntrackedobjectssmeared
due
to theirrelative motionontheretina.4.
Vergence Whenan observerfixates
anobject,theeyes convergetowardoneanother,withthedegreeofconvergence
depending
uponthedistance between
the observer andthe object. Vergenceeye movements are
disconjugate
in thesensethat theeyes rotateopposite to one another. Fora conjugate movement such aspursuit, theeyes rotatein
thesamedirection. Ifanobjectis moving both in
depth
andindirection,
adisconjugate
Figure 1A. Vergenceeye movement
5. Vestibular- Whenthe
head
rotates, thevestibular ocular reflex
(VOR)
allows ustofixate
an object in the environmentwithout visual feedback. The information necessary to
control eyemovements when theheadmoves originates inthevestibular system ofthe
inner ear,whichsensestheorientation ofthehead. Vestibular eyemovements arefaster
than pursuit movements,
however,
high velocity head movements such as thoseencountered while running or walking
fast
cannot befully
compensatedfor
by
avestibular eye movement
(Palmer,
1999). When thishappens,
objects in theenvironmentthatrequire highvisualacuity forperception(suchas
lettering
onsigns)willappearblurred.
6.
Optokinetic-j\noptokinetic eye movementisa responseto therapidtranslationofthe
entire visual
field,
or alargepartofit. For example, ifan observerislooking
throughawindow at a train passing
by,
fixating
andtracking
a spot on the train will cause theobserver to exhibit the optokinetic response. It is characterized
by
a slow,tracking
phase in which the image is stabilized on the retina, followed
by
a rapid, saccade-likesnap of the eyes in the direction opposite to the image motion. This is
known
asRecent studies
have
suggested that there are actually only two categories of eyemovements: saccadic and smooth
(Steinman,
Kowfer,
andCollewijn,
1990). Theclaimisthat the classificationintosix categoriesisartificial,aresult ofthe early
laboratory
methodsthat studied simple tasksinaconstrainedand sparsevisualenvironment. The experimental
results of suchearly studies reflected the
low-level
andinvoluntary
aspects of oculomotorcontrol, and were
simply
responses to sensory cues that did not reflect the cognitiveprocesses that people
typically
employ while engagedin natural tasks such as expectation,motivation,and
learning.
2.3
Visual
Attention
andSelection
The mechanics of oculomotor
behavior
do not explain how the selection process iscontrolled. Questionssuch as "what is theregion of
interest?",
and "where shouldthenextfixation
be?"can best be answered within a
framework
that considers the purpose offocused
attention.2.3.1
Saliency
MapsThe notion of a saliency mapwas proposed to define the relationship between the
components of asceneaccordingto theirrelativeimportanceto theobserver
(Mahony
andUllman,
1988).According
to thistheory,thevisual systemperformsaninitiallow-frequency
parsing of the environment to
identify
potential regions ofinterest,
and assigns to eachregion aweight according to its saliency.
Corners,
highluminance,
andbright
colors,for
example,wouldbeassignedahighsalientweight. This infonnation is recordedinamapof
theenvironment,whichisarecordoftheweightof each region. The map is
dynamic
inthesensethatrecenttargetsaredepressedastheindividualmoves aroundintheenvironmentto
2.3.2 Feature Integration
Theory
ofAttentionWhat is the purpose of
focused
attention?According
tofeature
integrationtheory,
elementary
features
inthe environmentsuch as color and shape are processedbeforeobjectsthat require a conjunction of several
features,
such asablue box
or agray kitten. Focusedattention is
necessary
to conjoin the separatefeatures,
which then enables properidentificationoftheobject
(Treisman
&Gehde,
1980).The studies Treisman and Gelade conducted were based on the experimental
paradigmknownasvisualsearch. In thisparadigm, theamountoftime ittakes tocomplete
a search is plotted as a function of the number of items to be searched. A flat response
indicates a
fast,
parallel process, whereas a linear response indicates a slower sequentialprocess. Sinceeye movementsare
inherently
sequential, atask thatrequireseye movementswould elicit longersearchtimesforalargernumberofitemsand alinearresponse.
Theexperiments weredesignedtodistinguish between
features
thatareelementary,orintegral,
andfeatures
that areseparable and requirefocusedattentionforintegration.They
hypothesized that an integral feature would elicit a flat search response and wouldexhibit
"pop-out"
ina
field
ofdistractors,
whereasan object with separablefeatureswouldrequire alinear search response. Their results showed this to be the case when the elementary
featureswere chosentobecolors or shapes
(
for example "pink" inafield
of "brown" and"purple"
distractors,
or "O" inafield
of "N" and "T"distractors)
andthe separablefeatures
were chosentobeaconjunctionofthe twoelementary
features
(
such as"pink O"ina
field
of"green O" and"pinkN" distractors).
Boththesaliency map
theory
andthefeatureintegrationtheory
describe
perception asbeing
theresultoflow-levelandearly-vision processes. Oculomotorbehavior
isa response2.4 The World
asAnchor
2.4.1
Semantic
Consistency
When subjects are shown a
line
drawing
of a natural scene that contains either asemantically
consistent object(a
teakettle
inakitchen)
or asemantically inconsistent object(a
microscopeinakitchen),
they
are quickertolocate
the consistentobject, when asked tosearch
for
it,
thanthey
are theinconsistentobject(Henderson,
et al, 1999).Moreover,
theinitial saccadeis equally
likely
tobe
to theconsistent object as itis tobe
to theinconsistentobject. Since the inconsistent object would seem to have a higher salience than the
consistentobject, thesaliency map
framework
for earlyvisualprocessingis eitherwrong orincomplete. A
determination
of semantic consistency necessarily takes into account therelevancyofa particular object in its surroundings, andthisis not considered as part ofthe
saliency mapmodel.
2.4.2 Change Blindness
Changeblindness refers to thephenomenon that occurs when
large-scale
changes inthe visual scene goundetected
by
theobserver as theresult of ablink,
a saccade, or someother visual transient. This has been explained
by
suggesting that attention isbeing
preventedfrom
being
focusedonthe changebecauseofthedistractioncausedby
thevisualtransient. In otherwords, the changeblindnesscouldbe duetoamasking, or resetting, of
the internal representation ofthe world
(Rensink, O'Regan,
andClark,
1995). Ithas
alsorecendy been found that small random changes in the scene, such as tiiat
due
to amud-splashon acarwindshield,can also resultinchangeblindness
(O'Regan,
Rensink,
andClark,
1999). Not onlyare mentalimages unreliable,buttheinternal representation isquite sparse
and contains only the informationabout theenvironment that is of central interest. This
encoding
visual primitives andbinding
themtogether,
cognition dictates what is actuallypreservedinand retrieved
from
memory. It may bethatit isa more efficientstrategytousetheworld as an external
memory
source,andonlyencodetheinformationdiat
currendylias
meaning.
2.4.3
Exocentric Reference FramesThe notion of "world-as-anchor" can
best
be summed upby
sayingthatweareperceptually
predisposedtoseeingtheworldaroundusasstable,despite largechangesineyeand
body
positionthatdisplacetheimageontheretina significandy.When a small afterimage is viewed in darkness except for the glow of a small,
stationaryreference
light,
andtheeyemoves, the afterimageappearstomove relative tothereferencelight. Whentheafterimageislarge (complexscene), itappearstoremainstationary
when the eye is moved, and the referencelight insteadappears to move, even though the
subject knowsthe reference lightis actually stationary (Pelz andHayhoe, 1995). Whenthe
subjects were instructed to inspect the afterimage andmade large saccades
(up
to5),
thelargeafterimagestilldidnot appear tomove. Thiswas explained
by
suggestingthatwhole-sceneafterimagescarrymoreperceptual "weight" than
do
small, isolatedpatches oflightinadarkened room. The largeafterimage creates an external reference
frame,
or anchor, thatallows
for
visualstabilityandconstancyofvisualdirection.2.4.4 Position
Constancy During
Passive MovementPositionconstancyrefersto theperceptionthattheenvironment
does
not appeartomovewhentheeyes,
head,
orbody
moves, eventhough theimageonthe retinaisdisplaced.
Irvin Rock
(1967)
found that external frames of reference are used to maintain positionHe seated subjects,
blindfolded,
in a small motorized wagon and started the wagonmoving. He
disguised
theeffect ofthe acceleration of thewagonby
telling
thesubjects toexpect a small amount of
jostling
oftheequipment whiletheexperimentwasbeing
setup.He then sent thewagon rolling alonga
darkened
hallway
and removedtheblindfold. Theonlyobjects visibleto the subject weresmall,
luminous
circles, placedalongthe walkofthehallway
so thatonly
one circle was visibleto thesubject at a time. Thesubjectswereaskedtoreport what
diey
wereexperiencing. Seventeenofthe20subjects reportedthatthey
werestationary
andthe circles were moving past them. He thenrepeated the experiment withdifferentsubjects, changing the luminous circles to
luminous
vertical lines. This time thesubjects were able to see all of the
lines,
which filled the visual field. Twelve of the 20subjects experiencedthe
lines
as stationaryandthemselvesas moving. Rock explainedthisby
saying that the lines provided aframe
of reference for the subjects that enabled thecorrectperception. The results
from
this studyshowed that forsubjectswho arepassivelymoving through their environment, position constancy can be maintained
by having
anexternal
frame
of reference.2.5
Perceiving
theDirection
ofHeading During
Motion
Position constancy is not the only issue relating to theperception of a stableworld in the
presence ofimagemotionontheretina.
Perceiving
one'sdirectionofheading
whilemakingwhole
body
movements, headmovements, andeye movements is criticalfor
survivalin theworld,andisanaturalability
for
humans.2.5.1 Retinalvs. Fjrtra-retinalInformation
In the 1980's and early 1990's researchers considered the question of
how
peopleIn this case the
flow
field,
which resultsfrom
thechanging structure of theambient opticarray
as the observer moves around, mustbe decomposed
intoboth
a translational and arotational component. Itwas assumedthat therotationalcomponent
due
toaneye orhead
movement was
effectively
canceled out prior to thedetermination
ofheading.
Severalhypotheses have been
proposedtoexplainhowtherotational component couldbe
canceledout.
Theretinal image
theory
claims that there isenough information in theretinalimagealonetoaccuratelypredictdirection inthepresence ofheador eyemovements(Warrenand
Hannon,
1988). The extra-retinaltheory
claims that proprioceptiveinformation,
andpossibly an efference copy of the eye command, is necessary to make an accurate
determinationofdirection
(Royden, Banks,
andCrowell, 1992).Both theories
base
their claims on the results of an experimental setup that requirestestsubjectstoview a random-dot
display
ofsimulatedmotion onavideoscreen. Therearetwoparts to theexperiment. Forthe
first
part, subjectsinitially
fixate
a centraltarget,
thenpursue the target as it moves
laterally
across the screen. For thesecond part, thesubjectsagain fixate a central target, and continue to
fixate
the target as thedisplay
changes tosimulate a lateraleye movement. The resulting flowon the retina shouldbe the same
for
bothcases. Flow field
onscreen
Flow fold onretina
RealEyeMovement
-SV"a' '' ' '*
o
* b
""****
'
v * y
/ V \ Simulated Eye Movement
'"1
t\
Figure2-5. E^qierimentalcoiMlitkMiSjforsufficiencyof retinalinfannaiion. Ina) thesubject was
instructedtofixatethecrossthenmake an eye movementinttedirectionof me arrow. Inc)the
Themajor
difference between
the two studies was that the retinal image proponentsused slow speeds
for
the real and the simulated eye movements (0.2 -1.2per smnd),
whereastheextra-retinal proponentsused
faster
speeds(1 to 5 per second). At theendofeach 1250 msec
trial,
the subjects were asked to state their perceiveddirection
ofheading.Warren & Harmon instructedthesubjectstoindicatetheirperceiveddirectionof
heading by
having
them state whetherthey
felt
as ifthey
wereheaded
to the rightor to the left of averticaltarget
line
placed onthehorizon
ofthelastframe
ofthedisplay
afterthemotionhadstopped. Royden et al. hadthe subjects state which one of the seven equally spaced(4
apart) targetswas closest to theperceived
direction
ofheading
afterthemotionhadstopped.The retinal imageproponents
found
nodifference
inperceiveddirection for therealor simulated eye movement. This suggests that all of the information that is required to
perceivedirection ispresent in dieretinalimage.
Interestingly,
theextra-retinalproponentsdiscovered that there was a significant difference in perceived direction for the real and
simulatedeye movements.
They
found
that the subjects could not tell inwhich directionthey
were headed without making a real eye movement. When the eye movement wassimulatedonthe
display
screen(by
sweepingthedotpatternlaterally
acrossthescreen),they
felt as if
they
were moving along a curvilinear path, rather than straight ahead. This isevidencethat extra-retinalinformationis necessary
for
determining
heading. It appears thatthespeed oftheeye movements might bea reason
for
thediscrepancy
between
theresults.from
the retinalflow
pattern withoutany
extra-retinalinput,
whereasfaster
speeds requiretheextra
information.
2.5.2
Differential
Motion ParallaxIn 1992 James
Cutting
disputed
thehypothesis
thatmovingobserversdecompose
retinal
flow
intotranslational and rotational components. Hemaintainedthatretinalflow inits entirety issufficient
for this,
in theform
ofdifferential
motion parallax. He argued thatthe earlier studies did not include the components of bounce and sway that people
experience when
they
move at a pedestrian speed. He reasonedthatifthesecomponents areincluded in the experimental conditions, subjects would find it much more difficult to
determine their
heading
direction because the additionaldecomposition
due to thesecomponents would complicate the process of perception. He found that subjects were
equallyabletodetermine their
heading
direction
withorwithouttheaddedcomponentsofbounceandsway,andconcludedthatindividualsuse retinalinfonnation
direcdy
intheformofdifferentialmotion parallax.
Neither retinal decomposition nor differential motion parallax considers the
possibilitythat subjectsmay be usingtheenvironment as an externalframeofreference, in
much the same sense as was shown for position constancy and exocentric
frames
ofreferenceforafterimages.
2.6
The
Effects
ofFreeing
theHead
The studies conducted on
heading
perception discussed in the previous section wereconducted in the confines of a
laboratory
setting. The subjects'eye movements were
monitored with a head-mountedlimbus eye-tracker as
they
watched a simulateddisplay
ofSince the
human
visual systemdid
not evolve motion perception capabilities inthis typeofsetting, it seems reasonable toconclude that the resultsmay differwhen natural movement
throughthereal worldisconsidered.
Traditionally,
most studies of oculomotorbehavior
have relied upon eye-movementrecording devices
thatrequiredthehead
tobe immobilizedduring
theexperiment.Usually
achin-rest or
bite board
was used. The reasonfor
using ahead-restraining
mechanism isbecause
inorderfor
an accurate measurement of maintainedfixationtobemade, the devtemustbeableto
distinguish
betweenmotionoftheeyeinthehead,
andmotionofthe trackerwith respectto the head. Ifthe tracker moveswhilethe subjectis mamtaining
fixation,
aneye movementwillappear to
have
beenmade,wheninreality the eyemaynot havemovedat all. It is necessaryto
keep
theheadsecuredtoeUrninateanymotion ofthe trackerrelativeto theheadinordertodeterminewhentheeyeisrotating.
Early
eyemovementmonitors suchasthecontact-lensopticallever,
themagneticfield
sensorcoil, andthe SRI Dual Purkinje
Image
Tracker requiredimmobilization ofthehead(Kowler,
1995). Researchers generally assumed that fixations made with the headimmobilized wouldbe the same in terms ofstability as
fixations
made when the headwasfreetomovebut didnotmove. It has subsequendy been discoveredthatthisassumption is
incorrect
(Skavenski,
et al, 1979). When subjects maintained fixation on a distanttarget,
retinal imagestabilization decreasedwhenthehead wasnot supported, as shown belowin
Figure2-6
for
twosubjects.When thehead is free to move but
does
not, image motion on the retina canbe asmuch as 2or
3
degreespersecond. Visualperceptionis insensitiveto this typeofmotion,and it has beensuggested that visioncan actually be impairedwhen thehead is not
free
toimagemotion,tomakethe taskofperception
less taxing,
muchin thesameway
thatsaccadetargetposition
during
reading
canbe
very imprecisewithin aword,yet comprehensiondoes
not suffer.
Developers
of robust robotic vision might well considerthewide tolerances ofhuman
visiontobe
amodelfor
systemsthat require thesynthesis oflarge
amounts ofdatafor
the performanceof complextasks.Subject A SubjectB
BITE-BOARD BITE-BOARO
>w-j
SITTING
I
%.'
Figure 2-6. Theeffectof
freeing
theheadonfixationstability. Theverticallinesrepresent1secondintervals. Theverti