Opinion
De
fining
Auditory-Visual
Objects:
Behavioral
Tests
and
Physiological
Mechanisms
Jennifer
K.
Bizley,
1,*
,@Ross
K.
Maddox,
2and
Adrian
K.C.
Lee
2,3,*
Crossmodalintegrationisatermapplicabletomanyphenomenainwhichone
sensorymodalityinfluencestaskperformanceorperceptioninanothersensory
modality.Wedistinguishthetermbindingasonethatshouldbereserved
specifi-callyfortheprocessthatunderpinsperceptualobjectformation.To
unambigu-ouslydifferentiatebindingformothertypesofintegration,behavioralandneural
studiesmustinvestigateperceptionofafeatureorthogonaltothefeaturesthat
link theauditory and visualstimuli. We argue that supporting trueperceptual
binding(asopposedtoother processessuch asdecision-making) isonerole
forcross-sensoryinfluences inearlysensory cortex.Theseearlymultisensory
interactions may therefore form a physiological substrate for the bottom-up
groupingofauditoryandvisualstimuliintoauditory-visual(AV)objects.
Introduction:WhatIsanAVObject?
Whatwehearandseetakestrikinglydifferentphysicalforms,andarenecessarilyencodedby different sensory receptor organs, but auditory and visual features are effortlessly bound togethertocreateacoherent percept.Bindingstimulus featuresfromacommonsource is notonlyaproblemacrosssensorysystems–withinsensorysystems,parallelandindependent perceptualfeatureextractionmeanthatstimulusfeatures,suchaspitchandspace,mustalsobe appropriatelycombined intoasingleperceptual object(Box1,Figure1A).The formationof cross-sensoryobjectsisaproblemsynonymouswithfeature-bindinginthevisualsystem,or auditorysceneanalysisintheauditorysystem.
WedefineanAVobjectas‘aperceptualconstructwhichoccurswhenaconstellationofstimulus featuresareboundwithinthebrain’.Participatinginaconversationatacrowdedbarisassisted bypairingyourfriend'sfaceandmouthmovementswithhervoice;pickingoutthemelodyofthe firstviolininastringquartetismadeeasierbywatchingtheplayer'sbowingaction.Ineachcase whatyouseeandhearareboundintoasinglecrossmodalobject.Conversely,tryingtolistento one friend while watching another's face makes listening more difficult. These examples demonstrate two fundamental aspectsof object-basedattention[1–3], namely (i)attending toone feature ofanobject enhances therepresentationofthe otherfeaturesoftheobject (Figure 1B), and (ii) dividing attention between two features across two objects is costly comparedtowhenattendingtotwofeatureswithinthesameobject(Figure1C,D).Thepurpose ofthisOpinionarticleistwofold.First,wewishtodistinguishbinding,whichunderlies cross-modal‘objecthood’,from othermechanisms ofcrossmodalintegration (seeGlossary).In makingthisdistinctionwewillarguethatcrossmodalbindingcanbestbedemonstratedby leveraging competition in tasks based on theories of object-based attention. Second, we proposethatthepurposeofearlycross-sensoryintegrationistosupportbinding.
Trends
Crossmodal integration and binding havebeentreatedassynonymousin theliterature,withnocleardelineation betweenperceptualchangesandother interactionssuchasdecision-making.
Crossmodalbindingisproposedasa distinctformofintegrationleadingto multisensoryobjectformation.
Multisensorystimuliaremostbeneficial innoisysituations,butfewstudiesuse stimuluscompetitiontoinvestigatethe processesunderpinning multisensory integration.
Evidencesuggeststhatbothvisualand auditoryattentionisobject-based–all featureswithinanobjectareenhanced andthereisacosttoattendingfeatures acrossversuswithinobjects.
Multisensory interactions can be observedthroughoutthebrain, includ-ingearlysensorycortex.
Theroleofearlysensorycortexin multi-sensory integration is unknown, but mayunderliecrossmodalbinding.
1
UniversityCollegeLondon(UCL)Ear Institute,332Gray'sInnRoad, London,WC1X8EE,UK
2
InstituteforLearningandBrain Sciences,UniversityofWashington, 1715NEColumbiaRoad,PortageBay Building,Box357988,Seattle,WA 98195,USA
3
DepartmentofSpeechandHearing Sciences,UniversityofWashington, 1417NE42ndStreet,EaglesonHall, Box354875,Seattle,WA98105,USA
*Correspondence:[email protected] (J.K.Bizley)[email protected] (AdrianK.C.Lee).
@
Glossary
Crossmodalbinding:aspecific formofcrossmodalintegration whereinperceptualfeaturesare groupedintoaunifiedcrossmodal object.
Crossmodalintegration:any processinwhichinformationacross sensorymodalitiesiscombinedto makeaperceptualjudgment. McGurkeffect:anillusionwhereby mismatchedvisualmouthmovements andauditoryphonemesresultina mergedpercept.Forexample,a visual/ga/andanauditory/ba/are perceivedasa/da/.
Orthogonalstimulusfeature: perceptualdimensionsthatdonot dependononeanother.For example,thepitchofasoundcan varyindependentlyfromitsperceived location.
Sound-inducedflashillusion (SIFI):aphenomenoninwhichthe numberofvisualflashesreportedis influencedbythenumberofrapidly presentedauditorystimuli. Ventriloquistillusion:anillusion wherebythespatiallocationofa sound-sourceis‘captured’bya visualstimulus.Forexample,we perceiveanactor'svoiceas originatingfromtheimageoftheir mouthonascreenratherthanfrom theloudspeakersaroundthecinema. SectionI.BehavioralAssays
BindingasaSpecialCaseofMultisensoryIntegration
Thereareamultitudeofwaysinwhichstimuliinonesensorymodalitycaninfluenceorperturb the behavioral response to stimuli inanother modality [4,5].We conceptually describetwo stagesofinteraction (Figure2A): first,thefeatures ofanincoming soundareperceived ina mannerrelatedtoitsphysicalvalue(e.g.,aphysicalintensityiscodedasloudnessaccordingto aprobability distribution)and, second,thisperceptis subsequentlyused as thebasis fora judgmentaboutthesound.Thetermcrossmodal(ormultisensory)integrationappliestoany instanceinwhichonesensorymodalityinfluencesthejudgmentofstimuliinanother,andcould thereforeoccurateitherofthesetwostages.Oneexampleofcrossmodalintegrationwouldbe weightinginformationfromdifferentsensorymodalitiesbytheirreliabilitytoreachadecision[6–8]. Binding,weargue, is aspecific conceptthat shouldbe reservedfor crossmodallinking of perceptualfeaturesresultinginaunifiedAVobject(i.e.,thechangehappensinthefirststageof
Figure2A). Bindingisaform ofintegration;however, integratinginformationatthedecision stage, for example, is not a form of binding. We further argue that binding relies upon consistencybetweenthemodalities–inparticular theirtemporalcoherence[9].Bindingcan theoreticallybebuiltonotherconsistencies,suchasagreementinauditoryandvisualspatial location,orphoneme–visemerelationships(e.g.,betweenanauditory/u/vowelandanimageof protrudedlips);however,whensuchfeaturesaredynamicandcoherent,thebindingshouldbe substantiallystrengthened.Itisalsoworthnotingthat,foradynamicstimulus,theonlywayfor features to be consistent as they change in time is for them to be temporally coherent. Furthermore,temporalcoherenceallowsdisparatecross-sensoryfeaturestobebound,such aspitchandcolor(whichhavenopresumednaturalconnection),therebyformingacoherent multisensoryperceptualobject.Weseebindingasalargelyperceptual,ratherthancognitive, process–andthereforedistinctfromtheintegrationofinformationatthedecision-makinglevel. Importantly,asdiscussedinSectionII,theprocessingstagethatisaffectedbyamultisensory stimuluscarriesimplicationsaboutwhereinthebraintheinfluencemaybeoccurring. Distinguishingbindingfromotherformsofintegrationexperimentallyisnon-trivial;mostprevious studies,whiledemonstratingadiverserangeofintegrationeffects,fallshortofunambiguously
Box1.FormingAuditoryandVisualObjects
Auditoryandvisualobjectsshareparticularproperties:bothhavelinkedfeaturesthatchangeovertimeandperceptually grouptogethertheacousticorvisualfeaturesthatcomefromacommonsource[78].Theprocessofformingobjectsis oftendescribedassceneanalysisinaudition[79]andimagesegmentation[80]invision.Inbothinstancesacontinuous sensoryinputisrepresentedacrossanarrayofsensoryreceptorsinthecochleaorretina,andthisinputmustbe appropriatelysegmentedintocomponents[78,80].Segmentingavisualimageiscomputationallydifficultbecause objectscanoccludeoneanother,althougharguablythechallengefacedbytheauditorysystemisgreaterbecausea singlesound-sourcecanelicitacomplex,discontinuouspatternofactivityatthecochleaandmultiplesound-sources mayelicitoverlappingorinterdigitatingpatternsofactivity.
Objectsmustthereforebeinferredfromlow-levelcues:forexample,intheauditorysystem,acousticcuessuchas inter-auraltimingandleveldifferences,usedforlocalizingsoundinthehorizontalplane,andharmonicityleadtoperceptual features(spaceandpitchrespectively)thatdefineanobjectandenablegroupingofsound-elementsfromthemixtureof soundsthatformanacousticscene[79].Similarly,inthevisualdomain,objectsaredefinedbyperceptualfeaturessuch aslocation,color,andshapethatallowthemtobeseparatedfromotherobjectsintheenvironment[81,82].
Forsensoryobjects,theabilitytogroupstimulusfeaturesenablestheseparationoftheobjectfromthefeaturesof competingstimulithatcomprisethewholesensoryrepresentation[83].Inbothvisionandauditiontheformationofa perceptualobjectallowsalevelofabstractionthatfacilitatesperceptualinvarianceandsubsequentobjectrecognition [78,83,84].Forbothauditoryandvisualobjects,observerscandeploytop-downmechanismstoselectivelyattendtoan objectofinterestbyfocusingonaparticularstimulusfeature–suchascolororpitch[3,85].Selectiveattentioncanalsobe engagedbyautomatic,bottom-upmechanismsdrivenbysalientattributesofavisual[2]orauditory[3]scene.Evidence suggeststhatbothvisual[1]andauditory[86]attentionareobjectbasedasdemonstratedbychangesinthecontinuityof atask-irrelevantfeaturehavinganinfluenceonbehavior.
showingbindingduetoachangeinperception.Figure2B–Gdemonstratesseveralwaysin whichbehavioralresultsthatmayseemtoshowbindingcanactuallybetheresultofintegration atalaterstageofprocessing.Forexample,givenacertainpatternofbehavioralresponsesforan auditory-onlystimulus(Figure2B),observingabiasindiscriminationwithasimultaneousvisual stimuluscanresultfromashiftinperception(whichwouldbebinding,Figure2C)orfromashiftin the decision criterion (which is integration, but is not binding, Figure 2D). Importantly, the behavioralreadoutsoftheseareidentical,renderingtheunderlyingprocessingchanges indis-tinguishable.Achangeinsensitivity,seeminglyfreeofbias,doesnotnecessarilyshowbinding either,asdemonstratedinFigure2E–G.Ifavisualstimuluschangesinawaythatisconsistent withchangesintheauditorystimulus,itcanleadtoavariabledecisioncriterionthatbiasestowards thecorrectbehavioralresponseinastimulus-dependentmanner.Theresultmightbeinterpreted asameasuredimprovementinsensitivity,withzerobias,thatbeliestheunderlyingmechanism. Theremayactuallybeaperceptualchangeresultingfrombinding,but,inthesesituations–namely wherethevisualstimuluscouldreasonablyshiftthedecisioncriterionfortheauditorytask–itis notpossibletoknow.Infact,avariabledecisioncriterionisadirectviolationofacentralaxiom ofthedecisionmodelusedinsignaldetectiontheory:thecriterionvaluemustremainconstant throughoutanexperimentforeachperceptualjudgment[10,11].Thus,applyingsignaldetection theoryasinFigure2Gwherethecriterionmayinfactbechanging(Figure2F)isflawed.
Cross-modal within-object
feature enhancement
Divided aenon to
features across
objects
visual (B)
(C)
auditory
(D)
Divided aenon to
features within an object
ampl pitch mbre size lumin color
A2
A2
V2
V2
ampl pitch mbre size lumin colorA2
A2
V2
V2
A1
A1
ampl pitch mbre size lumin colorV1
V1
size
lumin
color
V
(A)
Unimodal within-object
feature enhancement
size lumin colorV
A
ampl
pitch
mbre
A
ampl
pitch
mbre
size
lumin
color
V1
V1
A1
A1
ampl
pitch
mbre
Figure1.BindingofFeaturesintoUni-andCrossmodalObjectsThroughTemporalCoherence.(A)Whenone
attendstopitch(attendedfeaturesshownwithahighlightingbox),alltheotherauditoryfeatureswillbeenhanced(depicted byanincreasedfontsize),butnotfeaturesbelongingtoaseparatevisualobject.(B)Theamplitudeofthesoundisnow comodulatedwiththesizeofthevisualstimulus,andthistemporalcoherence(highlightedinyellow)enablesthesetwo sensorystimulitobindintooneobject.Wehypothesizethatinthissituationwhenoneattendspitch,theabilitytomakecolor judgmentwouldalsobeenhanced.Thisenhancementshouldalsobebidirectional:whenonemakesacolorjudgment, thereisenhancementinapitch-relatedtask[32].(C)Inadividedattentiontask,thecostofattendingtwofeaturesspanning acrosstwoobjectsshouldbemorethanwhenthetwofeaturesbelongtothesamecrossmodalobject.(D)Theperceptual costcanbemeasuredbehaviorallyasadecrementinsensitivityofthesefeaturesoranincreaseinreactiontimecompared towhenbothattendedfeaturesarewithinthesamecrossmodalobject.Thefluctuatinglinesextendingfrom“ampl”and “size”showthosedynamicfeatures’timeenvelopes.Abbreviations:ampl,amplitude;lumin,luminance.
.84 .16 .16 .84 Smulus Response 1 2 1 2 d' = 2 Visual bias Visual bias .96 .04 Smulus Response 1 1 2 .04 .96 Smulus Response 2 1 2 .96 .04 .04 .96 Smulus Response 1 2 1 2 d' = 3
Auditory perceptual units .84 .16 .16 .84 Smulus Response 1 2 1 2 d' = 2 No visual smulus No visual smulus 1 2 (B) (C) (D) .62 .04 .38 .96 Smulus Response 1 2 1 2 Vis shi d' = 2 Vis shi
Auditory perceptual units .62 .04 .38 .96 Smulus Response 1 2 1 2 Visual bias d' = 2
Input aud smulus
Percepon Judgment
Behavioral output
Aud perceptual units Aud perceptual units
(A)
(E)
(F)
(G)
Binding Integraon
Effect of vis smulus:
Binding: visual smulus alters
auditory percepon
Integraon: visual smulus
shis decision criterion
Integraon of congruent visual smulus yielding visual smulus-dependent criterion shi
Spurious improvement in auditory perceptual sensivity resulng from visual
smulus-dependent criterion shi
Figure2.InterpretingBehavioralResultsasBindingorIntegration.(A)Theprocessingunderlyingapsychophysical
taskinvolvesperceptionoftheauditory(aud)stimulusthenjudgment.Integrationcouldhappenatanystage.Bindingimplies aperceptualchange.(B)Anauditorytaskinvolvingdiscriminatingbetweenauditorystimuli1/2.Blue/greencurvesrepresent theconditionalprobabilitydensityfunction(PDF)fortheperceptionofeachstimulus.Perceptionbelow/abovethedecision criterion(verticalline)determinesstimulusjudgmentas1/2(coloredshadings).Thetableshowstheprobabilityofreporting stimulus1/2given1/2waspresented,yieldingd0of2.0.(C,D)Differentinfluencesofavisual(vis)stimulusyieldingidentically biasedbehavioralresponses,makingtheunderlyingmechanismsindistinguishable.In(C),visualstimulusbindswith auditorystimulus,shiftingauditoryperceptiontowards2(PDFsmoverightwards),yieldingreportingbiasdespitenocriterion shift.In(D)thereisnobinding.Thevisualstimuluseffectsaleftwardscriterionshift,againyieldingbiastowardsreporting2. Both(C)and(D)representintegration,butonly(C)representsbinding.(E–G)Stimulus-dependentdecisionbias(i.e., integration,butnotbinding)causingspuriousmeasurementsofimprovedperceptualsensitivity,(E)showsthetaskinthe absenceofavisualstimulusforreference.In(F)thevisualstimulusineachtrialisconsistentwitheachauditorystimulus. Thus,forstimulus1(left),thevisualstimuluspushesthecriteriontotheright,biasingresponsestowardsstimulus1(correct response),andsimilarlyforstimulus2(right).Thereisnochange(neithershiftnorexpansion/contraction)ineitherPDF(G). Theseresponsesyieldd0of3.0andtheerroneousconclusionthatperceptualsensitivityhasincreased.However,nobinding hasoccurred.
BehavioralTestsforIdentifyingBinding:TheNecessityofAssessingStimulusFeatures
OrthogonaltoThoseLeadingtoBinding
Thequestionthenbecomes:howdowedemonstratebindingexperimentally?Webelievethat thereisanessentialexperimentalelementtoachieving thisempirically:behavioral measures mustbemadeonastimulusfeatureorthogonaltothefeaturesthatcreatethebinding.Inother words,noneofthefeaturesthatareintendedtobindshouldbethedimensiononwhichsubjects reportsometypeofperceptualjudgment.Acrossmodalfeatureorthogonaltothatbeingtested will not influence the decision criterion. Thus, any measured changes in behavior can be assumedtoresultfromchangesinperception.Thisapproachhasbeensuccessfullyusedin unimodalstudies,forexampletoobjectivelymeasureauditoryobjectformation[12].
Howcouldanorthogonalstimulusfeatureaffectperception?Ifstimulusfeaturesarebeing boundtoformaperceptualobjectthen,consistentwithobject-basedtheoriesofattention[1–
3,13],allofthefeaturesofthatobjectshouldsubsequentlybeenhanced.Theseassumptionsare easilyextendabletothecaseofcrossmodalobjects.Furthermore,bymakingajudgmentona featureorthogonaltothefeaturesthatleadtobinding,weremovethepossibilityofcrossmodal influencethroughsimpledecisionbiasing.
Awidevarietyofexperimentalparadigmshavebeenusedtoinvestigatecrossmodal interac-tions.Ofthese,illusionshavegainedparticular tractionbecausetheyprovideinsightintothe obligatorymechanisms for resolving conflicting multisensoryinformation. Reports of illusory perceptshavebeenheldupasdemonstratingmultisensoryintegrationandcrossmodalbinding, butwithoutconsiderationoftheirdifferences.Weusethreecommonlyinvestigatedillusionsto showhowthebulkofthesestudiesdemonstratemultisensoryintegrationandareconsistent with–butdonotconclusivelyindicate–bindingoffeaturesintoAVobjectsattheperceptual level.Theseexamplesalsohighlighttheneedfortestsbasedonorthogonalfeatures. Thefirstexampleistheventriloquistillusion[7,14].Herethelocationofasound-sourcecanbe ‘captured’byavisualstimulus,forexample,thevoiceoftheventriloquistappearstocomefrom her puppet's moving mouth rather than from her own (stationary) mouth. This illusion is compelling;intuitively,wehavetheimpressionthatthevoiceandmouthhavebeen‘combined’ toformanobject.However,doesthisillusionnecessarilytellusthatthebrainhasboundthe auditoryandvisual signals?Observershavebeenshownto weightheir estimateoflocation accordingtothereliabilityofthesignalsineachmodalityinamannerconsistentwithBayesian decision-making [7]. This finding suggests that independent estimates are made for each modalityatalaterdecision-makingstagethatcombinesinformationacrosssensorymodalities, andwouldthereforebeconsistentwith ourdefinitionofcrossmodalintegration, ratherthan crossmodalbinding. If observers are askedto localize both the auditoryandvisual source separately,thenthelocationofthesoundismuchlessbiasedthanwhenobserverstreatthetwo signalsasasinglesource,suggestingindependentperceptualestimatesaremaintained[15– 17].Moreovertop-downfactorssuchasemotionalvalenceorrewardexpectationcanalterthe magnitudeoftheventriloquismeffectobserved[16,17].Suchtop-downmediationoftheeffect suggeststhatitis,atleastinpart,attributabletochangesinjudgmentratherthanperception. AsecondcommonlyusedillusionistheMcGurkeffect,inwhichavideoofamouthmovement affectstheauditoryphonemethatlistenersreporthearing[18],wherebytheperceptisneitherof theveridicalunisensoryperceptsbutisinsteadathirdone.Thisillusionisalsoinfluenced by higher-levelcontextualeffects[19,20]aswellasvisualattention[21].Eveninthesecasesthe illusoryperceptcouldrepresentabiasintheconsonantperceptioncontinuum,influencedby bothauditoryandvisualstimuli.Suchprocesses,inwhichstimuliineachmodalityarecoded independently,andthenajudgmentismadebyconsideringtheinformationprovidedbyeach, areconsistentwiththeresultsdiscussedabovefortheventriloquistillusion[7],aswellasother
illusory [22] and non-illusory multisensory behaviors [6]. Furthermore, uncertainty in visual speechsignalscan alsopotentially explainsomeoftheobservations inMcGurk paradigms
[23],makingitunclearatwhatstagetheintegrationisoccurring.Onestudythatdoestestan orthogonalfeature[temporal(a)synchronyasopposedtophonemeidentity]findsthatsubjects aremore sensitiveto asynchronies inillusoryaudiovisual syllables thanin congruent ones, suggestingthattheformerarenotintegratedasstrongly,furthercastingdoubtonbindingasthe soleexplanationfortheillusion[24].
Athirdcommonlyexploredillusionisthesound-inducedflashillusion(SIFI)[25].TheSIFIisan illusioninwhichbriefvisualandauditorystimuliarepresentedrapidly,andthenumberofauditory stimuliinfluences thereportednumberofvisual stimuli.Signal-detectiontheoryanalysishas demonstratedthatillusoryflashesareaccompaniedbymeasuredchangesinsensitivity(andnot onlybias)[26,27],suggestingthatthisillusionisdueinparttoachangeinperception.However, illusoryflashesarenotperceivedinthesamewayasrealflashes:whenofferedathird‘not-one, not-two’optionmanysubjectschooseit[28].WhilemostexperimentsutilizingtheSIFIdonot fulfill our proposed criteria of testing for AV object formation, by using a stimulus feature orthogonaltotheone beingbound,afew do.Onerecentstudyaskedsubjectstonotonly countthenumberofflashesbutalsodescribetheircolor(anorthogonalfeaturedimension)[29], andasecondtestedcontrastperceptioninadditiontothenumberofevents[27]andfoundthat theeffectislikelyexplainedbybothaperceptualchangeaswellasacriterionshift.
Moststudiesofillusionsequatechangesinstimulusjudgmentswith trulyalteredperception resultingfrombinding.However,withafewexceptions[24,27,29],allthreeillusionparadigms sufferfromtheproblemsindicatedinFigure2–thedimensionthatlinkstheauditoryandvisual stimuliisnotorthogonaltothestimulusdimensionbeingjudged,makingitimpossibletotell bindingapartfrom otherforms ofmultisensoryintegration.Intheventriloquist illusion,visual spaceinfluencesauditoryspatialjudgments;intheMcGurkeffect,avisualspeechcueinfluences thephonemereported;intheSIFI,thenumberofauditoryeventsinfluencesthenumberofvisual eventsreported.Theambiguityinherentininterpretingsuchresultsunderscorestheneedfor testingfeaturesorthogonaltothecrossmodalinfluenceifonewishestoconclusively demon-stratecrossmodalbinding.
BehavioralTestsforIdentifyingBinding:TheCaseforStimulusCompetition
Inmostlaboratorysituations,experimentersdonotimposecompetitionintheirexperimental design.Wearguethat,byintroducinganelementofstimuluscompetition,wecandrawupon theobject-basedattentionliteratureto generatespecificandtestable predictionsaboutthe processingadvantagesconferredupon thefeaturesassociatedwith asingleAV object.We furtherargue that introducingstimulus competition provides a morenaturalistic andtaxing situationthatincreasestheperceptualbenefitofferedbycrossmodalbinding,makingbinding easiertodetect.Competitionhasbeenproposedassimilarlyimportantforstudyingmultisensory attentionalprocessing[8].
Theimportanceofstimuluscompetitioncanbedemonstratedbycomparingtheoutcomesof studiestestingtheinfluenceofspaceontheSIFI.Thosestudiesshowedthattheprobabilityof anillusoryperceptwasuninfluencedbythedegreeofspatialseparationbetweenauditoryand visual stimuli [30], and that visual sensitivity was the key determinant of the probability of perceiving a SIFI, and not audiovisual spatial proximity [26]. However, when two spatially separatedstimulusstreamswereplacedincompetition,andsubjectswereinstructedtodirect theirspatialattentiontoonlyoneofthesestreams,spatialfactorsbecameapparent[31]. A crossmodalobject shouldbemoresalient thana unimodalone, which should provide a processingbenefit, especiallyinthecaseofcompetingstimuli.Forexample,arecentstudy
engagedobserversinaselectiveattentiontaskwhichrequiredthattheyreportbriefpitchor timbredeviantsinoneoftwoongoingindependentlyamplitude-modulatedsound-streams[32]. Observersalsoattendedaradius-modulateddiskthatchangedcoherentlywiththeamplitudeof either the target stream orthe masker stream, and were asked to report occasional brief changesincolor.Performancewasbetterwhenthevisualstimuluswastemporallycoherentwith thetargetauditorystream thanwhenitwascoherentwith themaskerstream.Becausethe modulationsofthevisualstimulusofferednoinformationastothetimingofthetargetauditory deviants,theauthorssuggestedaconceptualmodelproposingthatthetemporally-coherent auditory and visual streams formed an AV object whose properties were subsequently enhanced.Thuswhentheauditorytargetstreamandthevisualstimuluswereboundintoa singleobject,performancewasimprovedbecauseobserverswerenolongerrequiredtodivide theirattentionacrosstwosensoryobjects.Drawingontheoriesofobject-basedattention[1,8], thismodel[32]leadstotestablebehavioralpredictions.Forexample,whilethisstudyshowed visualenhancementofauditoryperception,weexpectanequivalentauditoryenhancementof visualperception[32].
Inthissectionwehavedelineatedthedifferencebetweenbindingleadingtocrossmodalobject formationfrombroaderintegration,andsuggestwaystodetermineunambiguouslythe exis-tenceofbinding.Wenowdiscussthedifferingneuralprocessingthatunderliesthesedistinct ideasandofferguidelinesforneuralexperimentstocomplementthebehavioralapproaches discussedabove.
SectionII.NeuralBasis
Whatmechanismmightallowtemporally-coherentauditoryandvisualinformationtobebound? InFigure2 wedrewadistinctionbetweenmultisensoryintegrationatthelevelofperception (decision-makingremainsunchanged,butacrossmodalinputaltersthestimuluscodingand consequentlyperception) andatthe level ofdecision-making (the stimuluscoding within a modality is unchanged but information in anothermodality changes the way in which the stimulusisinterpreted).Specifically,wehighlightedbindingasaperceptualeffectratherthan onethatwascausedbyinteractionsatthelevelofdecision-making(Figure2A).Multisensory interactionsarefoundinamultitudeofcorticalandsubcorticallocations[4],anditseemslikely that different anatomical loci might perform different types of multisensory integration. We proposeherethatmultisensoryintegrationinearlysensorycortexprovidesthe neurophysiologi-calsubstrateforcrossmodalbinding.Weextendtheconceptualmodeldescribedabovetoa neurophysiologicaloneinwhichavisualstimuluscanmodulatetheactivityofauditorycortical neuronsviadirectfeedforwardorlateralvisualinputsintoauditorycortex,providinga mecha-nismthroughwhichauditoryandvisualstimulicanbeboundtogether,enhancingtheir repre-sentations(Figure3).Becauseneuralactivityinearlyauditorycortexiscloselytiedtoperception
[33,34],integratingvisualinformationearlywouldfacilitategenuineperceptualshifts.Weargue thatvisualinputstoauditorycortexmightacttoenhancetheactivityofneuronsthatrepresenta sound-sourcethatiscoherentwithavisualstimulus,andthatthisenhancedneural subpopula-tioncould formthesubstrate onwhichtop-down connectionsmediatingselectiveattention furtherhoneneuralprocessing.
Asinthebehavioralstudiesmentionedabove,wearguethattheuseofacompetingstimulus streamwillbeinstrumental inelucidatingthe neuralmechanisms underlyingbindingandAV objectformation. Moreover,the use ofstimulus competition offersthepotential to testthe hypothesisthatvisual stimulus-inducedneuralcorrelates ofbinding occur independently of selectiveattention–somethingthatisimpossibletotestwithanyformofbehavioralparadigm.It hasbeenobservedthatasecondsound-sourcecanradicallychangetheresponseproperties ofneuronsinauditorycortexevenintheabsenceofselectiveattention[35,36].Whereassingle neuronsdisplaybroadsensitivitytosound-sourcespresentedinisolation,whentwosourcesare
presentedincompetitionneuralselectivityisgreatlyenhanced,resultingindiscretepopulations ofneuronseachrepresentingone ofthetwo competingsources[35,36].Wepredictthata visual stimulus would selectively enhance (or suppress) neural populations based on their temporal(in)coherence.
Consistentwiththevarietyofforms thatmultisensoryintegrationcantake,crossmodal inter-actionscanbeobservedatmanyprocessingstagesinthebrainfromsensorythalamus[37,38]
toprefrontalcortex[39–41].However,weproposethatcross-sensoryinputstoearlysensory cortexplayakeyroleinbindingandthusAVobjectformation,whilemultisensoryprocessing atlatersitespredominantlysupportsotherformsofmultisensoryintegrationsuchas decision-making.
EarlyCross-SensoryInteractions:AMechanismforBindingAuditoryandVisualInformation?
Wesuggestthatthepatternofanatomicalinnervationandphysiologicalresponsepropertiesin earlysensorycortexmakethemideallysuitedtosupportingthebindingofauditoryandvisual
Auditory inputs
Visual input
Early auditory cortex
Higher corces
No visual input or aenon Coherent visual input Coherent visual input + selecve aenon
(A) (B) (C)
Aenon
Figure3.TheEffectofaCoherentVisualStimulusandSelectiveAttentionontheNeuralRepresentationof
CompetingAuditoryStreams.(A)Startingatthebottomofthefigure,twoamplitude-modulatedtonesaretheauditory
inputs.Theseinputsareprocessedbyneuronsarrangedinatonotopicarraysuchasthosefoundinearlyauditorycortex (withwhitesinusoidsdenotingbestfrequencies),generatingtwoperistimulustimehistograms(PSTH)ateachofthe respectiveoutputs.Thetwostreamsareequallysalient(i.e.,equaloverallspikerates,butdifferenttemporalpatterns).These twostreamsarecombinedathigherlevelsofprocessingwithequalweighting.ThecolorofthePSTHinthehighercortices reflectsitssimilaritytoeachofthetwoinputwaveforms(blueandyellow).(B)Thesameas(A)butwithamodulatedvisual stimulusthatiscoherentwiththehigher-frequencyauditoryinput(blue).Theinfluenceofthebluemodulationtime-course (whichcouldbeachievedthroughspikingorsubthresholdvisualinputs)enhancestheresponsetotheblue(higher-frequency) waveformandreducestheresponsetotheyellow(lower-frequency)waveform.Thisresultsintheresponseinhighercortices moreresemblingthebluewaveformthantheyellow.(C)Thesameas(B)butwiththeaddedeffectofselectiveattentionto thehigher-frequencybandresultinginthefinalneuralreadoutresemblingevenmoretheoriginalblueinput.
stimulusfeatures.Theincidenceofearlymultisensoryinteractionshasbeendemonstratedin rodents[42–44],carnivores[45],non-humanprimates[46–48],andhumans[49–52]. Anatomi-calstudiesrevealaplethoraofpotentialbottom-upandlateralvisualinputs,includingprimary visualcortex,secondaryvisualcortex,andmultisensorythalamus[43,45,47,53,54],and func-tionalevidenceisconsistentwitharoleforvisualinputsinfeedforwardprocessing[45,55,56]. Directconnectionsfromauditorytovisualcortexhavebeenshowntomodulatebothspiking activityinvisualcortexandvisualperception,demonstratingthatdirectcortico-cortical con-nectionscanmediatemultisensoryintegration[44].Despitethisevidence,theroleofsuchearly cross-sensoryprojectionsremains elusive –likelyinpartbecause oftherelativescarcity of invasiveneurophysiologicalstudiesinbehavinganimalsperformingmultisensorytasks. AVinteractionswithinauditorycortexcanbefacilitativeorsuppressive,andmaybevisibleasa spikingresponse[45,57–59],anevokedlocalfieldpotentialresponse[58,60],orasan entrain-mentofrhythmicactivityacrosscorticalareas[61,62].Therelativetimingofauditoryandvisual signalscaninfluencethenatureofmultisensoryinteractionssuchthatthesameneuroncan showfacilitationorinhibitiondependingontheAV(a)synchrony[45,59,63,64].Spatial coinci-dencealsodeterminesthenatureofmultisensoryinteractionsinbothsubcorticalandcortical structures[57,63,65].Thusthepropertiesofmultisensoryinteractionsinearlysensorycortexare knowntobedependentonthetemporalcoherencenecessaryforbinding[58,60].Mostinvasive studiesofmultisensoryintegrationinsensorycortexhaverelieduponartificialstimulipresented asbrieftransientbursts,whileparameterssuchastheir onsettimingorspatiallocationhave beenmanipulated.Studieswithmorenaturalisticstimulisuggestthattemporally-dynamicvisual stimulicanmodulatethelocalfieldpotentialandspikingactivityinamannerthatimprovesthe reliabilityofauditorycorticalresponses[66].Stimulus-evokeddeflectionsofthefieldpotential couldpotentiallymodulatethefiringratesofcorticalneuronsbecauseactivityfromcoincident soundarrivingathigh-excitabilityphasesofthefieldpotential[48,58,66]willbeenhanced,while activityarrivingatnon-coincidenttimesislikelytofallinlow-excitabilityphases,thusdecreasing theoverallresponsetotheacousticstimulus.Spikinginputs(morecommoninsecondaryareas) couldmodulateactivityinthesameway,butprovidemorerobustmodulation(Figure3).Fora recentreviewofthemechanismsthatmightsupporttheseeffectssee[67].
Evidenceconsistentwiththeideathatinteractionsbetweensensorycorticesplayakeyrolein bindingauditoryandvisualstimulusfeaturesisprovidedbyhumanneuroimagingstudies.The SIFIisthoughtto bemediatedvia earlysensorycortex[68].ModulationoftheevokedEEG responsetoMcGurkstimulicorrelateswithperceptionoftheillusorysyllables[69]and,although thereisnodirectmappingofEEGtopographytounderlyingneuralsources,modulationofthe earliest components is consistentwith visual stimuli eliciting changesoccurring in auditory cortex. A concurrent visual stimulus can enhance the representation ofa sound-source in humanauditorycortexbothforsinglesound-sources[41]andwhenlistenersarefacedwithtwo competingtalkers[70].Inthefirstinstancesuchaneffectcouldresultsimplyfromamultisensory stimulusbeingmoresalient,andwouldnotnecessarilyindicatebindingnorconferany percep-tualadvantages.Inthesecondcase,however,therepresentationofthecoherentsound-source isenhancedoverthatofthecompetingstream;thisfindingcannotbeexplainedbyageneral effectofarousalandisthereforelikelytobeindicativeofbinding.
Ifearlysensoryareasmediatebindingthenwhataretherolesofmultisensoryresponsesinother brainregions?Multisensoryinteractionsarenot exclusivetoearly sensorycortexandoccur throughoutthebraininvariousforms.Forexample,neuronsinprefrontalcortexaresensitiveto multimodalmismatch[40,71,72].Tasksrequiringalevelofsemanticprocessinghighlightareas, includingthesuperiortemporal sulcus(STS)andintraparietalsulcus(IPS),wherethesizeof multisensoryenhancementobservedpredictsthebehavioraladvantageanindividualshowsfor multisensoryoverunisensoryobjectclassification[41,73].Feedforwardandlateralconnections
inearlysensorycortexarealsocomplementedbyfeedbackconnectionsfromhigherbrainareas suchastheSTSandIPS[60,74–76],andonerolefortheseconnectionsisproposedtobein mediatingpredictivecoding[77]orobjectrecognition[76].
PhysiologicalTestsforBinding
Wehaveidentifiedseveralfeaturesthatareconsistentwith thehypothesisthatmultisensory integrationinearlyauditorycortexplaysakeyroleinbindingandhenceAVobjectformation.We suggest that the following observations would demonstrate a role for early cross-sensory interactionsintheformationofanAV object.(i)In thepresence oftwocompeting auditory stimuliavisualstimulusthatistemporallycoherentwithonestreamshouldboostthe represen-tationofthesubsetofneurons thataredriven bythatauditorystream.This couldoccur by facilitatingthe responses of theneurons responding to the coherent sound-stream and/or suppressingtheresponsesofneurons respondingtothetemporally-incoherentstream,and wouldrequireinvasiveneurophysiologicalrecordingstodisambiguatethesetwopossibilities.(ii) Thesephysiologicaleffectsshouldbeobservedintheabsenceofattention,butwepredictthat selectiveattentiontothecoherentAVsourcewouldfurtheramplifythem.(iii)Acrucialtestfora neuralcorrelateofbindingisanenhancementofallofthefeaturesthataneuronrepresentsand notsimplythosethataretemporallycoherentwiththevisualstimulus.Forexample,aneuronin whichvisualandauditoryinformationareboundbycoherentchangesinsoundintensityand visualluminanceshouldbebetterabletorepresentachangeinotherfeaturesofthe sound-sourcethanwhentheluminanceandintensityvaryindependently.(iv)Finally,elicitinga behav-ioraldeficitinataskthatrequirescrossmodalbindingbyselectivelysilencinginputsfromvisual cortextoauditorycortexwouldprovideacausaldemonstrationoftheimportanceofinteractions betweensensorycorticesinformingAVobjects.Whilesomestudieslookingatmacro-scale signalswithhumanimagingmethodshaveprovidedempiricalevidenceinfavorofthefirstof theseproposals(e.g.,[70],addressingpointiiabove)theremainingpointsremainunaddressed. ConcludingRemarks
WehaveprovidedafunctionaldefinitionofanAVobjectasaperceptualconstructwhichoccurs whena constellation ofstimulus featuresare boundwithin the brain.Further, we identified bindingasaspecificcaseofmultisensoryintegrationinwhichstimulusfeaturesareperceptually groupedacrossmodalities,leadingtotheformationofcrossmodalobjects.Wearguethatthe formationofanAVobjectcanbedeterminedbydemonstratingacrossmodalinfluenceona featuredimensionorthogonaltotheonethatpromotesbinding.Wefurtherbelievethatstimulus competitionisimportantbecausebindingismostbeneficialundernaturalisticlisteningsituations withmultiplesound-sources,andsuchsituationsprovideawayofinvestigatingbindingbased ontheories ofobject-basedattention.Finally,weproposethatfeedforward orlateral cross-sensoryconnectionsinearlysensorycortexfacilitatebinding.
Acknowledgments
ThisworkwasfundedbygrantstoJ.K.B.(WellcomeTrust/RoyalSocietyWT098418MA),R.K.M(NationalInstitutesof HealthK99DC014288andHearingHealthFoundationEmergingResearchGrant),andA.K.C.L.(NIHR01DC013260),and alsobyanInternationalExchangesSchemeawardfromtheRoyalSocietytoJ.K.BandA.K.C.L.
References
1. Blaser,E.etal.(2000)Trackinganobjectthroughfeaturespace. Nature408,196–199
2. Desimone,R.andDuncan,J.(1995)Neuralmechanismsof selec-tivevisualattention.Annu.Rev.Neurosci.18,193–222
3. Shinn-Cunningham,B.G.(2008)Object-basedauditoryandvisual attention.TrendsCogn.Sci.12,182–186
4. Alais,D.etal.(2010)Multisensoryprocessinginreview:from physiologytobehaviour.SeeingPerceiving23,3–38
5. Raposo,D.etal.(2012)Multisensorydecision-makinginratsand humans.J.Neurosci.32,3726–3735
6. Sheppard,J.P.etal.(2013)Dynamicweightingofmultisensory stimulishapesdecision-makinginratsandhumans.J.Vis.13,4
7. Alais,D.andBurr,D.(2004)Theventriloquisteffectresultsfrom near-optimalbimodalintegration.Curr.Biol.14,257–262
8. Talsma,D.etal.(2010)Themultifacetedinterplaybetween atten-tionandmultisensoryintegration.TrendsCogn.Sci.14,400–410
9. Shamma,S.A.etal.(2011)Temporalcoherenceandattentionin auditorysceneanalysis.TrendsNeurosci.34,114–123
10.Durlach,N.I.andBraida,L.D.(1969)Intensityperception.I. Prelimi-narytheoryofintensityresolution.J.Acoust.Soc.Am.46,372–383
OutstandingQuestions
Bidirectionality.Wehavefocusedon visual influences on auditory scene analysis and within auditory cortex. Ourbehavioralpredictionsshouldbe bidirectional acrossmodalities. Does attending to a sound also enhance visualjudgmentsinabound crossmo-dalobject?
Competing Visual Stimuli. We have focusedontheroleofasinglevisual stimulusonanauditorymixture–but whataboutmultiplecompetingvisual stimuli?Dothesamerulesapplyand how dotheytrade off againsteach other?
Interaction with Attention. As has recentlybeenhighlightedfor multisen-soryattentionalprocessing[8], stimu-luscompetitionislikelytobecrucialto revealing such interactions: in the instance of single auditory or visual eventstreams, eachishighly salient andtheyarelikelytobeautomatically bound.Howdoesbottom-upsensory integration interact with top-down attentionbothbehaviorallyandatthe cellularlevel?
ActiveSensing.Whatroledoeyeand headmovementsplayin formingAV objects in a realistic multisensory environment?
Neural Underpinnings of Audiovisual Binding. Which (anatomical) inputs mightshapeneuronalresponse mod-ulation in auditorycortex? How are diverse latencies compensated for? Doeffects‘buildup’?
11.Green,D.M.andSwets,J.A.(1966)SignalDetectionTheoryand Psychophysics,Wiley
12.Micheyl,C.andOxenham,A.J.(2010)Objectiveandsubjective psychophysicalmeasuresofauditorystreamintegrationand seg-regation.J.Assoc.Res.Otolaryngol.11,709–724
13.Slutsky,D.A.andRecanzone,G.H.(2001)Temporalandspatial dependencyoftheventriloquismeffect.Neuroreport12,7–10
14.Radeau,M.andBertelson,P.(1974)Theafter-effectsof ventrilo-quism.Q.J.Exp.Psychol.26,63–71
15.Kording,K.P.etal.(2007)Causalinferenceinmultisensory per-ception.PLoSONE2,e943
16.Bruns,P.etal.(2014)Rewardexpectationinfluencesaudiovisual spatialintegration.Atten.Percept.Psychophys.76,1815–1827
17.Maiworm,M.etal.(2012)Whenemotionalvalencemodulates audiovisualintegration.Atten.Percept.Psychophys.74,1302– 1311
18.McGurk,H.andMacDonald,J.(1976)Hearinglipsandseeing voices.Nature264,746–748
19.Nahorna,O.etal.(2015)Audio-visualspeechsceneanalysis: characterizationofthedynamicsofunbindingandrebindingthe McGurkeffect.J.Acoust.Soc.Am.137,362–377
20.Nahorna,O.etal.(2012)Bindingandunbindingtheauditoryand visualstreamsintheMcGurkeffect.J.Acoust.Soc.Am.132, 1061–1077
21.Tiippana,K.etal.(2004)Visualattentionmodulatesaudiovisual speechperception.Eur.J.Cogn.Psychol.16,457–472
22.Shams,L.etal.(2005)Sound-inducedflashillusionasanoptimal percept.Neuroreport16,1923–1927
23.Tiippana,K.(2014)WhatistheMcGurkeffect?Front.Psychol.5, 725
24.vanWassenhove,V.etal.(2007)Temporalwindowofintegration inauditory-visualspeechperception.Neuropsychologia45, 598–607
25.Shams,L.etal.(2000)Illusions.Whatyouseeiswhatyouhear. Nature408,788
26.Kumpik,D.P.etal.(2014)Visualsensitivityisastronger determi-nantofillusoryprocessesthanauditorycueparametersinthe sound-inducedflashillusion.J.Vis.14,12
27.McCormick,D.andMamassian,P.(2008)Whatdoesthe illusory-flashlooklike?VisionRes.48,63–69
28.vanErp,J.B.etal.(2013)Observerscanreliablyidentifyillusory flashesintheillusoryflashparadigm.Exp.BrainRes.226,73–79
29.Mishra,J.etal.(2013)Auditioninfluencescolorprocessinginthe sound-inducedvisualflashillusion.Vis.Res.93,74–79
30.Innes-Brown,H.andCrewther,D.(2009)Theimpactofspatial incongruenceonanauditory-visualillusion.PLoSONE4,e6450
31.Bizley,J.K.etal.(2012)Nothingisirrelevantinanoisyworld: sensoryillusionsrevealobligatorywithin-andacross-modality inte-gration.J.Neurosci.32,13402–13410
32.Maddox,R.K.etal.(2015)Auditoryselectiveattentionisenhanced byatask-irrelevanttemporallycoherentvisualstimulusinhuman listeners.Elife4,e04995
33.Bizley,J.K.etal.(2013)Auditorycortexrepresentsbothpitch judgmentsandthecorrespondingacousticcues.Curr.Biol.23, 620–625
34.Niwa,M.etal.(2012)Activityrelatedtoperceptualjudgmentand actioninprimaryauditorycortex.J.Neurosci.32,3193–3210
35.Maddox,R.K.etal.(2012)Competingsoundsourcesreveal spatialeffectsincorticalprocessing.PLoSBiol.10,e1001319
36.Middlebrooks,J.C.andBremen,P.(2013)Spatialstream segre-gationbyauditorycorticalneurons.J.Neurosci.33,10986–11001
37.Komura,Y.etal.(2005)Auditorythalamusintegratesvisualinputs intobehavioralgains.Nat.Neurosci.8,1203–1209
38.Noesselt,T.etal.(2010)Sound-inducedenhancementof low-intensityvision:multisensoryinfluencesonhuman sensory-spe-cificcorticesandthalamicbodiesrelatetoperceptual enhance-mentofvisualdetectionsensitivity.J.Neurosci.30,13609–13623
39.Noppeney,U.etal.(2010)Perceptualdecisionsformedby accu-mulationofaudiovisualevidenceinprefrontalcortex.J.Neurosci. 30,7434–7446
40.Diehl,M.M.andRomanski,L.M.(2014)Responsesofprefrontal multisensoryneuronstomismatchingfacesandvocalizations. J.Neurosci.34,11233–11243
41.Werner, S. and Noppeney, U. (2010) Distinct functional contributions of primary sensoryand association areas to audiovisualintegrationin objectcategorization.J.Neurosci. 30,2662–2675
42.Wallace,M.T.etal.(2004)Arevisedviewofsensorycortical parcellation.Proc.Natl.Acad.Sci.U.S.A.101,2167–2172
43.Budinger,E.etal.(2006)Multisensoryprocessingviaearlycortical stages:Connectionsoftheprimaryauditorycorticalfieldwithother sensorysystems.Neuroscience143,1065–1083
44.Iurilli,G.etal.(2012)Sound-drivensynapticinhibitioninprimary visualcortex.Neuron73,814–828
45.Bizley,J.K.etal.(2007)Physiologicalandanatomicalevidencefor multisensoryinteractionsinauditorycortex.Cereb.Cortex17, 2172–2189
46.Fu,K.M.et al.(2003)Auditorycortical neuronsrespondto somatosensorystimulation.J.Neurosci.23,7510–7515
47.Hackett,T.A.etal.(2007)Multisensoryconvergenceinauditory cortex.II.Thalamocorticalconnectionsofthecaudalsuperior temporalplane.J.Comp.Neurol.502,924–952
48.Lakatos,P.etal.(2007)Neuronaloscillationsandmultisensory interactioninprimaryauditorycortex.Neuron53,279–292
49.Calvert,G.A.etal.(1999)Responseamplificationin sensory-specific corticesduringcrossmodalbinding. Neuroreport10, 2619–2623
50.Foxe,J.J.etal.(2000)Multisensoryauditory–somatosensory interactionsinearlycorticalprocessingrevealedbyhigh-density electricalmapping.BrainRes.Cogn.BrainRes.10,77–83
51.Martuzzi,R.etal.(2007)Multisensoryinteractionswithinhuman primarycorticesrevealedbyBOLDdynamics.Cereb.Cortex17, 1672–1679
52.Foxe,J.J.etal.(2002)Auditory–somatosensorymultisensory processinginauditoryassociationcortex:anfMRIstudy.J. Neuro-physiol.88,540–543
53.Budinger,E.etal.(2007)Non-sensorycorticalandsubcortical connectionsoftheprimaryauditorycortexinMongoliangerbils: bottom-upandtop-downprocessingofneuronalinformationvia fieldAI.BrainRes.1220,2–32
54.Smiley,J.F.etal.(2007)Multisensoryconvergenceinauditory cortex.I.Corticalconnectionsofthecaudalsuperiortemporal planeinmacaquemonkeys.J.Comp.Neurol.502,894–923
55.Molholm,S.etal.(2002)Multisensoryauditory–visualinteractions duringearlysensoryprocessinginhumans:ahigh-density electri-calmappingstudy.BrainRes.Cogn.BrainRes.14,115–128
56.Foxe,J.J.andSchroeder,C.E.(2005)Thecaseforfeedforward multisensoryconvergenceduringearlycorticalprocessing. Neuro-report16,419–423
57.Bizley,J.K.andKing,A.J.(2008)Visual-auditoryspatial process-inginauditorycorticalneurons.BrainRes.1242,24–36
58.Kayser,C.etal.(2008)Visualmodulationofneuronsinauditory cortex.Cereb.Cortex18,1560–1574
59.Chandrasekaran,C.etal.(2013)Dynamicfacesspeedupthe onsetofauditorycorticalspikingresponsesduringvocal detec-tion.Proc.Natl.Acad.Sci.U.S.A.110,4668–4677
60.Ghazanfar,A.A.etal.(2005)Multisensoryintegrationofdynamic facesandvoicesinrhesusmonkeyauditorycortex.J.Neurosci. 25,5004–5012
61.Maier,J.X.etal.(2008)Integrationofbimodalloomingsignals throughneuronalcoherenceinthetemporallobe.Curr.Biol.18, 963–968
62.Mercier,M.R.etal.(2015)Neuro-oscillatoryphase alignment drivesspeededmultisensoryresponsetimes:an electro-cortico-graphicinvestigation.J.Neurosci.35,8546–8557
63.Stein,B.E.etal.(1988)Neuronsandbehavior:thesamerulesof multisensoryintegrationapply.BrainRes.448,355–358
64.Perrodin,C.etal.(2015)Naturalasynchroniesin audiovisual communicationsignalsregulateneuronalmultisensory interac-tionsin voice-sensitivecortex. Proc.Natl. Acad.Sci. U.S.A. 112,273–278
65.Meredith,M.A.andStein,B.E.(1986)Spatialfactorsdeterminethe activityofmultisensoryneuronsincatsuperiorcolliculus.Brain Res.365,350–354
66.Kayser,C.etal.(2010)Visualenhancementoftheinformation representationinauditorycortex.Curr.Biol.20,19–24
67.vanAtteveldt,N.etal.(2014)Multisensoryintegration:flexibleuse ofgeneraloperations.Neuron81,1240–1253
68.Mishra,J.etal.(2007)Earlycross-modalinteractionsinauditory andvisualcortexunderlieasound-inducedvisualillusion.J. Neu-rosci.27,4120–4131
69.RoaRomero,Y.etal.(2015)Earlyandlatebeta-bandpower reflectaudiovisualperceptionintheMcGurkillusion.J. Neuro-physiol.113,2342–2350
70.ZionGolumbic,E.etal.(2013)Visualinputenhancesselective speechenvelopetrackinginauditorycortexata‘cocktailparty’.J. Neurosci.33,1417–1426
71.Gau,R.andNoppeney,U.(2015)Howpriorexpectationsshape multisensoryperception.Neuroimage124,876–886
72.Hein,G.etal.(2007)Objectfamiliarityandsemanticcongruency modulateresponsesincorticalaudiovisualintegrationareas.J. Neurosci.27,7881–7887
73.Werner,S.andNoppeney,U.(2010)Superadditiveresponsesin superiortemporalsulcuspredictaudiovisualbenefitsinobject categorization.Cereb.Cortex20,1829–1842
74.Noesselt,T.etal.(2007)Audiovisualtemporalcorrespondence modulateshumanmultisensorysuperiortemporalsulcusplus primarysensorycortices.J.Neurosci.27,11431–11441
75.Falchier,A.etal.(2002)Anatomicalevidenceofmultimodal inte-grationinprimatestriatecortex.J.Neurosci.22,5749–5759
76.Raij,T.etal.(2000)Audiovisualintegrationoflettersinthehuman brain.Neuron28,617–625
77.Lee,H.andNoppeney,U.(2014)Temporalpredictionerrorsin visualandauditorycortices.Curr.Biol.24,R309–R310
78.Bizley,J.K.andCohen,Y.E.(2013)Thewhat,whereandhow ofauditory-objectperception.Nat.Rev.Neurosci.14,693–707
79.Bregman,A.S.(1990)AuditorySceneAnalysis,MITPress
80.Pasupathy,A.(2015)Theneuralbasisofimagesegmentationin theprimatebrain.Neuroscience296,101–109
81.Wagemans,J.etal.(2012)AcenturyofGestaltpsychologyin visualperception.I.Perceptualgroupingandfigure-ground orga-nization.Psychol.Bull.138,1172–1217
82.DiCarlo,J.J.etal.(2012)Howdoesthebrainsolvevisualobject recognition?Neuron73,415–434
83.Griffiths,T.D.andWarren,J.D.(2004)Whatisanauditoryobject? Nat.Rev.Neurosci.5,887–892
84.Kourtzi,Z.andConnor,C.E.(2011)Neuralrepresentationsfor object perception:structure, category,and adaptive coding. Annu.Rev.Neurosci.34,45–67
85.Duncan,J.(2006)EPSMid-CareerAward2004:brain mecha-nismsofattention.Q.J.Exp.Psychol.59,2–27
86.Maddox,R.K.andShinn-Cunningham,B.G.(2012)Influenceof task-relevantandtask-irrelevantfeaturecontinuityonselective auditoryattention.J.Assoc.Res.Otolaryngol.13,119–129