Optimal feature selection for classifying a large set of chemicals using metal oxide sensors

(1)

ContentslistsavailableatScienceDirect

Sensors

and

Actuators

B:

Chemical

j o u r n a l ho me p a g e :w w w . e l s e v i e r . c o m / l o c a te / s n b

Optimal

feature

selection

for

classifying

a

large

set

of

chemicals

using

metal

oxide

sensors

Thomas

Nowotny

a,b,∗

_,

_Amalia

_Z.

_Berna

b

_,

_Russell

_Binions

c

_,

_Stephen

_Trowell

b

a_CCNR,_School_of_Engineering_and_Informatics,_University_of_Sussex,_Falmer,_Brighton_BN1_9QJ,_UK b_CSIRO_Ecosystem_Sciences_and_Food_Futures_Flagship,_GPO_Box₁₇₀₀_Canberra,_ACT_2601,_Australia c_School_of_Engineering_and_Materials_Science,_Queen_Mary_University_of_London,_London_E1_4NS,_UK

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Available online 13 February 2013 Keywords:

Featureselection Metaloxidesensors Classiﬁcation

Supportvectormachines Electronicnose

a

b

s

t

r

a

c

t

Usinglinearsupportvectormachines,weinvestigatedthefeatureselectionproblemfortheapplication ofall-against-allclassificationofasetof20chemicalsusingtwotypesofsensors,classicaldopedtin oxideandzeolite-coatedchromiumtitaniumoxidesensors.Wedefinedasimplesetofpossiblefeatures, namelytheidentityofthesensorsandthesamplingtimesandtestedallpossiblecombinationsofsuch featuresinawrapperapproach.Weconfirmedthatperformanceisimproved,relativetopreviousresults usingthisdataset,byexhaustivecomparisonofthesefeaturesets.Usingthemaximalnumberofdifferent sensorsandallavailabledatapointsforeachsensordoesnotnecessarilyyieldthebestresults,evenfor thelargenumberofclassesinthisproblem.Wecontrastthisanalysis,usingexhaustivescreeningof simplefeaturesets,withanumberofmorecomplexfeaturechoicesandfindthatsubsampledsetsof simplefeaturescanperformbetter.Analysisofpotentialpredictorsofclassificationperformancerevealed somerelevanceofclusteringpropertiesofthedataandofcorrelationsamongsensorresponsesbutfailed toidentifyasinglemeasuretopredictclassificationsuccess,reinforcingtherelevanceofthewrapper approachused.Comparisonofthetwosensortechnologiesshowedthat,inisolation,thedopedtinoxide sensorsperformedbetterthanthezeolite-coatedchromiumtitaniumoxidesensorsbutthatmixedarrays, combiningbothtechnologies,performedbest.

1. Introduction

Featureselectionisoneofthemoreimportantissuesinthefield ofmachinelearning andbioinformatics. Ingeneral,thegoalsof featureselection aretoreducedatadimensionalityandtobuild robustclassificationmodels.Themethodoffeatureselectioncan affecttheresultsofbothclassificationandclustering.Agood fea-turesubsetshouldstronglysupportclassificationandclustering. Filterandwrappermethodsaretwowell-knownfeatureselection techniquesforhighdimensionaldatasets.Inthefiltermethod, fea-turesareselectedonthebasisoffeatureseparabilityofsamples, whichisindependentofthelearningalgorithm.Theseparability onlytakesintoaccounttherelationsbetweenthefeatures,sothe selectedfeaturesmaynotbeoptimal.Wrappermethodssearchfor

Abbreviations:SVM,supportvectormachine;FS,featureselection;MOS,metal oxidesensors.

∗ Correspondingauthorat:SchoolofEngineeringandInformatics,Universityof Sussex,Falmer,BrightonBN19QJ,UK.Tel.:+441273678593;fax:+441273877873. E-mailaddresses:[email protected](T.Nowotny),[email protected] (A.Z.Berna),[email protected](R.Binions),[email protected] (S.Trowell).

criticalfeaturesbasedonthelearningalgorithmtobeemployed, andoftenleadtobetterresultsthanﬁltermethods[1].

Inthefieldofchemicalsensing,whenusingsensorarrays,an importantconsiderationisthetypeandthenumberofsensorsto use.Furtherchoicesapplytohowtosampledatafromthe sen-sorsandhowtopre-processthecollectedrawdata(see[2]fora recentreview).Itiswellknowninmachinelearningthatthis pro-cessoffeatureselectionisveryimportantfortheeventualsuccess of theoverallclassification (recognition)system.Fromthe per-spectiveofmaximizinginformationitmayseemthatusingmore sensorscanonlyimproveperformance,aslongasthesensorsare notfullyredundant(identical)orfullyuncorrelatedwiththe prob-lem(equivalenttonoise).Moreover,fromapracticalpointofview, solid-statechemistryusingcombinatorialsyntheticapproaches[3] orbiosensordesign,basedonnaturalgeneticdiversity[4],arenow capableofgeneratinganalmostlimitlessrepertoireofpotential chemicalsensors.Identifyingoptimaloratleastefficientsubsetsof thesesensorsandtheirsecondaryfeaturesforincorporationinto chemicalsensorarraysisbecomingincreasinglyimportant.

Here, we systematically investigate the feature deﬁnition (extraction)andselectionproblemforfullyclassifyingasetof20 chemicalsusingmetaloxidesensors(MOS)[5]andlinearsupport vector machines(SVMs)[6], asin [7,8],in a wrapperapproach 0925 4005- Crown Copyright©2013 Published by Elsevier B V. .

http://dx.doi.org/10.1016/j.snb.2013.01.088

Open access under CC BY-NC-ND license.

(2)

[9,10], similartothe approach in [11] but usingan exhaustive searchasin[12,13].

Inthedevelopmentofelectronicnoses,classicalfeaturechoices havebeenthemaximumresponseandtheareaundertheresponse curve.Morerecently,theareaunderaphase–spaceembeddingof thesensorresponse[14,15]andexponentialmovingaveragesof thederivativeofthesensorresponsehavebeenproposedas suit-ablefeaturesets [16].Otherauthorshaveputforwardmethods rangingfromsubspaceprojectionmethodswithbiomimetic inspi-ration[17],tospectralmethods suchasFastFourier Transform [18]andDiscreteWaveletTransform[19–22],classicalstatistical methodssuchaslineardiscriminantanalysisandprinciple com-ponentanalysis[23],andparametricmethodssuchascurveﬁtting [24–27].

Here,wetakeastepbackandinvestigatetheuseofverysimple potentialfeatures–asubsetofthemeasureddatapoints (resis-tancesatdifferentmeasurementtimes).Wecontrastthisanalysis withsomeofthemorepopularfeaturechoicesdiscussedabove.In contrasttomanyotherworksonenoses,weevaluateourmethods onaquitelargesetof20analytesthatareclassiﬁedall-against-all. Withthisapproachwecomplementtherelatedworksonsmaller classiﬁcationproblemswithonlyafewclasses.

2. Materialsandmethods

2.1. Electronicnosemeasurements

Themeasurements wereperformed witha FOX3000Enose (AlphaM.O.S.,Toulouse,France). Originally,theinstrumentwas equippedwithamodiﬁedarrayof12semiconductingsensors com-prisinganarrayofsixstandard,dopedtindioxide(SnO2)andsix

chromiumtitaniumoxide(CTO)and tungstenoxide(WO3)

sen-sors.WeremovedtheCTOandWO3 sensorsandreplacedthem

withanarrayofsixnovelCTObasedsensors,ﬁveofthesebeing zeolite-coated[28]andoneanuncoatedCTOsensor.

ThebasicconceptbehindthemodifiedCTOsensorsisthe addi-tionofatransformationlayerovertheporouschromiumtitanium oxidesensingelement.Atransformationelementisdesignedto modifyorrestrictthecompositionofgasesthatcontactthesensing element.Inthiscase,thetransformationlayercomprisestheacid (orsodium)formsofzeolitesA,ZSM-5andZSM22MCM-41. Zeo-litesareidealforthispurposeduetotheirporousnature,having poreandchannelstructuresofmoleculardimensions(seeTable1). Theyareabletorestrictthesizeandshapeofgasphasemolecules reachingthesensorthroughporesizecontrolandselective per-meability[29,30].Theyalsoactasselectivecrackingandpartial oxidationcatalysts[31]withmolecularsize-andshape-specificity. Furthermore,thezeolite’scatalyticbehaviourcanbemodified,and thustuned, byinsertion ofmetalions orvariousnanoparticles.

Fig.1. ExampleofresponsesfromtheFOXEnoseﬁttedwiththetwelve-sensorarray. ResponsesofSnO2sensorsaredrawndownwardsandresponsesofCTOsensors

upwards.Theverticallinesmarkthechosenavailablesamplingtimes.Notethatthe SnO2sensorsaremoresensitiveoverallthantheCTOsensors.

Modiﬁcationmaybebyeitherion-exchangeorlatticesubstitution oneithertheirinternalorexternalsurfaces.

Duetothephysicalcharacteristicsofthesensors,thetwoarrays werehousedindifferentchambers.Thesetofchemicalcompounds analysed consisted of five chemicals each from four chemical groups:alcohols,aldehydes,estersandketones(Table2). Chemi-calswerechosenfromalargersetofchemicalsusedinacomparison ofmetaloxidewithbiologicalsensors[32].Eachchemical com-poundwasdilutedusingparaffinoiltogiveafinalconcentration intherange1.22–8.03×10−5M(Table2).Intotal,tenreplicates ofeachsamplewereprepared.Standardconcentrationswere cho-senforeachchemicalclass,suchthattheabsolutevaluesofthe responsesformostchemicalsofthatclassweretowardsthehigher endofthescaleforanyofthetwosensortypes,i.e.theratioof themaximalresistancechangetothebaseline(R/R0)wasbetween

±0.8and±1(Fig.1).Samplesof1mlwerepresentedina20ml glassvialusingthestaticheadspacemethod.Theinstrumentwas equippedwithanautosampler(HS50,CTCAnalytics,Switzerland), whichallowsreproducibleinjections.TheinjectionportoftheFox wassetto30◦Candtheheadspacevolumetakenforanalysiswas 500␮l.Sampleswereanalysedingroupsbasedonchemical fam-ilywiththeanalysisofthe200samplesbeingcompletedoverfour days.Dryzerogradeair(ﬂowrate150mlmin−1)wasusedtosweep thesamplethroughthetwosensorchambers.Thesensorresponse wasrecordedforatotalof300sat2Hz,seeFig.1foratypical mea-surement.A240sdelaywasimposedbetweensamplestoallow

Table1

Overviewofthesensorsintheelectronicnose,togetherwithabriefdescriptionofeachsensor.

# Sensor Description

1 CTO Chromium–titaniumoxidesensorwithoutcoating.generalVOCsensor.

2 CTO-HZSM-5 Chromium–titaniumoxidesensorwithH-ZSM-5zeoliteoverlayer,poresize5.1–5.5 ˚A. 3 CTO+NaZSM-5 Chromium–titaniumoxidesensorwithNa-ZSM-5zeoliteoverlayer,poresize5.1–5.5 ˚A. 4 CTO+HLTA Chromium–titaniumoxidesensorwithH-LTAzeoliteoverlayer,poresize3.5 ˚A. 5 CTO+MCM-41 Chromium–titaniumoxidesensorwithMCM-41overlayer,poresize30–100 ˚A. 6 CTO+HZSM-22 Chromium–titaniumoxidesensorwithH-ZSM-22zeoliteoverlayer,poresize4.6–5.7 ˚A.

7 T30/1 SnO2sensorfordetectingsolvents.a

8 P10/1 SnO2sensorfordetectinghydrocarbonsandmethane.a

9 P10/2 SnO2sensorfordetectingmethane,propaneandaliphaticnonpolarmolecules.a

10 P40/1 SnO2sensorfordetectingchlorinatedandﬂuorinatedcompounds.a

11 T70/2 SnO2sensorfordetectingalcoholvapoursandaromaticcompounds.a

12 PA/2 SnO2sensorfordetectinglowconcentrationofhydrogen,ammonia,amines.a

(3)

Table2

Concentrationsofanalytesusedaccordingtotheirchemicalclasses.

Alcohols(1.22×10−5M) Aldehydes(8.03×10−5M) Esters(3.70×10−5M) Ketones(3.79×10−5M)

1-Pentanol Acetaldehyde Ethylhexanoate Acetone

1-Hexanol Butanal Ethylacetate 2-Butanone

Z2-Hexen-1-ol Hexanal Isopentylacetate 2-Pentanone

1-Octen-3-ol E2-hexenal Methylacetate 2-Heptanone

3-Methylbutanol Furfural Ethylbutyrate 2,3-Butanedione

thesensorstoreturntobaseline,thiscleaningprocedurewas

per-formedataﬂowrateof150mlmin−1usingdryzerogradeair.Data

werecaptured and pre-analysedusingAlphaSoftv.8 (Toulouse,

France).

2.2. Featuresets

Todeﬁneareasonablysizedsetofpossiblefeatures,foreach

sensor, we extractedsix candidate time points, namely20, 40,

60,80,100and120s(Fig.1),fromthe600datapointsavailable (2Hz/300s).Theglobalpopulationofcandidatefeaturesets com-prisedthese1–6timepointsinallpermutationsand,foreachof thetime pointpermutations,all possiblepermutationsof1–12 availablesensors.Notethat,inordertoreducethecomplexityof theproblemandmakeitcomputationallymanageable,wedidnot includefeaturesetswheretheselectedtimepoint(s)variedamong sensors.

In a second set of numerical experiments we considered six features that are commonly used in the enose literature instead:

1.Theabsolutemaximalresponse,

Rmax−R0

.

2.Theareaunderthefullresponsecurve,

₀TR(t)dt,whereTis ourtotalmeasurementtime,T=300s.

3.The

phase space area [14],

₀Rmax(dR(t)/dt)dR=

Tmax

0 (dR(t)/dt) 2

dt, where we made use of the fact that R(t)is(approximately)smoothandstrictlymonotonic,hence invertibletoderivetherighthandsideversion.

4–6.Exponentialmoving averages [16] of thederivative ofthe sensorresponse,E˛(R)=maxkema˛R(k),wherethediscretely

sampled exponential moving average y(k)=ema˛R(k) is

deﬁne recursively as y(k)=(1−˛)y(k−1)+˛(x(k)−x(k− 1))withsmoothingfactors˛=0.005,0.05,0.5whicharethe equivalentvaluesforoursamplingfrequencyof2Hztothe valuesin[16]forsamplingat10Hz.

2.3. Classiﬁcationalgorithmandcross-validation

We used the libsvm [33] library for linear support vector machines [6] to perform the cross-validation experiments. We performed classiﬁcation using linear C-SVC (SVM classiﬁcation withcostparameterC)withfourCvalues,C=1024,4096,16,384, 65,536.WeobservedconsistentperformanceforalltestedC val-uesandreportonlyresultsforC=65,536intheremainderofthe paper.

Allresultsreportedwereobtainedfrom10-foldbalanced cross-validation:thedatasetwassplitintoa trainingandtest setby randomlychoosingonemeasurement ofthetenmeasurements availablefor each chemicaltoformthetest set.Theremaining 20×9measurementsformthetrainingset.Thisprocedurewas repeatedtentimes,excludingallpreviouslychosentestsamples fromthechoiceuntilallmeasurementshavebeenusedinatestset once.Weperformedtenrepetitionsofthisentireprocedure,sothat theperformance measurementsreportedbelowaretheaverage performanceoftrainingandtesting100classiﬁers.

2.4. ClusteringqualityandMahalanobisdistance

Forthepurposeofcomparingtheclassiﬁcationresultswiththe structureofthedata,giveneachparticularfeatureset,wedeﬁned thequalityofclusteringasthequotientdinter/dintraoftheaverage

Euclideandistancebetweenaverageclassvectors, dinter=

x

_i−

x

_j

i,j

(1) andtheaverageEuclideandistanceofvectorswithinaclasstothe averageclassvector,

dintra=

xi,k−

x

_i

k. (2)

Here,x_i,kdenotesthekthmeasurementofchemical(class)i,· denotestakinganaverageand· denotestheEuclideannormin thespacedeﬁnedbythefeaturesetofinterest.

Asa secondmeasureofdata setstructure weemployed the averagepairwiseMahalanobisdistancesuggestedbyMuezzinoglu etal.[34].ThesquaredMahalanobisdistanceoftwoclasseswas estimatedby

D2(i,j)=

x

_i−

x

_j

TˆS−1 ij

x

_i−

x

_j

(3)

where ˆS_ij−1 denotes theinverse ofthe weightedaverage of the estimatedcovariancematrices ˆSiand ˆSj ofthevectorsinclasses

iandjand

x

iand

x

_jtheaverageclassvectorsforclassiand j,respectively.Forassessingtheoverallstructureofthedataset usingtheMahalanobisdistance,wecalculatedthemeanpairwise MahalanobisdistanceMDofallpairsofclasses,deﬁnedby MD2=

N

i/=jD 2

(i,j). (4)

2.5. Meanabsolutepairwisecorrelationoffeatures

Tocalculatethemeanabsolutepairwisecorrelationoffeatures weusedforthepairwisecorrelationoftwofeaturesxandythe estimator rx,y= 1/(n−1)

k(xk− ¯x)(yk− ¯y) sxsy (5) herexk and yk are thevalues thetwo features takeacross the

full data set, k=1,...,n, where n=200 and sx and sy denote

thestandard estimatorsfor thestandard deviations of xand y. For example, x may be the values of sensor i at time r and ythe values of sensor jat time s,where i,j∈

1,...,12

and s,t∈

20,40,...,120

.Toobtainthemeanabsolutepairwise cor-relationofafeaturesetS,weaverageovertheabsolutevalueofall soestimatedpairwisecorrelationsoffeaturesxandyinS:

r(S)= 2

N(N+1)

x≤y∈S

_r_x,y

(6)

wherethesumisoverallfeaturesinthefeaturesetS,butcounting pairs(x,y)and(y,x)onlyonce(asthecorrelationissymmetric).Note thatweincludedx=ytoobtainameaningfulmeasureforfeature setsthathaveonlyonefeature.Wetaketheabsolutevalueaswe

(4)

areinterestedinhowlinearlydependentfeaturesare,nomatter whethertheyarecorrelatedoranti-correlated.

3. Results

3.1. Classiﬁcationperformanceoffeaturesets

Wefirstinvestigatedtheoverallperformance ofallpotential featuresetsfor classifyingthe20chemicalsagainst eachother. Wemeasureandreportperformanceasthefraction(or percent-age) of correctly classified measurements in the test sets (see Section2),rangingfrom0(noexamplesclassifiedcorrectly)to1 (allexamplesclassifiedcorrectly).Fig.2Ashowstheobserved per-formanceforthefeaturesetsformedofsixcandidatetimepoints andall12sensors.Theperformanceisreportedforfeaturesets

ofdifferentsizeconstraint, e.g.(2,3) denotesa featuresetwith twotimepointseachfromthreesensors.We notethatthebest featuresetsleadtomuchbetterclassificationperformancethan previouslyreportedclassifiersbasedonthissetofmeasurements [35].Thereareseveralfeaturesets thatleadto100%,i.e. error-free,classificationinthetenrepetitionsof10-foldcross-validation. Thereare quitea fewfeaturesetsthatenablethis optimal per-formance(3368)butitisasmallfractionofallsetstested(1.3% ofthetested257,985sets).Thebestperformanceisnotachieved withthenaïvelyexpectedmaximalsensor-anddata-use(12 sen-sors,six time points,top lineofFig.2).Neitheris theclassical approach of using just one time point for all sensors (e.g.the timewhenthemaximumsignalisobserved)particularly success-ful(secondlineofFig.2),confirmingearlierfindingsinotherenose applications[13,36].

Fig.2.(A)Fractionalpredictionaccuracyof10-foldcross-validationusinglinearSVMsforallallowedcombinationsofsixtimepointsandthe12availablesensors.Thebox plotsshowthemedianand25%and75%quantiles,theestimatedoverallrange(whiskers)andidentifiedoutliers(redcrosses)oftheobservedperformancesforeachgiven sub-family,constrainedtohavextimepointsandysensors(firstandsecondnumericalcolumns).Thethirdnumericalcolumnindicatestheresultingnumberofcombinations thatwastested.Thecolouredcolumnsindicatethebestobservedperformance,theworstperformance,theperformanceofthe“top10”groupoffeaturechoices(seeSections2 and3.1)andtheperformanceofthisgroupinarepeat10-foldcross-validation.Theprevalenceofexcellentperformanceatthebottomofthegraphindicatesthatsub-families withmanydifferentchoicestypicallycontainchoicesthatleadtoexcellentperformance.However,therearealsoexamplesofexcellentperformanceforothersituations,e.g. the(3,12),(6,6)and(2,10)groups.Notethehighlynon-linear,logarithmiccolourcodefortheclassificationperformanceinwhichhighperformancescloseto1areresolved inmanycolourgradationsfromdarkredtocyanandweakperformancescloserto0arecompressedintoafewbluecolours.(B)Thesameanalysisappliedtothesamearray of12sensorsandsetof20chemicalsusingsixdifferentpopularfeaturesetsfromtheenoseliterature:absolutemaximalresponse,theareaundertheresponse,theareaof thephase–spacecurveoftheresponse[14],andexponentialmovingaveragesofthederivativeoftheresponsecurve[16].(Forinterpretationofthereferencestocolourin thisfigurecaption,thereaderisreferredtothewebversionofthearticle.)

(5)

To controlfor selection biases, we defined a groupof well-performingfeaturesets(column“top10”inFig.2)and repeated cross-validationforthisgroup(column“top10rerun”inFig.2).As expected,theperformanceinthererunistypicallyabitworsethan intheoriginalrun,aswehadselectedthemostsuccessfulfeature setsonthisfirstrunandhencemighthavecollectedthosewhere wejust “gotlucky” withthepartitionsofthetentimes10-fold cross-validation.Thererundemonstratesthatthisselectionbias isnoticeablebutthatitisaminoreffectcomparedtotheoverall variationinperformanceduetofeatureselection:theresultsinthe rerun,albeitslightlylessgoodthanintheoriginalrun,are consis-tentlybetterforfeaturesetsthatwereoptimalintheoriginalrun thanforfeaturesetsthatwerenot.Thereareevenrareexamples wheretheclassifierperformsbetterinthererun(forthe(5,12) fea-tureset)thanintheoriginalselectionrun.Weparticularlynote thatthesuperiorperformanceofmanyofthesmallerfeaturesets overthefull(6,12)choiceremainsintact.Alsonote,however,that forsmallerfeaturesets,classificationsuccessdependscriticallyon thechoiceoftheusedfeaturesetsfromthepoolofallpotentialsets ofagivensize.Thisisillustratedbythemuchlowerworst perfor-mance(seecolumn“worst”andthelowoutliersintheleftcolumn ofFig.2).

Thedataispresentedorderedbythenumberofpossible fea-turesetsforeachgivensizeconstraint,rangingfromasingle(6,12) featuresetonthetopto18,480possible(3,6)featuresetsatthe bottom.Theprevalenceofexcellent“best”,“top10”and“top10 rerun”performanceatthebottomofthegraphillustratesthatsize constraintswithmanydifferentfeaturesetchoicesaremorelikely tohavewell-performingsets,eventhoughthisisnotanabsolute rule,seee.g.the2970(1,4)featuresetsthataremuchlesssuccessful thanthe20(3,12)sets.

Fig.2(B)illustrates thesame analysisfor analternative fea-turesetcomprisingsixfeaturessuggestedintheenoseliterature: absolutemaximalresponse,theareaundertheresponse,thearea ofthephase–spacecurveoftheresponse[14],andthree differ-entexponentialmovingaveragesofthederivativeoftheresponse curve[16](seeSection2fordetails).Intheremainderofthis sec-tionwewillrefertothelattersetas“derivedfeatures”whereas we willrefer tothesix representativedata pointsthat arethe focusofthisstudyas“simplefeatures”.Thebest-performing fea-turesetsbasedonsubsetsofderivedfeaturesperformconsistently worsethanthebestsetsusingsubsetsofsimple features.Note, however, that the performance of the derived feature setwas still quite highand comparable, if not superior, to the perfor-mancebasedonEMAfeatures, previouslyreportedforthisdata set[35].

Itisalsonoteworthythatthesizeconstraints(i.e.thenumber ofdatapointsandnumberofsensorsused)thatleadtothebetter ortheratherpoorlyperformingfeaturesetsarethesameforboth scenarios:thepatternofbetter-performingandworse-performing feature set sizes, in terms of best performance, is identical in Fig.2(A)(simplefeatures)and (B)(derivedfeatures),cfcolumn “Best”.Wequantifiedthisobservationbycalculatingthe correla-tionbetweenthebestperformanceofeachsizeconstraintinthe twoscenariosandobtainedacorrelationcoefficientof0.975. Fur-thermore,thefulldistributionsofperformancesarealsosimilar, asillustrated bythe boxplotsin Fig.2(A) and (B),which share many properties.For example,the (1,1)size constraint hasthe sameverybroadandverylowperformancedistributioninboth casesandtheperformancedistributionforfeaturesetsoffivedata points and nine sensorsis particularly narrowand high. Inter-estingly,thederivedfeaturesusedinFig.2(B)leadtoa slightly improvedworstperformanceindicatedbytheshortertailof out-liersintheboxplots,inparticularfor thelower halfof theplot thatcontainssizeconstraintsthatallowlargenumbersoffeature choices.

3.2. Relationshiptoclustering

Toidentifypossibleexplanationsfortheimprovedperformance ofsomefeaturesetsoverothers wecomparedtheclassiﬁcation performanceforeachsetwiththequalityofclusteringgiventhis featurechoice.Forthispurposewedeﬁnedthequalityof cluster-ingasthequotientdinter/dintraoftheaverageEuclideandistance

betweenaverageclassvectorsandtheaverageEuclideandistance ofvectorswithinaclass(seeSection2).Theresultsareillustrated inFig.3.Wenoticeapositivecorrelationbetweenclustering per-formanceandclassificationperformance,inparticularintheform oftheabsenceofexamplesof goodperformancewithverylow clusteringquality(upperleftcornerinFig.3)andof(very)low per-formancewithhighclusteringquality(lowerrightcornerinFig.3). Theclusteringquality,however,clearlydoesnotfullyexplainthe classificationresults.Theoverallcorrelationcoefficientbetween classificationperformanceandclusteringqualityasdefinedhere ispositivebutonly0.205.Themostrelevantcasesarethe best-performingfeaturesets,forwhichclusteringperformanceisabove averagebutnotnecessarilymaximal,andviceversa,theverybest featuresetsintermsofclusteringquality,whichdonot necessar-ilyperformoptimally(Fig.3,inset).Onepossibleexplanationfor thefailuretopredictclassificationperformanceforthebest per-formancesisthatinthesecasesitistheworstcaseofclustering qualitybetweenclassesthatisimportantandnotnecessarilythe mean.Ifweanalysetheminimaldinter/dintraratioforallpairsof

classesinsteadofthemean,therelationshipbecomesmoreclear (SupplementalFig.1)and theoverallcorrelationreaches0.427. However,it stilldoesnot fullypredicttheclassiﬁcation perfor-mance,inparticularintermsofthemaximalperformancethatis ofmostinterest.

Fig.3.ClassificationperformanceoflinearSVMsforallpossiblefeaturechoicesin tenfoldcross-validationplottedagainsttheclusteringquality(ratioofinter-classto averageintra-classEuclideandistance,seeSection3.2).Thedisplayedcolour indi-catesthenumberofoccurrencesofeachparticularcombinationofclusteringquality andperformance(whiterepresents0).Aclearcorrelationisnoticeable,in particu-lartheabsenceofpointswithlowclusteringqualityandhighperformance(upper leftcorner)orlowperformanceandhighclusteringquality(lowerrightcorner). Theinsetshowstheregionmarkedbytheredrectangleontop.Whilethe correla-tionoverallisminor(Pearsoncorrelationcoefficient0.205)itisnoticeablethatthe overwhelmingmajorityoftheverytopperformers(greenarrowhead)have above-averageclusteringquality.Thereverse,however,isnottrue:thefeaturesetsthat showtheveryhighestclusteringqualitydonotnecessarilyleadtothehighest per-formance(redarrowhead).Thehorizontalstripesvisibleintheenlargedplotare notanartefactbutreflectthatiftheclassifiersfailforonespecificmeasurement, thentheytendtodosoconsistentlyinall10repetitionsofcross-validation.The distancebetweenstripes(0.5%=1/200)isconsistentwiththisinterpretation.(For interpretationofthereferencestocolourinthisfigurecaption,thereaderisreferred tothewebversionofthearticle.)

(6)

Supplementarydataassociatedwiththisarticlecanbefound,in theonlineversion,athttp://dx.doi.org/10.1016/j.snb.2013.01.088. Wethenperformedthesameanalysisusingthemeanpairwise MahalanobisdistanceassuggestedbyMuezzinogluetal.[34]with similarresults(Supplemental Fig.2).Thecorrelationcoefficient ofmean pairwise Mahalanobisdistance (MD)and classification performanceis0.189,or,ifweusethelogarithmofMDdueto itswide rangeof values,thecorrelation coefficientis0.392.As withthemorebasicclusteringqualitymeasureabove,the Maha-lanobisdistanceofclassesisrelatedtoclassificationperformance butcannotfullypredictit.Whenconsideringtheminimal Maha-lanobisdistancebetweenmeanclassvectorsratherthanthemean, asimilarpictureemerges(SupplementalFig.3)andthecorrelation coefficientsbetweenminimalMahalanobis distanceand perfor-manceandlogarithmofminimalMahalanobisdistanceare0.295 and0.548,respectively.

Supplementarydataassociatedwiththisarticlecanbefound,in theonlineversion,athttp://dx.doi.org/10.1016/j.snb.2013.01.088. 3.3. Sensorcorrelation

Anothercommonlyheldbeliefisthatthedegreeofcorrelations oftheresponsesofdifferentsensorsisstronglyrelatedto,ifnot anexplicitpredictorof(lackof)performanceinclassification. Pre-viousworkhasshownthatMOSsensorresponsescanbehighly correlatedinspiteofdifferentdopingoftheindividualsensortypes [32].Thishasbeenusedasanexplanatoryconstructtojustifywhy biologicalchemicalreceptors,whichappeartobemuchless corre-latedintheirresponses,mayoutperformMOSsensorarrays.With thefullassayoftheperformanceoffeaturesetsinSVM classifica-tioninthisworkwecandirectlytestthishypothesisbycomparing theclassificationperformanceoffeaturesetstothemeanabsolute pairwisecorrelationofthefeaturesinthesets.Wecalculatedthe averagePearsoncorrelationcoefficientofeachfeatureinasetwith eachotherfeatureinthesamesetacrossallpairsofinputsoftwo distinctchemicalsandplottedthesevaluesagainsttheobserved performancein10-foldcross-validation(Fig.4).Overallthereisa weaktrendwherelesscorrelatedfeaturesinafeaturegroupare correlatedwithbetterperformance,i.e.thereisaweaklynegative

Fig.4.Averagepairwisecorrelationplottedagainsttheperformanceofthelinear SVMintenfoldcross-validation.Thecolourrepresentsthenumberofoccurrences, whiterepresenting0.Theinsetshowsanexpandedplotoftheareainthered rect-angleonthetop.Thereisasmallnegativecorrelationbetweenthemeanpairwise correlationofsensorresponsesinafeaturesetandtheresultingclassification per-formanceoverall(Pearsoncorrelation−0.284)butalowpairwisecorrelationof featuresisbynomeansadirectpredictorofgoodperformance.Wheninspecting thetopperformers,wenoticeasfortheclusteringqualityinFig.3thattop per-formingfeaturesetsinthemajorityhavea(inthiscaselow)typicalmeanpairwise correlationoffeatures(greenarrow)butthefeaturesetswiththelowestmean pair-wisecorrelationoffeaturesdonotnecessarilyleadtotopclassificationperformance. Notethehighlynon-linearcolourscaleinthisfigure.Thestripesintheinsethave thesameoriginastheonesinFig.3(seeFig.3caption).(Forinterpretationofthe referencestocolourinthisfigurecaption,thereaderisreferredtothewebversion ofthearticle.)

correlationbetweenmeanpairwisecorrelationoffeaturesandthe classiﬁcationperformanceofthefeatureset(Pearsoncorrelation coefﬁcient−0.284).Similarlyaswhencomparingtotheclustering quality(Section3.2),thebestperformingfeaturesetshave typi-callyacomparablylowmeanpairwisecorrelationoftheirfeatures (Fig.4,inset),butminimalmeanpairwisecorrelationoffeatures doesnot necessarily predict optimalperformance. Also, due to

Fig.5.Frequencyofappearanceofindividualsensorsinoptimalfeaturesets.Foreachtypeoffeatureset,intermsofthenumberofsensorsused(y-axis),thecolourindicates thepercentageoffeaturesetsinthetop10groupsthatcontainthesensorindicatedonthex-axis,forthewholearray(A)andfortheseparate6-sensorarraysofSnO2sensors

andCTOsensors(B).Inthefullarray,sensor9isusedinalmosteveryfeatureset,exceptthosewithveryfewsensors(redarrowhead).Thesecondmostusedsensorissensor 11(bluearrowhead).Someothersensorsarealmostneverused,inparticularsensor7(blackarrowhead).Forfeaturesetsofonly1or2sensorstheSnO2sensors8or10

(whitestars)appearoften,foroptimalsetsoftwosensorstheyareoftenfoundincombinationwiththeCTOsensor1.FortheseparatearraysofSnO2sensorsandCTOsensors

(B),thepictureislessclear.Whilesensor9seemstobeusedmoreoftenthanaverage,sensor7,whichwasunder-representedinthefullarray,isalsousedquitefrequently. Bydesign,whenallsensorsareused,theymustbeusedequallyoften(i.e.always,greenarrows).(Forinterpretationofthereferencestocolourinthisﬁgurecaption,the readerisreferredtothewebversionofthearticle.)

(7)

thenatureofmeasuringpairwisecorrelations,featuresets with asinglefeaturearemaximallycorrelatedbutstillcanleadto per-formancesthatareclearlyabovechancelevels.Theabsolutevalues ofthemeanpairwisecorrelationofsensorsareallabove0.5which isfairlyhigh,inparticularwhencomparedtobiologicalchemical sensors[32].

3.4. Compositionofandrelationshipsbetweenoptimalfeature sets

An important question for understanding thenature of our resultsiswhatrelationshiptheoptimalfeaturesetshavetoeach other,whattheyhaveincommonandwheretheydiffer.Firstwe askedwhetherindividualsensorsappear preferentiallyinthese bestperforminggroups.Fig.5Aillustratesthenumberof occur-rences of each sensor (x-axis) in the “top10” sensor groups of differentsizeconstraints(y-axis).Whileallsensorsappeartosome degreeinthefeaturesets,sensor9appearsinalmostall“top10” featuresetsofsizethreeandlarger,whilesensors8(P10/1)and10 (P40/1)arepreferentiallyusedintheverysmallfeaturesetsoftwo oronlyonesensor.Sensor7(T30/1),ontheotherhand,isalmost neverused.

Whenperformingthesameanalysisfortheindividualsensor technologiesseparately (Fig. 5B), the picture is less clear even thoughthereispreferentialuseofsensor9(P10/2)fortheSnO2

sensors. Interestingly,when only using SnO2 sensors, sensor 7

(T30/1)isquitewellused,beingthemost-usedsensorforfeature setswithasinglesensor.Onecouldthinkthatthis discrepancy arisesbecausesensor7(T30/1)mayhaveresponsepropertiesthat areunusuallyredundantwiththeresponsesofCTOsensors. How-ever,whencalculatingtheaverageabsolutecorrelationofsensor 7(T30/1)withallCTOfeaturesweﬁndthesamelevelof corre-lations(0.121)aswhen usingsensor 9 (0.120).Alsotheproﬁle of correlationswithindividual CTO sensorsis similar(datanot shown).

Fig.6.Classificationperformancebasedondatafromindividualsensors,usingall sixdatapoints.Classificationperformanceisreportedinacolourcode,separatelyfor thefourchemicalgroups(rows1–4)andoverall(row5).Intheoverallperformance, sensor8(P10/1)performsbest,blackarrowhead,andthemostsuccessful classifi-cationforasinglechemicalgroupoccursforalcoholsusingeithersensor8(P10/1) orsensor10(P40/1),redarrowheads.Overall,themostchallengingchemicalgroup forclassificationaretheketones(greenhorizontalarrowhead)andtheworst per-formingsensor,whenusedonitsownissensor4(CTO+HLTA).Thelikelyreason forthispoorperformanceistheverysmallporesizeoftheH-LTAzeolite(∼3.5 ˚A), whichissmallerthanthewidthofmostofthemoleculesinvestigatedhere.Hence, mostanalyteswillnotreachthesensorsurfaceandresponsesaresmalland unspe-cific[30].(Forinterpretationofthereferencestocolourinthisfigurecaption,the readerisreferredtothewebversionofthearticle.)

3.5. Performanceofindividualsensors

Notsurprisingly,individually,sensorsdonotperform particu-larlywellinclassifyingthe20chemicals.Evenso,someindividual sensorscanpartiallydiscriminatecertainchemicalsandthereare signiﬁcantdifferencesinhowtheyperformforthewholesetand foreachchemicalclass.Fig.6illustratestheperformanceof individ-ualsensors,usingallsixdatapoints,resolvedforthefourchemical classesofanalytes.Theperformancereportedforeachclassof ana-lytesisthefractionofcorrectrecognitionsofthemembersofthis classwhencomparedtoallanalytesfromallchemicalclasses. Sen-sors 8(P10/1) and 10 (P40/1)are thebest sensorsindividually

Fig.7. Classiﬁcationperformanceoffeaturesetsthatuseonlyoneofthetwo avail-abletechnologies.TheplotsareasinFig.2.(A)Performanceifsubsetsofthe6SnO2

sensorsareused.Perfect100%classiﬁcationisstillpossible,notablymostrobustly forthe(4,6)sizeconstraint.Overall,however,theoptimalperformanceisachieved lessfrequentlyandislessrobustwithinthe“top10”groups.(B)Performanceif sub-setsofthe6CTOsensorsareused.Here100%performanceisnotachievedand performancelevelsaremeasurablylowerforallavailablefeaturechoicesthanfor theSnO2sensors.TheSnO2sensorsalsoperformmuchbetterintermsofthe

distri-butionofperformanceacrossallfeaturesets:theboxplotsin(A)aremuchtighter thanthosein(B).

(8)

andworkparticularlywellforalcohols(redarrowheads).Itwas interestingtonotethatsensor11 (T70/2),which ismore sensi-tivetoalcohols,wasnotthebestsensortorecognizethem.Closer inspectionoftheaverageratioofinter-classandintra-class dis-tancesofresponsesinindividualsensorsforalcoholsrevealedthat indeedT70/2hasthebestsignal-to-noiseratioforthealcoholsin thissense(SupplementalFig.S4A).Wheninspectingthematrixof recognitionsuccessandfailure(SupplementalFig.S4B)andaPCA plotofsensorresponsesusingthe6datapointsofT70/2 (Supple-mentalFig.S4C),weﬁndthattheproblemisnotdistinguishingthe alcoholsfromeachother,whichisdoneﬂawlessly,but distinguish-ing1-pentanol(input 1)fromhexanal(input8)andE2-hexenal (input9),seearrowheadsinSupplementalFig.S4Candthe cor-respondingclustersinFig.8Dforthebetter-performingsensor8 (P10/1).

Supplementarydataassociatedwiththisarticlecanbefound,in theonlineversion,athttp://dx.doi.org/10.1016/j.snb.2013.01.088. ReturningtotheresultsinFig.6,ketonesarebyfarthehardest ofthechemicalclassestoclassify(rowmarkedbythegreen arrow-head)andsensor4(CTO+HLTA)worksparticularlypoorlyforall analytes(columnmarkedbythegreenarrowhead).Thelikely rea-sonforthelatteristheverysmallporesizeoftheH-LTAzeolite (∼3.5 ˚A),whichistoosmalltoallowpassageofmostofthe ana-lytesusedinthisstudy,hencetheresponsesofthesensorareof lowamplitudeandquiteunspeciﬁc[30].

Theoverallbestperformingsensorisnumber8(P10/1,black arrowhead).

3.6. Comparisonofthetwosensortechnologies

Oneof thenoveltiesofthedataanalysedinthisworkisthe useofthefamilyofrecentlyintroducedzeolite-coatedCTO sen-sors[28].Thebestfeaturesetstypicallycontainsensorsofboth thestandardSnO2 andzeolite-coatedCTOtypes.When

examin-ingthesensorcombinationsthatallowed100%performancefor thewholesensorarray,SnO2sensorsarealwaysusedbut99.55%

oftheoptimalfeaturesetsusesensorsfrombothtechnologies.In thissubsectionwecomparetheperformanceofeachofthesensor technologieswhenusedontheirown.Fig.7showstheperformance proﬁleoffeaturesetsthatcontaineitheronlySnO2sensors(Fig.7A)

orexclusivelyCTOsensors(Fig.7B).Asexpected,theoverall clas-siﬁcationperformancelevelsarelowerthanforfeaturesetstaken fromthecombinedarraycombiningbothsensortechnologies,in correspondencewithearlierﬁndings[12,37].Comparingthe per-formancesofthetwo technologies, theSnO2 sensors appearto

havehigherperformance,achieving100%successfora few fea-turesetswhereastheCTOsensorsneverreach100%performance. ThepoorerperformanceoftheCTOsensorsmaybeareﬂectionof theloweramplitudeandtheresultingsmallersignal-to-noiseratio inthesesensors(Fig.1).

4. Discussion

In designing a chemical sensor array, such as an artificial noseor tongue,choices concerningthe number and properties of thesensorsand thenumber and identity of datapoints are veryimportant practicalconsiderations. Engineeringlimitations and computational demand preclude the use of all available sensorsanddatapointsandselectionoftheminimaloptimalset would be an efficient strategy. As Marco and Gutiérrez-Gálvez haverecentlypointed out[2] thereis a disconnectionbetween practitioners of machine olfaction and those developing the computationaltoolstoprocessdatafromchemicalsensorarrays. TheresultsillustratedinFig.2suggestthatitmaybebeneficialto designasensorarrayspecificallyforeachenvisionedapplication

domain and, if doing so, that a few well-chosen sensors and datasamplingtimesmayoutperformsimplyusingthemaximal array and many data points. However, it is worth noting that choosingthecorrectsensorsanddata-samplingtimesiscritical. Forexample,themedianperformance of(threedatapoints,six sensors) feature sets (0.985)is actually worse thanthe perfor-mance of the single comprehensive choice (6 data points, 12 sensors). This implies that just taking an arbitrary feature set (threedatapoints,sixsensors)wouldlikelynotimproveoverall success.

We notice that a large number of classiﬁcation results are almost optimal and some of the differences we base our conclusions on amount to discrepancies of a single error in classifying 200measurements of chemicals. This indicatesthat the array we used is well capable of this quite challenging classiﬁcation problem. In future work we intend to extend our analysis to even more challenging applications, including lowerormultipleconcentrations,andmeasurementstakenover an extended period of time, where sensor drift may become limiting.

Aspointedoutabove,fromtheperspectiveofmaximizing infor-mationwewouldhaveexpectedthatclassificationperformance canonlyincreasewhenadditionalfeaturesareadded.Intheworst caseonewouldhaveexpectedunchangedperformanceifthedata providedbytheadditionalfeatureswasnotuseful.Here,however, wesawthataddingadditionalfeaturescandecreasetheaccuracy ofclassification.Thelikelyexplanationofthisphenomenonis over-fittingofrelativelynoisydata.Theadditionaldatamayprovide additionalinformationforthetrainingdata,butthiscanleadto overlyspecificclassifiersthatmaynotgeneralizeaswelltonew testingdataas“lessinformed”ones.Thistrade-offbetweenoptimal classificationonthetrainingdataandoptimalabilitytogeneralize tonewtestdata,theso-calledover-fittingproblem,isaclassictopic inmachinelearning.Futureworkwillfocusonunravellingwhatthe optimalsolutionsareforgivenpracticalproblemsandthedegree towhichthesearegeneralizablewithinorbeyondagivenproblem set.

Theworkreportedherewasconducted witha specific clas-sification method, i.e. a linear support vector machine. One couldarguethattheobservedphenomenon ofbetter classifica-tionwithsmallerfeaturesets maybespecifictothisparticular method. While we cannot fully exclude this possibility, the effectsofover-fittingareknowntoaffectallapproachesto clas-sification. While the details may differ for other classification methods,theprincipalresultsarelikelytoapplytoavarietyof suchmethodsandsimilarresultshave beenobservedforother applications[13,36].

Finally,thewrapperapproachtofeatureselectionusedhere ledinmany casestoerror-freeclassification,incontrastto ear-lierworkwherefeatureswerechosenbasedondifferentcriteria [35].Ouranalysisoftherelationshipbetweentheperformancein classificationwiththeclusteringqualityofthedataandthe corre-lationofthesensorresponsesinSections3.2and3.3,respectively demonstratesthatthereisarelationshipbetweenthese proper-tiesofthedataandfeaturechoiceandtheeventualclassification successbasedonthechosenfeatures.However,therelationships arenotparticularlystrong,suggestingthatafilterapproachto fea-tureselectionbasedoneithersensorcorrelationordataclustering wouldlikelybelesssuccessful.Thisobservationisreinforcedby closeinspectionof thestructure of thedataafter dimensional-ityreduction,usingPCA,forparticularexemplaryfeaturechoices (Fig.8).While there isa cleardifferencein thequalityof clus-teringinmoresuccessfulfeaturesetsanditisstraightforwardto identifytheinputsthatleadtoclassificationerrorsintheless suc-cessfulcases(arrowheadsin Fig.8),there arealsomanypoints whereerrorscouldhaveoccurredbutdidnot.Thisfigureillustrates

(9)

Fig.8. PCAplotsforfourexemplaryfeaturechoices.(A)PCAplotoftheinputsobtainedwithoneoftheoptimalfeaturesetsoffourdatapointsandsixsensors(100% performance).(B)PCAplotoftheinputsobtainedwhenusingtheslightlylesswell-performingbestfeaturesetofsixdatapointsandelevensensors(99.5%performance). (C)PCAplotofinputswhenallavailablesixdatapointsandtwelvesensorsareused(99%performance).(D)PCAplotofinputswhenusingthebestsensorselectionofsix datapointsandasinglesensor(96.4%performance).Thearrowheadsmarkinputsthatleadtoclassificationerrors,e.g.in(B)themarkedinputofclass20(2,3-butanedione) ismisidentifiedasclass7(butanal).Thepercentagesattheaxesgivethefractionofthevariancethatisexplainedbythecorrespondingprinciplecomponent.Whiletheless well-performingfeaturesetusedforpanel(D)islowdimensionalanditsclassesarehardertoseparate,thebest-performingfeaturesetin(A)isnottheonewithhighest dimensionality(inthesenseofbeingtheleastwell-capturedbythefirstthreeprincipalcomponents)betweenthe4examplesshownhere(thesetsinpanels(B)and(C)have lessoftheiroverallvarianceexplainedbytheirfirstthreeprincipalcomponents).Dimensionalityinthissenseishencenotadirectpredictoroftheclassificationsuccess.

oncemorethattherearenosimplecorrelatestopredict classiﬁ-cationperformance,re-emphasizingtherelevanceofthewrapper approach.

5. Conclusions

We set out systematically to assess the question of feature selectionforarraysofmetaloxidesensorsinaclassiﬁcationtask, usingstandardmachinelearningmethods.Wefoundthatfeature selectioncanimproveclassiﬁcationperformanceandthatthe best-performingfeaturesetsarenotnecessarilythenaivelyexpected ones.

Infutureworkweplantoanalyseindepthwhyparticular com-binationsofsensorsareverysuccessful,whetherthisinformation canbetranslatedtofuturenovelmeasurementswiththesesensors andwhetherourresultstranslatetoclassiﬁcationmethodsother thanlinearsupportvectormachines.

Acknowledgement

ThisworkwaspartiallysupportedbyanOCEDistinguished Sci-entistAwardofCSIROtoTN.

References

[1]Q.J.Liu,Z.M.Zhao,Y.X.Li,Y.Y.Li,Featureselectionbasedonsensitivityanalysis offuzzyISODATA,Neurocomputing85(2012)29–37.

[2]S.Marco,A.Gutierrez-Galvez,Signalanddataprocessingformachineolfaction andchemicalsensing:areview,SensorsJournal,IEEE12(2012)3189–3214.

[3]H.Koinuma,I.Takeuchi,Combinatorialsolid-statechemistryofinorganic mate-rials,NatureMaterials3(2004)429–438.

[4]H.Dacres,J.Wang,V.Leitch,I.Horne,A.R.Anderson,S.C.Trowell,Greatly enhanceddetectionofavolatileligandatfemtomolarlevelsusing biolumi-nescenceresonanceenergytransfer(BRET),BiosensorsandBioelectronics29 (2011)119–124.

[5]A.Berna,Metaloxidesensorsforelectronicnosesandtheirapplicationtofood analysis,Sensors10(2010)3882–3910.

[6]C.Cortes,V.Vapnik,Support-vectornetworks,MachineLearning20(1995) 273–297.

[7]M.Pardo,G.Sberveglieri,Classiﬁcationofelectronicnosedatawithsupport vectormachines,SensorsandActuatorsB:Chemical107(2005)730–737. [8]C.Distante,N.Ancona,P.Siciliano,Supportvectormachinesforolfactorysignals

recognition,SensorsandActuatorsB:Chemical88(2003)30–39.

[9]R.Kohavi,G.H.John,Wrappersforfeaturesubsetselection,Artiﬁcial Intelli-gence97(1997)273–324.

[10]R.K.G.John,K.Pﬂeger,Irrelevantfeaturesandthesubsetselectionproblem, in:FifthInternationalConferenceonMachineLearning,NewBrunswick,NJ, MorganKaufmann,LosAltos,1994,pp.121–129.

[11]E.Phaisangittisagul,H.T.Nagle,Sensorselectionformachineolfactionbasedon transientfeatureextraction,IEEETransactionsonInstrumentationand Mea-surement57(2008)369–378.

[12]M.Pardo,L.G.Kwong,G.Sberveglieri,K.Brubaker,J.F.Schneider,W.R.Penrose, J.R.Stetter,Dataanalysisforahybridsensorarray,SensorsandActuatorsB: Chemical106(2005)136–143.

[13]M.Pardo,G.Sberveglieri,Comparingtheperformanceofdifferentfeaturesin sensorarrays,SensorsandActuatorsB:Chemical123(2007)437–443. [14]E.Martinelli,C.Falconi,A.D‘Amico,C.DiNatale,Featureextractionof

chem-icalsensorsinphasespace,SensorsandActuatorsB:Chemical95(2003) 132–139.

[15]M.Falasconi,M.Pardo,G.Sberveglieri,I.Ricco,A.Bresciani,ThenovelEOS835 electronicnoseanddataanalysisforevaluatingcoffeeripening,Sensorsand ActuatorsB:Chemical110(2005)73–80.

[16]M.K.Muezzinoglu,A.Vergara,R.Huerta,N.Rulkov,M.I.Rabinovich,A. Selver-ston,H.D.I.Abarbanel,Accelerationofchemo-sensoryinformationprocessing usingtransient features, Sensors and Actuators B: Chemical 137 (2009) 507–512.

(10)

[17]A.Perera,T.Yamanaka,A.Gutierrez-Galvez,B.Raman,R.Gutierrez-Osuna,A dimensionality-reductiontechniqueinspiredbyreceptorconvergenceinthe olfactorysystem,SensorsandActuatorsB:Chemical116(2006)17–22. [18]A.Heilig,N.Barsan,U.Weimar,M.Schweizer-Berberich,J.W.Gardner,W.Gopel,

GasidentiﬁcationbymodulatingtemperaturesofSnO2-basedthickﬁlm

sen-sors,SensorsandActuatorsB:Chemical43(1997)45–51.

[19]C.Distante,M.Leo,P.Siciliano,K.C.Persaud,Onthestudyoffeatureextraction methodsforanelectronicnose,SensorsandActuatorsB:Chemical87(2002) 274–288.

[20]R.Ionescu,E.Llobet,Wavelettransform-basedfastfeatureextractionfrom temperaturemodulatedsemiconductorgassensors,SensorsandActuatorsB: Chemical81(2002)289–295.

[21]X.J.Huang,Y.K.Choi,K.S.Yun,E.Yoon,Oscillatingbehaviourofhazardousgas ontinoxidegassensor:Fourierandwavelettransformanalysis,Sensorsand ActuatorsB:Chemical115(2006)357–364.

[22]K.E.Kramer,S.L.Rose-Pehrsson,M.H.Hammond,D. Tillett,H.H.Streckert, Detectionandclassiﬁcationofgaseoussulfurcompoundsbysolidelectrolyte cyclicvoltammetryofcermetsensorarray,AnalyticaChimicaActa584(2007) 78–88.

[23]F.J.Acevedo,S.Maldonado,E.Dominguez,A.Narvaez,F.Lopez,Probabilistic supportvectormachinesformulti-classalcoholidentiﬁcation,Sensorsand ActuatorsB:Chemical122(2007)227–235.

[24]J.Samitier,J.M.Lopezvillegas,S.Marco,L.Camara,A.Pardo,O.Ruiz,J.R.Morante, Anewmethodtoanalyzesignaltransientsinchemicalsensors,Sensorsand ActuatorsB:Chemical18(1994)308–312.

[25]R.Gutierrez-Osuna,H.T.Nagle,S.S.Schiffman,Transientresponseanalysisof anelectronicnoseusingmulti-exponentialmodels,SensorsandActuatorsB: Chemical61(1999)170–182.

[26]S.Wlodek,K.Colbow,F.Consadori,Signal-shapeanalysisofathermallycycled tin-oxidegassensor,SensorsandActuatorsB:Chemical3(1991)63–68. [27]L.Carmel,S.Levy,D.Lancet,D.Harel,Afeatureextractionmethodforchemical

sensorsinelectronicnoses,SensorsandActuatorsB:Chemical93(2003)67–76. [28]R.Binions,H.Davies,A.Afonja,S.Dungey,D.Lewis,D.E.Williams,I.P.Parkin, Zeolite-modiﬁeddiscriminatinggassensors,JournaloftheElectrochemical Society156(2009)J46–J51.

[29]P.Varsani,A.Afonja,D.E.Williams,I.P.Parkin,R.Binions,Zeolite-modiﬁedWO3

gassensors–enhanceddetectionofNO2,SensorsandActuatorsB:Chemical

160(2011)475–482.

[30]R.Binions,A.Afonja,S.Dungey,D.W.Lewis,I.P.Parkin,D.E.Williams, Discrimi-nationeffectsinzeolitemodiﬁedmetaloxidesemiconductorgassensors,IEEE SensorsJournal11(2011)1145–1151.

[31]N.Y.Chen,T.F.Degnan,C.MorrisSmith,MolecularTransportandReactionin Zeolites:DesignofShapeSelectiveCatalysts,VCH,NewYork,1994. [32]A.Z.Berna,A.R.Anderson,S.C.Trowell,Bio-benchmarkingofelectronicnose

sensors,PLoSOne4(2009).

[33]C.C. Chang, C.J. Lin, LIBSVM: a library for support vector machines in http://www.csie.ntu.edu.tw/∼cjlin/libsvm,2001.

[34]M.K.Muezzinoglu,A.Vergara,R.Huerta,M.I.Rabinovich,Asensorconditioning principleforodoridentiﬁcation,SensorsandActuatorsB:Chemical146(2010) 472–476.

[35]A.Z.Berna,A.Vergara,M.Trincavelli,R.Huerta,A.Afonja,I.P.Parkin,R.Binions, S.Trowell,Evaluatingzeolite-modiﬁedsensors:towardsafastersetofchemical

sensors,in:P.Gouma(Ed.),AIPConf.Proc,AmerInstPhysics,Melville,2011, pp.50–52.

[36]A. Pardo, S. Marco, C. Calaza, A. Ortega, A. Perera, T. Sundic, J. Sami-tier,Methodsforsensorsselectioninpatternrecognition,in:J.W.Gardner, K.C. Persaud (Eds.),Electronic Nosesand Olfaction,IoP Publishing, 2000, pp.83–88.

[37]H.Ulmer,J.Mitrovics,U.Weimar,W.Gopel,Sensorarrayswithonlyoneor sev-eraltransducerprinciples?Theadvantageofhybridmodularsystems,Sensors andActuatorsB:Chemical65(2000)79–81.

Biographies

ThomasNowotnyreceivedhisPhDintheoreticalphysicsfromtheUniversity ofLeipzig,Germanyandwasa postdoctoralresearcherandassistantresearch scientist at UCSD. He is now a Reader in Informatics at the University of Sussex. Hehasauthoredover 30 peer-reviewed publications ininternational journals.Hisbroadresearchinterestsincludeproblemsincomputational neuro-science,biologicalandartiﬁcialolfactionandcognitionandlearninginbrainsand machines.

AmaliaZ.Bernareceivedherengineeringdegreeinfoodtechnologyfromthe AgrarianNationalUniversity,PeruandPhDinbioscienceengineeringfromCatholic UniversityofLeuven,Belgium.SheiscurrentlyworkingataCSIROasTeamLeader oftheChemometricteamandResearchScientist.Shehasexperienceinsensor characterizationandtheapplicationofmultivariatestatisticalmethodstosensors andsensorarrays.ShecurrentlyleadsCSIRO’sresearchintodiagnosisofdisease inexpiredbreathvolatilesanddetectionofmicrobialcontaminantsinthefood chain.Amaliahas21peer-reviewedpapersininternationaljournalsand16refereed conferencepapers.

RussellBinions holdsa degree inchemistryfrom theUniversity of Durham andaPhDinChemistryfromUniversityCollegeLondon.Heiscurrentlya Lec-turer in Functional Materials in the School of Engineering and Materials at QueenMary,UniversityofLondonandanHonorarySeniorResearchAssociate at UCL. He is the author of over 50 peer reviewed journal papers, 4 book chaptersand 1book.Hisresearchinterestsencompass newchemicalvapour depositiontechniques,metaloxidesemiconductormaterials,gassensors, photo-catalysis,chromogenicmaterials,nanocompositeﬁlmsandenergyefﬁcientbuilding materials.

StephenTrowellholdsa naturalsciencesdegree fromCambridgeUniversity, majoring in biochemistry, and a PhD in visual neuroscience from the Aus-tralianNationalUniversity.Stephenhas100publications,40ofthemrefereed internationaljournalsand bookchapters,and isinventor on12patent fami-lies.Stephen’spracticalachievementsincludeleadingtheteamthatdeveloped TheLepTonTM _Test _Kit, _an _{immunodiagnostic}_kit _used_to _manage _insecticide

resistance. He is a Senior PrincipalResearch Scientist at CSIROand he cur-rently leadsateam developingabioelectronicnose todetectexplosivesand otheranalytes.Heﬁndsthemultidisciplinarynatureofthischallengeparticularly rewarding.