Availableonlineatwww.sciencedirect.com
ScienceDirect
j ou rn a l h o m e p a g e :w w w . e l s e v i e r . c o m / p i s c
Video
object
extraction
and
its
tracking
using
background
subtraction
in
complex
environments
夽
Satrughan
Kumar
∗,
Jigyendra
Sen
Yadav
DepartmentofElectronicsandCommunication,MANIT,Bhopal462003,India
Received5February2016;accepted12April2016 Availableonline 26April2016
KEYWORDS
Background subtraction; Entropy;
Initialmotionfield; Objecttracking
Summary Backgroundsubtractionisanefficientwaytolocalizeandobtainthecentroidof the connectedpixelsmovingonthe foregrounddespitethe priorinformationofthescene. Itissuitable underfixedcamera arrangement,whichincorporates many vision applications suchasobjecttracking,humanmonitoring,etc.However,themovingobjectextractiontask becomessophisticatedandchallengingduetosome annoyingfactorssuchaslocalmotionin background(wavingtree,ripplingwater,etc.),camouflageregion,sleepingobject,whichin turn degrades thetracking performance. In orderto alleviatethese problems,an efficient backgroundsubtractionalgorithmisproposedtosupporttheobject-trackingtaskunderstatic anddynamicbackgroundconditions.Theworkisfocustorealizetherelevantmovingblobson foregroundbyaidingtheproperinitializationandupdatingofthebackgroundmoduleinorder toimprove thetrackingaccuracy. Itgeneratesaninitial motionfieldusingspatial-temporal filteringontheconsecutivevideoframes.Theblock-wiseentropyisevaluatedaboveacertain rangeofthepixelsofthedifferenceimageinordertoextracttherelevantmovingpixelsfrom theinitialmotionfield.Asuitablethresholdvalueisestimatedtoassignanappropriatelabelto themovingblobsontheforegroundmask.Finally,anadaptingKalmanfilterisintegratedtothe objectextractionmoduleinordertotracktheobjectontheforeground.Extensivequantitative experimentsprovethattheproposedmethodcompetentlyhandlestheobjectextraction,which inturnimprovesthetrackingtaskunderstaticanddynamicbackgroundconditions.
©2016PublishedbyElsevierGmbH.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense (http://creativecommons.org/licenses/by-nc-nd/4.0/).
夽 ThisarticlebelongstothespecialissueonEngineeringandMaterialSciences.
∗Correspondingauthor.Tel.:+919893599980;fax:+917554051514.
E-mailaddress:[email protected](S.Kumar).
http://dx.doi.org/10.1016/j.pisc.2016.04.064
2213-0209/©2016PublishedbyElsevierGmbH.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense(http://creativecommons.org/
Introduction
Manycomputervision applicationsrequires realtime seg-mentationofmovingobject,whichisavitalstepintarget localization,trafficmonitoringandintruderactivityanalysis (Bouwmans,2014).Thealgorithmbasedonbackground sub-tractionisoneofthemostadmiredmethodtorealizemoving objectsontheforeground(Vostersetal.,2012;Nikolovand Kostov, 2014; Xue et al., 2012). The background subtrac-tiontaskinobjecttrackingoffersseveraladvantagessuch asitprovideshigheraccuracyatpixel,frameaswellasin regionallevel operation.Inaddition,itis computationally inexpensiveand does not require priorinformation about thescenetolocalizethemovingobjectontheforeground mask. Moreover, the background initialization and recon-structionare proving tobe effective understatic camera arrangementinordertogettheinitialestimateofmotion. Theintentionbehindextractingtheblobofmovingobjectis toprovidethecomplementarystatisticalinformationabout theregions. As seen, theKalman filter requires an initial estimateofpositionofanobject(blob)inordertogetthe improvedtrajectoryduringtracking(Ha,2012;Wengetal., 2006).Inthiscase,thebackgroundsubtractionprovidesthe sufficientsizeoftrackingsamplesorblobregion,whichis notobtainedthroughcornersandedgedetectors.
Inthiswork,abackgroundmodelandobjectextraction moduleareproposedtoenhancethe objectsegmentation process that also provide a reliable input to the track-ingmechanism.Simultaneously,itprovidesthepotentialto obtaintheforegroundblobs,whicharefreefromany arti-ficialtrails,ghost,apertureandholesdistortion(Cucchiara etal.,2003).Theremainingsectionsofthispaperare orga-nizedasfollow:secondsection presentssomebackground subtractionschemesusedintracking.Inthird section,we present ourbackground model andobject extraction pro-cedures.Experimentalresultsandperformanceanalysisare showninfourthsection.Theconcludingremarksaregiven infinalsection.
Related
works
Backgroundsubtractionplaysavitalroleinordertodetect theinterestingmovingobjectsandtrackingofsuchobjects intheconsecutivesframeofvideosequence.InBouwmans (2014),some background modelsarereviewed toobserve theircharacteristics. Thereviewprovidesthecluetotake thebenefitsofthecombinedapproachofbasicand statis-ticalmodels.Since,theintegralapproachcanberobustto noiseandoperateatpixelleveltosolvemanyproblems.An adaptive thresholding scheme is combined withtemporal averaging method is used to solve the problem of tradi-tionalframe differencemethod (FD) (Nikolov and Kostov, 2014).However,itcouldnotdetectthe completeblob on theforeground. The same problemis found in (Ha,2012; MazinanandAmir-Latifi,2012).InRahmanetal.(2013),the authorsusedthesecond orderfilterinthegradient direc-tion to get relevant edges of moving object by applying thetraditional background-updating scheme. The method couldnotsupportduringrealtimeapplicationsandinnoise suppression.In Zhang andDing (2012), an adaptive back-groundmodelusingmeanfilterisproposedtoeliminatethe
impulsive noise.As seen, the meanfilteris less robustto hollowspaceandnotadaptingtothevarianceofpixels.
In Xue et al. (2012), author proposed a phased based backgroundmodel.Itextractstheforegroundpixelusing dis-tancetransform.Althoughthemethodisrobusttofindthe foregroundpixelsinchangingilluminationandbootstrapping condition,yetitsuffersfromtimecomplexity.
InStaufferandGrimson(2000),authorsproposeda multi-valuedbackground-modellingschemeinwhicheachpixelof thebackground modelis modelledusingGaussianmixture model(GMM).Themethodisfoundtobeeffectiveagainst dynamicbackground,yetitdoesnotdiscriminatethe back-ground and foreground pixel near camouflage region and in changing illuminationcondition. The statistical method employedin (Oral and Deniz, 2007),does notupdate the background model thatmakesit lessuseful undervarying illuminationconditions.InVostersetal.(2012),theauthors proposedthecombinedapproachofEigen backgroundand statisticalilluminationmodeltosolvethedisruptiondueto rapidillumination.However,itcouldnotsolvetheproblem of localmotionbecauseEigenspace ofbackground model wasnotincorporatingthenextpositionofobject.
We can recapitulate here that some effective models have higher computational complexity, while the sim-pler background models are not effective under complex conditions.Itisnotedthattheeffectivebackground recon-structionandsufficientobjectsamplesizearenecessaryto extractandtracktheobjectontheforeground(Mandellos et al., 2011). In this work, we have included the spatial andtemporalconstrainttogettheobjectidentificationand removedunnecessarybackgroundpixelsbyintegratingthe regionleveloperation(YaoandLing,2014).
Proposed
method
Thissectionillustratestheproposedmethodintotwostages. The first stage is about the background modelling and objects extraction, while the second stage is related to objecttrackingusingKalmanfiltering(Wengetal.,2006). Itis notedthattheworkis focussedongreyscale videos, takenunderstaticcameraarrangement.
Backgroundmodellingandobjectextraction
The steps of the proposed method are followed accord-ing to Algorithm 1 that deals the background modelling andobjectextractionphase.Initially,averageofsome ini-tialframesistakenasreferencebackground. Asseen,the temporalprocessingcreatesholesandavoidsspatial correla-tionamongstthemovingpixels.Therefore,anapproximate motionfieldisderivedusingthebackgroundsubtractionand temporaldifferencemechanism.Thediversitystateofeach pixel isexaminedthroughspatio-temporalfilteringandan approximateinitialmotionfieldisestimatedby threshold-ing outthebackground pixel. Abackground model should adapttothedynamicchanges,whetheritislocal(swaying tree,ripplingwater,andfountain) or global(resolutionor illuminationchange).Inthiswork,theproposedbackground modeladaptstemporalchangesthatefficientlyavoidedthe commonunwantedvariablesandextractedthe complemen-taryregioninthescene.
Algorithm1. Objectextractionmodule
1. Average‘K’initialframes{I1,I2,I3,...,IK}togenerate
referencebackgroundBR
t(x,y),where0<K<11
11. fori←1tonumberofrowsincrementedby ‘r’
2. forF←1toNdo //N=totalnumberofframes 12. forj←1tonumberofcolumns incrementedby‘r’
3. X=abs(It(x,y)−BtR(x,y)) 13. M=BIBt(i→i+r,j→j+r)
FD=abs(It(x,y)−It−1(x,y)) 14. Discardthepixelwithintensitysmaller
than‘3. //It(x,y)andIt−1(x,y)arecurrentandpreviousframe
respectively.
15. Calculate‘pdf’usingthepixelsofM
4. Initializethresholds1and2 //pdf→probabilitydensityfunction
1=mean(X )+.∗(std(X )), 16. Calculateentropy
2=mean(FD)+.∗(std(FD)) Et=− Rmax R=Rmin
pdf(R)log(pdf(R))
5. forx←1torowsofframedo RminandRmaxaretheminimumandmaximum
levelofthegreylevelsR.
6. fory←1tocolumnsofframedo 17. ifEt(i,j)>3then //3=3;
7. ifX(x,y)>1then M(i,j)=BIBt(i,j);
BBS
t (x,y)=1; else
else M(i,j)=0;
BBS
t (x,y)=0; end
end 18. ifM(i,j)>4then
8. ifabs(It(x,y)−It−1(x,y))>2then Dt(i,j)=1;
BFD t (x,y)=1; else else Dt(i,j)=0 BFD t (x,y)=0; end end 19. Tocalculate4 9. if(BFD t ∪B BD
t )truethen -Initialize4bytakingthemeanof‘X’.
BIB
t(x,y)=X (x,y); -Update4usingthegivenequation: else 4=X(x,y)+(˛×M−ˇY(x,y))
BIB
t(x,y)=0; whereMismeanof‘M’and‘Y’isthecurrent
averageofupdatedbackgroundframe.
end 20. Updatecurrentbackgroundusing
end Bt(i,j)=Bt−1(i,j)+signum(It(i,j)−It−1(i,j))
end end
//Calculateentropyofinitialmotionfield end
10. Defineawindowofsizer×r,wherer=8. end
Incomplexenvironment,thestateofpixelsinBIB t(x,y)is affectedduetoheavylocalor globalchangesandcausing theappearanceofirrelevantpixelsontheforeground.Inthis regard,wemonitor theregionalentropytogettheactual movingpixelsin theapproximateinitial motionfield.The pixelsbelongingtotheactualmovingobjecthaslowentropy ascomparedtothosefalsenegativespixelsthatarisedueto eitherdynamicsorilluminationchanges.Duringthe proba-bilitydensityfunctionevaluation,wehavediscardedlower greylevelhavingtherangebelow‘3’.Basedonthe block-wiseentropy,aninitialmotionfieldisestimatedusingproper threshold.Finally,thelargestareaofmovingblobislabelled after amorphological open operation followed bya close operationonDt(i,j).
ObjecttrackingusingKalmanfiltering
AnadaptiveKalmanfilteralgorithmproposedinWengetal. (2006), has been adopted to estimate the centre of the largestmovingblobinsidetherectangle.Despitethe pres-ence of error due to unconstrained measurement, local motion in background, the Kalman filterhas an ability to
estimatetrackingpositionsusingtheminimuminformation about the blob. Initially, the state ˆS(t) and measurement m(t)modelisspecifiedtoestimate(predict)thenext posi-tion.Themodelmatricesaredefinedas:
ˆS(t)=AˆS(t−1)+g(t) (1)
m(t)=H(t)ˆs(t)+v(t) (2)
whereAandH(t)refertostatetransitionmatrixand mea-surementmatrixrespectively.ThewhiteGaussiannoiseg(t) and v(t) with zero mean may produce in the model due tounconstrainedmeasurement.Therefore,thecovariance matrixQandRareestimatedusingg(t)andv(t)in conjunc-tionwithkorneckordeltafunctiontodefinenoiseatcertain instant.
Thepredictionofthenextstate ˆS+(t)isaccomplishedby incorporatingtheactual measurementwithpriorestimate ofstate ˆS−(t).
ThevariableK(t)referstotheKalmangainandiswritten as:
K+(t)= ˆP−(t)H(t)T(H(t)ˆP−(t)H(t)T+R(t))−1 (4) TheKalmangainincludesapriorerrorcovariancematrix ˆ
P−(t) that is derived using covariance of ˆe−(t)= ˆS(t)− ˆS−(t).Inasimilarmanneraposteriorerrorcovariancematrix ˆ
P+(t)isderivedusingthecovarianceof ˆe+(t)= ˆS(t)− ˆS+(t). This ˆP−(t)alongwithK+(t)andcurrent inputisutilizedto findthenextstateandprocessisrepeatedrecursively.
Fornextmeasurement,apriorstateandcovarianceerror ismeasuredrecursivelyasfollows.
ˆS−(t)=AˆS+(t−1) (5)
ˆ
P−(t)=AˆP+(t−1)AT+Q(t−1) (6) ThemaingoalistoestimatecorrectstateusingEq.(3)by correctingtheKalmangainthroughEq.(4).Asseen,more wouldbetheKalmangainlesserwouldbethemeasurement error.Thefinalstepistoobtainaposteriorerrorcovariance matrixusingEq.(7).Thepreviousposteriorestimateisused tocreateanewpriorestimatetoimprovethemeasurement. ˆ
P+(t)=(I−K(t)H(t))ˆP−(t) (7)
Itisassumedthatobjectcoversequaldistanceinevery interval,whichisrepresentedas:
f(t)=f(t−1)+(f(t−1)−f(t−2)) (8) wheref(t−1)andf(t−2)isthedistancecoveredbyobject inframet−1andt−2,respectively.
The state matrix ˆS(t) and ˆS(t−1) of Eq. (1) can be derivedusingthegivenequation.
ˆS(t)= 2 −1 1 0 f(t) f(t−1) + g(t) 0 (9)
Experimental
analysis
Thissectionpresentsthequalitativeandquantitative anal-ysis on different video sequences using the proposed algorithm and its performance parameters are compared with other existing methods. The ‘IR’, ‘WS’ and ‘Office’ experimentalsequencesdepictgreatdiversityinits succes-siveframesduetotheilluminationvariation, localmotion inbackground,apertureproblem.The‘WS’sequenceshas camouflageregionbelowthekneeofpersonanddepictthe localmotioninthebackground duetoripplingwater.The movingobjectchangesitsdimensionalityinthesuccessive frameof‘IR’,whilethe‘Office’sequencesdepicttheslow andstationarybehaviouroftheobjectforalongtime.
Figs. 1and 2show thequalitative performance ofthis proposedmethod.ThefirstrowofFigs.1and2showsthe sampled frames with tracking results, while the last row shows themotion mask obtained throughthismethod. As expected,theproposedmethodperformsbetter classifica-tionof foreground andbackground underboth staticand dynamic background conditions. The qualitative compari-sononsomesampledframesshowninFig.3illustratesthe efficacyofthisproposedmethodincomplexsituations.
The quantitativeresultsareevaluatedthrough Similar-ity,F1 andDetectionratemetrics(Rahmanetal.,2013). Theseevaluation metricsdepend ontp(true positive),tn
Figure1 Detectedmotionmaskof‘WS’sequence.
Figure2 Detectedmotionmaskof‘IR’sequence.
(truenegative),fp(falsepositive)andfn(falsenegative). The ‘tp’ and ‘tn’ are correctly detected foreground and backgroundpixelsrespectively.
The‘fp’and‘fn’aretheincorrectlydetectedforeground andbackgroundpixelsrespectively.Theparameters Detec-tionRate,SimilarityandF1aregivenas:
DetectionRate= tp
tp+fn (10)
Similarity=tp/(tp+fp+fn) (11)
F 1= 2×Precision×Recall
Precision+Recall (12)
whereRecallandPrecisionaretherelevantandirrelevant truepositivepixelsrespectively.Fig.4presentsthe detec-tionrateonsampledframesofeachvideosequencethrough thismethodalongwithotherexistingmethods.Here,it is seenthatinitiallysomemethodssuchasGMM(Staufferand Grimson,2000)andmethod(Rahmanetal.,2013)havegood detectionrate,buttheirperformanceisdegradedon sub-sequentframesduetoobjecteitherbecomesstationaryor suffersfromghost onforeground. However,the detection rate throughourmethodis superior toother methodsfor eachvideosequence.Theaveragedetectionrateobtained through this method is up to 40% and 35% greater than GMM and method (Rahman et al., 2013) respectively for
Figure3 Visualcomparisononforegroundmotionmasks.
(a)
(b)
(c)
5 10 15 20 25 -0.2 0 0.2 0.4 0.6 0.8 1Number of Frames(WS Frame Sequence)
Det ec ti on Ra te Proposed GMM Method[9] FD 5 10 15 20 25 -0.2 0 0.2 0.4 0.6 0.8 1
Number of Frames(Office Frame Sequence)
Det e ct ion Rat e Proposed GMM Method[9] FD 5 10 15 20 25 -0.2 0 0.2 0.4 0.6 0.8 1
Number of Frames (IR Frame sequence)
De te c tio n R at e Proposed GMM Method[9] FD
Figure4 Detectionrateof(a)WS,(b)Officeand(c)IRvideo.
Table1 Quantitativeperformancecomparisononforegroundmotionmask.
Sequences Evaluation Proposedmethod Method(Rahmanetal.,2013) GMM FD WS Similarity 0.8745 0.7000 0.3476 0.1542 F1 0.9325 0.8444 0.5112 0.2677 IR Similarity 0.7791 0.4035 0.5116 0.2064 F1 0.8983 0.5750 0.6769 0.3132 Office Similarity 0.8189 0.4864 0.2729 0.2451 F1 0.9129 0.6545 0.4275 0.3938
all videos. The detection rate obtained through this pro-posed scheme is also better than traditional FD (Nikolov and Kostov,2014)as itdoes notcreate holes inside mov-ingentity.Table1liststheaverageSimilarityandF1score of themovingmask extracted bymethod (Rahmanetal., 2013).GMM,FDandproposedapproachfor theeachvideo sequence.Here,itis seenthat theaverageSimilarity and
F1 score securedthrough this method are upto 35% and 33% greater than those attained by GMM for ‘WS’ video. Moreover,itsignificantlyoutperformstheFDmethodin pro-vidingapromisingmotionmaskonforeground.Asseen,only thismethodattainshighersimilarityandF1scoreexceeding 80%for theOffice sequencein which theobject becomes stationaryforlongtime.
Conclusion
The proposed algorithmlocalizes the entire target region withhigh similarity and accuracy. The system effectively combinesthespatio-temporalprocessingwithregionallevel operation in ordertoreduce thebackground clutter. The proposedschemeisfoundtobeeffectivetoeliminateghost and aperture distortion. It providesthe sufficient sample sizeof an objectto Kalman filterin order topredict the targetpositioninthesuccessiveframes.Simulationresults provethatthemethodcanbeimplementedinbetter local-izationofmovingorstationaryobjectthanotherbackground subtractionschemesusedfortracking.
References
Bouwmans,T.,2014. Traditionaland recent approachesin
back-ground modeling for foreground detection: an overview. Comput.Sci.Rev.11,31—66.
Cucchiara,R.,Grana,C.,Piccardi,M.,Prati,A.,2003. Detecting
moving objects, ghosts, and shadows in video streams. IEEE Trans.PatternAnal.Mach.Intell.25(10),1337—1342.
Ha,F.,2012.CentroidweightedKalmanfilterforvisualobject
track-ing.J.Meas.45(4),650—655.
Mandellos, N.A., Keramitsoglou, I., Kiranoudis, C.T., 2011. A
backgroundsubtraction algorithm for detecting and tracking vehicles.ExpertSyst.Appl.38(3),1619—1631.
Mazinan,A.H.,Amir-Latifi,A.,2012.Applyingmeanshift,motion
informationandKalmanfilteringapproachestoobjecttracking. ISATrans.51(3),485—497.
Nikolov,B.,Kostov,N.,2014.Motiondetectionusingadaptive
tem-poralaveragingmethod.Radioengineering23(2),652—658.
Oral,M.,Deniz,U.,2007.Centerofmassmodel:anovelapproach
to backgroundmodelingfor segmentation ofmoving objects. ImageVis.Comput.25(8),1365—1376.
Rahman,A.Y.F.,Hussain,A.,Zaki,W.,Zaman,B.,Tahir,M.,2013.
Enhancementofbackgroundsubtractiontechniquesusinga sec-ondderivativeingradientdirectionfilter.J.Electric.Comput. Eng.2013(21),1—12.
Stauffer,C.,Grimson,W.E.L.,2000. Learningpatternsofactivity
usingreal-timetracking.IEEETrans.PatternAnal.Mach.Intell. 22(9),747—757.
Vosters,L.,Shan,C.,Gritti,T.,2012.Real-timerobustbackground
subtraction under rapidly changing illumination conditions. ImageVis.Comput.30(12),1004—1015.
Weng,S.K.,Kuo,C.M.,Tu,S.K.,2006.Videoobjecttrackingusing
adaptiveKalmanfilter.J.Vis.Commun.ImageRepresent.17(6), 1190—1208.
Xue,G.,Sun,J.,Song,L.,2012.Backgroundsubtractionbasedon
phasefeatureanddistancetransform.PatternRecognit.Lett. 33(12),1601—1613.
Yao,Li, Ling,M., 2014. Animprovedmixture-of-Gaussians
back-groundmodelwithframedifferenceandblobtrackinginvideo stream.Sci.WorldJ.(2014),1—9.
Zhang,R.,Ding,J.,2012.Objecttrackinganddetectingbasedon