Video object extraction and its tracking using background subtraction in complex environments

(1)

Availableonlineatwww.sciencedirect.com

ScienceDirect

j ou rn a l h o m e p a g e :w w w . e l s e v i e r . c o m / p i s c

Video

object

extraction

and

its

tracking

using

background

subtraction

in

complex

environments

夽

Satrughan

Kumar

∗

,

Jigyendra

Sen

Yadav

DepartmentofElectronicsandCommunication,MANIT,Bhopal462003,India

Received5February2016;accepted12April2016 Availableonline 26April2016

KEYWORDS

Background subtraction; Entropy;

Initialmotionﬁeld; Objecttracking

Summary Backgroundsubtractionisanefficientwaytolocalizeandobtainthecentroidof the connectedpixelsmovingonthe foregrounddespitethe priorinformationofthescene. Itissuitable underfixedcamera arrangement,whichincorporates many vision applications suchasobjecttracking,humanmonitoring,etc.However,themovingobjectextractiontask becomessophisticatedandchallengingduetosome annoyingfactorssuchaslocalmotionin background(wavingtree,ripplingwater,etc.),camouflageregion,sleepingobject,whichin turn degrades thetracking performance. In orderto alleviatethese problems,an efficient backgroundsubtractionalgorithmisproposedtosupporttheobject-trackingtaskunderstatic anddynamicbackgroundconditions.Theworkisfocustorealizetherelevantmovingblobson foregroundbyaidingtheproperinitializationandupdatingofthebackgroundmoduleinorder toimprove thetrackingaccuracy. Itgeneratesaninitial motionfieldusingspatial-temporal filteringontheconsecutivevideoframes.Theblock-wiseentropyisevaluatedaboveacertain rangeofthepixelsofthedifferenceimageinordertoextracttherelevantmovingpixelsfrom theinitialmotionfield.Asuitablethresholdvalueisestimatedtoassignanappropriatelabelto themovingblobsontheforegroundmask.Finally,anadaptingKalmanfilterisintegratedtothe objectextractionmoduleinordertotracktheobjectontheforeground.Extensivequantitative experimentsprovethattheproposedmethodcompetentlyhandlestheobjectextraction,which inturnimprovesthetrackingtaskunderstaticanddynamicbackgroundconditions.

夽 _This_article_belongs_to_the_special_issue_on_Engineering_and_Material_Sciences.

∗_{Corresponding}_author._Tel.:₊₉₁_9893599980;_fax:₊₉₁_7554051514.

E-mailaddress:[email protected](S.Kumar).

http://dx.doi.org/10.1016/j.pisc.2016.04.064

(2)

Introduction

Manycomputervision applicationsrequires realtime seg-mentationofmovingobject,whichisavitalstepintarget localization,trafficmonitoringandintruderactivityanalysis (Bouwmans,2014).Thealgorithmbasedonbackground sub-tractionisoneofthemostadmiredmethodtorealizemoving objectsontheforeground(Vostersetal.,2012;Nikolovand Kostov, 2014; Xue et al., 2012). The background subtrac-tiontaskinobjecttrackingoffersseveraladvantagessuch asitprovideshigheraccuracyatpixel,frameaswellasin regionallevel operation.Inaddition,itis computationally inexpensiveand does not require priorinformation about thescenetolocalizethemovingobjectontheforeground mask. Moreover, the background initialization and recon-structionare proving tobe effective understatic camera arrangementinordertogettheinitialestimateofmotion. Theintentionbehindextractingtheblobofmovingobjectis toprovidethecomplementarystatisticalinformationabout theregions. As seen, theKalman filter requires an initial estimateofpositionofanobject(blob)inordertogetthe improvedtrajectoryduringtracking(Ha,2012;Wengetal., 2006).Inthiscase,thebackgroundsubtractionprovidesthe sufficientsizeoftrackingsamplesorblobregion,whichis notobtainedthroughcornersandedgedetectors.

Inthiswork,abackgroundmodelandobjectextraction moduleareproposedtoenhancethe objectsegmentation process that also provide a reliable input to the track-ingmechanism.Simultaneously,itprovidesthepotentialto obtaintheforegroundblobs,whicharefreefromany arti-ﬁcialtrails,ghost,apertureandholesdistortion(Cucchiara etal.,2003).Theremainingsectionsofthispaperare orga-nizedasfollow:secondsection presentssomebackground subtractionschemesusedintracking.Inthird section,we present ourbackground model andobject extraction pro-cedures.Experimentalresultsandperformanceanalysisare showninfourthsection.Theconcludingremarksaregiven inﬁnalsection.

works

Backgroundsubtractionplaysavitalroleinordertodetect theinterestingmovingobjectsandtrackingofsuchobjects intheconsecutivesframeofvideosequence.InBouwmans (2014),some background modelsarereviewed toobserve theircharacteristics. Thereviewprovidesthecluetotake thebenefitsofthecombinedapproachofbasicand statis-ticalmodels.Since,theintegralapproachcanberobustto noiseandoperateatpixelleveltosolvemanyproblems.An adaptive thresholding scheme is combined withtemporal averaging method is used to solve the problem of tradi-tionalframe differencemethod (FD) (Nikolov and Kostov, 2014).However,itcouldnotdetectthe completeblob on theforeground. The same problemis found in (Ha,2012; MazinanandAmir-Latifi,2012).InRahmanetal.(2013),the authorsusedthesecond orderfilterinthegradient direc-tion to get relevant edges of moving object by applying thetraditional background-updating scheme. The method couldnotsupportduringrealtimeapplicationsandinnoise suppression.In Zhang andDing (2012), an adaptive back-groundmodelusingmeanfilterisproposedtoeliminatethe

impulsive noise.As seen, the meanﬁlteris less robustto hollowspaceandnotadaptingtothevarianceofpixels.

In Xue et al. (2012), author proposed a phased based backgroundmodel.Itextractstheforegroundpixelusing dis-tancetransform.Althoughthemethodisrobusttoﬁndthe foregroundpixelsinchangingilluminationandbootstrapping condition,yetitsuffersfromtimecomplexity.

InStaufferandGrimson(2000),authorsproposeda multi-valuedbackground-modellingschemeinwhicheachpixelof thebackground modelis modelledusingGaussianmixture model(GMM).Themethodisfoundtobeeffectiveagainst dynamicbackground,yetitdoesnotdiscriminatethe back-ground and foreground pixel near camouﬂage region and in changing illuminationcondition. The statistical method employedin (Oral and Deniz, 2007),does notupdate the background model thatmakesit lessuseful undervarying illuminationconditions.InVostersetal.(2012),theauthors proposedthecombinedapproachofEigen backgroundand statisticalilluminationmodeltosolvethedisruptiondueto rapidillumination.However,itcouldnotsolvetheproblem of localmotionbecauseEigenspace ofbackground model wasnotincorporatingthenextpositionofobject.

We can recapitulate here that some effective models have higher computational complexity, while the sim-pler background models are not effective under complex conditions.Itisnotedthattheeffectivebackground recon-structionandsufﬁcientobjectsamplesizearenecessaryto extractandtracktheobjectontheforeground(Mandellos et al., 2011). In this work, we have included the spatial andtemporalconstrainttogettheobjectidentiﬁcationand removedunnecessarybackgroundpixelsbyintegratingthe regionleveloperation(YaoandLing,2014).

Proposed

method

Thissectionillustratestheproposedmethodintotwostages. The ﬁrst stage is about the background modelling and objects extraction, while the second stage is related to objecttrackingusingKalmanﬁltering(Wengetal.,2006). Itis notedthattheworkis focussedongreyscale videos, takenunderstaticcameraarrangement.

Backgroundmodellingandobjectextraction

The steps of the proposed method are followed accord-ing to Algorithm 1 that deals the background modelling andobjectextractionphase.Initially,averageofsome ini-tialframesistakenasreferencebackground. Asseen,the temporalprocessingcreatesholesandavoidsspatial correla-tionamongstthemovingpixels.Therefore,anapproximate motionfieldisderivedusingthebackgroundsubtractionand temporaldifferencemechanism.Thediversitystateofeach pixel isexaminedthroughspatio-temporalfilteringandan approximateinitialmotionfieldisestimatedby threshold-ing outthebackground pixel. Abackground model should adapttothedynamicchanges,whetheritislocal(swaying tree,ripplingwater,andfountain) or global(resolutionor illuminationchange).Inthiswork,theproposedbackground modeladaptstemporalchangesthatefficientlyavoidedthe commonunwantedvariablesandextractedthe complemen-taryregioninthescene.

(3)

Algorithm1. Objectextractionmodule

1. Average‘K’initialframes{I1,I2,I3,...,IK}togenerate

referencebackgroundBR

t(x,y),where0<K<11

11. fori←1tonumberofrowsincrementedby ‘r’

2. forF←1toNdo //N=totalnumberofframes 12. forj←1tonumberofcolumns incrementedby‘r’

3. X=abs(It(x,y)−BtR(x,y)) 13. M=BIBt(i→i+r,j→j+r)

FD=abs(It(x,y)−It−1(x,y)) 14. Discardthepixelwithintensitysmaller

than‘3. //It(x,y)andIt−1(x,y)arecurrentandpreviousframe

respectively.

15. Calculate‘pdf’usingthepixelsofM

4. Initializethresholds1and2 //pdf→probabilitydensityfunction

1=mean(X )+.∗(std(X )), 16. Calculateentropy

2=mean(FD)+.∗(std(FD)) Et=− Rmax R=Rmin

pdf(R)log(pdf(R))

5. forx←1torowsofframedo RminandRmaxaretheminimumandmaximum

levelofthegreylevelsR.

6. fory←1tocolumnsofframedo 17. ifEt(i,j)>3then //3=3;

7. ifX(x,y)>1then M(i,j)=BIBt(i,j);

BBS

t (x,y)=1; else

else M(i,j)=0;

BBS

t (x,y)=0; end

end 18. ifM(i,j)>4then

8. ifabs(It(x,y)−It−1(x,y))>␶2then Dt(i,j)=1;

BFD t (x,y)=1; else else Dt(i,j)=0 BFD t (x,y)=0; end end 19. Tocalculate4 9. if(BFD t ∪B BD

t )truethen -Initialize4bytakingthemeanof‘X’.

BIB

t(x,y)=X (x,y); -Update4usingthegivenequation: else 4=X(x,y)+(˛×M−ˇY(x,y))

BIB

t(x,y)=0; whereMismeanof‘M’and‘Y’isthecurrent

averageofupdatedbackgroundframe.

end 20. Updatecurrentbackgroundusing

end Bt(i,j)=Bt−1(i,j)+signum(It(i,j)−It−1(i,j))

end end

//Calculateentropyofinitialmotionﬁeld end

10. Deﬁneawindowofsizer_×r,wherer=8. end

Incomplexenvironment,thestateofpixelsinBIB t(x,y)is affectedduetoheavylocalor globalchangesandcausing theappearanceofirrelevantpixelsontheforeground.Inthis regard,wemonitor theregionalentropytogettheactual movingpixelsin theapproximateinitial motionﬁeld.The pixelsbelongingtotheactualmovingobjecthaslowentropy ascomparedtothosefalsenegativespixelsthatarisedueto eitherdynamicsorilluminationchanges.Duringthe proba-bilitydensityfunctionevaluation,wehavediscardedlower greylevelhavingtherangebelow‘3’.Basedonthe block-wiseentropy,aninitialmotionﬁeldisestimatedusingproper threshold.Finally,thelargestareaofmovingblobislabelled after amorphological open operation followed bya close operationonDt(i,j).

ObjecttrackingusingKalmanﬁltering

AnadaptiveKalmanﬁlteralgorithmproposedinWengetal. (2006), has been adopted to estimate the centre of the largestmovingblobinsidetherectangle.Despitethe pres-ence of error due to unconstrained measurement, local motion in background, the Kalman ﬁlterhas an ability to

estimatetrackingpositionsusingtheminimuminformation about the blob. Initially, the state ˆS(t) and measurement m(t)modelisspeciﬁedtoestimate(predict)thenext posi-tion.Themodelmatricesaredeﬁnedas:

ˆ_S(t)₌_Aˆ_S(t₋₁₎₊_g(t) ₍₁₎

m(t)=H(t)ˆs(t)+v(t) (2)

whereAandH(t)refertostatetransitionmatrixand mea-surementmatrixrespectively.ThewhiteGaussiannoiseg(t) and v(t) with zero mean may produce in the model due tounconstrainedmeasurement.Therefore,thecovariance matrixQandRareestimatedusingg(t)andv(t)in conjunc-tionwithkorneckordeltafunctiontodeﬁnenoiseatcertain instant.

Thepredictionofthenextstate ˆS+(t)isaccomplishedby incorporatingtheactual measurementwithpriorestimate ofstate ˆS−(t).

(4)

ThevariableK(t)referstotheKalmangainandiswritten as:

K+(t)= ˆP−(t)H(t)T(H(t)ˆP−(t)H(t)T+R(t))−1 (4) TheKalmangainincludesapriorerrorcovariancematrix ˆ

P−(t) that is derived using covariance of ˆe−(t)= ˆS(t)− ˆ_S−_(t)._In_a_similar_manner_a_posterior_error_covariance_matrix ˆ

P+(t)isderivedusingthecovarianceof ˆe+(t)= ˆS(t)− ˆS+(t). This ˆP−(t)alongwithK+_(t)_and_current _input_is_utilized_to ﬁndthenextstateandprocessisrepeatedrecursively.

Fornextmeasurement,apriorstateandcovarianceerror ismeasuredrecursivelyasfollows.

ˆ_S−_(t)₌_Aˆ_S+_(t₋₁₎ ₍₅₎

ˆ

P−(t)=AˆP+(t−1)AT₊_Q_(t₋₁₎ ₍₆₎ ThemaingoalistoestimatecorrectstateusingEq.(3)by correctingtheKalmangainthroughEq.(4).Asseen,more wouldbetheKalmangainlesserwouldbethemeasurement error.Theﬁnalstepistoobtainaposteriorerrorcovariance matrixusingEq.(7).Thepreviousposteriorestimateisused tocreateanewpriorestimatetoimprovethemeasurement. ˆ

P+(t)=(I−K(t)H(t))ˆP−(t) (7)

Itisassumedthatobjectcoversequaldistanceinevery interval,whichisrepresentedas:

f(t)=f(t−1)+(f(t−1)−f(t−2)) (8) wheref(t−1)andf(t−2)isthedistancecoveredbyobject inframet−1andt−2,respectively.

The state matrix ˆS(t) and ˆS(t−1) of Eq. (1) can be derivedusingthegivenequation.

ˆ_S(t)₌ 2 −1 1 0 f(t) f(t−1) + g(t) 0 (9)

Experimental

analysis

Thissectionpresentsthequalitativeandquantitative anal-ysis on different video sequences using the proposed algorithm and its performance parameters are compared with other existing methods. The ‘IR’, ‘WS’ and ‘Office’ experimentalsequencesdepictgreatdiversityinits succes-siveframesduetotheilluminationvariation, localmotion inbackground,apertureproblem.The‘WS’sequenceshas camouflageregionbelowthekneeofpersonanddepictthe localmotioninthebackground duetoripplingwater.The movingobjectchangesitsdimensionalityinthesuccessive frameof‘IR’,whilethe‘Office’sequencesdepicttheslow andstationarybehaviouroftheobjectforalongtime.

Figs. 1and 2show thequalitative performance ofthis proposedmethod.ThefirstrowofFigs.1and2showsthe sampled frames with tracking results, while the last row shows themotion mask obtained throughthismethod. As expected,theproposedmethodperformsbetter classifica-tionof foreground andbackground underboth staticand dynamic background conditions. The qualitative compari-sononsomesampledframesshowninFig.3illustratesthe efficacyofthisproposedmethodincomplexsituations.

The quantitativeresultsareevaluatedthrough Similar-ity,F1 andDetectionratemetrics(Rahmanetal.,2013). Theseevaluation metricsdepend ontp(true positive),tn

Figure1 Detectedmotionmaskof‘WS’sequence.

Figure2 Detectedmotionmaskof‘IR’sequence.

(truenegative),fp(falsepositive)andfn(falsenegative). The ‘tp’ and ‘tn’ are correctly detected foreground and backgroundpixelsrespectively.

The‘fp’and‘fn’aretheincorrectlydetectedforeground andbackgroundpixelsrespectively.Theparameters Detec-tionRate,SimilarityandF1aregivenas:

DetectionRate= tp

tp+fn (10)

Similarity=tp/(tp+fp+fn) (11)

F 1= 2×Precision×Recall

Precision+Recall (12)

whereRecallandPrecisionaretherelevantandirrelevant truepositivepixelsrespectively.Fig.4presentsthe detec-tionrateonsampledframesofeachvideosequencethrough thismethodalongwithotherexistingmethods.Here,it is seenthatinitiallysomemethodssuchasGMM(Staufferand Grimson,2000)andmethod(Rahmanetal.,2013)havegood detectionrate,buttheirperformanceisdegradedon sub-sequentframesduetoobjecteitherbecomesstationaryor suffersfromghost onforeground. However,the detection rate throughourmethodis superior toother methodsfor eachvideosequence.Theaveragedetectionrateobtained through this method is up to 40% and 35% greater than GMM and method (Rahman et al., 2013) respectively for

(5)

Figure3 Visualcomparisononforegroundmotionmasks.

(a)

(b)

(c)

5 10 15 20 25 -0.2 0 0.2 0.4 0.6 0.8 1

Number of Frames(WS Frame Sequence)

Det ec ti on Ra te Proposed GMM Method[9] FD 5 10 15 20 25 -0.2 0 0.2 0.4 0.6 0.8 1

Number of Frames(Office Frame Sequence)

Det e ct ion Rat e Proposed GMM Method[9] FD 5 10 15 20 25 -0.2 0 0.2 0.4 0.6 0.8 1

Number of Frames (IR Frame sequence)

De te c tio n R at e Proposed GMM Method[9] FD

Figure4 Detectionrateof(a)WS,(b)Ofﬁceand(c)IRvideo.

Table1 Quantitativeperformancecomparisononforegroundmotionmask.

Sequences Evaluation Proposedmethod Method(Rahmanetal.,2013) GMM FD WS Similarity 0.8745 0.7000 0.3476 0.1542 F1 0.9325 0.8444 0.5112 0.2677 IR Similarity 0.7791 0.4035 0.5116 0.2064 F1 0.8983 0.5750 0.6769 0.3132 Ofﬁce Similarity 0.8189 0.4864 0.2729 0.2451 F1 0.9129 0.6545 0.4275 0.3938

all videos. The detection rate obtained through this pro-posed scheme is also better than traditional FD (Nikolov and Kostov,2014)as itdoes notcreate holes inside mov-ingentity.Table1liststheaverageSimilarityandF1score of themovingmask extracted bymethod (Rahmanetal., 2013).GMM,FDandproposedapproachfor theeachvideo sequence.Here,itis seenthat theaverageSimilarity and

F1 score securedthrough this method are upto 35% and 33% greater than those attained by GMM for ‘WS’ video. Moreover,itsigniﬁcantlyoutperformstheFDmethodin pro-vidingapromisingmotionmaskonforeground.Asseen,only thismethodattainshighersimilarityandF1scoreexceeding 80%for theOfﬁce sequencein which theobject becomes stationaryforlongtime.

(6)

Conclusion

The proposed algorithmlocalizes the entire target region withhigh similarity and accuracy. The system effectively combinesthespatio-temporalprocessingwithregionallevel operation in ordertoreduce thebackground clutter. The proposedschemeisfoundtobeeffectivetoeliminateghost and aperture distortion. It providesthe sufﬁcient sample sizeof an objectto Kalman ﬁlterin order topredict the targetpositioninthesuccessiveframes.Simulationresults provethatthemethodcanbeimplementedinbetter local-izationofmovingorstationaryobjectthanotherbackground subtractionschemesusedfortracking.

References

Bouwmans,T.,2014. Traditionaland recent approachesin

back-ground modeling for foreground detection: an overview. Comput.Sci.Rev.11,31—66.

Cucchiara,R.,Grana,C.,Piccardi,M.,Prati,A.,2003. Detecting

moving objects, ghosts, and shadows in video streams. IEEE Trans.PatternAnal.Mach.Intell.25(10),1337—1342.

Ha,F.,2012.CentroidweightedKalmanﬁlterforvisualobject

track-ing.J.Meas.45(4),650—655.

Mandellos, N.A., Keramitsoglou, I., Kiranoudis, C.T., 2011. A

backgroundsubtraction algorithm for detecting and tracking vehicles.ExpertSyst.Appl.38(3),1619—1631.

Mazinan,A.H.,Amir-Latiﬁ,A.,2012.Applyingmeanshift,motion

informationandKalmanﬁlteringapproachestoobjecttracking. ISATrans.51(3),485—497.

Nikolov,B.,Kostov,N.,2014.Motiondetectionusingadaptive

tem-poralaveragingmethod.Radioengineering23(2),652—658.

Oral,M.,Deniz,U.,2007.Centerofmassmodel:anovelapproach

to backgroundmodelingfor segmentation ofmoving objects. ImageVis.Comput.25(8),1365—1376.

Rahman,A.Y.F.,Hussain,A.,Zaki,W.,Zaman,B.,Tahir,M.,2013.

Enhancementofbackgroundsubtractiontechniquesusinga sec-ondderivativeingradientdirectionﬁlter.J.Electric.Comput. Eng.2013(21),1—12.

Stauffer,C.,Grimson,W.E.L.,2000. Learningpatternsofactivity

usingreal-timetracking.IEEETrans.PatternAnal.Mach.Intell. 22(9),747—757.

Vosters,L.,Shan,C.,Gritti,T.,2012.Real-timerobustbackground

subtraction under rapidly changing illumination conditions. ImageVis.Comput.30(12),1004—1015.

Weng,S.K.,Kuo,C.M.,Tu,S.K.,2006.Videoobjecttrackingusing

adaptiveKalmanﬁlter.J.Vis.Commun.ImageRepresent.17(6), 1190—1208.

Xue,G.,Sun,J.,Song,L.,2012.Backgroundsubtractionbasedon

phasefeatureanddistancetransform.PatternRecognit.Lett. 33(12),1601—1613.

Yao,Li, Ling,M., 2014. Animprovedmixture-of-Gaussians

back-groundmodelwithframedifferenceandblobtrackinginvideo stream.Sci.WorldJ.(2014),1—9.

Zhang,R.,Ding,J.,2012.Objecttrackinganddetectingbasedon