4.2 Variational inference
4.3.2 Results and discussion
In this section, we implement the proposed HPYPGGM algorithm on the challenging Change Detection dataset [70] which consists of 31 videos categorized into 6 differ- ent categories (baseline, shadows, dynamic background, intermittent object motion, camera jitter, and thermal). To evaluate the efficiency of the proposed model, we consider six videos of the Change Detection dataset which are described as follows:
• Pedestrians: This video sequence shows pedestrians walking in a park. • Office: This video sequence shows a person walking around in an office. • Library: This thermal video sequence shows a person walking in the library. • Corridor: This thermal video sequence shows a person walking in the corridor. • Canoe: This video sequence shows a moving canoe in a dynamic background. • Badminton: This video sequence shows players playing badminton.
Sample images from the videos can be found in Fig. 11. In our experiments, we initialize threshold Υ = [0.55, 0.75] for different videos. Our results for each of the
(a) Pedestrians. (b) Office. (c) Library.
(d) Corridor. (e) Canoe. (f) Badminton.
Figure 11: Sample frames of the video sequences from Change Detection dataset.
video sequences can be observed in confusion matrix form in Fig. 12. We evaluate the classification measure by accuracy, recall and precision which are defined as accuracy = (TP + TN)/Total no of observations, recall = TP/(TP + FN) and precision = TP/(TP + FP) where TP, TN, FP, and FN represent the total number of true positives, true negatives, false positives, and false negatives respectively. The reported results of precision and recall are based on the macro averages of the overall frames.
(a) Pedestrians. (b) Office. (c) Library.
(d) Corridor. (e) Caneo. (f) Badminton.
We compare our results with four other approaches from the literature; namely, K-means, GMM, variational GMM (VGMM) and Dirichlet process GMM (DPGMM). We set the number of components to 6 for K-means, GMM, VGMM and DPGMM. The threshold Υ is set to the same value as HPYPGGM for a fair comparison between the models. A visual comparison of the results from all the video sequences can be observed in Fig. 13 and Table 4 shows the comparison of the proposed HPYPGGM against K-means, GMM, VGMM and DPGMM. We can observe in the Pedestrians video sequence, that our model performed better in classifying between the back- ground and foreground pixels while the others misclassified most of the background with foreground pixels. In the Office video sequence, K-means, GMM, VGMM and DPGMM were not able to precisely distinguish between the background and fore- ground pixels. This may be due to the close color intensity of the person’s jeans with the color intensity of the box next to him. Nonetheless, HPYPGGM was able to segment a better foreground compared with the other models. All the models performed better in the Library video sequence where the background pixels are dark with perfect illumination, thereby resulting in a high accuracy with a supporting high precision and recall for all the models. In the Canoe video sequence all the models misclassified background water with foreground. However, HPYPGMM was able to give a better classification between the background and the foreground pixels. This can be observed clearly in Fig. 13. Similar results were obtained in the Badminton video sequence. Table 4 shows that the proposed HPYPGGM model outperformed K-means, GMM, VGMM and DPGMM in most cases in terms of precision, recall and accuracy.
Table 4: The macro average results of background subtraction with the Change De- tection dataset.
Model Precision (%) Recall (%) Accuracy (%) Pedestrians K-means 67.30 73.11 96.48 GMM 60.12 73.42 94.02 VGMM 57.21 71.60 92.31 DPGMM 67.48 65.33 93.33 HPYPGGM 77.31 69.10 97.64 Office K-means 67.51 65.23 90.52 GMM 56.10 60.40 81.30 VGMM 58.92 65.11 81.51 DPGMM 89.48 87.12 87.21 HPYPGGM 93.31 65.20 94.05 Library K-means 98.97 96.02 98.04 GMM 98.56 96.01 98.06 VGMM 98.61 96.30 98.09 DPGMM 98.48 98.13 98.09 HPYPGGM 99.01 96.31 98.72 Corridor K-means 92.31 62.01 84.22 GMM 94.12 76.89 90.25 VGMM 95.23 78.12 83.50 DPGMM 94.48 93.21 93.01 HPYPGGM 96.01 83.51 93.43 Caneo K-means 80.10 78.26 93.22 GMM 76.82 77.12 92.59 VGMM 71.13 77.23 90.64 DPGMM 93.10 74.42 93.46 HPYPGGM 97.01 74.51 95.66 Badminton K-means 61.12 72.51 95.22 GMM 57.51 74.12 93.52 VGMM 57.12 74.71 93.50 DPGMM 66.48 65.47 94.73 HPYPGGM 66.01 66.52 97.40
Figure 13: The foreground mask results for each of the original images (Pedestrians, Office, Library, Corridor, Caneo and Badminton from top to bottom respectively) obtained by K-means, GMM, VGMM, DPGMM and HPYPGMM algorithms are shown in columns 1 to 5 respectively.
Chapter 5
Conclusion
Clustering algorithms have been broadly applied in many research areas such as com- puter vision, signal processing, and pattern recognition. A mixture model, one of the most predominant statistical techniques, clusters data into a collection of homoge- neous groups. Gaussian distribution has been widely used and studied with success for many applications involving computer vision, machine learning, image processing and statistical analysis. However, in many real applications, Gaussian fails to fit different shapes of data.
In this thesis, first we have presented a variational inference approach for general- ized Gaussian Distribution. The algorithm was based on treating the shape parameter as a variable. Using the single-step update of Newton’s method, the shape param- eter is updated in the VM-step. Experimental results on medical, astrological, and image segmentation applications have shown the effectiveness of the algorithm when compared with the traditional models.
Second, we extended the variational inference approach to the infinite case using Dirichlet process and applied feature selection. Also, by extending the model to infinity with simultaneous feature selection, we were able to detect the number of mixture components and relevant features without the need to specify the number of mixture components a priori thereby resulting in an overall better accuracy. The variational learning approach aided in approximating the posteriors and experimental results have shown that the proposed variational IGGM with feature selection has favorable results compared to standard models.
hierarchical Pitman-Yor process prior for our model. The proposed model tackles the estimation of parameters via variational learning. We inspected the benefits of our approach on video background subtraction using the challenging Change Detection dataset. The experimental results compared with the traditional models as well as the Dirichlet process of Gaussian distributions shows that our nonparametric Bayesian framework performed better over other models resulting in significant outcomes.
In conclusion, when compared with existing models and techniques which are mostly based on Gaussian assumption, our approach’s can model non-Gaussian data with an efficient approximation of the model parameters resulting in better accuracy. Our models were also able to automatically determine the better number of mixture components resulting in overall better performance.
Future work could be dedicated to investigating online variational techniques for our proposed approaches and also extending the proposed models with component splitting. Another potential future work related to video background subtraction can be to modify the pixel based approach into segmenting each pixel of all the frames might lead to better over all performance in terms of classifying foreground and background pixels.
Bibliography
[1] Chris Stauffer and W Eric L Grimson. Adaptive background mixture models for real-time tracking. In Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149), volume 2, pages 246–252. IEEE, 1999.
[2] Chi Liu, Heng-Chao Li, Kun Fu, Fan Zhang, Mihai Datcu, and William J Emery. Bayesian estimation of generalized gamma mixture model based on variational em algorithm. Pattern Recognition, 87:269–284, 2019.
[3] Giancarlo Calvagno, Cristiano Ghirardi, Gian Antonio Mian, and Roberto Ri- naldo. Modeling of subband image data for buffer control. IEEE transactions on circuits and systems for video technology, 7(2):402–408, 1997.
[4] Minh N Do and Martin Vetterli. Wavelet-based texture retrieval using general- ized gaussian density and kullback-leibler distance. IEEE transactions on image processing, 11(2):146–158, 2002.
[5] J-F Aujol, Gilles Aubert, and Laure Blanc-F´eraud. Wavelet-based level set evo- lution for classification of textured images. IEEE Transactions on Image Pro- cessing, 12(12):1634–1641, 2003.
[6] Siu-Kai Choy and Chong-Sze Tong. Supervised texture classification using char- acteristic generalized gaussian density. Journal of Mathematical Imaging and Vision, 29(1):35–47, 2007.
[7] Mohand Said Allili, Nizar Bouguila, and Djemel Ziou. A robust video fore- ground segmentation by using generalized gaussian mixture modeling. In Fourth Canadian Conference on Computer and Robot Vision (CRV’07), pages 503–509. IEEE, 2007.
[8] Mohand Sa¨ıd Allili, Nizar Bouguila, and Djemel Ziou. Finite general gaussian mixture modeling and application to image and video foreground segmentation. Journal of Electronic Imaging, 17(1):013005, 2008.
[9] Shu-Kai S Fan and Yen Lin. A fast estimation method for the generalized gaus- sian mixture distribution on complex images. Computer Vision and Image Un- derstanding, 113(7):839–853, 2009.
[10] Thumpudi Naveen and John W Woods. Motion compensated multiresolution transmission of high definition video. IEEE Transactions on Circuits and Systems for Video Technology, 4(1):29–41, 1994.
[11] Gabriele Moser, Josiane Zerubia, and Sebastiano B Serpico. Sar amplitude prob- ability density function estimation based on a generalized gaussian model. IEEE Transactions on Image Processing, 15(6):1429–1442, 2006.
[12] Karnran Sharifi and Alberto Leon-Garcia. Estimation of shape parameter for generalized gaussian distributions in subband decompositions of video. IEEE Transactions on Circuits and Systems for Video Technology, 5(1):52–56, 1995. [13] Pierre Moulin and Juan Liu. Analysis of multiresolution image denoising schemes
using generalized gaussian and complexity priors. IEEE Transactions on Infor- mation Theory, 45(3):909–919, 1999.
[14] Thomas Fischer. A pyramid vector quantizer. IEEE transactions on information theory, 32(4):568–583, 1986.
[15] Keith A Birney and Thomas R Fischer. On the modeling of dct and subband image data for compression. IEEE transactions on Image Processing, 4(2):186– 193, 1995.
[16] Charles Bouman and Ken Sauer. A generalized gaussian image model for edge- preserving map estimation. ECE Technical Reports, page 277, 1992.
[17] Yakoub Bazi, Lorenzo Bruzzone, and Farid Melgani. Image thresholding based on the em algorithm and the generalized gaussian distribution. Pattern Recognition, 40(2):619–634, 2007.
[18] Shu-Kai S Fan, Yen Lin, and Chia-Chan Wu. Image thresholding using a novel estimation method in generalized gaussian distribution mixture modeling. Neu- rocomputing, 72(1-3):500–512, 2008.
[19] Stephane G Mallat. A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis & Machine In- telligence, (7):674–693, 1989.
[20] Rajan L Joshi, Valerie J Crump, and Thomas R Fischer. Image subband coding using arithmetic coded trellis coded quantization. IEEE Transactions on Circuits and Systems for Video Technology, 5(6):515–523, 1995.
[21] Saeed Gazor and Wei Zhang. Speech probability distribution. IEEE Signal Processing Letters, 10(7):204–207, 2003.
[22] Kostas Kokkinakis and Asoke K Nandi. Exponent parameter estimation for generalized gaussian probability density functions with application to speech modeling. Signal Processing, 85(9):1852–1858, 2005.
[23] Demetrios Cantzos, Athanasios Mouchtaris, and Chris Kyriakakis. Multichannel audio resynthesis based on a generalized gaussian mixture model and cepstral smoothing. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005., pages 215–218. IEEE, 2005.
[24] Stephane G Mallat. A theory for multiresolution signal decomposition: the wavelet representation. IEEE transactions on pattern analysis and machine in- telligence, 11(7):674–693, 1989.
[25] S Grace Chang, Bin Yu, and Martin Vetterli. Adaptive wavelet thresholding for image denoising and compression. IEEE transactions on image processing, 9(9):1532–1546, 2000.
[26] Pierre Moulin and Juan Liu. Analysis of multiresolution image denoising schemes using generalized gaussian and complexity priors. IEEE transactions on Infor- mation Theory, 45(3):909–919, 1999.
[27] Yakoub Bazi, Lorenzo Bruzzone, and Farid Melgani. Image thresholding based on the em algorithm and the generalized gaussian distribution. Pattern Recognition, 40(2):619–634, 2007.
[28] Minh N Do and Martin Vetterli. Wavelet-based texture retrieval using general- ized gaussian density and kullback-leibler distance. IEEE transactions on image processing, 11(2):146–158, 2002.
[29] Jacob Scharcanski. A wavelet-based approach for analyzing industrial stochas- tic textures with applications. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 37(1):10–22, 2006.
[30] Bruno Aiazzi, Luciano Alparone, and Stefano Baronti. Estimation based on entropy matching for generalized gaussian pdf modeling. IEEE Signal Processing Letters, 6(6):138–140, 1999.
[31] Mahesh K Varanasi and Behnaam Aazhang. Parametric generalized gaussian density estimation. The Journal of the Acoustical Society of America, 86(4):1404– 1415, 1989.
[32] Minghong Pi. Improve maximum likelihood estimation for subband ggd param- eters. Pattern Recognition Letters, 27(14):1710–1713, 2006.
[33] F M¨uller. Distribution shape of two-dimensional dct coefficients of natural im- ages. Electronics Letters, 29(22):1935–1936, 1993.
[34] Sylvain Meignen and Hubert Meignen. On the modeling of small sample distri- butions with generalized gaussian density in a maximum likelihood framework. IEEE Transactions on Image Processing, 15(6):1647–1652, 2006.
[35] David Barber. Bayesian reasoning and machine learning. Cambridge University Press, 2012.
[36] Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I Jordan. An introduction to mcmc for machine learning. Machine learning, 50(1-2):5–43, 2003.
[37] Ziyang Song, Samr Ali, and Nizar Bouguila. Bayesian learning of infinite asym- metric gaussian mixture models for background subtraction. In International Conference on Image Analysis and Recognition, pages 264–274. Springer, 2019.
[38] Martin J Wainwright, Michael I Jordan, et al. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learn-R
ing, 1(1–2):1–305, 2008.
[39] Eren Gultepe and Masoud Makrehchi. Improving clustering performance using independent component analysis and unsupervised feature learning. Human- centric Computing and Information Sciences, 2018.
[40] Nizar Bouguila and Djemel Ziou. A dirichlet process mixture of dirichlet dis- tributions for classification and prediction. In 2008 IEEE workshop on machine learning for signal processing, pages 297–302. IEEE, 2008.
[41] Michael Harrison and Arie Shirom. Organizational diagnosis and assessment: Bridging theory and practice. Sage Publications, 1998.
[42] Martin HC Law, Mario AT Figueiredo, and Anil K Jain. Simultaneous feature selection and clustering using mixture models. IEEE transactions on pattern analysis and machine intelligence, 26(9):1154–1166, 2004.
[43] Constantinos Constantinopoulos, Michalis K Titsias, and Aristidis Likas. Bayesian feature and model selection for gaussian mixture models. IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 28(6):1013–1018, 2006. [44] Yee Whye Teh and Michael I Jordan. Hierarchical bayesian nonparametric mod-
els with applications. Bayesian nonparametrics, 1:158–207, 2010.
[45] Yee W Teh, Michael I Jordan, Matthew J Beal, and David M Blei. Sharing clusters among related groups: Hierarchical dirichlet processes. In Advances in neural information processing systems, pages 1385–1392, 2005.
[46] Jim Pitman and Marc Yor. The two-parameter poisson-dirichlet distribution derived from a stable subordinator. The Annals of Probability, pages 855–900, 1997.
[47] Srikanth Amudala, Samr Ali, Fatma Najar, and Nizar Bouguila. Variational inference of finite generalized gaussian mixture models. In 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pages 2433–2439. IEEE, 2019.
[48] Srikanth Amudala, Samr Ali, and Nizar Bouguila. Variational inference of infi- nite generalized gaussian mixture models with feature selection. In 2020 IEEE International Conference on Systems, Man and Cybernetics (SMC) Organizing Committee. IEEE, 2020.
[49] Srikanth Amudala, Samr Ali, and Nizar Bouguila. Background subtraction with a hierarchical pitman-yor process mixture model of generalized gaussian distribu- tion. In International Conference on Information Reuse and Integration. IEEE, 2020.
[50] Tarek Elguebaly and Nizar Bouguila. Bayesian learning of finite generalized gaussian mixture models on images. Signal Processing, 91(4):801–820, 2011. [51] Geoffrey McLachlan and Thriyambakam Krishnan. The EM algorithm and ex-
tensions, volume 382. John Wiley & Sons, 2007.
[52] Dimitris G Tzikas, Aristidis C Likas, and Nikolaos P Galatsanos. The varia- tional approximation for bayesian inference. IEEE Signal Processing Magazine, 25(6):131–146, 2008.
[53] Christopher M Bishop. Pattern recognition and machine learning. Springer, 2006.
[54] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge Uni- versity Press, 2004.
[55] Linda Shapiro and C George. Stockman g: computer vision. In Prentice Hall. 2002.
[56] Lauren Barghout and Lawrence Lee. Perceptual information processing system, March 25 2004, US Patent App. 10/618,543.
[57] David M Blei, Michael I Jordan, et al. Variational inference for dirichlet process mixtures. Bayesian analysis, 1(1):121–143, 2006.
[58] Thomas S Ferguson. A bayesian analysis of some nonparametric problems. The annals of statistics, pages 209–230, 1973.
[59] Kenichi Kurihara, Max Welling, and Yee Whye Teh. Collapsed variational dirich- let process mixture models. In IJCAI, volume 7, pages 2796–2801, 2007.
[60] Jayaram Sethuraman. A constructive definition of dirichlet priors. Statistica sinica, pages 639–650, 1994.
[61] Kamal Maanicshah, Nizar Bouguila, and Wentao Fan. Variational learning for finite generalized inverted dirichlet mixture models with a component splitting approach. In 2019 IEEE 28th International Symposium on Industrial Electronics (ISIE), pages 1453–1458. IEEE, 2019.
[62] Li Fei-Fei, Rob Fergus, and Pietro Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In 2004 conference on computer vision and pattern recognition workshop, pages 178–178. IEEE, 2004.
[63] Teng Li, Tao Mei, In-So Kweon, and Xian-Sheng Hua. Contextual bag-of-words for visual categorization. IEEE Transactions on Circuits and Systems for Video Technology, 21(4):381–392, 2010.
[64] Gabriella Csurka, Christopher Dance, Lixin Fan, Jutta Willamowski, and C´edric Bray. Visual categorization with bags of keypoints. In Workshop on statistical learning in computer vision, ECCV, volume 1, pages 1–2. Prague, 2004.
[65] David G Lowe. Distinctive image features from scale-invariant keypoints. Inter- national journal of computer vision, 60(2):91–110, 2004.
[66] Herbert Bay, Andreas Ess, Tinne Tuytelaars, and Luc Van Gool. Speeded-up robust features (surf). Computer vision and image understanding, 110(3):346– 359, 2008.
[67] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, pages 886–893. IEEE, 2005.
[68] Deyang Wang, Weixin Xie, Jihong Pei, and Zongqing Lu. Moving area detection based on estimation of static background. J Inform Comput Sci, 2(1):129–134, 2005.
[69] Wentao Fan and Nizar Bouguila. Dynamic textures clustering using a hierarchical pitman-yor process mixture of dirichlet distributions. In 2015 IEEE International Conference on Image Processing (ICIP), pages 296–300. IEEE, 2015.
[70] Yi Wang, Pierre-Marc Jodoin, Fatih Porikli, Janusz Konrad, Yannick Benezeth, and Prakash Ishwar. Cdnet 2014: An expanded change detection benchmark dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 387–394, 2014.