• Dictionary size. The dictionary size of the BoW representation is pre-defined and empirically determined in this thesis. A compact dictionary with small size has a limited discriminative ability, while a dictionary with large size is likely to introduce noise. How to adaptively set the optimal size of the dictionary to make the dictionary compact and yet discriminative is still an open question. Some criteria can be defined to merge entries of the dictionary to construct an adaptive codebook. For instance, the method in [66] utilizes Maximization of Mutual Information (MMI) principal to estimate the optimal dictionary size. Two entries of a codebook are merged by maximizing the mutual information in an unsupervised way. Creating a dictionary with adaptive size will be inves- tigated in our future work.
• Number of topics. The number of topics in the probabilistic topic models is pre- defined as the number of time series categories, which is based on the assumption that each topic corresponds to a category. However, automatically deciding the number of underlying topics (categories) is still an open problem for data clustering. A system that can automatically discover the underlying patterns of a collection of biomedical time series with no prior knowledge is more useful in real world applications. Bayesian nonparametric models such as Hierarchical Dirichlet Processes (HDP) [107] that can automatically determine the number of clusters in a group of data are an optional choice for time series clustering
based on the BoW representation.
• Label information. This thesis constructs the dictionary of the BoW representa- tion based on the k-means clustering or the non-negative sparse coding, which are unsupervised methods. These unsupervised methods are useful to analyse biomedical time series whose label information is missing or difficult to obtain. However, label information of training data is an important information. How to incorporate the label information of the training data into the BoW repre- sentation so that the BoW representation is more discriminative and compact deserves to be investigated. One example is the method in [59], which incorpo- rates a supervised logistic regression model into the GMM model to generate the dictionary. The supervised logistic regression model makes use of label infor- mation to modify the parameters of the GMM model. Another similar method in [28] also makes use of the label information to improve the discriminative ability of the BoW representation. This new dictionary learning method simul- taneously maximizes the likelihood of a set of labelled and unlabelled training data and the purity of the clusters.
• Temporal order of local segments. One limitation of the BoW representation is that it ignores the temporal order of local segments. One promising way to make use of this temporal order information is to introduce the temporal order of local segments into the probabilistic topic models. This is similar to the spatial topic models [27] [13] in computer vision that introduce the spatial information of words into the original topic models. Another example that considers the temporal order information is the structural pLSA model proposed
in [137], which introduces the temporal dependence of words into the pLSA model. How to better utilize the temporal order information of local segments in the BoW representation and the probabilistic topic models still needs to be further investigated in the future.
Bibliography
[1] Text classification in python [Online]. Available: http://www.python-course. eu/text_classification_python.php.
[2] U. R. Acharya, F. Molinari, S. V. Sree, S. Chattopadhyay, K.-H. Ng, and J. S. Suri. Automated diagnosis of epileptic EEG using entropies. Biomed. Signal Process. Control, 7(4):401 – 408, 2012.
[3] M. Aharon, M. Elad, and A. Bruckstein. K-SVD: An algorithm for design- ing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process., 54(11):4311 –4322, 2006.
[4] R. G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David, and C. E. Elger. Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. Phys. rev. E, 64(6 Pt 1), Dec. 2001.
[5] T. Bassani and J. Nievola. Brain-computer interface using wavelet transfor- mation and na¨ıve bayes classifier. In A. Hussain, I. Aleksander, L. S. Smith, A. K. Barros, R. Chrisley, and V. Cutsuridis, editors, Brain Inspired Cognitive
Systems 2008, volume 657 of Advances in Experimental Medicine and Biology, pages 147–165. Springer New York, 2010.
[6] L. Biel, O. Pettersson, L. Philipson, and P. Wide. ECG analysis: a new approach in human identification. IEEE Trans. Instrum. Meas., 50(3):808 –812, 2001.
[7] M. H. S. S. Bissacco, A.and Yang. Detecting humans with their pose. In J. C. H. T. Sch¨olkopf, B.and Platt, editor, Advances in Neural Information Processing Systems, pages 169–176. MIT Press, Cambridge, MA, December 2007.
[8] D. Blei and J. McAuliffe. Supervised topic models. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems, pages 121–128. MIT Press, Cambridge, MA, 2008.
[9] D. Blei, A. Ng, and M. Jordan. Latent Dirichlet Allocation. J. Mach. Learn. Res., 3:993–1022, 2003.
[10] D. M. Blei. Probabilistic topic models. Commun. ACM, 55(4):77–84, Apr. 2012.
[11] D. M. Blei and J. D. Lafferty. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning, pages 113–120, 2006.
[12] E. Candes and T. Tao. Decoding by linear programming. IEEE Trans. Inf. Theory, 51(12):4203 – 4215, dec. 2005.
[13] L. Cao and L. Fei-Fei. Spatially coherent latent topic model for concurrent object segmentation and classification. In Proceedings of IEEE Intern. Conf. in Computer Vision (ICCV)., 2007.
[14] Y. Cha and J. Cho. Social-network analysis using topic models. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’12, pages 565–574, 2012.
[15] K. Chakrabarti, E. Keogh, S. Mehrotra, and M. Pazzani. Locally adaptive dimensionality reduction for indexing large time series databases. ACM Trans. Database Syst., 27:188–228, June 2002.
[16] A. Chan, M. Hamdy, A. Badre, and V. Badee. Wavelet distance measure for person identification using electrocardiograms. IEEE Trans. Instrum. Meas., 57(2):248 –253, 2008.
[17] S. Chandaka, A. Chatterjee, and S. Munshi. Cross-correlation aided support vector machine classifier for classification of EEG signals. Expert Systems with Applications, 36(2, Part 1):1329 – 1336, 2009.
[18] W.-Y. Chen, Y. Song, H. Bai, C.-J. Lin, and E. Chang. Parallel spectral cluster- ing in distributed systems. IEEE Trans. Pattern Anal. Mach. Intell., 33(3):568 –586, 2011.
[19] T. chung Fu. A review on time series data mining. Eng. Appl. Artif. Intel., 24(1):164 – 181, 2011.
[20] A. Daamouche, L. Hamami, N. Alajlan, and F. Melgani. A wavelet optimization approach for ECG signal classification. Biomed. Signal Proces., 7(4):342 – 349, 2012.
[21] E. Derya ¨Ubeyli. Least squares support vector machine employing model-based methods coefficients for analysis of EEG signals. Expert Systems with Applica- tions, 37(1):233 – 239, 2010.
[22] D. L. Donoho. For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution. Comm. Pure Appl. Math, 59:797–829, 2004.
[23] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Ann. Stat., 32:407–499, 2004.
[24] D. Endres and J. Schindelin. A new metric for probability distributions. IEEE Trans. Inf. Theory, 49(7):1858 – 1860, july 2003.
[25] S.-C. Fang and H.-L. Chan. Human identification by quantifying similarity and dissimilarity in electrocardiogram phase space. Pattern Recogn., 42(9):1824 – 1831, 2009.
[26] L. Fei-Fei and P. Perona. A bayesian hierarchical model for learning natu- ral scene categories. In Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, volume 2, pages 524 – 531, 2005.
[27] R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from google’s image search. In Tenth IEEE International Conference on Com- puter Vision, volume 2, pages 1816–1823 Vol. 2, 2005.
[28] B. Fernando, E. Fromont, D. Muselet, and M. Sebban. Supervised learning of gaussian mixture models for visual vocabulary generation. Pattern Recogn., 45(2):897 – 907, 2012.
[29] C. Fowlkes, S. Belongie, F. Chung, and J. Malik. Spectral grouping using the nystr¨om method. IEEE Trans. Pattern Anal. Mach. Intell., 26(2):214 –225, 2004.
[30] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. Ch.Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley. Phys- ioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation, 101(23):215–220, 2000.
[31] K. Grauman and T. Darrell. The pyramid match kernel: Efficient learning with sets of features. J. Mach. Learn. Res., 8:725–760, May 2007.
[32] T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101(Suppl.1):5228–5235, 2004.
[33] L. Guo, D. Rivero, J. Dorado, J. R. Rabual, and A. Pazos. Automatic epileptic seizure detection in EEGs based on line length feature and artificial neural networks. J. Neurosci Meth., 191(1):101 – 109, 2010.
[34] L. Guo, D. Rivero, and A. Pazos. Epileptic seizure detection using multiwavelet transform based approximate entropy and artificial neural networks. Journal of Neuroscience Methods, 193(1):156 – 163, 2010.
[35] L. Guo, D. Rivero, J. A. Seoane, and A. Pazos. Classification of EEG signals using relative wavelet energy and artificial neural networks. In Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation, GEC ’09, pages 177–184, 2009.
[36] T. Hospedales, S. Gong, and T. Xiang. A markov clustering topic model for mining behaviour in video. In 2009 IEEE 12th International Conference on Computer Vision, pages 1165 –1172, 29 2009-oct. 2 2009.
[37] T. Hospedales, S. Gong, and T. Xiang. Video behaviour mining using a dynamic topic model. International Journal of Computer Vision, 98(3):303–323, 2012.
[38] T. Hospedales, J. Li, S. Gong, and T. Xiang. Identifying rare and subtle be- haviors: A weakly supervised joint topic model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12):2451–2464, 2011.
[39] P. Hoyer. Non-negative sparse coding. In Proc. IEEE Workshop on Neural Networks for Signal Processing, pages 557 – 565, 2002.
[40] D. J. Hu and L. K. Saul. A probabilistic topic model for music analysis. In NIPS-09, Applications for Topic Models Workshop, Whistler, Canada, 2009.
[41] ˙I.G¨uler and E. D. ¨Ubeyli. ECG beat classifier designed by combined neural network model. Pattern Recogn., 38(2):199 – 208, 2005.
[42] ˙I.G¨uler and E. D. ¨Ubeyli. Multiclass support vector machines for EEG-signals classification. IEEE Trans. Inf. Technol. Biomed., 11(2):117 –126, march 2007.
[43] T. Ince, S. Kiranyaz, and M. Gabbouj. A generic and robust system for au- tomated patient-specific classification of ECG signals. IEEE Trans. Biomed. Eng., 56(5):1415 –1426, may 2009.
[44] J. M. Irvine, S. A. Israel, W. T. Scruggs, and W. J. Worek. eigenpulse: Ro- bust human identification from cardiovascular function. Pattern Recognition, 41(11):3427 – 3435, 2008.
[45] S. A. Israel, J. M. Irvine, A. Cheng, M. D. Wiederhold, and B. K. Wiederhold. ECG to identify individuals. Pattern Recogn., 38(1):133 – 142, 2005.
[46] P. Jahankhani, V. Kodogiannis, and K. Revett. EEG signal classification us- ing wavelet feature extraction and neural networks. In IEEE John Vincent Atanasoff 2006 International Symposium on Modern Computing, 2006. JVA ’06, pages 120–124, 2006.
[47] T. Jebara, Y. Song, and K. Thadani. Spectral clustering and embedding with hidden markov models. In J. Kok, J. Koronacki, R. Mantaras, S. Matwin, D. Mladeni, and A. Skowron, editors, Machine Learning: ECML 2007, volume 4701 of Lecture Notes in Computer Science, pages 164–175. Springer Berlin Heidelberg, 2007.
[48] F. Jurie and B. Triggs. Creating efficient codebooks for visual recognition. In Proc. IEEE Int’l Conf. Computer Vision, volume 1, pages 604 – 610 Vol. 1, oct. 2005.
[49] A. Kampouraki, G. Manis, and C. Nikou. Heartbeat time series classification with support vector machines. IEEE Trans. Inf. Technol. Biomed., 13(4):512 –518, july 2009.
[50] N. Kannathal, M. L. Choo, U. R. Acharya, and P. Sadasivan. Entropies for detection of epilepsy in EEG. Compu. Meth. Prog. Bio., 80(3):187 – 194, 2005.
[51] W. Karlen, C. Mattiussi, and D. Floreano. Sleep and wake classification with ECG and respiratory effort signals. IEEE Trans. Biomed. Circuits Syst., 3(2):71 –78, 2009.
[52] J. Kim and E. Andre. Emotion recognition based on physiological changes in music listening. IEEE Trans. Pattern Anal. Mach. Intell., 30(12):2067 –2083, 2008.
[53] Z. Koles, J. Lind, and A. Soong. Spatio-temporal decomposition of the EEG: a general approach to the isolation and localization of sources. Electroencephalog- raphy and Clinical Neurophysiology, 95(4):219 – 230, 1995.
[54] G. Lebanon, Y. Mao, and J. Dillon. The locally weighted bag of words frame- work for document representation. J. Mach. Learn. Res., 8:2405–2441, Decem- ber 2007.
[55] H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In Proc. Conf. Neural Information Processing Systems, pages 801–808. 2007.
[56] C. Li and G. Biswas. A bayesian approach to temporal data clustering using hidden markov models. In Proceedings of the Seventeenth International Con- ference on Machine Learning, pages 543–550, 2000.
[57] J. Li, S. Gong, and T. Xiang. Global behaviour inference using probabilistic latent semantic analysis. In Proceedings of British Machine Vision Conference (BMVC), 2008.
[58] J. Li, L. Zhang, D. Tao, H. Sun, and Q. Zhao. A prior neurophysiologic knowl- edge free tensor-based scheme for single trial EEG classification. IEEE Transac- tions on Neural Systems and Rehabilitation Engineering, 17(2):107–115, 2009.
[59] X.-C. Lian, Z. Li, C. Wang, B.-L. Lu, and L. Zhang. Probabilistic models for supervised dictionary learning. In 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2305–2312, 2010.
[60] T. Liao. Clustering of time series data–a survey. Pattern Recognition, 38(11):1857–1874, 2005.
[61] C. A. Lima and A. L. Coelho. Kernel machines for epilepsy diagnosis via EEG signal classification: A comparative study. Artificial Intelligence in Medicine, 53(2):83 – 95, 2011.
[62] J. Lin, E. Keogh, L. Wei, and S. Lonardi. Experiencing sax: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2):107– 144, 2007.
[63] J. Lin, R. Khade, and Y. Li. Rotation-invariant similarity in time series using bag-of-patterns representation. J. Intell. Inf. Syst, pages 1–29, 2012.
[64] J. Lin and Y. Li. Finding structural similarity in time series data using bag- of-patterns representation. In Proceedings of the 21st International Conference on Scientific and Statistical Database Management, pages 461–477, 2009.
[65] Y.-P. Lin, C.-H. Wang, T.-P. Jung, T.-L. Wu, S.-K. Jeng, J.-R. Duann, and J.-H. Chen. EEG-based emotion recognition in music listening. IEEE Trans. Biomed. Eng., 57(7):1798 –1806, july 2010.
[66] J. Liu and M. Shah. Learning human actions via information maximization. In Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, pages 1–8,
[67] M. Llamedo and J. P. Mart´ınez. Heartbeat classification using feature selec- tion driven by database generalization criteria. IEEE Trans. Biomed. Eng., 58(3):616–625, 2011.
[68] H. Lu, H.-L. Eng, C. Guan, K. Plataniotis, and A. Venetsanopoulos. Regular- ized common spatial pattern with aggregation for EEG classification in small- sample setting. IEEE Transactions on Biomedical Engineering, 57(12):2936– 2946, 2010.
[69] S. K. Lukins, N. A. Kraft, and L. H. Etzkorn. Bug localization using latent dirichlet allocation. Information and Software Technology, 52(9):972 – 990, 2010.
[70] U. Luxburg. A tutorial on spectral clustering. Stat. Comput, 17(4):395–416, 2007.
[71] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factor- ization and sparse coding. J. Mach. Learn. Res., 11:19–60, 2010.
[72] T. Mar, S. Zaunseder, J. P. Mart´ınez, M. Llamedo, and R. Poll. Optimization of ECG classification by means of feature selection. IEEE Trans. Biomed. Eng., 58(8):2168 –2177, aug. 2011.
[73] R. J. Martis, C. Chakraborty, and A. K. Ray. A two-stage mechanism for registration and classification of ECG using gaussian mixture model. Pattern Recognition, 42(11):2979 – 2988, 2009.
[74] F. Melgani and Y. Bazi. Classification of electrocardiogram signals with sup- port vector machines and particle swarm optimization. IEEE Transactions on Information Technology in Biomedicine, 12(5):667–677, 2008.
[75] K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10):1615– 1630, 2005.
[76] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffal- itzky, T. Kadir, and L. V. Gool. A comparison of affine region detectors. Int. J. Comput. Vision, 65(1-2):43–72, Nov. 2005.
[77] T. Minka and J. Lafferty. Expectation-propagation for the generative aspec- t model. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, 2002.
[78] A. R. Naghsh-Nilchi and M. Aghashahi. Epilepsy seizure detection using eigen-system spectral estimation and multiple layer perceptron neural network. Biomed. Signal Process. Control, 5(2):147 – 157, 2010.
[79] R. M. Nallapati, A. Ahmed, E. P. Xing, and W. W. Cohen. Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD inter- national conference on Knowledge discovery and data mining, KDD ’08, pages 542–550, 2008.
[80] J. C. Niebles, H. Wang, and L. Fei-Fei. Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vision., 79(3):299– 318, 2008.
[81] T. Oates, L. Firoiu, and P. Cohen. Using dynamic time warping to bootstrap hmm-based clustering of time series. In R. Sun and C. Giles, editors, Sequence Learning, volume 1828 of Lecture Notes in Computer Science, pages 35–52. Springer Berlin Heidelberg, 2001.
[82] B. Obermaier, C. Guger, C. Neuper, and G. Pfurtscheller. Hidden markov models for online classification of single trial EEG data. Pattern Recognition Letters, 22(12):1299 – 1309, 2001.
[83] H. Ocak. Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy. Expert Syst. Appl., 36(2, Part 1):2027 – 2036, 2009.
[84] U. Orhan, M. Hekim, and M. Ozer. EEG signals classification using the k-means clustering and a multilayer perceptron neural network model. Expert Systems with Applications, 38(10):13475 – 13481, 2011.
[85] S. Pal and M. Mitra. Increasing the accuracy of ECG based biometric analysis by data modelling. Measurement, 45(7):1927 – 1932, 2012.
[86] C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Dover publications, 1998.
[87] E. Pasolli and F. Melgani. Active learning methods for electrocardiograph- ic signal classification. IEEE Transactions on Information Technology in Biomedicine, 14(6):1405–1416, 2010.
[88] A. Perina, M. Cristani, and V. Murino. 2lda: Segmentation for recognition. In 20th International Conference on Pattern Recognition (ICPR), pages 995–998, 2010.
[89] J. Philbin, J. Sivic, and A. Zisserman. Geometric latent dirichlet allocation on a matching graph for large-scale image datasets. International Journal of Computer Vision, 95(2):138–153, 2011.
[90] K. Plataniotis, D. Hatzinakos, and J. Lee. ECG biometric recognition without fiducial detection. In Proc. IEEE BCC, pages 1–6, 2006.
[91] K. Polat and S. G¨unes. Classification of epileptiform EEG using a hybrid system based on decision tree classifier and fast fourier transform. Appl. Math. Comput., 187(2):1017 – 1026, 2007.
[92] D. Putthividhya, H. Attias, and S. Nagarajan. Supervised topic model for auto- matic image annotation. In 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pages 1894–1897, 2010.
[93] L. Rabiner, C. Lee, B.-H. Juang, and J. Wilpon. Hmm clustering for connected word recognition. In 1989 International Conference on Acoustics, Speech, and Signal Processing, ICASSP-89.,, pages 405–408, 1989.
[94] D. Rafiei and A. Mendelzon. Querying time series data based on similarity. IEEE Trans. Knowl. Data Eng., 12(5):675 –693, 2000.
[95] P. Rodrigues, J. Gama, and J. Pedroso. Hierarchical clustering of time-series da- ta streams. IEEE Transactions on Knowledge and Data Engineering, 20(5):615–
[96] M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In Proceedings of the 20th conference on Uncertainty in artificial intelligence, UAI’04, pages 487–494, 2004.
[97] K.-Q. Shen, C.-J. Ong, X.-P. Li, Z. Hui, and E. Wilder-Smith. A feature selection method for multilevel mental fatigue EEG classification. IEEE Trans. Biomed. Eng., 54(7):1231 –1237, july 2007.
[98] J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans- actions on Pattern Analysis and Machine Intelligence (PAMI), 22(8):888–905, 2000.
[99] Y. N. Singh and P. Gupta. Correlation-based classification of heartbeats for individual identification. Soft Comput., 15(3):449–460, Mar. 2011.
[100] Siuly, Y. Li, and P. P. Wen. Clustering technique-based least square support vector machine for EEG signal classification. Computer Methods and Programs in Biomedicine, 104(3):358 – 372, 2011.
[101] J. Sivic, B. Russell, A. Efros, A. Zisserman, and W. Freeman. Discovering objects and their location in images. In Tenth IEEE International Conference on Computer Vision, volume 1, pages 370–377, 2005.
[102] P. Smyth. Clustering sequences with hidden markov models. In Advances in Neural Information Processing Systems (NIPS), pages 648–654. MIT Press,