Statistical physics of learning vector quantization

(1)

University of Groningen

Statistical physics of learning vector quantization

Witoelar, Aree Widya

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from

it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date:

2010

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Witoelar, A. W. (2010). Statistical physics of learning vector quantization. Groningen: s.n.

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

Bibliography

Ahr, M., Biehl, M. and Schloesser, E.: 1999, Weight decay induced phase transitions in multi-layer neural networks, Journal of Physics A: Mathematical and General 32, 5003–5008. Ahr, M., Biehl, M. and Urbanczik, R.: 1999, Statistical physics and practical training of

soft-committee machines, The European Physical Journal B 10, 583–588.

Barber, D. and Sollich, P.: 1998, Online learning from finite training sets, Europhysics Letters

38, 279–302.

Barkai, N., Seung, H. and Sompolinsky, H.: 1993, Scaling laws in learning of classification tasks, Phys. Rev. Lett. 70 70, 3167–3170.

Baum, E. and Haussler, D.: 1989, What size net gives valid generalization?, Neural

Computa-tion 1, 151–160.

Bengio, Y.: 2000, Gradient-based optimization of hyperparameters, Neural Computation

12(8), 1889–1900.

Bermejo, S. and Cabestany, J.: 2000, A batch learning vector quantization algorithm for nearest neighbour classification, Neural Processing Letters 11(3), 173–184.

Bezdek, J.: 1981, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York.

Biehl, M.: 1994, An exactly solvable model of unsupervised learning, Europhysics Letters

25(5), 391–396.

Biehl, M. and Caticha, N.: 2003, The statistical mechanics of on-line learning and generaliza-tion, The handbook of brain theory and neural networks pp. 1095–1098.

Biehl, M., Freking, A., Ghosh, A. and Reents, G.: 2004, A theoretical framework for analysing the dynamics of LVQ: A statistical physics approach, Technical Report

2004-9-02, Mathematics and Computing Science, University of Groningen . Available on-line:

http://www.cs.rug.nl/∼biehl.

Biehl, M., Ghosh, A. and Hammer, B.: 2006, Learning vector quantization: The dynamics of winner-takes-all algorithms, Neurocomputing 69, 660–670.

(3)

Biehl, M., Ghosh, A. and Hammer, B.: 2007, Dynamics and generalization ability of LVQ algorithms, Journal of Machine Learning Research 8, 323–360.

Biehl, M. and Mietzner, A.: 1994, Statistical mechanics of unsupervised structure recognition,

J. Phys. A 27, 1885–1897.

Biehl, M., Schl ¨osser, E. and Ahr, M.: 1998, Phase transitions in soft-committee machines,

Eu-rophys. Lett. 44(2), 261–267.

Bojer, T., Hammer, B. and Koers, C.: 2003, Monitoring technical systems with prototype based clustering, in M. Verleysen (ed.), European Symposium on Artificial Neural Networks

(ESANN), d-side, Evere, Belgium, pp. 433–439.

Bojer, T., Hammer, B., Schunk, D. and von Toschanowitz, K. T.: 2001, Relevance determina-tionin learning vector quantization, in M. Verleysen (ed.), European Symposium on

Artifi-cial Neural Networks (ESANN), d-side, Evere, Belgium, pp. 271–276.

Bottou, L.: 1991, Stochastic gradient learning in neural networks, Proceedings of Neuro-Nˆımes

91, EC2, Nimes, France.

Bottou, L. and Bengio, Y.: 1995, Convergence properties of the k-means, NIPS 1994, pp. 585– 592.

Buhmann, J. M.: 1998, Stochastic algorithms for exploratory data analysis: Data clustering and data visualization, In Learning in Graphical Models, Kluwer, pp. 405–420.

Buhot, A., Gordon, M. and Nadal, J.: 2002, Rigorous bounds to retarded learning, Phys. Rev.

Lett. 88(9), 099801.

Carnevali, P. and Patarnello, S.: 1987, Exhaustive thermodynamic analysis of boolean learning networks, Europhys. Lett. pp. 1199–1204.

Cortes, C. and Vapnik, V.: 1995, Support-vector networks, Machine Learning 20(3), 273–297. Cottrell, M., Hammer, B., Hasenfuß, A. and Villmann, T.: 2006, Batch and median neural gas,

Neural Networks 19(6), 762–771.

Crammer, K., Gilad-bachrach, R., Navot, A. and Tishby, N.: 2002, Margin analysis of the lvq algorithm, Advances in Neural Information Processing Systems 2002, MIT press, pp. 462– 469.

del Giudice, P., Franz, S. and Virasoro, M. A.: 1989, Perceptron beyond the limit of capacity,

Journal de Physique 50(2), 121–134.

Duda, R., Hart, P. and Stork, D.: 2000, Pattern Classification, Wiley, New York.

Edwards, S. and P.W, A.: 1975, Theory of spin glasses, Journal of Physics F: Metal Physics

5(5), 965–974.

Engel, A. and van den Broeck, C.: 2001, The Statistical Mechanics of Learning, Cambridge Uni-versity Press, Cambridge, UK.

Gersho, A. and Gray, R. M.: 1991, Vector quantization and signal compression, Kluwer Academic Publishers, Norwell, MA, USA.

Ghosh, A., Biehl, M. and Hammer, B.: 2006, Performance analysis of LVQ algorithms: a sta-tistical physics approach, Neural Networks 19, 817–829.

(4)

Hammer, B., Hasenfuss, A., Schleif, F. and Villmann, T.: 2006, Supervised batch neural gas,

Artificial Neural Networks in Pattern Recognition, Vol. 4087, Springer, pp. 33–45.

Hammer, B., Strickert, M. and Villmann, T.: 2003, On the generalization ability of grlvq net-works, Neural Processing Letters, p. 10.

Hammer, B., Strickert, M. and Villmann, T.: 2004, Relevance lvq versus svm, Artificial

Intel-ligence and Softcomputing, volume 3070 of Springer Lecture Notes in Artificial IntelIntel-ligence,

Springer, pp. 592–597.

Hammer, B. and Villmann, T.: 2002, Generalized relevance learning vector quantization,

Neu-ral Networks 15(8-9), 1059 – 1068.

Han, J. and Kamber, M.: 2005, Data Mining: Concepts and Techniques, Morgan Kaufmann Pub-lishers Inc.

Hansel, D. and Sompolinsky, H.: 1990, Learning from examples in a single-layer neural net-work, Europhys. Lett. 11, 687.

Herschkowitz, D. and Opper, M.: 2001, Retarded learning: Rigorous results from statistical mechanics, Phys. Rev. Lett. 86(10), 2174–2177.

Huang, K.: 1987, Statistical mechanics, 2nd ed. edn, Wiley, New York.

Jain, A. K., Murty, M. N. and Flynn, P. J.: 1999, Data clustering: a review, ACM Comput. Surv.

31(3), 264–323.

Kinzel, W.: 1997, Phase transitions of neural networks, Phil. Mag. B 77, 1455–1477.

Kohonen, T.: 1990, Improved versions of learning vector quantization, Neural Networks, 1990.,

1990 IJCNN International Joint Conference on, Vol. 1, pp. 545–550.

Kohonen, T.: 1997, Self Organising Maps, Springer, Berlin 2nd ed.

Kuncheva, L.: 2004, Classifier ensembles for changing environments, in F. Roli, J. Kittler and T. Windeatt (eds), Multiple Classifier Ensembles: 5th International Workshop, MCS 2004,

Cagliari, Italy, Vol. 3077 of Lecture Notes in Computer Science, Springer, Berlin, pp. 1–15.

Levin, E., Tishby, N. and Solla, S.: 1990, A statistical approach to learning and generalization in layered neural networks, Proceedings of the IEEE, Vol. 78, pp. 1568–1574.

Lootens, E. and van den Broeck, C.: 1995, Analysing cluster formation by replica method,

Europhys. Lett. 30, 381–387.

Lyman, P. and Varian, H. R.: 2003, How much information. Retrieved from http://www.sims.berkeley.edu/how-much-info-2003 on 19 Jan 2010.

Martinetz, T., Berkovich, S. and Schulten, K.: 1993, ’neural gas’ network for vector quantiza-tion and its applicaquantiza-tion to time series predicquantiza-tion, IEEE TNN 4(4), 558–569.

Meir, R.: 1995, Empirical risk minimization versus maximum-likelihood estimation: a case study, Neural computation 7, 144–157.

Mezard, M., Parisi, G. and Virasoro, M.: 1987, Spin Glass Theory and Beyond, Singapore: World Scientific.

(5)

Neural Networks Research Centre, Helsinki: 2002, Bibliography on the self-organizing maps (SOM) and learning vector quantization (LVQ), Otaniemi: Helsinki Univ. of Technology . Available on-line: http://liinwww.ira.uka.de/bibliography/Neural/SOM.LVQ.html . Opper, M.: 1994, Learning and generalization in a two-layer neural network: The role of the

vapnik-chervonenkis dimension, Phys. Rev. Lett. 72(13), 2113–2116.

Pregenzer, M., Pfurtscheller, G. and Flotzinger, D.: 1996, Automated feature selection with a distinction sensitive learning vector quantizer, Neurocomputing 11(1), 19 – 29.

Rae, H., Sollich, P. and Coolen, A.: 1999, On-line learning with restricted training sets: An exactly solvable case, Journal of Physics A: Mathematical and General 32(18), 3321–3339. Reents, G. and Urbanczik, R.: 1998, Self averaging and on-line learning, Phys. Rev. Letter

80, 5445–5448.

Ripley, B.: 1996, Pattern Recognition and Neural Networks, Cambridge University Press. Saad, D. (ed.): 1999, Online learning in neural networks, Cambridge University Press,

Cam-bridge, UK.

Saad, D. and Rattray, M.: 1997, Globally optimal parameters for on-line learning in multilayer neural networks, Phys. Rev. Lett. 79(13), 2578–2581.

Saad, D. and Solla, S.: 1995, On-line learning in soft committee machines, Phys. Rev. E

52(4), 4225–4243.

Sato, A. and Yamada, K.: 1995, Generalized learning vector quantization, NIPS, pp. 423–429. Schleif, F., Villmann, T. and Hammer, B.: 2006, Local metric adaptation for soft nearest

pro-totype classification to classify proteomic data, International Workshop on Fuzzy Logic and

Applications 3849, 290–296.

Schneider, P., Biehl, M. and Hammer, B.: 2009, Adaptive relevance matrices in learning vector quantization, Neural Computation pp. 1–30. PMID: 19764875.

Schottky, B.: 1995, Phase transitions in the generalization behaviour of multilayer neural net-works, Journal of Physics A: Mathematical and General 28(16), 4515.

Seo, S. and Obermayer, K.: 2003, Soft learning vector quantization, Neural Computation

15, 1589–1604.

Seo, S. and Obermayer, K.: 2006, Dynamic hyperparameter scaling method for lvq algorithms,

International Joint Conference on Neural Networks, pp. 3196–3203.

Seung, H., Sompolinsky, H. and Tishby, N.: 1992, Statistical mechanics of learning from ex-amples, Physical Review A 45(8), 6056–6091.

Solla, S. A. and Levin, E.: 1992, Learning in linear neural networks: The validity of the an-nealed approximation, Physical Review A 46, 2124–2130.

Sompolinsky, H. and Tishby, N.: 1990, Learning in a two-layer neural network of edge detec-tors, Europhys. Lett. 13:6, 567–572.

Strickert, M., Seiffert, U., Sreenivasulu, N., Weschke, W., Villmann, T. and Hammer, B.: 2006, Generalized relevance lvq (grlvq) with correlation measures for gene expression analy-sis, Neurocomputing 69(7-9), 651 – 659. New Issues in Neurocomputing: 13th European Symposium on Artificial Neural Networks.

(6)

Sutton, R. and Barto, A.: 1998, Reinforcement learning, an introduction, MIT Press.

Tishby, N., Levin, E. and Solla, S.: 1989, Consistent inference of probabilities in layered net-works: predictions and generalizations, IJCNN International Joint Conference on Neural

Networks, pp. 403–409 vol.2.

Valiant, L. G.: 1984, A theory of the learnable, Commun. ACM 27(11), 1134–1142.

Vapnik, V.: 1995, The nature of statistical learning theory, Springer-Verlag New York, Inc., New York, NY, USA.

Villmann, T., Merenyi, E. and Hammer, B.: 2003, Neural maps in remote sensing image anal-ysis, Neural Networks 16(3-4), 389–403.

Watkin, T. and Nadal, J.: 1994, Optimal unsupervised learning, J. Phys. A 27, 1899–1915. Watkin, T., Rau, A. and Biehl, M.: 1993, The statistical mechanics of learning a rule, Reviews of

Modern Physics 65(2), 499–556.

Witoelar, A. and Biehl, M.: 2008, Equilibrium physics approach in vector quantization.,

Tech-nical Report, Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen . Available on-line: http://www.cs.rug.nl/∼aree.

Witoelar, A. and Biehl, M.: 2009, Phase transitions in vector quantization and neural gas,

Neurocomputing 72, 1390–1397.

Witoelar, A., Biehl, M., Ghosh, A. and Hammer, B.: 2008, Learning dynamics and robustness of vector quantization and neural gas, Neurocomputing 71, 1210–1219.

(7)