Abstract
We propose the utilization of divergences in gradient descent learning of supervised and unsupervised vector quantization as an alternative for the squared Euclidean distance. The approach is based on the determination of the Fréchet-derivatives for the divergences, wich can be immediately plugged into the online-learning rules. We provide the mathematical foundation of the respective framework. This framework includes usual gradient descent learning of prototypes as well as parameter optimization and relevance learning for improvement of the performance.
Chapter PDF
References
Amari, S.-I.: Differential-Geometrical Methods in Statistics. Springer, Heidelberg (1985)
Banerjee, A., Merugu, S., Dhillon, I., Ghosh, J.: Clustering with bregman divergences. Journal of Machine Learning Research 6, 1705–1749 (2005)
Bezdek, J., Hathaway, R., Windham, M.: Numerical comparison of RFCM and AP algorithms for clustering relational data. Pattern recognition 24, 783–791 (1991)
Cichocki, A., Zdunek, R., Phan, A., Amari, S.-I.: Nonnegative Matrix and Tensor Factorizations. Wiley, Chichester (2009)
Cottrell, M., Hammer, B., Hasenfuß, A., Villmann, T.: Batch and median neural gas. Neural Networks 19, 762–771 (2006)
Csiszár, I.: Information-type measures of differences of probability distributions and indirect observations. Studia Sci. Math. Hungaria 2, 299–318 (1967)
Fichtenholz, G.: Differential- und Integralrechnung, 9th edn., vol. II. Deutscher Verlag der Wissenschaften, Berlin (1964)
Frigyik, B.A., Srivastava, S., Gupta, M.: An introduction to functional derivatives. Technical Report UWEETR-2008-0001, Dept. of Electrical Engineering, University of Washington (2008)
Fujisawa, H., Eguchi, S.: Robust parameter estimation with a small bias against heavy contamination. Journal of Multivariate Analysis 99, 2053–2081 (2008)
Hammer, B., Villmann, T.: Generalized relevance learning vector quantization. Neural Networks 15(8-9), 1059–1068 (2002)
Heskes, T.: Energy functions for self-organizing maps. In: Oja, E., Kaski, S. (eds.) Kohonen Maps, pp. 303–316. Elsevier, Amsterdam (1999)
Hulle, M.M.V.: Kernel-based topographic map formation achieved with an information theoretic approach. Neural Networks 15, 1029–1039 (2002)
Jang, E., Fyfe, C., Ko, H.: Bregman divergences and the self organising map. In: Fyfe, C., Kim, D., Lee, S.-Y., Yin, H. (eds.) IDEAL 2008. LNCS, vol. 5326, pp. 452–458. Springer, Heidelberg (2008)
Kantorowitsch, I., Akilow, G.: Funktionalanalysis in normierten Räumen, 2nd revised edn. Akademie-Verlag, Berlin (1978)
Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (1995) (2nd Extended edn. 1997)
Kullback, S., Leibler, R.: On information and sufficiency. Annals of Mathematical Statistics 22, 79–86 (1951)
Lee, J., Verleysen, M.: Generalization of the l p norm for time series and its application to self-organizing maps. In: Cottrell, M. (ed.) Proc. of Workshop on Self-Organizing Maps (WSOM) 2005, Paris, Sorbonne, pp. 733–740 (2005)
Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Transactions on Communications 28, 84–95 (1980)
Martinetz, T.M., Berkovich, S.G., Schulten, K.J.: Neural-gas network for vector quantization and its application to time-series prediction. IEEE Trans. on Neural Networks 4(4), 558–569 (1993)
Principe, J.C., Fisher III, J., Xu, D.: Information theoretic learning. In: Haykin, S. (ed.) Unsupervised Adaptive Filtering. Wiley, New York (2000)
Qin, A., Suganthan, P.: A novel kernel prototype-based learning algorithm. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), vol. 4, pp. 621–624 (2004)
Ramsay, J., Silverman, B.: Functional Data Analysis, 2nd edn. Springer Science+Media, New York (2006)
Renyi, A.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press (1961)
Renyi, A.: Probability Theory. North-Holland Publishing Company, Amsterdam (1970)
Rossi, F., Delannay, N., Conan-Gueza, B., Verleysen, M.: Representation of functional data in neural networks. Neurocomputing 64, 183–210 (2005)
Sato, A., Yamada, K.: Generalized learning vector quantization. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Proceedings of the 1995 Conference on Advances in Neural Information Processing Systems, vol. 8, pp. 423–429. MIT Press, Cambridge (1996)
Schneider, P., Biehl, M., Hammer, B.: Hyperparameter learning in robust soft LVQ. In: Verleysen, M. (ed.) Proceedings of the European Symposium on Artificial Neural Networks ESANN, pp. 517–522. d-side publications (2009)
Schneider, P., Hammer, B., Biehl, M.: Adaptive relevance matrices in learning vector quantization. Neural Computation 21, 3532–3561 (2009)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis and Discovery. Cambridge University Press, Cambridge (2004)
Taneja, I., Kumar, P.: Relative information of type s, Csiszár’s f -divergence, and information inequalities. Information Sciences 166, 105–125 (2004)
Villmann, T., Haase, S.: Mathematical aspects of divergence based vector quantization using fréchet-derivatives - extended and revised version. Machine Learning Reports 4(MLR-01-2010), 1–35 (2010), http://www.uni-leipzig.de/~compint/mlr/mlr_01_2010.pdf
Villmann, T., Merényi, E., Hammer, B.: Neural maps in remote sensing image analysis. Neural Networks 16(3-4), 389–403 (2003)
Villmann, T., Schleif, F.-M.: Functional vector quantization by neural maps. In: Chanussot, J. (ed.) Proceedings of First Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS 2009), pp. 1–4. IEEE Press, Los Alamitos (2009)
Villmann, T., Schleif, F.-M., Kostrzewa, M., Walch, A., Hammer, B.: Classification of mass-spectrometric data in clinical proteomics using learning vector quantization methods. Briefings in Bioinformatics 9(2), 129–143 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Villmann, T., Haase, S., Schleif, FM., Hammer, B., Biehl, M. (2010). The Mathematics of Divergence Based Online Learning in Vector Quantization. In: Schwenker, F., El Gayar, N. (eds) Artificial Neural Networks in Pattern Recognition. ANNPR 2010. Lecture Notes in Computer Science(), vol 5998. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12159-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-12159-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12158-6
Online ISBN: 978-3-642-12159-3
eBook Packages: Computer ScienceComputer Science (R0)