The Mathematics of Divergence Based Online Learning in Vector Quantization

Villmann, Thomas; Haase, Sven; Schleif, Frank-Michael; Hammer, Barbara; Biehl, Michael

doi:10.1007/978-3-642-12159-3_10

The Mathematics of Divergence Based Online Learning in Vector Quantization

Thomas Villmann²¹,
Sven Haase²¹,
Frank-Michael Schleif²²,
Barbara Hammer²² &
…
Michael Biehl²³

Conference paper

1103 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5998))

Abstract

We propose the utilization of divergences in gradient descent learning of supervised and unsupervised vector quantization as an alternative for the squared Euclidean distance. The approach is based on the determination of the Fréchet-derivatives for the divergences, wich can be immediately plugged into the online-learning rules. We provide the mathematical foundation of the respective framework. This framework includes usual gradient descent learning of prototypes as well as parameter optimization and relevance learning for improvement of the performance.

Download to read the full chapter text

Chapter PDF

References

Amari, S.-I.: Differential-Geometrical Methods in Statistics. Springer, Heidelberg (1985)
MATH Google Scholar
Banerjee, A., Merugu, S., Dhillon, I., Ghosh, J.: Clustering with bregman divergences. Journal of Machine Learning Research 6, 1705–1749 (2005)
MathSciNet Google Scholar
Bezdek, J., Hathaway, R., Windham, M.: Numerical comparison of RFCM and AP algorithms for clustering relational data. Pattern recognition 24, 783–791 (1991)
Google Scholar
Cichocki, A., Zdunek, R., Phan, A., Amari, S.-I.: Nonnegative Matrix and Tensor Factorizations. Wiley, Chichester (2009)
Book Google Scholar
Cottrell, M., Hammer, B., Hasenfuß, A., Villmann, T.: Batch and median neural gas. Neural Networks 19, 762–771 (2006)
Article MATH Google Scholar
Csiszár, I.: Information-type measures of differences of probability distributions and indirect observations. Studia Sci. Math. Hungaria 2, 299–318 (1967)
MATH Google Scholar
Fichtenholz, G.: Differential- und Integralrechnung, 9th edn., vol. II. Deutscher Verlag der Wissenschaften, Berlin (1964)
MATH Google Scholar
Frigyik, B.A., Srivastava, S., Gupta, M.: An introduction to functional derivatives. Technical Report UWEETR-2008-0001, Dept. of Electrical Engineering, University of Washington (2008)
Google Scholar
Fujisawa, H., Eguchi, S.: Robust parameter estimation with a small bias against heavy contamination. Journal of Multivariate Analysis 99, 2053–2081 (2008)
Article MATH MathSciNet Google Scholar
Hammer, B., Villmann, T.: Generalized relevance learning vector quantization. Neural Networks 15(8-9), 1059–1068 (2002)
Article Google Scholar
Heskes, T.: Energy functions for self-organizing maps. In: Oja, E., Kaski, S. (eds.) Kohonen Maps, pp. 303–316. Elsevier, Amsterdam (1999)
Chapter Google Scholar
Hulle, M.M.V.: Kernel-based topographic map formation achieved with an information theoretic approach. Neural Networks 15, 1029–1039 (2002)
Article Google Scholar
Jang, E., Fyfe, C., Ko, H.: Bregman divergences and the self organising map. In: Fyfe, C., Kim, D., Lee, S.-Y., Yin, H. (eds.) IDEAL 2008. LNCS, vol. 5326, pp. 452–458. Springer, Heidelberg (2008)
Chapter Google Scholar
Kantorowitsch, I., Akilow, G.: Funktionalanalysis in normierten Räumen, 2nd revised edn. Akademie-Verlag, Berlin (1978)
Google Scholar
Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (1995) (2nd Extended edn. 1997)
Google Scholar
Kullback, S., Leibler, R.: On information and sufficiency. Annals of Mathematical Statistics 22, 79–86 (1951)
Article MATH MathSciNet Google Scholar
Lee, J., Verleysen, M.: Generalization of the l _p norm for time series and its application to self-organizing maps. In: Cottrell, M. (ed.) Proc. of Workshop on Self-Organizing Maps (WSOM) 2005, Paris, Sorbonne, pp. 733–740 (2005)
Google Scholar
Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Transactions on Communications 28, 84–95 (1980)
Article Google Scholar
Martinetz, T.M., Berkovich, S.G., Schulten, K.J.: Neural-gas network for vector quantization and its application to time-series prediction. IEEE Trans. on Neural Networks 4(4), 558–569 (1993)
Article Google Scholar
Principe, J.C., Fisher III, J., Xu, D.: Information theoretic learning. In: Haykin, S. (ed.) Unsupervised Adaptive Filtering. Wiley, New York (2000)
Google Scholar
Qin, A., Suganthan, P.: A novel kernel prototype-based learning algorithm. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), vol. 4, pp. 621–624 (2004)
Google Scholar
Ramsay, J., Silverman, B.: Functional Data Analysis, 2nd edn. Springer Science+Media, New York (2006)
Google Scholar
Renyi, A.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press (1961)
Google Scholar
Renyi, A.: Probability Theory. North-Holland Publishing Company, Amsterdam (1970)
Google Scholar
Rossi, F., Delannay, N., Conan-Gueza, B., Verleysen, M.: Representation of functional data in neural networks. Neurocomputing 64, 183–210 (2005)
Article Google Scholar
Sato, A., Yamada, K.: Generalized learning vector quantization. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Proceedings of the 1995 Conference on Advances in Neural Information Processing Systems, vol. 8, pp. 423–429. MIT Press, Cambridge (1996)
Google Scholar
Schneider, P., Biehl, M., Hammer, B.: Hyperparameter learning in robust soft LVQ. In: Verleysen, M. (ed.) Proceedings of the European Symposium on Artificial Neural Networks ESANN, pp. 517–522. d-side publications (2009)
Google Scholar
Schneider, P., Hammer, B., Biehl, M.: Adaptive relevance matrices in learning vector quantization. Neural Computation 21, 3532–3561 (2009)
Article MATH MathSciNet Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis and Discovery. Cambridge University Press, Cambridge (2004)
Google Scholar
Taneja, I., Kumar, P.: Relative information of type s, Csiszár’s f -divergence, and information inequalities. Information Sciences 166, 105–125 (2004)
Article MATH MathSciNet Google Scholar
Villmann, T., Haase, S.: Mathematical aspects of divergence based vector quantization using fréchet-derivatives - extended and revised version. Machine Learning Reports 4(MLR-01-2010), 1–35 (2010), http://www.uni-leipzig.de/~compint/mlr/mlr_01_2010.pdf
Google Scholar
Villmann, T., Merényi, E., Hammer, B.: Neural maps in remote sensing image analysis. Neural Networks 16(3-4), 389–403 (2003)
Article Google Scholar
Villmann, T., Schleif, F.-M.: Functional vector quantization by neural maps. In: Chanussot, J. (ed.) Proceedings of First Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS 2009), pp. 1–4. IEEE Press, Los Alamitos (2009)
Chapter Google Scholar
Villmann, T., Schleif, F.-M., Kostrzewa, M., Walch, A., Hammer, B.: Classification of mass-spectrometric data in clinical proteomics using learning vector quantization methods. Briefings in Bioinformatics 9(2), 129–143 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics/Natural Sciences/Informatics, University of Applied Sciences Mittweida, 09648, Mittweida, Germany
Thomas Villmann & Sven Haase
Institute of Computer Science, Clausthal University of Technology, Clausthal-Zellerfeld, Germany
Frank-Michael Schleif & Barbara Hammer
Johann Bernoulli Inst. for Mathematics and Computer Science, Rijksuniversity Groningen, The Netherlands
Michael Biehl

Authors

Thomas Villmann
View author publications
You can also search for this author in PubMed Google Scholar
Sven Haase
View author publications
You can also search for this author in PubMed Google Scholar
Frank-Michael Schleif
View author publications
You can also search for this author in PubMed Google Scholar
Barbara Hammer
View author publications
You can also search for this author in PubMed Google Scholar
Michael Biehl
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Neural Information Processing, Oberer Eselsberg, University of Ulm, 89069, Ulm, Germany
Friedhelm Schwenker
Center for Informatics Science, Nile University, 12677, Giza, Egypt
Neamat El Gayar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Villmann, T., Haase, S., Schleif, FM., Hammer, B., Biehl, M. (2010). The Mathematics of Divergence Based Online Learning in Vector Quantization. In: Schwenker, F., El Gayar, N. (eds) Artificial Neural Networks in Pattern Recognition. ANNPR 2010. Lecture Notes in Computer Science(), vol 5998. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12159-3_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-12159-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12158-6
Online ISBN: 978-3-642-12159-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)