Efficient Clustering of Structured Documents Using Graph Self-Organizing Maps

Hagenbuchner, Markus; Tsoi, Ah Chung; Sperduti, Alessandro; Kc, Milly

doi:10.1007/978-3-540-85902-4_19

Efficient Clustering of Structured Documents Using Graph Self-Organizing Maps

Markus Hagenbuchner¹,
Ah Chung Tsoi²,
Alessandro Sperduti³ &
…
Milly Kc¹

Conference paper

567 Accesses
10 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4862))

Abstract

Graph Self-Organizing Maps (GraphSOMs) are a new concept in the processing of structured objects using machine learning methods. The GraphSOM is a generalization of the Self-Organizing Maps for Structured Domain (SOM-SD) which had been shown to be a capable unsupervised machine learning method for some types of graph structured information. An application of the SOM-SD to document mining tasks as part of an international competition: Initiative for the Evaluation of XML Retrieval (INEX), on the clustering of XML formatted documents was conducted, and the method subsequently won the competition in 2005 and 2006 respectively. This paper applies the GraphSOM to the clustering of a larger dataset in the INEX competition 2007. The results are compared with those obtained when utilizing the more traditional SOM-SD approach. Experimental results show that (1) the GraphSOM is computationally more efficient than the SOM-SD, (2) the performances of both approaches on the larger dataset in INEX 2007 are not competitive when compared with those obtained by other participants of the competition using other approaches, and, (3) different structural representation of the same dataset can influence the performance of the proposed GraphSOM technique.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Scarselli, F., Yong, S., Gori, M., Hagenbuchner, M., Tsoi, A., Maggini, M.: Graph neural networks for ranking web pages. In: Web Intelligence Conference (2005)
Google Scholar
Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Berlin (1995)
Google Scholar
Hagenbuchner, M., Sperduti, A., Tsoi, A.: A self-organizing map for adaptive processing of structured data. IEEE Transactions on Neural Networks 14(3), 491–505 (2003)
Article Google Scholar
Hagenbuchner, M., Sperduti, A., Tsoi, A., Trentini, F., Scarselli, F., Gori, M.: Clustering xml documents using self-organizing maps for structures. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 481–496. Springer, Heidelberg (2006)
Chapter Google Scholar
Hagenbuchner, M., Sperduti, A., Tsoi, A.: Contextual self-organizing maps for structured domains. In: Workshop on Relational Machine Learning (2005)
Google Scholar
Kc, M., Hagenbuchner, M., Tsoi, A., Scarselli, F., Gori, M., Sperduti, A.: Xml document mining using contextual self-organizing maps for structures. LNCS. Springer, Berlin (2007)
Google Scholar
Hagenbuchner, M., Sperduti, A., Tsoi, A.: Self-organizing maps for cyclic and unbound graphs. In: European symposium on Artificial Neural Networks, April 23-25 (2008) (to appear)
Google Scholar
Hagenbuchner, M., Sperduti, A., Tsoi, A.: Contextual processing of graphs using self-organizing maps. In: European symposium on Artificial Neural Networks. Poster track, Bruges, Belgium, April 27-29 (2005)
Google Scholar
Sperduti, A., Starita, A.: A memory model based on LRAAM for associative access of structures. In: Proceedings of IEEE International Conference on Neural Networks, June 2-6, 1996, vol. 1, pp. 543–548 (1996)
Google Scholar
Neuhaus, M., Bunke, H.: Self-organizing maps for learning the edit costs in graph matching. IEEE Transactions on Systems, Man, and Cybernetics, Part B 35(3), 503–514 (2005)
Article Google Scholar
Rahman, M.K.M., Yang, W.P., Chow, T.W.S., Wu, S.: A flexible multi-layer self-organizing map for generic processing of tree-structured data. Pattern Recogn. 40(5), 1406–1424 (2007)
Article MATH Google Scholar
Strickert, M., Hammer, B.: Neural gas for sequences. In: WSOM 2003, pp. 53–57 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Wollongong, Wollongong, Australia
Markus Hagenbuchner & Milly Kc
Hong Kong Baptist University, Hong Kong
Ah Chung Tsoi
University of Padova, Padova, Italy
Alessandro Sperduti

Authors

Markus Hagenbuchner
View author publications
You can also search for this author in PubMed Google Scholar
Ah Chung Tsoi
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Sperduti
View author publications
You can also search for this author in PubMed Google Scholar
Milly Kc
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Norbert Fuhr Jaap Kamps Mounia Lalmas Andrew Trotman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hagenbuchner, M., Tsoi, A.C., Sperduti, A., Kc, M. (2008). Efficient Clustering of Structured Documents Using Graph Self-Organizing Maps. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds) Focused Access to XML Documents. INEX 2007. Lecture Notes in Computer Science, vol 4862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85902-4_19

Download citation

DOI: https://doi.org/10.1007/978-3-540-85902-4_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85901-7
Online ISBN: 978-3-540-85902-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics