Abstract
Data mining on Web documents is one of the most challenging tasks in machine learning due to the large number of documents on the Web, the underlying structures (as one document may refer to another document), and the data is commonly not labeled (the class in which the document belongs is not known a-priori). This paper considers latest developments in Self-Organizing Maps (SOM), a machine learning approach, as one way to classifying documents on the Web. The most recent development is called a Probability Mapping Graph Self-Organizing Map (PMGraphSOM), and is an extension of an earlier Graph-SOM approach; this encodes undirected and cyclic graphs in a scalable fashion. This paper illustrates empirically the advantages of the PMGraphSOM versus the original GraphSOM model in a data mining application involving graph structured information. It will be shown that the performances achieved can exceed the current state-of-the art techniques on a given benchmark problem.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Kohonen, T.: Self-Organisation and Associative Memory, 3rd edn. Springer, Heidelberg (1990)
Hagenbuchner, M., Tsoi, A., Sperduti, A.: A supervised self-organising map for structured data. In: Allison, N., Yin, H., Allison, L., Slack, J. (eds.) WSOM 2001 - Advances in Self-Organising Maps, pp. 21–28. Springer, Heidelberg (2001)
Günter, S., Bunke, H.: Self-organizing map for clustering in the graph domain. Pattern Recognition Letters 23(4), 405–417 (2002)
Hagenbuchner, M., Sperduti, A., Tsoi, A.: Contextual processing of graphs using self-organizing maps. In: European symposium on Artificial Neural Networks, Poster track, Bruges, Belgium, April 27-29 (2005)
Hagenbuchner, M., Tsoi, A.C., Sperduti, A., Kc, M.: Efficient clustering of structured documents using graph self-organizing maps. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 207–221. Springer, Heidelberg (2008)
Denoyer, L., Gallinari, P.: Initiative for the evaluation of xml retrieval, xml-mining track (2008), http://www.inex.otago.ac.nz/
Hagenbuchner, M., Zhang, S., Tsoi, A., Sperduti, A.: Projection of undirected and non-positional graphs using self organizing maps. In: European Symposium on Artificial Neural Networks - Advances in Computational Intelligence and Learning, April 22-24 (to appear, 2009)
Pearson, K.: On lines and planes of closest fit to systems of points in space. Philosophical Magazine Series 6(2), 559–572 (1901)
McCallum, A.K.: Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering (1996), http://www.cs.cmu.edu/~mccallum/bow
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, S., Hagenbuchner, M., Tsoi, A.C., Sperduti, A. (2009). Self Organizing Maps for the Clustering of Large Sets of Labeled Graphs. In: Geva, S., Kamps, J., Trotman, A. (eds) Advances in Focused Retrieval. INEX 2008. Lecture Notes in Computer Science, vol 5631. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03761-0_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-03761-0_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03760-3
Online ISBN: 978-3-642-03761-0
eBook Packages: Computer ScienceComputer Science (R0)