Topic Extraction with AGAPE

Velcin, Julien; Ganascia, Jean-Gabriel

doi:10.1007/978-3-540-73871-8_35

Julien Velcin²⁴ &
Jean-Gabriel Ganascia²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4632))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2189 Accesses
1 Citations

Abstract

This paper uses an optimization approach to address the problem of conceptual clustering. The aim of AGAPE, which is based on the tabu-search meta-heuristic using split, merge and a special “k-means” move, is to extract concepts by optimizing a global quality function. It is deterministic and uses no a priori knowledge about the number of clusters. Experiments carried out in topic extraction show very promising results on both artificial and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Michalski, R.S., Stepp, R.E., Diday, E.: A recent advance in data analysis: clustering objects into classes characterized by conjunctive concepts. Pattern Recognition (1), 33–55 (1981)
Google Scholar
Mishra, N., Ron, D., Swaminathan, R.: A New Conceptual Clustering Framework. Machine Learning 56(1-3), 115–151 (2004)
Article MATH Google Scholar
Sherali, H.D., Desai, J.: A Global Optimization RLT-based Approach for Solving the Hard Clustering Problem. Journal of Global Optimization 32(2), 281–306 (2005)
Article MATH MathSciNet Google Scholar
Glover, F., Laguna, M.S.: Tabu Search. Kluwer Academic Publishers, Dordrecht (1997)
Book MATH Google Scholar
Newman, D.J., Block, S.: Probabilistic Topic Decomposition of an Eighteenth-Century American Newspaper. Journal of the American Society for Information Science and Technology 57(6), 753–767 (2006)
Article Google Scholar
Ng, M.K., Wong, J.C.: Clustering categorical data sets using tabu search techniques. Pattern Recognition 35(12), 2783–2790 (2002)
Article MATH Google Scholar
Velcin, J., Ganascia, J.-G.: Stereotype Extraction with Default Clustering. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, Edinburgh, Scotland (2005)
Google Scholar
Fisher, D.H.: Knowledge Acquisition Via Incremental Conceptual Clustering. Machine Learning (2), 139–172 (1987)
Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley, Califonia (1967)
Google Scholar
Dhillon, I.S., Mallela, S., Modha, D.S.: Information-Theoretic Co-Clustering. In: KDD 2003. Proceedings of The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 89–98. ACM Press, New York (2003)
Google Scholar
Aggarwal, C.: Re-designing distance functions and distance-based applications for high dimensional data. ACM SIGMOD Record 30(1), 13–18 (2001)
Article Google Scholar
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: Cluster Validity Methods: Part I - Part II. In: Special Interest Groups on Management Of Data (2002)
Google Scholar
He, J., Tan, A.-H., Tan, C.-L., Sung, S.-Y.: On Qualitative Evaluation of Clustering Systems. In: Information Retrieval and Clustering, Kluwer Academic Publishers, Dordrecht (2002)
Google Scholar
Huang, Z.: A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining. In: DMKD, vol. 8 (1997)
Google Scholar
Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: Proceedings of the KDD Workshop on Text Mining (2000)
Google Scholar
Chateauraynaud, F.: Prospéro: une technologie littéraire pour les sciences humaines. CNRS Editions (2003)
Google Scholar
Kass, R.E., Raftery, A.E.: Bayes factors. Journal of American Statistical Association 90, 773–795 (1995)
Article MATH Google Scholar
Zhao, Y., Karypis, G.: Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering. Machine Learning 55, 311–331 (2004)
Article MATH Google Scholar
Gondek, D., Hofmann, T.: Non-redundant clustering with conditional ensembles. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, Illinois, pp. 70–77 (2005)
Google Scholar
Dimitriadou, E., Weingessel, A., Hornik, K.: A cluster ensembles framework. In: Design and application of hybrid intelligent systems, pp. 528–534. IOS Press, Amsterdam (2003)
Google Scholar
Fred, A., Jain, A.: Robust data clustering. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 128–133. IEEE Computer Society Press, Los Alamitos (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Université de Paris 6 – LIP6, 104 avenue du Président Kennedy, 75016 Paris, France
Julien Velcin & Jean-Gabriel Ganascia

Authors

Julien Velcin
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Gabriel Ganascia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Calgary , Calgary, AB, Canada
Reda Alhajj
School of Computer Science and Technology , Harbin Institute of Technology, Harbin, China
Hong Gao
School of Computer Science and Technology , Harbin Institute of Technology , Harbin, China
Jianzhong Li
School of Information Technology and Electronic Engineering , The University of Queensland , Queensland, Australia
Xue Li
Department of Computing Science , University of Alberta, Edmonton, AB, Canada
Osmar R. Zaïane

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Velcin, J., Ganascia, JG. (2007). Topic Extraction with AGAPE. In: Alhajj, R., Gao, H., Li, J., Li, X., Zaïane, O.R. (eds) Advanced Data Mining and Applications. ADMA 2007. Lecture Notes in Computer Science(), vol 4632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73871-8_35

Download citation

DOI: https://doi.org/10.1007/978-3-540-73871-8_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73870-1
Online ISBN: 978-3-540-73871-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics