Skip to main content

Fuzzy Clustering for Documents Based on Optimization of Classifier Using the Genetic Algorithm

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3481))

Abstract

It is a problem that established document categorization method reflects the semantic relation inaccurately at feature expression of document. For the purpose of solving this problem, we propose a genetic algorithm and C-Means clustering algorithm for choosing an appropriate set of fuzzy clustering for classification problems of documents. The aim of the proposed method is to find a minimum set of fuzzy cluster that can correctly classify all training documents. The number of fuzzy pseudo-partition and the shapes of the fuzzy membership functions that we use the classification criteria are determined by the genetic algorithms. Then, the classifier decides using fuzzy c-means clustering algorithms for documents classification. A solution obtained by the genetic algorithm is a set of fuzzy clustering, and its fitness function is determined by fuzzy membership function.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ko, S.-J.: Bayesian Automatic Document Categorization Using Apriori-Genetic Algorithm  8(3), 6 (2003)

    Google Scholar 

  2. Soon, H.K.: A Cluster Validity Index for Fuzzy Clustering. Electronics Letters 34(22) (2002)

    Google Scholar 

  3. Baeza-ates, R., Ribeiro-Neto, B.: Modern Information Retrieval, 230–255 (1998)

    Google Scholar 

  4. Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic Theory and Applications (1998)

    Google Scholar 

  5. Ko, S.-J.: Optimization of Associative Word Knowledge Base Using Apriori-genetic algorithm. KISS 28(8) (2003)

    Google Scholar 

  6. Lee, K.-M.: Classification Rule Mining from Fuzzy Data based on fuzzy decision Tree. KISS 28(1) (2003)

    Google Scholar 

  7. Hyun-Jin, K.: Clustering Korean Nouns Based On Syntactic Relation and Corpus Data. In: Proceedings of the LASTED International Conference Artificial Intelligence and Soft Computing (2003)

    Google Scholar 

  8. Gondon, M.: Probabilistic and genetic algorithms for document retrieval. Communication of the ACM 31 (2000)

    Google Scholar 

  9. Koczy, L.T.: Information retrieval by fuzzy relations and hierarchical co-occurrence (1997)

    Google Scholar 

  10. Baranyi, P., Gedeon, T.D., Koczy, L.T.: Improved fuzzy and neural network algorithms for frequency prediction in document filtering. TR 97-02 (1997)

    Google Scholar 

  11. Koczy, L.T., Gedeon, T.D., Koczy, J.A.: The construction of fuzzy relational maps in information retrieval. IETR 98-01 (1998)

    Google Scholar 

  12. Koczy, L.T., Gedeon, T.: Information retrieval by fuzzy relations and hierarchical cooccurrence. Part I. TR99-01, Dept. of Info. Eng., School of Comp. Sci. & Eng., UNSW (1999)

    Google Scholar 

  13. Han, S.-W., Eun, H.-J., Kim, Y.-S., Koczy, L.T.: A Document Classification Algorithm Using the Fuzzy set Theory and Hierarchical Structure of Document. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004. LNCS, vol. 3043, pp. 122–133. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  14. Eun, H.-j.: An Algorithm of Documents classification and Query Extension using fuzzy function. Journal of KISS: Software and applications 28(2) (2001)

    Google Scholar 

  15. Chen, T.C.: A Fuzzy Network for the Document Clustering Based on the Measurement of Information Pattern. Neural Networks 4 (1991)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Youn, JI., Eun, HJ., Kim, YS. (2005). Fuzzy Clustering for Documents Based on Optimization of Classifier Using the Genetic Algorithm. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2005. ICCSA 2005. Lecture Notes in Computer Science, vol 3481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424826_2

Download citation

  • DOI: https://doi.org/10.1007/11424826_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25861-2

  • Online ISBN: 978-3-540-32044-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics