Skip to main content

Efficient Algorithms for Concept Space Construction

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2001)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2035))

Included in the following conference series:

Abstract

The vocabulary problem in information retrieval arises because authors and indexers often use different terms for the same concept. A thesaurus defines mappings between different but related terms. It is widely used in modern information retrieval systems to solve the vocabulary problem. Chen et al. proposed the concept space approach to automatic thesaurus construction. A concept space contains the associations between every pair of terms. Previous research studies show that concept space is a useful tool for helping information searchers in revising their queries in order to get better results from information retrieval systems. The construction of a concept space, however, is very computationally intensive. In this paper, we propose and evaluate efficient algorithms for constructing concept spaces that include only strong associations. Since weak associations are not useful in thesauri construction, our algorithms use various prunning techniques to avoid computing weak associations to achieve efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The SMART retrieval system. ftp://ftp.cs.cornell.edu/pub/smart/med.

  2. R. Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, 1999.

    Google Scholar 

  3. Hsinchun Chen, Joanne Martinez, Tobun D. Ng, and Bruce R. Schatz. A concept space approach to addressing the vocabulary problem in scientific information retrieval: an experiment on the worm community system. Journal of American Society for information Science, 48(1): 17–31, 1997.

    Article  Google Scholar 

  4. W.B. Frakes and R. Baeza-Yates. Information Retreival: Data Structures and Algorithms. Prentice Hall, 1992.

    Google Scholar 

  5. G.W. Furnas et al. The vocabulary problem in human-system communicaiton. Comm. ACM, 30(11):964–971, 1987.

    Article  Google Scholar 

  6. H. Chen and K.J. Lynch. Automatic construction of networks of concepts characterizing document databases. IEEE Transaction of Systems, Man, and Cybernetics, 22(5):885–902, Sep/Oct 1992.

    Article  Google Scholar 

  7. B.R. Schatz, E. Johnson, P. Cochrane, and H. Chen. Interactive term suggestion for users of digital libraries: using subject thesauri and co-occurrence lists for information retrieval. In Digital Library 96, Bethesda MD, 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ng, C.Y., Lee, J., Cheung, F., Kao, B., Cheung, D. (2001). Efficient Algorithms for Concept Space Construction. In: Cheung, D., Williams, G.J., Li, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2001. Lecture Notes in Computer Science(), vol 2035. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45357-1_12

Download citation

  • DOI: https://doi.org/10.1007/3-540-45357-1_12

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41910-5

  • Online ISBN: 978-3-540-45357-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics