ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Information Processing & Management
Volume 42, Issue 1, January 2006, Pages 56-73
Formal Methods for Information Retrieval
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (302 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
Special issue
View Record in Scopus
 
doi:10.1016/j.ipm.2004.11.007    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2004 Elsevier Ltd All rights reserved.

A framework for understanding Latent Semantic Indexing (LSI) performance

April Kontostathisa, Corresponding Author Contact Information, E-mail The Corresponding Author and William M. Pottengerb, E-mail The Corresponding Author

aUrsinus College, PO Box 1000, 601 Main Street, Collegeville, PA 19426, United States bLehigh University, 19 Memorial Drive West, Bethlehem, PA 18015, United States

Accepted 12 November 2004. 
Available online 21 January 2005.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

In this paper we present a theoretical model for understanding the performance of Latent Semantic Indexing (LSI) search and retrieval application. Many models for understanding LSI have been proposed. Ours is the first to study the values produced by LSI in the term by dimension vectors. The framework presented here is based on term co-occurrence data. We show a strong correlation between second-order term co-occurrence and the values produced by the Singular Value Decomposition (SVD) algorithm that forms the foundation for LSI. We also present a mathematical proof that the SVD algorithm encapsulates term co-occurrence information.

Keywords: Latent Semantic Indexing; Term co-occurrence; Singular value; Decomposition; Information retrieval theory

Article Outline

1. Introduction
2. Overview of Latent Semantic Indexing
2.1. Latent Semantic Indexing algorithm
2.2. Co-occurrence in LSI—an example
3. Higher-order co-occurrence in LSI
3.1. Data sets
3.2. Methodology
3.3. Results
4. Analysis of the LSI values
4.1. Data sets
4.2. Methodology
4.3. Results
4.4. Discussion
5. Transitivity and the SVD
6. Conclusions and future work
Acknowledgements
References






Information Processing & Management
Volume 42, Issue 1, January 2006, Pages 56-73
Formal Methods for Information Retrieval
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.