Skip to main content
Log in

A Ranking-Based Document Retrieval Model: The TEXPROS Approach

  • Published:
Journal of Systems Integration

Abstract

In this paper, a ranking-based document retrieval model is proposed to incorporate with the browsing process. In TEXPROS (TEXt PROcessing System), the interactive browsing process is designed to allow the interactions between the system and a user for forming a strategy of retrieving documents and information from the document base. By gathering information, users could reformulate queries dynamically. During the browsing sessions, a predicate augmented an infrastructure (called Operation Network) is used to present the information about the document types, the synopses of the documents and where documents are deposited. The outcome of using the concept-based retrieval for searching requested documents with partial descriptions could be a large volume of returned documents. The ranking model is used to rank the returned documents according to the degree of their relevancy to the query. Based on the TEXPROS's dual models, an approach to creating the representatives of documents and queries is described as a basis for the proposed ranking model. By integrating the ranking model and the browsing system as a whole, an open retrieval environment is created, which can be customized for different application domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. A. Bookstein, “A comparison of two systems of weighted Boolean retrieval.” Journal of the American Society for Information Science, pp. 275–279, July 1981.

  2. G. Salton, A. Wong and C. S. Yang, “A vector space model for automatic indexing.” Communications of the ACM 18(11), pp. 613–620, November 1975.

    Google Scholar 

  3. K. S. Jones, “A statistical interpretation of term specificity and its application in retrieval.” Journal of Documentation 28(1), pp. 11–21, March 1972.

    Google Scholar 

  4. C. T. Yu and G. Salton, “Effective information retrieval using term accuracy.” Communications of the ACM 20(3), pp. 135–142, March 1977.

    Google Scholar 

  5. W. B. Croft, “Boolean queries and term dependencies in probabilistic retrieval.” Journal of the American Society for Information Science 37(2), 1986.

  6. I. J. Aalbersberg, “A document retrieval model based on term frequency ranks.” Research and Development in Information Retrieval, pp. 163–172, 1994.

  7. N. Fuhr, “Integration of probabilistic fact and text retrieval,” in Proceedings of the 15th Annual International ACM SIGIR Conference, Denmark, June 1992, pp. 211–222.

  8. G. Salton and M. J. McGill, Introduction to Modern Information Retrieval. McGraw-Hill: New York, 1983.

    Google Scholar 

  9. G. K. Zipf, Human Behavior and the Principle of Least Effort. Addison-Wesley: Reading, MA, 1949.

    Google Scholar 

  10. M. Persin, “Document filtering for fast ranking.” Research and Development in Information Retrieval, pp. 339–348, 1994.

  11. C. J. Van Rijsbergen, Information Retrieval. Butterworths: Boston, MA, 1979.

    Google Scholar 

  12. Q. Liu and P. A. Ng, Document Processing and Retrieval: TEXPROS. Kluwer Academic Publishers: Norwell, MA, 1996.

    Google Scholar 

  13. J. T. L. Wang and P. A. Ng, “TEXPROS: An intelligent document processing system.” International Journal of Software Engineering and Knowledge Engineering 15(4), pp. 171–196, April 1992.

    Google Scholar 

  14. M. Sneoeck and G. Dedene. “Generalization/specification and role in object oriented conceptual modeling.” Data and Knowledge Engineering 19(2), pp. 171–195, June 1966.

    Google Scholar 

  15. C. Wei, J. T. L. Wang, X. Hao and P. A. Ng, “Inductive learning and knowledge representation for document classification: The TEXPROS approach,” in Proceedings of 3rd International Conference on Systems Integration, Sao Paulo, SP, Brazil, August 1994, pp. 1166-1175.

  16. X. Hao, “Document Classification and Information Extraction.” Ph.D. dissertation, Department of Computer and Information Science, New Jersey Institute of Technology. UMI Press, 1995.

  17. Z. Zhu, J. A. McHugh and P. A. Ng, “A predicate driven document filing system.” Journal of Systems Integration 6(3), pp. 373–403, 1996.

    Google Scholar 

  18. X. Fan, Q. Liu and P. A. Ng, “A multimedia document filing system,” in Proceedings of the International Conference on Multimedia Computing and Systems, Ottawa, Ontario, Canada, pp. 492–499.

  19. X. Fan, Q. Liu and P. A. Ng, “Knowledge-based document filing: TEXPROS approach,” in Proceedings of the 13th International Conference on Advanced Science and Technology in Conjunction with the 2nd International Conference on Multimedia Information Systems, Schaumburg, Illinois, USA, pp. 58–67.

  20. Q. Liu and P. A. Ng, “A browser of supporting vague query processing in an office document system.” Journal of Systems Integration 5(1), pp. 61–82, 1995.

    Google Scholar 

  21. Q. Liu, “An office document retrieval system with the capability of processing incomplete and vague queries.” Ph.D. dissertation, Department of Computer and Information Science, New Jersey Institute of Technology. UMI Press, 1994.

  22. A. Motro, “Browsing in a loosely structured database,” In Proceedings of ACM-SIGMOD International Conference on Management of Data, Boston, MA, June 1984, pp. 197–207.

  23. A. Motro, “BAROQUE: A browser for relational databases.” ACM Transactions on Office Information Systems 4(2), pp. 164–181, April 1986.

    Google Scholar 

  24. C. Y. Wang, Q. Liu and P. A. Ng, “Intelligent browser for TEXPROS,” in ISATED Proceedings of International Conference on Intelligent Information Systems (IIS’ 97) H. Adeli, ed., IEEE Computer Society Press, pp. 388–398, Dec 8–10, 1997.

  25. C. Y. Wang, Q. Liu and P. A. Ng, “Browsing in an information repository,” in Proceeding of 2nd World Conference on Integrated Design and Process Technology, M. M. Tanik, etc., eds., IDPT-Vol 2, pp. 48–56, 1996.

  26. Q. Kong and G. Chen, “On deductive databases with incomplete information.” ACM Transactions on Information Systems 13(3), pp. 354–369, July 1995.

    Google Scholar 

  27. C. Y. Wang, “The Intelligent Browser for TEXPROS.” Ph.D. Dissertation. Department of Computer and Information Science, New Jersey Institute of Technology, Newark, NJ. UMI Press, May 1998.

    Google Scholar 

  28. X. Fan, “Knowledge-Based Document Filing for TEXPROS.” Ph.D. Dissertation. Department of Computer and Information Science, New Jersey Institute of Technology, Newark, NJ. UMI Press, May 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, CY., Ng, P.A. A Ranking-Based Document Retrieval Model: The TEXPROS Approach. Journal of Systems Integration 8, 379–404 (1998). https://doi.org/10.1023/A:1008421404974

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008421404974

Keywords

Navigation