Regular article
Term-relevance computations and perfect retrieval performance

https://doi.org/10.1016/0306-4573(95)00011-5Get rights and content

Abstract

Computing formulas for binary independent (BI) term relevance weights are evaluated as a function of query representations and retrieval expectations in the CF database. Query representations consist of the limited set of terms appearing in each query statement and the complete set of terms appearing in the database. Retrieval expectations include comprehensive searches, for which many relevant documents are sought, and specific searches, for which only a few documents have merit. Conventional computing equations, which are known to over estimate term relevance weights, are shown to produce mediocre results for all combinations of query representations and retrieval expectations. Modified computing equations, which do not over estimate relevance weights, produce essentially perfect retrieval results for both comprehensive and specific searches, when the query representation is complete. Probabilistic retrieval, based on BI assumptions and applied to simple subject descriptions of documents and queries, can retrieve all relevant documents and only relevant documents, when term relevance weights are computed accurately.

References (14)

There are more references available in the full text version of this article.

Cited by (31)

  • Social media analysis by innovative hybrid algorithms with label propagation

    2022, Expert Systems with Applications
    Citation Excerpt :

    Here, pk denotes the probability that the word k appears in a relevant text, uk shows the probability of word k appearing in a non-relevant text, and wk presents the relevance weight of term k. According to discussions and analysis in the literature (Shaw Jr, 1995), if term k occurs frequently in relevant texts while it rarely occurs in non-relevant texts, then this means that term k has the capability of discriminating relevant texts from non-relevant texts, which is called the distinguishing characteristic of relevance computation. A positive value of wk means that k appears in relevant texts, while a negative value of wk means that k appears in non-relevant documents.

  • Achieving efficient and privacy-preserving multi-feature search for mobile sensing

    2015, Computer Communications
    Citation Excerpt :

    But it is usually difficult for a search entity to express its information need precisely; thus the value defined by itself may not be accurate. To overcome this impreciseness, the technique of relevance feedback is used [9–11]. It is the process of automatically adjusting an existing query using information feedback by the search entity about the preference of previously retrieved documents.

View all citing articles on Scopus
View full text