ABSTRACT
In this poster we describe an investigation of topic similarity measures. We elicit assessments on the similarity of 10 pairs of topic from 76 subjects and use these as a benchmark to assess how well each measure performs. The measures have the potential to form the basis of a predictive technique, for adaptive search systems. The results of our evaluation show that measures based on the level of correlation between topics concords most with general subject perceptions of search topic similarity.
- Harman, D. (1986) 'An Experimental Study of the Factors Important in Document Ranking'. Proceedings of the 9th ACM SIGIR Conference, 186--193. Google ScholarDigital Library
- Lee, L. (1999) 'Measures of Distributional Similarity'. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, 25--32. Google ScholarDigital Library
- Maes, P. (1994) 'Agents that Reduce Work and Information Overload'. Communications of the ACM, 37(7), 30--40. Google ScholarDigital Library
Index Terms
- A study of topic similarity measures
Recommendations
The Bayes Decision Rule Induced Similarity Measures
This paper first shows that the popular whitened cosine similarity measure is related to the Bayes decision rule under specific assumptions and then presents two new similarity measures: the PRM Whitened Cosine (PWC) similarity measure and the Within-...
Improved cosine similarity measures of simplified neutrosophic sets for medical diagnoses
We proposed improved cosine similarity measures of simplified neutrosophic sets (SNSs) based on cosine function, including single valued neutrosophic cosine similarity measures and interval neutrosophic cosine similarity measures, to overcome some ...
Word Embedding-Based Topic Similarity Measures
Natural Language Processing and Information SystemsAbstractTopic models aim at discovering a set of hidden themes in a text corpus. A user might be interested in identifying the most similar topics of a given theme of interest. To accomplish this task, several similarity and distance metrics can be ...
Comments