research-article

Seed-driven Document Ranking for Systematic Reviews in Evidence-Based Medicine

Authors:
Grace E. Lee

Nanyang Technological University, Singapore, Singapore

Nanyang Technological University, Singapore, Singapore
View Profile

,
Aixin Sun

Nanyang Technological University, Singapore, Singapore

Nanyang Technological University, Singapore, Singapore
View Profile

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalJune 2018Pages 455–464https://doi.org/10.1145/3209978.3209994

Published:27 June 2018Publication History

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Pages 455–464

ABSTRACT

Systematic review (SR) in evidence-based medicine is a literature review which provides a conclusion to a specific clinical question. To assure credible and reproducible conclusions, SRs are conducted by well-defined steps. One of the key steps, the screening step, is to identify relevant documents from a pool of candidate documents. Typically about 2000 candidate documents will be retrieved from databases using keyword queries for a SR. From which, about 20 relevant documents are manually identified by SR experts, based on detailed relevance conditions or eligibility criteria. Recent studies show that document ranking, or screening prioritization, is a promising way to improve the manual screening process. In this paper, we propose a seed-driven document ranking (SDR) model for effective screening, with the assumption that one relevant document is known, i.e., the seed document. Based on a detailed analysis of characteristics of relevant documents, SDR represents documents using bag of clinical terms, rather than the commonly used bag of words. More importantly, we propose a method to estimate the importance of the clinical terms based on their distribution in candidate documents. On benchmark dataset released by CLEF'17 eHealth Task 2, we show that the proposed SDR outperforms state-of-the-art solutions. Interestingly, we also observe that ranking based on word embedding representation of documents well complements SDR. The best ranking is achieved by combining the relevances estimated by SDR and by word embedding. Additionally, we report results of simulating the manual screening process with SDR.

References

Amal Alharbi and Mark Stevenson . 2017. Ranking abstracts to identify relevant evidence for systematic reviews: The University of Sheffield's approach to CLEF eHealth 2017 Task 2: Working notes for CLEF 2017 CEUR Workshop Proceedings, Vol. Vol. 1866.Google Scholar
Victoria B Allen, Kurinchi Selvan Gurusamy, Yemisi Takwoingi, Amun Kalia, and Brian R Davidson . 2013. Diagnostic accuracy of laparoscopy following computed tomography (CT) scanning for assessing the resectability with curative intent in pancreatic and periampullary cancer. Cochrane Database Syst Rev Vol. 11 (2013).Google Scholar
Aaron M Cohen, William R Hersh, K Peterson, and Po-Yin Yen . 2006. Reducing workload in systematic review preparation using automated citation classification. Journal of the American Medical Informatics Association Vol. 13, 2 (2006), 206--219.Google ScholarCross Ref
Agostino Colli, Juan Cristóbal Gana, Dan Turner, Jason Yap, Thomasin Adams-Webber, Simon C Ling, and Giovanni Casazza . 2014. Capsule endoscopy for the diagnosis of oesophageal varices in people with chronic liver disease or portal vein thrombosis. Cochrane Database Syst Rev Vol. 10 (2014).Google Scholar
Gordon V. Cormack and Maura R. Grossman . 2016. Engineering Quality and Reliability in Technology-Assisted Review SIGIR. 75--84. Google ScholarDigital Library
Gordon V. Cormack and Maura R. Grossman . 2017. Technology-Assisted Review in Empirical Medicine: Waterloo Participation in CLEF eHealth 2017. In CEUR Workshop Proceedings, Vol. Vol. 1866.Google Scholar
Kurinchi Selvan Gurusamy, Vanja Giljaca, Yemisi Takwoingi, David Higgie, Goran Poropat, Davor vStimac, and Brian R Davidson . 2015. Ultrasound versus liver function tests for diagnosis of common bile duct stones. Cochrane Database Syst Rev Vol. 2 (2015).Google Scholar
Kazuma Hashimoto, Georgios Kontonatsios, Makoto Miwa, and Sophia Ananiadou . 2016. Topic detection using paragraph vectors to support active learning in systematic reviews. Journal of biomedical informatics Vol. 62 (2016), 59--65. Google ScholarDigital Library
Siddhartha R Jonnalagadda, Pawan Goyal, and Mark D Huffman . 2015. Automating data extraction in systematic reviews: a systematic review. Systematic reviews Vol. 4, 1 (2015), 78.Google Scholar
Evangelos Kanoulas, Dan Li, Leif Azzopardi, and Rene Spijker . 2017. CLEF 2017 Technologically Assisted Reviews in Empirical Medicine Overview CEUR Workshop Proceedings, Vol. Vol. 1866.Google Scholar
Youngho Kim and W. Bruce Croft . 2014. Diversifying Query Suggestions Based on Query Documents SIGIR. 891--894. Google ScholarDigital Library
Youngho Kim and W. Bruce Croft . 2015. Improving Patent Search by Search Result Diversification ICTIR. 201--210. Google ScholarDigital Library
Youngho Kim, Jangwon Seo, W Bruce Croft, and David A Smith . 2014. Automatic suggestion of phrasal-concept queries for literature search. IP&M Vol. 50, 4 (2014), 568--583.Google Scholar
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger . 2015. From word embeddings to document distances. In ICML. 957--966. Google ScholarDigital Library
Matthew Lease, Gordon V Cormack, An T Nguyen, Thomas A Trikalinos, and Byron C Wallace . 2016. Systematic review is e-discovery in doctor's clothing MedIR workshop, SIGIR.Google Scholar
Yuanhua Lv, Taesup Moon, Pranam Kolari, Zhaohui Zheng, Xuanhui Wang, and Yi Chang . 2011. Learning to Model Relatedness for News Recommendation WWW. 57--66. Google ScholarDigital Library
Iain J Marshall, Joël Kuiper, and Byron C Wallace . 2015. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. Journal of the American Medical Informatics Association Vol. 23, 1 (2015), 193--201.Google ScholarCross Ref
Eric Nalisnick, Bhaskar Mitra, Nick Craswell, and Rich Caruana . 2016. Improving document ranking with dual word embeddings WWW. 83--84. Google ScholarDigital Library
Alison ÓMara-Eves, James Thomas, John McNaught, Makoto Miwa, and Sophia Ananiadou . 2015. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic reviews Vol. 4, 1 (2015), 5.Google Scholar
Harrisen Scells, Guido Zuccon, Bevan Koopman, Anthony Deacon, Leif Azzopardi, and Shlomo Geva . 2017 a. Integrating the Framing of Clinical Questions via PICO into the Retrieval of Medical Literature for Systematic Reviews. In CIKM. 2291--2294. Google ScholarDigital Library
Harrisen Scells, Guido Zuccon, Bevan Koopman, Anthony Deacon, Leif Azzopardi, and Shlomo Geva . 2017 b. A Test Collection for Evaluating Retrieval of Studies for Inclusion in Systematic Reviews. In SIGIR. 1237--1240. Google ScholarDigital Library
Nader Shaikh, JL Borrell, J Evron, and MM Leeflang . 2011. Procalcitonin, C-reactive protein, and erythrocyte sedimentation rate for the diagnosis of acute pyelonephritis in children. Cochrane Database Syst Rev Vol. 1 (2011).Google Scholar
Luca Soldaini and Nazli Goharian . 2016. Quickumls: a fast, unsupervised approach for medical concept extraction MedIR workshop, SIGIR.Google Scholar
Byron C Wallace, Joël Kuiper, Aakash Sharma, Mingxi Brian Zhu, and Iain J Marshall . 2016. Extracting PICO sentences from clinical trial reports using supervised distant supervision. JMLR Vol. 17, 132 (2016), 1--25. Google ScholarDigital Library
Byron C Wallace, Kevin Small, Carla E Brodley, and Thomas A Trikalinos . 2010. Active learning for biomedical citation screening. In KDD. 173--182. Google ScholarDigital Library
Linkai Weng, Zhiwei Li, Rui Cai, Yaoxue Zhang, Yuezhi Zhou, Laurence T. Yang, and Lei Zhang . 2011. Query by Document via a Decomposition-based Two-level Retrieval Approach SIGIR. 505--514. Google ScholarDigital Library
Christopher M Williams, Nicholas Henschke, Christopher G Maher, Maurits W van Tulder, Bart W Koes, Petra Macaskill, and Les Irwig . 2013. Red flags to screen for vertebral fracture in patients presenting with low-back pain. Cochrane Database Syst Rev Vol. 1 (2013).Google Scholar
Yin Yang, Nilesh Bansal, Wisam Dakka, Panagiotis Ipeirotis, Nick Koudas, and Dimitris Papadias . 2009. Query by Document. In WSDM. 34--43. Google ScholarDigital Library
ChengXiang Zhai and Sean Massung . 2016. Text data management and analysis: a practical introduction to information retrieval and text mining. Morgan & Claypool. Google ScholarDigital Library

Index Terms

Seed-driven Document Ranking for Systematic Reviews in Evidence-Based Medicine
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Similarity measures

Recommendations

Context-sensitive document ranking

Ranking is a main research issue in IR-styled keyword search over a set of documents. In this paper, we study a new keyword search problem, called context-sensitive document ranking, which is to rank documents with an additional context that provides ...
Read More
Leveraging Passage-level Cumulative Gain for Document Ranking
WWW '20: Proceedings of The Web Conference 2020

Document ranking is one of the most studied but challenging problems in information retrieval (IR) research. A number of existing document ranking models capture relevance signals at the whole document level. Recently, more and more research has begun ...
Read More
From Cluster Ranking to Document Ranking
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

The common approach of using clusters of similar documents for ad hoc document retrieval is to rank the clusters in response to the query; then, the cluster ranking is transformed to document ranking. We present a novel supervised approach to transform ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
June 2018
1509 pages
ISBN:9781450356572
DOI:10.1145/3209978
General Chairs:
Kevyn Collins-Thompson
University of Michigan, United States
,
Qiaozhu Mei
University of Michigan, United States
,
Program Chairs:
Brian Davison
Lehigh University, United States
,
Yiqun Liu
Tsinghua University, China
,
Emine Yilmaz
University College London, United Kingdom
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 June 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
document ranking
seed document
systematic reviews
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '18 Paper Acceptance Rate86of409submissions,21%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 20
  Total Citations
  View Citations
- 505
  Total Downloads
- Downloads (Last 12 months)36
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Seed-driven Document Ranking for Systematic Reviews in Evidence-Based Medicine

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Context-sensitive document ranking

Leveraging Passage-level Cumulative Gain for Document Ranking

From Cluster Ranking to Document Ranking