Directing scientists away from potentially biased publications: the role of systematic reviews in health care
Introduction
Consideration of prior research, as indicated by the citation of relevant scientific publications, is essential to the conduct of new research (e.g., Dasgupta and David, 1994, Merton, 1973). Building new research cumulatively on previous works is made difficult by high quality research being diluted in the larger body of existing articles, which include less than perfect studies. As reviewed in a recent Special Issue in this journal, the quality of research can be compromised at various points in the production and dissemination of research results (Biagioli et al., 2019). Cases of deliberate fraud and data manipulation are often the most visible, yet more elusive or debateable practices have been proliferating. Although these ‘liminal’ forms of misrepresentation have not been classified as misconduct, they can be as problematic as the most severe practices (e.g., Hall and Martin, 2019, Biagioli et al., 2019).
Recent studies have drawn attention to systematic errors specifically introduced in the publication phase, and provided evidence that the dissemination of research in scientific journals is often incomplete (e.g., Fanelli, 2011, Fanelli et al., 2017, Franco et al., 2014). The occurrence of unscientific publication practices – including, but not limited to, the misreporting of true effect sizes (‘p-hacking’), and hypothesizing after results are known (HARKing) – has been lamented across a number of fields e.g., psychology, management, economics and innovation studies (Necker, 2014, Fanelli, 2009, Bergh et al., 2017, Murphy and Aguinis, 2019, Head et al., 2015, Bettis, 2012, Harrison et al., 2017, Bruns et al., 2019, Halevi, 2020, Craig et al., 2020). Among others, these practices include selective reporting, which consists of including in the final publication only part of the findings originally recorded during a research study, on the basis of their direction or significance (Higgins and Green, 2011, Higgins et al., 2019). In clinical research, estimates suggest that at least half of studies are incompletely reported, in turn becoming unusable for follow-on science and clinical guidelines (Chalmers and Glasziou, 2009). Since reporting rules are often vague, selective reporting may be regarded more as an ‘inappropriate’ practice than a ‘questionable’ one (Hall and Martin, 2019). Yet, scientists rate it as the top factor contributing to irreproducible research (Baker, 2016), and members of the public see selective reporting as being immoral and deserving of punishment (Pickett and Roche, 2018).
In addition to obstructing scientists’ ability to replicate past research findings (Allison et al., 2016, Collaboration, 2015, Fanelli, 2018), poor reporting is believed to seriously distort science (Young et al., 2008) and generate research waste (Glasziou et al., 2014, Chalmers and Glasziou, 2009). The risks carried by incomplete reporting are particularly high in the biomedical field, for example, in terms of the potential harm for human health.
While prior research has thoroughly documented the proliferation of selective reporting and other poor publication practices, little is known about how scientists can go about detecting them. Work on knowledge governance systems, notably retractions, suggests that the scientific community can detect signals and redirect research efforts away from inadequate or biased publications. For example, retracted papers are cited less than their counterparts following retraction notices (Furman et al., 2012, Lu et al., 2013, Azoulay et al., 2015b). However, retractions are not always fit-for-purpose, particularly in those cases where errors are not necessarily the outcome of deliberate misconduct, or where mistakes are not serious enough to lead to the invalidation of an entire article (Fang et al., 2012, Neale et al., 2010). Therefore, it is useful to investigate other systems that can provide signals about the quality of published research.
We focus here on the role of systematic reviews in health care, which summarise prior medical knowledge, such as randomised controlled trials of medical drugs, with the primary aim of informing clinical and policy decisions (e.g., Cook et al., 1997). Systematic reviews represent a particularly interesting case because most studies on publication and reporting bias have been conducted in the biomedical sciences, reflecting a high awareness for bias in this field (Bekelman et al., 2003, Lexchin et al., 2003, Dwan et al., 2008, Lee et al., 2008, Dwan et al., 2013, Ross et al., 2008, Fleming et al., 2015). In addition, although the synthesis of research findings – for example, in literature reviews and meta-analyses – has enjoyed a growing interest in several fields (e.g., Aguinis et al., 2011a), systematic reviews in health care are unique because they routinely incorporate systems to appraise the quality of the included studies. An example of such systems is the assessment of a study's risk of bias due to selective reporting.
These features allow us to explore whether systematic reviews in health care can play a secondary role – over and above their established role in summarizing existing medical evidence – by providing a signal to detect biases, and flag them to the scientific community. We explore whether publications deemed at high risk of bias from selective reporting attract less relative attention (as measured by follow-on citations), compared to their low risk of bias counterparts, after potential biases are flagged in systematic reviews. We also examine whether the key features of this signal, or rating in this case, shape its effect on scientists’ attention.
To tackle this question, we leverage evidence ratings presented in the reviews compiled by Cochrane, the most authoritative and comprehensive source of systematic reviews in health care (e.g., Jadad et al., 1998). We consider the publication of Cochrane reviews, which include a risk of bias assessment for all appraised studies. The risk of bias for a given study is judged as being high, low, or unclear, and it is summarised making use of a traffic-light system – i.e. a red, green, or amber flag.
In line with prior literature on retractions, we employ a matched-sample control group, pairing articles deemed at high risk of bias due to selective reporting to similar papers deemed at low risk of bias, and quantify the impact of risk of bias ratings by comparing citation patterns for articles at high risk of bias to those of the matched publications. Our results indicate that systematic reviews provide key signals to follow-on researchers: in the main model specification, high risk of bias articles receive on average 7.9% fewer annual citations relative to their low risk of bias counterparts, following the publication of a Cochrane review. The investigation of citation dynamics after the treatment indicates that the citation effect is strongest in year 3 and 4 after the publication of a review. While our main sample compares the two furthermost categories – that is, high vs. low risk of bias ratings – in additional analyses, we also consider the cases flagged as being at unclear bias. We observe no significant citation effect for high risk of bias publications when compared to those at unclear risk. Low risk of bias publications, instead, receive on average 4.1 % more citations relative to their unclear risk of bias counterparts.
We also examine whether the effect of this quality signal is shaped by the modes of presentation of the risk of bias rating within a review, and by contextual factors such as the timing and subject area of the publication. We find that the effect is stronger when the risk of bias for the focal paper is deemed to be high along other bias domains considered by Cochrane, when the bias judgment is accompanied by a long explanatory comment, and when the review appraises a low number of articles. The effect is also concentrated among the most recent papers, and for papers reviewed within subject areas in which selective reporting bias is prevalent. These results suggest that the form and nature of the rating system's signal shape scientists’ response to it.
By exploring the role of systematic reviews in health care in influencing the way scientists place new research in the context of prior publications, this study builds on prior research that has examined other knowledge governance systems, such as the system of retractions (Furman et al., 2012, Lu et al., 2013, Azoulay et al., 2015b). By helping us to investigate practical ways to improve the way scientists can detect publication errors and build upon prior work that is at least well reported, our findings inform the conversation on the detection and possible remedies for academic misconduct, misrepresentation and gaming particularly for the less easily defined, yet alarming, practices (Biagioli et al., 2019). Speaking to the ongoing debate regarding the quality of published research, these results also have repercussions for important matters such as research waste (Glasziou et al., 2014, Chalmers and Glasziou, 2009), and the crisis of replication of research across various science fields (Allison et al., 2016, Aguinis et al., 2017).
Section snippets
Systems to govern the validity of published research
The proliferation of academic misconduct, misrepresentation and gaming, and the resulting publication errors, are posing growing threats to the reliability of the scientific literature (e.g., Biagioli et al., 2019). These issues beg the question of how should scientists detect biases and build new studies upon robust evidence, while trying to navigate across an overwhelming volume of scientific publications (e.g., Bornmann and Mutz, 2015).
From the standpoint of prevention, various initiatives
Systematic reviews in health care and bias assessment tools
Given that the search costs for scientists are invariably high, signals about the reliability and credibility of research may play an important role in shaping scientific efforts. Signalling theory suggests that signals can lower the uncertainty associated with selection where there is incomplete and asymmetric information about quality (Spence, 1973, Spence, 2002). In science, there are often information asymmetries between what scientists need to know, and what information is available to
Empirical strategy
We leverage a unique dataset of clinical research publications matched to expert-driven assessments of bias derived from the Cochrane Database of Systematic Reviews. Cochrane is a global independent network of researchers, and the leading provider of systematic reviews in health care. Cochrane reviews, which appraise the extant research on international public health priority themes, are recognised as the highest standard in evidence-based health care (e.g., Grimshaw, 2004), and are influential
Results
The final sample for our main set (‘HIGH vs LOW’), consists of 8,726 unique papers prior to the matching exercise. Out of these, 2,675 (30.7%) were deemed at a high risk of bias due to selective reporting. In Table 1, we report descriptive statistics for the key variables for this sample of publications. It is worth noting that, before the matching, the treated and the control samples are statistically significant different along several characteristics. Among others, treated papers have on
Discussion and conclusion
The proliferation of a range of poor research practices is widely seen to be seriously affecting the scientific literature, leading to research waste across many disciplines. Despite these issues, the current systems intended to counter biases are subject to several important limitations. Against this backdrop, we explored the role of systematic reviews in health care in signalling bias from selective reporting. In our main sample, we found a 7.9% relative decrease in annual citations for
CRediT authorship contribution statement
Rossella Salandra: Conceptualization, Methodology, Data curation, Formal analysis, Writing - original draft. Paola Criscuolo: Conceptualization, Methodology, Data curation, Formal analysis, Writing - original draft. Ammon Salter: Conceptualization, Methodology, Writing - original draft.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
Early versions of this paper were presented at DRUID18, at the 12th Workshop on the Organization, Economics and Policy of Scientific Research, and at Evidence Live 2018. We are thankful for comments received at these events, and grateful to the Cochrane Editorial and Methods Department for providing access to the data. We are also indebted to Stefano Baruffaldi, Ruxandra Luca and John Walsh. R. Salandra gratefully acknowledges the financial support received by the UK Engineering and Physical
References (100)
- et al.
Academic misconduct, misrepresentation and gaming: A reassessment
Research Policy
(2019) - et al.
Reporting errors and biases in published empirical findings: Evidence from innovation research
Research Policy
(2019) - et al.
Avoidable Waste in the Production and Reporting of Research Evidence
The Lancet
(2009) - et al.
Putting research into context—revisited
The Lancet
(2010) - et al.
Clinical trials should begin and end with systematic reviews of relevant evidence: 12 years and waiting
The Lancet
(2010) - et al.
Using retracted journal articles in psychology to understand research misconduct in the social sciences: What is to be done
Research Policy
(2020) - et al.
Reviews assessing the quality or the reporting of randomized controlled trials are increasing over time but raised questions about how quality is assessed
Journal of clinical epidemiology
(2011) - et al.
The Matthew effect of a journal’s ranking
Research Policy
(2020) - et al.
Scientific citations favor positive results: a systematic review and meta-analysis
Journal of clinical epidemiology
(2017) - et al.
Governing knowledge in the scientific community: Exploring the role of retractions in biomedicine
Research Policy
(2012)
Evaluating solutions to the problem of false positives
Research Policy
Reducing waste from incomplete or unusable reports of biomedical research
The Lancet
Towards a taxonomy of research misconduct: The case of business school research
Research Policy
Guilt by association: How scientific misconduct harms prior collaborators
Research Policy
The effect of negative online consumer reviews on product attitude: An information processing view
Electronic commerce research and applications
Scientific misbehavior in economics
Research Policy
Exploring the adoption and processing of online holiday reviews: A grounded theory approach
Tourism Management
pybliometrics: Scriptable bibliometrics using a Python interface to Scopus
SoftwareX
Visual complexity of websites: Effects on users’ experience, physiology, performance, and memory
International journal of human-computer studies
Meta-analytic choices and judgment calls: Implications for theory building and testing, obtained effect sizes, and scholarly impact
Journal of Management
Debunking myths and urban legends about meta-analysis
Organizational Research Methods
What You See is What You Get? Enhancing Methodological Transparency in Management Research
Academy of Management Annals
Reproducibility: A tragedy of errors
Nature
Open access is tiring out peer reviewers
Nature News
Systems for grading the quality of evidence and the strength of recommendations I: critical appraisal of existing approaches The Grade Working Group
BMC Health Services Research
The career effects of scandal: Evidence from scientific retractions
National Bureau of Economic Research
Retractions
Review of Economics and Statistics
Reproducibility crisis
Nature
Can systematic reviews contribute to regulatory decisions
European journal of clinical pharmacology
Free-riding on power laws: questioning the validity of the impact factor as a measure of research quality in organization studies
Organization
Scope and impact of financial conflicts of interest in biomedical research
JAMA
Is there a credibility crisis in strategic management research? Evidence on the reproducibility of study findings
Strategic Organization
The search for asterisks: Compromised statistical tests and flawed theories
Strategic Management Journal
Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references
Journal of the Association for Information Science and Technology
Commentary: Perverse incentives or rotten apples
Accountability in research
The impact of Cochrane Systematic Reviews: a mixed method evaluation of outputs from Cochrane Review Groups supported by the UK National Institute for Health Research
Systematic reviews
The impact of Cochrane Reviews: a mixed-methods evaluation of outputs from Cochrane Review Groups supported by the National Institute for Health Research
Health Technology Assessment
The incidence and role of negative citations in science
Proceedings of the National Academy of Sciences
Identifying outcome reporting bias in randomised trials on PubMed: review of publications and survey of authors
BMJ
Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles
JAMA
The effect of word of mouth on sales: Online book reviews
Journal of marketing research
Correcting the medical literature:“To err is human, to correct divine”
JAMA
Estimating the reproducibility of psychological science
Science
Signaling Theory: A Review and Assessment
Journal of Management
Systematic reviews: synthesis of best evidence for clinical decisions
Annals of internal medicine
Toward a new economics of science
Research Policy
Evolution of poor reporting and inadequate methods over time in 20 920 randomised controlled trials included in Cochrane reviews: research on research study
BMJ
Systematic review of the empirical evidence of study publication bias and outcome reporting bias
PLOS One
Systematic review of the empirical evidence of study publication bias and outcome reporting bias—an updated review
PLOS One
How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data
PLOS One
Cited by (5)
Kinesiology Review's Scholarly Influence: An Audit of Its First Decade
2023, Kinesiology ReviewReview Research as Scientific Inquiry
2023, Organizational Research MethodsExploring the patterns in political consumption: A review and identification of future research agenda
2022, International Journal of Consumer Studies