Guidelines for including grey literature and conducting multivocal literature reviews in software engineering
Introduction
Systematic Literature Reviews (SLR) and Systematic Mapping (SM) studies were adopted from medical sciences in mid-2000′s [1], and since then numerous SLRs studies have been published in software engineering (SE) [2], [3]. SLRs are valuable as they help practitioners and researchers by indexing evidence and gaps of a particular research area, which may consist of several hundreds of papers [4], [5], [6], [7], [8], [9]. Unfortunately, SLRs fall short in providing full benefits since they typically review the formally-published literature only while excluding the large bodies of the “grey” literature (GL), which are constantly produced by SE practitioners outside of academic forums [10]. As SE is a practitioner-oriented and an application-oriented field [11] the role of GL should be formally recognized, as has been done for example in educational research [12], [13] and health sciences [14], [15], [16], and management [17]. We think that GL can enable a rigorous identification of emerging research topics in SE as many research topics already stem from software industry.
SLRs which include both the academic and the GL were termed as Multivocal Literature Reviews (MLR) in educational research [12], [13], in the early 1990′s. The main difference between an MLR and an SLR is the fact that, while SLRs use as input only academic peer-reviewed papers, MLRs in addition also use sources from the GL, e.g., blogs, videos, white papers and web-pages [18]. MLRs recognize the need for “multiple” voices rather than constructing evidence from only the knowledge rigorously reported in academic settings (formal literature). The MLR definition from [12] elaborates this: “Multivocal literatures are comprised of all accessible writings on a common, often contemporary topic. The writings embody the views or voices of diverse sets of authors (academics, practitioners, journalists, policy centers, state offices of education, local school districts, independent research and development firms, and others). The writings appear in a variety of forms. They reflect different purposes, perspectives, and information bases. They address different aspects of the topic and incorporate different research or non-research logics”.
Many SLR recommendations and guidelines, e.g., Cochrane [19], do not prevent including GL in SLR studies, but on the contrary, they recommend considering the GL as long as GL sources meet the inclusion/exclusion criteria [20]. Yet, nearly all SLR papers in the SE domain exclude GL in SLR studies, a situation which hurts both academia and industry in our field. To facilitate adoption of the guidelines we integrate boxes throughout the paper that cover concrete guidelines summarizing more detailed discussions of specific issues in the respective sections.
The purpose of this paper is therefore to promote the role of GL in SE and to provide specific guidelines for including GL and conducting multivocal literature reviews. We aim at complementing the existing guidelines for SLR studies [3], [21], [22] in SE to address peculiarities of including the GL in our field. Without proper guidelines, conducting MLRs by different teams of researchers may result in review papers with different styles and depth. We support the idea that, “more specific guidelines for scholars on including grey literature in reviews are important as the practice of systematic review in our field continues to mature”, which originates from the field of management sciences [17]. Although multiple MLR guidelines have appeared in areas outside SE, e.g. [19], [20], we think they are not directly applicable for two reasons. First, the specific nature of GL in SE needs to be considered (the type of blogs, questions answer sites, and other GL sources in SE). Second, the guidelines are scattered to different disciplines and offer conflicting suggestions. Thus, in this paper we integrate them all and utilize our prior MLR expertise to present a single “synthesized” guideline.
This paper is structured similar to SLR [22] and SM guideline [3] in SE and considers three phases: (1) planning the review, (2) conducting the review, and (3) reporting the review results. The remainder of this guidelines paper is structured as follows. Section 2 provides a background on concepts of GL and MLRs. Section 3 explains how we developed the guidelines. Section 4 presents guidelines on planning an MLR, Section 5 on conducting an MLR, and Section 6 on reporting an MLR. Finally, in Section 8, we draw conclusions and suggest areas for further work.
Section snippets
Background
We review the concept of GL in Section 2.1. We then discuss different types of secondary studies (of which MLR is a type of) in Section 2.2. Section 2.3 reviews the emergence of and need for MLRs in SE. We then motivate the need for a set of guidelines for conducting MLR studies in Section 2.4.
An overview of the guidelines and its development
In Section 3.1, we explain how we developed the guidelines and Section 3.4 provides an overview of the guidelines.
Planning a MLR
As shown in Fig. 7, the MLR planning phase consists of the following two phases: (1) Establishing the need for an MLR in a given topic, and (2) Defining the MLR's goal and raising its research questions (RQs). In this section, these two steps are discussed.
Conducting the review
Once an MLR is planned, it shall be conducted. This section is structured according to five phases of conducting an MLR:
- •
Search process (Section 5.1)
- •
Source selection (Section 5.2)
- •
Study quality assessment (Section 5.3)
- •
Data extraction (Section 5.4)
- •
Data synthesis (Section 5.5)
Reporting the review
As shown in the MLR process, see Fig. 7, the last phase is reporting the review. Typical issues of the reporting phase of an MLR are similar to the SLR guidelines of Kitchenham and Charters [22]. In the experience from our past SLR and MLRs, we have seen two important additional issues that we discuss next: (1) reporting style for different audience types, and (2) ensuring usefulness to the target audience.
MLR needs to provide benefits for both researchers and practitioners since it contains a
Conclusions and future works
We think that software engineering research can improve its relevance by accepting and analyzing input from practitioner literature. Currently, books and consultancy reports are considered valid evidence while relevant input found in blogs and in social media discussions is often ignored. Furthermore, practitioner interviews done and reported by researchers have, for long, been considered as academic evidence in empirical software engineering, while grey literature produced by the very same
Acknowledgments
The third author has been partially supported by the Academy of Finland Grant no 298020 (Auto-Time) and by TEKES Grant no 3192/31/2017 (ITEA3: 16032 TESTOMAT project).
References (119)
- et al.
Systematic literature reviews in software engineering–a systematic literature review
Inf. Softw. Technol.
(2009) - et al.
Guidelines for conducting systematic mapping studies in software engineering: an update
Inf. Softw. Technol.
(2015) - et al.
Including systematic reviews in PhD programmes and candidatures in nursing – ‘Hobson's choice’?
Nurse Educ. Pract.
(2014) - et al.
An exploration of technical debt
J. Syst. Softw.
(2013) - et al.
Does the inclusion of grey literature influence estimates of intervention effectiveness reported in meta-analyses?
The Lancet
(2000) Global epidemiology of injecting drug use and HIV among people who inject drugs: a systematic review
The Lancet
(2008)- et al.
A systematic literature review of literature reviews in software testing
Inf. Softw. Technol.
(2016) - et al.
Citations, research topics and active countries in software engineering: a bibliometrics study
Elsevier Comput. Sci. Rev.
(2016) - et al.
A systematic mapping study of web application testing
Elsevier J. Inf. Softw. Technol.
(2013) - et al.
Graphical User Interface (GUI) Testing: systematic Mapping and Repository
Inf. Softw. Technol.
(2013)
Why children are not vaccinated: a review of the grey literature
Int. Health
Software test maturity assessment and test process improvement: a multivocal literature review
Inf. Softw. Technol.
When and what to automate in software testing? A multivocal literature review
Inf. Softw. Technol.
Using argumentation theory to analyses software practitioners’ defeasible evidence, inference and belief
Inf. Softw. Technol.
A multivocal literature review on serious games for software process standards education
Comput. Stand. Inter.
Smells in software test code: a survey of knowledge in industry and academia
J. Syst. Softw.
Web application testing: a systematic literature reviewr
J. Syst. Softw.
Evidence-based software engineering
Using systematic review methods within a Ph.D. dissertation in political science: challenges and lessons learned from practice
Int. J. Soc. Res. Methodol.
Software test-code engineering: a systematic mapping
J. Inf. Softw. Technol.
Systematic reviews and Meta-analysis: understanding the best evidence in primary healthcare
J. Family Med. Prim. Care
The role of systematic reviews in evidence-based practice, research, and development
FOCUS
What we know about testing embedded software
IEEE Softw.
Software Creativity 2.0
Experimentation in Software Engineering
Towards rigor in reviews of multivocal literatures: applying the exploratory case study method
Rev. Educ. Res.
Towards utility in reviews of multivocal literatures
Rev. Educ. Res.
The use of grey literature in health sciences: a preliminary survey
Bull. Med. Libr. Assoc.
Grey literature searching for Health Sciences Systematic Reviews: a prospective study of time spent and resources utilized
Evid. Based Libr. Inf. Pract.
Grey literature in meta-analyses of randomized trials of health care interventions
Cochrane Database Syst. Rev.
Shades of grey: guidelines for working with the grey literature in systematic reviews for management and organizational studies
Int. J. Manag. Rev.
Systematic mapping studies in software engineering
Grey literature
Searching for studies
Finding the hard to finds: searching for grey (Gray) literature
UBC
Evidence-Based Software Engineering and Systematic Reviews
Investigating bias in the search phase of Software Engineering secondary studies
A bibliometric analysis of the Turkish software engineering research community
Springer J. Scientometr.
Quantity versus impact of software engineering papers: a quantitative study
Scientometrics
What we know about testing embedded software
IEEE Sof.,
UML-Driven Software Performance Engineering: a Systematic Mapping and Trend Analysis
A grey literature review of special events for promoting cancer screenings
BMC Cancer
A grey literature review of the cultural Olympiad
Cultural Trends
Reconceptualizing cultural participation in Europe: grey literature review
Cultural Trends
The need for multivocal literature reviews in software engineering: complementing systematic literature reviews with grey literature
Choosing the right test automation tool: a grey literature review
Comment on "Towards rigor in reviews of multivocal Literatures: applying the exploratory case study method"
Rev. Educ. Res.
Introducing automated GUI testing and observing its benefits: an industrial case study in the context of law-practice management software
Cited by (424)
Inclusion of individuals with autism spectrum disorder in Software Engineering
2024, Information and Software TechnologyMachine learning in identity and access management systems: Survey and deep dive
2024, Computers and SecurityTest Code Flakiness in Mobile Apps: The Developer's Perspective
2024, Information and Software TechnologyEdge to cloud tools: A Multivocal Literature Review
2024, Journal of Systems and SoftwareGovernance of decentralized autonomous organizations that produce open source software
2024, Blockchain: Research and ApplicationsSoftware engineering practices for machine learning — Adoption, effects, and team assessment
2024, Journal of Systems and Software