Review
A systematic examination of knowledge loss in open source software projects

https://doi.org/10.1016/j.ijinfomgt.2018.11.015Get rights and content

Abstract

Context Open Source Software (OSS) development is a knowledge focused activity which relies heavily on contributors who can be volunteers or paid workers and are geographically distributed. While working on OSS projects contributors acquire project related individualistic knowledge and gain experience and skills, which often remains unshared with others and is usually lost once contributors leave a project. All software development organisations face the problem of knowledge loss as employees leave, but this situation is exasperated in OSS projects where most contributors are volunteers with largely unpredictable engagement durations. Contributor turnover is inevitable due to the transient nature of OSS project workforces causing knowledge loss, which threatens the overall sustainability of OSS projects and impacts negatively on software quality and contributor productivity.

Objective

The objective of this work is to deeply and systematically investigate the phenomenon of knowledge loss due to contributor turnover in OSS projects as presented in the state-of-the-art literature and to synthesise the information presented on the topic. Furthermore, based on the learning arising from our investigation it is our intention to identify mechanisms to reduce the overall effects of knowledge loss in OSS projects.

Methodology

We use the snowballing methodology to identify the relevant literature on knowledge loss due to contributor turnover in OSS projects. This robust methodology for a literature review includes research question, search strategy, inclusion, exclusion, quality criteria, and data synthesis. The search strategy, and inclusion, exclusions and quality criteria are applied as a part of snowballing procedure.

Snowballing is considered an efficient and reliable way to conduct a systematic literature review, providing a robust alternative to mechanically searching individual databases for given topics.

Result

Knowledge sharing in OSS projects is abundant but there is no evidence of a formal strategy or practice to manage knowledge. Due to the dynamic and diverse nature of OSS projects, knowledge management is considered a challenging task and there is a need for a proactive mechanism to share knowledge in the OSS community for knowledge to be reused in the future by the OSS project contributors. From the collection of papers found using snowballing, we consolidated various themes on knowledge loss due to contributor turnover in OSS projects and identified 11 impacts due to knowledge loss in OSS projects, and 10 mitigations to manage with knowledge loss in OSS projects.

Conclusion

In this paper, we propose future research directions to investigate integration of proactive knowledge retention practices with the existing OSS practices to reduce the current knowledge loss problem. We suggest that there is insufficient attention paid to KM in general in OSS, in particular there would appear to an absence of proactive measures to reduce the potential impact of knowledge loss. We also propose the need for a KM evaluation metric in OSS projects, similar to the ones that evaluate health of online communities, which should help to inform potential consumers of the OSS of the KM status on a project, something that is not existent today.

Introduction

Software development is a knowledge-intensive activity, which involves intense complexity (Clarke, O’Connor, & Leavy, 2016). Open Source Software (OSS) has had a profound impact on the way in which software is developed and consequently on the perception of software development (Hagan, Watson, & Barron, 2007). Open Source Software is one of the representative examples of open collaboration (Lee, Baek, & Jahng, 2017). Many researchers have pointed out that the open source movement is an interesting phenomenon that is difficult to explain with conventional economic theories (Andersen-Gott, Ghinea, & Bygstad, 2012). In Open Source Software (OSS) projects, contributors can be volunteers or paid workers who participate in software development activities. While working on OSS projects, contributors acquire project related knowledge and gain experience and skills. Examples of knowledge that is required to accomplish software development tasks on projects include application domain, system’s architecture, use of particular algorithms to code, insights into requirements, programming language and development environment (Anquetil, de Oliveira, de Sousa, & Batista Dias, 2007). Valuable individualistic knowledge, which remains unshared with others, is lost once contributors leave the project. Organisations constantly face the problem of knowledge loss as employees leave (De Long & Davenport, 2003; Jennex & Durcikova, 2013; Viana, Conte, Marczak, Ferreira, & Souza, 2015), a situation which is perhaps exasperated in OSS projects (Donadelli, 2015; Izquierdo-Cortazar, Robles, Ortega, & Gonzalez-Barahona, 2009; Rigby, Zhu, Donadelli, & Mockus, 2016) where most (if not all) contributors are volunteers with largely unpredictable engagement durations (Robles, Gonzalez-Barahona, & Michlmayr, 2005). The phenomenon of volunteers joining and leaving at their discretion is more common in OSS projects than with hired employees in Closed Source Software (CSS) (Robles et al., 2005). Such contributor attrition leads to knowledge loss on OSS projects.

The importance of OSS in our daily lives can be realised from the fact that there are thousands of OSS projects operating worldwide such as the Linux operating system, Apache Web Server, Mozilla Firefox, OpenOffice and many more. There has been an exponential rise in OSS products and their use, as indicated by 430,000 projects hosted in 2014 on the SourceForge portal (Silic & Back, 2017). The projects are of varying sizes and also involve commercial firms who are heavily dependent on OSS systems (Crowston, Wei, Howison, & Wiggins, 2012). During 2014, Google’s mobile Android operating system had about one billion users across all devices (Conn, 2014). A survey conducted in 2015 reported that almost 78% of companies run operations on Open Source Software and 66% of companies have incorporated Open Source Software in creating software for customers (BlackDuck, 2015). The use of Apache and NGINX is calculated to be 54% of all webservers used worldwide (Silic & Back, 2017).

Many technology firms traditionally known for closed organisational structures and proprietary software development, such as IBM, Microsoft, and Facebook, have embraced the strategic opportunities that open source development models offer (Daniel, Midha, Bhattacherhjee, & Singh, 2018). Currently it is reported that 40% of Fortune 50 companies are developing software systems using Open Source Software, with firms such as Microsoft and Facebook being the most active open source development communities on GitHub. For example Microsoft boasting the highest number of contributors and more than 258 software projects under development (Octoverse, 2016).

The phenomenon where contributors working in project teams join, leave or change their role is referred to as ‘turnover’ (Foucault, Palyart, Blanc, Murphy, & Falleri, 2015). The turnover can be catastrophic for the project if a contributor who is knowledgeable on major parts of the system leaves and this reduces the spread of the knowledge (Donadelli, 2015). Contributor turnover is rated as being very high both in the software industry (Zhou, 2009) and in OSS projects (Foucault et al., 2015; Otte, Moreton, & Knoell, 2008; Rigby et al., 2016; Robles & Gonzalez-Barahona, 2006) and mitigating its effects is considered a significant problem (Fronza, Janes, Sillitti, Succi, & Trebeschi, 2013). Turnover is inevitable due to the transient nature of OSS project workforces (Michlmayr, 2007; Yu, Benlian, & Hess, 2012), causing knowledge loss (Izquierdo-Cortazar et al., 2009; Rigby et al., 2016). Knowledge loss in this work refers to the loss of experience and expertise in OSS projects that can result in the decline of evolution in OSS systems (Joblin, Apel, & Mauerer, 2017). Knowledge loss not only impacts software quality (Foucault et al., 2015; Mockus, 2010) and contributor productivity (Izquierdo-Cortazar et al., 2009; Schilling, Laumer, & Weitzel, 2011), but also threatens the overall sustainability of OSS projects.

OSS projects are constantly evolving as indicated in the adaptated staged model for OSS systems (Capiluppi, Stol, & Boldyreff, 2012), and maintenance plays a significant role in project evolution (Lin, Robles, & Serebrenik, 2017; Rigby et al., 2016). As asserted “it is harder to separate out maintenance and development since they tend to occur together (Michlmayr, 2007)”. A system that stops evolving is an indication that it may become a legacy in the near future (Capiluppi et al., 2012). Taking the view of software development and maintenance being part of the broader phenomenon of software evolution, it is argued that the adoption of knowledge management practices in software engineering would improve both software construction and, more particularly, software maintenance (de Vasconcelos, Kimble, Carreteiro, & Rocha, 2017).

Knowledge Management (KM) processes play a significant role in the implementation of various Information Systems (IS). Researchers have introduced different KM processes, each of which contributes to the efficient use of ISs (Al-Emran, Mezhuyev, Kamaludin, & Shaalan, 2018). In terms of IS type, most of the analysed studies focused on investigating the impact of KM processes on E-business systems, knowledge management systems and IS outsourcing respectively (Al-Emran et al., 2018). Effective knowledge management practices in organisations are focused on knowledge creation and knowledge transfer activities (Barão, de Vasconcelos, Rocha, & Pereira, 2017). KM processes are the essential elements for improving the capabilities of a particular technology, and the successful implementation of such technology increasingly depends on the efficient use of these processes (Lee, Lee, & Lin, 2007). KM processes are considered the fundamental processes for the successful adoption and implementation of a new IS (Chong, Chan, Goh, & Tiwari, 2013). For instance, information and knowledge can be seen as key resources for improving the internationalisation processes of small and medium-sized enterprises (SMEs) including collaboration, which is considered an important facilitator of these processes, particularly by nurturing information and knowledge sharing (Costa, Soares, & de Sousa, 2016).

The objective of this work is to deeply and systematically investigate the phenomenon of knowledge loss due to contributor turnover in OSS projects as presented in the state-of-the-art literature and to synthesise the information presented on the topic. Furthermore, based on the learning arising from our investigation it is our intention to identify mechanisms to reduce the overall effects of knowledge loss in OSS projects.

To gain an insight into the phenomenon of knowledge loss in OSS projects due to turnover, we conducted a literature review. The research question is structured using the PICOC (Population, Intervention, Comparison, Outcome and Context) criteria to structure research questions for a Systematic Literature Review (SLR) (Petticrew & Roberts, 2006), with only population, intervention, and outcome being relevant for our research question:

  • Population: Open Source Software and associated synonyms

  • Intervention: Knowledge loss and turnover

  • Outcome: Existing themes and patterns related to knowledge loss and turnover in OSS projects

Our central research question is phrased as “What is the existing state-of-the-art literature on knowledge loss due to turnover in OSS projects?

The remainder of the paper is structured as follows: Section 2 provides an insight into the setting up of OSS projects, work structure and knowledge relevant concepts. Section 3 presents the literature review methodology and Section 4 presents the application of the literature review methodology. Section 5 elaborates on problem identification and details the impact of knowledge loss in OSS projects. Section 6 comprises practices to reduce knowledge loss in OSS projects. Section 7 discusses limitations to this work and the final Section 8 marks the conclusion of this work with future directions.

Section snippets

Open source software

Open Source Software (OSS) is enlisted as one of the four core components of Open Science (OS), which relates to the movement that provides free accessibility to scientific research data and its dissemination to all levels of inquiring society (Pontika, Knoth, Cancellieri, & Pearce, 2015). Open Source Software (OSS) is a term used to embrace software developed and released under an “open source” license that complies with Open Source Definition (OSD). The OSD uses either the shorter version

Literature review methodology

In order to find the relevant literature on the topic of knowledge loss in OSS, we conducted our literature review using the snowballing approach (SB) (Badampudi, Wohlin, & Petersen, 2015; Jalali & Wohlin, 2012; Wohlin, 2014). Systematic literature studies have become common in software engineering, and hence it is important to conduct them efficiently and reliably (Wohlin, 2014). In software engineering, the main recommended first step is using search strings in a number of databases, while in

Identifying the baseline set

The queries in Table 1 were executed on Google Scholar using Zotero2 (a free tool for managing bibliographies) to retain Google Scholar results generated by executing the queries and removing duplicates. Zotero further facilitated to download the citation details of the papers including electronic versions. Quotes around the term knowledge loss help to search for it on Google Scholar as one word.

During the execution of the search

Discussion

The emerging themes identified from the papers in this work are subjective in nature and a rigorous process was followed to collect papers in a systematic literature review. In this section, we examine the problem of knowledge loss in OSS projects and the subsequent section lists the impact of knowledge loss in OSS projects.

Reducing knowledge loss in OSS projects

The knowledge loss phenomenon and its impact in OSS projects were discussed in Section 5. In this section, our focus is on the literature that discusses the reduction of knowledge loss in OSS projects due to contributor turnover. We first draw our attention to Knowledge Retention (KR) in traditional organisations, which mainly comes into focus when an employee is leaving (Lindvall & Rus, 2003). The need for a KR mechanism in an organisation is assessed based on the following (Lindvall & Rus,

Limitations of the study

In this study, we used a snowballing search to find relevant papers instead of a traditional database search. It is arguable that there is a possibility that the snowballing method is weak and does not provide enough coverage of the relevant literature in this study. The snowballing method is shown in literature to provide coverage similar to database searches (Wohlin, 2014). The snowballing search strategy can be effectively employed in place of the database search to find relevant papers. In

Conclusion

The objective of this study is to understand the phenomenon of knowledge loss due to contributor turnover in OSS projects. In order to understand the phenomenon of knowledge loss in OSS projects, we conducted a literature review using snowballing as a search strategy. We identified 38 papers after filtering from a large number of papers (more than 2000) in a comprehensive review. The papers spread over the period of the year 2000 to 2017. The majority of the papers employed empirical methods as

References (148)

  • D. Hagan et al.

    Ascending into order: A reflective analysis from a small open source development team

    International Journal of Information Management

    (2007)
  • C.-L. Hsu et al.

    Acceptance of blog usage: The roles of technology acceptance, social influence and knowledge sharing motivation

    Information & Management

    (2008)
  • B. Kitchenham et al.

    A systematic review of systematic review process research in software engineering

    Information and Software Technology

    (2013)
  • J. Koh et al.

    Knowledge sharing in virtual communities: An e-business perspective

    Expert Systems With Applications

    (2004)
  • K.R. Lakhani et al.

    How open source software works:“free” user-to-user assistance

    Research Policy

    (2003)
  • S. Lee et al.

    Governance strategies for open collaboration: Focusing on resource allocation in open source software development organizations

    International Journal of Information Management

    (2017)
  • S.A. Licorish et al.

    Understanding the attitudes, knowledge sharing behaviors and task performance of core developers: A longitudinal study

    Information and Software Technology

    (2014)
  • S. Nidhra et al.

    Knowledge transfer challenges and mitigation strategies in global software development—A systematic literature review and industrial validation

    International Journal of Information Management

    (2013)
  • I. Nonaka et al.

    SECI, Ba and leadership: A unified model of dynamic knowledge creation

    Long Range Planning

    (2000)
  • L.G. Pee et al.

    Intrinsically motivating employees’ online knowledge sharing: Understanding the effects of job design

    International Journal of Information Management

    (2015)
  • N. Pennington

    Stimulus structures and mental representations in expert comprehension of computer programs

    Cognitive Psychology

    (1987)
  • A. Qumer et al.

    A framework to support the evaluation, adoption and improvement of agile methods in practice

    The Journal of Systems and Software

    (2008)
  • P.J. Adams et al.

    Coordination and productivity issues in free software: The role of brooks’ law, in software maintenance, 2009

    ICSM 2009. IEEE International Conference on, IEEE

    (2009)
  • L. Aggestam et al.

    Seven types of knowledge loss in the knowledge capture process

    (2010)
  • R. Ayushi et al.

    What Community contribution pattern says about stability of software project?

    Software Engineering Conference (APSEC), 2014 21st Asia-Pacific

    (2014)
  • D. Badampudi et al.

    Experiences from using snowballing and database searches in systematic literature studies

    Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering

    (2015)
  • L. Bao et al.

    Who will leave the Company?: A large-scale industry study of developer turnover by mining monthly work report

    2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR)

    (2017)
  • V.R. Basili

    Viewing maintenance as reuse-oriented software development

    IEEE Software

    (1990)
  • S. BlackDuck

    Seventy-eight percent of companies run on open source, yet many lack formal policies to manage legal, operational, and security risk

    (2015)
  • A. Bosu et al.

    Impact of developer reputation on code review outcomes in OSS projects: An empirical investigation

    Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

    (2014)
  • R. Britto et al.

    Software architects in large-scale distributed projects: An ericsson case study

    IEEE Software

    (2016)
  • A. Capiluppi et al.

    From the cathedral to the bazaar: An empirical study of the lifecycle of volunteer Community projects

  • A. Capiluppi et al.

    Adapting the "staged model for software evolution" to free/libre/open source software

    Ninth International Workshop on Principles of Software Evolution: in Conjunction With the 6th ESEC/FSE Joint Meeting

    (2007)
  • A. Capiluppi et al.

    Exploring the role of commercial stakeholders in open source software evolution

    IFIP International Conference on Open Source Systems

    (2012)
  • A.Y.-L. Chong et al.

    Do interorganisational relationships and knowledge-management practices enhance collaborative commerce adoption?

    International Journal of Production Research

    (2013)
  • C.U. Ciborra et al.

    Sharing knowledge across boundaries

    Journal of Information Technology

    (2001)
  • P. Clarke et al.

    A complexity theory viewpoint on the software development process and situational context, in software and system processes (ICSSP)

    2016 IEEE/ACM International Conference on, IEEE

    (2016)
  • J. Colazo et al.

    Impact of license choice on open source software development activity

    Journal of the American Society for Information Science and Technology

    (2009)
  • S. Conn

    Gartner says worldwide traditional PC, tablet, ultramobile and mobile phone shipments on pace to grow 7.6 percent in 2014 [press release]

    (2014)
  • E. Constantinou et al.

    An empirical comparison of developer retention in the RubyGems and npm software ecosystems

    Innovations in Systems and Software Engineering

    (2017)
  • K. Crowston

    Lessons from volunteering and free/libre open source software development for the future of work in researching the Future in Information Systems

    (2011)
  • K. Crowston et al.

    The social structure of free and open source software development, First Monday

    (2005)
  • K. Crowston et al.

    Effective work practices for software engineering: free/libre open source software development, in Proceedings of the 2004 ACM workshop on Interdisciplinary software engineering research

    (2004)
  • K. Crowston et al.

    Information systems success in free and open source software development: Theory and measures

    Software Process Improvement and Practice

    (2006)
  • K. Crowston et al.

    Free/Libre open-source software development: What we know and what we do not know

    ACM Computing Surveys

    (2012)
  • G.N. Dafermos

    Management and virtual decentralised networks: The Linux project (originally published in Volume 6, Number 11, November 2001), First Monday

    (2005)
  • A. Daghfous et al.

    Understanding and managing knowledge loss

    Journal of Knowledge Management

    (2013)
  • S. Daniel et al.

    Sourcing knowledge in open source software projects: The impacts of internal and external social capital on project success

    The Journal of Strategic Information Systems

    (2018)
  • T.H. Davenport et al.

    Working knowledge: How organizations manage what they know

    Harvard Business Press.

    (1998)
  • D.W. De Long et al.

    Better practices for retaining organizational knowledge: Lessons from the leading edge

    Employment Relations Today

    (2003)
  • Cited by (0)

    View full text