Disclosure or secrecy? The dynamics of Open Science☆
Introduction
At least since the development of scientific societies and related research institutions in the seventeenth century, the centrality of cumulativeness in scientific and technical advance has been recognized, most famously by Newton, who observed that scientific progress depends on “standing on the shoulders of giants.” While economic theory has focused on deriving the implications of cumulativeness for related economic variables such as the equilibrium growth rate (Romer, 1990, Grossman and Helpman, 1991, Jones, 1995, Jones, forthcoming) or the incentives for commercial innovation (Scotchmer, 1991, Gallini and Scotchmer, 2002, Scotchmer, 2004), relatively little research has focused on the microeconomic conditions that support a cumulative research environment.
The fact that knowledge is produced does not guarantee that follow-on researchers will be able to exploit that knowledge (Polanyi, 1967). Effective diffusion of knowledge across researchers and over time requires that individuals are aware of the extant knowledge and that they pay the costs of accessing that knowledge. The ability of a society to stand on the shoulders of giants depends not only the amount of knowledge it generates but also on the quality of mechanisms for storing knowledge, the trustworthiness of that knowledge, and the cost to future generations of accessing that knowledge (Mokyr, 2002, Furman and Stern, 2008).
Open Science is perhaps the most well-known system for achieving these objectives. Open Science is characterized by a distinctive set of incentives for cumulative knowledge production, including norms that facilitate disclosure and knowledge diffusion (Merton, 1973, Dasgupta and David, 1994). This system includes the recognition of scientific priority by future scientific generations, the importance of demonstrating experimental replicability, and a system of public (or coordinated) expenditures to reward those who contribute to cumulative knowledge production over the long term. By conditioning career rewards (such as tenure) on disclosure through publication, Open Science promotes cumulative discovery.1 However, the logic underlying Open Science as an economic institution is more subtle. The ability to sustain disclosure over time depends not simply on the willingness of scientists to invest in research per se but also in their willingness to (1) invest in drawing upon the knowledge provided by prior researchers and (2) disclose their own discoveries in a way that can be accessed and exploited by future researchers.2
The ability to maintain Open Science may be challenged when discoveries are not only of scientific interest but also have significant commercial application. When a single discovery has dual applications – it can serve as an input to future scientific research and be exploited directly for commercial gain – a trade-off arises between the incentives to disclose through the scientific literature and the incentives to maximize direct commercial exploitation (Rosenberg, 1990, Stokes, 1997). Consider the Oncomouse (Murray, 2006). In the early 1980s, Professor Phil Leder at the Harvard Medical School developed the first genetically engineered mouse; it was called the Oncomouse. Leder and his colleague had used newly emerging transgenic techniques to insert an oncogene into a mouse embryo; the result was a mouse that was highly susceptible to cancer. Using the mice to examine the importance of oncogenes in the onset of cancer, Leder came to recognize that “it could serve a variety of different purposes, some purely scientific others highly practical” (Kevles, 2002, p. 83). This research was published in Cell in 1984, and, in 1988, a broad patent for the Oncomouse was granted by the USPTO. Harvard's licensee DuPont aggressively enforced these rights, including demands for “reach-through” rights and review of publications that used the Oncomouse in further scientific research. Over the next decade, a number of controversies surrounded the access to and credit for discoveries based on the Oncomouse. The conflict over the Oncomouse centered on the ability of the broader scientific community to exploit the Oncomouse (and to provide informal recognition to Leder and his coauthors) versus the incentives of DuPont to limit the diffusion of the Oncomouse in order to maximize its commercial advantage (Murray, 2006).
Although traditional models of science and innovation have often assumed a sharp delineation between purely scientific research and commercial applications, qualitative studies of scientific research have increasingly emphasized the importance of dual-use research (Rosenberg, 1974, Stokes, 1997, Murray, 2002, Murray and Stern, 2007). Stokes, in particular, suggested that a significant share of all scientific research combines the scientific and commercial motives and results in knowledge production in “Pasteur's Quadrant.”3 Pasteur's fundamental insights into microbiology simultaneously had practical applications for cholera and rabies while also serving as the foundation for the germ theory of disease (Geison, 1995, Stokes, 1997).
This paper analyzes the feasibility of Open Science when research is conducted in Pasteur's Quadrant (i.e., has both scientific and commercial importance). We consider how incentives for access to prior knowledge, investment in knowledge, and the disclosure of discoveries depend on the disclosure and investment decisions of prior researchers and the access decisions of future researchers. A critical ingredient of our analysis is the fact that the incentives of any one researcher to participate in Open Science depend crucially on the choices of other researchers — i.e., the incentives to publish research in an academic journal depend on future researchers building on that discovery and providing appropriate citations to it in their own research. We model scientific disclosure as an endogenous economic outcome of the microeconomic environment, with the potential for Open Science depending on strategic interaction among researchers in their access, investment, and disclosure decisions.
Our model highlights two features of Open Science: (1) the ability to draw upon prior (disclosed) research and (2) the fact that the incentives to produce and disclose abstract knowledge depend on receiving credit from follow-on researchers. In contrast, the incentives for commercially motivated knowledge production are premised on the ability to limit the use of knowledge by others; we call this approach “Secrecy.” Of course, the private returns to scientific research crucially depend on several exogenous factors such as the institutional and legal environment of the time. In our model, this is achieved through trade secrecy. (We consider the role of formal intellectual property rights (IPR) in an extension.) We embed the choice between secrecy versus disclosure into an overlapping generations framework in which each generation is composed of a single researcher who lives for two periods. During his first period of life, each researcher produces a knowledge output by choosing (1) whether to draw upon knowledge (if available) produced by the previous generation, (2) the level of investment in his own research, and (3) whether to disclose the produced knowledge for follow-on researchers in the next period. Each researcher faces a fixed cost of drawing upon prior knowledge, and a constant marginal cost of investment in his own research. The benefits to each researcher are composed of (1) the benefits from citations to his research by the next generation (if he chooses to disclose, and the next generation chooses to build on that research) and (2) private rents from proprietary exploitation of his knowledge. Researchers face a trade-off between maximizing the benefits from private exploitation (through secrecy) and earning a lower benefit from private exploitation but earning additional benefits from disclosure through the institutions of Open Science.
We draw out the equilibrium implications of this choice between secrecy and disclosure and focus on three potential outcomes: (1) “Open Science,” in which each generation invests in access to prior knowledge, chooses a constant level of investment, and discloses knowledge to the next generation; (2) “Secrecy,” in which each generation does not build on the knowledge produced by the prior generation, chooses a constant level of investment, and chooses not to disclose the knowledge produced to the subsequent generation; and (3) k-period “cycle” equilibrium, in which a single period of “Secrecy” is followed by k − 1 periods of “Open Science.”
At least one of these three types of equilibria must exist for any set of parameter values that describes the microeconomic environment. With that said, the feasibility of a given equilibrium depends crucially on the parameters of the economic environment. For example, the viability of Open Science is decreasing in both the cost of accessing knowledge produced by prior generations and in the relative benefits to private exploitation under secrecy versus disclosure. We also examine the role of factors such as the effectiveness of scientific institutions in promoting the effective transfer of knowledge across generations and the marginal cost of research investment. Rather than being grounded in differences in the type of knowledge produced, the model suggests that the feasibility of Open Science depends on the institutional and microeconomic environment in which that knowledge is produced; these parameters are themselves functions of the policy environment.
The model also highlights the potential for multiple equilibria for a given set of parameters, so that the choice between “Open Science” and “Secrecy” is endogenous to the strategic interaction among researchers. When multiple equilibria exist, we are able to rank welfare. Open Science, whenever viable, generates more surplus than any regime involving Secrecy. Moreover, among the set of Open Science equilibria, welfare increases as a function of the level of research investment. Finally, we considers a number of extensions and implications of the model: (1) the potential for knowledge spillovers across multiple generations (relaxing our assumption in the baseline model that spillovers only occur across immediately adjacent research generations), (2) the potential for hysteresis (is it more difficult to establish Open Science as an equilibrium than to maintain that equilibrium once it is established?), and (3) the role of formal intellectual property rights such as patents. The contribution of this paper is to isolate the equilibrium implications of the trade-off that arises for each research generation between secrecy and disclosure and to assess the implications of this equilibrium for welfare. By doing so, we contribute to two rapidly emerging literatures. First, building on Dasgupta and David (1994), several recent papers focus on the microeconomic conditions supporting “Open Science” as an economic institution (among others, Stern, 2004, Aghion et al., 2005, Lacetera, 2008; and Gans et al., 2008). At the same time, an emerging literature focuses on the incentives for knowledge disclosure by firms and on the interaction between trade secrecy and other mechanisms for earning returns from research investments (Horstman et al., 1985, Arora, 1995, Anton and Yao, 2004, Lerner and Tirole, 2005, Kultti et al., 2007). This paper complements these contributions by focusing on the strategic impact of disclosure when the returns from knowledge production accrue both from citations from follow-on researchers and from traditional commercial returns.
The remainder of the paper is organized as follows. The next section introduces the basic model structure. Section 3 derives the equilibrium of the model and discusses how the set of equilibria may change in different economic environments. Section 4 considers several short extensions to the baseline model. A final section concludes. All proofs are in the Appendix A.
Section snippets
The model
We consider an overlapping generations framework. In each generation, a single researcher is born, and each generation lives for two periods. There is an infinite sequence of researchers; letting the tth generation be denoted as Gt (t = 0, ± 1, ± 2,…), the generations Gt − 1 (currently in their second period of life) and Gt (currently in their first period of life) therefore coexist.
At the beginning of the first period of life, researcher Gt makes three choices: (1) whether to invest in accessing
The game
As stated above, the economy is composed of overlapping generations Gt (t = 0, ± 1, ± 2,…). Each generation seeks to maximize its individual payoff:by choosing a triplet (at,xt,dt) conditional on the observed value of zt and the strategy of Gt + 1. Since dt and at are binary, it is useful to (1) evaluate the optimal value of xt, conditional on each potential combination of dt and at, and (2) choose the triplet that yields the highest
Extensions
Our analysis attempts to identify some of the key trade-offs associated with maintaining Open Science as an equilibrium over multiple research generations. In particular, the model focuses attention on the interdependence between the decision to build on knowledge from prior generations, the incentives for research investment, and the decision to disclose knowledge upon which subsequent generations might themselves build. To focus on these properties, we adopt several simplifying assumptions.
Conclusion
This paper is motivated by a simple yet important feature of cumulative knowledge production: to build upon prior discoveries, the knowledge underlying those discoveries must be disclosed and accessible. Whether knowledge gets disclosed to serve as an input into future research is not simply a function of the type of knowledge produced but depends on incentives. Researchers will endogenously choose whether to invest in prior knowledge, how much to invest in knowledge production, and whether to
References (41)
Innovation as overlapping scientific and technological trajectories: exploring tissue engineering
Research Policy
(2002)- et al.
Do formal intellectual property rights hinder the free flow of scientific knowledge? An empirical test of the anti-commons hypothesis
Journal of Economic Behavior and Organization
(2007) Why do firms do basic research (with their own money)
Research Policy
(1990)- et al.
Little patents and big secrets: managing intellectual property
RAND Journal of Economics
(2004) - et al.
A model of growth through creative destruction
Econometrica
(1992) - et al.
Academic Freedom, Private-Sector Focus, and the Process of Innovation
Licensing tacit knowledge: intellectual property rights and the market for know-how
Economics of Innovation and New Technology
(1995)Economics of welfare and the allocation of resources for invention
- et al.
Withholding research results in academic life science: evidence from a National Survey of Faculty
Journal of the American Medical Association
(1997) - et al.
Towards a new economics of science
Research Policy
(1994)
Common agency contracting and the emergence of Open Science institutions
AEA Papers and Proceedings
Patronage, reputation, and common agency contracting in the scientific revolution: from keeping ‘nature's secrets’ to the institutionalization of ‘Open Science’
The economics of scientific research coalitions: collaborative network formation in the presence of multiple funding agencies
Climbing atop the shoulders of giants: the impact of institutions on cumulative research
The Private Science of Louis Pasteur
Overlapping generation games with mixed strategies
Mathematics of Operation Research
Innovation and Growth in the Global Economy
Patents as information transfer mechanisms: to patent or (maybe) not to patent
Journal of Political Economy
Cited by (39)
Papers with code or without code? Impact of GitHub repository usability on the diffusion of machine learning research
2023, Information Processing and ManagementWhy do firms publish? A systematic literature review and a conceptual framework
2022, Research PolicyIncentive or disincentive for research data disclosure? A large-scale empirical analysis and implications for open science policy
2021, International Journal of Information ManagementCitation Excerpt :However, another finding that, in the long term, data-disclosing research receives fewer citations than non-data-disclosing research implies that the short-term benefit disappears over time and can even turn into a net disbenefit in the long term. Regarding these findings, the theoretical model of Mukherjee and Stern (2009) implies that the contemporary environment for science might provide a short-term academic incentive for researchers to disclose their research data, but it may not function as desired when it comes to the provision of incentives for data disclosure from a long-term perspective. Our investigation of the systematic difference in the occurrence timing of the credit and competition effects contributes to the scholarly efforts of examining researchers’ data sharing behavior in the information management and policy field.
Open Science now: A systematic literature review for an integrated definition
2018, Journal of Business ResearchCitation Excerpt :The research team concludes that Open Science is conceptualised as: Open Science as knowledge: Bisol, Anagnostou, Capocasa, et al. (2014); Bond-Lamberty, Smith, and Bailey (2016); Brown (2009); Caulfield, Harmon, and Joly (2012); Cho and Choi (2013); Cook-Deegan (2007); Czarnitzki, Grimpe, and Pellens (2015); Czarnitzki, Grimpe, and Toole (2015); David (1998, 2004a); Davis, Larsen, and Lotz (2011); Deng (2011); De Roure, Goble, Aleksejevs, et al. (2010); European Commission (2014, 2015b, 2016); European Council (2016); Friesike, Widenmayer, Gassmann, and Schildhauer (2015); Fry, Schroeder, and den Besten (2009); Gorgolewski and Poldrack (2016); Grand, Wilkinson, Bultitude, and Winfield (2016); Grand (2015); Hampton, Anderson, Bagby, et al. (2015); Jamali, Nicholas, and Herman (2016); Jong and Slavova (2014); Langlois and Garzarelli (2008); Lasthiotakis, Kretz, and Sá (2015); Leonelli, Spichtinger, and Prainsack (2015); MacLean, Aleksic, Alexa, et al. (2015); McKiernan, Bourne, Brown, et al. (2016); Morzy (2015); Mukherjee and Stern (2009); Nelson (2003); OECD (2014, 2015); Peters (2010a, 2010b); Powell (2016); Rinaldi (2014); Robertson, Ylioja, Williamson, et al. (2014); Schmidt et al. (2016); Shibayama (2015); Stodden (2010); Szkuta and Osimo (2016); Thanos (2014); West (2008); Wolkovich, Regetz, and O'Connor (2012). Open Science as transparent knowledge: European Commission (2015b); European Council (2016); Hampton et al. (2015); Kraker et al. (2011); Leonelli et al. (2015); Lyon (2016); Rentier (2016); Ramjoué (2015); Scheliga and Friesike (2014).
Open access to research data: Strategic delay and the ambiguous welfare effects of mandatory data disclosure
2018, Information Economics and Policy
- ☆
We thank Joshua Gans, Maria Goltsman, Tapas Kundu, Fiona Murray, Marcin Peski, Luis Vasconcelos, Michael Whinston, the Editor, two anonymous referees, and especially Julien Jamison for extremely helpful comments and suggestions. The first author gratefully acknowledges financial support from Bates White Research Enhancement Grant. The second author gratefully acknowledges funding from NSF Science of Science Policy Grant 0738394. All errors that may remain are ours.