Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published November 28, 2023 | Version v62
Dataset Open

Reliance on Science

Creators

  • 1. Cornell University

Description

This dataset contains patent-to-paper citations through 2022 as well as patent-paper pairs (through 2021).  If you use the data, please cite these two articles:

1. M. Marx & A. Fuegi, "Reliance on Science by Inventors: Hybrid Extraction of In-text Patent-to-Article Citations."  forthcoming in Journal of Economics and Management Strategy. (http://doi.org/10.1111/jems.12455)

2. M. Marx, & A. Fuegi, "Reliance on Science: Worldwide Front-Page Patent Citations to Scientific Articles" (2020), Strategic Management Journal 41(9):1572-1594. (https://onlinelibrary.wiley.com/doi/full/10.1002/smj.3145

The datafile containing the citations is _pcs_oa.csv.  Each citation has the applicant/examiner flag, confidence score (1-10), whether the reference was a) only on the front page, b) only in the body text, or c) in both, and an indicator for a self-citation (i.e., one of the authors is an inventor on the patent). There are two "shorthand" files, _pcs_countsbypatent.csv and _pcs_countsbypaper.csv, which collapse these to the paper and patent level by citation type.

The datafile containing the patent-paper pairs (PPPs) is _patent_paper_pairs.tsv. These are USPTO only, through 2021. Each PPP has a confidence score and the count of days between the publication of the paper and the filing of the patent. (If the patent is a continuation of another patent, the filing date of the original patent is used.) Also, when a paper is paired with multiple patents, an indicator variable reports whether those patents are continuations or otherwise identical. 

(The redistribution of OpenAlex is temporarily removed, but we hope to re-add it soon.)

The above is documented in greater detail in __reliance_on_science.pdf.

These data are provided under a Creative Commons Attribution Non-Commercial license. Please contact us regarding commercial use.  Questions & feedback to support@relianceonscience.org.

This work is sponsored by the Alfred P. Sloan Foundation grant #G-2021-16822.

Files

__relianceonscience.pdf

Files (3.7 GB)

Name Size Download all
md5:e4b606d64bbe02a44fc2e4736c410afe
182.8 kB Preview Download
md5:4bd74952b31f2c99542c3e51ec193d4d
423.4 MB Preview Download
md5:da82d6df695e0980753206595d371c53
631.9 MB Preview Download
md5:af49e972d39b2773b53bb3feedc0d64a
22.1 MB Preview Download
md5:ee63d694cebfb5539dbe7ed6767f2f70
52.7 MB Preview Download
md5:8946c21613d92f7342ae917844d0759a
251.7 MB Preview Download
md5:a0fa762574525d74c9de8fc3563723f6
266.0 MB Preview Download
md5:c76b3e096253dce1fd356c2a59be30a1
2.0 GB Preview Download

Additional details

References

  • Marx, Matt and Aaron Fuegi, "Reliance on Science in Patenting: USPTO Front-Page Citations to Scientific Articles" (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3331686)
  • Sinha, Arnab, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). ACM, New York, NY, USA, 243-246