Early identification of emerging technologies: A machine learning approach using multiple patent indicators

doi:10.1016/j.techfore.2017.10.002

Technological Forecasting and Social Change

Volume 127, February 2018, Pages 291-303

https://doi.org/10.1016/j.techfore.2017.10.002 Get rights and content

Highlights

•
Proposing a machine learning approach to identifying emerging technologies at early stages
•
Defining 18 input and 3 output variables from the United States Patent and Trademark Office database
•
Employing feed-forward multilayer neural networks to capture nonlinear relationships between input and output variables
•
Developing two quantitative indicators to identify trends of a technology's emergingness

Abstract

Patent citation analysis is considered a useful tool for identifying emerging technologies. However, the outcomes of previous methods are likely to reveal no more than current key technologies, since they can only be performed at later stages of technology development due to the time required for patents to be cited (or fail to be cited). This study proposes a machine learning approach to identifying emerging technologies at early stages using multiple patent indicators that can be defined immediately after the relevant patents are issued. For this, first, a total of 18 input and 3 output indicators are extracted from the United States Patent and Trademark Office database. Second, a feed-forward multilayer neural network is employed to capture the complex nonlinear relationships between input and output indicators in a time period of interest. Finally, two quantitative indicators are developed to identify trends of a technology's emergingness over time. Based on this, we also provide the practical guidelines for implementation of the proposed approach. The case of pharmaceutical technology shows that our approach can facilitate responsive technology forecasting and planning.

Graphical abstract

Introduction

Emerging technologies are of great interest to a wide range of stakeholders in both industry and government who aim to set up investment-related strategies (Rotolo et al., 2015). The existing literature has shown that patent citation information is useful for measuring the economic value of a technology (Lerner, 1994, Narin et al., 1987). In this respect, many methods – such as cluster analysis, association rule mining, and conjoint analysis – have been employed to identify emerging technologies using patent citation information. However, the outcomes of previous studies are not forward-looking because most have been limited to ex post evaluation which measures past performance, impacts, or consequences (Lee et al., 2016). The value of predictive analysis for identifying emerging technologies has seldom been addressed.

Arguably, the most scientific approaches to identifying emerging technologies use curve fitting techniques (Daim et al., 2006, Shin et al., 2013) and stochastic models (Jang et al., 2017, Lee et al., 2011, Lee et al., 2012, Lee et al., 2016, Lee et al., 2017) to show future-projected trends of a technology by estimating the future citation counts of the relevant patents as a quantitative proxy. Curve fitting techniques using least squares estimation or least absolute deviation fit growth curves to time-series patent citation data and extrapolate those curves beyond the range of the data, whereas stochastic models estimate probability distributions of patent citations in the future by analysing fluctuations observed in historical data. However, the outcomes of these methods are likely to reveal no more than current key technologies, since they can only be performed at later stages of technology development due to the time required for patents to be cited (or fail to be cited) (Haupt et al., 2007). It should be noted that the time lag between citing and cited patents is found to be between 4 and 5 years on average (Verspagen and De Loo, 1999), and the latest patents have naturally less chance to be cited by other patents (Karki, 1997). Moreover, these methods have been criticised due to their reliance on making assumptions about pre-determined growth curves and probability distributions (Jang et al., 2017, Lee et al., 2011, Lee et al., 2012, Lee et al., 2016, Lee et al., 2017, Shin et al., 2013), which are difficult to identify at early stages of technology development and are heterogeneous across technologies. Hence, curve fitting techniques and stochastic models are of little practical assistance in identifying emerging technologies, especially when a technology is at its early stages and there is no historical data (Jang et al., 2017).

As a remedy, we propose a machine learning approach to identifying emerging technologies at early stages using multiple patent indicators that can be defined immediately after the relevant patents are issued. Economic and innovation literature has presented a wide range of patent indicators – such as patent family and originality – that may be indicative of the future citation count of patents and that further the relevant technology's economic value (Lerner, 1994, Narin et al., 1987). The tenet of this research is that analysis of those patent indicators can provide evidence for a patent's value and further the relevant technology's value in the future. For this, first, a total of 18 input and 3 output indicators are extracted from the United States Patent and Trademark Office (USPTO) database. Second, a feed-forward multilayer neural network – that is a supervised machine learning technique inspired by attempts to model the neuro-physical structure of the human brain – is employed to capture the complex nonlinear relationships between input and output indicators in a time period of interest. The primary advantage of this method for identifying emerging technologies is its ability to infer a function from observations (Buscema et al., 2017). It should be noted that there is no theoretical understanding of the relationships between those patent indicators, and moreover, the complexity and nonlinearity associated with innovation processes makes the design of a certain function impractical (Chen et al., 2012). Finally, two quantitative indicators are developed to identify trends of a technology's emergingness over time. Based on this, we also provide the practical guidelines for the implementation of our approach in terms of the choice of machine learning models and model update.

We applied the proposed approach to support Korean small and medium-sized high tech companies in technology forecasting at the request of the Korea Institute of Science and Technology Information (KISTI). We adopted the USPTO database for this research, since it contains the most representative data for analysing international technology (Lee et al., 2013). Our experience showed that the proposed approach can find emerging technologies at early stages, using the limited patent indicators that can be defined and extracted immediately after the relevant patents are issued. Our method also enabled us to perform systematic and continuous monitoring of emerging technologies, yielding high potential benefits at relatively low cost. Moreover, the results of our case study enabled us to identify a way to improve the proposed approach, which we expect to be a useful complementary tool to support experts' decision making in emerging technologies, especially for small and medium-sized high-tech companies. We believe that the systematic process and quantitative outcomes our approach offers can facilitate responsive technology forecasting and planning.

This paper is organised as follows. Section 2 presents the background to our research and Section 3 explains the research framework and methodology, which are then illustrate by a case study on pharmaceutical technology in Section 4. Section 5 provides the guidelines for implementation of our approach. Finally, Section 6 offers our conclusions.

Section snippets

Definitions and characteristics of emerging technologies

Although emerging technologies have been the subject of many previous studies, there is no consensus as to what qualifies a technology to be emergent (Rotolo et al., 2015). As Table 1 reports, the definitions and concepts of emerging technologies presented by a number of studies overlap, but at the same time, point to different characteristics. For instance, Day and Schoemaker (2000) defined an emerging technology as a science-based innovation that has the potential to create a new industry or

Methodology

Fig. 1 shows the overall process of the proposed approach. Given the complexities involved, the proposed approach is designed to be executed in four discrete steps: data collection and pre-processing; defining and extracting patent indicators; assessing the value of patents; and identifying trends of a technology's emergingness.

Overview

We conducted a case study of pharmaceutical technology for three reasons. First, a patent normally equals a product in the pharmaceutical industry so that the technological value of a patent is directly related to its commercial value (Chen and Chang, 2010). Second, patent management activities such as valuation and protection is especially important in this industry compared to those of other industries since the manufacturing process is relatively easy to replicate and can be copied with a

Guidelines for implementation of our approach

Newly developed methods should be carefully deployed in practice. There are many issues to be considered for practical implementation. First, we employed classification models to assess the value of patents since predicting the exact future citation count of a patent is not the focus of our analysis. However, the value of patents can also be assessed by using regression models with such performance metrics as mean absolute error (MAE) and mean absolute percentage error (MAPE). Second, although

Conclusions

This study has proposed a machine learning approach for identifying emerging technologies at early stages using multiple patent indicators that can be defined immediately after the relevant patents are issued. The central tenet of the proposed approach is that patent indicators – such as patent family and originality – can provide evidence for a patent's value and further the relevant technology's value in the future. To this end, a total of 18 input and 3 output indicators were extracted from

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grants funded by the Korea government (MSIP) (No. 2017R1C1B2011434) and supported by the Future Strategic Fund (No. 1.140010.01) of Ulsan National Institute of Science and Technology (UNIST).

Changyong Lee is an associate professor of the School of Management Engineering at Ulsan National Institute of Science and Technology (UNIST). He holds a BS in computer science and industrial engineering from Korea Advanced Institute of Science and Technology, and a PhD in industrial engineering from Seoul National University. His research interests lie in the areas of applied data mining and machine learning techniques, future-oriented technology analysis, robust technology planning,

References (70)

B.S. Aharonson et al.
Mapping the technological landscape: measuring technology distance, technological footprints, and technology evolution
Res. Policy
(2016)
J. Alcacer et al.
Applicant and examiner citations in US patents: an overview and analysis
Res. Policy
(2009)
J. Bessen
The value of US patents by owner and patent characteristics
Res. Policy
(2008)
J. Bode
Decision support with neural networks in the management of research and development: concepts and application to cost estimation
Inf. Manag.
(1998)
A. Breitzman et al.
The emerging clusters model: a tool for identifying emerging technologies across multiple patent systems
Res. Policy
(2015)
M. Buscema et al.
What kind of ‘world order’? An artificial neural networks approach to intensive data mining
Technol. Forecast. Soc. Chang.
(2017)
M. Callon
The state and technical innovation: a case study of the electrical vehicle in France
Res. Policy
(1980)
C.T. Chen et al.
A feedforward neural network with function shape autotuning
Neural Netw.
(1996)
Y.S. Chen et al.
The relationship between a firm's patent quality and its market value—the case of US pharmaceutical industry
Technol. Forecast. Soc. Chang.
(2010)
Y.S. Chen et al.
Utilizing patent analysis to explore the cooperative competition relationship of the two LED companies: Nichia and Osram
Technol. Forecast. Soc. Chang.
(2011)

Y.S. Chen et al.

Nonlinear influence on R&D project performance

Technol. Forecast. Soc. Chang.

(2012)

P. Criscuolo et al.

Does it matter where patent citations come from? Inventor vs. examiner citations in European patents

Res. Policy

(2008)

T.U. Daim et al.

Forecasting emerging technologies: use of bibliometrics and patent analysis

Technol. Forecast. Soc. Chang.

(2006)

H. Ernst

Patent information for strategic technology management

World Patent Inf.

(2003)

R. Genuer et al.

Variable selection using random forests

Pattern Recogn. Lett.

(2010)

M. Gevrey et al.

Review and comparison of methods to study the contribution of variables in artificial neural network models

Ecol. Model.

(2003)

D. Guellec et al.

Applications, grants and the value of patent

Econ. Lett.

(2000)

D. Harhoff et al.

Citations, family size, opposition and the value of patent rights

Res. Policy

(2003)

R. Haupt et al.

Patent indicators for the technology life cycle development

Res. Policy

(2007)

H.J. Jang et al.

Hawkes process-based technology impact analysis

J. Inf. Secur.

(2017)

G.H. Jeong et al.

A qualitative cross-impact approach to find the key technology

Technol. Forecast. Soc. Chang.

(1997)

Y. Jeong et al.

Forecasting technology substitution based on hazard function

Technol. Forecast. Soc. Chang.

(2016)

J. Joung et al.

Monitoring emerging technologies for technology planning using technical keyword based analysis from patent data

Technol. Forecast. Soc. Chang.

(2017)

Y. Ju et al.

Patent-based QFD framework development for identification of emerging technologies and related business models: a case of robot technology in Korea

Technol. Forecast. Soc. Chang.

(2015)

M.M.S. Karki

Patent citation analysis: a policy analysis tool

World Patent Inf.

(1997)

H. Kim et al.

Concentric diversification based on technological capabilities: link analysis of products and technologies

Technol. Forecast. Soc. Chang.

(2017)

S. Lee et al.

Business planning based on technological capabilities: patent analysis for technology-driven roadmapping

Technol. Forecast. Soc. Chang.

(2009)

H. Lee et al.

Technology clustering based on evolutionary patterns: the case of information and communications technologies

Technol. Forecast. Soc. Chang.

(2011)

C. Lee et al.

A stochastic patent citation analysis approach to assessing future technological impacts

Technol. Forecast. Soc. Chang.

(2012)

C. Lee et al.

Novelty-focused patent mapping for technology opportunity analysis

Technol. Forecast. Soc. Chang.

(2015)

C. Lee et al.

Stochastic technology life cycle analysis using multiple patent indicators

Technol. Forecast. Soc. Chang.

(2016)

Z. Ma et al.

Patent application and technological collaboration in inventive activities: 1980–2005

Technovation

(2008)

M. Meyer

Are patenting scientists the better scholars?: an exploratory comparison of inventor-authors with their non-inventing peers in nano-science and technology

Res. Policy

(2006)

F. Narin et al.

Patents as indicators of corporate technological strength

Res. Policy

(1987)

J.D. Olden et al.

An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data

Ecol. Model.

(2004)

Cited by (153)

Unveiling predictors influencing patent licensing: Analyzing patent scope in robotics and automation
2024, World Patent Information
Ensuring the sustainability of technology transfer offices depends on effective patent licensing strategies. This study investigates novel predictors for patent licensing prediction. It emphasizes the importance of judiciously selecting suitable patent-scope metrics to enhance the likelihood of successful patent licensing agreements. This work focuses on a critical aspect of patent scope, specifically examining the number of independent claims, the length of the first claim, the depth of the claim, the Cooperative Patent classification count, the non-US family count, and the family independent count. Additionally, we consider conventional metrics previously investigated in prior research, such as claim count, the count within the International Patent Classification, and Simple Family Application. Our empirical analysis harnesses a dataset comprising patents from university technology transfers within the robotics and automation domain. We analyze the relationship between patent scope measures and licensing outcomes using data visualization and statistical techniques, including the point-biserial correlation coefficient and the t-test. Comparative analysis of the statistical results is performed to identify the most impactful predictor. Our study reveals a correlation between the number of independent claims and the success of patent licensing. In contrast, the rest of the investigated measures do not impact the success of patent licensing.
Multidimensional indicators to identify emerging technologies: Perspective of technological knowledge flow
2024, Journal of Informetrics
The identification of emerging technologies (ETs) is pivotal for advancing technological innovation. However, current methods fail to sufficiently clarify ETs' innovation mechanisms and lack a consistent perspective to integrate the five attributes proposed by Rotolo. This paper presents an innovative term-level framework to identify and comprehend ETs through the perspective of technological knowledge flow (TKF). By dissecting TKF comprehensively, encompassing aspects of knowledge absorption, growth, and diffusion, we construct and explicate multidimensional indicators reflective of ETs' attributes, including relatively rapid growth, radical novelty, coherence, prominent impact, as well as uncertainty and ambiguity. Through the analysis of digital medical patent dataset, our framework proves effective in assessing emergent scores and pinpointing ETs with specificity at the term level, clarifying their technological components and efficacy. This is beneficial for technology developers to overcome technical difficulties and strategic decision makers to manage IP for competitive advantage.
Research on patent quality evaluation based on rough set and cloud model
2024, Expert Systems with Applications
The evaluation and identification of high-quality patents are urgently needed for the technological research and development and the transformation of achievements. Traditional researchers and analysts mainly focus on developing various patent quality indicators.However, there is a lack of relative research on how to apply these indicators to comprehensively evaluate patent quality. Therefore, this paper uses the rough set theory(RST) and multidimensional cloud model(MCM) to construct a comprehensive evaluation and grading system for patent quality (RST-MCM), which is used to comprehensively evaluate the quality of patents. First, by systematically summarizing the relevant literature on patent quality evaluation, we identify the influencing factors of patent quality in multiple dimensions and at multiple stages. Second, the patent quality evaluation index system is constructed by using RST to reduce redundant patent quality influencing factors and determine the evaluation index weights. Finally, the evaluation and grading of patent quality is completed with MCM. To validate the effectiveness of the research, RST-MCM is applied to the quality evaluation of patents in the construction engineering industry. The research results show that the accuracy rate of RST-MCM is 90.3%. The research will provide effective decision-making support for the formulation of technical strategies such as improving independent innovation capability and patent layout.
Towards firm-specific technology opportunities: A rule-based machine learning approach to technology portfolio analysis
2023, Journal of Informetrics
Despite the substantial contributions of many studies on firm-specific technology opportunity analysis (TOA), there is a lack of understanding of the technology portfolios of organizations and actors of technology innovation activities. The study proposes a new firm-specific TOA approach using graph representation, rule-based machine learning, and index analysis. First, organizations’ technology portfolios are characterized by multiple graphs consisting of technological components based on their own patent information. Second, given an organization of interest for a TOA, its core technology, which is represented as links between technological components, is defined and significant association rules are identified through our rule-based machine learning pipeline. Third, new-to-firm technology opportunities are identified from a set of association rules and evaluated using quantitative metrics. Finally, we examine the evaluation metrics on which each organization focuses by tracking the patenting activities of the organizations after the analysis period. Consequently, we can enhance the understanding of organizational technology portfolios and provide firm-specific technology opportunities. Our empirical results for multiple organizations showed that the proposed approach is effective and valuable as a decision-supporting tool for TOA in practice.
Measuring the novelty of scientific publications: A fastText and local outlier factor approach
2023, Journal of Informetrics
Although the novelty of scientific publications has been the subject of previous studies, most have examined the distribution of references in the bibliography, which may not be effective in capturing implied scientific knowledge. We propose an analytical framework for measuring the novelty of scientific publications using a paper's title. At the heart of the framework, fastText is used to construct a vector space model in which papers with similar scientific knowledge are located close to each other, and the local outlier factor is used to measure the novelty of scientific knowledge implied in the papers on a numerical scale. The feasibility and validity of the analytical framework were assessed by comparing the average novelty scores of papers recommended with novelty-related tags in Faculty Opinions to those of papers without such tags. This case study of 15,653 papers published in a biomedical journal confirms that our framework is a useful complementary tool for the continuous assessment of the novelty of scientific publications and can serve as a starting point for developing more general models.
Automated weak signal detection and prediction using keyword network clustering and graph convolutional network
2023, Futures
Weak signals are rarely identified in the initial stage of growth and appear significant over time, unlike strong signals clearly observed in past trends. Weak signals are important cues that need to be analyzed to rapidly and accurately predict changes in the uncertain future. Researchers have developed various methods for identifying cues that can be significantly used for prediction. However, in many cases, they heavily depend on the opinions of experts or are applicable only to weak signals in specific fields. This study proposes a weak signal detection method that extracts weak signals by selecting significant keywords from literature database and grouping relevant keywords. Furthermore, this study presents a weak signal prediction method for predicting the growth of specific weak signals by investigating and learning the growth of the extracted weak signals over 10 years. To verify the proposed method, we extracted weak signals for 10 years (2001–2010) from SCOPUS publication data from 1996 to 2009 and applied machine learning using a graph convolutional network (GCN) model with the growth data of the extracted weak signals. The results showed that the proposed methods can effectively detect and predict weak signals.

View all citing articles on Scopus

Ohjin Kwon is a director of Centre for Future Information R&D at Korea Institute of Science and Technology Information. He obtained a BS and MS in computer science at Kwangwoon University, and a PhD in computer science at University of Seoul. His research areas are information systems, technology intelligence, and patent analysis.

Myeongjung Kim is a PhD student of School of Management Engineering at UNIST. He holds a BS in business administration from UNIST. His research interests include applied data mining and machine learning, technology intelligence, and intellectual property management.

Daeil Kwon is an assistant professor of system design and control engineering at UNIST. He received his PhD in mechanical engineering from the University of Maryland, and his BS in mechanical engineering from Pohang University of Science and Technology. His research interests include prognostics and health management of electronics, reliability modelling, and use condition characterisation.

View full text

Early identification of emerging technologies: A machine learning approach using multiple patent indicators

Highlights

Abstract

Graphical abstract

Introduction

Section snippets

Definitions and characteristics of emerging technologies

Methodology

Overview

Guidelines for implementation of our approach

Conclusions

Acknowledgements

Res. Policy

Res. Policy

Res. Policy

Inf. Manag.

Res. Policy

Technol. Forecast. Soc. Chang.

Res. Policy

Neural Netw.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

Res. Policy

Technol. Forecast. Soc. Chang.

World Patent Inf.

Pattern Recogn. Lett.

Ecol. Model.

Econ. Lett.

Res. Policy

Res. Policy

J. Inf. Secur.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

World Patent Inf.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

Technol. Forecast. Soc. Chang.

Technovation

Res. Policy

Res. Policy

Ecol. Model.