Skip to main content
Log in

A knowledge graph approach for recommending patents to companies

  • Published:
Electronic Commerce Research Aims and scope Submit manuscript

Abstract

Online platforms have emerged to facilitate patent transfer between academia and industry, but a recommendation method that matches patents with company needs is missing in the literature. Previous patent recommendation methods were designed mainly for query-driven patent search contexts, where user needs are given. However, company needs are implicit in the patent transfer context. The problem of profiling the needs and recommending patents accordingly remains unsolved. This research proposes a knowledge graph approach to address the problem. The proposed approach defines and constructs a patent knowledge graph to capture the semantic information between keywords in the patent domain. Then, it profiles patents and companies as weighted graphs based on the patent knowledge graph. Finally, it generates recommendations by comparing the weighted graphs based on the graph edit distance measure. During the recommendation process, three recommendation strategies (i.e., supplementary, complementary, and hybrid recommendation strategies) are proposed to profile different company needs and make recommendations accordingly. The proposed approach has been implemented and tested on a knowledge transfer platform in Jiangxi province, R.P. China. A pretest experiment shows that the proposed approach outperforms several baseline methods in terms of precision, recall, F-score, and mean average precision. User feedback from an online experiment further demonstrates the usability and the effectiveness of the proposed approach for recommending patents to companies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://cgjy.jxstc.gov.cn/onlineweb/innovation/patentinfozs?menuId=40000

References

  1. Lee, J.-S., Park, J.-H., & Bae, Z.-T. (2017). The effects of licensing-in on innovative performance in different technological regimes. Research Policy., 46, 485–496. https://doi.org/10.1016/j.respol.2016.12.002

    Article  Google Scholar 

  2. Parker, D. D., & Zilberman, D. (1993). University technology transfers: Impacts on local and US economies, Contemporary Policy Issues. Huntington Beach, 11, 87. https://doi.org/10.1111/j.1465-7287.1993.tb00382.x

    Article  Google Scholar 

  3. McDevitt, V. L., Mendez-Hinds, J., Winwood, D., Nijhawan, V., Sherer, T., Ritter, J. F., & Sanberg, P. R. (2014). More than money: The exponential impact of academic technology transfer. Technology & Innovation., 16, 75–84. https://doi.org/10.3727/194982414X13971392823479

    Article  Google Scholar 

  4. Roessner, D., Bond, J., Okubo, S., & Planting, M. (2013). The economic impact of licensed commercialized inventions originating in university research. Research Policy, 42, 23–34. https://doi.org/10.1016/j.respol.2012.04.015

    Article  Google Scholar 

  5. Gambardella, A., Giuri, P., & Luzzi, A. (2007). The market for patents in Europe. Research Policy., 36, 1163–1183. https://doi.org/10.1016/j.respol.2007.07.006

    Article  Google Scholar 

  6. Caviggioli, F., & Ughetto, E. (2013). The drivers of patent transactions: Corporate views on the market for patents. R&D Management, 43, 318–332. https://doi.org/10.1111/radm.12016

    Article  Google Scholar 

  7. Kani, M., & Motohashi, K. (2012). Understanding the technology market for patents: New insights from a licensing survey of Japanese firms. Research Policy, 41, 226–235. https://doi.org/10.1016/j.respol.2011.08.002

    Article  Google Scholar 

  8. Muscio, A. (2010). What drives the university use of technology transfer offices? Evidence from Italy. The Journal of Technology Transfer, 35, 181–202. https://doi.org/10.1007/s10961-009-9121-7

    Article  Google Scholar 

  9. Trappey, A. J. C., Trappey, C. V., Wu, C.-Y., Fan, C. Y., & Lin, Y.-L. (2013). Intelligent patent recommendation system for innovative design collaboration. Journal of Network and Computer Applications., 36, 1441–1450. https://doi.org/10.1016/j.jnca.2013.02.035

    Article  Google Scholar 

  10. Lu, J., Wu, D., Mao, M., Wang, W., & Zhang, G. (2015). Recommender system application developments: A survey. Decision Support Systems., 74, 12–32. https://doi.org/10.1016/j.dss.2015.03.008

    Article  Google Scholar 

  11. Ji, X., Gu, X., Dai, F., Chen, J., & Le, C. (2011). Patent collaborative filtering recommendation approach based on patent similarity. In 2011 eighth international conference on fuzzy systems and knowledge discovery (FSKD), pp. 1699–1703. https://doi.org/https://doi.org/10.1109/FSKD.2011.6019821.

  12. Krestel, R., & Smyth, P. (2013). Recommending patents based on latent topics. In Proceedings of the 7th ACM conference on recommender systems, ACM, New York, NY, USA, pp. 395–398. https://doi.org/https://doi.org/10.1145/2507157.2507232.

  13. Mahdabi, P., & Crestani, F. (2014). Query-driven mining of citation networks for patent citation retrieval and recommendation. In Proceedings of the 23rd ACM international conference on conference on information and knowledge management, ACM, New York, NY, USA, pp. 1659–1668. https://doi.org/https://doi.org/10.1145/2661829.2661899.

  14. Oh, S., Lei, Z., Lee, W., & Yen, J. (2014). Recommending missing citations for newly granted patents. In 2014 international conference on data science and advanced analytics (DSAA), pp. 442–448. https://doi.org/https://doi.org/10.1109/DSAA.2014.7058110.

  15. Oh, S., Lei, Z., Lee, W.-C., Mitra, P., & Yen, J. (2013). CV-PCR: a context-guided value-driven framework for patent citation recommendation. In Proceedings of the 22nd ACM international conference on conference on information and knowledge management, ACM, New York, NY, USA, pp. 2291–2296. https://doi.org/https://doi.org/10.1145/2505515.2505659.

  16. Friesl, M. (2012). Knowledge acquisition strategies and company performance in young high technology companies. British Journal of Management., 23, 325–343. https://doi.org/10.1111/j.1467-8551.2011.00742.x

    Article  Google Scholar 

  17. Knudsen, M. P. (2007). The relative importance of interfirm relationships and knowledge transfer for new product development success*. Journal of Product Innovation Management, 24, 117–138. https://doi.org/10.1111/j.1540-5885.2007.00238.x

    Article  Google Scholar 

  18. Yang, M. C., Su, F., Chang, Y.-H., Lai, K. K., Lin, C. Y., & Chang, H. Y. (2016). “Expand/offense” and “deepen/defense” strategy of patent acquisition for leader and follower: Evidence from drug-eluting stent. In 2016 Portland international conference on management of engineering and technology (PICMET), pp. 1560–1566. https://doi.org/https://doi.org/10.1109/PICMET.2016.7806630.

  19. Chang, P., Chang, Y., Su, F., Chen, S., & Lai, K. K. (2014). The study on patent acquisition from complementarity and supplementarity: Evidence from Smartphones of Apple and Samsung. In Proceedings of PICMET ’14 conference: Portland international center for management of engineering and technology; Infrastructure and Service Integration, pp. 2996–3003.

  20. Buckley, P. J., Glaister, K. W., Klijn, E., & Tan, H. (2009). Knowledge accession and knowledge acquisition in strategic alliances: The impact of supplementary and complementary dimensions. British Journal of Management, 20, 598–609. https://doi.org/10.1111/j.1467-8551.2008.00607.x

    Article  Google Scholar 

  21. Makri, M., Hitt, M. A., & Lane, P. J. (2010). Complementary technologies, knowledge relatedness, and invention outcomes in high technology mergers and acquisitions. Strategic Management Journal, 31, 602–628.

    Google Scholar 

  22. Wang, Q., Yu, J., & Deng, W. (2019). An adjustable re-ranking approach for improving the individual and aggregate diversities of product recommendations. Electronic Commerce Research, 19, 59–79. https://doi.org/10.1007/s10660-018-09325-4

    Article  Google Scholar 

  23. Jing, N., Jiang, T., Du, J., & Sugumaran, V. (2018). Personalized recommendation based on customer preference mining and sentiment assessment from a Chinese e-commerce website. Electronic Commerce Research, 18, 159–179. https://doi.org/10.1007/s10660-017-9275-6

    Article  Google Scholar 

  24. Wang, Q., Ma, J., Liao, X., & Du, W. (2017). A context-aware researcher recommendation system for university-industry collaboration on R&D projects. Decision Support Systems, 103, 46–57. https://doi.org/10.1016/j.dss.2017.09.001

    Article  Google Scholar 

  25. Xu, Y., Zhou, D., & Ma, J. (2019). Scholar-friend recommendation in online academic communities: An approach based on heterogeneous network. Decision Support Systems, 119, 1–13. https://doi.org/10.1016/j.dss.2019.01.004

    Article  Google Scholar 

  26. Jeong, H. J., & Kim, M. H. (2019). HGGC: A hybrid group recommendation model considering group cohesion. Expert Systems with Applications, 136, 73–82. https://doi.org/10.1016/j.eswa.2019.05.054

    Article  Google Scholar 

  27. Feng, S., Zhang, H., Wang, L., Liu, L., & Xu, Y. (2019). Detecting the latent associations hidden in multi-source information for better group recommendation. Knowledge-Based Systems, 171, 56–68. https://doi.org/10.1016/j.knosys.2019.02.002

    Article  Google Scholar 

  28. Trappey, A. J. C., Trappey, C. V., Wu, C., Fan, C. Y., & Lin, Y. (2012). Intelligent recommendation methodology and system for patent search. In Proceedings of the 2012 IEEE 16th international conference on computer supported cooperative work in design (CSCWD), pp. 172–178. https://doi.org/https://doi.org/10.1109/CSCWD.2012.6221815.

  29. Wang, Q., Du, W., Ma, J., & Liao, X. (2019). Recommendation mechanism for patent trading empowered by heterogeneous information networks. International Journal of Electronic Commerce, 23, 147–178. https://doi.org/10.1080/10864415.2018.1564549

    Article  Google Scholar 

  30. He, Q., Spangler, W. S., He, B., Chen, Y., & Kato, L. (2012). Prospective client driven technology recommendation. In 2012 annual SRII global conference, pp. 110–119. https://doi.org/https://doi.org/10.1109/SRII.2012.23.

  31. Fu, T., Lei, Z., & Lee, W. (2015). Patent citation recommendation for examiners. In 2015 IEEE international conference on data mining, pp. 751–756. https://doi.org/https://doi.org/10.1109/ICDM.2015.151.

  32. Deng, N., Chen, X., & Li, D. (2017). Intelligent recommendation of chinese traditional medicine patents supporting new medicine’s R&D. Journal of Computational and Theoretical Nanoscience, 13(9), 5907–5913. https://doi.org/10.1166/jctn.2016.5505

    Article  Google Scholar 

  33. Chen, Y.-L., & Chiu, Y.-T. (2011). An IPC-based vector space model for patent retrieval. Information Processing & Management, 47, 309–322. https://doi.org/10.1016/j.ipm.2010.06.001

    Article  Google Scholar 

  34. Helmers, L., Horn, F., Biegler, F., Oppermann, T., & Müller, K.-R. (2019). Automating the search for a patent’s prior art with a full text similarity search. PLoS ONE, 14, e0212103. https://doi.org/10.1371/journal.pone.0212103

    Article  Google Scholar 

  35. Liu, S.-H., Liao, H.-L., Pi, S.-M., & Hu, J.-W. (2011). Development of a patent retrieval and analysis platform—A hybrid approach. Expert Systems with Applications, 38, 7864–7868. https://doi.org/10.1016/j.eswa.2010.12.114

    Article  Google Scholar 

  36. Kim, D., Seo, D., Cho, S., & Kang, P. (2019). Multi-co-training for document classification using various document representations: TF–IDF, LDA, and Doc2Vec. Information Sciences, 477, 15–29. https://doi.org/10.1016/j.ins.2018.10.006

    Article  Google Scholar 

  37. Schuhmacher, M., & Ponzetto, S. P. (2014). Knowledge-based graph document modeling. In Proceedings of the 7th ACM international conference on web search and data mining, association for computing machinery, New York, NY, USA, pp. 543–552. https://doi.org/https://doi.org/10.1145/2556195.2556250.

  38. Manrique, R., & Mariño, O. (2018). Knowledge graph-based weighting strategies for a scholarly paper recommendation scenario. In KaRS@RecSys.

  39. Watford, S. M., Grashow, R. G., De La Rosa, V. Y., Rudel, R. A., Friedman, K. P., & Martin, M. T. (2018). Novel application of normalized pointwise mutual information (NPMI) to mine biomedical literature for gene sets associated with disease: Use case in breast carcinogenesis. Computational Toxicology, 7, 46–57. https://doi.org/10.1016/j.comtox.2018.06.003

    Article  Google Scholar 

  40. Manrique, R., Cueto-Ramirez, F., & Mariño, O. (2018). Comparing graph similarity measures for semantic representations of documents. https://doi.org/https://doi.org/10.1007/978-3-319-98998-3_13.

  41. Lin, C.-J. (2007). Projected gradient methods for nonnegative matrix factorization. Neural Computation, 19, 2756–2779. https://doi.org/10.1162/neco.2007.19.10.2756

    Article  Google Scholar 

  42. Wang, D., Liang, Y., Xu, D., Feng, X., & Guan, R. (2018). A content-based recommender system for computer science publications. Knowledge-Based Systems, 157, 1–9. https://doi.org/10.1016/j.knosys.2018.05.001

    Article  Google Scholar 

  43. Zheng, N., Song, S., & Bao, H. (2015). A temporal-topic model for friend recommendations in Chinese microblogging systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45, 1245–1253. https://doi.org/10.1109/TSMC.2015.2391262

    Article  Google Scholar 

  44. Kong, X., Mao, M., Wang, W., Liu, J., & Xu, B. (2019). VOPRec: Vector representation learning of papers with text information and structural identity for recommendation. IEEE Transactions on Emerging Topics in Computing. https://doi.org/10.1109/TETC.2018.2830698

    Article  Google Scholar 

  45. Shani, G., & Gunawardana, A. (2011). Evaluating recommendation systems. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender systems handbook (pp. 257–297). Boston, MA: Springer. https://doi.org/10.1007/978-0-387-85820-3_8

    Chapter  Google Scholar 

  46. Satta, G., Parola, F., Penco, L., & de Falco, S. E. (2016). Insights to technological alliances and financial resources as antecedents of high-tech firms’ innovative performance. R&D Management, 46, 127–144. https://doi.org/10.1111/radm.12117

    Article  Google Scholar 

  47. Caviggioli, F., De Marco, A., Scellato, G., & Ughetto, E. (2017). Corporate strategies for technology acquisition: Evidence from patent transactions. Management Decision, 55, 1163–1181. https://doi.org/10.1108/MD-04-2016-0220

    Article  Google Scholar 

  48. Liu, D., Lian, J., Wang, S., Qiao, Y., Chen, J.-H., Sun, G., & Xie, X. (2020) KRED: Knowledge-aware document representation for news recommendations. In Fourteenth ACM conference on recommender systems, association for computing machinery, New York, NY, USA, pp. 200–209. https://doi.org/https://doi.org/10.1145/3383313.3412237.

  49. Gao, W., Peng, M., Wang, H., Zhang, Y., Xie, Q., & Tian, G. (2019). Incorporating word embeddings into topic modeling of short text. Knowledge and Information Systems, 61, 1123–1145. https://doi.org/10.1007/s10115-018-1314-7

    Article  Google Scholar 

  50. Dong, C., Jia, H., & Wang, C. (2018). Unsupervised leraning for sematic representation of short text. In 2018 5th IEEE international conference on cloud computing and intelligence systems (CCIS), pp. 475–478. https://doi.org/https://doi.org/10.1109/CCIS.2018.8691363.

  51. Jipeng, Q., Zhenyu, Q., Yun, L., Yunhao, Y., & Xindong, W. (2019) Short text topic modeling techniques, applications, and performance: A survey, ArXiv:1904.07695 [Cs]. http://arxiv.org/abs/1904.07695. Accessed October 9, 2019.

  52. Huang, L., Wang, C., Chao, H., Lai, J., & Yu, P. S. (2019). A score prediction approach for optional course recommendation via cross-user-domain collaborative filtering. IEEE Access, 7, 19550–19563. https://doi.org/10.1109/ACCESS.2019.2897979

    Article  Google Scholar 

  53. Maksai, A., Garcin, F., & Faltings, B. (2015). Predicting online performance of news recommender systems through richer evaluation metrics. In Proceedings of the 9th ACM conference on recommender systems, ACM, New York, NY, USA, pp. 179–186. https://doi.org/https://doi.org/10.1145/2792838.2800184.

  54. Sauro, J., & Lewis, J. R. (2016). Chapter 9—Six enduring controversies in measurement and statistics. In J. Sauro & J. R. Lewis (Eds.), Quantifying the user experience (2nd edn) (pp. 249–276). Boston: Morgan Kaufmann. https://doi.org/10.1016/B978-0-12-802308-2.00009-6

    Chapter  Google Scholar 

  55. Nonparametric statistics: An introduction. In Nonparametric statistics for non-statisticians. Wiley, 2009: pp. 1–11. https://doi.org/https://doi.org/10.1002/9781118165881.ch1.

  56. Porteous, I., Newman, D., Ihler, A., Asuncion, A., Smyth, P., & Welling, M. (2008). Fast collapsed Gibbs sampling for latent Dirichlet allocation. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, pp. 569–577. https://doi.org/https://doi.org/10.1145/1401890.1401960.

  57. Yuan, H., Lau, R. Y. K., & Xu, W. (2016). The determinants of crowdfunding success: A semantic text analytics approach. Decision Support Systems, 91, 67–76. https://doi.org/10.1016/j.dss.2016.08.001

    Article  Google Scholar 

Download references

Acknowledgments

This work is partially supported by China Postdoctoral Science Foundation (Project No. 2020M682757) and the Guangzhou science and technology plan project (Project No. 202002030384).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiwei Deng.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1. Algorithm for computing the graph edit distance

figure a

Appendix 2. Algorithm for constructing the complementary profile

figure b

Appendix 3. Algorithm for learning the preference weights

figure c

Appendix 4. Introduction of the VSM-based recommendation method

The VSM-based recommendation method uses TF-IDF to extract keywords and their weights from documents. TF-IDF assumes that a document is represented by a collection of words and a word is important for the document if it repeatedly appears in the document but rarely appears in other documents. Therefore, TF-IDF is defined as follows given a document corpus:

$$TF{-}IDF_{ij} = tf_{ij} \times \log \left( {\frac{N}{{df_{i} + 1}}} \right)$$
(19)

where \(tf_{ij}\) is the frequency of word \(i\) appears in document \(j\), \(df_{i}\) is the number of documents that contain word \(i\), and \(N\) is the number of documents in the document corpus. Word \(i\) is considered to be important for document \(j\) only when \(tf_{ij}\) is large and \(df_{i}\) is small.

After profiling patents and companies as vectors of keyword weights, the cosine similarity between a target company \(C\) and a candidate patent \(P\) is calculated as below:

$$Sim_{VSM} \left( {C,P} \right) = \frac{{\mathop \sum \nolimits_{k = 1}^{n} TF{-}IDF_{k,C} \times TF{-}IDF_{k,P} }}{{\sqrt {\mathop \sum \nolimits_{k = 1}^{n} TF{-}IDF_{k,C}^{2} } \times \sqrt {\mathop \sum \nolimits_{k = 1}^{n} TF{-}IDF_{k,P}^{2} } }}$$
(20)

where \(n\) is the number of keywords extracted from the corpus and \(Sim_{VSM} \left( {C,P} \right)\) is the similarity between \(C\) and \(P\) based on the VSM-based recommendation method.

Appendix 5. Introduction of the LDA-based recommendation method

The basic assumption of LDA is that a document is generated by picking a set of topics and each topic is generated by picking a set of words. Therefore, a document is represented by the distribution of topics, and a topic is described by the distribution of words in LDA. The structure of the LDA model is presented in Fig. 

Fig. 8
figure 8

The structure of the LDA model

8. Given a corpus consisting of \(M\) documents each of length \(N_{d}\), the topic distribution per document and the word distribution per topic are estimated according to the following generation process: First, choose a topic distribution \(\theta_{d} \sim Dir\left( \alpha \right)\) for each document, where \(d \in \left\{ {1, \ldots ,M} \right\}\) and \(Dir\left( \alpha \right)\) is a Dirichlet distribution with the hyperparameter \(\alpha\). Second, choose a word distribution \(\varphi_{k} \sim Dir\left( \beta \right)\) for each topic, where \(k \in \left\{ {1, \ldots ,K} \right\}\) is the \(k\)-th topic and \(Dir\left( \beta \right)\) is a Dirichlet distribution with the hyperparameter \(\beta\). Third, for each word position in the corpus, choose a topic \(z_{d,l} \sim Multinomial\left( {\theta_{d} } \right)\) and a word \(w_{d,l} \sim Multinomial\left( {\varphi_{{z_{d,l} }} } \right)\), where \(d \in \left\{ {1, \ldots ,M} \right\}\), \(l \in \left\{ {1, \ldots ,N_{d} } \right\}\), \(Multinomial\left( {\theta_{d} } \right)\) is a multinomial distribution with the parameter \(\theta_{d}\), and \(Multinomial\left( {\varphi_{{z_{d,l} }} } \right)\) is the multinomial distribution with the parameter \(\varphi_{{z_{d,l} }}\).

The parameters \(\theta_{d}\) and \(\varphi_{k}\) can be obtained by collapsed Gibbs sampling [56] and the number of latent topics \(K\) is experimentally determined based on the perplexity-based approach [57]. The perplexity measure is defined as \(perp\left( D \right) = exp\left( { - \frac{{\mathop \sum \nolimits_{d \in D} \ln P\left( {d{|}\theta ,\varphi } \right)}}{{\mathop \sum \nolimits_{d \in D} \left| d \right|}}} \right)\), where \(D\) is a corpus, \(d\) is a document in the corpus, and \(P\left( {d{|}\theta ,\varphi } \right) = \mathop \prod \limits_{{w_{l} \in d}} \mathop \sum \limits_{k = 1}^{K} P\left( {w_{l} {|}z_{k} } \right) \cdot P\left( {z_{k} {|}d} \right)\). The smaller the perplexity value is, the better the model is trained.

After characterizing patents and companies as topic distributions, the similarity between a target company \(C\) and a candidate patent \(P\) is calculated based on the Jensen-Shannon divergence measure and is defined as follows:

$$Sim_{LDA} \left( {C,P} \right) = 1 - JSD\left( {\theta_{C} \parallel \theta_{P} } \right)$$
(21)
$$JSD\left( {\theta_{C} \parallel \theta_{P} } \right) = \frac{1}{2}\left[ {\mathop \sum \limits_{k = 1}^{K} \theta_{C,k} \log \frac{{\theta_{C,k} }}{{\theta_{C,P,k} }} + \mathop \sum \limits_{k = 1}^{K} \theta_{P,k} \log \frac{{\theta_{P,k} }}{{\theta_{C,P,k} }}} \right]$$
(22)

where \(Sim_{LDA} \left( {C,P} \right)\) is the similarity between \(C\) and \(P\) based on the LDA-based recommendation method, \(\theta_{C}\) is the topic distribution of \(C\), \(\theta_{P}\) is the topic distribution of \(P\), \(JSD\left( {\theta_{C} \parallel \theta_{P} } \right)\) denotes the Jensen-Shannon divergence between \(C\) and \(P\), \(\theta_{C,P} = \frac{1}{2}\left( {\theta_{C} + \theta_{P} } \right)\) is the average of the two probability distributions, \(K\) is the number of latent topics, and \(\theta_{C,k}\) is the probability of the \(k\)-the topic in \(C\).

Appendix 6. Introduction of the doc2vec-based recommendation method

Doc2vec is an unsupervised method that learns an embedding model from a corpus and then can be used to generate embedding vectors for new documents. It is an extension to word2vec, a word embedding technique that predicts a word given the other words in a context. Given a sequence of words \(word_{1} ,word_{2} , \ldots ,word_{T}\), word2vec learns a word embedding model by maximizing the log probability \(\frac{1}{T}\mathop \sum \limits_{t = 1}^{T - k} \log p\left( {word_{t} {|}word_{t - k} , \ldots ,word_{t + k} } \right)\), where \(k\) is the number of surrounding words for a target word. Then the softmax function \(p\left( {word_{t} {|}word_{t - k} , \ldots ,word_{t + k} } \right) = \frac{{e^{{y_{{word_{t} }} }} }}{{\mathop \sum \nolimits_{i} e^{{y_{i} }} }}\) is used to conduct the prediction task. In the softmax function, \(y_{i}\) is the \(i\)-th output value of a feed-forward neural network and is calculated as follows: \(y = b + Uh\left( {word_{t - k} , \ldots ,word_{t + k} ;WE} \right)\), where \(U\) and \(b\) are the softmax parameters, \(h\) is the concatenation or average for context words, and \(WE\) is the word embedding matrix. Doc2vec extends word2vec by considering a document as another word in the context when predicting words. The only difference between doc2vec and word2vec is the addition of a document embedding matrix \(DE\) in \(y\): \(y = b + Uh\left( {word_{t - k} , \ldots ,word_{t + k} ;WE,DE} \right)\). Therefore, both documents and words are embedded in the same latent space and both embedding vectors contribute to the prediction task. The number of surrounding words and the dimension of the latent space are determined experimentally.

After training, the doc2vec model can be used to infer the embedding vectors for companies and patents. The similarity between a target company \(C\) and a candidate patent \(P\) is then calculated based on the cosine similarity shown below:

$$Sim_{doc2vec} \left( {C,P} \right) = \frac{{DE_{C} \cdot DE_{P} }}{{\|{DE_{C}} \|}\cdot{\| {DE_{P} }\|}}$$
(23)

where \(DE_{C}\) is the embedding vector of \(C\), \(DE_{P}\) is the embedding vector of \(P\), and \(Sim_{doc2vec} \left( {C,P} \right)\) is the similarity between \(C\) and \(P\) based on the doc2vec-based recommendation method.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, W., Ma, J. A knowledge graph approach for recommending patents to companies. Electron Commer Res 22, 1435–1466 (2022). https://doi.org/10.1007/s10660-021-09471-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10660-021-09471-2

Keywords

Navigation