Skip to main content

Concept Extraction Based on Semantic Models Using Big Amount of Patents and Scientific Publications Data

  • Conference paper
  • First Online:

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 635))

Abstract

Formalisation of heuristic methods for supporting the conceptual design stage of product and technology development has been extensively evolved in industry during the last half of the century and gradually more formally appears in academic context nowadays. Due to the considerable interest from the Industry and the Academia, heuristic approaches such as TRIZ have been strongly developed over the past decades. Thus, TRIZ evolved from a set of empirical inventive principles into a considerably formal approach including techniques for modeling technical problems with the possibility of further overcoming them using formal methods. Moreover, during the last decades, TRIZ has been extensively digitized. Several generations of software have appeared that facilitate the use of inventive methods (Goldfire, Invention Machine). From the trend of digitalisation and the success of machine driven processes, it can be assumed that the further fate of invention methods and formal algorithms for overcoming non-trivial problems lies in the plane of Machine Learning and Artificial Intelligence approaches. The position of the authors is that the idea of ​​automating inventions looks extremely attractive, although in the coming time, digital approaches will rather complement the intelligence of engineers and scientists, rather than replace it. Taking a certain preparatory step towards AI driven inventions, we present a semantic model that can form the basis of future approaches, at the same time, having already sufficient functionality to support the heuristic stage of technology. As part of this work, over 8 millions of patents and scientific publications have been analyzed to extract semantic concepts. A model was built based on Machine Learning methods and Natural Language Processing techniques with the following discussion and application examples.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Salamatov, Y. Souchkov, V.: TRIZ: The Right Solution at the Right Time: A Guide to Innovative Problem Solving, p. 256. Insytec, Hattem (1999)

    Google Scholar 

  2. Altshuller, G., Altov, H.: And Suddenly the Inventor Appeared: TRIZ, The Theory of Inventive Problem Solving. Technical Innovation Center, Inc. (1996)

    Google Scholar 

  3. Litvin, S., Petrov, V., Rubin M.: TRIZ Body of Knowledge. The TRIZ Developers summit 2007 (2007). https://triz-summit.ru/en/203941/

  4. Cavallucci, D., Khomenko, N.: From TRIZ to OTSM-TRIZ: addressing complexity challenges in inventive design. Int. J. Prod. Dev. 4(1–2), 4–21 (2007)

    Article  Google Scholar 

  5. Cascini, G.: State-of-the-art and trends of computer-aided innovation tools. In: Jacquart, R. (ed.) Building the Information Society. IFIP International Federation for Information Processing, vol 156. Springer, Boston, MA (2004). https://doi.org/10.1007/978-1-4020-8157-6_40

  6. http://invention-machine.com/custsupport/to_install.cfm. Accessed Apr 2021

  7. https://ihsmarkit.com/products/enterprise-knowledge.html. Accessed Apr 2021

  8. Savransky, S.D.: Engineering of creativity: Introduction to TRIZ methodology of inventive problem solving. CRC press (2000)

    Google Scholar 

  9. Artificial Intelligence (2019). WIPO Technology Trends (2019). https://www.wipo.int/edocs/pubdocs/en/wipo_pub_1055.pdf

  10. Loper, E., Bird, S.: NLTK: the natural language toolkit. arXiv preprint cs/0205028 (2002)

    Google Scholar 

  11. Joseph, S.R., Hlomani, H., Letsholo, K., Kaniwa, F., Sedimo, K.: Natural language processing: a review. Nat. Lang. Process. Rev. 6, 207–210 (2016)

    Google Scholar 

  12. Hu, Z., Fang, S., Liang, T.: Empirical study of constructing a knowledge organization system of patent documents using topic modeling. Scientometrics 100(3), 787–799 (2014). https://doi.org/10.1007/s11192-014-1328-1

    Article  Google Scholar 

  13. Ranaei, S., Knutas, A., Salminen, J., Hajikhani, A.: Cloud-based patent and paper analysis tool for comparative analysis of research. In CompSysTech, pp. 315–322, June 2016

    Google Scholar 

  14. Okamoto, M., Shan, Z., Orihara, R.: Applying information extraction for patent structure analysis. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 989–992, August 2017

    Google Scholar 

  15. Montecchi, T., Russo, D., Liu, Y.: Searching in cooperative patent classification: comparison between keyword and concept-based search. Adv. Eng. Inf. 27(3), 335–345 (2013)

    Article  Google Scholar 

  16. Abood, A., Feltenberger, D.: Automated patent landscaping. Artificial Intelligence and Law 26(2), 103–125 (2018). https://doi.org/10.1007/s10506-018-9222-4

    Article  Google Scholar 

  17. Liang, Y., Tan, R., Ma, J.: Patent analysis with text mining for TRIZ. In: 2008 4th IEEE International Conference on Management of Innovation and Technology, pp. 1147–1151. IEEE, September 2008

    Google Scholar 

  18. Cascini, G., Russo, D.: Computer-aided analysis of patents and search for TRIZ contradictions. Int. J. Prod. Dev. 4(1–2), 52–67 (2007)

    Article  Google Scholar 

  19. Ni, X., Samet, A., Cavallucci, D.: Build links between problems and solutions in the patent. In: Cavallucci, D., Brad, S., Livotov, P. (eds.) Systematic Complex Problem Solving in the Age of Digitalization and Open Innovation. TFC 2020. IFIP Advances in Information and Communication Technology, vol 597. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61295-5_6

  20. Berdyugina, D., Cavallucci, D.: Setting up context-sensitive real-time contradiction matrix of a given field using unstructured texts of patent contents and natural language processing. In: Cavallucci, D., Brad, S., Livotov, P. (eds.) Systematic Complex Problem Solving in the Age of Digitalization and Open Innovation. TFC 2020. IFIP Advances in Information and Communication Technology, vol 597. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61295-5_3

  21. Regazzoni, D., Nani, R.: TRIZ-Based Patent Investigation by Evaluating Inventiveness. In: Cascini, G. (ed.) CAI 2008. TIFIP, vol. 277, pp. 247–258. Springer, Boston, MA (2008). https://doi.org/10.1007/978-0-387-09697-1_21

    Chapter  Google Scholar 

  22. Bergeaud, A., Potiron, Y., Raimbault, J.: Classifying patents based on their semantic content. PloS One 12(4), e0176310 (2017)

    Google Scholar 

  23. Kaliteevskii, V., Deder, A., Peric, N., Chechurin, L.: Conceptual semantic analysis of patents and scientific publications based on TRIZ tools. In: International TRIZ Future Conference, pp. 54–63. Springer, Cham, October 2020

    Google Scholar 

  24. Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recogn. 36(2), 451–461 (2003)

    Article  Google Scholar 

  25. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  26. Huang, C.H., Yin, J., Hou, F.: A text similarity measurement combining word semantic information with TF-IDF method. Jisuanji Xuebao(Chinese Journal of Computers) 34(5), 856–864 (2011)

    Google Scholar 

  27. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  28. Řehůřek, R., Sojka, P.: Gensim—statistical semantics in python. Statistical semantics; gensim; Python; LDA; SVD (2011)

    Google Scholar 

  29. https://www.uspto.gov/. Accessed May 2020

  30. https://core.ac.uk/. Accessed May 2020

  31. Oxford Creativity. Physical effects and functions database. http://wbam2244.dns-systems.net/EDB/index.php. Accessed May 2020

  32. Fomenkov, S.A., Kolesnikov, S.G., Korobkin, D.M., Kamaev, V.A., Orlova, Y.A.: The information filling of the database by physical effects. J. Eng. Appl. Sci. 9(10–12), 422–426 (2014)

    Google Scholar 

  33. Physical Effects database. http://bionicinspiration.org/physical-effects/. Accessed May 2020

  34. Efimov-Soini, N.K., Chechurin, L.S.: Method of ranking in the function model. Procedia CIRP 39, 22–26 (2016)

    Article  Google Scholar 

  35. Renev, I., Chechurin, L., Perlova, E.: Early design stage automation in architecture-engineering-construction (AEC) projects. In: Proceedings of the 35th eCAADe Conference, pp. 373–382 (2017)

    Google Scholar 

Download references

Acknowledgement

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska Curie grant agreement № 722176.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vasilii Kaliteevskii .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaliteevskii, V., Deder, A., Peric, N., Chechurin, L. (2021). Concept Extraction Based on Semantic Models Using Big Amount of Patents and Scientific Publications Data. In: Borgianni, Y., Brad, S., Cavallucci, D., Livotov, P. (eds) Creative Solutions for a Sustainable Development. TFC 2021. IFIP Advances in Information and Communication Technology, vol 635. Springer, Cham. https://doi.org/10.1007/978-3-030-86614-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86614-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86613-6

  • Online ISBN: 978-3-030-86614-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics