Issue 5, 2023

Using GPT-4 in parameter selection of polymer informatics: improving predictive accuracy amidst data scarcity and ‘Ugly Duckling’ dilemma

Abstract

Materials informatics and cheminformatics struggle with data scarcity, hindering the extraction of significant relationships between structures and properties. The “Ugly Duckling” theorem, suggesting the difficulty of data processing without assumptions or prior knowledge, exacerbates this problem. Current methodologies don't entirely bypass this theorem and may lead to decreased accuracy with unfamiliar data. We propose using OpenAI generative pretrained transformer 4 (GPT-4) language model for explanatory variable selection, leveraging its extensive knowledge and logical reasoning capabilities to embed domain knowledge in tasks predicting structure–property correlations, such as the refractive index of polymers. This can partially alleviate challenges posed by the “Ugly Duckling” theorem and limited data availability.

Graphical abstract: Using GPT-4 in parameter selection of polymer informatics: improving predictive accuracy amidst data scarcity and ‘Ugly Duckling’ dilemma

Supplementary files

Article information

Article type
Paper
Submitted
28 Jul 2023
Accepted
11 Sep 2023
First published
12 Sep 2023
This article is Open Access
Creative Commons BY license

Digital Discovery, 2023,2, 1548-1557

Using GPT-4 in parameter selection of polymer informatics: improving predictive accuracy amidst data scarcity and ‘Ugly Duckling’ dilemma

K. Hatakeyama-Sato, S. Watanabe, N. Yamane, Y. Igarashi and K. Oyaizu, Digital Discovery, 2023, 2, 1548 DOI: 10.1039/D3DD00138E

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements