Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jan 5, 2023
Open Peer Review Period: Jan 5, 2023 - Mar 2, 2023
Date Accepted: Mar 19, 2023
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Understanding Views Around the Creation of a Consented, Donated Databank of Clinical Free Text to Develop and Train Natural Language Processing Models for Research: Focus Group Interviews With Stakeholders

Fitzpatrick NK, Dobson R, Roberts A, Jones K, Shah A, Nenadic G, Ford E

Understanding Views Around the Creation of a Consented, Donated Databank of Clinical Free Text to Develop and Train Natural Language Processing Models for Research: Focus Group Interviews With Stakeholders

JMIR Med Inform 2023;11:e45534

DOI: 10.2196/45534

PMID: 37133927

PMCID: 10193205

Understanding stakeholder views around the creation of a consented donated databank of clinical free text to develop and train natural language processing models for research: an exploratory study

  • Natalie Karit Fitzpatrick; 
  • Richard Dobson; 
  • Angus Roberts; 
  • Kerina Jones; 
  • Anoop Shah; 
  • Goran Nenadic; 
  • Elizabeth Ford

ABSTRACT

Background:

Information stored within electronic health records is often recorded as unstructured text. Special computerised natural language processing (NLP) tools are needed to process this text, however complex governance arrangements make such data in the NHS hard to access and therefore it is difficult to use for research in improving NLP methods. The creation of a donated databank of clinical free text could provide an important opportunity for researchers to develop NLP methods and tools and may circumvent delays in accessing the data needed to train the models. However, to date, there has been little or no engagement with stakeholders on the acceptability and design considerations of establishing a free text databank for this purpose.

Objective:

To ascertain stakeholder views around the creation of a consented, donated databank of clinical free text to help create, train and evaluate NLP for clinical research, and inform potential next steps for adopting a partner-led approach to establish a national, funded databank of free text for use by the research community.

Methods:

Online in-depth focus group interviews were carried out with four stakeholder groups (patients and public, clinicians, information governance leads and research ethics members, and NLP researchers).

Results:

All stakeholder groups were strongly in favour of the databank and saw great value in creating an environment where NLP tools can be tested and trained to improve their accuracy. Participants highlighted a range of complex issues for consideration as the databank is developed, including communicating the intended purpose, the approach to access and safeguarding the data, who should have access and how to fund the databank. Participants recommended that a small-scale, gradual approach is adopted to start to gather donations and encouraged further engagement with stakeholders to develop a roadmap and set of standards for the databank.

Conclusions:

These findings provide a clear mandate to begin to develop the databank and a framework for stakeholder expectations which we would aim to meet with the databank delivery.


 Citation

Please cite as:

Fitzpatrick NK, Dobson R, Roberts A, Jones K, Shah A, Nenadic G, Ford E

Understanding Views Around the Creation of a Consented, Donated Databank of Clinical Free Text to Develop and Train Natural Language Processing Models for Research: Focus Group Interviews With Stakeholders

JMIR Med Inform 2023;11:e45534

DOI: 10.2196/45534

PMID: 37133927

PMCID: 10193205

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

Advertisement