Methods Inf Med 2009; 48(02): 149-154
DOI: 10.3414/ME9213
Original Articles
Schattauer GmbH

Automatic Acquisition of Synonym Resources and Assessment of their Impact on the Enhanced Search in EHRs

N. Grabar
1   Centre de Recherche des Cordeliers, Université Paris Descartes, Paris, France
2   INSERM, U872, Paris, France
3   HEGP AP-HP, Paris, France
,
P.-C. Varoutas
4   Institut Curie, Département d’Information Médicale, Paris, France
,
P. Rizand
4   Institut Curie, Département d’Information Médicale, Paris, France
,
A. Livartowski
4   Institut Curie, Département d’Information Médicale, Paris, France
,
T. Hamon
5   LIPN-UMR 7030, Université Paris-Nord – CNRS, Villetaneuse, France
› Author Affiliations
Further Information

Publication History

18 February 2009

Publication Date:
17 January 2018 (online)

Summary

Objective: Currently, the use of natural language processing (NLP) approaches in order to improve search and exploration of electronic health records (EHRs) within healthcare information systems is not a common practice. One reason for this is the lack of suitable lexical resources. Indeed, in order to support such tasks, various types of such resources need to be collected or acquired (i.e., morphological, orthographic, synonymous).

Methods: We propose a novel method for the acquisition of synonymy resources. This method is language-independent and relies on existence of structured terminologies. It enables to decipher hidden synonymy relations between simple words and terms on the basis of their syntactic analysis and exploitation of their compositionality.

Results: Applied to series of synonym terms from the French subset of the UMLS, the method shows 99% precision. The overlap between thus inferred terms and the existing sparse resources of synonyms is very low. In order to better integrate these resources in an EHR search system, we analyzed a sample of clinical queries submitted by healthcare professionals.

Conclusions: Observation of clinical queries shows that they make a very little use of the query expansion function, and, whenever they do, synonymy relations are rarely involved.

 
  • References

  • 1 Varoutas PC, Rizand P, Livartowski A. Using category theory as a basis for a heterogeneous data source search meta-engine: The Promethee framework. Lecture Notes in Computer Science (MSFP 2006) 2006; 4019: 381-388.
  • 2 Stroetmann V, Jones T, Ambroise D. et al. Institut Curie, Paris, France: Elios and prométhée. Technical report eHealth Impact, Information Society and Media, EC study; 2006
  • 3 Organisation mondiale de la Santé, Genève.. International Classification of Diseases for Oncology (ICD-O). 2000
  • 4 Burnage G. CELEX – A Guide for Users. Centre for Lexical Information, University of Nijmegen; 1990
  • 5 Hathout N, Namer F, Dal G. An experimental constructional database: the MorTAL project. In: Boucher P. (ed). Morphology book. Cambridge, MA: Cascadilla Press; 2001
  • 6 NLM.. UMLS Knowledge Sources Manual. National Library of Medicine; Bethesda, Maryland: 2007. www.nlm.nih.gov/research/umls.
  • 7 Schulz S, Romacker M, Franz P. et al. Towards a multilingual morpheme thesaurus for medical free-text retrieval. In: Medical Informatics in Europe (MIE). 1999
  • 8 Zweigenbaum P, Baud R, Burgun A. et al. Towards a Unified Medical Lexicon for French. In: Medical Informatics in Europe (MIE). 2003
  • 9 Fellbaum C. A semantic network of English: the mother of all WordNets. Computers and Humanities. EuroWordNet: a multilingual database with lexical semantic network 1998; 32 2–3 209-220.
  • 10 Hamon T, Nazarenko A. Detection of synonymy links between terms: experiment and results. In: Recent Advances in Computational Terminology. John Benjamins; 2001. pp 185-208.
  • 11 Smith B, Fellbaum C. Medical WordNet: a new methodology for the Construction and Validation of Information. In: Proc of 20th CoLing. Geneva, Switzerland: 2004. pp 371-382.
  • 12 Poprat M, Beisswanger E, Hahn U. Building a BioWordNet Using WordNet Data Structures and WordNet’s Software Infrastructure – A Failure Story. ACL 2008 workshop “Software Engineering, Testing, and Quality Assurance for Natural Language Processing”, 2008
  • 13 Robert.. Le nouveau petit Robert. Dictionnaires Le Robert; Paris: 1993
  • 14 Partee BH. Compositionality. F Landman and F Veltman; 1984
  • 15 Berroyer JF. Tagen, un analyseur d’entités nommées: conception, développement et évaluation. Mémoire de D. E. A. d’intelligence artificielle, Université Paris-Nord; 2004
  • 16 Schmid H. Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing. Manchester, UK: 1994. pp 44-49.
  • 17 Hole W, Srinivasan S. Discovering missed synonymy in a large concept-oriented metathesaurus. In: AMIA. 2000. pp 354-358.
  • 18 Verspoor CM, Joslyn C, Papcun GJ. The gene ontology as a source of lexical semantic knowledge for a biological natural language processing application. In: SIGIR workshop on Text Analysis and Search for Bioinformatics. 2003. pp 51-56.
  • 19 Gene Ontology Consortium.. Gene Ontology: tool for the unification of biology. Nature genetics 2000; 25: 25-29.
  • 20 Grabar N, Jaulent MC, Hamon T. Combination of endogenous clues for profiling inferred semantic relations: experiments with gene ontology. In: AMIA 2008. Washington, USA: 2008. pp 252-256.
  • 21 Ogren P, Cohen K, Acquaah-Mensah G, Eberlein J, Hunter L. The compositional structure of Gene Ontology terms. In: Pacific Symposium of Biocomputing. 2004. pp 214-225.