The final, formatted version of the article will be published soon.
METHODS article
Front. Artif. Intell.
Sec. Medicine and Public Health
Volume 7 - 2024 |
doi: 10.3389/frai.2024.1450477
This article is part of the Research Topic Data Science and Digital Health Technologies for Personalized Healthcare View all 3 articles
Clinical Entity Aware Domain Adaptation in Low Resource Setting for Inflammatory Bowel Disease
Provisionally accepted- 1 LIIR Lab, Department of Computer Science, KU Leuven, Leuven 3001, Belgium, Leuven, Belgium
- 2 Dynamical Systems, Signal Processing and Data Analytics (ESAT-STADIUS), KU Leuven, Leuven 3001, Belgium, Leuven, Belgium
The digitization of healthcare records has revolutionized medical research and patient care, with electronic health records (EHRs) containing a wealth of structured and unstructured data.Extracting valuable information from unstructured clinical text presents a significant challenge, necessitating automated tools for efficient data mining. Natural language processing (NLP) methods have been pivotal in this endeavor, aiming to extract crucial clinical concepts embedded within free-form text. Our research addresses the imperative for robust biomedical entity extraction, focusing specifically on inflammatory bowel disease (IBD). Leveraging novel domain-specific pre-training and entity-aware masking strategies with contrastive learning, we fine-tune and adapt a general language model to be better adapted to IBD-related information extraction scenarios.Our named entity recognition (NER) tool streamlines the retrieval process, supporting annotation, correction, and visualization functionalities. In summary, we developed a comprehensive pipeline for clinical Dutch NER encompassing an efficient domain adaptation strategy with domainaware masking and model fine-tuning enhancements, and an end-to-end entity extraction tool, significantly advancing medical record curation and clinical workflows.
Keywords: Entity Aware Pre-training, named entity recognition, Clinical NER Tool, Contrastive learning, inflammatory bowel disease, language modeling, Natural Language Processing, Information Extraction
Received: 17 Jun 2024; Accepted: 26 Dec 2024.
Copyright: © 2024 Francis, Garcia, Uma, Mestdagh, De Moor and Moens. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Sumam Francis, LIIR Lab, Department of Computer Science, KU Leuven, Leuven 3001, Belgium, Leuven, Belgium
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.