Multifaceted Natural Language Processing Task-Based Evaluation of Bidirectional Encoder Representations From Transformers Models for Bilingual (Korean and English) Clinical Notes: Algorithm Development and Validation.

IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS JMIR Medical Informatics Pub Date : 2024-10-30 DOI:10.2196/52897
Kyungmo Kim, Seongkeun Park, Jeongwon Min, Sumin Park, Ju Yeon Kim, Jinsu Eun, Kyuha Jung, Yoobin Elyson Park, Esther Kim, Eun Young Lee, Joonhwan Lee, Jinwook Choi
{"title":"Multifaceted Natural Language Processing Task-Based Evaluation of Bidirectional Encoder Representations From Transformers Models for Bilingual (Korean and English) Clinical Notes: Algorithm Development and Validation.","authors":"Kyungmo Kim, Seongkeun Park, Jeongwon Min, Sumin Park, Ju Yeon Kim, Jinsu Eun, Kyuha Jung, Yoobin Elyson Park, Esther Kim, Eun Young Lee, Joonhwan Lee, Jinwook Choi","doi":"10.2196/52897","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The bidirectional encoder representations from transformers (BERT) model has attracted considerable attention in clinical applications, such as patient classification and disease prediction. However, current studies have typically progressed to application development without a thorough assessment of the model's comprehension of clinical context. Furthermore, limited comparative studies have been conducted on BERT models using medical documents from non-English-speaking countries. Therefore, the applicability of BERT models trained on English clinical notes to non-English contexts is yet to be confirmed. To address these gaps in literature, this study focused on identifying the most effective BERT model for non-English clinical notes.</p><p><strong>Objective: </strong>In this study, we evaluated the contextual understanding abilities of various BERT models applied to mixed Korean and English clinical notes. The objective of this study was to identify the BERT model that excels in understanding the context of such documents.</p><p><strong>Methods: </strong>Using data from 164,460 patients in a South Korean tertiary hospital, we pretrained BERT-base, BERT for Biomedical Text Mining (BioBERT), Korean BERT (KoBERT), and Multilingual BERT (M-BERT) to improve their contextual comprehension capabilities and subsequently compared their performances in 7 fine-tuning tasks.</p><p><strong>Results: </strong>The model performance varied based on the task and token usage. First, BERT-base and BioBERT excelled in tasks using classification ([CLS]) token embeddings, such as document classification. BioBERT achieved the highest F1-score of 89.32. Both BERT-base and BioBERT demonstrated their effectiveness in document pattern recognition, even with limited Korean tokens in the dictionary. Second, M-BERT exhibited a superior performance in reading comprehension tasks, achieving an F1-score of 93.77. Better results were obtained when fewer words were replaced with unknown ([UNK]) tokens. Third, M-BERT excelled in the knowledge inference task in which correct disease names were inferred from 63 candidate disease names in a document with disease names replaced with [MASK] tokens. M-BERT achieved the highest hit@10 score of 95.41.</p><p><strong>Conclusions: </strong>This study highlighted the effectiveness of various BERT models in a multilingual clinical domain. The findings can be used as a reference in clinical and language-based applications.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e52897"},"PeriodicalIF":3.1000,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11539635/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/52897","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The bidirectional encoder representations from transformers (BERT) model has attracted considerable attention in clinical applications, such as patient classification and disease prediction. However, current studies have typically progressed to application development without a thorough assessment of the model's comprehension of clinical context. Furthermore, limited comparative studies have been conducted on BERT models using medical documents from non-English-speaking countries. Therefore, the applicability of BERT models trained on English clinical notes to non-English contexts is yet to be confirmed. To address these gaps in literature, this study focused on identifying the most effective BERT model for non-English clinical notes.

Objective: In this study, we evaluated the contextual understanding abilities of various BERT models applied to mixed Korean and English clinical notes. The objective of this study was to identify the BERT model that excels in understanding the context of such documents.

Methods: Using data from 164,460 patients in a South Korean tertiary hospital, we pretrained BERT-base, BERT for Biomedical Text Mining (BioBERT), Korean BERT (KoBERT), and Multilingual BERT (M-BERT) to improve their contextual comprehension capabilities and subsequently compared their performances in 7 fine-tuning tasks.

Results: The model performance varied based on the task and token usage. First, BERT-base and BioBERT excelled in tasks using classification ([CLS]) token embeddings, such as document classification. BioBERT achieved the highest F1-score of 89.32. Both BERT-base and BioBERT demonstrated their effectiveness in document pattern recognition, even with limited Korean tokens in the dictionary. Second, M-BERT exhibited a superior performance in reading comprehension tasks, achieving an F1-score of 93.77. Better results were obtained when fewer words were replaced with unknown ([UNK]) tokens. Third, M-BERT excelled in the knowledge inference task in which correct disease names were inferred from 63 candidate disease names in a document with disease names replaced with [MASK] tokens. M-BERT achieved the highest hit@10 score of 95.41.

Conclusions: This study highlighted the effectiveness of various BERT models in a multilingual clinical domain. The findings can be used as a reference in clinical and language-based applications.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多方面自然语言处理任务的双向编码器表征评估--来自双语(韩语和英语)临床笔记的变换器模型:算法开发与验证。
背景:来自变压器的双向编码器表征(BERT)模型在病人分类和疾病预测等临床应用中引起了广泛关注。然而,目前的研究通常都是在没有对模型理解临床背景的能力进行全面评估的情况下进行应用开发的。此外,利用非英语国家的医疗文档对 BERT 模型进行的比较研究也很有限。因此,以英语临床笔记为基础训练的 BERT 模型在非英语环境中的适用性还有待证实。为了填补这些文献空白,本研究重点确定了适用于非英语临床笔记的最有效 BERT 模型:在本研究中,我们评估了应用于韩语和英语混合临床笔记的各种 BERT 模型的语境理解能力。本研究的目的是找出最能理解此类文档语境的 BERT 模型:方法:利用韩国一家三甲医院 164,460 名患者的数据,我们对 BERT-base、BERT for Biomedical Text Mining (BioBERT)、Korean BERT (KoBERT) 和 Multilingual BERT (M-BERT) 进行了预训练,以提高它们的上下文理解能力,随后比较了它们在 7 个微调任务中的表现:结果:根据任务和标记使用的不同,模型的性能也各不相同。首先,BERT-base 和 BioBERT 在使用分类([CLS])标记嵌入的任务(如文档分类)中表现出色。BioBERT 的 F1 分数最高,达到 89.32。BERT-base 和 BioBERT 在文档模式识别方面都表现出了很高的效率,即使字典中的韩文标记有限。其次,M-BERT 在阅读理解任务中表现出色,取得了 93.77 的 F1 分数。如果用未知([UNK])词组替换的单词较少,则会取得更好的结果。第三,M-BERT 在知识推断任务中表现出色,该任务是从用 [MASK] 标记替换疾病名称的文档中的 63 个候选疾病名称中推断出正确的疾病名称。M-BERT 获得了 95.41 的最高命中@10 分:本研究强调了各种 BERT 模型在多语言临床领域中的有效性。研究结果可供临床和基于语言的应用参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
JMIR Medical Informatics
JMIR Medical Informatics Medicine-Health Informatics
CiteScore
7.90
自引率
3.10%
发文量
173
审稿时长
12 weeks
期刊介绍: JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals. Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.
期刊最新文献
A Multivariable Prediction Model for Mild Cognitive Impairment and Dementia: Algorithm Development and Validation. Using Machine Learning to Predict the Duration of Atrial Fibrillation: Model Development and Validation. Factors Contributing to Successful Information System Implementation and Employee Well-Being in Health Care and Social Welfare Professionals: Comparative Cross-Sectional Study. Bidirectional Long Short-Term Memory-Based Detection of Adverse Drug Reaction Posts Using Korean Social Networking Services Data: Deep Learning Approaches. Correlation between Diagnosis-related Group Weights and Nursing Time in the Cardiology Department: A Cross-sectional Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1