Modeling of joint extraction of entity relationships in clinical electronic medical records

IF 7 2区 医学 Q1 BIOLOGY Computers in biology and medicine Pub Date : 2024-09-18 DOI:10.1016/j.compbiomed.2024.109161
{"title":"Modeling of joint extraction of entity relationships in clinical electronic medical records","authors":"","doi":"10.1016/j.compbiomed.2024.109161","DOIUrl":null,"url":null,"abstract":"<div><p>The advancement of medical informatization necessitates extracting entities and their relationships from electronic medical records. Presently, research on electronic medical records predominantly concentrates on single-entity relationship extraction. However, clinical electronic medical records frequently exhibit overlapping complex entity relationships, thereby heightening the challenge of information extraction. To rectify the absence of a clinical medical relationship extraction dataset, this study utilizes electronic medical records from 584 patients in a hospital to create a compact clinical medical relationship extraction dataset. To address the pipelined relationship extraction model’s limitation in overlooking the one-to-many correlation problem between entities and relationships, this paper introduces a cascading relationship extraction model. This model integrates the MacBERT pre-training model, gated recurrent network, and multi-head self-attention mechanism to enhance the extraction of text features. Simultaneously, adversarial learning is incorporated to bolster the model’s robustness. In scenarios involving one-to-many relationships between entities, a two-phase task is employed. Initially, the main entity is predicted, followed by predicting the associated object and their correspondences. Employing this cascade-structured approach enables the model to flexibly manage intricate entity relationships, thereby enhancing extraction accuracy. Experimental results demonstrate the model’s efficiency, yielding F1-scores of 82.8%, 76.8%, and 88.2% for fulfilling relational extraction requirements and tasks on DuIE, CHIP-CDEE, and private datasets, respectively. These scores represent improvements over the benchmark model. The findings indicate the model’s applicability in practical domains, particularly in tasks such as biomedical information extraction.</p></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":null,"pages":null},"PeriodicalIF":7.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482524012460","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The advancement of medical informatization necessitates extracting entities and their relationships from electronic medical records. Presently, research on electronic medical records predominantly concentrates on single-entity relationship extraction. However, clinical electronic medical records frequently exhibit overlapping complex entity relationships, thereby heightening the challenge of information extraction. To rectify the absence of a clinical medical relationship extraction dataset, this study utilizes electronic medical records from 584 patients in a hospital to create a compact clinical medical relationship extraction dataset. To address the pipelined relationship extraction model’s limitation in overlooking the one-to-many correlation problem between entities and relationships, this paper introduces a cascading relationship extraction model. This model integrates the MacBERT pre-training model, gated recurrent network, and multi-head self-attention mechanism to enhance the extraction of text features. Simultaneously, adversarial learning is incorporated to bolster the model’s robustness. In scenarios involving one-to-many relationships between entities, a two-phase task is employed. Initially, the main entity is predicted, followed by predicting the associated object and their correspondences. Employing this cascade-structured approach enables the model to flexibly manage intricate entity relationships, thereby enhancing extraction accuracy. Experimental results demonstrate the model’s efficiency, yielding F1-scores of 82.8%, 76.8%, and 88.2% for fulfilling relational extraction requirements and tasks on DuIE, CHIP-CDEE, and private datasets, respectively. These scores represent improvements over the benchmark model. The findings indicate the model’s applicability in practical domains, particularly in tasks such as biomedical information extraction.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
临床电子病历实体关系联合提取建模
医疗信息化的发展要求从电子病历中提取实体及其关系。目前,有关电子病历的研究主要集中在单一实体关系提取方面。然而,临床电子病历经常表现出重叠的复杂实体关系,从而增加了信息提取的难度。为了弥补临床医学关系提取数据集的缺失,本研究利用某医院 584 名患者的电子病历创建了一个紧凑的临床医学关系提取数据集。为了解决流水线式关系提取模型在实体和关系之间一对多关联问题上的局限性,本文引入了级联式关系提取模型。该模型集成了 MacBERT 预训练模型、门控递归网络和多头自注意机制,以增强文本特征的提取。同时,该模型还加入了对抗学习,以增强其鲁棒性。在涉及实体间一对多关系的场景中,采用了两阶段任务。首先预测主要实体,然后预测相关对象及其对应关系。采用这种级联结构的方法使模型能够灵活地管理错综复杂的实体关系,从而提高提取的准确性。实验结果证明了该模型的高效性,在DuIE、CHIP-CDEE和私人数据集上完成关系提取要求和任务的F1分数分别为82.8%、76.8%和88.2%。与基准模型相比,这些分数都有所提高。研究结果表明,该模型适用于实际领域,尤其是生物医学信息提取等任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers in biology and medicine
Computers in biology and medicine 工程技术-工程:生物医学
CiteScore
11.70
自引率
10.40%
发文量
1086
审稿时长
74 days
期刊介绍: Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.
期刊最新文献
Lightweight medical image segmentation network with multi-scale feature-guided fusion. Shuffled ECA-Net for stress detection from multimodal wearable sensor data. Stacking based ensemble learning framework for identification of nitrotyrosine sites. Two-stage deep learning framework for occlusal crown depth image generation. A joint analysis proposal of nonlinear longitudinal and time-to-event right-, interval-censored data for modeling pregnancy miscarriage.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1