OLR-Net:用于主要诊断提取的对象标签检索网络

IF 7 2区 医学 Q1 BIOLOGY Computers in biology and medicine Pub Date : 2024-09-16 DOI:10.1016/j.compbiomed.2024.109130
{"title":"OLR-Net:用于主要诊断提取的对象标签检索网络","authors":"","doi":"10.1016/j.compbiomed.2024.109130","DOIUrl":null,"url":null,"abstract":"<div><h3>Background:</h3><p>Extracting principal diagnosis from patient discharge summaries is an essential task for the meaningful use of medical data. The extraction process, usually by medical staff, is laborious and time-consuming. Although automatic models have been proposed to retrieve principal diagnoses from medical records, many rare diagnoses and a small amount of training data per rare diagnosis provide significant statistical and computational challenges.</p></div><div><h3>Objective:</h3><p>In this study, we aimed to extract principal diagnoses with limited available data.</p></div><div><h3>Methods:</h3><p>We proposed the OLR-Net, Object Label Retrieval Network, to extract principal diagnoses for discharge summaries. Our approach included semantic extraction, label localization, label retrieval, and recommendation. The semantic information of discharge summaries was mapped into the diagnoses set. Then, one-dimensional convolutional neural networks slid into the bottom-up region for diagnosis localization to enrich rare diagnoses. Finally, OLR-Net detected the principal diagnosis in the localized region. The evaluation metrics focus on the hit ratio, mean reciprocal rank, and the area under the receiver operating characteristic curve (AUROC).</p></div><div><h3>Results:</h3><p>12,788 desensitized discharge summary records were collected from the oncology department at Hainan Hospital of Chinese People’s Liberation Army General Hospital. We designed five distinct settings based on the number of training data per diagnosis: the full dataset, the top-50 dataset, the few-shot dataset, the one-shot dataset, and the zero-shot dataset. The performance of our model had the highest HR@5 of 0.8778 and macro-AUROC of 0.9851. In the limited available (few-shot and one-shot) dataset, the macro-AUROC were 0.9833 and 0.9485, respectively.</p></div><div><h3>Conclusions:</h3><p>OLR-Net has great potential for extracting principal diagnosis with limited available data through label localization and retrieval.</p></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":null,"pages":null},"PeriodicalIF":7.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"OLR-Net: Object Label Retrieval Network for principal diagnosis extraction\",\"authors\":\"\",\"doi\":\"10.1016/j.compbiomed.2024.109130\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background:</h3><p>Extracting principal diagnosis from patient discharge summaries is an essential task for the meaningful use of medical data. The extraction process, usually by medical staff, is laborious and time-consuming. Although automatic models have been proposed to retrieve principal diagnoses from medical records, many rare diagnoses and a small amount of training data per rare diagnosis provide significant statistical and computational challenges.</p></div><div><h3>Objective:</h3><p>In this study, we aimed to extract principal diagnoses with limited available data.</p></div><div><h3>Methods:</h3><p>We proposed the OLR-Net, Object Label Retrieval Network, to extract principal diagnoses for discharge summaries. Our approach included semantic extraction, label localization, label retrieval, and recommendation. The semantic information of discharge summaries was mapped into the diagnoses set. Then, one-dimensional convolutional neural networks slid into the bottom-up region for diagnosis localization to enrich rare diagnoses. Finally, OLR-Net detected the principal diagnosis in the localized region. The evaluation metrics focus on the hit ratio, mean reciprocal rank, and the area under the receiver operating characteristic curve (AUROC).</p></div><div><h3>Results:</h3><p>12,788 desensitized discharge summary records were collected from the oncology department at Hainan Hospital of Chinese People’s Liberation Army General Hospital. We designed five distinct settings based on the number of training data per diagnosis: the full dataset, the top-50 dataset, the few-shot dataset, the one-shot dataset, and the zero-shot dataset. The performance of our model had the highest HR@5 of 0.8778 and macro-AUROC of 0.9851. In the limited available (few-shot and one-shot) dataset, the macro-AUROC were 0.9833 and 0.9485, respectively.</p></div><div><h3>Conclusions:</h3><p>OLR-Net has great potential for extracting principal diagnosis with limited available data through label localization and retrieval.</p></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482524012150\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482524012150","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:从病人出院摘要中提取主要诊断是医疗数据有意义使用的一项重要任务。提取过程通常由医务人员完成,费时费力。方法:我们提出了对象标签检索网络(OLR-Net)来提取出院摘要中的主要诊断。我们的方法包括语义提取、标签定位、标签检索和推荐。出院摘要的语义信息被映射到诊断集。然后,一维卷积神经网络滑入自下而上区域进行诊断定位,以丰富罕见诊断。最后,OLR-Net 检测出定位区域中的主要诊断。结果:我们从中国人民解放军总医院海南医院肿瘤科收集了12788份脱敏出院摘要记录。根据每个诊断的训练数据数量,我们设计了五种不同的设置:全数据集、前 50 数据集、少量数据集、一次数据集和零次数据集。我们的模型性能最高,HR@5 为 0.8778,macro-AUROC 为 0.9851。结论:通过标签定位和检索,OLR-Net 在利用有限的可用数据提取主诊断方面具有巨大潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
OLR-Net: Object Label Retrieval Network for principal diagnosis extraction

Background:

Extracting principal diagnosis from patient discharge summaries is an essential task for the meaningful use of medical data. The extraction process, usually by medical staff, is laborious and time-consuming. Although automatic models have been proposed to retrieve principal diagnoses from medical records, many rare diagnoses and a small amount of training data per rare diagnosis provide significant statistical and computational challenges.

Objective:

In this study, we aimed to extract principal diagnoses with limited available data.

Methods:

We proposed the OLR-Net, Object Label Retrieval Network, to extract principal diagnoses for discharge summaries. Our approach included semantic extraction, label localization, label retrieval, and recommendation. The semantic information of discharge summaries was mapped into the diagnoses set. Then, one-dimensional convolutional neural networks slid into the bottom-up region for diagnosis localization to enrich rare diagnoses. Finally, OLR-Net detected the principal diagnosis in the localized region. The evaluation metrics focus on the hit ratio, mean reciprocal rank, and the area under the receiver operating characteristic curve (AUROC).

Results:

12,788 desensitized discharge summary records were collected from the oncology department at Hainan Hospital of Chinese People’s Liberation Army General Hospital. We designed five distinct settings based on the number of training data per diagnosis: the full dataset, the top-50 dataset, the few-shot dataset, the one-shot dataset, and the zero-shot dataset. The performance of our model had the highest HR@5 of 0.8778 and macro-AUROC of 0.9851. In the limited available (few-shot and one-shot) dataset, the macro-AUROC were 0.9833 and 0.9485, respectively.

Conclusions:

OLR-Net has great potential for extracting principal diagnosis with limited available data through label localization and retrieval.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers in biology and medicine
Computers in biology and medicine 工程技术-工程:生物医学
CiteScore
11.70
自引率
10.40%
发文量
1086
审稿时长
74 days
期刊介绍: Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.
期刊最新文献
Lightweight medical image segmentation network with multi-scale feature-guided fusion. Shuffled ECA-Net for stress detection from multimodal wearable sensor data. Stacking based ensemble learning framework for identification of nitrotyrosine sites. Two-stage deep learning framework for occlusal crown depth image generation. A joint analysis proposal of nonlinear longitudinal and time-to-event right-, interval-censored data for modeling pregnancy miscarriage.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1