Enhanced ICD-10 code assignment of clinical texts: A summarization-based approach

IF 6.1 2区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Artificial Intelligence in Medicine Pub Date : 2024-08-20 DOI:10.1016/j.artmed.2024.102967
Yaoqian Sun , Lei Sang , Dan Wu , Shilin He , Yani Chen , Huilong Duan , Han Chen , Xudong Lu
{"title":"Enhanced ICD-10 code assignment of clinical texts: A summarization-based approach","authors":"Yaoqian Sun ,&nbsp;Lei Sang ,&nbsp;Dan Wu ,&nbsp;Shilin He ,&nbsp;Yani Chen ,&nbsp;Huilong Duan ,&nbsp;Han Chen ,&nbsp;Xudong Lu","doi":"10.1016/j.artmed.2024.102967","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Assigning International Classification of Diseases (ICD) codes to clinical texts is a common and crucial practice in patient classification, hospital management, and further statistics analysis. Current auto-coding methods mainly transfer this task to a multi-label classification problem. Such solutions are suffering from high-dimensional mapping space and excessive redundant information in long clinical texts. To alleviate such a situation, we introduce text summarization methods to the ICD coding regime and apply text matching to select ICD codes.</p></div><div><h3>Method</h3><p>We focus on the tenth revision of the ICD (ICD-10) coding and design a novel summarization-based approach (SuM) with an end-to-end strategy to efficiently assign ICD-10 code to clinical texts. In this approach, a knowledge-guided pointer network is purposed to distill and summarize key information in clinical texts precisely. Then a matching model with matching-aggregation architecture follows to align the summary result with code, tuning the one-vs-all scenario to one-vs-one matching so that the large-label-space obstacle laid in classification approaches would be avoided.</p></div><div><h3>Result</h3><p>The 12,788 ICD-10 coded discharge summaries from a Chinese hospital were collected to evaluate the proposed approach. Compared with existing methods, the purposed model achieves the greatest coding results with Micro AUC of 0.9548, MRR@10 of 0.7977, Precision@10 of 0.0944, and Recall@10 of 0.9439 for the TOP-50 Dataset. Results on the FULL-Dataset remain consistent. Also, the proposed knowledge encoder and applied end-to-end strategy are proven to facilitate the whole model to gain efficacy in selecting the most suitable code.</p></div><div><h3>Conclusion</h3><p>The proposed automatic ICD-10 code assignment approach via text summarization can effectively capture critical messages in long clinical texts and improve the performance of ICD-10 coding of clinical texts.</p></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"156 ","pages":"Article 102967"},"PeriodicalIF":6.1000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0933365724002094","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Assigning International Classification of Diseases (ICD) codes to clinical texts is a common and crucial practice in patient classification, hospital management, and further statistics analysis. Current auto-coding methods mainly transfer this task to a multi-label classification problem. Such solutions are suffering from high-dimensional mapping space and excessive redundant information in long clinical texts. To alleviate such a situation, we introduce text summarization methods to the ICD coding regime and apply text matching to select ICD codes.

Method

We focus on the tenth revision of the ICD (ICD-10) coding and design a novel summarization-based approach (SuM) with an end-to-end strategy to efficiently assign ICD-10 code to clinical texts. In this approach, a knowledge-guided pointer network is purposed to distill and summarize key information in clinical texts precisely. Then a matching model with matching-aggregation architecture follows to align the summary result with code, tuning the one-vs-all scenario to one-vs-one matching so that the large-label-space obstacle laid in classification approaches would be avoided.

Result

The 12,788 ICD-10 coded discharge summaries from a Chinese hospital were collected to evaluate the proposed approach. Compared with existing methods, the purposed model achieves the greatest coding results with Micro AUC of 0.9548, MRR@10 of 0.7977, Precision@10 of 0.0944, and Recall@10 of 0.9439 for the TOP-50 Dataset. Results on the FULL-Dataset remain consistent. Also, the proposed knowledge encoder and applied end-to-end strategy are proven to facilitate the whole model to gain efficacy in selecting the most suitable code.

Conclusion

The proposed automatic ICD-10 code assignment approach via text summarization can effectively capture critical messages in long clinical texts and improve the performance of ICD-10 coding of clinical texts.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
增强临床文本的 ICD-10 代码分配:基于总结的方法
背景为临床文本分配国际疾病分类(ICD)代码是病人分类、医院管理和进一步统计分析中常见的关键做法。目前的自动编码方法主要是将这一任务转化为多标签分类问题。此类解决方案存在高维映射空间和冗长临床文本中冗余信息过多的问题。为了缓解这种情况,我们在 ICD 编码系统中引入了文本摘要方法,并应用文本匹配来选择 ICD 编码。方法我们关注 ICD(ICD-10)编码的第十次修订,并设计了一种基于摘要的新型方法(SuM),该方法采用端到端策略,可高效地为临床文本分配 ICD-10 编码。在这种方法中,知识引导的指针网络旨在精确提炼和总结临床文本中的关键信息。结果收集了一家中国医院的 12,788 份 ICD-10 编码出院摘要,对所提出的方法进行了评估。与现有方法相比,在 TOP-50 数据集上,本模型的编码结果最好,微观 AUC 为 0.9548,MRR@10 为 0.7977,Precision@10 为 0.0944,Recall@10 为 0.9439。在 FULL 数据集上的结果保持一致。此外,所提出的知识编码器和应用的端到端策略也被证明有助于整个模型在选择最合适的代码时取得成效。 结论所提出的通过文本摘要自动分配 ICD-10 代码的方法可以有效捕捉长篇临床文本中的关键信息,并提高临床文本的 ICD-10 编码性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Artificial Intelligence in Medicine
Artificial Intelligence in Medicine 工程技术-工程:生物医学
CiteScore
15.00
自引率
2.70%
发文量
143
审稿时长
6.3 months
期刊介绍: Artificial Intelligence in Medicine publishes original articles from a wide variety of interdisciplinary perspectives concerning the theory and practice of artificial intelligence (AI) in medicine, medically-oriented human biology, and health care. Artificial intelligence in medicine may be characterized as the scientific discipline pertaining to research studies, projects, and applications that aim at supporting decision-based medical tasks through knowledge- and/or data-intensive computer-based solutions that ultimately support and improve the performance of a human care provider.
期刊最新文献
Hyperbolic multivariate feature learning in higher-order heterogeneous networks for drug–disease prediction Editorial Board BDFormer: Boundary-aware dual-decoder transformer for skin lesion segmentation Finger-aware Artificial Neural Network for predicting arthritis in Patients with hand pain Artificial intelligence-driven approaches in antibiotic stewardship programs and optimizing prescription practices: A systematic review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1