Yaoqian Sun , Lei Sang , Dan Wu , Shilin He , Yani Chen , Huilong Duan , Han Chen , Xudong Lu
{"title":"增强临床文本的 ICD-10 代码分配:基于总结的方法","authors":"Yaoqian Sun , Lei Sang , Dan Wu , Shilin He , Yani Chen , Huilong Duan , Han Chen , Xudong Lu","doi":"10.1016/j.artmed.2024.102967","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Assigning International Classification of Diseases (ICD) codes to clinical texts is a common and crucial practice in patient classification, hospital management, and further statistics analysis. Current auto-coding methods mainly transfer this task to a multi-label classification problem. Such solutions are suffering from high-dimensional mapping space and excessive redundant information in long clinical texts. To alleviate such a situation, we introduce text summarization methods to the ICD coding regime and apply text matching to select ICD codes.</p></div><div><h3>Method</h3><p>We focus on the tenth revision of the ICD (ICD-10) coding and design a novel summarization-based approach (SuM) with an end-to-end strategy to efficiently assign ICD-10 code to clinical texts. In this approach, a knowledge-guided pointer network is purposed to distill and summarize key information in clinical texts precisely. Then a matching model with matching-aggregation architecture follows to align the summary result with code, tuning the one-vs-all scenario to one-vs-one matching so that the large-label-space obstacle laid in classification approaches would be avoided.</p></div><div><h3>Result</h3><p>The 12,788 ICD-10 coded discharge summaries from a Chinese hospital were collected to evaluate the proposed approach. Compared with existing methods, the purposed model achieves the greatest coding results with Micro AUC of 0.9548, MRR@10 of 0.7977, Precision@10 of 0.0944, and Recall@10 of 0.9439 for the TOP-50 Dataset. Results on the FULL-Dataset remain consistent. Also, the proposed knowledge encoder and applied end-to-end strategy are proven to facilitate the whole model to gain efficacy in selecting the most suitable code.</p></div><div><h3>Conclusion</h3><p>The proposed automatic ICD-10 code assignment approach via text summarization can effectively capture critical messages in long clinical texts and improve the performance of ICD-10 coding of clinical texts.</p></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"156 ","pages":"Article 102967"},"PeriodicalIF":6.1000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced ICD-10 code assignment of clinical texts: A summarization-based approach\",\"authors\":\"Yaoqian Sun , Lei Sang , Dan Wu , Shilin He , Yani Chen , Huilong Duan , Han Chen , Xudong Lu\",\"doi\":\"10.1016/j.artmed.2024.102967\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><p>Assigning International Classification of Diseases (ICD) codes to clinical texts is a common and crucial practice in patient classification, hospital management, and further statistics analysis. Current auto-coding methods mainly transfer this task to a multi-label classification problem. Such solutions are suffering from high-dimensional mapping space and excessive redundant information in long clinical texts. To alleviate such a situation, we introduce text summarization methods to the ICD coding regime and apply text matching to select ICD codes.</p></div><div><h3>Method</h3><p>We focus on the tenth revision of the ICD (ICD-10) coding and design a novel summarization-based approach (SuM) with an end-to-end strategy to efficiently assign ICD-10 code to clinical texts. In this approach, a knowledge-guided pointer network is purposed to distill and summarize key information in clinical texts precisely. Then a matching model with matching-aggregation architecture follows to align the summary result with code, tuning the one-vs-all scenario to one-vs-one matching so that the large-label-space obstacle laid in classification approaches would be avoided.</p></div><div><h3>Result</h3><p>The 12,788 ICD-10 coded discharge summaries from a Chinese hospital were collected to evaluate the proposed approach. Compared with existing methods, the purposed model achieves the greatest coding results with Micro AUC of 0.9548, MRR@10 of 0.7977, Precision@10 of 0.0944, and Recall@10 of 0.9439 for the TOP-50 Dataset. Results on the FULL-Dataset remain consistent. Also, the proposed knowledge encoder and applied end-to-end strategy are proven to facilitate the whole model to gain efficacy in selecting the most suitable code.</p></div><div><h3>Conclusion</h3><p>The proposed automatic ICD-10 code assignment approach via text summarization can effectively capture critical messages in long clinical texts and improve the performance of ICD-10 coding of clinical texts.</p></div>\",\"PeriodicalId\":55458,\"journal\":{\"name\":\"Artificial Intelligence in Medicine\",\"volume\":\"156 \",\"pages\":\"Article 102967\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2024-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence in Medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0933365724002094\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0933365724002094","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Enhanced ICD-10 code assignment of clinical texts: A summarization-based approach
Background
Assigning International Classification of Diseases (ICD) codes to clinical texts is a common and crucial practice in patient classification, hospital management, and further statistics analysis. Current auto-coding methods mainly transfer this task to a multi-label classification problem. Such solutions are suffering from high-dimensional mapping space and excessive redundant information in long clinical texts. To alleviate such a situation, we introduce text summarization methods to the ICD coding regime and apply text matching to select ICD codes.
Method
We focus on the tenth revision of the ICD (ICD-10) coding and design a novel summarization-based approach (SuM) with an end-to-end strategy to efficiently assign ICD-10 code to clinical texts. In this approach, a knowledge-guided pointer network is purposed to distill and summarize key information in clinical texts precisely. Then a matching model with matching-aggregation architecture follows to align the summary result with code, tuning the one-vs-all scenario to one-vs-one matching so that the large-label-space obstacle laid in classification approaches would be avoided.
Result
The 12,788 ICD-10 coded discharge summaries from a Chinese hospital were collected to evaluate the proposed approach. Compared with existing methods, the purposed model achieves the greatest coding results with Micro AUC of 0.9548, MRR@10 of 0.7977, Precision@10 of 0.0944, and Recall@10 of 0.9439 for the TOP-50 Dataset. Results on the FULL-Dataset remain consistent. Also, the proposed knowledge encoder and applied end-to-end strategy are proven to facilitate the whole model to gain efficacy in selecting the most suitable code.
Conclusion
The proposed automatic ICD-10 code assignment approach via text summarization can effectively capture critical messages in long clinical texts and improve the performance of ICD-10 coding of clinical texts.
期刊介绍:
Artificial Intelligence in Medicine publishes original articles from a wide variety of interdisciplinary perspectives concerning the theory and practice of artificial intelligence (AI) in medicine, medically-oriented human biology, and health care.
Artificial intelligence in medicine may be characterized as the scientific discipline pertaining to research studies, projects, and applications that aim at supporting decision-based medical tasks through knowledge- and/or data-intensive computer-based solutions that ultimately support and improve the performance of a human care provider.