Yumeng Yang , Hongfei Lin , Zhihao Yang , Yijia Zhang , Di Zhao , Ling Luo
{"title":"LCDL:基于疾病标签共现依赖和LongFormer与医学知识的ICD代码分类。","authors":"Yumeng Yang , Hongfei Lin , Zhihao Yang , Yijia Zhang , Di Zhao , Ling Luo","doi":"10.1016/j.artmed.2024.103041","DOIUrl":null,"url":null,"abstract":"<div><div>Medical coding involves assigning codes to clinical free-text documents, specifically medical records that average over 3,000 markers, in order to track patient diagnoses and treatments. This is typically accomplished through manual assignments by healthcare professionals. To improve efficiency and accuracy while reducing the workload on these professionals, researchers have employed a multi-label classification approach. Since the long-tail phenomenon impacts tens of thousands of ICD codes, whereby only a few codes (representative of common diseases) are frequently assigned, while the majority of codes (representative of rare diseases) are infrequently assigned, this paper presents an LCDL model that addresses the challenge at hand by examining the LongFormer pre-trained language model and the disease label co-occurrence map. To enhance the performance of automated medical coding in the biomedical domain, hierarchies with medical knowledge, synonyms and abbreviations are introduced, improving the medical knowledge representation. Test evaluations are extensively conducted on the benchmark dataset MIMIC-III, and obtained the competitive performance compared to the previous state-of-the-art methods.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"160 ","pages":"Article 103041"},"PeriodicalIF":6.1000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LCDL: Classification of ICD codes based on disease label co-occurrence dependency and LongFormer with medical knowledge\",\"authors\":\"Yumeng Yang , Hongfei Lin , Zhihao Yang , Yijia Zhang , Di Zhao , Ling Luo\",\"doi\":\"10.1016/j.artmed.2024.103041\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Medical coding involves assigning codes to clinical free-text documents, specifically medical records that average over 3,000 markers, in order to track patient diagnoses and treatments. This is typically accomplished through manual assignments by healthcare professionals. To improve efficiency and accuracy while reducing the workload on these professionals, researchers have employed a multi-label classification approach. Since the long-tail phenomenon impacts tens of thousands of ICD codes, whereby only a few codes (representative of common diseases) are frequently assigned, while the majority of codes (representative of rare diseases) are infrequently assigned, this paper presents an LCDL model that addresses the challenge at hand by examining the LongFormer pre-trained language model and the disease label co-occurrence map. To enhance the performance of automated medical coding in the biomedical domain, hierarchies with medical knowledge, synonyms and abbreviations are introduced, improving the medical knowledge representation. Test evaluations are extensively conducted on the benchmark dataset MIMIC-III, and obtained the competitive performance compared to the previous state-of-the-art methods.</div></div>\",\"PeriodicalId\":55458,\"journal\":{\"name\":\"Artificial Intelligence in Medicine\",\"volume\":\"160 \",\"pages\":\"Article 103041\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence in Medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0933365724002835\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0933365724002835","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
LCDL: Classification of ICD codes based on disease label co-occurrence dependency and LongFormer with medical knowledge
Medical coding involves assigning codes to clinical free-text documents, specifically medical records that average over 3,000 markers, in order to track patient diagnoses and treatments. This is typically accomplished through manual assignments by healthcare professionals. To improve efficiency and accuracy while reducing the workload on these professionals, researchers have employed a multi-label classification approach. Since the long-tail phenomenon impacts tens of thousands of ICD codes, whereby only a few codes (representative of common diseases) are frequently assigned, while the majority of codes (representative of rare diseases) are infrequently assigned, this paper presents an LCDL model that addresses the challenge at hand by examining the LongFormer pre-trained language model and the disease label co-occurrence map. To enhance the performance of automated medical coding in the biomedical domain, hierarchies with medical knowledge, synonyms and abbreviations are introduced, improving the medical knowledge representation. Test evaluations are extensively conducted on the benchmark dataset MIMIC-III, and obtained the competitive performance compared to the previous state-of-the-art methods.
期刊介绍:
Artificial Intelligence in Medicine publishes original articles from a wide variety of interdisciplinary perspectives concerning the theory and practice of artificial intelligence (AI) in medicine, medically-oriented human biology, and health care.
Artificial intelligence in medicine may be characterized as the scientific discipline pertaining to research studies, projects, and applications that aim at supporting decision-based medical tasks through knowledge- and/or data-intensive computer-based solutions that ultimately support and improve the performance of a human care provider.