Aosheng Cheng , Yan Zhang , Zhiqiang Qian , Xueli Yuan , Sumei Yao , Wenqing Ni , Yijin Zheng , Hongmin Zhang , Quan Lu , Zhiguang Zhao
{"title":"整合多任务学习和成本敏感学习,利用真实世界数据预测老年人慢性病的死亡风险。","authors":"Aosheng Cheng , Yan Zhang , Zhiqiang Qian , Xueli Yuan , Sumei Yao , Wenqing Ni , Yijin Zheng , Hongmin Zhang , Quan Lu , Zhiguang Zhao","doi":"10.1016/j.ijmedinf.2024.105567","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><p>Real-world data encompass population diversity, enabling insights into chronic disease mortality risk among the elderly. Deep learning excels on large datasets, offering promise for real-world data. However, current models focus on single diseases, neglecting comorbidities prevalent in patients. Moreover, mortality is infrequent compared to illness, causing extreme class imbalance that impedes reliable prediction. We aim to develop a deep learning framework that accurately forecasts mortality risk from real-world data by addressing comorbidities and class imbalance.</p></div><div><h3>Methods</h3><p>We integrated multi-task and cost-sensitive learning, developing an enhanced deep neural network architecture that extends multi-task learning to predict mortality risk across multiple chronic diseases. Each patient cohort with a chronic disease was assigned to a separate task, with shared lower-level parameters capturing inter-disease complexities through distinct top-level networks. Cost-sensitive functions were incorporated to ensure learning of positive class characteristics for each task and achieve accurate prediction of the risk of death from multiple chronic diseases.</p></div><div><h3>Results</h3><p>Our study covers 15 prevalent chronic diseases and is experimented with real-world data from 482,145 patients (including 9,516 deaths) in Shenzhen, China. The proposed model is compared with six models including three machine learning models: logistic regression, XGBoost, and CatBoost, and three state-of-the-art deep learning models: 1D-CNN, TabNet, and Saint. The experimental results show that, compared with the other compared algorithms, MTL-CSDNN has better prediction results on the test set (ACC=0.99, REC=0.99, PRAUC=0.97, MCC=0.98, G-means = 0.98).</p></div><div><h3>Conclusions</h3><p>Our method provides valuable insights into leveraging real-world data for precise multi-disease mortality risk prediction, offering potential applications in optimizing chronic disease management, enhancing well-being, and reducing healthcare costs for the elderly population.</p></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating multi-task and cost-sensitive learning for predicting mortality risk of chronic diseases in the elderly using real-world data\",\"authors\":\"Aosheng Cheng , Yan Zhang , Zhiqiang Qian , Xueli Yuan , Sumei Yao , Wenqing Ni , Yijin Zheng , Hongmin Zhang , Quan Lu , Zhiguang Zhao\",\"doi\":\"10.1016/j.ijmedinf.2024.105567\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background and Objective</h3><p>Real-world data encompass population diversity, enabling insights into chronic disease mortality risk among the elderly. Deep learning excels on large datasets, offering promise for real-world data. However, current models focus on single diseases, neglecting comorbidities prevalent in patients. Moreover, mortality is infrequent compared to illness, causing extreme class imbalance that impedes reliable prediction. We aim to develop a deep learning framework that accurately forecasts mortality risk from real-world data by addressing comorbidities and class imbalance.</p></div><div><h3>Methods</h3><p>We integrated multi-task and cost-sensitive learning, developing an enhanced deep neural network architecture that extends multi-task learning to predict mortality risk across multiple chronic diseases. Each patient cohort with a chronic disease was assigned to a separate task, with shared lower-level parameters capturing inter-disease complexities through distinct top-level networks. Cost-sensitive functions were incorporated to ensure learning of positive class characteristics for each task and achieve accurate prediction of the risk of death from multiple chronic diseases.</p></div><div><h3>Results</h3><p>Our study covers 15 prevalent chronic diseases and is experimented with real-world data from 482,145 patients (including 9,516 deaths) in Shenzhen, China. The proposed model is compared with six models including three machine learning models: logistic regression, XGBoost, and CatBoost, and three state-of-the-art deep learning models: 1D-CNN, TabNet, and Saint. The experimental results show that, compared with the other compared algorithms, MTL-CSDNN has better prediction results on the test set (ACC=0.99, REC=0.99, PRAUC=0.97, MCC=0.98, G-means = 0.98).</p></div><div><h3>Conclusions</h3><p>Our method provides valuable insights into leveraging real-world data for precise multi-disease mortality risk prediction, offering potential applications in optimizing chronic disease management, enhancing well-being, and reducing healthcare costs for the elderly population.</p></div>\",\"PeriodicalId\":54950,\"journal\":{\"name\":\"International Journal of Medical Informatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Medical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1386505624002302\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624002302","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Integrating multi-task and cost-sensitive learning for predicting mortality risk of chronic diseases in the elderly using real-world data
Background and Objective
Real-world data encompass population diversity, enabling insights into chronic disease mortality risk among the elderly. Deep learning excels on large datasets, offering promise for real-world data. However, current models focus on single diseases, neglecting comorbidities prevalent in patients. Moreover, mortality is infrequent compared to illness, causing extreme class imbalance that impedes reliable prediction. We aim to develop a deep learning framework that accurately forecasts mortality risk from real-world data by addressing comorbidities and class imbalance.
Methods
We integrated multi-task and cost-sensitive learning, developing an enhanced deep neural network architecture that extends multi-task learning to predict mortality risk across multiple chronic diseases. Each patient cohort with a chronic disease was assigned to a separate task, with shared lower-level parameters capturing inter-disease complexities through distinct top-level networks. Cost-sensitive functions were incorporated to ensure learning of positive class characteristics for each task and achieve accurate prediction of the risk of death from multiple chronic diseases.
Results
Our study covers 15 prevalent chronic diseases and is experimented with real-world data from 482,145 patients (including 9,516 deaths) in Shenzhen, China. The proposed model is compared with six models including three machine learning models: logistic regression, XGBoost, and CatBoost, and three state-of-the-art deep learning models: 1D-CNN, TabNet, and Saint. The experimental results show that, compared with the other compared algorithms, MTL-CSDNN has better prediction results on the test set (ACC=0.99, REC=0.99, PRAUC=0.97, MCC=0.98, G-means = 0.98).
Conclusions
Our method provides valuable insights into leveraging real-world data for precise multi-disease mortality risk prediction, offering potential applications in optimizing chronic disease management, enhancing well-being, and reducing healthcare costs for the elderly population.
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.