Shuzhi Su , Jisheng Gao , Jingjing Dong , Qi Guo , Hualin Ma , Shaodong Luan , Xuejia Zheng , Huihui Tao , Lingling Zhou , Yong Dai
{"title":"Prediction of mortality in hemodialysis patients based on autoencoders","authors":"Shuzhi Su , Jisheng Gao , Jingjing Dong , Qi Guo , Hualin Ma , Shaodong Luan , Xuejia Zheng , Huihui Tao , Lingling Zhou , Yong Dai","doi":"10.1016/j.ijmedinf.2024.105744","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Patients with end-stage renal disease (ESRD) undergoing hemodialysis (HD) exhibit a high mortality risk, particularly at the onset of treatment. Conventional risk assessment models, dependent on extensive temporal data accumulation, frequently encounter issues of data incompleteness and lengthy collection periods.</div></div><div><h3>Objective</h3><div>This study addresses the imbalance in short-term HD data and the issue of missing data features, achieving a robust assessment of mortality risk for HD patients over the subsequent 30 to 450 days.</div></div><div><h3>Methods</h3><div>An autoencoder-based mortality prediction model for HD patients is proposed. Leveraging the manifold structure of the non-missing features and the intrinsic relationship between the features in the high-dimensional data space, the model infers the values of the missing features. Noise and redundant information typically distort the manifold structure, impacting the accuracy of inferences about missing features. Consequently, we generate feature dropping masks to simulate the missing data distribution in the deep learning framework and design an autoencoder, forming an adaptive feature extraction module. The module utilizes readily available short-term data for unsupervised learning, enabling the encoder to reconstruct missing features and derive latent representations. Finally, a classifier based on the latent representations achieves the mortality prediction.</div></div><div><h3>Results</h3><div>Over a 30-day observation window, the model demonstrated superior mortality prediction performance compared to other models across all prediction windows. Feature importance analysis showed that creatinine and age are consistently the most critical features across all prediction windows. Glucose (fasting) and platelet count also remain significant, with their correlation with mortality risk increasing over time. Serum albumin, international standard ratio, and phosphate are particularly important for short-term predictions, while conjugated bilirubin and prothrombin time gain prominence in mid- and long-term predictions.</div></div><div><h3>Conclusion</h3><div>The proposed model proficiently leverages short-term HD data to provide precise mortality risk evaluations in HD patients, with particular efficacy in the short-term. Its application holds considerable value for clinical decision-making and risk management in this patient population.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105744"},"PeriodicalIF":3.7000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624004076","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Patients with end-stage renal disease (ESRD) undergoing hemodialysis (HD) exhibit a high mortality risk, particularly at the onset of treatment. Conventional risk assessment models, dependent on extensive temporal data accumulation, frequently encounter issues of data incompleteness and lengthy collection periods.
Objective
This study addresses the imbalance in short-term HD data and the issue of missing data features, achieving a robust assessment of mortality risk for HD patients over the subsequent 30 to 450 days.
Methods
An autoencoder-based mortality prediction model for HD patients is proposed. Leveraging the manifold structure of the non-missing features and the intrinsic relationship between the features in the high-dimensional data space, the model infers the values of the missing features. Noise and redundant information typically distort the manifold structure, impacting the accuracy of inferences about missing features. Consequently, we generate feature dropping masks to simulate the missing data distribution in the deep learning framework and design an autoencoder, forming an adaptive feature extraction module. The module utilizes readily available short-term data for unsupervised learning, enabling the encoder to reconstruct missing features and derive latent representations. Finally, a classifier based on the latent representations achieves the mortality prediction.
Results
Over a 30-day observation window, the model demonstrated superior mortality prediction performance compared to other models across all prediction windows. Feature importance analysis showed that creatinine and age are consistently the most critical features across all prediction windows. Glucose (fasting) and platelet count also remain significant, with their correlation with mortality risk increasing over time. Serum albumin, international standard ratio, and phosphate are particularly important for short-term predictions, while conjugated bilirubin and prothrombin time gain prominence in mid- and long-term predictions.
Conclusion
The proposed model proficiently leverages short-term HD data to provide precise mortality risk evaluations in HD patients, with particular efficacy in the short-term. Its application holds considerable value for clinical decision-making and risk management in this patient population.
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.