Applying stacking ensemble method to predict chronic kidney disease progression in Chinese population based on laboratory information system: a retrospective study.
IF 2.4 3区 生物学Q2 MULTIDISCIPLINARY SCIENCESPeerJPub Date : 2024-11-01eCollection Date: 2024-01-01DOI:10.7717/peerj.18436
Jialin Du, Jie Gao, Jie Guan, Bo Jin, Nan Duan, Lu Pang, Haiming Huang, Qian Ma, Chenwei Huang, Haixia Li
{"title":"Applying stacking ensemble method to predict chronic kidney disease progression in Chinese population based on laboratory information system: a retrospective study.","authors":"Jialin Du, Jie Gao, Jie Guan, Bo Jin, Nan Duan, Lu Pang, Haiming Huang, Qian Ma, Chenwei Huang, Haixia Li","doi":"10.7717/peerj.18436","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and objective: </strong>Chronic kidney disease (CKD) is a major public health issue, and accurate prediction of the progression of kidney failure is critical for clinical decision-making and helps improve patient outcomes. As such, we aimed to develop and externally validate a machine-learned model to predict the progression of CKD using common laboratory variables, demographic characteristics, and an electronic health records database.</p><p><strong>Methods: </strong>We developed a predictive model using longitudinal clinical data from a single center for Chinese CKD patients. The cohort included 987 patients who were followed up for more than 24 months. Fifty-three laboratory features were considered for inclusion in the model. The primary outcome in our study was an estimated glomerular filtration rate ≤15 mL/min/1.73 m<sup>2</sup> or kidney failure. Machine learning algorithms were applied to the modeling dataset (<i>n</i> = 296), and an external dataset (<i>n</i> = 71) was used for model validation. We assessed model discrimination <i>via</i> area under the curve (AUC) values, accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score.</p><p><strong>Results: </strong>Over a median follow-up period of 3.75 years, 148 patients experienced kidney failure. The optimal model was based on stacking different classifier algorithms with six laboratory features, including 24-h urine protein, potassium, glucose, urea, prealbumin and total protein. The model had considerable predictive power, with AUC values of 0.896 and 0.771 in the validation and external datasets, respectively. This model also accurately predicted the progression of renal function in patients over different follow-up periods after their initial assessment.</p><p><strong>Conclusions: </strong>A prediction model that leverages routinely collected laboratory features in the Chinese population can accurately identify patients with CKD at high risk of progressing to kidney failure. An online version of the model can be easily and quickly applied in clinical management and treatment.</p>","PeriodicalId":19799,"journal":{"name":"PeerJ","volume":"12 ","pages":"e18436"},"PeriodicalIF":2.4000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11533905/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.7717/peerj.18436","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Background and objective: Chronic kidney disease (CKD) is a major public health issue, and accurate prediction of the progression of kidney failure is critical for clinical decision-making and helps improve patient outcomes. As such, we aimed to develop and externally validate a machine-learned model to predict the progression of CKD using common laboratory variables, demographic characteristics, and an electronic health records database.
Methods: We developed a predictive model using longitudinal clinical data from a single center for Chinese CKD patients. The cohort included 987 patients who were followed up for more than 24 months. Fifty-three laboratory features were considered for inclusion in the model. The primary outcome in our study was an estimated glomerular filtration rate ≤15 mL/min/1.73 m2 or kidney failure. Machine learning algorithms were applied to the modeling dataset (n = 296), and an external dataset (n = 71) was used for model validation. We assessed model discrimination via area under the curve (AUC) values, accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score.
Results: Over a median follow-up period of 3.75 years, 148 patients experienced kidney failure. The optimal model was based on stacking different classifier algorithms with six laboratory features, including 24-h urine protein, potassium, glucose, urea, prealbumin and total protein. The model had considerable predictive power, with AUC values of 0.896 and 0.771 in the validation and external datasets, respectively. This model also accurately predicted the progression of renal function in patients over different follow-up periods after their initial assessment.
Conclusions: A prediction model that leverages routinely collected laboratory features in the Chinese population can accurately identify patients with CKD at high risk of progressing to kidney failure. An online version of the model can be easily and quickly applied in clinical management and treatment.
期刊介绍:
PeerJ is an open access peer-reviewed scientific journal covering research in the biological and medical sciences. At PeerJ, authors take out a lifetime publication plan (for as little as $99) which allows them to publish articles in the journal for free, forever. PeerJ has 5 Nobel Prize Winners on the Board; they have won several industry and media awards; and they are widely recognized as being one of the most interesting recent developments in academic publishing.