Raymundo Buenrostro-Mariscal, Osval A Montesinos-López, Cesar Gonzalez-Gonzalez
{"title":"Predicting Hospitalization in Older Adults Using Machine Learning.","authors":"Raymundo Buenrostro-Mariscal, Osval A Montesinos-López, Cesar Gonzalez-Gonzalez","doi":"10.3390/geriatrics10010006","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background/Objectives</b>: Hospitalization among older adults is a growing challenge in Mexico due to the high prevalence of chronic diseases and limited public healthcare resources. This study aims to develop a predictive model for hospitalization using longitudinal data from the Mexican Health and Aging Study (MHAS) using the random forest (RF) algorithm. <b>Methods</b>: An RF-based machine learning model was designed and evaluated under different data partition strategies (ST) with and without variable interaction. Variable importance was assessed based on the mean decrease in impurity and permutation importance, enhancing our understanding of predictors of hospitalization. The model's robustness was ensured through modified nested cross-validation, with evaluation metrics including sensitivity, specificity, and the kappa coefficient. <b>Results</b>: The model with ST2, incorporating interaction and a 20% test proportion, achieved the best balance between sensitivity (0.7215, standard error ± 0.0038), and specificity (0.4935, standard error ± 0.0039). Variable importance analysis revealed that functional limitations (e.g., abvd3, 31.1% importance), age (12.75%), and history of cerebrovascular accidents (12.4%) were the strongest predictors. Socioeconomic factors, including education level (12.08%), also emerged as critical predictors, highlighting the model's ability to capture complex interactions between health and socioeconomic variables. <b>Conclusions</b>: The integration of variable importance analysis enhances the interpretability of the RF model, providing novel insights into the predictors of hospitalization in older adults. These findings underscore the potential for clinical applications, including anticipating hospital demand and optimizing resource allocation. Future research will focus on integrating subgroup analyses for comorbidities and advanced techniques for handling missing data to further improve predictive accuracy.</p>","PeriodicalId":12653,"journal":{"name":"Geriatrics","volume":"10 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11755630/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geriatrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/geriatrics10010006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GERIATRICS & GERONTOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background/Objectives: Hospitalization among older adults is a growing challenge in Mexico due to the high prevalence of chronic diseases and limited public healthcare resources. This study aims to develop a predictive model for hospitalization using longitudinal data from the Mexican Health and Aging Study (MHAS) using the random forest (RF) algorithm. Methods: An RF-based machine learning model was designed and evaluated under different data partition strategies (ST) with and without variable interaction. Variable importance was assessed based on the mean decrease in impurity and permutation importance, enhancing our understanding of predictors of hospitalization. The model's robustness was ensured through modified nested cross-validation, with evaluation metrics including sensitivity, specificity, and the kappa coefficient. Results: The model with ST2, incorporating interaction and a 20% test proportion, achieved the best balance between sensitivity (0.7215, standard error ± 0.0038), and specificity (0.4935, standard error ± 0.0039). Variable importance analysis revealed that functional limitations (e.g., abvd3, 31.1% importance), age (12.75%), and history of cerebrovascular accidents (12.4%) were the strongest predictors. Socioeconomic factors, including education level (12.08%), also emerged as critical predictors, highlighting the model's ability to capture complex interactions between health and socioeconomic variables. Conclusions: The integration of variable importance analysis enhances the interpretability of the RF model, providing novel insights into the predictors of hospitalization in older adults. These findings underscore the potential for clinical applications, including anticipating hospital demand and optimizing resource allocation. Future research will focus on integrating subgroup analyses for comorbidities and advanced techniques for handling missing data to further improve predictive accuracy.
期刊介绍:
• Geriatric biology
• Geriatric health services research
• Geriatric medicine research
• Geriatric neurology, stroke, cognition and oncology
• Geriatric surgery
• Geriatric physical functioning, physical health and activity
• Geriatric psychiatry and psychology
• Geriatric nutrition
• Geriatric epidemiology
• Geriatric rehabilitation