Pub Date : 2025-06-01Epub Date: 2025-04-08DOI: 10.1016/j.health.2025.100393
Detcharat Sumrit
Evaluating organizational big data analytics capabilities (BDAC) is crucial for strengthening resilience in healthcare supply chains (HSCs). This study employs an integrated multi-criteria decision-making (MCDM) approach, combining the Decision-making Trial and Evaluation Laboratory (DANP) and Multi-Attributive Border Approximation Area Comparison (MABAC) methods in a fuzzy environment. The goal is to assess the interdependence of BDAC and its impact on resilience within the HSC. The research draws on organizational information processing (OIP) and knowledge-based view (KBV) theoretical lenses to identify relevant BDAC components. The study yields context-specific insights into the role of big data analytics in fortifying the HSC Using a case study in a public hospital. The findings contribute to the understanding of supply chain resilience, emphasizing the pivotal role of BDAC in organizational preparedness. This knowledge can guide healthcare sector managers in making informed decisions to enhance overall resilience, allowing organizations to navigate uncertainties and challenges proactively. Ultimately, leveraging insights from this study can foster a more adaptive and resilient HSC, benefiting both patients and stakeholders.
{"title":"An investigation of the impact of organizational big data analytics capabilities on healthcare supply chain resiliency","authors":"Detcharat Sumrit","doi":"10.1016/j.health.2025.100393","DOIUrl":"10.1016/j.health.2025.100393","url":null,"abstract":"<div><div>Evaluating organizational big data analytics capabilities (BDAC) is crucial for strengthening resilience in healthcare supply chains (HSCs). This study employs an integrated multi-criteria decision-making (MCDM) approach, combining the Decision-making Trial and Evaluation Laboratory (DANP) and Multi-Attributive Border Approximation Area Comparison (MABAC) methods in a fuzzy environment. The goal is to assess the interdependence of BDAC and its impact on resilience within the HSC. The research draws on organizational information processing (OIP) and knowledge-based view (KBV) theoretical lenses to identify relevant BDAC components. The study yields context-specific insights into the role of big data analytics in fortifying the HSC Using a case study in a public hospital. The findings contribute to the understanding of supply chain resilience, emphasizing the pivotal role of BDAC in organizational preparedness. This knowledge can guide healthcare sector managers in making informed decisions to enhance overall resilience, allowing organizations to navigate uncertainties and challenges proactively. Ultimately, leveraging insights from this study can foster a more adaptive and resilient HSC, benefiting both patients and stakeholders.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100393"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143852051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-05-20DOI: 10.1016/j.health.2025.100398
Gayathri Hegde M , P Deepa Shenoy , Venugopal KR , Arvind Canchi
Chronic Kidney Disease (CKD) has become more prevalent, leading to a gradual decline in kidney function and, ultimately, in renal failure. Timely detection of the CKD stage is essential for enhancing healthcare services and decreasing morbidity and mortality. Hence, this study proposes a Metaheuristic-Hybrid Metaheuritstic eXplainable Artificial Intelligence (MHMXAI) driven Feature Selection (FS) approach and Deep Learning (DL) models for CKD stage prediction. MHMXAI approach selects the features with the highest scores from the Metaheuristic algorithm-Eagle Search Strategy, Hybrid Metaheuristic algorithm-Great Salmon Run-Thermal Exchange Optimization and eXplainable AI (XAI) tools like Local Interpretable Model-agnostic Explanations (LIME) and Shapley Additive exPlanations (SHAP) for their effectiveness. To evaluate the proposed method, eight DL models — Feedforward Neural Network, Recurrent Neural Network, Deep Neural Network, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit (GRU) and Bidirectional GRU were trained on selected features using different FS methods, as well as complete dataset. The models were assessed using performance metrics such as accuracy, precision, recall, F1-Score, Loss, Validation Loss and computation time. The CNN model outperformed others, achieving an accuracy between 98%-99.5% for all FS methods. Statistical tests, including the Friedman and Nemenyi post-hoc test, identified the CNN model trained with MHMXAI-selected features as the most robust choice for CKD stage prediction. These findings demonstrate that the proposed MHMXAI method effectively integrates metaheuristic algorithms and XAI tools, improving CKD stage prediction accuracy and clinical interpretability.
{"title":"A Deep Learning Framework for Chronic Kidney Disease stage classification","authors":"Gayathri Hegde M , P Deepa Shenoy , Venugopal KR , Arvind Canchi","doi":"10.1016/j.health.2025.100398","DOIUrl":"10.1016/j.health.2025.100398","url":null,"abstract":"<div><div>Chronic Kidney Disease (CKD) has become more prevalent, leading to a gradual decline in kidney function and, ultimately, in renal failure. Timely detection of the CKD stage is essential for enhancing healthcare services and decreasing morbidity and mortality. Hence, this study proposes a Metaheuristic-Hybrid Metaheuritstic eXplainable Artificial Intelligence (MHMXAI) driven Feature Selection (FS) approach and Deep Learning (DL) models for CKD stage prediction. MHMXAI approach selects the features with the highest scores from the Metaheuristic algorithm-Eagle Search Strategy, Hybrid Metaheuristic algorithm-Great Salmon Run-Thermal Exchange Optimization and eXplainable AI (XAI) tools like Local Interpretable Model-agnostic Explanations (LIME) and Shapley Additive exPlanations (SHAP) for their effectiveness. To evaluate the proposed method, eight DL models — Feedforward Neural Network, Recurrent Neural Network, Deep Neural Network, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit (GRU) and Bidirectional GRU were trained on selected features using different FS methods, as well as complete dataset. The models were assessed using performance metrics such as accuracy, precision, recall, F1-Score, Loss, Validation Loss and computation time. The CNN model outperformed others, achieving an accuracy between 98%-99.5% for all FS methods. Statistical tests, including the Friedman and Nemenyi post-hoc test, identified the CNN model trained with MHMXAI-selected features as the most robust choice for CKD stage prediction. These findings demonstrate that the proposed MHMXAI method effectively integrates metaheuristic algorithms and XAI tools, improving CKD stage prediction accuracy and clinical interpretability.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100398"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144115378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-01-21DOI: 10.1016/j.health.2024.100381
J.E. Camacho-Cogollo , Cristhian Felipe Patiño Zambrano , Christian Lochmuller , Claudia C. Colmenares-Mejia , Nicolas Rozo , Mario A. Isaza-Ruget , Paul Rodriguez , Andrés García
The therapeutic goal for diabetes mellitus is to maintain normal blood glucose levels, but in some cases, hypoglycemia may occur as a consequence of treatment. Identifying patients with hypoglycemia is critical to preventing adverse events and mortality. However, hypoglycemic events are often not accurately documented in electronic health records (EHRs). This study presents a retrospective analysis of the EHRs of patients with diabetes mellitus. We hypothesize that text analytics and machine learning can identify possible hypoglycemic incidents from unstructured physician notes in electronic health records. Our analysis applies these techniques using the Python programming language as a tool. It also considers words that describe symptoms related to hypoglycemia. The analysis involves searching physicians' notes for keywords and applying supervised classification methods to 146,542 records. Natural language processing (NLP) and machine learning algorithms are used to identify possible hypoglycemic events and related symptoms in physicians’ notes. A multi-layer perceptron (MLP) model produces the best classification performance among all the models tested in this study, with an obtained accuracy of 0.87. We show that the NLP approach can effectively identify and automate the text-based detection process of potential hypoglycemic events, and can subsequently be used to make informed decisions about potential patient risks.
{"title":"An application of natural language processing for hypoglycemic event identification in patients with diabetes mellitus","authors":"J.E. Camacho-Cogollo , Cristhian Felipe Patiño Zambrano , Christian Lochmuller , Claudia C. Colmenares-Mejia , Nicolas Rozo , Mario A. Isaza-Ruget , Paul Rodriguez , Andrés García","doi":"10.1016/j.health.2024.100381","DOIUrl":"10.1016/j.health.2024.100381","url":null,"abstract":"<div><div>The therapeutic goal for diabetes mellitus is to maintain normal blood glucose levels, but in some cases, hypoglycemia may occur as a consequence of treatment. Identifying patients with hypoglycemia is critical to preventing adverse events and mortality. However, hypoglycemic events are often not accurately documented in electronic health records (EHRs). This study presents a retrospective analysis of the EHRs of patients with diabetes mellitus. We hypothesize that text analytics and machine learning can identify possible hypoglycemic incidents from unstructured physician notes in electronic health records. Our analysis applies these techniques using the Python programming language as a tool. It also considers words that describe symptoms related to hypoglycemia. The analysis involves searching physicians' notes for keywords and applying supervised classification methods to 146,542 records. Natural language processing (NLP) and machine learning algorithms are used to identify possible hypoglycemic events and related symptoms in physicians’ notes. A multi-layer perceptron (MLP) model produces the best classification performance among all the models tested in this study, with an obtained accuracy of 0.87. We show that the NLP approach can effectively identify and automate the text-based detection process of potential hypoglycemic events, and can subsequently be used to make informed decisions about potential patient risks.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100381"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143172047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-02-22DOI: 10.1016/j.health.2025.100386
Mahmudul Bari Hridoy, Angela Peace
Malaria continues to be a significant global health challenge, particularly in tropical regions. Resistance to key antimalarial drugs is spreading, complicating treatment efforts. While progress toward eradication has been slow, the development and introduction of novel malaria vaccines offer hope for reducing the disease burden in endemic areas. To address these challenges, we develop an extended Susceptible–Exposed–Infected–Recovered (SEIR) age-structured model incorporating malaria vaccination for children, drug-sensitive and drug-resistant strains, and interactions between human hosts and mosquitoes. Our research evaluates how malaria vaccination coverage influences disease prevalence and transmission dynamics. We derive both strains’ basic, intervention, and invasion reproduction numbers and conduct sensitivity analysis to identify key parameters affecting infection prevalence. Our findings reveal that model outcomes are primarily influenced by scale factors that reduce transmission and natural recovery rates for the resistant strain, as well as by drug treatment and vaccination efficacies and mosquito death rates. Numerical simulations indicate that while treatment reduces the malaria disease burden, it also increases the proportion of drug-resistant cases. Conversely, higher vaccination efficacy correlates with lower infection cases for both strains. These results suggest that a synergistic approach involving vaccination and treatment could effectively decrease the overall proportion of the infected population.
{"title":"An exploration of the interplay between treatment and vaccination in an Age-Structured Malaria Model using non-linear ordinary differential equations","authors":"Mahmudul Bari Hridoy, Angela Peace","doi":"10.1016/j.health.2025.100386","DOIUrl":"10.1016/j.health.2025.100386","url":null,"abstract":"<div><div>Malaria continues to be a significant global health challenge, particularly in tropical regions. Resistance to key antimalarial drugs is spreading, complicating treatment efforts. While progress toward eradication has been slow, the development and introduction of novel malaria vaccines offer hope for reducing the disease burden in endemic areas. To address these challenges, we develop an extended Susceptible–Exposed–Infected–Recovered (SEIR) age-structured model incorporating malaria vaccination for children, drug-sensitive and drug-resistant strains, and interactions between human hosts and mosquitoes. Our research evaluates how malaria vaccination coverage influences disease prevalence and transmission dynamics. We derive both strains’ basic, intervention, and invasion reproduction numbers and conduct sensitivity analysis to identify key parameters affecting infection prevalence. Our findings reveal that model outcomes are primarily influenced by scale factors that reduce transmission and natural recovery rates for the resistant strain, as well as by drug treatment and vaccination efficacies and mosquito death rates. Numerical simulations indicate that while treatment reduces the malaria disease burden, it also increases the proportion of drug-resistant cases. Conversely, higher vaccination efficacy correlates with lower infection cases for both strains. These results suggest that a synergistic approach involving vaccination and treatment could effectively decrease the overall proportion of the infected population.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100386"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-05-14DOI: 10.1016/j.health.2025.100400
Hong-Yan Li , Jing Guo, Chuang-Hao Yang
The rational allocation of healthcare resources is vital for establishing a healthcare system that aligns with the levels of economic and social development. As a comprehensive discipline integrating geography, cartography, remote sensing, and computer science, Geographic Information System (GIS) can visualize and analyze spatial information through mapping. By utilizing GIS's statistical analysis and data visualization functions, this study provides a more efficient and intuitive analysis of Shanghai's spatial healthcare resource allocation and a more comprehensive assessment of its current allocation status. To examine the spatial correlation and spatial proximity, we apply the Global Moran Index (Moran's I), the Local Indicators of Spatial Association (LISA) test, and Hot Spot Analysis (Getis-Ord Gi∗) for assessment. Furthermore, by utilizing the Lorenz curve and Gini coefficient, this study provides a new perspective by expanding the measurement dimensions for assessing healthcare resource allocation in Shanghai. The results show that: From the global spatial correlation perspective, the allocation of healthcare resources in Shanghai exhibits spatial clustering. From the local spatial correlation perspective, healthcare resources in Shanghai show significant regional disparities, with resources concentrated in central urban areas. And from a multidimensional perspective, the equity of allocation of healthcare resources in Shanghai in 2022 was higher when measured by population (0.298 ± 0.063) and economy (0.292 ± 0.027) than by geographic area (0.612 ± 0.100) and green spaces (0.590 ± 0.110) of the Gini coefficient. These findings offer valuable insights for promoting the structural optimization and spatial distribution of healthcare resources in Shanghai.
合理配置医疗卫生资源,是建立与经济社会发展水平相适应的医疗卫生体系的关键。地理信息系统(Geographic Information System, GIS)是一门集地理学、地图学、遥感学和计算机科学于一体的综合性学科,它能够通过制图实现空间信息的可视化和分析。本研究利用GIS的统计分析和数据可视化功能,对上海市空间卫生资源配置进行了更高效、直观的分析,并对其配置现状进行了更全面的评估。为了检验空间相关性和空间接近性,我们应用全球Moran指数(Moran's I)、空间关联局部指标(LISA)测试和热点分析(Getis-Ord Gi∗)进行评估。此外,本研究运用Lorenz曲线和基尼系数,拓展了上海市卫生资源配置的测量维度,为评估上海市卫生资源配置提供了新的视角。结果表明:从全球空间关联角度看,上海市卫生资源配置呈现空间集聚性;从区域空间关联角度看,上海市卫生资源存在显著的区域差异,资源集中在中心城区。从多维度看,以人口(0.298±0.063)和经济(0.292±0.027)衡量的2022年上海市卫生资源配置公平性高于以地理面积(0.612±0.100)和绿地(0.590±0.110)衡量的基尼系数。研究结果对促进上海市卫生资源的结构优化和空间布局具有重要的参考价值。
{"title":"An analytical approach to assessing the spatial equity and allocation of healthcare resources in Shanghai","authors":"Hong-Yan Li , Jing Guo, Chuang-Hao Yang","doi":"10.1016/j.health.2025.100400","DOIUrl":"10.1016/j.health.2025.100400","url":null,"abstract":"<div><div>The rational allocation of healthcare resources is vital for establishing a healthcare system that aligns with the levels of economic and social development. As a comprehensive discipline integrating geography, cartography, remote sensing, and computer science, Geographic Information System (GIS) can visualize and analyze spatial information through mapping. By utilizing GIS's statistical analysis and data visualization functions, this study provides a more efficient and intuitive analysis of Shanghai's spatial healthcare resource allocation and a more comprehensive assessment of its current allocation status. To examine the spatial correlation and spatial proximity, we apply the Global Moran Index (Moran's I), the Local Indicators of Spatial Association (LISA) test, and Hot Spot Analysis (Getis-Ord Gi∗) for assessment. Furthermore, by utilizing the Lorenz curve and Gini coefficient, this study provides a new perspective by expanding the measurement dimensions for assessing healthcare resource allocation in Shanghai. The results show that: From the global spatial correlation perspective, the allocation of healthcare resources in Shanghai exhibits spatial clustering. From the local spatial correlation perspective, healthcare resources in Shanghai show significant regional disparities, with resources concentrated in central urban areas. And from a multidimensional perspective, the equity of allocation of healthcare resources in Shanghai in 2022 was higher when measured by population (0.298 ± 0.063) and economy (0.292 ± 0.027) than by geographic area (0.612 ± 0.100) and green spaces (0.590 ± 0.110) of the Gini coefficient. These findings offer valuable insights for promoting the structural optimization and spatial distribution of healthcare resources in Shanghai.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100400"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144185725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-03-11DOI: 10.1016/j.health.2025.100390
Keona Pang
Over the years, numerous machine learning models have been developed, leading to successful applications across various fields. This study uses a large dataset related to type 2 diabetes prediction from the Centers for Disease Control and Prevention (CDC) in the United States. The dataset with 70692 samples has 21 input features and one output (non-diabetes or diabetes). In addition to health indicators like Body Mass Index (BMI), blood pressure, and cholesterol level, the features include socioeconomic factors (e.g., income, education) and lifestyle factors such as diet and physical activity. This paper aims to study how these features influence diabetes risk. 80 % of the dataset is used for training and 20 % for testing. Six machine learning models, as well as the Multivariate Adaptive Regression Splines (MARS) model, were used in the investigation. A detailed comparison of the performance of these models is given. Shapley values explain the nature of various machine learning models using visualization by color graphs to demonstrate the reliability of different machine learning models. This paper shows how Shapley values can improve their explainability and interpretability on diabetes prediction. We leverage the SHapley Additive exPlanations (SHAP) scores to provide information about the relative importance of each predictive feature, and these results shed light on the relationship between the features and the risk of developing type 2 diabetes.
{"title":"A comparative study of explainable machine learning models with Shapley values for diabetes prediction","authors":"Keona Pang","doi":"10.1016/j.health.2025.100390","DOIUrl":"10.1016/j.health.2025.100390","url":null,"abstract":"<div><div>Over the years, numerous machine learning models have been developed, leading to successful applications across various fields. This study uses a large dataset related to type 2 diabetes prediction from the Centers for Disease Control and Prevention (CDC) in the United States. The dataset with 70692 samples has 21 input features and one output (non-diabetes or diabetes). In addition to health indicators like Body Mass Index (BMI), blood pressure, and cholesterol level, the features include socioeconomic factors (e.g., income, education) and lifestyle factors such as diet and physical activity. This paper aims to study how these features influence diabetes risk. 80 % of the dataset is used for training and 20 % for testing. Six machine learning models, as well as the Multivariate Adaptive Regression Splines (MARS) model, were used in the investigation. A detailed comparison of the performance of these models is given. Shapley values explain the nature of various machine learning models using visualization by color graphs to demonstrate the reliability of different machine learning models. This paper shows how Shapley values can improve their explainability and interpretability on diabetes prediction. We leverage the SHapley Additive exPlanations (SHAP) scores to provide information about the relative importance of each predictive feature, and these results shed light on the relationship between the features and the risk of developing type 2 diabetes.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100390"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-01-31DOI: 10.1016/j.health.2025.100384
Madhusree Kuanr, Puspanjali Mohapatra
This study proposes a health recommender system to analyze health risk and disease prediction by identifying the most responsible disease-causing factors using a hybrid Genetic–Harris Hawk optimization multi-objective feature selection approach. The proposed recommender system uses the Tree-based Pipeline Optimization Tool (TPOT) automated machine learning model to recommend the most suitable machine learning prediction model with the best classifier in terms of classification accuracy for a disease with the selected features. It also recommends the top three disease-causing features for a particular disease that can be utilized to analyze a person’s health risk. The proposed system has also been compared with the competing prediction approaches using Principal Component Analysis (PCA), Singular Vector Decomposition (SVD), and Autoencoders. We show that the proposed system outperforms competing approaches in terms of classification accuracy.
{"title":"A recommender system with multi-objective hybrid Harris Hawk optimization for feature selection and disease diagnosis","authors":"Madhusree Kuanr, Puspanjali Mohapatra","doi":"10.1016/j.health.2025.100384","DOIUrl":"10.1016/j.health.2025.100384","url":null,"abstract":"<div><div>This study proposes a health recommender system to analyze health risk and disease prediction by identifying the most responsible disease-causing factors using a hybrid Genetic–Harris Hawk optimization multi-objective feature selection approach. The proposed recommender system uses the Tree-based Pipeline Optimization Tool (TPOT) automated machine learning model to recommend the most suitable machine learning prediction model with the best classifier in terms of classification accuracy for a disease with the selected features. It also recommends the top three disease-causing features for a particular disease that can be utilized to analyze a person’s health risk. The proposed system has also been compared with the competing prediction approaches using Principal Component Analysis (PCA), Singular Vector Decomposition (SVD), and Autoencoders. We show that the proposed system outperforms competing approaches in terms of classification accuracy.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100384"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143172046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-04-01DOI: 10.1016/j.health.2025.100392
Xinyi Yang, Juan Li
The increasing prevalence of diabetes necessitates innovative glucose prediction methods that prioritize patient privacy. While edge artificial intelligence (AI) offers potential, its limitations in resource-constrained devices can be mitigated through federated learning (FL). However, challenges remain in accounting for patient variability and optimizing FL for glucose prediction. This research introduces a novel personalized clustering-based federated deep learning (Clu-FDL) model to address these challenges. We develop tailored models that enhance prediction accuracy by clustering patients based on carbohydrate (CHO) intake patterns. Utilizing Simple Recurrent Neural Network (SimpleRNN) and Gated Recurrent Unit (GRU) methods, the study evaluates the performance of local patients who contribute to training the cluster and global (non-cluster) models. The results show that the Clu-FDL approach achieves high precision (0.93), recall (0.96), and F1 scores (0.95), along with low Root Mean Square Error (RMSE) values (11.08 ± 1.77 mg/dL). Additionally, for new patients with different data durations, analysis based on 0.25–3 days of data indicates that Clu-FDL models exhibit greater stability, with smaller RMSE and higher precision, recall, and F1 scores compared to non-clustering models. The study identifies that SimpleRNN and GRU models are most effective for new patients with 9 and 6 days of data. This privacy-preserving, clustering-based personalized approach empowers patients to manage their diabetes effectively.
{"title":"A clustering-based federated deep learning approach for enhancing diabetes management with privacy-preserving edge artificial intelligence","authors":"Xinyi Yang, Juan Li","doi":"10.1016/j.health.2025.100392","DOIUrl":"10.1016/j.health.2025.100392","url":null,"abstract":"<div><div>The increasing prevalence of diabetes necessitates innovative glucose prediction methods that prioritize patient privacy. While edge artificial intelligence (AI) offers potential, its limitations in resource-constrained devices can be mitigated through federated learning (FL). However, challenges remain in accounting for patient variability and optimizing FL for glucose prediction. This research introduces a novel personalized clustering-based federated deep learning (Clu-FDL) model to address these challenges. We develop tailored models that enhance prediction accuracy by clustering patients based on carbohydrate (CHO) intake patterns. Utilizing Simple Recurrent Neural Network (SimpleRNN) and Gated Recurrent Unit (GRU) methods, the study evaluates the performance of local patients who contribute to training the cluster and global (non-cluster) models. The results show that the Clu-FDL approach achieves high precision (0.93), recall (0.96), and F1 scores (0.95), along with low Root Mean Square Error (RMSE) values (11.08 ± 1.77 mg/dL). Additionally, for new patients with different data durations, analysis based on 0.25–3 days of data indicates that Clu-FDL models exhibit greater stability, with smaller RMSE and higher precision, recall, and F1 scores compared to non-clustering models. The study identifies that SimpleRNN and GRU models are most effective for new patients with 9 and 6 days of data. This privacy-preserving, clustering-based personalized approach empowers patients to manage their diabetes effectively.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100392"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143760594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-04-22DOI: 10.1016/j.health.2025.100395
Parama Sridevi, Zawad Arefin, Sheikh Iqbal Ahamed
Frequent measurement of creatinine levels is vital for patients with chronic kidney disease. Traditional creatinine level measurement requires invasive blood test which has several disadvantages like discomfort, anxiety, panic, pain, risk of infection, etc. To address the issue, we propose a noninvasive machine learning (ML) model-based method to estimate creatinine level using photoplethysmography (PPG) signal. We obtained the PPG signal and gold-standard serum creatinine level of 404 patients from the Medical News Mart for Concentrated Care III (MIMIC III) database. In data preprocessing, we analyzed the PPG signal following several steps and created PPG feature set. We used multiple feature engineering methods to identify the most important features. We integrated Optuna, a hyperparameter optimization framework, with every ML model to get the optimal hyperparameters. We developed five ML models and compared their performance both with and without the application of Optuna. We found that Optuna significantly improves every model's performance. With Optuna, extreme gradient boosting (XGBoost) performed best among all five models. This XGBoost model had an accuracy of 85.2 %, an average k-fold cross validation score (k = 10) of 0.70, and a “receiver operating characteristic area under the curve” (ROC-AUC) score of 0.80. With the high performance exhibited by our developed model, the study can play a crucial role in the field of noninvasive creatinine estimation and diagnosis of chronic kidney disease.
{"title":"An integrated machine learning and hyperparameter optimization framework for noninvasive creatinine estimation using photoplethysmography signals","authors":"Parama Sridevi, Zawad Arefin, Sheikh Iqbal Ahamed","doi":"10.1016/j.health.2025.100395","DOIUrl":"10.1016/j.health.2025.100395","url":null,"abstract":"<div><div>Frequent measurement of creatinine levels is vital for patients with chronic kidney disease. Traditional creatinine level measurement requires invasive blood test which has several disadvantages like discomfort, anxiety, panic, pain, risk of infection, etc. To address the issue, we propose a noninvasive machine learning (ML) model-based method to estimate creatinine level using photoplethysmography (PPG) signal. We obtained the PPG signal and gold-standard serum creatinine level of 404 patients from the Medical News Mart for Concentrated Care III (MIMIC III) database. In data preprocessing, we analyzed the PPG signal following several steps and created PPG feature set. We used multiple feature engineering methods to identify the most important features. We integrated Optuna, a hyperparameter optimization framework, with every ML model to get the optimal hyperparameters. We developed five ML models and compared their performance both with and without the application of Optuna. We found that Optuna significantly improves every model's performance. With Optuna, extreme gradient boosting (XGBoost) performed best among all five models. This XGBoost model had an accuracy of 85.2 %, an average k-fold cross validation score (k = 10) of 0.70, and a “receiver operating characteristic area under the curve” (ROC-AUC) score of 0.80. With the high performance exhibited by our developed model, the study can play a crucial role in the field of noninvasive creatinine estimation and diagnosis of chronic kidney disease.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100395"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143887787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01Epub Date: 2025-05-21DOI: 10.1016/j.health.2025.100401
Sara Al-Naabi , Noura Al Nasiri , Talal Al-Awadhi , Meshal Abdullah , Ammar Abulibdeh
Healthcare services have a significant impact on socioeconomic and health development globally. In Oman, rapid development since the 1970s has led to a focus on the equitable distribution of public services. This research aims to evaluate the spatial accessibility and distribution of pharmacies in Muscat Governorate, Oman, using Geographical Information Systems (GIS) and spatial analysis techniques. The primary objective is to measure the equity in the spatial distribution of pharmacies within Muscat Governorate. The study utilizes spatial datasets, including administrative areas, pharmacy locations, settlement locations, transportation networks, and non-spatial datasets such as demographic data. The methodology involves spatial distribution analysis using Average Nearest Neighbor (ANN), Moran's I for spatial autocorrelation, Kernel Density Analysis (KDA), Thiessen polygons for catchment areas, and Network analysis for determining service areas and accessibility by walking and driving distances. Findings indicate a clustered distribution of pharmacies, with higher concentrations in densely populated northern Wilayats like Muttrah, AS Seeb, and Bawshar. Muttrah exhibits the highest accessibility, with 99 % coverage within a 2.5 km radius, whereas Muscat Wilaya lacks pharmacy services entirely. These findings highlight significant disparities in the spatial distribution of pharmacies, underscoring the need for policy interventions to ensure equitable access. Policymakers should consider geographic and demographic factors in health service planning to ensure fair distribution and accessibility across the governorate. Implementing these recommendations can help improve healthcare access and equity in Muscat, contributing to overall social and health development.
{"title":"An equity-based spatial analytics framework for evaluating pharmacy accessibility using geographical information systems","authors":"Sara Al-Naabi , Noura Al Nasiri , Talal Al-Awadhi , Meshal Abdullah , Ammar Abulibdeh","doi":"10.1016/j.health.2025.100401","DOIUrl":"10.1016/j.health.2025.100401","url":null,"abstract":"<div><div>Healthcare services have a significant impact on socioeconomic and health development globally. In Oman, rapid development since the 1970s has led to a focus on the equitable distribution of public services. This research aims to evaluate the spatial accessibility and distribution of pharmacies in Muscat Governorate, Oman, using Geographical Information Systems (GIS) and spatial analysis techniques. The primary objective is to measure the equity in the spatial distribution of pharmacies within Muscat Governorate. The study utilizes spatial datasets, including administrative areas, pharmacy locations, settlement locations, transportation networks, and non-spatial datasets such as demographic data. The methodology involves spatial distribution analysis using Average Nearest Neighbor (ANN), Moran's I for spatial autocorrelation, Kernel Density Analysis (KDA), Thiessen polygons for catchment areas, and Network analysis for determining service areas and accessibility by walking and driving distances. Findings indicate a clustered distribution of pharmacies, with higher concentrations in densely populated northern Wilayats like Muttrah, AS Seeb, and Bawshar. Muttrah exhibits the highest accessibility, with 99 % coverage within a 2.5 km radius, whereas Muscat Wilaya lacks pharmacy services entirely. These findings highlight significant disparities in the spatial distribution of pharmacies, underscoring the need for policy interventions to ensure equitable access. Policymakers should consider geographic and demographic factors in health service planning to ensure fair distribution and accessibility across the governorate. Implementing these recommendations can help improve healthcare access and equity in Muscat, contributing to overall social and health development.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100401"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144147169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}