{"title":"Deciphering the climate-malaria nexus: A machine learning approach in rural southeastern Tanzania.","authors":"Jin-Xin Zheng, Shen-Ning Lu, Qin Li, Yue-Jin Li, Jin-Bo Xue, Tegemeo Gavana, Prosper Chaki, Ning Xiao, Yeromin Mlacha, Duo-Quan Wang, Xiao-Nong Zhou","doi":"10.1016/j.puhe.2024.11.013","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Malaria remains a critical public health challenge, especially in regions like southeastern Tanzania. Understanding the intricate relationship between environmental factors and malaria incidence is essential for effective control and elimination strategies.</p><p><strong>Study design: </strong>Cohort study.</p><p><strong>Methods: </strong>This cohort study, conducted between Jan 2016 and October 2021 across three districts in southeastern Tanzania, utilized advanced machine learning techniques, specifically the Extreme Gradient Boosting (XGBoost) model, to examine the impact of climate factors on malaria incidence. SHapley Additive exPlanations (SHAP) values were applied to interpret model predictions, highlighting the roles of normalized difference vegetation index (NDVI), temperature, and rainfall in shaping malaria transmission dynamics.</p><p><strong>Results: </strong>Analysis revealed considerable heterogeneity in malaria incidence across southeastern Tanzania, with Kibiti experiencing the highest number of cases (15,308) over the study period. Seasonal peaks corresponded with rainy periods, though incidence rates varied by district. Incorporating lagged climate variables and seasonal trends significantly improved forecast accuracy, with the one-month lag model achieving the lowest mean absolute error (MAE = 175.46) and root mean squared error (RMSE = 228.24). SHAP analysis identified seasonality (mean SHAP 29.6), followed by lagged temperature (13.8), rainfall (12.4), and NDVI (5.96), as the most influential factors, reflecting the biological underpinnings of malaria transmission.</p><p><strong>Conclusions: </strong>This study demonstrates the utility of machine learning and explainable SHAP in malaria epidemiology, providing a data-driven framework to guide targeted, climate-informed malaria control strategies. By capturing seasonal and climate-linked risks, these methods hold promise for enhancing public health planning and adaptive response in malaria-endemic regions.</p>","PeriodicalId":49651,"journal":{"name":"Public Health","volume":"238 ","pages":"124-130"},"PeriodicalIF":3.9000,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Public Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.puhe.2024.11.013","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Malaria remains a critical public health challenge, especially in regions like southeastern Tanzania. Understanding the intricate relationship between environmental factors and malaria incidence is essential for effective control and elimination strategies.
Study design: Cohort study.
Methods: This cohort study, conducted between Jan 2016 and October 2021 across three districts in southeastern Tanzania, utilized advanced machine learning techniques, specifically the Extreme Gradient Boosting (XGBoost) model, to examine the impact of climate factors on malaria incidence. SHapley Additive exPlanations (SHAP) values were applied to interpret model predictions, highlighting the roles of normalized difference vegetation index (NDVI), temperature, and rainfall in shaping malaria transmission dynamics.
Results: Analysis revealed considerable heterogeneity in malaria incidence across southeastern Tanzania, with Kibiti experiencing the highest number of cases (15,308) over the study period. Seasonal peaks corresponded with rainy periods, though incidence rates varied by district. Incorporating lagged climate variables and seasonal trends significantly improved forecast accuracy, with the one-month lag model achieving the lowest mean absolute error (MAE = 175.46) and root mean squared error (RMSE = 228.24). SHAP analysis identified seasonality (mean SHAP 29.6), followed by lagged temperature (13.8), rainfall (12.4), and NDVI (5.96), as the most influential factors, reflecting the biological underpinnings of malaria transmission.
Conclusions: This study demonstrates the utility of machine learning and explainable SHAP in malaria epidemiology, providing a data-driven framework to guide targeted, climate-informed malaria control strategies. By capturing seasonal and climate-linked risks, these methods hold promise for enhancing public health planning and adaptive response in malaria-endemic regions.
期刊介绍:
Public Health is an international, multidisciplinary peer-reviewed journal. It publishes original papers, reviews and short reports on all aspects of the science, philosophy, and practice of public health.