Amy Marie Campbell, Jordi Manuel Cabrera-Gumbau, Joaquin Trinanes, Craig Baker-Austin, Jaime Martinez-Urtaza
{"title":"Machine Learning Potential for Identifying and Forecasting Complex Environmental Drivers of <i>Vibrio vulnificus</i> Infections in the United States.","authors":"Amy Marie Campbell, Jordi Manuel Cabrera-Gumbau, Joaquin Trinanes, Craig Baker-Austin, Jaime Martinez-Urtaza","doi":"10.1289/EHP15593","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Environmental change in coastal areas can drive marine bacteria and resulting infections, such as those caused by <i>Vibrio vulnificus</i>, with both foodborne and nonfoodborne exposure routes and high mortality. Although ecological drivers of <i>V. vulnificus</i> in the environment have been well-characterized, fewer models have been able to apply this to human infection risk due to limited surveillance.</p><p><strong>Objectives: </strong>The Cholera and Other <i>Vibrio</i> Illness Surveillance (COVIS) system database has reported <i>V. vulnificus</i> infections in the United States since 1988, offering a unique opportunity to both explore the forecasting capabilities machine learning could provide and to characterize complex environmental drivers of <i>V. vulnificus</i> infections.</p><p><strong>Methods: </strong>Machine learning models, in the form of random forest classification models, were trained and refined using the epidemiological data from 2008 to 2018, six environmental variables (sea surface temperature, salinity, chlorophyll <i>a</i> concentration, sea level, land surface temperature, and runoff rate) and categorical encoders to assess our predictive potential to forecast <i>V. vulnificus</i> infections based on environmental data.</p><p><strong>Results: </strong>The highest-performing model, which used balanced classes, had an Area Under the Curve score of 0.984 and a sensitivity of 0.971, highlighting the potential of machine learning to anticipate areas and periods of <i>V. vulnificus</i> risk. A higher false positive rate was found when the model was applied to real-world imbalanced surveillance data, which is pertinent amid modeled underreporting and misdiagnosis ratios of <i>V. vulnificus</i> infections. Further models were also developed to explore multilevel spatial resolution, finding state-specific models can improve specificity and early warning system potential by exclusively using lagged environmental data.</p><p><strong>Discussion: </strong>The machine learning approach was able to characterize nonlinear and interacting environmental associations driving <i>V. vulnificus</i> infections. This study accentuates the potential of machine learning and robust surveillance for forecasting environmentally associated marine infections, providing future directions for improvements, further application, and operationalization. https://doi.org/10.1289/EHP15593.</p>","PeriodicalId":11862,"journal":{"name":"Environmental Health Perspectives","volume":"133 1","pages":"17006"},"PeriodicalIF":10.1000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11756857/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Health Perspectives","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1289/EHP15593","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/23 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Environmental change in coastal areas can drive marine bacteria and resulting infections, such as those caused by Vibrio vulnificus, with both foodborne and nonfoodborne exposure routes and high mortality. Although ecological drivers of V. vulnificus in the environment have been well-characterized, fewer models have been able to apply this to human infection risk due to limited surveillance.
Objectives: The Cholera and Other Vibrio Illness Surveillance (COVIS) system database has reported V. vulnificus infections in the United States since 1988, offering a unique opportunity to both explore the forecasting capabilities machine learning could provide and to characterize complex environmental drivers of V. vulnificus infections.
Methods: Machine learning models, in the form of random forest classification models, were trained and refined using the epidemiological data from 2008 to 2018, six environmental variables (sea surface temperature, salinity, chlorophyll a concentration, sea level, land surface temperature, and runoff rate) and categorical encoders to assess our predictive potential to forecast V. vulnificus infections based on environmental data.
Results: The highest-performing model, which used balanced classes, had an Area Under the Curve score of 0.984 and a sensitivity of 0.971, highlighting the potential of machine learning to anticipate areas and periods of V. vulnificus risk. A higher false positive rate was found when the model was applied to real-world imbalanced surveillance data, which is pertinent amid modeled underreporting and misdiagnosis ratios of V. vulnificus infections. Further models were also developed to explore multilevel spatial resolution, finding state-specific models can improve specificity and early warning system potential by exclusively using lagged environmental data.
Discussion: The machine learning approach was able to characterize nonlinear and interacting environmental associations driving V. vulnificus infections. This study accentuates the potential of machine learning and robust surveillance for forecasting environmentally associated marine infections, providing future directions for improvements, further application, and operationalization. https://doi.org/10.1289/EHP15593.
期刊介绍:
Environmental Health Perspectives (EHP) is a monthly peer-reviewed journal supported by the National Institute of Environmental Health Sciences, part of the National Institutes of Health under the U.S. Department of Health and Human Services. Its mission is to facilitate discussions on the connections between the environment and human health by publishing top-notch research and news. EHP ranks third in Public, Environmental, and Occupational Health, fourth in Toxicology, and fifth in Environmental Sciences.