Osvalda De Giglio, Fabrizio Fasano, Giusy Diella, Valentina Spagnuolo, Francesco Triggiano, Marco Lopuzzo, Francesca Apollonio, Carla Maria Leone, Maria Teresa Montagna
{"title":"Machine learning vs. regression models to predict the risk of Legionella contamination in a hospital water network.","authors":"Osvalda De Giglio, Fabrizio Fasano, Giusy Diella, Valentina Spagnuolo, Francesco Triggiano, Marco Lopuzzo, Francesca Apollonio, Carla Maria Leone, Maria Teresa Montagna","doi":"10.7416/ai.2024.2644","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The periodic monitoring of Legionella in hospital water networks allows preventive measures to be taken to avoid the risk of legionellosis to patients and healthcare workers.</p><p><strong>Study design: </strong>The aim of the study is to standardize a method for predicting the risk of Legionella contamination in the water supply of a hospital facility, by comparing Machine Learning, conventional and combined models.</p><p><strong>Methods: </strong>During the period July 2021- October 2022, water sampling for Legionella detection was performed in the rooms of an Italian hospital pavilion (89.9% of the total number of rooms). Fifty-eight parameters regarding the structural and environmental characteristics of the water network were collected. Models were built on 70% of the dataset and tested on the remaining 30% to evaluate accuracy, sensitivity, and specificity.</p><p><strong>Results: </strong>A total of 1,053 water samples were analyzed and 57 (5.4%) were positive for Legionella. Of the Machine Learning models tested, the most efficient had an input layer (56 neurons), hidden layer (30 neurons), and output layer (two neurons). Accuracy was 93.4%, sensitivity was 43.8%, and specificity was 96%. The regression model had an accuracy of 82.9%, sensitivity of 20.3%, and specificity of 97.3%. The combination of the models achieved an accuracy of 82.3%, sensitivity of 22.4%, and specificity of 98.4%. The most important parameters that influenced the model results were the type of water network (hot/cold), the replacement of filter valves, and atmospheric temperature. Among the models tested, Machine Learning obtained the best results in terms of accuracy and sensitivity.</p><p><strong>Conclusions: </strong>Future studies are required to improve these predictive models by expanding the dataset using other parameters and other pavilions of the same hospital.</p>","PeriodicalId":7999,"journal":{"name":"Annali di igiene : medicina preventiva e di comunita","volume":" ","pages":"128-140"},"PeriodicalIF":1.5000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annali di igiene : medicina preventiva e di comunita","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7416/ai.2024.2644","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/11 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: The periodic monitoring of Legionella in hospital water networks allows preventive measures to be taken to avoid the risk of legionellosis to patients and healthcare workers.
Study design: The aim of the study is to standardize a method for predicting the risk of Legionella contamination in the water supply of a hospital facility, by comparing Machine Learning, conventional and combined models.
Methods: During the period July 2021- October 2022, water sampling for Legionella detection was performed in the rooms of an Italian hospital pavilion (89.9% of the total number of rooms). Fifty-eight parameters regarding the structural and environmental characteristics of the water network were collected. Models were built on 70% of the dataset and tested on the remaining 30% to evaluate accuracy, sensitivity, and specificity.
Results: A total of 1,053 water samples were analyzed and 57 (5.4%) were positive for Legionella. Of the Machine Learning models tested, the most efficient had an input layer (56 neurons), hidden layer (30 neurons), and output layer (two neurons). Accuracy was 93.4%, sensitivity was 43.8%, and specificity was 96%. The regression model had an accuracy of 82.9%, sensitivity of 20.3%, and specificity of 97.3%. The combination of the models achieved an accuracy of 82.3%, sensitivity of 22.4%, and specificity of 98.4%. The most important parameters that influenced the model results were the type of water network (hot/cold), the replacement of filter valves, and atmospheric temperature. Among the models tested, Machine Learning obtained the best results in terms of accuracy and sensitivity.
Conclusions: Future studies are required to improve these predictive models by expanding the dataset using other parameters and other pavilions of the same hospital.