{"title":"Performance evaluation of different machine learning algorithms for prediction of nitrate in groundwater in Thiruvannamalai District","authors":"Christina Jacob, Uma Shankar Masilamani","doi":"10.1002/clen.202400060","DOIUrl":null,"url":null,"abstract":"The prevalence of nitrates (NO<jats:sub>3</jats:sub><jats:sup>−</jats:sup>) in groundwater due to the extensive application of fertilizers and anthropogenic sources pollutes the groundwater. Machine learning (ML) techniques are now being increasingly deployed to achieve high precision in predicting water quality. This study assesses the efficacy of nine distinct ML algorithms, namely, linear regression, polynomial regression, decision tree, random forest (RF), support vector machine, multilayer perceptron regressor, eXtreme gradient boosting (XGB), light gradient boosting (LGB), and K‐nearest neighbors to predict nitrate concentration in the groundwater in Thiruvannamalai District, Tamil Nadu. Overall, 360 water samples for 1 year and 14 groundwater variables were determined to predict nitrate. Performance evaluation metrics such as root mean square error (RMSE), moving average error (MAE), and correlation coefficient (<jats:italic>R</jats:italic><jats:sup>2</jats:sup>) were evaluated for pre‐monsoon, monsoon, and post‐monsoon seasons. For all three seasons, RF predicted the nitrate concentration with low values of RMSE, MAE, and higher values of <jats:italic>R</jats:italic><jats:sup>2</jats:sup>. The results show values for RF with: RSME: 0.49, MAE: 1.30, and <jats:italic>R</jats:italic><jats:sup>2</jats:sup>: 0.94, which has a higher prediction tailed by LGB and XGB and is true for all the seasons. The results from the study will aid the policymakers in planning the strategy for remediation.","PeriodicalId":10306,"journal":{"name":"Clean-soil Air Water","volume":"8 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clean-soil Air Water","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1002/clen.202400060","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
The prevalence of nitrates (NO3−) in groundwater due to the extensive application of fertilizers and anthropogenic sources pollutes the groundwater. Machine learning (ML) techniques are now being increasingly deployed to achieve high precision in predicting water quality. This study assesses the efficacy of nine distinct ML algorithms, namely, linear regression, polynomial regression, decision tree, random forest (RF), support vector machine, multilayer perceptron regressor, eXtreme gradient boosting (XGB), light gradient boosting (LGB), and K‐nearest neighbors to predict nitrate concentration in the groundwater in Thiruvannamalai District, Tamil Nadu. Overall, 360 water samples for 1 year and 14 groundwater variables were determined to predict nitrate. Performance evaluation metrics such as root mean square error (RMSE), moving average error (MAE), and correlation coefficient (R2) were evaluated for pre‐monsoon, monsoon, and post‐monsoon seasons. For all three seasons, RF predicted the nitrate concentration with low values of RMSE, MAE, and higher values of R2. The results show values for RF with: RSME: 0.49, MAE: 1.30, and R2: 0.94, which has a higher prediction tailed by LGB and XGB and is true for all the seasons. The results from the study will aid the policymakers in planning the strategy for remediation.
期刊介绍:
CLEAN covers all aspects of Sustainability and Environmental Safety. The journal focuses on organ/human--environment interactions giving interdisciplinary insights on a broad range of topics including air pollution, waste management, the water cycle, and environmental conservation. With a 2019 Journal Impact Factor of 1.603 (Journal Citation Reports (Clarivate Analytics, 2020), the journal publishes an attractive mixture of peer-reviewed scientific reviews, research papers, and short communications.
Papers dealing with environmental sustainability issues from such fields as agriculture, biological sciences, energy, food sciences, geography, geology, meteorology, nutrition, soil and water sciences, etc., are welcome.