机器学习方法在有限数据环境下预测河口溶解氧中的应用

IF 2.4 4区 环境科学与生态学 Q2 WATER RESOURCES Water Quality Research Journal Pub Date : 2022-08-11 DOI:10.2166/wqrj.2022.002
M. A. Z. Siddik
{"title":"机器学习方法在有限数据环境下预测河口溶解氧中的应用","authors":"M. A. Z. Siddik","doi":"10.2166/wqrj.2022.002","DOIUrl":null,"url":null,"abstract":"\n The application of machine learning (ML) approaches to predict estuarine dissolved oxygen (DO) from a set of environmental covariates including nutrients remains unexplored due to nutrient data unavailability. Employing data from 12 southwest coastal Florida water quality stations, the applicability of four ML models – support vector machine (SVM), random forest (RF), decision tree, and Wang–Mendel – was examined in predicting DO under a limited nutrient data environment. Monthly water temperature, pH, salinity, total nitrogen (TN), and total phosphorus (TP) data were used for model development. The multiple linear regression model was trained as benchmarks to compare the ML model performances. The site-specific RF and SVM showed superior model efficiency (Nash–Sutcliffe Efficiency > 0.80) when all the predictor variables were used for model development. However, models trained without nutrients demonstrated reduced prediction accuracy. Modeling by synthesizing all site data under TN-limited, TP-limited, and TN- & TP-co-limited regimes illustrated a preferable performance of RF. Overall, the study rendered two crucial conclusions that could complement the existing approaches to estimate total daily loads for environmental management: (1) nutrients serve as a necessary predictor of estuarine DO dynamics and (2) RF performs better among the ML methods under a limited data environment.","PeriodicalId":23720,"journal":{"name":"Water Quality Research Journal","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2022-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Application of machine learning approaches in predicting estuarine dissolved oxygen (DO) under a limited data environment\",\"authors\":\"M. A. Z. Siddik\",\"doi\":\"10.2166/wqrj.2022.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n The application of machine learning (ML) approaches to predict estuarine dissolved oxygen (DO) from a set of environmental covariates including nutrients remains unexplored due to nutrient data unavailability. Employing data from 12 southwest coastal Florida water quality stations, the applicability of four ML models – support vector machine (SVM), random forest (RF), decision tree, and Wang–Mendel – was examined in predicting DO under a limited nutrient data environment. Monthly water temperature, pH, salinity, total nitrogen (TN), and total phosphorus (TP) data were used for model development. The multiple linear regression model was trained as benchmarks to compare the ML model performances. The site-specific RF and SVM showed superior model efficiency (Nash–Sutcliffe Efficiency > 0.80) when all the predictor variables were used for model development. However, models trained without nutrients demonstrated reduced prediction accuracy. Modeling by synthesizing all site data under TN-limited, TP-limited, and TN- & TP-co-limited regimes illustrated a preferable performance of RF. Overall, the study rendered two crucial conclusions that could complement the existing approaches to estimate total daily loads for environmental management: (1) nutrients serve as a necessary predictor of estuarine DO dynamics and (2) RF performs better among the ML methods under a limited data environment.\",\"PeriodicalId\":23720,\"journal\":{\"name\":\"Water Quality Research Journal\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2022-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Water Quality Research Journal\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.2166/wqrj.2022.002\",\"RegionNum\":4,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"WATER RESOURCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Quality Research Journal","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.2166/wqrj.2022.002","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"WATER RESOURCES","Score":null,"Total":0}
引用次数: 2

摘要

由于营养数据不可用,机器学习(ML)方法在从一组包括营养素在内的环境协变量预测河口溶解氧(DO)方面的应用尚未探索。利用佛罗里达州西南沿海12个水质站的数据,检验了四个ML模型——支持向量机(SVM)、随机森林(RF)、决策树和王-孟德尔——在有限营养数据环境下预测DO的适用性。月水温、pH、盐度、总氮(TN)和总磷(TP)数据用于模型开发。将多元线性回归模型作为基准进行训练,以比较ML模型的性能。当所有预测变量都用于模型开发时,位点特异性RF和SVM显示出优越的模型效率(Nash–Sutcliffe效率>0.80)。然而,在没有营养素的情况下训练的模型显示预测准确性降低。通过在TN限制、TP限制和TN-&TP共限制条件下合成所有站点数据进行建模,表明RF具有较好的性能。总的来说,该研究得出了两个关键结论,可以补充现有的环境管理总日负荷估计方法:(1)营养物质是河口DO动力学的必要预测因子;(2)在有限的数据环境下,RF在ML方法中表现更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Application of machine learning approaches in predicting estuarine dissolved oxygen (DO) under a limited data environment
The application of machine learning (ML) approaches to predict estuarine dissolved oxygen (DO) from a set of environmental covariates including nutrients remains unexplored due to nutrient data unavailability. Employing data from 12 southwest coastal Florida water quality stations, the applicability of four ML models – support vector machine (SVM), random forest (RF), decision tree, and Wang–Mendel – was examined in predicting DO under a limited nutrient data environment. Monthly water temperature, pH, salinity, total nitrogen (TN), and total phosphorus (TP) data were used for model development. The multiple linear regression model was trained as benchmarks to compare the ML model performances. The site-specific RF and SVM showed superior model efficiency (Nash–Sutcliffe Efficiency > 0.80) when all the predictor variables were used for model development. However, models trained without nutrients demonstrated reduced prediction accuracy. Modeling by synthesizing all site data under TN-limited, TP-limited, and TN- & TP-co-limited regimes illustrated a preferable performance of RF. Overall, the study rendered two crucial conclusions that could complement the existing approaches to estimate total daily loads for environmental management: (1) nutrients serve as a necessary predictor of estuarine DO dynamics and (2) RF performs better among the ML methods under a limited data environment.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.50
自引率
8.70%
发文量
0
期刊最新文献
Development and evaluation of filter for canal water potability Phosphorus removal and recovery from anaerobic bioreactor effluent using a batch electrocoagulation process A Fuzzy Inference System for enhanced groundwater quality assessment and index determination The risk of bacterial virulence in the face of concentrated river pollution Efficient removal of perfluorinated compounds with the polyamide nanofiltration membrane and membrane fouling resistance analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1