Comparative analysis of different machine learning algorithms for predicting trace metal concentrations in soils under intensive paddy cultivation

IF 7.7 1区 农林科学 Q1 AGRICULTURE, MULTIDISCIPLINARY Computers and Electronics in Agriculture Pub Date : 2024-03-02 DOI:10.1016/j.compag.2024.108772
Mehmet Taşan , Yusuf Demir , Sevda Taşan , Elif Öztürk
{"title":"Comparative analysis of different machine learning algorithms for predicting trace metal concentrations in soils under intensive paddy cultivation","authors":"Mehmet Taşan ,&nbsp;Yusuf Demir ,&nbsp;Sevda Taşan ,&nbsp;Elif Öztürk","doi":"10.1016/j.compag.2024.108772","DOIUrl":null,"url":null,"abstract":"<div><p>Contamination of agricultural soils with trace metals is of concern as it poses potential long-term threats to water resources, aquatic species, and human health. Therefore, fast, accurate and reliable methods should be developed to monitor trace metal content of agricultural soils. This study was conducted to compare performance of different machine learning models (Artificial Neural Network – ANN, Deep Neural Network - DNN, Random Forest - RF, K-Nearest Neighbors - KNN and Adaptive Boosting - AB) in estimation of heavy metal (Cu, Fe, Mn, and Zn) contents of the soils over which intensive paddy-farming has been practiced for years. Model stability was also investigated. Based on correlation analysis, some soil physicochemical parameters (EC, pH, Na, K, N) and soil depth were defined as covariates to improve estimation accuracy for soil heavy metals. Model performance was assessed through coefficient of determination (R<sup>2</sup>), mean absolute error (MAE), and root mean square error (RMSE). Scatter plots, box plots and Taylor diagrams were used for graphical comparison of model performances. Present findings revealed that with greater R<sup>2</sup> and lower RMSE values, RF model (RMSE = 1.11 ppm, R<sup>2</sup> = 0.90) yielded more accurate outcomes for Cu, RF (RMSE = 25.40 ppm, R<sup>2</sup> = 0.67) model for Fe, RF (RMSE = 9.05 ppm, R<sup>2</sup> = 0.59) model for Mn and ANN (RMSE = 0.35 ppm, R<sup>2</sup> = 0.49) model for Zn than the other models. Besides, AB model yielded more stable estimations for Cu contents and ANN models for the other heavy metals. The smallest change in RMSE values of training and testing datasets was 2.5 % (AB) for Cu, 10.38 % (ANN) for Fe, 21.35 % (ANN) for Mn and 6.79 % (ANN) for Zn. Besides, overfitting was observed in RF model. Moreover, the sensitivity analysis of the best and most stable models showed that EC, pH, and N in particular had a significant impact on the Zn, Cu, Mn, and Fe accumulation of soils. Better performance of ANN models was resulted from better modeling of complex nonlinear relationships between heavy metal contents of soils and covariates. It was concluded based on present findings that artificial intelligence-based methods could reliably and successfully be use to predict trace metal content of paddy fields.</p></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"219 ","pages":"Article 108772"},"PeriodicalIF":7.7000,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169924001637","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Contamination of agricultural soils with trace metals is of concern as it poses potential long-term threats to water resources, aquatic species, and human health. Therefore, fast, accurate and reliable methods should be developed to monitor trace metal content of agricultural soils. This study was conducted to compare performance of different machine learning models (Artificial Neural Network – ANN, Deep Neural Network - DNN, Random Forest - RF, K-Nearest Neighbors - KNN and Adaptive Boosting - AB) in estimation of heavy metal (Cu, Fe, Mn, and Zn) contents of the soils over which intensive paddy-farming has been practiced for years. Model stability was also investigated. Based on correlation analysis, some soil physicochemical parameters (EC, pH, Na, K, N) and soil depth were defined as covariates to improve estimation accuracy for soil heavy metals. Model performance was assessed through coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE). Scatter plots, box plots and Taylor diagrams were used for graphical comparison of model performances. Present findings revealed that with greater R2 and lower RMSE values, RF model (RMSE = 1.11 ppm, R2 = 0.90) yielded more accurate outcomes for Cu, RF (RMSE = 25.40 ppm, R2 = 0.67) model for Fe, RF (RMSE = 9.05 ppm, R2 = 0.59) model for Mn and ANN (RMSE = 0.35 ppm, R2 = 0.49) model for Zn than the other models. Besides, AB model yielded more stable estimations for Cu contents and ANN models for the other heavy metals. The smallest change in RMSE values of training and testing datasets was 2.5 % (AB) for Cu, 10.38 % (ANN) for Fe, 21.35 % (ANN) for Mn and 6.79 % (ANN) for Zn. Besides, overfitting was observed in RF model. Moreover, the sensitivity analysis of the best and most stable models showed that EC, pH, and N in particular had a significant impact on the Zn, Cu, Mn, and Fe accumulation of soils. Better performance of ANN models was resulted from better modeling of complex nonlinear relationships between heavy metal contents of soils and covariates. It was concluded based on present findings that artificial intelligence-based methods could reliably and successfully be use to predict trace metal content of paddy fields.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
预测水稻集约化种植土壤中痕量金属浓度的不同机器学习算法比较分析
农业土壤中的痕量金属污染对水资源、水生物种和人类健康构成潜在的长期威胁,因此备受关注。因此,应开发快速、准确和可靠的方法来监测农业土壤中的痕量金属含量。本研究比较了不同机器学习模型(人工神经网络(ANN)、深度神经网络(DNN)、随机森林(RF)、K-近邻(KNN)和自适应提升(AB))在估算多年来实行集约化水稻种植的土壤中重金属(铜、铁、锰和锌)含量方面的性能。此外,还对模型的稳定性进行了研究。根据相关性分析,确定了一些土壤理化参数(EC、pH、Na、K、N)和土壤深度作为协变量,以提高土壤重金属的估算精度。模型性能通过判定系数(R2)、平均绝对误差(MAE)和均方根误差(RMSE)进行评估。散点图、方框图和泰勒图被用来对模型性能进行图形比较。目前的研究结果表明,与其他模型相比,RF 模型(RMSE = 1.11 ppm,R2 = 0.90)对铜、RF 模型(RMSE = 25.40 ppm,R2 = 0.67)对铁、RF 模型(RMSE = 9.05 ppm,R2 = 0.59)对锰和 ANN 模型(RMSE = 0.35 ppm,R2 = 0.49)对锌的结果更准确,R2 值更大,RMSE 值更小。此外,AB 模型对铜含量的估计更稳定,而 ANN 模型对其他重金属的估计更稳定。训练数据集和测试数据集的 RMSE 值的最小变化分别是:铜为 2.5%(AB),铁为 10.38%(ANN),锰为 21.35%(ANN),锌为 6.79%(ANN)。此外,在 RF 模型中还发现了过拟合现象。此外,对最佳和最稳定模型的灵敏度分析表明,EC、pH 值和氮对土壤中锌、铜、锰和铁的积累有显著影响。更好地模拟土壤重金属含量与协变量之间复杂的非线性关系,使 ANN 模型具有更好的性能。根据目前的研究结果得出结论,基于人工智能的方法可以可靠、成功地用于预测稻田中的痕量金属含量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers and Electronics in Agriculture
Computers and Electronics in Agriculture 工程技术-计算机:跨学科应用
CiteScore
15.30
自引率
14.50%
发文量
800
审稿时长
62 days
期刊介绍: Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.
期刊最新文献
A novel approach to water stress assessment in plants: New bioimpedance method with PSO-optimized Cole-Cole impedance modeling Real-time monitoring system for evaluating the operational quality of rice transplanters Next generation crop protection: A systematic review of trends in modelling approaches for disease prediction The role of spectro-temporal remote sensing in vegetation classification: A comprehensive review integrating machine learning and bibliometric analysis A systematic literature review on the applications of federated learning and enabling technologies for livestock management
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1