[Soil Cadmium Prediction and Health Risk Assessment of an Oasis on the Eastern Edge of the Tarim Basin Based on Feature Optimization and Machine Learning].

Q2 Environmental Science Huanjing Kexue/Environmental Science Pub Date : 2024-08-08 DOI:10.13227/j.hjkx.202308010
Jing-Yu Liu, Ruo-Yi Li, Yong-Chun Liang, Lei Liu, Fang Yin, Su Tang, Lin-Sen He, Yi Zhang
{"title":"[Soil Cadmium Prediction and Health Risk Assessment of an Oasis on the Eastern Edge of the Tarim Basin Based on Feature Optimization and Machine Learning].","authors":"Jing-Yu Liu, Ruo-Yi Li, Yong-Chun Liang, Lei Liu, Fang Yin, Su Tang, Lin-Sen He, Yi Zhang","doi":"10.13227/j.hjkx.202308010","DOIUrl":null,"url":null,"abstract":"<p><p>Soil heavy metal pollution poses a serious threat to food security, human health, and soil ecosystems. Based on 644 soil samples collected from a typical oasis located at the eastern margin of the Tarim Basin, a series of models, namely, multiple linear regression (LR), neural network (BP), random forest (RF), support vector machine (SVM), and radial basis function (RBF), were built to predict the soil heavy metal content. The optimal prediction result was obtained and utilized to analyze the spatial distribution features of heavy metal contamination and relevant health risks. The outcomes demonstrated that: ① The average Cd content in the study area was 0.14 mg·kg<sup>-1</sup>, which was 1.17 times the soil background value of Xinjiang, making it the primary factor of soil heavy metal contamination in the area. Additionally, the carcinogenicity risk coefficients of Cd for both adults and children were less than 10<sup>-4</sup>, indicating that there were no significant long-term health risks for humans in the area. ② The estimation accuracies of the five inversion models were compared, and the validation set of the RF model had an <i>R</i><sup>2</sup> value of 0.763 7, which was the highest among the five models. Additionally, the RMSE, MAE, and MBE of the RF model were the smallest among the five models. Therefore, the predicted values of the RF model were most consistent with the measured values of the soil Cd content. The predicted map of soil Cd distribution derived from the RF model coincided best with the interpolation map. ③ The RF model outperformed the other four models in predicting health risks associated with the soil Cd element for both adults and children, resulting in better prediction results. Comparatively, the predicted values of the LR model in the validation set varied greatly, leading to unreliable results. It was demonstrated that the RF was the best model for predicting soil Cd content and evaluating health risks in the study area, considering its superior generalization capability and anti-overfitting ability.</p>","PeriodicalId":35937,"journal":{"name":"Huanjing Kexue/Environmental Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Huanjing Kexue/Environmental Science","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.13227/j.hjkx.202308010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Environmental Science","Score":null,"Total":0}
引用次数: 0

Abstract

Soil heavy metal pollution poses a serious threat to food security, human health, and soil ecosystems. Based on 644 soil samples collected from a typical oasis located at the eastern margin of the Tarim Basin, a series of models, namely, multiple linear regression (LR), neural network (BP), random forest (RF), support vector machine (SVM), and radial basis function (RBF), were built to predict the soil heavy metal content. The optimal prediction result was obtained and utilized to analyze the spatial distribution features of heavy metal contamination and relevant health risks. The outcomes demonstrated that: ① The average Cd content in the study area was 0.14 mg·kg-1, which was 1.17 times the soil background value of Xinjiang, making it the primary factor of soil heavy metal contamination in the area. Additionally, the carcinogenicity risk coefficients of Cd for both adults and children were less than 10-4, indicating that there were no significant long-term health risks for humans in the area. ② The estimation accuracies of the five inversion models were compared, and the validation set of the RF model had an R2 value of 0.763 7, which was the highest among the five models. Additionally, the RMSE, MAE, and MBE of the RF model were the smallest among the five models. Therefore, the predicted values of the RF model were most consistent with the measured values of the soil Cd content. The predicted map of soil Cd distribution derived from the RF model coincided best with the interpolation map. ③ The RF model outperformed the other four models in predicting health risks associated with the soil Cd element for both adults and children, resulting in better prediction results. Comparatively, the predicted values of the LR model in the validation set varied greatly, leading to unreliable results. It was demonstrated that the RF was the best model for predicting soil Cd content and evaluating health risks in the study area, considering its superior generalization capability and anti-overfitting ability.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
[基于特征优化和机器学习的塔里木盆地东缘绿洲土壤镉预测与健康风险评估]。
土壤重金属污染对粮食安全、人类健康和土壤生态系统构成了严重威胁。基于从塔里木盆地东缘典型绿洲采集的 644 份土壤样本,建立了一系列模型,即多元线性回归(LR)、神经网络(BP)、随机森林(RFM)、矢量机支持(SVM)和土壤重金属污染分析模型、建立了神经网络(BP)、随机森林(RF)、支持向量机(SVM)和径向基函数(RBF)等一系列模型来预测土壤重金属含量。得到的最优预测结果用于分析重金属污染的空间分布特征及相关健康风险。结果表明:①研究区平均镉含量为 0.14 mg-kg-1,是新疆土壤背景值的 1.17 倍,是该地区土壤重金属污染的首要因素。此外,镉对成人和儿童的致癌风险系数均小于 10-4,表明该地区对人体的长期健康风险不大。比较了五个反演模型的估计精度,RF 模型验证集的 R2 值为 0.763 7,是五个模型中最高的。此外,RF 模型的 RMSE、MAE 和 MBE 也是五个模型中最小的。因此,射频模型的预测值与土壤镉含量的实测值最为一致。射频模型得出的土壤镉分布预测图与插值图的吻合度最高。在预测与土壤中镉元素有关的成人和儿童健康风险方面,射频模型优于其他四种模型,从而获得了更好的预测结果。相比之下,LR 模型在验证集中的预测值差异很大,导致结果不可靠。结果表明,考虑到 RF 模型优越的泛化能力和抗过拟合能力,它是预测研究区域土壤镉含量和评估健康风险的最佳模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Huanjing Kexue/Environmental Science
Huanjing Kexue/Environmental Science Environmental Science-Environmental Science (all)
CiteScore
4.40
自引率
0.00%
发文量
15329
期刊最新文献
[Adsorption and Desorption Behavior of PAEs Plasticizer on PVC and Rubber Particles After Natural Environment Aging]. [Advances in Research of the Effects and Mechanisms of Polyethylene Microplastics on Soil Nitrogen Transformation]. [An Analysis of the Industrial Water Use Evolution in China]. [Analysis and Optimization Suggestions on Allowance Allocation Methods of the Power Industry in the Carbon Market]. [Analysis of Antibiotic Resistance of Bioaerosols from Wastewater Treatment Process].
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1