{"title":"通过随机森林算法预测不同作物品种的氟含量。","authors":"Yuqi Zhang, Jie Luo, Siyao Feng, Xinying Ke, Heran Jia, Qiaohui Zhu","doi":"10.1007/s10653-024-02206-w","DOIUrl":null,"url":null,"abstract":"<p><p>Fluoride (F) is a trace element that is essential to the human body and occurs naturally in the environment. However, a deficiency or excess of F in the environment can potentially lead to human health issues. The pseudototal amount of F in soil often does not correlate directly with the F content in plants. Instead, the F content within plants tends to have a greater correlation with the bioavailable F in soils. In large-scale soil surveys, only the pseudototal elemental content of soils is typically measured, which may not be highly reliable for developing agricultural zoning plans. There are significant variations in the ability of different plants to accumulate F from soil. Additionally, due to variations in soil elemental absorption mechanisms among different plant species, when multiple crops are grown in an area, it is typically necessary to study the elemental absorption mechanisms of each crop. To address these issues, in this study, we examined the factors influencing F bioaccumulation coefficients in different crops based on 1:50,000 soil geochemical survey data. Using the random forest algorithm, four indicators-bioavailable P, bioavailable Zn, leachable Pb, and Sr-were selected from among 29 parameters to predict the F content within crops to replace bioavailable F in the soil. Compared with the multivariate linear regression (MLR) model, the random forest (RF) model provided more accurate and reliable predictions of the fluoride content in crops, with the RF model's prediction accuracy improving by approximately 95.23%. Additionally, while the partial least squares regression (PLSR) model also offered improved accuracy over MLR, the RF model still outperformed PLSR in terms of prediction accuracy and robustness. Additionally, it maximized the utilization of existing geochemical survey data, enabling cross-species studies for the first time and avoiding redundant evaluations of different types of agricultural products in the same region. In this investigation, we selected the Xining-Ledu region of Qinghai Province, China, as the study area and employed a random forest model to predict the crop F content in soils, providing a new methodological framework for crop production that effectively enhances agricultural quality and efficiency.</p>","PeriodicalId":11759,"journal":{"name":"Environmental Geochemistry and Health","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of the fluoride contents of different crop species via the random forest algorithm.\",\"authors\":\"Yuqi Zhang, Jie Luo, Siyao Feng, Xinying Ke, Heran Jia, Qiaohui Zhu\",\"doi\":\"10.1007/s10653-024-02206-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Fluoride (F) is a trace element that is essential to the human body and occurs naturally in the environment. However, a deficiency or excess of F in the environment can potentially lead to human health issues. The pseudototal amount of F in soil often does not correlate directly with the F content in plants. Instead, the F content within plants tends to have a greater correlation with the bioavailable F in soils. In large-scale soil surveys, only the pseudototal elemental content of soils is typically measured, which may not be highly reliable for developing agricultural zoning plans. There are significant variations in the ability of different plants to accumulate F from soil. Additionally, due to variations in soil elemental absorption mechanisms among different plant species, when multiple crops are grown in an area, it is typically necessary to study the elemental absorption mechanisms of each crop. To address these issues, in this study, we examined the factors influencing F bioaccumulation coefficients in different crops based on 1:50,000 soil geochemical survey data. Using the random forest algorithm, four indicators-bioavailable P, bioavailable Zn, leachable Pb, and Sr-were selected from among 29 parameters to predict the F content within crops to replace bioavailable F in the soil. Compared with the multivariate linear regression (MLR) model, the random forest (RF) model provided more accurate and reliable predictions of the fluoride content in crops, with the RF model's prediction accuracy improving by approximately 95.23%. Additionally, while the partial least squares regression (PLSR) model also offered improved accuracy over MLR, the RF model still outperformed PLSR in terms of prediction accuracy and robustness. Additionally, it maximized the utilization of existing geochemical survey data, enabling cross-species studies for the first time and avoiding redundant evaluations of different types of agricultural products in the same region. In this investigation, we selected the Xining-Ledu region of Qinghai Province, China, as the study area and employed a random forest model to predict the crop F content in soils, providing a new methodological framework for crop production that effectively enhances agricultural quality and efficiency.</p>\",\"PeriodicalId\":11759,\"journal\":{\"name\":\"Environmental Geochemistry and Health\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Geochemistry and Health\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1007/s10653-024-02206-w\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ENVIRONMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Geochemistry and Health","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s10653-024-02206-w","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0
摘要
氟(F)是一种人体必需的微量元素,天然存在于环境中。然而,环境中氟的缺乏或过量可能会导致人体健康问题。土壤中 F 的假总量通常与植物中的 F 含量并不直接相关。相反,植物体内的膳食纤维含量往往与土壤中的生物可利用膳食纤维有更大的相关性。在大规模的土壤调查中,通常只测量土壤中的假总元素含量,这对于制定农业区划计划可能不太可靠。不同植物从土壤中积累碳元素的能力存在很大差异。此外,由于不同植物种类的土壤元素吸收机制存在差异,当一个地区种植多种作物时,通常需要研究每种作物的元素吸收机制。为了解决这些问题,在本研究中,我们根据 1:50,000 土壤地球化学调查数据,研究了影响不同作物膳食纤维生物累积系数的因素。利用随机森林算法,从 29 个参数中选取了 4 个指标--生物可利用磷、生物可利用锌、可浸出铅和硒--来预测作物体内的萤石含量,以替代土壤中的生物可利用萤石。与多元线性回归(MLR)模型相比,随机森林(RF)模型对农作物中氟含量的预测更加准确可靠,RF 模型的预测准确率提高了约 95.23%。此外,虽然偏最小二乘回归(PLSR)模型也比 MLR 提高了准确性,但 RF 模型在预测准确性和稳健性方面仍然优于 PLSR。此外,它还最大限度地利用了现有的地球化学调查数据,首次实现了跨物种研究,避免了对同一地区不同类型农产品的重复评估。本研究选取中国青海省西宁-乐都地区作为研究区域,采用随机森林模型预测土壤中作物F含量,为作物生产提供了新的方法框架,有效提高了农业质量和效益。
Prediction of the fluoride contents of different crop species via the random forest algorithm.
Fluoride (F) is a trace element that is essential to the human body and occurs naturally in the environment. However, a deficiency or excess of F in the environment can potentially lead to human health issues. The pseudototal amount of F in soil often does not correlate directly with the F content in plants. Instead, the F content within plants tends to have a greater correlation with the bioavailable F in soils. In large-scale soil surveys, only the pseudototal elemental content of soils is typically measured, which may not be highly reliable for developing agricultural zoning plans. There are significant variations in the ability of different plants to accumulate F from soil. Additionally, due to variations in soil elemental absorption mechanisms among different plant species, when multiple crops are grown in an area, it is typically necessary to study the elemental absorption mechanisms of each crop. To address these issues, in this study, we examined the factors influencing F bioaccumulation coefficients in different crops based on 1:50,000 soil geochemical survey data. Using the random forest algorithm, four indicators-bioavailable P, bioavailable Zn, leachable Pb, and Sr-were selected from among 29 parameters to predict the F content within crops to replace bioavailable F in the soil. Compared with the multivariate linear regression (MLR) model, the random forest (RF) model provided more accurate and reliable predictions of the fluoride content in crops, with the RF model's prediction accuracy improving by approximately 95.23%. Additionally, while the partial least squares regression (PLSR) model also offered improved accuracy over MLR, the RF model still outperformed PLSR in terms of prediction accuracy and robustness. Additionally, it maximized the utilization of existing geochemical survey data, enabling cross-species studies for the first time and avoiding redundant evaluations of different types of agricultural products in the same region. In this investigation, we selected the Xining-Ledu region of Qinghai Province, China, as the study area and employed a random forest model to predict the crop F content in soils, providing a new methodological framework for crop production that effectively enhances agricultural quality and efficiency.
期刊介绍:
Environmental Geochemistry and Health publishes original research papers and review papers across the broad field of environmental geochemistry. Environmental geochemistry and health establishes and explains links between the natural or disturbed chemical composition of the earth’s surface and the health of plants, animals and people.
Beneficial elements regulate or promote enzymatic and hormonal activity whereas other elements may be toxic. Bedrock geochemistry controls the composition of soil and hence that of water and vegetation. Environmental issues, such as pollution, arising from the extraction and use of mineral resources, are discussed. The effects of contaminants introduced into the earth’s geochemical systems are examined. Geochemical surveys of soil, water and plants show how major and trace elements are distributed geographically. Associated epidemiological studies reveal the possibility of causal links between the natural or disturbed geochemical environment and disease. Experimental research illuminates the nature or consequences of natural or disturbed geochemical processes.
The journal particularly welcomes novel research linking environmental geochemistry and health issues on such topics as: heavy metals (including mercury), persistent organic pollutants (POPs), and mixed chemicals emitted through human activities, such as uncontrolled recycling of electronic-waste; waste recycling; surface-atmospheric interaction processes (natural and anthropogenic emissions, vertical transport, deposition, and physical-chemical interaction) of gases and aerosols; phytoremediation/restoration of contaminated sites; food contamination and safety; environmental effects of medicines; effects and toxicity of mixed pollutants; speciation of heavy metals/metalloids; effects of mining; disturbed geochemistry from human behavior, natural or man-made hazards; particle and nanoparticle toxicology; risk and the vulnerability of populations, etc.