Weichun Yang, Jiaxin Li, Kai Nie, Pengwei Zhao, Hui Xia, Qi Li, Qi Liao, Qingzhu Li, Chunhua Dong, Zhihui Yang, Mengying Si
{"title":"基于机器学习的水稻谷粒镉积累关键因素识别。","authors":"Weichun Yang, Jiaxin Li, Kai Nie, Pengwei Zhao, Hui Xia, Qi Li, Qi Liao, Qingzhu Li, Chunhua Dong, Zhihui Yang, Mengying Si","doi":"10.1007/s10653-024-02312-9","DOIUrl":null,"url":null,"abstract":"<p><p>The aggregation of Cadmium (Cd) in rice grains is a significant threat to human healthy. The complexity of the soil-rice system, with its numerous influencing parameters, highlights the need to identify the crucial factors responsible for Cd aggregation. This study uses machine learning (ML) modeling to predict Cd aggregation in rice grains and identify the influencing factors. Data from 474 data points from 77 published works were analyzed, and eight ML models were established using different algorithms. The input variables were total soil Cd concentration (TS Cd) and extractable Cd concentration (Ex-Cd), while rice Cd concentration (Cd<sub>rice</sub>) was the output variable. Among the models, the Extremely Randomized Trees (ERT) model performed the best (TS Cd: R<sup>2</sup> = 0.825; Ex-Cd: R<sup>2</sup> = 0.792), followed by Random Forest (TS Cd: R<sup>2</sup> = 0.721; Ex-Cd: R<sup>2</sup> = 0.719). The ERT feature importance ranking analysis revealed that the essential factors responsible for Cd aggregation are cation exchange capacity (CEC), TS Cd, Water Management Model (WMM), and pH for total soil Cd as input variables. For extractable Cd as an input variable, the vital factors are CEC, Ex-Cd, pH, and WMM. The study highlights the importance of the Water Management Model and its impact on Cd concentration in rice grains, which has been overlooked in previous research.Please check and confirm that the authors and their respective affiliations have been correctly identified and amend if necessary.The authors and their respective affiliations are correct.Author details: Kindly check and confirm whether the corresponding author is correctly identified.It is correct.</p>","PeriodicalId":11759,"journal":{"name":"Environmental Geochemistry and Health","volume":"47 1","pages":"2"},"PeriodicalIF":3.2000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning-based identification of critical factors for cadmium accumulation in rice grains.\",\"authors\":\"Weichun Yang, Jiaxin Li, Kai Nie, Pengwei Zhao, Hui Xia, Qi Li, Qi Liao, Qingzhu Li, Chunhua Dong, Zhihui Yang, Mengying Si\",\"doi\":\"10.1007/s10653-024-02312-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The aggregation of Cadmium (Cd) in rice grains is a significant threat to human healthy. The complexity of the soil-rice system, with its numerous influencing parameters, highlights the need to identify the crucial factors responsible for Cd aggregation. This study uses machine learning (ML) modeling to predict Cd aggregation in rice grains and identify the influencing factors. Data from 474 data points from 77 published works were analyzed, and eight ML models were established using different algorithms. The input variables were total soil Cd concentration (TS Cd) and extractable Cd concentration (Ex-Cd), while rice Cd concentration (Cd<sub>rice</sub>) was the output variable. Among the models, the Extremely Randomized Trees (ERT) model performed the best (TS Cd: R<sup>2</sup> = 0.825; Ex-Cd: R<sup>2</sup> = 0.792), followed by Random Forest (TS Cd: R<sup>2</sup> = 0.721; Ex-Cd: R<sup>2</sup> = 0.719). The ERT feature importance ranking analysis revealed that the essential factors responsible for Cd aggregation are cation exchange capacity (CEC), TS Cd, Water Management Model (WMM), and pH for total soil Cd as input variables. For extractable Cd as an input variable, the vital factors are CEC, Ex-Cd, pH, and WMM. The study highlights the importance of the Water Management Model and its impact on Cd concentration in rice grains, which has been overlooked in previous research.Please check and confirm that the authors and their respective affiliations have been correctly identified and amend if necessary.The authors and their respective affiliations are correct.Author details: Kindly check and confirm whether the corresponding author is correctly identified.It is correct.</p>\",\"PeriodicalId\":11759,\"journal\":{\"name\":\"Environmental Geochemistry and Health\",\"volume\":\"47 1\",\"pages\":\"2\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Geochemistry and Health\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1007/s10653-024-02312-9\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ENVIRONMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Geochemistry and Health","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s10653-024-02312-9","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
Machine learning-based identification of critical factors for cadmium accumulation in rice grains.
The aggregation of Cadmium (Cd) in rice grains is a significant threat to human healthy. The complexity of the soil-rice system, with its numerous influencing parameters, highlights the need to identify the crucial factors responsible for Cd aggregation. This study uses machine learning (ML) modeling to predict Cd aggregation in rice grains and identify the influencing factors. Data from 474 data points from 77 published works were analyzed, and eight ML models were established using different algorithms. The input variables were total soil Cd concentration (TS Cd) and extractable Cd concentration (Ex-Cd), while rice Cd concentration (Cdrice) was the output variable. Among the models, the Extremely Randomized Trees (ERT) model performed the best (TS Cd: R2 = 0.825; Ex-Cd: R2 = 0.792), followed by Random Forest (TS Cd: R2 = 0.721; Ex-Cd: R2 = 0.719). The ERT feature importance ranking analysis revealed that the essential factors responsible for Cd aggregation are cation exchange capacity (CEC), TS Cd, Water Management Model (WMM), and pH for total soil Cd as input variables. For extractable Cd as an input variable, the vital factors are CEC, Ex-Cd, pH, and WMM. The study highlights the importance of the Water Management Model and its impact on Cd concentration in rice grains, which has been overlooked in previous research.Please check and confirm that the authors and their respective affiliations have been correctly identified and amend if necessary.The authors and their respective affiliations are correct.Author details: Kindly check and confirm whether the corresponding author is correctly identified.It is correct.
期刊介绍:
Environmental Geochemistry and Health publishes original research papers and review papers across the broad field of environmental geochemistry. Environmental geochemistry and health establishes and explains links between the natural or disturbed chemical composition of the earth’s surface and the health of plants, animals and people.
Beneficial elements regulate or promote enzymatic and hormonal activity whereas other elements may be toxic. Bedrock geochemistry controls the composition of soil and hence that of water and vegetation. Environmental issues, such as pollution, arising from the extraction and use of mineral resources, are discussed. The effects of contaminants introduced into the earth’s geochemical systems are examined. Geochemical surveys of soil, water and plants show how major and trace elements are distributed geographically. Associated epidemiological studies reveal the possibility of causal links between the natural or disturbed geochemical environment and disease. Experimental research illuminates the nature or consequences of natural or disturbed geochemical processes.
The journal particularly welcomes novel research linking environmental geochemistry and health issues on such topics as: heavy metals (including mercury), persistent organic pollutants (POPs), and mixed chemicals emitted through human activities, such as uncontrolled recycling of electronic-waste; waste recycling; surface-atmospheric interaction processes (natural and anthropogenic emissions, vertical transport, deposition, and physical-chemical interaction) of gases and aerosols; phytoremediation/restoration of contaminated sites; food contamination and safety; environmental effects of medicines; effects and toxicity of mixed pollutants; speciation of heavy metals/metalloids; effects of mining; disturbed geochemistry from human behavior, natural or man-made hazards; particle and nanoparticle toxicology; risk and the vulnerability of populations, etc.