{"title":"Mapping surface soil organic carbon density of cultivated land using machine learning in Zhengzhou.","authors":"Hengliang Guo, Jinyang Wang, Dujuan Zhang, Jian Cui, Yonghao Yuan, Haoming Bao, Mengjiao Yang, Jiahui Guo, Feng Chen, Wenge Zhou, Gang Wu, Yang Guo, Haitao Wei, Baojin Qiao, Shan Zhao","doi":"10.1007/s10653-024-02313-8","DOIUrl":null,"url":null,"abstract":"<p><p>Research on soil organic carbon (SOC) is crucial for improving soil carbon sinks and achieving the \"double-carbon\" goal. This study introduces ten auxiliary variables based on the data from a 2021 land quality survey in Zhengzhou and a multi-objective regional geochemical survey. It uses geostatistical ordinary kriging (OK) interpolation, as well as classical machine learning (ML) models, including random forest (RF) and support vector machine (SVM), to map soil organic carbon density (SOCD) in the topsoil layer (0 - 20 cm) of cultivated land. It partitions the sampling data to assess the generalization capability of the machine learning models, with Zhongmu County designated as an independent test set (dataset2) and the remaining data as the training set (dataset1). The three models are trained using dataset1, and the trained machine learning models are directly applied to dataset2 to evaluate and compare their generalization performance. The distribution of SOCD and SOCS in soils of various types and textures is analyzed using the optimal interpolation method. The results indicated that: (1) The average SOC densities predicted by OK interpolation, RF, and SVM are 3.70, 3.74, and 3.63 kg/m<sup>2</sup>, with test set precisions (R<sup>2</sup>) of 0.34, 0.60, and 0.81, respectively. (2) ML achieves a significantly higher predictive precision than traditional OK interpolation. The RF model's precision is 0.21 higher than the SVM model and more precise in estimating carbon stock. (3) When applied to the dataset2, the RF model exhibited superior generalization capabilities (R<sup>2</sup> = 0.52, MSE = 0.32) over the SVM model (R<sup>2</sup> = 0.32, MSE = 0.45). (4) The spatial distribution of surface SOCD in the study area exhibits a decreasing gradient from west to east and from south to north. The total carbon stock in the study area is estimated at approximately 10.76 × 10<sup>6</sup>t. (5) The integration of soil attribute variables, climatic variables, remote sensing data, and machine learning techniques holds significant promise for the high-precision and high-quality mapping of soil organic carbon density (SOCD) in agricultural soils.</p>","PeriodicalId":11759,"journal":{"name":"Environmental Geochemistry and Health","volume":"47 1","pages":"1"},"PeriodicalIF":3.2000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11604695/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Geochemistry and Health","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s10653-024-02313-8","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Research on soil organic carbon (SOC) is crucial for improving soil carbon sinks and achieving the "double-carbon" goal. This study introduces ten auxiliary variables based on the data from a 2021 land quality survey in Zhengzhou and a multi-objective regional geochemical survey. It uses geostatistical ordinary kriging (OK) interpolation, as well as classical machine learning (ML) models, including random forest (RF) and support vector machine (SVM), to map soil organic carbon density (SOCD) in the topsoil layer (0 - 20 cm) of cultivated land. It partitions the sampling data to assess the generalization capability of the machine learning models, with Zhongmu County designated as an independent test set (dataset2) and the remaining data as the training set (dataset1). The three models are trained using dataset1, and the trained machine learning models are directly applied to dataset2 to evaluate and compare their generalization performance. The distribution of SOCD and SOCS in soils of various types and textures is analyzed using the optimal interpolation method. The results indicated that: (1) The average SOC densities predicted by OK interpolation, RF, and SVM are 3.70, 3.74, and 3.63 kg/m2, with test set precisions (R2) of 0.34, 0.60, and 0.81, respectively. (2) ML achieves a significantly higher predictive precision than traditional OK interpolation. The RF model's precision is 0.21 higher than the SVM model and more precise in estimating carbon stock. (3) When applied to the dataset2, the RF model exhibited superior generalization capabilities (R2 = 0.52, MSE = 0.32) over the SVM model (R2 = 0.32, MSE = 0.45). (4) The spatial distribution of surface SOCD in the study area exhibits a decreasing gradient from west to east and from south to north. The total carbon stock in the study area is estimated at approximately 10.76 × 106t. (5) The integration of soil attribute variables, climatic variables, remote sensing data, and machine learning techniques holds significant promise for the high-precision and high-quality mapping of soil organic carbon density (SOCD) in agricultural soils.
期刊介绍:
Environmental Geochemistry and Health publishes original research papers and review papers across the broad field of environmental geochemistry. Environmental geochemistry and health establishes and explains links between the natural or disturbed chemical composition of the earth’s surface and the health of plants, animals and people.
Beneficial elements regulate or promote enzymatic and hormonal activity whereas other elements may be toxic. Bedrock geochemistry controls the composition of soil and hence that of water and vegetation. Environmental issues, such as pollution, arising from the extraction and use of mineral resources, are discussed. The effects of contaminants introduced into the earth’s geochemical systems are examined. Geochemical surveys of soil, water and plants show how major and trace elements are distributed geographically. Associated epidemiological studies reveal the possibility of causal links between the natural or disturbed geochemical environment and disease. Experimental research illuminates the nature or consequences of natural or disturbed geochemical processes.
The journal particularly welcomes novel research linking environmental geochemistry and health issues on such topics as: heavy metals (including mercury), persistent organic pollutants (POPs), and mixed chemicals emitted through human activities, such as uncontrolled recycling of electronic-waste; waste recycling; surface-atmospheric interaction processes (natural and anthropogenic emissions, vertical transport, deposition, and physical-chemical interaction) of gases and aerosols; phytoremediation/restoration of contaminated sites; food contamination and safety; environmental effects of medicines; effects and toxicity of mixed pollutants; speciation of heavy metals/metalloids; effects of mining; disturbed geochemistry from human behavior, natural or man-made hazards; particle and nanoparticle toxicology; risk and the vulnerability of populations, etc.