A long-term global Mollisols SOC content prediction framework: Integrating prior knowledge, geographical partitioning, and deep learning models with spatio-temporal validation
Xiangtian Meng, Yilin Bao, Xinle Zhang, Chong Luo, Huanjun Liu
{"title":"A long-term global Mollisols SOC content prediction framework: Integrating prior knowledge, geographical partitioning, and deep learning models with spatio-temporal validation","authors":"Xiangtian Meng, Yilin Bao, Xinle Zhang, Chong Luo, Huanjun Liu","doi":"10.1016/j.rse.2024.114592","DOIUrl":null,"url":null,"abstract":"Recently, Soil Organic Carbon (SOC) content has declined across global Mollisols region due to erosion, intensive agriculture, and other factors, weakening the soil's capacity to buffer climate change and necessitating urgent monitoring of SOC dynamics. Large-scale SOC content monitoring using remote sensing technology faces challenges in extracting advanced features from remote sensing data and mitigating the negative impact of high spatial heterogeneity in SOC content on prediction accuracy. To address these challenges, we collected 8984 samples, 956,423 Landsat TM/OLI images, shuttle radar topography mission-digital elevation model data, and meteorological data. We developed a Geographic Knowledge Dataset (GEKD) incorporating prior knowledge of soil formation and erosion processes. We then input the GEKD into a Probability Hybrid Model (PHM). In the PHM, we applied a fuzzy Gaussian mixture model to cluster the global Mollisols region and calculate corresponding probabilities. We then built a high-accuracy SOC content prediction model by integrating the Attention mechanism, Convolutional Neural Networks, and Convolutional Long Short-Term Memory Networks (A-CNN-ConvLSTM). Finally, we generated spatial maps of SOC content at a 30 m resolution for 8 periods since 1984 and verified the accuracy of its spatial distribution and temporal variation patterns. The results showed that (1) the highest SOC content prediction accuracy (<em>RMSE</em> = 7.17 g/kg, <em>R</em><sup><em>2</em></sup> = 0.72, and <em>RPIQ</em> = 1.92) was achieved when GEKD was input into PHM using the A-CNN-ConvLSTM algorithm. (2) PHM effectively reduces the negative impact of high SOC spatial heterogeneity on prediction accuracy, resulting in smoother spatial distribution at cluster boundaries. Compared to the global model, PHM reduced <em>RMSE</em> by 1.66 g/kg and improved <em>R</em><sup><em>2</em></sup> and <em>RPIQ</em> by 0.06 and 0.15, respectively. (3) Compared to the commonly used random forest algorithm, A-CNN-ConvLSTM reduced <em>RMSE</em> by 1.50 g/kg and improved <em>R</em><sup><em>2</em></sup> and <em>RPIQ</em> by 0.13 and 0.47, respectively. The spatial context features extracted by the CNN structure in the A-CNN-ConvLSTM algorithm are the most effective in improving SOC content prediction accuracy. (4) Currently, the SOC content across continents in the global Mollisols region is ranked as follows: Siberia (27.21 g/kg) > Europe (26.78 g/kg) > Asia (20.48 g/kg) > North America (20.43 g/kg) > South America (16.49 g/kg). Since 1984, SOC content has shown a decreasing trend, with the global Mollisols region losing 1.91 g/kg overall. The Asian Mollisols region experienced the largest decline (2.93 g/kg), while Siberia saw the smallest decrease (1.45 g/kg).","PeriodicalId":417,"journal":{"name":"Remote Sensing of Environment","volume":"42 1","pages":""},"PeriodicalIF":11.1000,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Remote Sensing of Environment","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.rse.2024.114592","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, Soil Organic Carbon (SOC) content has declined across global Mollisols region due to erosion, intensive agriculture, and other factors, weakening the soil's capacity to buffer climate change and necessitating urgent monitoring of SOC dynamics. Large-scale SOC content monitoring using remote sensing technology faces challenges in extracting advanced features from remote sensing data and mitigating the negative impact of high spatial heterogeneity in SOC content on prediction accuracy. To address these challenges, we collected 8984 samples, 956,423 Landsat TM/OLI images, shuttle radar topography mission-digital elevation model data, and meteorological data. We developed a Geographic Knowledge Dataset (GEKD) incorporating prior knowledge of soil formation and erosion processes. We then input the GEKD into a Probability Hybrid Model (PHM). In the PHM, we applied a fuzzy Gaussian mixture model to cluster the global Mollisols region and calculate corresponding probabilities. We then built a high-accuracy SOC content prediction model by integrating the Attention mechanism, Convolutional Neural Networks, and Convolutional Long Short-Term Memory Networks (A-CNN-ConvLSTM). Finally, we generated spatial maps of SOC content at a 30 m resolution for 8 periods since 1984 and verified the accuracy of its spatial distribution and temporal variation patterns. The results showed that (1) the highest SOC content prediction accuracy (RMSE = 7.17 g/kg, R2 = 0.72, and RPIQ = 1.92) was achieved when GEKD was input into PHM using the A-CNN-ConvLSTM algorithm. (2) PHM effectively reduces the negative impact of high SOC spatial heterogeneity on prediction accuracy, resulting in smoother spatial distribution at cluster boundaries. Compared to the global model, PHM reduced RMSE by 1.66 g/kg and improved R2 and RPIQ by 0.06 and 0.15, respectively. (3) Compared to the commonly used random forest algorithm, A-CNN-ConvLSTM reduced RMSE by 1.50 g/kg and improved R2 and RPIQ by 0.13 and 0.47, respectively. The spatial context features extracted by the CNN structure in the A-CNN-ConvLSTM algorithm are the most effective in improving SOC content prediction accuracy. (4) Currently, the SOC content across continents in the global Mollisols region is ranked as follows: Siberia (27.21 g/kg) > Europe (26.78 g/kg) > Asia (20.48 g/kg) > North America (20.43 g/kg) > South America (16.49 g/kg). Since 1984, SOC content has shown a decreasing trend, with the global Mollisols region losing 1.91 g/kg overall. The Asian Mollisols region experienced the largest decline (2.93 g/kg), while Siberia saw the smallest decrease (1.45 g/kg).
期刊介绍:
Remote Sensing of Environment (RSE) serves the Earth observation community by disseminating results on the theory, science, applications, and technology that contribute to advancing the field of remote sensing. With a thoroughly interdisciplinary approach, RSE encompasses terrestrial, oceanic, and atmospheric sensing.
The journal emphasizes biophysical and quantitative approaches to remote sensing at local to global scales, covering a diverse range of applications and techniques.
RSE serves as a vital platform for the exchange of knowledge and advancements in the dynamic field of remote sensing.