Shuaishuai Shi , Nan Wang , Songchao Chen , Bifeng Hu , Jie Peng , Zhou Shi
{"title":"基于时间窗特征优化和集成学习模型的土壤盐度数字制图","authors":"Shuaishuai Shi , Nan Wang , Songchao Chen , Bifeng Hu , Jie Peng , Zhou Shi","doi":"10.1016/j.ecoinf.2024.102982","DOIUrl":null,"url":null,"abstract":"<div><div>Soil salinization poses considerable global environmental and ecological risks. Remote-sensing time-series data enable more accurate monitoring and prediction of soil salinity levels, offering a refined approach to soil salinization assessment. However, the current limitations of time-series data analysis—particularly in terms of timely and effective information extraction—hinder high-precision soil salinity assessments. This study proposes a data mining approach using Sentinel-1 time-series data, integrating time-series decomposition and feature selection to capture seasonal and trend components correlated with soil salinity, and determine optimal time windows and effective time spans. An advanced feature-selection algorithm was then applied to refine the model-relevant features, and the transferability of the method across different regions was validated through empirical testing. The results revealed a 12 month periodicity in the correlation between Sentinel-1 time-series features and soil salinity, with an annual decay rate of 0.0019. In the study area, the optimal time window was from July to September, with the maximum effective years ranging from 19 to 21. Recursive feature elimination has shown a gradually increasing trend in the importance of SAR features from single-temporal to multi-temporal to time-series data. The time-series analysis combined with feature selection not only significantly reduced data volumes, but also improved the prediction accuracy of the model—increased R<sup>2</sup> of the prediction set was improved by 0.11, and a reduced root mean square error of 3.08 g kg<sup>−1</sup>, compared to single-temporal data. Furthermore, the results demonstrate that the ensemble model outperforms the individual models in terms of inversion accuracy, whereas the time-series mining method exhibits generalizability across diverse study areas and metrics. The combination of the time-series mining method with the ensemble model helps achieve a higher accuracy in digital soil mapping.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"85 ","pages":"Article 102982"},"PeriodicalIF":7.3000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Digital mapping of soil salinity with time-windows features optimization and ensemble learning model\",\"authors\":\"Shuaishuai Shi , Nan Wang , Songchao Chen , Bifeng Hu , Jie Peng , Zhou Shi\",\"doi\":\"10.1016/j.ecoinf.2024.102982\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Soil salinization poses considerable global environmental and ecological risks. Remote-sensing time-series data enable more accurate monitoring and prediction of soil salinity levels, offering a refined approach to soil salinization assessment. However, the current limitations of time-series data analysis—particularly in terms of timely and effective information extraction—hinder high-precision soil salinity assessments. This study proposes a data mining approach using Sentinel-1 time-series data, integrating time-series decomposition and feature selection to capture seasonal and trend components correlated with soil salinity, and determine optimal time windows and effective time spans. An advanced feature-selection algorithm was then applied to refine the model-relevant features, and the transferability of the method across different regions was validated through empirical testing. The results revealed a 12 month periodicity in the correlation between Sentinel-1 time-series features and soil salinity, with an annual decay rate of 0.0019. In the study area, the optimal time window was from July to September, with the maximum effective years ranging from 19 to 21. Recursive feature elimination has shown a gradually increasing trend in the importance of SAR features from single-temporal to multi-temporal to time-series data. The time-series analysis combined with feature selection not only significantly reduced data volumes, but also improved the prediction accuracy of the model—increased R<sup>2</sup> of the prediction set was improved by 0.11, and a reduced root mean square error of 3.08 g kg<sup>−1</sup>, compared to single-temporal data. Furthermore, the results demonstrate that the ensemble model outperforms the individual models in terms of inversion accuracy, whereas the time-series mining method exhibits generalizability across diverse study areas and metrics. The combination of the time-series mining method with the ensemble model helps achieve a higher accuracy in digital soil mapping.</div></div>\",\"PeriodicalId\":51024,\"journal\":{\"name\":\"Ecological Informatics\",\"volume\":\"85 \",\"pages\":\"Article 102982\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ecological Informatics\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1574954124005247\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"ECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954124005247","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
土壤盐渍化对全球环境和生态构成重大风险。遥感时间序列数据能够更准确地监测和预测土壤盐度水平,为土壤盐渍化评估提供了一种改进的方法。然而,目前时间序列数据分析的局限性——特别是在及时有效的信息提取方面——阻碍了高精度的土壤盐度评估。本研究提出了一种基于Sentinel-1时间序列数据的数据挖掘方法,将时间序列分解和特征选择相结合,捕捉与土壤盐分相关的季节和趋势分量,确定最佳时间窗和有效时间跨度。采用先进的特征选择算法对模型相关特征进行细化,并通过实证检验验证了该方法在不同区域间的可移植性。结果表明,Sentinel-1时间序列特征与土壤盐度的相关性具有12个月的周期性,年衰减率为0.0019。研究区最佳时间窗为7 ~ 9月,最大有效年限为19 ~ 21年。从单时相到多时相再到时间序列数据,SAR特征的重要性呈现出逐渐增加的趋势。时间序列分析结合特征选择不仅显著减少了数据量,而且提高了模型的预测精度——与单时间数据相比,预测集的R2提高了0.11,均方根误差降低了3.08 g kg−1。此外,结果表明,在反演精度方面,集成模型优于单个模型,而时间序列挖掘方法在不同的研究领域和指标上表现出通用性。将时间序列挖掘方法与集成模型相结合,可以提高数字土壤制图的精度。
Digital mapping of soil salinity with time-windows features optimization and ensemble learning model
Soil salinization poses considerable global environmental and ecological risks. Remote-sensing time-series data enable more accurate monitoring and prediction of soil salinity levels, offering a refined approach to soil salinization assessment. However, the current limitations of time-series data analysis—particularly in terms of timely and effective information extraction—hinder high-precision soil salinity assessments. This study proposes a data mining approach using Sentinel-1 time-series data, integrating time-series decomposition and feature selection to capture seasonal and trend components correlated with soil salinity, and determine optimal time windows and effective time spans. An advanced feature-selection algorithm was then applied to refine the model-relevant features, and the transferability of the method across different regions was validated through empirical testing. The results revealed a 12 month periodicity in the correlation between Sentinel-1 time-series features and soil salinity, with an annual decay rate of 0.0019. In the study area, the optimal time window was from July to September, with the maximum effective years ranging from 19 to 21. Recursive feature elimination has shown a gradually increasing trend in the importance of SAR features from single-temporal to multi-temporal to time-series data. The time-series analysis combined with feature selection not only significantly reduced data volumes, but also improved the prediction accuracy of the model—increased R2 of the prediction set was improved by 0.11, and a reduced root mean square error of 3.08 g kg−1, compared to single-temporal data. Furthermore, the results demonstrate that the ensemble model outperforms the individual models in terms of inversion accuracy, whereas the time-series mining method exhibits generalizability across diverse study areas and metrics. The combination of the time-series mining method with the ensemble model helps achieve a higher accuracy in digital soil mapping.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.