Shuaishuai Shi , Nan Wang , Songchao Chen , Bifeng Hu , Jie Peng , Zhou Shi
{"title":"Digital mapping of soil salinity with time-windows features optimization and ensemble learning model","authors":"Shuaishuai Shi , Nan Wang , Songchao Chen , Bifeng Hu , Jie Peng , Zhou Shi","doi":"10.1016/j.ecoinf.2024.102982","DOIUrl":null,"url":null,"abstract":"<div><div>Soil salinization poses considerable global environmental and ecological risks. Remote-sensing time-series data enable more accurate monitoring and prediction of soil salinity levels, offering a refined approach to soil salinization assessment. However, the current limitations of time-series data analysis—particularly in terms of timely and effective information extraction—hinder high-precision soil salinity assessments. This study proposes a data mining approach using Sentinel-1 time-series data, integrating time-series decomposition and feature selection to capture seasonal and trend components correlated with soil salinity, and determine optimal time windows and effective time spans. An advanced feature-selection algorithm was then applied to refine the model-relevant features, and the transferability of the method across different regions was validated through empirical testing. The results revealed a 12 month periodicity in the correlation between Sentinel-1 time-series features and soil salinity, with an annual decay rate of 0.0019. In the study area, the optimal time window was from July to September, with the maximum effective years ranging from 19 to 21. Recursive feature elimination has shown a gradually increasing trend in the importance of SAR features from single-temporal to multi-temporal to time-series data. The time-series analysis combined with feature selection not only significantly reduced data volumes, but also improved the prediction accuracy of the model—increased R<sup>2</sup> of the prediction set was improved by 0.11, and a reduced root mean square error of 3.08 g kg<sup>−1</sup>, compared to single-temporal data. Furthermore, the results demonstrate that the ensemble model outperforms the individual models in terms of inversion accuracy, whereas the time-series mining method exhibits generalizability across diverse study areas and metrics. The combination of the time-series mining method with the ensemble model helps achieve a higher accuracy in digital soil mapping.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"85 ","pages":"Article 102982"},"PeriodicalIF":5.8000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954124005247","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Soil salinization poses considerable global environmental and ecological risks. Remote-sensing time-series data enable more accurate monitoring and prediction of soil salinity levels, offering a refined approach to soil salinization assessment. However, the current limitations of time-series data analysis—particularly in terms of timely and effective information extraction—hinder high-precision soil salinity assessments. This study proposes a data mining approach using Sentinel-1 time-series data, integrating time-series decomposition and feature selection to capture seasonal and trend components correlated with soil salinity, and determine optimal time windows and effective time spans. An advanced feature-selection algorithm was then applied to refine the model-relevant features, and the transferability of the method across different regions was validated through empirical testing. The results revealed a 12 month periodicity in the correlation between Sentinel-1 time-series features and soil salinity, with an annual decay rate of 0.0019. In the study area, the optimal time window was from July to September, with the maximum effective years ranging from 19 to 21. Recursive feature elimination has shown a gradually increasing trend in the importance of SAR features from single-temporal to multi-temporal to time-series data. The time-series analysis combined with feature selection not only significantly reduced data volumes, but also improved the prediction accuracy of the model—increased R2 of the prediction set was improved by 0.11, and a reduced root mean square error of 3.08 g kg−1, compared to single-temporal data. Furthermore, the results demonstrate that the ensemble model outperforms the individual models in terms of inversion accuracy, whereas the time-series mining method exhibits generalizability across diverse study areas and metrics. The combination of the time-series mining method with the ensemble model helps achieve a higher accuracy in digital soil mapping.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.