Li Wang;Yong Zhou;Zehan Zhou;Shangrong Wu;Lang Xia;Yan Zha;Peng Yang
{"title":"A Spectral Hierarchical Machine Learning for Predicting Arsenic Concentration in Farmland Soil Using Sentinel-2 Imagery","authors":"Li Wang;Yong Zhou;Zehan Zhou;Shangrong Wu;Lang Xia;Yan Zha;Peng Yang","doi":"10.1109/TGRS.2025.3532678","DOIUrl":null,"url":null,"abstract":"Accurately predicting arsenic (As) concentration in farmland soil on a large scale is essential for effectively preventing and managing soil pollution in agricultural areas, thereby safeguarding food security. Multispectral imaging presents a cost-effective and efficient method for monitoring As concentration across extensive farmland regions. Nevertheless, the underlying process and mechanisms determining the relationship between As concentration in farmland soil and spectral data remain uncertain. The primary aim of this study was to evaluate whether employing a hierarchical strategy (based on soil organic matter (SOM) and pH) results in more accurate prediction of As concentration in farmland soil than those employing nonhierarchical (global) models. Our results show that with respect to global models, the best prediction of As concentration was achieved using the convolutional neural network (CNN) model (validated ratio of the model performance to the interquartile distance (RPIQ) =2.50), followed by the Cubist model (validated RPIQ =2.19) and the extreme learning machine (ELM) model (validated RPIQ =2.15). After SOM-based hierarchization, the Cubist model exhibited the highest prediction accuracy (validated coefficient of determination (<inline-formula> <tex-math>$R^{2})=0.73$ </tex-math></inline-formula>), representing a 0.02 improvement in the <inline-formula> <tex-math>$R^{2}$ </tex-math></inline-formula> compared with the that of global CNN model. The clay mineral ratio (CMR) was identified as the most important variable for predicting As concentration. Notably, the identification of high As concentration in the central old town areas underscores the importance of early soil contamination risk warnings on arable lands. These findings indicate that SOM-hierarchical machine learning models could serve as an effective approach to address the influence of soil environmental complications on spectral prediction of As concentration in farmland soil. By implementing this proposed method, soil environmental monitoring efforts can be significantly improved.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-14"},"PeriodicalIF":8.6000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10849777/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Accurately predicting arsenic (As) concentration in farmland soil on a large scale is essential for effectively preventing and managing soil pollution in agricultural areas, thereby safeguarding food security. Multispectral imaging presents a cost-effective and efficient method for monitoring As concentration across extensive farmland regions. Nevertheless, the underlying process and mechanisms determining the relationship between As concentration in farmland soil and spectral data remain uncertain. The primary aim of this study was to evaluate whether employing a hierarchical strategy (based on soil organic matter (SOM) and pH) results in more accurate prediction of As concentration in farmland soil than those employing nonhierarchical (global) models. Our results show that with respect to global models, the best prediction of As concentration was achieved using the convolutional neural network (CNN) model (validated ratio of the model performance to the interquartile distance (RPIQ) =2.50), followed by the Cubist model (validated RPIQ =2.19) and the extreme learning machine (ELM) model (validated RPIQ =2.15). After SOM-based hierarchization, the Cubist model exhibited the highest prediction accuracy (validated coefficient of determination ($R^{2})=0.73$ ), representing a 0.02 improvement in the $R^{2}$ compared with the that of global CNN model. The clay mineral ratio (CMR) was identified as the most important variable for predicting As concentration. Notably, the identification of high As concentration in the central old town areas underscores the importance of early soil contamination risk warnings on arable lands. These findings indicate that SOM-hierarchical machine learning models could serve as an effective approach to address the influence of soil environmental complications on spectral prediction of As concentration in farmland soil. By implementing this proposed method, soil environmental monitoring efforts can be significantly improved.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.