Francisco Mena , Deepak Pathak , Hiba Najjar , Cristhian Sanchez , Patrick Helber , Benjamin Bischke , Peter Habelitz , Miro Miranda , Jayanth Siddamsetty , Marlon Nuske , Marcela Charfuelan , Diego Arenas , Michaela Vollmer , Andreas Dengel
{"title":"多模态遥感数据自适应融合优化子田作物产量预测","authors":"Francisco Mena , Deepak Pathak , Hiba Najjar , Cristhian Sanchez , Patrick Helber , Benjamin Bischke , Peter Habelitz , Miro Miranda , Jayanth Siddamsetty , Marlon Nuske , Marcela Charfuelan , Diego Arenas , Michaela Vollmer , Andreas Dengel","doi":"10.1016/j.rse.2024.114547","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate crop yield prediction is of utmost importance for informed decision-making in agriculture, aiding farmers, industry stakeholders, and policymakers in optimizing agricultural practices. However, this task is complex and depends on multiple factors, such as environmental conditions, soil properties, and management practices. Leveraging Remote Sensing (RS) technologies, multi-modal data from diverse global data sources can be collected to enhance predictive model accuracy. However, combining heterogeneous RS data poses a fusion challenge, like identifying the specific contribution of each modality in the predictive task. In this paper, we present a novel multi-modal learning approach to predict crop yield for different crops (soybean, wheat, rapeseed) and regions (Argentina, Uruguay, and Germany). Our multi-modal input data includes multi-spectral optical images from Sentinel-2 satellites and weather data as dynamic features during the crop growing season, complemented by static features like soil properties and topographic information. To effectively fuse the multi-modal data, we introduce a Multi-modal Gated Fusion (MMGF) model, comprising dedicated modality-encoders and a Gated Unit (GU) module. The modality-encoders handle the heterogeneity of data sources with varying temporal resolutions by learning a modality-specific representation. These representations are adaptively fused via a weighted sum. The <em>fusion</em> weights are computed for each sample by the GU using a concatenation of the multi-modal representations. The MMGF model is trained at sub-field level with 10 m resolution pixels. Our evaluations show that the MMGF outperforms conventional models on the same task, achieving the best results by incorporating all the data sources, unlike the usual fusion results in the literature. For Argentina, the MMGF model achieves an <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> value of 0.68 at sub-field yield prediction, while at the field level evaluation (comparing field averages), it reaches around 0.80 across different countries. The GU module learned different weights based on the country and crop-type, aligning with the variable significance of each data source to the prediction task. This novel method has proven its effectiveness in enhancing the accuracy of the challenging sub-field crop yield prediction. Our investigation indicates that the gated fusion approach promises a significant advancement in the field of agriculture and precision farming.</div></div>","PeriodicalId":417,"journal":{"name":"Remote Sensing of Environment","volume":"318 ","pages":"Article 114547"},"PeriodicalIF":11.1000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive fusion of multi-modal remote sensing data for optimal sub-field crop yield prediction\",\"authors\":\"Francisco Mena , Deepak Pathak , Hiba Najjar , Cristhian Sanchez , Patrick Helber , Benjamin Bischke , Peter Habelitz , Miro Miranda , Jayanth Siddamsetty , Marlon Nuske , Marcela Charfuelan , Diego Arenas , Michaela Vollmer , Andreas Dengel\",\"doi\":\"10.1016/j.rse.2024.114547\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Accurate crop yield prediction is of utmost importance for informed decision-making in agriculture, aiding farmers, industry stakeholders, and policymakers in optimizing agricultural practices. However, this task is complex and depends on multiple factors, such as environmental conditions, soil properties, and management practices. Leveraging Remote Sensing (RS) technologies, multi-modal data from diverse global data sources can be collected to enhance predictive model accuracy. However, combining heterogeneous RS data poses a fusion challenge, like identifying the specific contribution of each modality in the predictive task. In this paper, we present a novel multi-modal learning approach to predict crop yield for different crops (soybean, wheat, rapeseed) and regions (Argentina, Uruguay, and Germany). Our multi-modal input data includes multi-spectral optical images from Sentinel-2 satellites and weather data as dynamic features during the crop growing season, complemented by static features like soil properties and topographic information. To effectively fuse the multi-modal data, we introduce a Multi-modal Gated Fusion (MMGF) model, comprising dedicated modality-encoders and a Gated Unit (GU) module. The modality-encoders handle the heterogeneity of data sources with varying temporal resolutions by learning a modality-specific representation. These representations are adaptively fused via a weighted sum. The <em>fusion</em> weights are computed for each sample by the GU using a concatenation of the multi-modal representations. The MMGF model is trained at sub-field level with 10 m resolution pixels. Our evaluations show that the MMGF outperforms conventional models on the same task, achieving the best results by incorporating all the data sources, unlike the usual fusion results in the literature. For Argentina, the MMGF model achieves an <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> value of 0.68 at sub-field yield prediction, while at the field level evaluation (comparing field averages), it reaches around 0.80 across different countries. The GU module learned different weights based on the country and crop-type, aligning with the variable significance of each data source to the prediction task. This novel method has proven its effectiveness in enhancing the accuracy of the challenging sub-field crop yield prediction. Our investigation indicates that the gated fusion approach promises a significant advancement in the field of agriculture and precision farming.</div></div>\",\"PeriodicalId\":417,\"journal\":{\"name\":\"Remote Sensing of Environment\",\"volume\":\"318 \",\"pages\":\"Article 114547\"},\"PeriodicalIF\":11.1000,\"publicationDate\":\"2024-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Remote Sensing of Environment\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S003442572400573X\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Remote Sensing of Environment","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S003442572400573X","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Adaptive fusion of multi-modal remote sensing data for optimal sub-field crop yield prediction
Accurate crop yield prediction is of utmost importance for informed decision-making in agriculture, aiding farmers, industry stakeholders, and policymakers in optimizing agricultural practices. However, this task is complex and depends on multiple factors, such as environmental conditions, soil properties, and management practices. Leveraging Remote Sensing (RS) technologies, multi-modal data from diverse global data sources can be collected to enhance predictive model accuracy. However, combining heterogeneous RS data poses a fusion challenge, like identifying the specific contribution of each modality in the predictive task. In this paper, we present a novel multi-modal learning approach to predict crop yield for different crops (soybean, wheat, rapeseed) and regions (Argentina, Uruguay, and Germany). Our multi-modal input data includes multi-spectral optical images from Sentinel-2 satellites and weather data as dynamic features during the crop growing season, complemented by static features like soil properties and topographic information. To effectively fuse the multi-modal data, we introduce a Multi-modal Gated Fusion (MMGF) model, comprising dedicated modality-encoders and a Gated Unit (GU) module. The modality-encoders handle the heterogeneity of data sources with varying temporal resolutions by learning a modality-specific representation. These representations are adaptively fused via a weighted sum. The fusion weights are computed for each sample by the GU using a concatenation of the multi-modal representations. The MMGF model is trained at sub-field level with 10 m resolution pixels. Our evaluations show that the MMGF outperforms conventional models on the same task, achieving the best results by incorporating all the data sources, unlike the usual fusion results in the literature. For Argentina, the MMGF model achieves an value of 0.68 at sub-field yield prediction, while at the field level evaluation (comparing field averages), it reaches around 0.80 across different countries. The GU module learned different weights based on the country and crop-type, aligning with the variable significance of each data source to the prediction task. This novel method has proven its effectiveness in enhancing the accuracy of the challenging sub-field crop yield prediction. Our investigation indicates that the gated fusion approach promises a significant advancement in the field of agriculture and precision farming.
期刊介绍:
Remote Sensing of Environment (RSE) serves the Earth observation community by disseminating results on the theory, science, applications, and technology that contribute to advancing the field of remote sensing. With a thoroughly interdisciplinary approach, RSE encompasses terrestrial, oceanic, and atmospheric sensing.
The journal emphasizes biophysical and quantitative approaches to remote sensing at local to global scales, covering a diverse range of applications and techniques.
RSE serves as a vital platform for the exchange of knowledge and advancements in the dynamic field of remote sensing.