基于遥感数据和机器学习的环境条件时空变异性对小麦产量预测的影响

Keltoum Khechba , Mariana Belgiu , Ahmed Laamrani , Alfred Stein , Abdelhakim Amazirh , Abdelghani Chehbouni
{"title":"基于遥感数据和机器学习的环境条件时空变异性对小麦产量预测的影响","authors":"Keltoum Khechba ,&nbsp;Mariana Belgiu ,&nbsp;Ahmed Laamrani ,&nbsp;Alfred Stein ,&nbsp;Abdelhakim Amazirh ,&nbsp;Abdelghani Chehbouni","doi":"10.1016/j.jag.2025.104367","DOIUrl":null,"url":null,"abstract":"<div><div>Climate change poses significant challenges to food security, especially in semi-arid agriculture areas. Effective monitoring of crop yield is important for establishing food emergency responses and developing long-term sustainable strategies. In Morocco, where cereals are the predominant crops, yield forecasting is important for addressing the yield gap as it enables farmers to take preventive actions before the harvesting period. This study aims to assess the impact of spatial and temporal heterogeneity of environmental conditions on wheat yield forecasting using machine learning models. It compares the 2019–2020 and 2020–2021 agricultural seasons using three sets of variables: (1) spectral indices; (2) weather data; and (3) a combination of both spectral indices and weather data. Weather data, including cumulative monthly precipitation from ERA5 data and average monthly temperature from PERSIANN data, were extracted for the wheat growing season (November to June). Spectral indices including the Normalized Difference Vegetation Index, Moisture Stress Index, and Terrestrial Chlorophyll Index were calculated from Sentinel-2 imagery for the same period and processed using Google Earth Engine. The study area was divided into homogeneous zones based on an existing landform classification, and XGBoost and Random Forest (RF) models were used for yield forecasting in each zone separately. The two models performed equally well across both the zones and the whole study area (SA) when using weather data as the input variable. For instance, across SA, they achieved average R<sup>2</sup> values of 0.60 and 0.81 for all months during the 2019–2020 and 2020–2021 agricultural seasons, respectively. However, when using spectral indices or combining these indices with weather data, RF consistently outperformed XGBoost. For example, in SA during the 2019–2020 season, RF achieved an average R<sup>2</sup> of 0.48 across the growing season, compared to XGBoost’s R<sup>2</sup> of 0.43. Similarly, in the 2020–2021 season, RF achieved an R<sup>2</sup> of 0.35 and an RMSE of 1083 kg ha<sup>-1</sup>, while XGBoost performed slightly lower, with an R<sup>2</sup> of 0.29 and an RMSE of 1137 kg ha<sup>-1</sup>. Comparing the prediction accuracy between the seasons for each set of variables, the RF model performs better when using spectral indices during the relatively dry 2019–2020 season as compared to the wet 2020–2021 season. Incorporating weather data, the model improved its performance for the 2020–2021 season. April showed the highest prediction performance overall, with R<sup>2</sup> values of 0.6 for SA using weather data alone in the 2019–2020 season, and 0.8 for SA using a combination of weather data and spectral indices in the 2020–2021 season. The 2019–2020 season showed strong fluctuations in accuracy throughout the growing season, whereas the 2020–2021 season had a consistent improvement in accuracy over time. These variations in accuracy are due to differing environmental conditions that should be taken into account for making better and more reliable yield predictions.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104367"},"PeriodicalIF":7.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The impact of spatiotemporal variability of environmental conditions on wheat yield forecasting using remote sensing data and machine learning\",\"authors\":\"Keltoum Khechba ,&nbsp;Mariana Belgiu ,&nbsp;Ahmed Laamrani ,&nbsp;Alfred Stein ,&nbsp;Abdelhakim Amazirh ,&nbsp;Abdelghani Chehbouni\",\"doi\":\"10.1016/j.jag.2025.104367\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Climate change poses significant challenges to food security, especially in semi-arid agriculture areas. Effective monitoring of crop yield is important for establishing food emergency responses and developing long-term sustainable strategies. In Morocco, where cereals are the predominant crops, yield forecasting is important for addressing the yield gap as it enables farmers to take preventive actions before the harvesting period. This study aims to assess the impact of spatial and temporal heterogeneity of environmental conditions on wheat yield forecasting using machine learning models. It compares the 2019–2020 and 2020–2021 agricultural seasons using three sets of variables: (1) spectral indices; (2) weather data; and (3) a combination of both spectral indices and weather data. Weather data, including cumulative monthly precipitation from ERA5 data and average monthly temperature from PERSIANN data, were extracted for the wheat growing season (November to June). Spectral indices including the Normalized Difference Vegetation Index, Moisture Stress Index, and Terrestrial Chlorophyll Index were calculated from Sentinel-2 imagery for the same period and processed using Google Earth Engine. The study area was divided into homogeneous zones based on an existing landform classification, and XGBoost and Random Forest (RF) models were used for yield forecasting in each zone separately. The two models performed equally well across both the zones and the whole study area (SA) when using weather data as the input variable. For instance, across SA, they achieved average R<sup>2</sup> values of 0.60 and 0.81 for all months during the 2019–2020 and 2020–2021 agricultural seasons, respectively. However, when using spectral indices or combining these indices with weather data, RF consistently outperformed XGBoost. For example, in SA during the 2019–2020 season, RF achieved an average R<sup>2</sup> of 0.48 across the growing season, compared to XGBoost’s R<sup>2</sup> of 0.43. Similarly, in the 2020–2021 season, RF achieved an R<sup>2</sup> of 0.35 and an RMSE of 1083 kg ha<sup>-1</sup>, while XGBoost performed slightly lower, with an R<sup>2</sup> of 0.29 and an RMSE of 1137 kg ha<sup>-1</sup>. Comparing the prediction accuracy between the seasons for each set of variables, the RF model performs better when using spectral indices during the relatively dry 2019–2020 season as compared to the wet 2020–2021 season. Incorporating weather data, the model improved its performance for the 2020–2021 season. April showed the highest prediction performance overall, with R<sup>2</sup> values of 0.6 for SA using weather data alone in the 2019–2020 season, and 0.8 for SA using a combination of weather data and spectral indices in the 2020–2021 season. The 2019–2020 season showed strong fluctuations in accuracy throughout the growing season, whereas the 2020–2021 season had a consistent improvement in accuracy over time. These variations in accuracy are due to differing environmental conditions that should be taken into account for making better and more reliable yield predictions.</div></div>\",\"PeriodicalId\":73423,\"journal\":{\"name\":\"International journal of applied earth observation and geoinformation : ITC journal\",\"volume\":\"136 \",\"pages\":\"Article 104367\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of applied earth observation and geoinformation : ITC journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1569843225000147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"REMOTE SENSING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843225000147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0

摘要

气候变化对粮食安全构成重大挑战,特别是在半干旱农业区。有效监测作物产量对于制定粮食应急措施和制定长期可持续战略至关重要。在以谷物为主要作物的摩洛哥,产量预测对于解决产量差距非常重要,因为它使农民能够在收获期之前采取预防行动。本研究旨在利用机器学习模型评估环境条件时空异质性对小麦产量预测的影响。利用三组变量对2019-2020年和2020-2021年的农业季节进行比较:(1)光谱指数;(2)气象资料;(3)光谱指数与气象资料相结合。提取了小麦生长季节(11月至6月)的气象数据,包括ERA5数据的月累积降水量和persann数据的月平均气温。利用同一时期的Sentinel-2遥感影像计算归一化植被指数、水分胁迫指数和陆地叶绿素指数,并使用谷歌Earth Engine进行处理。在现有地貌分类的基础上,将研究区划分为均匀带,分别使用XGBoost和Random Forest (RF)模型对每个带进行产量预测。当使用天气数据作为输入变量时,这两个模型在两个区域和整个研究区域(SA)上都表现得同样好。例如,在整个SA中,2019-2020年和2020-2021年农业季节所有月份的平均R2分别为0.60和0.81。然而,当使用光谱指数或将这些指数与天气数据相结合时,RF的表现始终优于XGBoost。例如,在2019-2020季节,在SA中,RF在整个生长季节的平均R2为0.48,而XGBoost的R2为0.43。同样,在2020-2021赛季,RF的R2为0.35,RMSE为1083 kg ha-1,而XGBoost的R2略低,为0.29,RMSE为1137 kg ha-1。对比各变量季节间的预测精度,RF模型在相对干旱的2019-2020季节比湿润的2020-2021季节表现更好。结合天气数据,该模型提高了其在2020-2021赛季的表现。总体而言,4月份的预测性能最高,2019-2020年季节仅使用天气数据的SA R2值为0.6,2020-2021年季节使用天气数据和光谱指数组合的SA R2值为0.8。2019-2020赛季在整个生长季中准确性波动较大,而2020-2021赛季随着时间的推移准确性持续提高。这些准确性的差异是由于不同的环境条件造成的,为了做出更好和更可靠的产量预测,应将这些环境条件考虑在内。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
The impact of spatiotemporal variability of environmental conditions on wheat yield forecasting using remote sensing data and machine learning
Climate change poses significant challenges to food security, especially in semi-arid agriculture areas. Effective monitoring of crop yield is important for establishing food emergency responses and developing long-term sustainable strategies. In Morocco, where cereals are the predominant crops, yield forecasting is important for addressing the yield gap as it enables farmers to take preventive actions before the harvesting period. This study aims to assess the impact of spatial and temporal heterogeneity of environmental conditions on wheat yield forecasting using machine learning models. It compares the 2019–2020 and 2020–2021 agricultural seasons using three sets of variables: (1) spectral indices; (2) weather data; and (3) a combination of both spectral indices and weather data. Weather data, including cumulative monthly precipitation from ERA5 data and average monthly temperature from PERSIANN data, were extracted for the wheat growing season (November to June). Spectral indices including the Normalized Difference Vegetation Index, Moisture Stress Index, and Terrestrial Chlorophyll Index were calculated from Sentinel-2 imagery for the same period and processed using Google Earth Engine. The study area was divided into homogeneous zones based on an existing landform classification, and XGBoost and Random Forest (RF) models were used for yield forecasting in each zone separately. The two models performed equally well across both the zones and the whole study area (SA) when using weather data as the input variable. For instance, across SA, they achieved average R2 values of 0.60 and 0.81 for all months during the 2019–2020 and 2020–2021 agricultural seasons, respectively. However, when using spectral indices or combining these indices with weather data, RF consistently outperformed XGBoost. For example, in SA during the 2019–2020 season, RF achieved an average R2 of 0.48 across the growing season, compared to XGBoost’s R2 of 0.43. Similarly, in the 2020–2021 season, RF achieved an R2 of 0.35 and an RMSE of 1083 kg ha-1, while XGBoost performed slightly lower, with an R2 of 0.29 and an RMSE of 1137 kg ha-1. Comparing the prediction accuracy between the seasons for each set of variables, the RF model performs better when using spectral indices during the relatively dry 2019–2020 season as compared to the wet 2020–2021 season. Incorporating weather data, the model improved its performance for the 2020–2021 season. April showed the highest prediction performance overall, with R2 values of 0.6 for SA using weather data alone in the 2019–2020 season, and 0.8 for SA using a combination of weather data and spectral indices in the 2020–2021 season. The 2019–2020 season showed strong fluctuations in accuracy throughout the growing season, whereas the 2020–2021 season had a consistent improvement in accuracy over time. These variations in accuracy are due to differing environmental conditions that should be taken into account for making better and more reliable yield predictions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International journal of applied earth observation and geoinformation : ITC journal
International journal of applied earth observation and geoinformation : ITC journal Global and Planetary Change, Management, Monitoring, Policy and Law, Earth-Surface Processes, Computers in Earth Sciences
CiteScore
12.00
自引率
0.00%
发文量
0
审稿时长
77 days
期刊介绍: The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.
期刊最新文献
Augmenting estuary monitoring from space: New retrievals of fine-scale CDOM quality and DOC exchange An enhanced image stacks method for mapping long-term retrogressive thaw slumps in the Tibetan Plateau Estimation of fractional cover based on NDVI-VISI response space using visible-near infrared satellite imagery PolSAR image classification using complex-valued multiscale attention vision transformer (CV-MsAtViT) Efficient management of ubiquitous location information using geospatial grid region name
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1