{"title":"Comparative evaluation of statistical and machine learning models for weather-driven wheat yield forecasting across different districts of Punjab","authors":"Kulwinder Kaur Gill, Kavita Bhatt, Akansha, Parul Setiya, Sandeep Singh Sandhu, Baljeet Kaur","doi":"10.1007/s12517-024-12077-1","DOIUrl":null,"url":null,"abstract":"<div><p>Predicting crop yields before harvest is important for making and carrying out policies about food safety, transportation costs, import-export, storage, and selling of agricultural goods. The weather is a key factor in crop growth and its development. Therefore, models that include meteorological variables can predict reliable forecasts for crop output; however, selecting the appropriate model for use in agricultural production forecasting can be challenging. This study investigates the development of wheat yield prediction models using various multivariate analysis techniques and weather indices derived from meteorological data collected over 22 years in Punjab, India. Five different modeling approaches, including stepwise multiple linear regression (SMLR), LASSO, elastic net (ELNET), artificial neural network (ANN), and ridge regression, were employed and compared for their effectiveness in predicting wheat yield. The models were calibrated using data from 17 years (2000–01 to 2016–17) and validated using data from the subsequent 5 years (2017–18 to 2021–22). Evaluation metrics such as <i>R</i><sup>2</sup>, root mean square error (RMSE), normalized root mean square error (NRMSE), mean biased error (MBE), and modeling efficiency (EF) were utilized to assess model performance. The results indicate varying degrees of performance across districts and modeling techniques. ANN demonstrated the highest performance during both calibration and validation periods, followed closely by LASSO and ELNET. However, certain districts showed discrepancies in model fit, with some models performing better than others depending on the specific district. Overall, ANN emerged as the most reliable approach for wheat yield prediction in Punjab followed by ELNET and LASSO, offering valuable insights for agricultural planning and management. This comprehensive analysis provides valuable contributions to the field of crop yield prediction, enhancing understanding of the complex interactions between weather variables and agricultural outcomes.</p></div>","PeriodicalId":476,"journal":{"name":"Arabian Journal of Geosciences","volume":"17 10","pages":""},"PeriodicalIF":1.8270,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal of Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s12517-024-12077-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Earth and Planetary Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Predicting crop yields before harvest is important for making and carrying out policies about food safety, transportation costs, import-export, storage, and selling of agricultural goods. The weather is a key factor in crop growth and its development. Therefore, models that include meteorological variables can predict reliable forecasts for crop output; however, selecting the appropriate model for use in agricultural production forecasting can be challenging. This study investigates the development of wheat yield prediction models using various multivariate analysis techniques and weather indices derived from meteorological data collected over 22 years in Punjab, India. Five different modeling approaches, including stepwise multiple linear regression (SMLR), LASSO, elastic net (ELNET), artificial neural network (ANN), and ridge regression, were employed and compared for their effectiveness in predicting wheat yield. The models were calibrated using data from 17 years (2000–01 to 2016–17) and validated using data from the subsequent 5 years (2017–18 to 2021–22). Evaluation metrics such as R2, root mean square error (RMSE), normalized root mean square error (NRMSE), mean biased error (MBE), and modeling efficiency (EF) were utilized to assess model performance. The results indicate varying degrees of performance across districts and modeling techniques. ANN demonstrated the highest performance during both calibration and validation periods, followed closely by LASSO and ELNET. However, certain districts showed discrepancies in model fit, with some models performing better than others depending on the specific district. Overall, ANN emerged as the most reliable approach for wheat yield prediction in Punjab followed by ELNET and LASSO, offering valuable insights for agricultural planning and management. This comprehensive analysis provides valuable contributions to the field of crop yield prediction, enhancing understanding of the complex interactions between weather variables and agricultural outcomes.
期刊介绍:
The Arabian Journal of Geosciences is the official journal of the Saudi Society for Geosciences and publishes peer-reviewed original and review articles on the entire range of Earth Science themes, focused on, but not limited to, those that have regional significance to the Middle East and the Euro-Mediterranean Zone.
Key topics therefore include; geology, hydrogeology, earth system science, petroleum sciences, geophysics, seismology and crustal structures, tectonics, sedimentology, palaeontology, metamorphic and igneous petrology, natural hazards, environmental sciences and sustainable development, geoarchaeology, geomorphology, paleo-environment studies, oceanography, atmospheric sciences, GIS and remote sensing, geodesy, mineralogy, volcanology, geochemistry and metallogenesis.