Ahmad Alsaber, Parul Setiya, Anurag Satpathi, Abrar Aljamaan, Jiazhu Pan
{"title":"Advancing pearl millet yield forecasting: Comparative analysis of individual and ensemble machine learning approaches over Rajasthan, India.","authors":"Ahmad Alsaber, Parul Setiya, Anurag Satpathi, Abrar Aljamaan, Jiazhu Pan","doi":"10.1371/journal.pone.0317602","DOIUrl":null,"url":null,"abstract":"<p><p>Pearl millet (Pennisetum glaucum L.) is a resilient crop known for its ability to thrive in arid and semi-arid regions, making it a crucial staple in regions prone to drought. Rajasthan, a state in India, emerged as the top producer of pearl millet. This study enhances yield forecasting for pearl millet using machine learning models across nine districts viz. Jaipur, Ajmer, Jodhpur, Bikaner, Bharatpur, Alwar, Sikar, Jhunjhunu and Nagaur in Rajasthan, India. Data from 1997-2019 (23 years), including yield data from the Directorate of Economics and Statistics and weather data from the NASA POWER web portal, were analysed. The study employed individual machine learning methods (GLM, ELNET, XGB, SVR and RF) and their ensemble combinations (GLM, ELNET, Cubist and RF). Discerning the overall best performing model across all locations remained challenging. For instance, while ensemble models exhibited subpar performance in Barmer and Nagaur, their performance ranged from satisfactory to commendable in other locations. To identify the best model, all models were ranked based on their R2 and nRMSE (%) values. Combined average ranks during training and testing revealed the model performance ranking as I-XGB (3.83) > I-GLM (4.28) > E-ELNET (4.32) > I-RF (4.67) > E-GLM (4.88) > I-SVR (4.90) > I-ELNET (4.94) > E-RF (6.03) > E-Cubist (7.15), where I denotes individual model, while E denotes ensemble model. Intriguingly, while individual GLM and XGB models demonstrated superior performance during calibration, they exhibited poorer performance during validation, potentially indicating issues of data overfitting. Hence, the ensemble ELNET approach is recommended for accurate prediction of pearl millet yield, followed by the individual RF model. These performances underscore the importance of tailored model selection based on specific geographic and environmental conditions.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 3","pages":"e0317602"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0317602","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Pearl millet (Pennisetum glaucum L.) is a resilient crop known for its ability to thrive in arid and semi-arid regions, making it a crucial staple in regions prone to drought. Rajasthan, a state in India, emerged as the top producer of pearl millet. This study enhances yield forecasting for pearl millet using machine learning models across nine districts viz. Jaipur, Ajmer, Jodhpur, Bikaner, Bharatpur, Alwar, Sikar, Jhunjhunu and Nagaur in Rajasthan, India. Data from 1997-2019 (23 years), including yield data from the Directorate of Economics and Statistics and weather data from the NASA POWER web portal, were analysed. The study employed individual machine learning methods (GLM, ELNET, XGB, SVR and RF) and their ensemble combinations (GLM, ELNET, Cubist and RF). Discerning the overall best performing model across all locations remained challenging. For instance, while ensemble models exhibited subpar performance in Barmer and Nagaur, their performance ranged from satisfactory to commendable in other locations. To identify the best model, all models were ranked based on their R2 and nRMSE (%) values. Combined average ranks during training and testing revealed the model performance ranking as I-XGB (3.83) > I-GLM (4.28) > E-ELNET (4.32) > I-RF (4.67) > E-GLM (4.88) > I-SVR (4.90) > I-ELNET (4.94) > E-RF (6.03) > E-Cubist (7.15), where I denotes individual model, while E denotes ensemble model. Intriguingly, while individual GLM and XGB models demonstrated superior performance during calibration, they exhibited poorer performance during validation, potentially indicating issues of data overfitting. Hence, the ensemble ELNET approach is recommended for accurate prediction of pearl millet yield, followed by the individual RF model. These performances underscore the importance of tailored model selection based on specific geographic and environmental conditions.
期刊介绍:
PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides:
* Open-access—freely accessible online, authors retain copyright
* Fast publication times
* Peer review by expert, practicing researchers
* Post-publication tools to indicate quality and impact
* Community-based dialogue on articles
* Worldwide media coverage