Sayantan Sarkar, Javier M Osorio Leyton, Efrain Noa-Yarasca, Kabindra Adhikari, Chad B Hajda, Douglas R Smith
{"title":"Integrating Remote Sensing and Soil Features for Enhanced Machine Learning-Based Corn Yield Prediction in the Southern US.","authors":"Sayantan Sarkar, Javier M Osorio Leyton, Efrain Noa-Yarasca, Kabindra Adhikari, Chad B Hajda, Douglas R Smith","doi":"10.3390/s25020543","DOIUrl":null,"url":null,"abstract":"<p><p>Efficient and reliable corn (<i>Zea mays</i> L.) yield prediction is important for varietal selection by plant breeders and management decision-making by growers. Unlike prior studies that focus mainly on county-level or controlled laboratory-scale areas, this study targets a production-scale area, better representing real-world agricultural conditions and offering more practical relevance for farmers. Therefore, the objective of our study was to determine the best combination of vegetation indices and abiotic factors for predicting corn yield in a rain-fed, production-scale area, identify the most suitable corn growth stage for yield estimation using machine learning, and identify the most effective machine learning model for corn yield estimation. Our study used high-resolution (6 cm) aerial multispectral imagery. Sixty-two different predictors, including soil properties (sand, silt, and clay percentages), slope, spectral bands (red, green, blue, red-edge, NIR), vegetation indices (GNDRE, NDRE, TGI), color-space indices, and wavelengths were derived from the multispectral data collected at the seven (V4, V5, V6, V7, V9, V12, and V14/VT) growth stages of corn. Four regression and machine learning algorithms were evaluated for yield prediction: linear regression, random forest, extreme gradient boosting, and gradient boosting regressor. A total of 6865 yield values were used for model training and 1716 for validation. Results show that, using random forest method, the V14/VT stage had the best yield predictions (RMSE of 0.52 Mg/ha for a mean yield of 10.19 Mg/ha), and yield estimation at V6 stage was still feasible. We concluded that integrating abiotic factors, such as slope and soil properties, significantly improved model accuracy. Among vegetation indices, TGI, HUE, and GNDRE performed better. Results from this study can help farmers or crop consultants plan ahead for future logistics through enhanced early-season yield predictions and support farm profitability and sustainability.</p>","PeriodicalId":21698,"journal":{"name":"Sensors","volume":"25 2","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11769266/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sensors","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.3390/s25020543","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Efficient and reliable corn (Zea mays L.) yield prediction is important for varietal selection by plant breeders and management decision-making by growers. Unlike prior studies that focus mainly on county-level or controlled laboratory-scale areas, this study targets a production-scale area, better representing real-world agricultural conditions and offering more practical relevance for farmers. Therefore, the objective of our study was to determine the best combination of vegetation indices and abiotic factors for predicting corn yield in a rain-fed, production-scale area, identify the most suitable corn growth stage for yield estimation using machine learning, and identify the most effective machine learning model for corn yield estimation. Our study used high-resolution (6 cm) aerial multispectral imagery. Sixty-two different predictors, including soil properties (sand, silt, and clay percentages), slope, spectral bands (red, green, blue, red-edge, NIR), vegetation indices (GNDRE, NDRE, TGI), color-space indices, and wavelengths were derived from the multispectral data collected at the seven (V4, V5, V6, V7, V9, V12, and V14/VT) growth stages of corn. Four regression and machine learning algorithms were evaluated for yield prediction: linear regression, random forest, extreme gradient boosting, and gradient boosting regressor. A total of 6865 yield values were used for model training and 1716 for validation. Results show that, using random forest method, the V14/VT stage had the best yield predictions (RMSE of 0.52 Mg/ha for a mean yield of 10.19 Mg/ha), and yield estimation at V6 stage was still feasible. We concluded that integrating abiotic factors, such as slope and soil properties, significantly improved model accuracy. Among vegetation indices, TGI, HUE, and GNDRE performed better. Results from this study can help farmers or crop consultants plan ahead for future logistics through enhanced early-season yield predictions and support farm profitability and sustainability.
期刊介绍:
Sensors (ISSN 1424-8220) provides an advanced forum for the science and technology of sensors and biosensors. It publishes reviews (including comprehensive reviews on the complete sensors products), regular research papers and short notes. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced.