Random Forest Regression in Maize Yield Prediction

Miriam Sitienei, A. Anapapa, A. Otieno
{"title":"Random Forest Regression in Maize Yield Prediction","authors":"Miriam Sitienei, A. Anapapa, A. Otieno","doi":"10.9734/ajpas/2023/v23i4511","DOIUrl":null,"url":null,"abstract":"Artificial Intelligence is the discipline of making computers behave without explicit programming. Machine learning is a subset of artificial Intelligence that enables machines to learn autonomously from previous data without explicit programming. The purpose of machine learning in agriculture is to increase crop yield and quality in the agricultural sector. It is driven by the emergence of big data technologies and high-performance computation, which provide new opportunities to unravel, quantify, and comprehend data-intensive agricultural operational processes. Random Forest is an ensemble technique that reduces the result's overfitting. This algorithm is primarily utilized for forecasting. It generates a forest with numerous trees. The random forest classifier predicts that the model's accuracy will increase as the number of trees in the forest increases. All through the training phase, multiple decision trees are constructed. It generates subsets of data from randomly selected training samples with replacement. Each data subset is employed to train decision trees. It utilizes multiple trees to reduce the possibility of overfitting. Maize is a staple food in Kenya and having it in sufficient amounts in the country assures the farmers' food security and economic stability. This study predicted maize yield in the Kenyan county of Uasin Gishu using the machine learning algorithm Random Forest regression. The regression model employed a mixed-methods research design, and the survey employed well-structured questionnaires containing quantitative and qualitative variables, which were directly administered to 30 clustered wards' representative farmers. The questionnaire encompassed 30 maize production-related variables from 900 randomly selected maize producers in 30 wards. The model was able to identify important variables from the dataset and predicted maize yield. The prediction evaluation used machine learning regression metrics, Root Mean Squared error-RMSE=0.52199, Mean Squared Error-MSE =0.27248, and Mean Absolute Error-MAE = 0.471722. The model predicted maize yield and indicated the contribution of each variable to the overall prediction.","PeriodicalId":8532,"journal":{"name":"Asian Journal of Probability and Statistics","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asian Journal of Probability and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.9734/ajpas/2023/v23i4511","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial Intelligence is the discipline of making computers behave without explicit programming. Machine learning is a subset of artificial Intelligence that enables machines to learn autonomously from previous data without explicit programming. The purpose of machine learning in agriculture is to increase crop yield and quality in the agricultural sector. It is driven by the emergence of big data technologies and high-performance computation, which provide new opportunities to unravel, quantify, and comprehend data-intensive agricultural operational processes. Random Forest is an ensemble technique that reduces the result's overfitting. This algorithm is primarily utilized for forecasting. It generates a forest with numerous trees. The random forest classifier predicts that the model's accuracy will increase as the number of trees in the forest increases. All through the training phase, multiple decision trees are constructed. It generates subsets of data from randomly selected training samples with replacement. Each data subset is employed to train decision trees. It utilizes multiple trees to reduce the possibility of overfitting. Maize is a staple food in Kenya and having it in sufficient amounts in the country assures the farmers' food security and economic stability. This study predicted maize yield in the Kenyan county of Uasin Gishu using the machine learning algorithm Random Forest regression. The regression model employed a mixed-methods research design, and the survey employed well-structured questionnaires containing quantitative and qualitative variables, which were directly administered to 30 clustered wards' representative farmers. The questionnaire encompassed 30 maize production-related variables from 900 randomly selected maize producers in 30 wards. The model was able to identify important variables from the dataset and predicted maize yield. The prediction evaluation used machine learning regression metrics, Root Mean Squared error-RMSE=0.52199, Mean Squared Error-MSE =0.27248, and Mean Absolute Error-MAE = 0.471722. The model predicted maize yield and indicated the contribution of each variable to the overall prediction.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
随机森林回归在玉米产量预测中的应用
人工智能是一门让计算机在没有明确编程的情况下运行的学科。机器学习是人工智能的一个子集,它使机器能够在没有显式编程的情况下从先前的数据中自主学习。农业中机器学习的目的是提高农业部门的作物产量和质量。它是由大数据技术和高性能计算的出现推动的,这些技术为解开、量化和理解数据密集型农业操作过程提供了新的机会。随机森林是一种减少结果过拟合的集成技术。该算法主要用于预测。它形成了一片树木繁茂的森林。随机森林分类器预测模型的精度将随着森林中树木数量的增加而增加。在整个训练阶段,构建了多个决策树。它通过替换从随机选择的训练样本中生成数据子集。每个数据子集被用来训练决策树。它利用多个树来减少过拟合的可能性。玉米是肯尼亚的主食,在肯尼亚拥有足够数量的玉米可以确保农民的粮食安全和经济稳定。这项研究使用机器学习算法随机森林回归预测肯尼亚瓦辛吉舒县的玉米产量。回归模型采用混合方法研究设计,调查采用结构合理、包含定量和定性变量的问卷,直接对30个集聚区具有代表性的农户进行问卷调查。问卷包含30个玉米生产相关变量,来自30个省900个随机选择的玉米生产者。该模型能够从数据集中识别重要变量并预测玉米产量。预测评估使用机器学习回归指标,均方根误差- rmse =0.52199,均方误差- mse =0.27248,平均绝对误差- mae = 0.471722。该模型预测了玉米产量,并指出了各变量对总体预测的贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Bayesian Sequential Updation and Prediction of Currency in Circulation Using a Weighted Prior Assessment of Required Sample Sizes for Estimating Proportions Rainfall Pattern in Kenya: Bayesian Non-parametric Model Based on the Normalized Generalized Gamma Process Advancing Retail Predictions: Integrating Diverse Machine Learning Models for Accurate Walmart Sales Forecasting Common Fixed-Point Theorem for Expansive Mappings in Dualistic Partial Metric Spaces
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1