{"title":"随机森林回归在玉米产量预测中的应用","authors":"Miriam Sitienei, A. Anapapa, A. Otieno","doi":"10.9734/ajpas/2023/v23i4511","DOIUrl":null,"url":null,"abstract":"Artificial Intelligence is the discipline of making computers behave without explicit programming. Machine learning is a subset of artificial Intelligence that enables machines to learn autonomously from previous data without explicit programming. The purpose of machine learning in agriculture is to increase crop yield and quality in the agricultural sector. It is driven by the emergence of big data technologies and high-performance computation, which provide new opportunities to unravel, quantify, and comprehend data-intensive agricultural operational processes. Random Forest is an ensemble technique that reduces the result's overfitting. This algorithm is primarily utilized for forecasting. It generates a forest with numerous trees. The random forest classifier predicts that the model's accuracy will increase as the number of trees in the forest increases. All through the training phase, multiple decision trees are constructed. It generates subsets of data from randomly selected training samples with replacement. Each data subset is employed to train decision trees. It utilizes multiple trees to reduce the possibility of overfitting. Maize is a staple food in Kenya and having it in sufficient amounts in the country assures the farmers' food security and economic stability. This study predicted maize yield in the Kenyan county of Uasin Gishu using the machine learning algorithm Random Forest regression. The regression model employed a mixed-methods research design, and the survey employed well-structured questionnaires containing quantitative and qualitative variables, which were directly administered to 30 clustered wards' representative farmers. The questionnaire encompassed 30 maize production-related variables from 900 randomly selected maize producers in 30 wards. The model was able to identify important variables from the dataset and predicted maize yield. The prediction evaluation used machine learning regression metrics, Root Mean Squared error-RMSE=0.52199, Mean Squared Error-MSE =0.27248, and Mean Absolute Error-MAE = 0.471722. The model predicted maize yield and indicated the contribution of each variable to the overall prediction.","PeriodicalId":8532,"journal":{"name":"Asian Journal of Probability and Statistics","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Random Forest Regression in Maize Yield Prediction\",\"authors\":\"Miriam Sitienei, A. Anapapa, A. Otieno\",\"doi\":\"10.9734/ajpas/2023/v23i4511\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial Intelligence is the discipline of making computers behave without explicit programming. Machine learning is a subset of artificial Intelligence that enables machines to learn autonomously from previous data without explicit programming. The purpose of machine learning in agriculture is to increase crop yield and quality in the agricultural sector. It is driven by the emergence of big data technologies and high-performance computation, which provide new opportunities to unravel, quantify, and comprehend data-intensive agricultural operational processes. Random Forest is an ensemble technique that reduces the result's overfitting. This algorithm is primarily utilized for forecasting. It generates a forest with numerous trees. The random forest classifier predicts that the model's accuracy will increase as the number of trees in the forest increases. All through the training phase, multiple decision trees are constructed. It generates subsets of data from randomly selected training samples with replacement. Each data subset is employed to train decision trees. It utilizes multiple trees to reduce the possibility of overfitting. Maize is a staple food in Kenya and having it in sufficient amounts in the country assures the farmers' food security and economic stability. This study predicted maize yield in the Kenyan county of Uasin Gishu using the machine learning algorithm Random Forest regression. The regression model employed a mixed-methods research design, and the survey employed well-structured questionnaires containing quantitative and qualitative variables, which were directly administered to 30 clustered wards' representative farmers. The questionnaire encompassed 30 maize production-related variables from 900 randomly selected maize producers in 30 wards. The model was able to identify important variables from the dataset and predicted maize yield. The prediction evaluation used machine learning regression metrics, Root Mean Squared error-RMSE=0.52199, Mean Squared Error-MSE =0.27248, and Mean Absolute Error-MAE = 0.471722. The model predicted maize yield and indicated the contribution of each variable to the overall prediction.\",\"PeriodicalId\":8532,\"journal\":{\"name\":\"Asian Journal of Probability and Statistics\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Asian Journal of Probability and Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.9734/ajpas/2023/v23i4511\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asian Journal of Probability and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.9734/ajpas/2023/v23i4511","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
人工智能是一门让计算机在没有明确编程的情况下运行的学科。机器学习是人工智能的一个子集,它使机器能够在没有显式编程的情况下从先前的数据中自主学习。农业中机器学习的目的是提高农业部门的作物产量和质量。它是由大数据技术和高性能计算的出现推动的,这些技术为解开、量化和理解数据密集型农业操作过程提供了新的机会。随机森林是一种减少结果过拟合的集成技术。该算法主要用于预测。它形成了一片树木繁茂的森林。随机森林分类器预测模型的精度将随着森林中树木数量的增加而增加。在整个训练阶段,构建了多个决策树。它通过替换从随机选择的训练样本中生成数据子集。每个数据子集被用来训练决策树。它利用多个树来减少过拟合的可能性。玉米是肯尼亚的主食,在肯尼亚拥有足够数量的玉米可以确保农民的粮食安全和经济稳定。这项研究使用机器学习算法随机森林回归预测肯尼亚瓦辛吉舒县的玉米产量。回归模型采用混合方法研究设计,调查采用结构合理、包含定量和定性变量的问卷,直接对30个集聚区具有代表性的农户进行问卷调查。问卷包含30个玉米生产相关变量,来自30个省900个随机选择的玉米生产者。该模型能够从数据集中识别重要变量并预测玉米产量。预测评估使用机器学习回归指标,均方根误差- rmse =0.52199,均方误差- mse =0.27248,平均绝对误差- mae = 0.471722。该模型预测了玉米产量,并指出了各变量对总体预测的贡献。
Random Forest Regression in Maize Yield Prediction
Artificial Intelligence is the discipline of making computers behave without explicit programming. Machine learning is a subset of artificial Intelligence that enables machines to learn autonomously from previous data without explicit programming. The purpose of machine learning in agriculture is to increase crop yield and quality in the agricultural sector. It is driven by the emergence of big data technologies and high-performance computation, which provide new opportunities to unravel, quantify, and comprehend data-intensive agricultural operational processes. Random Forest is an ensemble technique that reduces the result's overfitting. This algorithm is primarily utilized for forecasting. It generates a forest with numerous trees. The random forest classifier predicts that the model's accuracy will increase as the number of trees in the forest increases. All through the training phase, multiple decision trees are constructed. It generates subsets of data from randomly selected training samples with replacement. Each data subset is employed to train decision trees. It utilizes multiple trees to reduce the possibility of overfitting. Maize is a staple food in Kenya and having it in sufficient amounts in the country assures the farmers' food security and economic stability. This study predicted maize yield in the Kenyan county of Uasin Gishu using the machine learning algorithm Random Forest regression. The regression model employed a mixed-methods research design, and the survey employed well-structured questionnaires containing quantitative and qualitative variables, which were directly administered to 30 clustered wards' representative farmers. The questionnaire encompassed 30 maize production-related variables from 900 randomly selected maize producers in 30 wards. The model was able to identify important variables from the dataset and predicted maize yield. The prediction evaluation used machine learning regression metrics, Root Mean Squared error-RMSE=0.52199, Mean Squared Error-MSE =0.27248, and Mean Absolute Error-MAE = 0.471722. The model predicted maize yield and indicated the contribution of each variable to the overall prediction.