Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity

IF 0.8 4区 农林科学 Q3 AGRICULTURE, MULTIDISCIPLINARY Spanish Journal of Agricultural Research Pub Date : 2023-02-01 DOI:10.5424/sjar/2023211-19835
A. Behpouri, S. Farokhzadeh, Z. Zinati, Zobeir Khosravi
{"title":"Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity","authors":"A. Behpouri, S. Farokhzadeh, Z. Zinati, Zobeir Khosravi","doi":"10.5424/sjar/2023211-19835","DOIUrl":null,"url":null,"abstract":"Aim of study: Regarding the third largest staple food crop in the world, determining the factors affecting wheat yield is of great importance. This study aimed to determine useful subsets of agronomic traits and evaluate the order of importance of traits in grain yield. \nArea of study: Fars province, Iran. \nMaterial and methods: In total, the data corresponding to 22 agronomic traits was collected from six different regions (Darab, Kavar, Marvdasht, Fasa, Lar, and Khonj) of 90 farms of Fars province, Iran as the most important wheat-growing regions. Multivariate statistical analysis (correlation, stepwise regression, and principal component analysis (PCA)) and machine learning modeling approaches, such as partial least squares regression (PLSR) and support vector regression (SVR) models, were applied to agronomic traits. \nMain results: The findings, based on integrated approaches such as correlation, stepwise regression, and PCA, highlighted that number of spikes m-2, grain number spike-1, and thousand-grain weight had a major impact on the yield followed by awn length, spike length, narrow leaf herbicide, broadleaf herbicide, time to plant maturity (month), and soil salinity. Besides, PLSR with nine inputs (nine selected traits) displayed better prediction capability (R2=85 %, RMSE=0.32, MSE=0.10, and BIAS=-0.05) than that with all twenty-two input traits. \nResearch highlights: Integrated multivariate statistical analyses and machine learning regression methods could be a powerful tool in determining traits that have a significant impact on yield. These achievements can be considered for future breeding programs.","PeriodicalId":22182,"journal":{"name":"Spanish Journal of Agricultural Research","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spanish Journal of Agricultural Research","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.5424/sjar/2023211-19835","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Aim of study: Regarding the third largest staple food crop in the world, determining the factors affecting wheat yield is of great importance. This study aimed to determine useful subsets of agronomic traits and evaluate the order of importance of traits in grain yield. Area of study: Fars province, Iran. Material and methods: In total, the data corresponding to 22 agronomic traits was collected from six different regions (Darab, Kavar, Marvdasht, Fasa, Lar, and Khonj) of 90 farms of Fars province, Iran as the most important wheat-growing regions. Multivariate statistical analysis (correlation, stepwise regression, and principal component analysis (PCA)) and machine learning modeling approaches, such as partial least squares regression (PLSR) and support vector regression (SVR) models, were applied to agronomic traits. Main results: The findings, based on integrated approaches such as correlation, stepwise regression, and PCA, highlighted that number of spikes m-2, grain number spike-1, and thousand-grain weight had a major impact on the yield followed by awn length, spike length, narrow leaf herbicide, broadleaf herbicide, time to plant maturity (month), and soil salinity. Besides, PLSR with nine inputs (nine selected traits) displayed better prediction capability (R2=85 %, RMSE=0.32, MSE=0.10, and BIAS=-0.05) than that with all twenty-two input traits. Research highlights: Integrated multivariate statistical analyses and machine learning regression methods could be a powerful tool in determining traits that have a significant impact on yield. These achievements can be considered for future breeding programs.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用多变量分析和机器学习方法来表征影响小麦产量多样性的性状
研究目的:作为世界第三大主粮作物,确定小麦产量的影响因素具有重要意义。本研究旨在确定有用的农艺性状亚群,并评价各性状在粮食产量中的重要程度。研究领域:伊朗法尔斯省。材料和方法:在伊朗最重要的小麦产区法尔斯省6个不同地区(达拉布、卡瓦尔、马夫达什特、法萨、拉尔和洪吉)90个农场共收集了22个农艺性状的数据。应用多元统计分析(相关、逐步回归、主成分分析)和机器学习建模方法,如偏最小二乘回归(PLSR)和支持向量回归(SVR)模型对农艺性状进行分析。主要结果:综合运用相关分析、逐步回归分析和主成分分析等方法,发现对单株产量影响最大的是穗数m-2、穗数1和千粒重,其次是芒长、穗长、窄叶除草剂、阔叶除草剂、植株成熟期(月)和土壤盐分。此外,9个输入(9个选择性状)的PLSR预测能力(R2= 85%, RMSE=0.32, MSE=0.10, BIAS=-0.05)优于全部22个输入性状。研究重点:综合多元统计分析和机器学习回归方法可能是确定对产量有重大影响的性状的有力工具。这些成果可以考虑用于未来的育种计划。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Spanish Journal of Agricultural Research
Spanish Journal of Agricultural Research 农林科学-农业综合
CiteScore
2.00
自引率
0.00%
发文量
60
审稿时长
6 months
期刊介绍: The Spanish Journal of Agricultural Research (SJAR) is a quarterly international journal that accepts research articles, reviews and short communications of content related to agriculture. Research articles and short communications must report original work not previously published in any language and not under consideration for publication elsewhere. The main aim of SJAR is to publish papers that report research findings on the following topics: agricultural economics; agricultural engineering; agricultural environment and ecology; animal breeding, genetics and reproduction; animal health and welfare; animal production; plant breeding, genetics and genetic resources; plant physiology; plant production (field and horticultural crops); plant protection; soil science; and water management.
期刊最新文献
Predatory insect species, and patterns of abundance of two common thrips species (Thysanoptera) and their predators on common crops Adjuvants used in fungicide spraying on soybean plants Effect of hydro-methanolic extract of Mangifera indica L. stem bark on body weight, pathological lesions, and hematology in experimental Eimeria tenella-infected broiler chickens Selection of incentives for a business strategy based on crop diversification Use of a mixture design to optimize dietary macronutrients for large turbot (Scophthalmus maximus Linnaeus, 1758)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1