Ridge regression and deep learning models for genome-wide selection of complex traits in New Mexican Chile peppers.

IF 1.9 Q3 GENETICS & HEREDITY BMC genomic data Pub Date : 2023-12-18 DOI:10.1186/s12863-023-01179-6
Dennis N Lozada, Karansher Singh Sandhu, Madhav Bhatta
{"title":"Ridge regression and deep learning models for genome-wide selection of complex traits in New Mexican Chile peppers.","authors":"Dennis N Lozada, Karansher Singh Sandhu, Madhav Bhatta","doi":"10.1186/s12863-023-01179-6","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Genomewide prediction estimates the genomic breeding values of selection candidates which can be utilized for population improvement and cultivar development. Ridge regression and deep learning-based selection models were implemented for yield and agronomic traits of 204 chile pepper genotypes evaluated in multi-environment trials in New Mexico, USA.</p><p><strong>Results: </strong>Accuracy of prediction differed across different models under ten-fold cross-validations, where high prediction accuracy was observed for highly heritable traits such as plant height and plant width. No model was superior across traits using 14,922 SNP markers for genomewide selection. Bayesian ridge regression had the highest average accuracy for first pod date (0.77) and total yield per plant (0.33). Multilayer perceptron (MLP) was the most superior for flowering time (0.76) and plant height (0.73), whereas the genomic BLUP model had the highest accuracy for plant width (0.62). Using a subset of 7,690 SNP loci resulting from grouping markers based on linkage disequilibrium coefficients resulted in improved accuracy for first pod date, ten pod weight, and total yield per plant, even under a relatively small training population size for MLP and random forest models. Genomic and ridge regression BLUP models were sufficient for optimal prediction accuracies for small training population size. Combining phenotypic selection and genomewide selection resulted in improved selection response for yield-related traits, indicating that integrated approaches can result in improved gains achieved through selection.</p><p><strong>Conclusions: </strong>Accuracy values for ridge regression and deep learning prediction models demonstrate the potential of implementing genomewide selection for genetic improvement in chile pepper breeding programs. Ultimately, a large training data is relevant for improved genomic selection accuracy for the deep learning models.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"24 1","pages":"80"},"PeriodicalIF":1.9000,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10726521/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC genomic data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12863-023-01179-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Genomewide prediction estimates the genomic breeding values of selection candidates which can be utilized for population improvement and cultivar development. Ridge regression and deep learning-based selection models were implemented for yield and agronomic traits of 204 chile pepper genotypes evaluated in multi-environment trials in New Mexico, USA.

Results: Accuracy of prediction differed across different models under ten-fold cross-validations, where high prediction accuracy was observed for highly heritable traits such as plant height and plant width. No model was superior across traits using 14,922 SNP markers for genomewide selection. Bayesian ridge regression had the highest average accuracy for first pod date (0.77) and total yield per plant (0.33). Multilayer perceptron (MLP) was the most superior for flowering time (0.76) and plant height (0.73), whereas the genomic BLUP model had the highest accuracy for plant width (0.62). Using a subset of 7,690 SNP loci resulting from grouping markers based on linkage disequilibrium coefficients resulted in improved accuracy for first pod date, ten pod weight, and total yield per plant, even under a relatively small training population size for MLP and random forest models. Genomic and ridge regression BLUP models were sufficient for optimal prediction accuracies for small training population size. Combining phenotypic selection and genomewide selection resulted in improved selection response for yield-related traits, indicating that integrated approaches can result in improved gains achieved through selection.

Conclusions: Accuracy values for ridge regression and deep learning prediction models demonstrate the potential of implementing genomewide selection for genetic improvement in chile pepper breeding programs. Ultimately, a large training data is relevant for improved genomic selection accuracy for the deep learning models.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于新墨西哥智利辣椒复杂性状全基因组选择的岭回归和深度学习模型。
背景:全基因组预测可以估算出候选品种的基因组育种价值,并将其用于品种改良和栽培品种开发。在美国新墨西哥州的多环境试验中,对 204 个辣椒基因型的产量和农艺性状进行了评估,并建立了基于岭回归和深度学习的选择模型:在十倍交叉验证下,不同模型的预测准确率各不相同,植株高度和植株宽度等高遗传性状的预测准确率较高。在使用 14922 个 SNP 标记进行全基因组选择时,没有哪个模型在所有性状上都更优越。贝叶斯脊回归对第一个结荚期(0.77)和每株总产量(0.33)的平均准确率最高。多层感知器(MLP)在花期(0.76)和株高(0.73)方面的准确率最高,而基因组 BLUP 模型在株宽(0.62)方面的准确率最高。根据连锁不平衡系数对标记进行分组后得到的 7,690 个 SNP 位点子集,即使在 MLP 和随机森林模型的训练群体规模相对较小的情况下,也能提高第一荚日期、十荚重和单株总产量的准确性。基因组和脊回归 BLUP 模型足以在训练群体较小的情况下达到最佳预测精度。将表型选择与全基因组选择相结合可提高产量相关性状的选择响应,这表明综合方法可提高通过选择获得的收益:结论:脊回归和深度学习预测模型的准确度值表明,在辣椒育种项目中实施全基因组选择进行遗传改良具有潜力。最终,大量的训练数据对于提高深度学习模型的基因组选择准确性至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.90
自引率
0.00%
发文量
0
期刊最新文献
The amniote-conserved DNA-binding domain of CGGBP1 restricts cytosine methylation of transcription factor binding sites in proximal promoters to regulate gene expression. Comprehensive analysis of the genetic variation dataset among wild soybean (Glycine soja) in Shandong Province, China. Chromosome-scale assembly of apple mint (Mentha suaveolens). A highly contiguous genome sequence of Alternaria porri isolate Apn-Nashik causing purple blotch disease in onion. Complete genome of single locus sequence typing D1 strain Cutibacterium acnes CN6 isolated from healthy facial skin.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1