GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yield

IF 8.2 Q1 AGRICULTURE, MULTIDISCIPLINARY Artificial Intelligence in Agriculture Pub Date : 2023-06-01 DOI:10.1016/j.aiia.2023.05.001
Sheikh Jubair , Olivier Tremblay-Savard , Mike Domaratzki
{"title":"GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yield","authors":"Sheikh Jubair ,&nbsp;Olivier Tremblay-Savard ,&nbsp;Mike Domaratzki","doi":"10.1016/j.aiia.2023.05.001","DOIUrl":null,"url":null,"abstract":"<div><p>The expression of quantitative traits of a line of a crop depends on its genetics, the environment where it is sown and the interaction between the genetic information and the environment known as GxE. Thus to maximize food production, new varieties are developed by selecting superior lines of seeds suitable for a specific environment. Genomic selection is a computational technique for developing a new variety that uses whole genome molecular markers to identify top lines of a crop. A large number of statistical and machine learning models are employed for single environment trials, where it is assumed that the environment does not have any effect on the quantitative traits. However, it is essential to consider both genomic and environmental data to develop a new variety, as these strong assumptions may lead to failing to select top lines for an environment. Here we devised three novel deep learning frameworks incorporating GxE within the deep learning model and predicted line-specific yield for an environment. In the process, we also developed a new technique for identifying environment-specific markers that can be useful in many applications of environment-specific genomic selection. The result demonstrates that our best framework obtains 1.75 to 1.95 times better correlation coefficients than other deep learning models that incorporate environmental data depending on the test scenario. Furthermore, the feature importance analysis shows that environmental information, followed by genomic information, is the driving factor in predicting environment-specific yield for a line. We also demonstrate a way to extend our framework for new data types, such as text or soil data. The extended model also shows the potential to be useful in genomic selection.</p></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"8 ","pages":"Pages 60-76"},"PeriodicalIF":8.2000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Agriculture","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589721723000168","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

The expression of quantitative traits of a line of a crop depends on its genetics, the environment where it is sown and the interaction between the genetic information and the environment known as GxE. Thus to maximize food production, new varieties are developed by selecting superior lines of seeds suitable for a specific environment. Genomic selection is a computational technique for developing a new variety that uses whole genome molecular markers to identify top lines of a crop. A large number of statistical and machine learning models are employed for single environment trials, where it is assumed that the environment does not have any effect on the quantitative traits. However, it is essential to consider both genomic and environmental data to develop a new variety, as these strong assumptions may lead to failing to select top lines for an environment. Here we devised three novel deep learning frameworks incorporating GxE within the deep learning model and predicted line-specific yield for an environment. In the process, we also developed a new technique for identifying environment-specific markers that can be useful in many applications of environment-specific genomic selection. The result demonstrates that our best framework obtains 1.75 to 1.95 times better correlation coefficients than other deep learning models that incorporate environmental data depending on the test scenario. Furthermore, the feature importance analysis shows that environmental information, followed by genomic information, is the driving factor in predicting environment-specific yield for a line. We also demonstrate a way to extend our framework for new data types, such as text or soil data. The extended model also shows the potential to be useful in genomic selection.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GxENet:基于全连接神经网络的小麦产量预测方法
作物品系数量性状的表达取决于其遗传、播种环境以及遗传信息与GxE环境之间的相互作用。因此,为了最大限度地提高粮食产量,通过选择适合特定环境的优良种子系来开发新品种。基因组选择是一种开发新品种的计算技术,该技术使用全基因组分子标记来识别作物的顶线。大量的统计和机器学习模型被用于单一环境试验,其中假设环境对数量性状没有任何影响。然而,开发新品种必须同时考虑基因组和环境数据,因为这些强有力的假设可能会导致无法选择环境的顶线。在这里,我们设计了三种新的深度学习框架,将GxE纳入深度学习模型中,并预测了环境的特定行产量。在这个过程中,我们还开发了一种识别环境特异性标记的新技术,该技术可用于环境特异性基因组选择的许多应用。结果表明,我们的最佳框架获得的相关系数是其他深度学习模型的1.75至1.95倍,这些模型根据测试场景结合了环境数据。此外,特征重要性分析表明,环境信息和基因组信息是预测品系环境特异性产量的驱动因素。我们还展示了一种将我们的框架扩展到新数据类型的方法,例如文本或土壤数据。扩展模型也显示了在基因组选择中有用的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Artificial Intelligence in Agriculture
Artificial Intelligence in Agriculture Engineering-Engineering (miscellaneous)
CiteScore
21.60
自引率
0.00%
发文量
18
审稿时长
12 weeks
期刊最新文献
Enhancing crop yield prediction in Senegal using advanced machine learning techniques and synthetic data Neural network architecture search enabled wide-deep learning (NAS-WD) for spatially heterogenous property awared chicken woody breast classification and hardness regression Utility-based regression and meta-learning techniques for modeling actual ET: Comparison to (METRIC-EEFLUX) model Detectability of multi-dimensional movement and behaviour in cattle using sensor data and machine learning algorithms: Study on a Charolais bull Estimating TYLCV resistance level using RGBD sensors in production greenhouse conditions
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1