Exploring a Bayesian sparse factor model-based strategy for the genetic analysis of thousands of MIR-spectra traits for animal breeding.

IF 3.7 1区 农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE Journal of Dairy Science Pub Date : 2024-07-03 DOI:10.3168/jds.2023-24319
Yansen Chen, Hadi Atashi, Jiayi Qu, Pauline Delhez, Daniel Runcie, Hélène Soyeurt, Nicolas Gengler
{"title":"Exploring a Bayesian sparse factor model-based strategy for the genetic analysis of thousands of MIR-spectra traits for animal breeding.","authors":"Yansen Chen, Hadi Atashi, Jiayi Qu, Pauline Delhez, Daniel Runcie, Hélène Soyeurt, Nicolas Gengler","doi":"10.3168/jds.2023-24319","DOIUrl":null,"url":null,"abstract":"<p><p>With the rapid development of animal phenomics and deep phenotyping, we can get thousands of traditional but also molecular phenotypes per individual. However, there is still a lack of exploration regarding how to handle this huge amount of data in the context of animal breeding, presenting a challenge that we are likely to encounter more and more in the future. This study aimed to (1) explore the use of the Mega-scale linear mixed model (MegaLMM), a factor model-based approach, able to simultaneously estimate (co)variance components and genetic parameters in the context of thousands of milk traits, hereafter called thousand-trait (TT) models; (2) compare the phenotype values and genomic breeding values (u) predictions for focal traits (i.e., traits that are targeted for prediction, compared with secondary traits that are helping to evaluate), from single-trait (ST) and TT models, respectively; (3) propose a new approximate method of estimated genomic breeding values (U) prediction with TT models and MegaLMM. 3,421 milk mid-infrared (MIR) spectra wavepoints (called secondary traits) and 3 focal traits [average fat percent (Fat), average methane (CH4), and average somatic cell score (SCS)] collected on 3,302 first-parity Holstein cows were used. The 3,421 milk MIR wavepoints traits were composed of 311 wavepoints in 11 classes (months in lactation). Genotyping information of 564,439 SNP was available for all animals and was used to calculate the genomic relationship matrix. The MegaLMM was implemented in the framework of the Bayesian sparse factor model and solved through Gibbs sampling (Markov chain Monte Carlo). The heritabilities of the studied 3,421 milk MIR wavepoints gradually increased and then decreased in units of 311 wavepoints throughout the lactation. The genetic and phenotypic correlations between the first 311 wavepoints and the other 3,110 wavepoints were low. The accuracies of phenotype predictions from the ST model were lower than those from the TT model for Fat (0.51 vs. 0.93), CH4 (0.30 vs. 0.86), and SCS (0.14 vs. 0.33). The same trend was observed for the accuracies of u predictions: Fat (0.59 vs. 0.86), CH4 (0.47 vs. 0.78), and SCS (0.39 vs. 0.59). The average correlation between U predicted from the TT model and the new approximate method was 0.90. The new approximate method used for estimating U in MegaLMM will enhance the suitability of MegaLMM for applications in animal breeding. This study conducted an initial investigation into the application of thousands of traits in animal breeding and showed that the TT model is beneficial for the prediction of focal traits (phenotype and breeding values), especially for difficult-to-measure traits (e.g., CH4).</p>","PeriodicalId":354,"journal":{"name":"Journal of Dairy Science","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dairy Science","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.3168/jds.2023-24319","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

With the rapid development of animal phenomics and deep phenotyping, we can get thousands of traditional but also molecular phenotypes per individual. However, there is still a lack of exploration regarding how to handle this huge amount of data in the context of animal breeding, presenting a challenge that we are likely to encounter more and more in the future. This study aimed to (1) explore the use of the Mega-scale linear mixed model (MegaLMM), a factor model-based approach, able to simultaneously estimate (co)variance components and genetic parameters in the context of thousands of milk traits, hereafter called thousand-trait (TT) models; (2) compare the phenotype values and genomic breeding values (u) predictions for focal traits (i.e., traits that are targeted for prediction, compared with secondary traits that are helping to evaluate), from single-trait (ST) and TT models, respectively; (3) propose a new approximate method of estimated genomic breeding values (U) prediction with TT models and MegaLMM. 3,421 milk mid-infrared (MIR) spectra wavepoints (called secondary traits) and 3 focal traits [average fat percent (Fat), average methane (CH4), and average somatic cell score (SCS)] collected on 3,302 first-parity Holstein cows were used. The 3,421 milk MIR wavepoints traits were composed of 311 wavepoints in 11 classes (months in lactation). Genotyping information of 564,439 SNP was available for all animals and was used to calculate the genomic relationship matrix. The MegaLMM was implemented in the framework of the Bayesian sparse factor model and solved through Gibbs sampling (Markov chain Monte Carlo). The heritabilities of the studied 3,421 milk MIR wavepoints gradually increased and then decreased in units of 311 wavepoints throughout the lactation. The genetic and phenotypic correlations between the first 311 wavepoints and the other 3,110 wavepoints were low. The accuracies of phenotype predictions from the ST model were lower than those from the TT model for Fat (0.51 vs. 0.93), CH4 (0.30 vs. 0.86), and SCS (0.14 vs. 0.33). The same trend was observed for the accuracies of u predictions: Fat (0.59 vs. 0.86), CH4 (0.47 vs. 0.78), and SCS (0.39 vs. 0.59). The average correlation between U predicted from the TT model and the new approximate method was 0.90. The new approximate method used for estimating U in MegaLMM will enhance the suitability of MegaLMM for applications in animal breeding. This study conducted an initial investigation into the application of thousands of traits in animal breeding and showed that the TT model is beneficial for the prediction of focal traits (phenotype and breeding values), especially for difficult-to-measure traits (e.g., CH4).

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
探索基于贝叶斯稀疏因子模型的策略,对动物育种中的数千个近红外光谱性状进行遗传分析。
随着动物表型组学和深度表型技术的快速发展,我们可以获得每个个体成千上万的传统表型和分子表型。然而,如何在动物育种中处理这些海量数据仍缺乏探索,这也是我们未来可能会遇到的越来越多的挑战。本研究旨在:(1)探索使用基于因子模型的巨尺度线性混合模型(MegaLMM),该模型可同时估计数千个奶牛性状的(共)方差分量和遗传参数,以下称千性状(TT)模型;(2)比较重点性状(即:作为预测目标的性状)的表型值和基因组育种值(u)预测值、(3) 提出一种利用 TT 模型和 MegaLMM 预测基因组育种值(U)的新近似方法。研究使用了从 3,302 头头等荷斯坦奶牛身上采集的 3,421 个牛奶中红外光谱波点(称为次要性状)和 3 个重点性状[平均脂肪率(Fat)、平均甲烷(CH4)和平均体细胞评分(SCS)]。3421 个牛奶 MIR 波点性状由 11 个等级(泌乳月份)的 311 个波点组成。所有动物都有 564,439 个 SNP 的基因分型信息,用于计算基因组关系矩阵。MegaLMM 在贝叶斯稀疏因子模型的框架内实现,并通过吉布斯采样(马尔科夫链蒙特卡罗)求解。所研究的 3,421 个牛奶 MIR 波点的遗传力在整个泌乳期以 311 个波点为单位逐渐增大,然后减小。前 311 个波点与其他 3,110 个波点之间的遗传和表型相关性较低。在脂肪(0.51 对 0.93)、CH4(0.30 对 0.86)和 SCS(0.14 对 0.33)方面,ST 模型的表型预测准确率低于 TT 模型。u 预测的准确度也呈现出同样的趋势:脂肪(0.59 vs. 0.86)、CH4(0.47 vs. 0.78)和 SCS(0.39 vs. 0.59)。TT 模型和新近似方法预测的 U 平均相关性为 0.90。在 MegaLMM 中用于估计 U 的新近似方法将提高 MegaLMM 在动物育种应用中的适用性。本研究对动物育种中数千个性状的应用进行了初步调查,结果表明 TT 模型有利于预测重点性状(表型和育种值),尤其是难以测量的性状(如 CH4)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Dairy Science
Journal of Dairy Science 农林科学-奶制品与动物科学
CiteScore
7.90
自引率
17.10%
发文量
784
审稿时长
4.2 months
期刊介绍: The official journal of the American Dairy Science Association®, Journal of Dairy Science® (JDS) is the leading peer-reviewed general dairy research journal in the world. JDS readers represent education, industry, and government agencies in more than 70 countries with interests in biochemistry, breeding, economics, engineering, environment, food science, genetics, microbiology, nutrition, pathology, physiology, processing, public health, quality assurance, and sanitation.
期刊最新文献
Comparative transcriptomic analysis of the flavor production mechanism in yogurt by traditional starter strains. Etiology and epidemiology of digital dermatitis in Australian dairy herds. Effects of a multistrain Bacillus-based direct-fed microbial on gastrointestinal permeability and biomarkers of inflammation during and following feed restriction in mid-lactation Holstein cows. Long-term effects of 3-nitrooxypropanol on methane emission and milk production characteristics in Holstein-Friesian dairy cows. Replacing soybean meal with microalgae biomass in diets with contrasting carbohydrate profiles can reduce in vitro methane production and improve short-chain fatty acid production.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1