Yansen Chen , Hadi Atashi , Jiayi Qu , Pauline Delhez , Daniel Runcie , Hélène Soyeurt , Nicolas Gengler
{"title":"Exploring a Bayesian sparse factor model-based strategy for the genetic analysis of thousands of mid-infrared spectra traits for animal breeding","authors":"Yansen Chen , Hadi Atashi , Jiayi Qu , Pauline Delhez , Daniel Runcie , Hélène Soyeurt , Nicolas Gengler","doi":"10.3168/jds.2023-24319","DOIUrl":null,"url":null,"abstract":"<div><div>With the rapid development of animal phenomics and deep phenotyping, we can obtain thousands of traditional (but also molecular) phenotypes per individual. However, there is still a lack of exploration regarding how to handle this huge amount of data in the context of animal breeding, presenting a challenge that we are likely to encounter more and more in the future. This study aimed to (1) explore the use of the mega-scale linear mixed model (MegaLMM), a factor model-based approach that is able to simultaneously estimate (co)variance components and genetic parameters in the context of thousands of milk traits, hereafter called thousand-trait (TT) models; (2) compare the phenotype values and genomic breeding value (<strong>u</strong>) predictions for focal traits (i.e., traits that are targeted for prediction, compared with secondary traits that are helping to evaluate), from single-trait (ST) and TT models, respectively; (3) propose a new approximate method of GEBV (<strong>U</strong>) prediction with TT models and MegaLMM. We used a total of 3,421 milk mid-infrared (MIR) spectra wavepoints (called secondary traits) and 3 focal traits (average fat percentage [AFP], average methane production [ACH4], and average SCS [ASCS]) collected on 3,302 first-parity Holstein cows. The 3,421 milk MIR wavepoint traits were composed of 311 wavepoints in 11 classes (months in lactation). Genotyping information of 564,439 SNPs was available for all animals and was used to calculate the genomic relationship matrix. The MegaLMM was implemented in the framework of the Bayesian sparse factor model and solved through Gibbs sampling (Markov chain Monte Carlo). The heritabilities of the studied 3,421 milk MIR wavepoints gradually increased and then decreased in units of 311 wavepoints throughout the lactation. The genetic and phenotypic correlations between the first 311 wavepoints and the other 3,110 wavepoints were low. The accuracies of phenotype predictions from the ST model were lower than those from the TT model for AFP (0.51 vs. 0.93), ACH4 (0.30 vs. 0.86), and ASCS (0.14 vs. 0.33). The same trend was observed for the accuracies of <strong>u</strong> predictions for AFP (0.59 vs. 0.86), ACH4 (0.47 vs. 0.78), and ASCS (0.39 vs. 0.59). The average correlation between <strong>U</strong> predicted from the TT model and the new approximate method was 0.90. The new approximate method used for estimating <strong>U</strong> in MegaLMM will enhance the suitability of MegaLMM for applications in animal breeding. This study conducted an initial investigation into the application of thousands of traits in animal breeding and showed that the TT model is beneficial for the prediction of focal traits (phenotype and breeding values), especially for difficult-to-measure traits (e.g., ACH4).</div></div>","PeriodicalId":354,"journal":{"name":"Journal of Dairy Science","volume":"107 11","pages":"Pages 9615-9627"},"PeriodicalIF":3.7000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dairy Science","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022030224009755","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
With the rapid development of animal phenomics and deep phenotyping, we can obtain thousands of traditional (but also molecular) phenotypes per individual. However, there is still a lack of exploration regarding how to handle this huge amount of data in the context of animal breeding, presenting a challenge that we are likely to encounter more and more in the future. This study aimed to (1) explore the use of the mega-scale linear mixed model (MegaLMM), a factor model-based approach that is able to simultaneously estimate (co)variance components and genetic parameters in the context of thousands of milk traits, hereafter called thousand-trait (TT) models; (2) compare the phenotype values and genomic breeding value (u) predictions for focal traits (i.e., traits that are targeted for prediction, compared with secondary traits that are helping to evaluate), from single-trait (ST) and TT models, respectively; (3) propose a new approximate method of GEBV (U) prediction with TT models and MegaLMM. We used a total of 3,421 milk mid-infrared (MIR) spectra wavepoints (called secondary traits) and 3 focal traits (average fat percentage [AFP], average methane production [ACH4], and average SCS [ASCS]) collected on 3,302 first-parity Holstein cows. The 3,421 milk MIR wavepoint traits were composed of 311 wavepoints in 11 classes (months in lactation). Genotyping information of 564,439 SNPs was available for all animals and was used to calculate the genomic relationship matrix. The MegaLMM was implemented in the framework of the Bayesian sparse factor model and solved through Gibbs sampling (Markov chain Monte Carlo). The heritabilities of the studied 3,421 milk MIR wavepoints gradually increased and then decreased in units of 311 wavepoints throughout the lactation. The genetic and phenotypic correlations between the first 311 wavepoints and the other 3,110 wavepoints were low. The accuracies of phenotype predictions from the ST model were lower than those from the TT model for AFP (0.51 vs. 0.93), ACH4 (0.30 vs. 0.86), and ASCS (0.14 vs. 0.33). The same trend was observed for the accuracies of u predictions for AFP (0.59 vs. 0.86), ACH4 (0.47 vs. 0.78), and ASCS (0.39 vs. 0.59). The average correlation between U predicted from the TT model and the new approximate method was 0.90. The new approximate method used for estimating U in MegaLMM will enhance the suitability of MegaLMM for applications in animal breeding. This study conducted an initial investigation into the application of thousands of traits in animal breeding and showed that the TT model is beneficial for the prediction of focal traits (phenotype and breeding values), especially for difficult-to-measure traits (e.g., ACH4).
期刊介绍:
The official journal of the American Dairy Science Association®, Journal of Dairy Science® (JDS) is the leading peer-reviewed general dairy research journal in the world. JDS readers represent education, industry, and government agencies in more than 70 countries with interests in biochemistry, breeding, economics, engineering, environment, food science, genetics, microbiology, nutrition, pathology, physiology, processing, public health, quality assurance, and sanitation.