Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.

IF 3.6 1区 农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE Genetics Selection Evolution Pub Date : 2023-10-18 DOI:10.1186/s12711-023-00843-w
Di Zhu, Yiqiang Zhao, Ran Zhang, Hanyu Wu, Gengyuan Cai, Zhenfang Wu, Yuzhe Wang, Xiaoxiang Hu
{"title":"Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.","authors":"Di Zhu, Yiqiang Zhao, Ran Zhang, Hanyu Wu, Gengyuan Cai, Zhenfang Wu, Yuzhe Wang, Xiaoxiang Hu","doi":"10.1186/s12711-023-00843-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Although the accumulation of whole-genome sequencing (WGS) data has accelerated the identification of mutations underlying complex traits, its impact on the accuracy of genomic predictions is limited. Reliable genotyping data and pre-selected beneficial loci can be used to improve prediction accuracy. Previously, we reported a low-coverage sequencing genotyping method that yielded 11.3 million highly accurate single-nucleotide polymorphisms (SNPs) in pigs. Here, we introduce a method termed selective linkage disequilibrium pruning (SLDP), which refines the set of SNPs that show a large gain during prediction of complex traits using whole-genome SNP data.</p><p><strong>Results: </strong>We used the SLDP method to identify and select markers among millions of SNPs based on genome-wide association study (GWAS) prior information. We evaluated the performance of SLDP with respect to three real traits and six simulated traits with varying genetic architectures using two representative models (genomic best linear unbiased prediction and BayesR) on samples from 3579 Duroc boars. SLDP was determined by testing 180 combinations of two core parameters (GWAS P-value thresholds and linkage disequilibrium r<sup>2</sup>). The parameters for each trait were optimized in the training population by five fold cross-validation and then tested in the validation population. Similar to previous GWAS prior-based methods, the performance of SLDP was mainly affected by the genetic architecture of the traits analyzed. Specifically, SLDP performed better for traits controlled by major quantitative trait loci (QTL) or a small number of quantitative trait nucleotides (QTN). Compared with two commercial SNP chips, genotyping-by-sequencing data, and an unselected whole-genome SNP panel, the SLDP strategy led to significant improvements in prediction accuracy, which ranged from 0.84 to 3.22% for real traits controlled by major or moderate QTL and from 1.23 to 11.47% for simulated traits controlled by a small number of QTN.</p><p><strong>Conclusions: </strong>The SLDP marker selection method can be incorporated into mainstream prediction models to yield accuracy improvements for traits with a relatively simple genetic architecture, however, it has no significant advantage for traits not controlled by major QTL. The main factors that affect its performance are the genetic architecture of traits and the reliability of GWAS prior information. Our findings can facilitate the application of WGS-based genomic selection.</p>","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"55 1","pages":"72"},"PeriodicalIF":3.6000,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10583454/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics Selection Evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12711-023-00843-w","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Although the accumulation of whole-genome sequencing (WGS) data has accelerated the identification of mutations underlying complex traits, its impact on the accuracy of genomic predictions is limited. Reliable genotyping data and pre-selected beneficial loci can be used to improve prediction accuracy. Previously, we reported a low-coverage sequencing genotyping method that yielded 11.3 million highly accurate single-nucleotide polymorphisms (SNPs) in pigs. Here, we introduce a method termed selective linkage disequilibrium pruning (SLDP), which refines the set of SNPs that show a large gain during prediction of complex traits using whole-genome SNP data.

Results: We used the SLDP method to identify and select markers among millions of SNPs based on genome-wide association study (GWAS) prior information. We evaluated the performance of SLDP with respect to three real traits and six simulated traits with varying genetic architectures using two representative models (genomic best linear unbiased prediction and BayesR) on samples from 3579 Duroc boars. SLDP was determined by testing 180 combinations of two core parameters (GWAS P-value thresholds and linkage disequilibrium r2). The parameters for each trait were optimized in the training population by five fold cross-validation and then tested in the validation population. Similar to previous GWAS prior-based methods, the performance of SLDP was mainly affected by the genetic architecture of the traits analyzed. Specifically, SLDP performed better for traits controlled by major quantitative trait loci (QTL) or a small number of quantitative trait nucleotides (QTN). Compared with two commercial SNP chips, genotyping-by-sequencing data, and an unselected whole-genome SNP panel, the SLDP strategy led to significant improvements in prediction accuracy, which ranged from 0.84 to 3.22% for real traits controlled by major or moderate QTL and from 1.23 to 11.47% for simulated traits controlled by a small number of QTN.

Conclusions: The SLDP marker selection method can be incorporated into mainstream prediction models to yield accuracy improvements for traits with a relatively simple genetic architecture, however, it has no significant advantage for traits not controlled by major QTL. The main factors that affect its performance are the genetic architecture of traits and the reliability of GWAS prior information. Our findings can facilitate the application of WGS-based genomic selection.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于纯杜洛克群体中低覆盖率全基因组序列变异的选择性连锁不平衡修剪的基因组预测。
背景:尽管全基因组测序(WGS)数据的积累加速了复杂性状突变的识别,但其对基因组预测准确性的影响有限。可靠的基因分型数据和预先选择的有益基因座可用于提高预测准确性。此前,我们报道了一种低覆盖率测序基因分型方法,该方法在猪中产生了1130万个高度准确的单核苷酸多态性(SNPs)。在这里,我们介绍了一种称为选择性连锁不平衡修剪(SLDP)的方法,该方法使用全基因组SNP数据来细化在预测复杂性状过程中显示出大增益的SNP集。结果:基于全基因组关联研究(GWAS)的先验信息,我们使用SLDP方法在数百万个SNPs中识别和选择标记。我们使用两个代表性模型(基因组最佳线性无偏预测和BayesR)对3579头杜洛克公猪的样本,评估了SLDP在三个真实性状和六个具有不同遗传结构的模拟性状方面的性能。SLDP是通过测试两个核心参数(GWAS P值阈值和连锁不平衡r2)的180个组合来确定的。每个特征的参数在训练群体中通过五倍交叉验证进行优化,然后在验证群体中进行测试。与以前基于GWAS先验的方法类似,SLDP的性能主要受所分析性状的遗传结构的影响。具体而言,SLDP对由主要数量性状基因座(QTL)或少量数量性状核苷酸(QTN)控制的性状表现更好。与两种商业SNP芯片、通过测序数据进行基因分型和未选择的全基因组SNP面板相比,SLDP策略显著提高了预测准确性,主要或中等QTL控制的真实性状的产量为0.84%至3.22%,少量QTN控制的模拟性状的产量则为1.23%至11.47%,对不受主效QTL控制的性状没有显著优势。影响其性能的主要因素是性状的遗传结构和GWAS先验信息的可靠性。我们的发现可以促进基于WGS的基因组选择的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Genetics Selection Evolution
Genetics Selection Evolution 生物-奶制品与动物科学
CiteScore
6.50
自引率
9.80%
发文量
74
审稿时长
1 months
期刊介绍: Genetics Selection Evolution invites basic, applied and methodological content that will aid the current understanding and the utilization of genetic variability in domestic animal species. Although the focus is on domestic animal species, research on other species is invited if it contributes to the understanding of the use of genetic variability in domestic animals. Genetics Selection Evolution publishes results from all levels of study, from the gene to the quantitative trait, from the individual to the population, the breed or the species. Contributions concerning both the biological approach, from molecular genetics to quantitative genetics, as well as the mathematical approach, from population genetics to statistics, are welcome. Specific areas of interest include but are not limited to: gene and QTL identification, mapping and characterization, analysis of new phenotypes, high-throughput SNP data analysis, functional genomics, cytogenetics, genetic diversity of populations and breeds, genetic evaluation, applied and experimental selection, genomic selection, selection efficiency, and statistical methodology for the genetic analysis of phenotypes with quantitative and mixed inheritance.
期刊最新文献
A computationally efficient algorithm to leverage average information REML for (co)variance component estimation in the genomic era On the ability of the LR method to detect bias when there is pedigree misspecification and lack of connectedness Empirical versus estimated accuracy of imputation: optimising filtering thresholds for sequence imputation The effect of phenotyping, adult selection, and mating strategies on genetic gain and rate of inbreeding in black soldier fly breeding programs Investigating genotype by environment interaction for beef cattle fertility traits in commercial herds in northern Australia with multi-trait analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1