低覆盖率全基因组测序数据的估算策略及其对猪基因组预测和全基因组关联研究的影响

IF 4 2区 农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE Animal Pub Date : 2024-07-25 DOI:10.1016/j.animal.2024.101258
X.Q. Wang , L.G. Wang , L.Y. Shi , J.J. Tian , M.Y. Li , L.X. Wang , F.P. Zhao
{"title":"低覆盖率全基因组测序数据的估算策略及其对猪基因组预测和全基因组关联研究的影响","authors":"X.Q. Wang ,&nbsp;L.G. Wang ,&nbsp;L.Y. Shi ,&nbsp;J.J. Tian ,&nbsp;M.Y. Li ,&nbsp;L.X. Wang ,&nbsp;F.P. Zhao","doi":"10.1016/j.animal.2024.101258","DOIUrl":null,"url":null,"abstract":"<div><p>The uncertainty resulting from missing genotypes in low-coverage whole-genome sequencing (<strong>LCWGS</strong>) data complicates genotype imputation. The aim of this study is to find out an optimal strategy for accurately imputing LCWGS data and assess its effectiveness for genomic prediction (<strong>GP</strong>) and genome-wide association study (<strong>GWAS</strong>) on economically important traits of Large White pigs. The LCWGS data of 1 423 Large White pigs were imputed using three different strategies: (1) using the high-coverage whole-genome sequencing (<strong>HCWGS</strong>) of 30 key progenitors as the reference panel (<strong>Ref_LG</strong>); (2) mixing HCWGS of key progenitors with LCWGS (<strong>Mix_HLG</strong>) and (3) self-imputation in LCWGS (<strong>Within_LG</strong>). Additionally, to compare the imputation effects of LCWGS, we also imputed SNP chip data of 1 423 Large White pigs to the whole-genome sequencing level using the reference panel consisting of key progenitors (<strong>Ref_SNP</strong>). To evaluate effects of the imputed sequencing data, we compared the accuracies of GP and statistical power of GWAS for four reproductive traits based on the chip data, sequencing data imputed from chip data and LCWGS data using an optimal strategy. The average imputation accuracies of the Within_LG, Ref_LG and Mix_HLG were 0.9893, 0.9899 and 0.9875, respectively, which were higher than that of the Ref_SNP (0.8522). Using the imputed sequencing data from LCWGS with the Ref_LG imputation strategy, the accuracies of GP for four traits improved by approximately 0.31–1.04% compared to the chip data, and by 0.7–1.05% compared to the imputed sequencing data from chip data. Furthermore, by using the sequence data imputed from LCWGS with the Ref_LG, 18 candidate genes were identified to be associated with the four reproductive traits of interest in Large White pigs: total number of piglets born - <em>EPC2</em>, <em>MBD5</em>, <em>ORC4</em> and <em>ACVR2A</em>; number of piglets born healthy - <em>IKBKE</em>; total litter weight of piglets born alive - <em>HSPA13</em> and <em>CPA1</em>; gestation length - <em>GTF2H5</em>, <em>ITGAV</em>, <em>NFE2L2</em>, <em>CALCRL</em>, <em>ITGA4</em>, <em>STAT1</em>, <em>HOXD10</em>, <em>MSTN</em>, <em>COL5A2</em> and <em>STAT4</em>. With the exception of <em>EPC2</em>, <em>ORC4</em>, <em>ACVR2A</em> and <em>MSTN</em>, others represent novel candidates. Our findings can provide a reference for the application of LCWGS data in livestock and poultry.</p></div>","PeriodicalId":50789,"journal":{"name":"Animal","volume":"18 9","pages":"Article 101258"},"PeriodicalIF":4.0000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1751731124001897/pdfft?md5=28d01f71dd72ca1d8835c482acebf1fe&pid=1-s2.0-S1751731124001897-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Imputation strategies for low-coverage whole-genome sequencing data and their effects on genomic prediction and genome-wide association studies in pigs\",\"authors\":\"X.Q. Wang ,&nbsp;L.G. Wang ,&nbsp;L.Y. Shi ,&nbsp;J.J. Tian ,&nbsp;M.Y. Li ,&nbsp;L.X. Wang ,&nbsp;F.P. Zhao\",\"doi\":\"10.1016/j.animal.2024.101258\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The uncertainty resulting from missing genotypes in low-coverage whole-genome sequencing (<strong>LCWGS</strong>) data complicates genotype imputation. The aim of this study is to find out an optimal strategy for accurately imputing LCWGS data and assess its effectiveness for genomic prediction (<strong>GP</strong>) and genome-wide association study (<strong>GWAS</strong>) on economically important traits of Large White pigs. The LCWGS data of 1 423 Large White pigs were imputed using three different strategies: (1) using the high-coverage whole-genome sequencing (<strong>HCWGS</strong>) of 30 key progenitors as the reference panel (<strong>Ref_LG</strong>); (2) mixing HCWGS of key progenitors with LCWGS (<strong>Mix_HLG</strong>) and (3) self-imputation in LCWGS (<strong>Within_LG</strong>). Additionally, to compare the imputation effects of LCWGS, we also imputed SNP chip data of 1 423 Large White pigs to the whole-genome sequencing level using the reference panel consisting of key progenitors (<strong>Ref_SNP</strong>). To evaluate effects of the imputed sequencing data, we compared the accuracies of GP and statistical power of GWAS for four reproductive traits based on the chip data, sequencing data imputed from chip data and LCWGS data using an optimal strategy. The average imputation accuracies of the Within_LG, Ref_LG and Mix_HLG were 0.9893, 0.9899 and 0.9875, respectively, which were higher than that of the Ref_SNP (0.8522). Using the imputed sequencing data from LCWGS with the Ref_LG imputation strategy, the accuracies of GP for four traits improved by approximately 0.31–1.04% compared to the chip data, and by 0.7–1.05% compared to the imputed sequencing data from chip data. Furthermore, by using the sequence data imputed from LCWGS with the Ref_LG, 18 candidate genes were identified to be associated with the four reproductive traits of interest in Large White pigs: total number of piglets born - <em>EPC2</em>, <em>MBD5</em>, <em>ORC4</em> and <em>ACVR2A</em>; number of piglets born healthy - <em>IKBKE</em>; total litter weight of piglets born alive - <em>HSPA13</em> and <em>CPA1</em>; gestation length - <em>GTF2H5</em>, <em>ITGAV</em>, <em>NFE2L2</em>, <em>CALCRL</em>, <em>ITGA4</em>, <em>STAT1</em>, <em>HOXD10</em>, <em>MSTN</em>, <em>COL5A2</em> and <em>STAT4</em>. With the exception of <em>EPC2</em>, <em>ORC4</em>, <em>ACVR2A</em> and <em>MSTN</em>, others represent novel candidates. Our findings can provide a reference for the application of LCWGS data in livestock and poultry.</p></div>\",\"PeriodicalId\":50789,\"journal\":{\"name\":\"Animal\",\"volume\":\"18 9\",\"pages\":\"Article 101258\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1751731124001897/pdfft?md5=28d01f71dd72ca1d8835c482acebf1fe&pid=1-s2.0-S1751731124001897-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Animal\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1751731124001897\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, DAIRY & ANIMAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Animal","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1751731124001897","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

低覆盖率全基因组测序(LCWGS)数据中基因型缺失导致的不确定性使基因型归约变得复杂。本研究旨在找出精确归约 LCWGS 数据的最佳策略,并评估其在大白猪重要经济性状的基因组预测(GP)和全基因组关联研究(GWAS)中的有效性。采用三种不同的策略对 1 423 头大白猪的 LCWGS 数据进行了归约:(1)使用 30 个关键祖先的高覆盖全基因组测序(HCWGS)作为参考面板(Ref_LG);(2)将关键祖先的 HCWGS 与 LCWGS 混合(Mix_HLG);(3)在 LCWGS 中进行自我归约(Within_LG)。此外,为了比较 LCWGS 的估算效果,我们还使用由关键祖代组成的参考面板(Ref_SNP)将 1 423 头大白猪的 SNP 芯片数据估算到全基因组测序水平。为了评估推算测序数据的效果,我们比较了基于芯片数据、从芯片数据推算的测序数据和使用最优策略的 LCWGS 数据的 GP 的准确性和 GWAS 对四个繁殖性状的统计能力。Within_LG、Ref_LG和Mix_HLG的平均估算精确度分别为0.9893、0.9899和0.9875,高于Ref_SNP(0.8522)。采用 Ref_LG 归约策略使用 LCWGS 的归约测序数据,四个性状的 GP 精确度与芯片数据相比提高了约 0.31-1.04%,与芯片数据的归约测序数据相比提高了 0.7-1.05%。此外,通过使用 LCWGS 与 Ref_LG 估算的序列数据,确定了 18 个与大白猪四个相关繁殖性状有关的候选基因:出生仔猪总数--EPC2、MBD5、ORC4 和 ACVR2A;健康出生仔猪数--IKBKE;活产仔猪窝重--HSPA13 和 CPA1;妊娠期--GTF2H5、ITGAV、NFE2L2、CALCRL、ITGA4、STAT1、HOXD10、MSTN、COL5A2 和 STAT4。除 EPC2、ORC4、ACVR2A 和 MSTN 外,其他均为新的候选基因。我们的研究结果可为 LCWGS 数据在畜禽中的应用提供参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Imputation strategies for low-coverage whole-genome sequencing data and their effects on genomic prediction and genome-wide association studies in pigs

The uncertainty resulting from missing genotypes in low-coverage whole-genome sequencing (LCWGS) data complicates genotype imputation. The aim of this study is to find out an optimal strategy for accurately imputing LCWGS data and assess its effectiveness for genomic prediction (GP) and genome-wide association study (GWAS) on economically important traits of Large White pigs. The LCWGS data of 1 423 Large White pigs were imputed using three different strategies: (1) using the high-coverage whole-genome sequencing (HCWGS) of 30 key progenitors as the reference panel (Ref_LG); (2) mixing HCWGS of key progenitors with LCWGS (Mix_HLG) and (3) self-imputation in LCWGS (Within_LG). Additionally, to compare the imputation effects of LCWGS, we also imputed SNP chip data of 1 423 Large White pigs to the whole-genome sequencing level using the reference panel consisting of key progenitors (Ref_SNP). To evaluate effects of the imputed sequencing data, we compared the accuracies of GP and statistical power of GWAS for four reproductive traits based on the chip data, sequencing data imputed from chip data and LCWGS data using an optimal strategy. The average imputation accuracies of the Within_LG, Ref_LG and Mix_HLG were 0.9893, 0.9899 and 0.9875, respectively, which were higher than that of the Ref_SNP (0.8522). Using the imputed sequencing data from LCWGS with the Ref_LG imputation strategy, the accuracies of GP for four traits improved by approximately 0.31–1.04% compared to the chip data, and by 0.7–1.05% compared to the imputed sequencing data from chip data. Furthermore, by using the sequence data imputed from LCWGS with the Ref_LG, 18 candidate genes were identified to be associated with the four reproductive traits of interest in Large White pigs: total number of piglets born - EPC2, MBD5, ORC4 and ACVR2A; number of piglets born healthy - IKBKE; total litter weight of piglets born alive - HSPA13 and CPA1; gestation length - GTF2H5, ITGAV, NFE2L2, CALCRL, ITGA4, STAT1, HOXD10, MSTN, COL5A2 and STAT4. With the exception of EPC2, ORC4, ACVR2A and MSTN, others represent novel candidates. Our findings can provide a reference for the application of LCWGS data in livestock and poultry.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Animal
Animal 农林科学-奶制品与动物科学
CiteScore
7.50
自引率
2.80%
发文量
246
审稿时长
3 months
期刊介绍: Editorial board animal attracts the best research in animal biology and animal systems from across the spectrum of the agricultural, biomedical, and environmental sciences. It is the central element in an exciting collaboration between the British Society of Animal Science (BSAS), Institut National de la Recherche Agronomique (INRA) and the European Federation of Animal Science (EAAP) and represents a merging of three scientific journals: Animal Science; Animal Research; Reproduction, Nutrition, Development. animal publishes original cutting-edge research, ''hot'' topics and horizon-scanning reviews on animal-related aspects of the life sciences at the molecular, cellular, organ, whole animal and production system levels. The main subject areas include: breeding and genetics; nutrition; physiology and functional biology of systems; behaviour, health and welfare; farming systems, environmental impact and climate change; product quality, human health and well-being. Animal models and papers dealing with the integration of research between these topics and their impact on the environment and people are particularly welcome.
期刊最新文献
Editorial Board Amylase activity across black soldier fly larvae development and feeding substrates: insights on starch digestibility and external digestion Comparison of predictive ability of single-trait and multitrait genomic selection models for body growth traits in Maiwa yaks Effects of oxygen levels and temperature on growth and physiology of pikeperch juveniles cultured in a recirculating aquaculture system Resolving and functional analysis of RNA editing sites in sheep ovaries and associations with litter size
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1