Jigme Dorji, Amanda J. Chamberlain, Coralie M. Reich, Christy J. VanderJagt, Tuan V. Nguyen, Hans D. Daetwyler, Iona M. MacLeod
{"title":"线粒体序列变异:测试估算的准确性及其与奶牛牛奶性状的关系","authors":"Jigme Dorji, Amanda J. Chamberlain, Coralie M. Reich, Christy J. VanderJagt, Tuan V. Nguyen, Hans D. Daetwyler, Iona M. MacLeod","doi":"10.1186/s12711-024-00931-5","DOIUrl":null,"url":null,"abstract":"Mitochondrial genomes differ from the nuclear genome and in humans it is known that mitochondrial variants contribute to genetic disorders. Prior to genomics, some livestock studies assessed the role of the mitochondrial genome but these were limited and inconclusive. Modern genome sequencing provides an opportunity to re-evaluate the potential impact of mitochondrial variation on livestock traits. This study first evaluated the empirical accuracy of mitochondrial sequence imputation and then used real and imputed mitochondrial sequence genotypes to study the role of mitochondrial variants on milk production traits of dairy cattle. The empirical accuracy of imputation from Single Nucleotide Polymorphism (SNP) panels to mitochondrial sequence genotypes was assessed in 516 test animals of Holstein, Jersey and Red breeds using Beagle software and a sequence reference of 1883 animals. The overall accuracy estimated as the Pearson’s correlation squared (R2) between all imputed and real genotypes across all animals was 0.454. The low accuracy was attributed partly to the majority of variants having low minor allele frequency (MAF < 0.005) but also due to variants in the hypervariable D-loop region showing poor imputation accuracy. Beagle software provides an internal estimate of imputation accuracy (DR2), and 10 percent of the total 1927 imputed positions showed DR2 greater than 0.9 (N = 201). There were 151 sites with empirical R2 > 0.9 (of 954 variants segregating in the test animals) and 138 of these overlapped the sites with DR2 > 0.9. This suggests that the DR2 statistic is a reasonable proxy to select sites that are imputed with higher accuracy for downstream analyses. Accordingly, in the second part of the study mitochondrial sequence variants were imputed from real mitochondrial SNP panel genotypes of 9515 Australian Holstein, Jersey and Red dairy cattle. Then, using only sites with DR2 > 0.900 and real genotypes, we undertook a genome-wide association study (GWAS) for milk, fat and protein yields. The GWAS mitochondrial SNP effects were not significant. The accuracy of imputation of mitochondrial genotypes from the SNP panel to sequence was generally low. The Beagle DR2 statistic enabled selection of sites imputed with higher empirical accuracy. We recommend building larger reference populations with mitochondrial sequence to improve the accuracy of imputing less common variants and ensuring that SNP panels include common variants in the D-loop region.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"104 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mitochondrial sequence variants: testing imputation accuracy and their association with dairy cattle milk traits\",\"authors\":\"Jigme Dorji, Amanda J. Chamberlain, Coralie M. Reich, Christy J. VanderJagt, Tuan V. Nguyen, Hans D. Daetwyler, Iona M. MacLeod\",\"doi\":\"10.1186/s12711-024-00931-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mitochondrial genomes differ from the nuclear genome and in humans it is known that mitochondrial variants contribute to genetic disorders. Prior to genomics, some livestock studies assessed the role of the mitochondrial genome but these were limited and inconclusive. Modern genome sequencing provides an opportunity to re-evaluate the potential impact of mitochondrial variation on livestock traits. This study first evaluated the empirical accuracy of mitochondrial sequence imputation and then used real and imputed mitochondrial sequence genotypes to study the role of mitochondrial variants on milk production traits of dairy cattle. The empirical accuracy of imputation from Single Nucleotide Polymorphism (SNP) panels to mitochondrial sequence genotypes was assessed in 516 test animals of Holstein, Jersey and Red breeds using Beagle software and a sequence reference of 1883 animals. The overall accuracy estimated as the Pearson’s correlation squared (R2) between all imputed and real genotypes across all animals was 0.454. The low accuracy was attributed partly to the majority of variants having low minor allele frequency (MAF < 0.005) but also due to variants in the hypervariable D-loop region showing poor imputation accuracy. Beagle software provides an internal estimate of imputation accuracy (DR2), and 10 percent of the total 1927 imputed positions showed DR2 greater than 0.9 (N = 201). There were 151 sites with empirical R2 > 0.9 (of 954 variants segregating in the test animals) and 138 of these overlapped the sites with DR2 > 0.9. This suggests that the DR2 statistic is a reasonable proxy to select sites that are imputed with higher accuracy for downstream analyses. Accordingly, in the second part of the study mitochondrial sequence variants were imputed from real mitochondrial SNP panel genotypes of 9515 Australian Holstein, Jersey and Red dairy cattle. Then, using only sites with DR2 > 0.900 and real genotypes, we undertook a genome-wide association study (GWAS) for milk, fat and protein yields. The GWAS mitochondrial SNP effects were not significant. The accuracy of imputation of mitochondrial genotypes from the SNP panel to sequence was generally low. The Beagle DR2 statistic enabled selection of sites imputed with higher empirical accuracy. We recommend building larger reference populations with mitochondrial sequence to improve the accuracy of imputing less common variants and ensuring that SNP panels include common variants in the D-loop region.\",\"PeriodicalId\":55120,\"journal\":{\"name\":\"Genetics Selection Evolution\",\"volume\":\"104 1\",\"pages\":\"\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetics Selection Evolution\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12711-024-00931-5\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, DAIRY & ANIMAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics Selection Evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12711-024-00931-5","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0
摘要
线粒体基因组不同于核基因组,在人类中,线粒体变异导致了遗传疾病。在基因组学出现之前,一些家畜研究对线粒体基因组的作用进行了评估,但评估结果有限,而且没有定论。现代基因组测序技术为重新评估线粒体变异对家畜性状的潜在影响提供了机会。本研究首先评估了线粒体序列估算的经验准确性,然后使用真实和估算的线粒体序列基因型研究线粒体变异对奶牛产奶性状的作用。使用 Beagle 软件和 1883 头动物的序列参照,对 516 头荷斯坦、娟珊和红种的测试动物进行了评估,结果表明,从单核苷酸多态性(SNP)面板到线粒体序列基因型的推算经验准确性很高。根据所有动物的所有估算基因型与真实基因型之间的皮尔逊相关平方(R2)估算,总体准确度为 0.454。准确率低的部分原因是大多数变异的小等位基因频率(MAF 0.9)较低(在测试动物中分离出 954 个变异),其中 138 个与 DR2 > 0.9 的位点重叠。这表明,DR2 统计量是一个合理的替代指标,可用于为下游分析选择更准确的估算位点。因此,在研究的第二部分,从 9515 头澳大利亚荷斯坦牛、娟珊牛和红奶牛的真实线粒体 SNP 面板基因型中推算线粒体序列变异。然后,我们仅使用 DR2 > 0.900 的位点和真实基因型,对牛奶、脂肪和蛋白质产量进行了全基因组关联研究(GWAS)。GWAS 的线粒体 SNP 影响并不显著。从 SNP 面板到序列的线粒体基因型估算准确率普遍较低。使用 Beagle DR2 统计量可以选择经验准确性较高的归因位点。我们建议利用线粒体序列建立更大的参考群体,以提高较不常见变异的归因准确性,并确保 SNP 面板包括 D 环区域的常见变异。
Mitochondrial sequence variants: testing imputation accuracy and their association with dairy cattle milk traits
Mitochondrial genomes differ from the nuclear genome and in humans it is known that mitochondrial variants contribute to genetic disorders. Prior to genomics, some livestock studies assessed the role of the mitochondrial genome but these were limited and inconclusive. Modern genome sequencing provides an opportunity to re-evaluate the potential impact of mitochondrial variation on livestock traits. This study first evaluated the empirical accuracy of mitochondrial sequence imputation and then used real and imputed mitochondrial sequence genotypes to study the role of mitochondrial variants on milk production traits of dairy cattle. The empirical accuracy of imputation from Single Nucleotide Polymorphism (SNP) panels to mitochondrial sequence genotypes was assessed in 516 test animals of Holstein, Jersey and Red breeds using Beagle software and a sequence reference of 1883 animals. The overall accuracy estimated as the Pearson’s correlation squared (R2) between all imputed and real genotypes across all animals was 0.454. The low accuracy was attributed partly to the majority of variants having low minor allele frequency (MAF < 0.005) but also due to variants in the hypervariable D-loop region showing poor imputation accuracy. Beagle software provides an internal estimate of imputation accuracy (DR2), and 10 percent of the total 1927 imputed positions showed DR2 greater than 0.9 (N = 201). There were 151 sites with empirical R2 > 0.9 (of 954 variants segregating in the test animals) and 138 of these overlapped the sites with DR2 > 0.9. This suggests that the DR2 statistic is a reasonable proxy to select sites that are imputed with higher accuracy for downstream analyses. Accordingly, in the second part of the study mitochondrial sequence variants were imputed from real mitochondrial SNP panel genotypes of 9515 Australian Holstein, Jersey and Red dairy cattle. Then, using only sites with DR2 > 0.900 and real genotypes, we undertook a genome-wide association study (GWAS) for milk, fat and protein yields. The GWAS mitochondrial SNP effects were not significant. The accuracy of imputation of mitochondrial genotypes from the SNP panel to sequence was generally low. The Beagle DR2 statistic enabled selection of sites imputed with higher empirical accuracy. We recommend building larger reference populations with mitochondrial sequence to improve the accuracy of imputing less common variants and ensuring that SNP panels include common variants in the D-loop region.
期刊介绍:
Genetics Selection Evolution invites basic, applied and methodological content that will aid the current understanding and the utilization of genetic variability in domestic animal species. Although the focus is on domestic animal species, research on other species is invited if it contributes to the understanding of the use of genetic variability in domestic animals. Genetics Selection Evolution publishes results from all levels of study, from the gene to the quantitative trait, from the individual to the population, the breed or the species. Contributions concerning both the biological approach, from molecular genetics to quantitative genetics, as well as the mathematical approach, from population genetics to statistics, are welcome. Specific areas of interest include but are not limited to: gene and QTL identification, mapping and characterization, analysis of new phenotypes, high-throughput SNP data analysis, functional genomics, cytogenetics, genetic diversity of populations and breeds, genetic evaluation, applied and experimental selection, genomic selection, selection efficiency, and statistical methodology for the genetic analysis of phenotypes with quantitative and mixed inheritance.