Munish Gangwar, Sheikh Firdous Ahmad, Abdul Basit Ali, Amit Kumar, Amod Kumar, Gyanendra Kumar Gaur, Triveni Dutt
{"title":"通过对印度牦牛、中国牦牛和野生牦牛进行全基因组重测序,确定低密度、具有祖先信息的 SNP 标记。","authors":"Munish Gangwar, Sheikh Firdous Ahmad, Abdul Basit Ali, Amit Kumar, Amod Kumar, Gyanendra Kumar Gaur, Triveni Dutt","doi":"10.1186/s12864-024-10924-9","DOIUrl":null,"url":null,"abstract":"<p><p>The current investigation was undertaken to elucidate the population-stratifying and ancestry-informative markers in Indian, Chinese, and wild yak populations using whole genome resequencing (WGS) analysis while employing various selection strategies (Delta, Pairwise Wright's Fixation Index-F<sub>ST</sub>, and Informativeness of Assignment) and marker densities (5-25 thousand). The study used WGS data on 105 individuals from three separate yak cohorts i.e., Indian yak (n = 29), Chinese yak (n = 61), and wild yak (n = 15). Variant calling in the GATK program with strict quality control resulted in 1,002,970 high-quality and independent (LD-pruned) SNP markers across the yak autosomes. Analysis was undertaken in toolbox for ranking and evaluation of SNPs (TRES) program wherein three different criteria i.e., Delta, Pairwise Wright's Fixation Index-F<sub>ST</sub>, and Informativeness of Assignment were employed to identify population-stratifying and ancestry-informative markers across various datasets. The top-ranked 5,000 (5K), 10,000 (10K), 15,000 (15K), 20,000 (20K), and 25,000 (25K) SNPs were identified from each dataset while their composition and performance was assessed using different criteria. The average genomic breed clustering of Indian, Chinese, and wild yak cohorts with full density dataset (105 individuals with 1,002,970 markers) was 81.74%, 80.02%, and 83.62%, respectively. Informativeness of Assignment criterion with 10K density emerged as the best combination for three yak cohorts with 86.94%, 96.46%, and 98.20% clustering for Indian, Chinese, and wild yak, respectively. There was an average increase of 7.56%, 22.72%, and 30.35% in genomic breed clustering scores of Indian, Chinese, and wild yak cohorts over the estimates of the original dataset. The selected markers showed overlap multiple protein-coding genes within a 10 kb window including ADGRB3, ANK1, CACNG7, CALN1, CHCHD2, CREBBP, GLI3, KHDRBS2, and OSBPL10. This is the first report ever on elucidating low-density SNP marker sets with population-stratifying and ancestry-informative properties in three yak groups using WGS data. The results gain significance for application of genomic selection using cost-effective low-density SNP panels in global yak species.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":null,"pages":null},"PeriodicalIF":3.5000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11539683/pdf/","citationCount":"0","resultStr":"{\"title\":\"Identifying low-density, ancestry-informative SNP markers through whole genome resequencing in Indian, Chinese, and wild yak.\",\"authors\":\"Munish Gangwar, Sheikh Firdous Ahmad, Abdul Basit Ali, Amit Kumar, Amod Kumar, Gyanendra Kumar Gaur, Triveni Dutt\",\"doi\":\"10.1186/s12864-024-10924-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The current investigation was undertaken to elucidate the population-stratifying and ancestry-informative markers in Indian, Chinese, and wild yak populations using whole genome resequencing (WGS) analysis while employing various selection strategies (Delta, Pairwise Wright's Fixation Index-F<sub>ST</sub>, and Informativeness of Assignment) and marker densities (5-25 thousand). The study used WGS data on 105 individuals from three separate yak cohorts i.e., Indian yak (n = 29), Chinese yak (n = 61), and wild yak (n = 15). Variant calling in the GATK program with strict quality control resulted in 1,002,970 high-quality and independent (LD-pruned) SNP markers across the yak autosomes. Analysis was undertaken in toolbox for ranking and evaluation of SNPs (TRES) program wherein three different criteria i.e., Delta, Pairwise Wright's Fixation Index-F<sub>ST</sub>, and Informativeness of Assignment were employed to identify population-stratifying and ancestry-informative markers across various datasets. The top-ranked 5,000 (5K), 10,000 (10K), 15,000 (15K), 20,000 (20K), and 25,000 (25K) SNPs were identified from each dataset while their composition and performance was assessed using different criteria. The average genomic breed clustering of Indian, Chinese, and wild yak cohorts with full density dataset (105 individuals with 1,002,970 markers) was 81.74%, 80.02%, and 83.62%, respectively. Informativeness of Assignment criterion with 10K density emerged as the best combination for three yak cohorts with 86.94%, 96.46%, and 98.20% clustering for Indian, Chinese, and wild yak, respectively. There was an average increase of 7.56%, 22.72%, and 30.35% in genomic breed clustering scores of Indian, Chinese, and wild yak cohorts over the estimates of the original dataset. The selected markers showed overlap multiple protein-coding genes within a 10 kb window including ADGRB3, ANK1, CACNG7, CALN1, CHCHD2, CREBBP, GLI3, KHDRBS2, and OSBPL10. This is the first report ever on elucidating low-density SNP marker sets with population-stratifying and ancestry-informative properties in three yak groups using WGS data. The results gain significance for application of genomic selection using cost-effective low-density SNP panels in global yak species.</p>\",\"PeriodicalId\":9030,\"journal\":{\"name\":\"BMC Genomics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11539683/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Genomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12864-024-10924-9\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12864-024-10924-9","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
Identifying low-density, ancestry-informative SNP markers through whole genome resequencing in Indian, Chinese, and wild yak.
The current investigation was undertaken to elucidate the population-stratifying and ancestry-informative markers in Indian, Chinese, and wild yak populations using whole genome resequencing (WGS) analysis while employing various selection strategies (Delta, Pairwise Wright's Fixation Index-FST, and Informativeness of Assignment) and marker densities (5-25 thousand). The study used WGS data on 105 individuals from three separate yak cohorts i.e., Indian yak (n = 29), Chinese yak (n = 61), and wild yak (n = 15). Variant calling in the GATK program with strict quality control resulted in 1,002,970 high-quality and independent (LD-pruned) SNP markers across the yak autosomes. Analysis was undertaken in toolbox for ranking and evaluation of SNPs (TRES) program wherein three different criteria i.e., Delta, Pairwise Wright's Fixation Index-FST, and Informativeness of Assignment were employed to identify population-stratifying and ancestry-informative markers across various datasets. The top-ranked 5,000 (5K), 10,000 (10K), 15,000 (15K), 20,000 (20K), and 25,000 (25K) SNPs were identified from each dataset while their composition and performance was assessed using different criteria. The average genomic breed clustering of Indian, Chinese, and wild yak cohorts with full density dataset (105 individuals with 1,002,970 markers) was 81.74%, 80.02%, and 83.62%, respectively. Informativeness of Assignment criterion with 10K density emerged as the best combination for three yak cohorts with 86.94%, 96.46%, and 98.20% clustering for Indian, Chinese, and wild yak, respectively. There was an average increase of 7.56%, 22.72%, and 30.35% in genomic breed clustering scores of Indian, Chinese, and wild yak cohorts over the estimates of the original dataset. The selected markers showed overlap multiple protein-coding genes within a 10 kb window including ADGRB3, ANK1, CACNG7, CALN1, CHCHD2, CREBBP, GLI3, KHDRBS2, and OSBPL10. This is the first report ever on elucidating low-density SNP marker sets with population-stratifying and ancestry-informative properties in three yak groups using WGS data. The results gain significance for application of genomic selection using cost-effective low-density SNP panels in global yak species.
期刊介绍:
BMC Genomics is an open access, peer-reviewed journal that considers articles on all aspects of genome-scale analysis, functional genomics, and proteomics.
BMC Genomics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.