Estimating current effective sizes of large populations from a single sample of genomic marker data: A comparison of estimators by simulations

IF 1 4区环境科学与生态学 Q4 ECOLOGY Population Ecology Pub Date : 2023-09-19 DOI:10.1002/1438-390x.12167

Jinliang Wang

{"title":"Estimating current effective sizes of large populations from a single sample of genomic marker data: A comparison of estimators by simulations","authors":"Jinliang Wang","doi":"10.1002/1438-390x.12167","DOIUrl":null,"url":null,"abstract":"Abstract Genome‐wide single nucleotide polymorphisms (SNPs) data are increasingly used in estimating the current effective population sizes ( N e ) for informing the conservation of endangered species and guiding the management of exploited species. Previous assessments of sibship frequency (SF) and linkage disequilibrium (LD) estimators of N e focused on small populations where genetic drift is strong and thus N e is easy to estimate. Genomic single nucleotide polymorphism (SNP) data provide ample information and hold the potential for application of these estimators to large populations where genetic drift is rather weak and thus N e is difficult to estimate. In this study, I simulated very large populations and sampled a widely variable number of individuals (genotyped at 10,000 SNPs) for estimating N e by both SF and LD methods. I also considered the more realistic situation where a population experiences a bottleneck, and where marker data suffer from genotyping errors. The simulations show that both SF and LD methods can yield accurate N e estimates of very large populations when sampled individuals are sufficiently numerous. When n is much smaller than N e , however, N e estimates are in a bimodal distribution with a substantial proportion of the estimates being infinitely large. For a population with a bottleneck, LD estimator overestimates and underestimates the N e of the parental population from samples taken at and after the bottleneck, respectively. LD estimator also overestimates N e substantially when applied to data suffering from allelic dropouts and false alleles. In contrast, SF estimator is unbiased and accurate when populations are changing in size or markers suffer from genotyping errors.","PeriodicalId":54597,"journal":{"name":"Population Ecology","volume":"24 1","pages":"0"},"PeriodicalIF":1.0000,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Population Ecology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/1438-390x.12167","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ECOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract Genome‐wide single nucleotide polymorphisms (SNPs) data are increasingly used in estimating the current effective population sizes ( N e ) for informing the conservation of endangered species and guiding the management of exploited species. Previous assessments of sibship frequency (SF) and linkage disequilibrium (LD) estimators of N e focused on small populations where genetic drift is strong and thus N e is easy to estimate. Genomic single nucleotide polymorphism (SNP) data provide ample information and hold the potential for application of these estimators to large populations where genetic drift is rather weak and thus N e is difficult to estimate. In this study, I simulated very large populations and sampled a widely variable number of individuals (genotyped at 10,000 SNPs) for estimating N e by both SF and LD methods. I also considered the more realistic situation where a population experiences a bottleneck, and where marker data suffer from genotyping errors. The simulations show that both SF and LD methods can yield accurate N e estimates of very large populations when sampled individuals are sufficiently numerous. When n is much smaller than N e , however, N e estimates are in a bimodal distribution with a substantial proportion of the estimates being infinitely large. For a population with a bottleneck, LD estimator overestimates and underestimates the N e of the parental population from samples taken at and after the bottleneck, respectively. LD estimator also overestimates N e substantially when applied to data suffering from allelic dropouts and false alleles. In contrast, SF estimator is unbiased and accurate when populations are changing in size or markers suffer from genotyping errors.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从单个基因组标记数据样本估计当前大种群的有效规模:模拟估算器的比较

全基因组单核苷酸多态性(snp)数据越来越多地用于估计当前有效种群大小(N e)，为濒危物种保护和指导开发物种管理提供信息。以往的兄弟姐妹频率(SF)和连锁不平衡(LD)估计集中在遗传漂变强的小群体，因此N e很容易估计。基因组单核苷酸多态性(SNP)数据提供了充足的信息，并具有将这些估计器应用于遗传漂变相当弱的大群体的潜力，因此N e难以估计。在这项研究中，我模拟了非常大的种群，并采样了大量可变的个体(基因分型为10,000个snp)，以便通过SF和LD方法估计N - e。我还考虑了更现实的情况，即群体经历瓶颈，标记数据遭受基因分型错误。模拟结果表明，当样本数量足够多时，SF和LD方法都能对非常大的种群产生准确的N - e估计。然而，当n远远小于nne时，nne的估计值呈双峰分布，其中相当大比例的估计值为无限大。对于有瓶颈的种群，LD估计器分别高估和低估了在瓶颈处和瓶颈后采样的亲本种群的N e。LD估计器在应用于存在等位基因缺失和假等位基因的数据时，也会大大高估N e。相比之下，SF估计是无偏和准确的，当群体的大小变化或标记遭受基因分型错误。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Population Ecology 环境科学-生态学

CiteScore

3.90

自引率

11.80%

发文量

审稿时长

18-36 weeks

期刊介绍： Population Ecology, formerly known as Researches on Population Ecology launched in Dec 1952, is the official journal of the Society of Population Ecology. Population Ecology publishes original research articles and reviews (including invited reviews) on various aspects of population ecology, from the individual to the community level. Among the specific fields included are population dynamics and distribution, evolutionary ecology, ecological genetics, theoretical models, conservation biology, agroecosystem studies, and bioresource management. Manuscripts should contain new results of empirical and/or theoretical investigations concerning facts, patterns, processes, mechanisms or concepts of population ecology; those purely descriptive in nature are not suitable for this journal. All manuscripts are reviewed anonymously by two or more referees, and the final editorial decision is made by the Chief Editor or an Associate Editor based on the referees'' evaluations.