{"title":"Estimating current effective sizes of large populations from a single sample of genomic marker data: A comparison of estimators by simulations","authors":"Jinliang Wang","doi":"10.1002/1438-390x.12167","DOIUrl":null,"url":null,"abstract":"Abstract Genome‐wide single nucleotide polymorphisms (SNPs) data are increasingly used in estimating the current effective population sizes ( N e ) for informing the conservation of endangered species and guiding the management of exploited species. Previous assessments of sibship frequency (SF) and linkage disequilibrium (LD) estimators of N e focused on small populations where genetic drift is strong and thus N e is easy to estimate. Genomic single nucleotide polymorphism (SNP) data provide ample information and hold the potential for application of these estimators to large populations where genetic drift is rather weak and thus N e is difficult to estimate. In this study, I simulated very large populations and sampled a widely variable number of individuals (genotyped at 10,000 SNPs) for estimating N e by both SF and LD methods. I also considered the more realistic situation where a population experiences a bottleneck, and where marker data suffer from genotyping errors. The simulations show that both SF and LD methods can yield accurate N e estimates of very large populations when sampled individuals are sufficiently numerous. When n is much smaller than N e , however, N e estimates are in a bimodal distribution with a substantial proportion of the estimates being infinitely large. For a population with a bottleneck, LD estimator overestimates and underestimates the N e of the parental population from samples taken at and after the bottleneck, respectively. LD estimator also overestimates N e substantially when applied to data suffering from allelic dropouts and false alleles. In contrast, SF estimator is unbiased and accurate when populations are changing in size or markers suffer from genotyping errors.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/1438-390x.12167","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Genome‐wide single nucleotide polymorphisms (SNPs) data are increasingly used in estimating the current effective population sizes ( N e ) for informing the conservation of endangered species and guiding the management of exploited species. Previous assessments of sibship frequency (SF) and linkage disequilibrium (LD) estimators of N e focused on small populations where genetic drift is strong and thus N e is easy to estimate. Genomic single nucleotide polymorphism (SNP) data provide ample information and hold the potential for application of these estimators to large populations where genetic drift is rather weak and thus N e is difficult to estimate. In this study, I simulated very large populations and sampled a widely variable number of individuals (genotyped at 10,000 SNPs) for estimating N e by both SF and LD methods. I also considered the more realistic situation where a population experiences a bottleneck, and where marker data suffer from genotyping errors. The simulations show that both SF and LD methods can yield accurate N e estimates of very large populations when sampled individuals are sufficiently numerous. When n is much smaller than N e , however, N e estimates are in a bimodal distribution with a substantial proportion of the estimates being infinitely large. For a population with a bottleneck, LD estimator overestimates and underestimates the N e of the parental population from samples taken at and after the bottleneck, respectively. LD estimator also overestimates N e substantially when applied to data suffering from allelic dropouts and false alleles. In contrast, SF estimator is unbiased and accurate when populations are changing in size or markers suffer from genotyping errors.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.