Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes.

IF 4.5 2区 生物学 Q1 Agricultural and Biological Sciences PLoS Genetics Pub Date : 2023-09-07 eCollection Date: 2023-09-01 DOI:10.1371/journal.pgen.1010931
Pavel Flegontov, Ulaş Işıldak, Robert Maier, Eren Yüncü, Piya Changmai, David Reich
{"title":"Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes.","authors":"Pavel Flegontov,&nbsp;Ulaş Işıldak,&nbsp;Robert Maier,&nbsp;Eren Yüncü,&nbsp;Piya Changmai,&nbsp;David Reich","doi":"10.1371/journal.pgen.1010931","DOIUrl":null,"url":null,"abstract":"<p><p>f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data-that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed-but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True \"outgroup ascertainment\" is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the \"Affymetrix Human Origins array\" which has been genotyped on thousands of modern individuals from hundreds of populations, or the \"1240k\" in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups.</p>","PeriodicalId":20266,"journal":{"name":"PLoS Genetics","volume":"19 9","pages":"e1010931"},"PeriodicalIF":4.5000,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10508636/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pgen.1010931","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/9/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data-that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed-but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True "outgroup ascertainment" is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the "Affymetrix Human Origins array" which has been genotyped on thousands of modern individuals from hundreds of populations, or the "1240k" in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在应用所有先前提出的SNP确定方案时,使用f统计量对非洲人口历史进行建模是有偏见的。
f-statistics已经成为从全基因组数据中推断人口统计学历史的第一条分析线。它们不仅保证在分析全基因组测序数据时允许对所提出的种群历史模型与数据的拟合进行稳健的测试,被分析个体中的所有单核苷酸多态性(SNPs),但它们也保证允许对SNPs的模型进行稳健的测试,该SNPs在系统发育意义上是所有被分析群体的外群的群体中被确定为多态性。事实上,真正的“群外确定”在人类身上是不可能的,因为我们的物种是从一个亚结构的祖先种群中产生的,而不是从数十万年前的同质祖先种群中下来的。然而,最初的研究表明,使用f统计量,非外组确定方案可能会产生足够稳健的结果,这促使人们广泛使用非外组确认的SNP面板将模型与数据拟合,如“Affymetrix人类起源阵列”,该阵列已在数百个群体的数千个现代个体上进行了基因分型,或溶液中的“1240k”富集试剂,该试剂是已发表的约70%古代人类全基因组数据的来源。在这项研究中,我们表明,虽然使用这种面板对人口历史的分析对于研究非非洲人口和一个非洲外群体之间的关系非常有效,但当对一个以上撒哈拉以南非洲和/或古代人类群体(尼安德特人和丹尼索瓦人)进行联合建模时,f统计数据与此类SNP集的拟合预计会经常导致对真实人口统计历史的错误拒绝,以及无法拒绝不正确的模型。分析古代人类多态性SNPs的小组,被认为是确定问题的解决方案,其统计能力有限,并保留了重要的偏见。然而,通过对不同的人口历史进行模拟,我们表明,通过确定不同非洲群体联盟中常见的变异,可以最大限度地减少基于f统计的推断中的偏见;这种确定保留了很高的统计能力,同时允许对古代和现代群体进行共同分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
PLoS Genetics
PLoS Genetics 生物-遗传学
CiteScore
8.10
自引率
2.20%
发文量
438
审稿时长
1 months
期刊介绍: PLOS Genetics is run by an international Editorial Board, headed by the Editors-in-Chief, Greg Barsh (HudsonAlpha Institute of Biotechnology, and Stanford University School of Medicine) and Greg Copenhaver (The University of North Carolina at Chapel Hill). Articles published in PLOS Genetics are archived in PubMed Central and cited in PubMed.
期刊最新文献
Subfunctionalization of NRC3 altered the genetic structure of the Nicotiana NRC network The transcription factor RUNT-like regulates pupal cuticle development via promoting a pupal cuticle protein transcription Direct targets of MEF2C are enriched for genes associated with schizophrenia and cognitive function and are involved in neuron development and mitochondrial function Evolutionary rate covariation is pervasive between glycosylation pathways and points to potential disease modifiers Histone variant H2A.Z is needed for efficient transcription-coupled NER and genome integrity in UV challenged yeast cells
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1