树先验和抽样尺度对语系起源时间贝叶斯系统发育估计的影响

IF 2.1 0 LANGUAGE & LINGUISTICS Journal of Language Evolution Pub Date : 2019-07-01 DOI:10.1093/JOLE/LZZ005

Andrew M. Ritchie, S. Ho

{"title":"树先验和抽样尺度对语系起源时间贝叶斯系统发育估计的影响","authors":"Andrew M. Ritchie, S. Ho","doi":"10.1093/JOLE/LZZ005","DOIUrl":null,"url":null,"abstract":"Bayesian phylogenetic methods derived from evolutionary biology can be used to reconstruct the history of human languages using databases of cognate words. These analyses have produced exciting results regarding the origins and dispersal of linguistic and cultural groups through prehistory. Bayesian lexical dating requires the specification of priors on all model parameters. This includes the use of a prior on divergence times, often combined with a prior on tree topology and referred to as a tree prior. Violation of the underlying assumptions of the tree prior can lead to an erroneous estimate of the timescale of language evolution. To investigate these impacts, we tested the sensitivity of Bayesian dating to the tree prior in analyses of four lexical data sets. Our results show that estimates of the origin times of language families are robust to the choice of tree prior for lexical data, though less so than when Bayesian phylogenetic methods are used to analyse genetic data sets. We also used the relative fit of speciation and coalescent tree priors to determine the ability of speciation models to describe language diversification at four different taxonomic levels. We found that speciation priors were preferred over a constant-size coalescent prior regardless of taxonomic scale. However, data sets with narrower taxonomic and geographic sampling exhibited a poorer fit to ideal birth–death model expectations. Our results encourage further investigation into the nature of language diversification at different sampling scales.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/JOLE/LZZ005","citationCount":"9","resultStr":"{\"title\":\"Influence of the tree prior and sampling scale on Bayesian phylogenetic estimates of the origin times of language families\",\"authors\":\"Andrew M. Ritchie, S. Ho\",\"doi\":\"10.1093/JOLE/LZZ005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bayesian phylogenetic methods derived from evolutionary biology can be used to reconstruct the history of human languages using databases of cognate words. These analyses have produced exciting results regarding the origins and dispersal of linguistic and cultural groups through prehistory. Bayesian lexical dating requires the specification of priors on all model parameters. This includes the use of a prior on divergence times, often combined with a prior on tree topology and referred to as a tree prior. Violation of the underlying assumptions of the tree prior can lead to an erroneous estimate of the timescale of language evolution. To investigate these impacts, we tested the sensitivity of Bayesian dating to the tree prior in analyses of four lexical data sets. Our results show that estimates of the origin times of language families are robust to the choice of tree prior for lexical data, though less so than when Bayesian phylogenetic methods are used to analyse genetic data sets. We also used the relative fit of speciation and coalescent tree priors to determine the ability of speciation models to describe language diversification at four different taxonomic levels. We found that speciation priors were preferred over a constant-size coalescent prior regardless of taxonomic scale. However, data sets with narrower taxonomic and geographic sampling exhibited a poorer fit to ideal birth–death model expectations. Our results encourage further investigation into the nature of language diversification at different sampling scales.\",\"PeriodicalId\":37118,\"journal\":{\"name\":\"Journal of Language Evolution\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1093/JOLE/LZZ005\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Language Evolution\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/JOLE/LZZ005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Language Evolution","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/JOLE/LZZ005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 9

摘要

进化生物学衍生的贝叶斯系统发育方法可以利用同源词数据库来重建人类语言的历史。这些分析对史前语言和文化群体的起源和分布产生了令人兴奋的结果。贝叶斯词法定年要求对所有模型参数指定先验。这包括对散度时间的先验使用，通常与树拓扑上的先验结合使用，称为树先验。违反先验树的基本假设可能导致对语言进化时间尺度的错误估计。为了研究这些影响，我们在分析四个词法数据集时测试了贝叶斯定年法对先验树的敏感性。我们的研究结果表明，语族起源时间的估计对于词法数据的树先验选择是稳健的，尽管不如使用贝叶斯系统发育方法来分析遗传数据集。我们还利用物种形成和聚结树先验的相对拟合来确定物种形成模型在四个不同分类水平上描述语言多样化的能力。我们发现，无论分类学规模如何，物种形成先验都优于恒定大小的聚结先验。然而，分类学和地理抽样范围较窄的数据集与理想的出生-死亡模型预期的拟合程度较差。我们的结果鼓励在不同的抽样尺度上进一步研究语言多样化的本质。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Influence of the tree prior and sampling scale on Bayesian phylogenetic estimates of the origin times of language families

Bayesian phylogenetic methods derived from evolutionary biology can be used to reconstruct the history of human languages using databases of cognate words. These analyses have produced exciting results regarding the origins and dispersal of linguistic and cultural groups through prehistory. Bayesian lexical dating requires the specification of priors on all model parameters. This includes the use of a prior on divergence times, often combined with a prior on tree topology and referred to as a tree prior. Violation of the underlying assumptions of the tree prior can lead to an erroneous estimate of the timescale of language evolution. To investigate these impacts, we tested the sensitivity of Bayesian dating to the tree prior in analyses of four lexical data sets. Our results show that estimates of the origin times of language families are robust to the choice of tree prior for lexical data, though less so than when Bayesian phylogenetic methods are used to analyse genetic data sets. We also used the relative fit of speciation and coalescent tree priors to determine the ability of speciation models to describe language diversification at four different taxonomic levels. We found that speciation priors were preferred over a constant-size coalescent prior regardless of taxonomic scale. However, data sets with narrower taxonomic and geographic sampling exhibited a poorer fit to ideal birth–death model expectations. Our results encourage further investigation into the nature of language diversification at different sampling scales.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Language Evolution Social Sciences-Linguistics and Language

CiteScore

4.50

自引率

7.70%

发文量