最大似然系统发育分析中自举与似然比支持度关系的实证检验

IF 3.9 2区 生物学 Q1 EVOLUTIONARY BIOLOGY Cladistics Pub Date : 2021-12-21 DOI:10.1111/cla.12496
Denis Jacob Machado, Fernando Portella de Luna Marques, Larry Jiménez-Ferbans, Taran Grant
{"title":"最大似然系统发育分析中自举与似然比支持度关系的实证检验","authors":"Denis Jacob Machado,&nbsp;Fernando Portella de Luna Marques,&nbsp;Larry Jiménez-Ferbans,&nbsp;Taran Grant","doi":"10.1111/cla.12496","DOIUrl":null,"url":null,"abstract":"<p>In maximum likelihood (ML), the support for a clade can be calculated directly as the likelihood ratio (LR) or log-likelihood difference (<i>S</i>, LLD) of the best trees with and without the clade of interest. However, bootstrap (BS) clade frequencies are more pervasive in ML phylogenetics and are almost universally interpreted as measuring support. In addition to theoretical arguments against that interpretation, BS has several undesirable attributes for a support measure. For example, it does not vary in proportion to optimality or identify clades that are rejected by the evidence and can be overestimated due to missing data. Nevertheless, if BS is a reliable predictor of <i>S</i>, then it might be an efficient indirect method of measuring support—an attractive possibility, given the speed of many BS implementations. To assess the relationship between <i>S</i> and BS, we analyzed 106 empirical datasets retrieved from TreeBASE. Also, to evaluate the degree to which <i>S</i> and BS are affected by the number of replicates during suboptimal tree searches for <i>S</i> and pseudoreplicates during BS estimation, we randomly selected 5 of the 106 datasets and analyzed them using variable numbers of replicates and pseudoreplicates, respectively. The correlation between <i>S</i> and BS was extremely weak in the datasets we analyzed. Increasing the number of replicates during tree search decreased the estimated values of <i>S</i> for most clades, but the magnitude of change was small. In contrast, although increasing pseudoreplicates affected BS values for only approximately 40% of clades, values both increased and decreased, and they did so at much greater magnitudes. Increasing replicates/pseudoreplicates affected the rank order of clades in each tree for both <i>S</i> and BS. Our findings show decisively that BS is not an efficient indirect method of measuring support and suggest that even quite superficial searches to calculate <i>S</i> provide better estimates of support.</p>","PeriodicalId":50688,"journal":{"name":"Cladistics","volume":"38 3","pages":"392-401"},"PeriodicalIF":3.9000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An empirical test of the relationship between the bootstrap and likelihood ratio support in maximum likelihood phylogenetic analysis\",\"authors\":\"Denis Jacob Machado,&nbsp;Fernando Portella de Luna Marques,&nbsp;Larry Jiménez-Ferbans,&nbsp;Taran Grant\",\"doi\":\"10.1111/cla.12496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In maximum likelihood (ML), the support for a clade can be calculated directly as the likelihood ratio (LR) or log-likelihood difference (<i>S</i>, LLD) of the best trees with and without the clade of interest. However, bootstrap (BS) clade frequencies are more pervasive in ML phylogenetics and are almost universally interpreted as measuring support. In addition to theoretical arguments against that interpretation, BS has several undesirable attributes for a support measure. For example, it does not vary in proportion to optimality or identify clades that are rejected by the evidence and can be overestimated due to missing data. Nevertheless, if BS is a reliable predictor of <i>S</i>, then it might be an efficient indirect method of measuring support—an attractive possibility, given the speed of many BS implementations. To assess the relationship between <i>S</i> and BS, we analyzed 106 empirical datasets retrieved from TreeBASE. Also, to evaluate the degree to which <i>S</i> and BS are affected by the number of replicates during suboptimal tree searches for <i>S</i> and pseudoreplicates during BS estimation, we randomly selected 5 of the 106 datasets and analyzed them using variable numbers of replicates and pseudoreplicates, respectively. The correlation between <i>S</i> and BS was extremely weak in the datasets we analyzed. Increasing the number of replicates during tree search decreased the estimated values of <i>S</i> for most clades, but the magnitude of change was small. In contrast, although increasing pseudoreplicates affected BS values for only approximately 40% of clades, values both increased and decreased, and they did so at much greater magnitudes. Increasing replicates/pseudoreplicates affected the rank order of clades in each tree for both <i>S</i> and BS. Our findings show decisively that BS is not an efficient indirect method of measuring support and suggest that even quite superficial searches to calculate <i>S</i> provide better estimates of support.</p>\",\"PeriodicalId\":50688,\"journal\":{\"name\":\"Cladistics\",\"volume\":\"38 3\",\"pages\":\"392-401\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2021-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cladistics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/cla.12496\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EVOLUTIONARY BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cladistics","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cla.12496","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

在最大似然(ML)中,分支的支持度可以直接计算为具有和不具有感兴趣分支的最佳树的似然比(LR)或对数似然差(S, LLD)。然而,bootstrap (BS)支系频率在ML系统发育中更为普遍,并且几乎普遍地被解释为测量支持度。除了反对这种解释的理论论据外,BS还有一些不受欢迎的属性作为支持措施。例如,它不会与最优性成比例变化,也不会识别被证据拒绝的进化枝,也不会因缺少数据而被高估。然而,如果BS是S的可靠预测器,那么它可能是测量支持度的一种有效的间接方法——考虑到许多BS实现的速度,这是一种很有吸引力的可能性。为了评估S和BS之间的关系,我们分析了从TreeBASE检索的106个经验数据集。此外,为了评估S和BS估计过程中对S和伪重复进行次优树搜索时重复数对S和BS的影响程度,我们从106个数据集中随机选择了5个数据集,分别使用可变重复数和伪重复数对它们进行分析。在我们分析的数据集中,S和BS之间的相关性非常弱。随着重复次数的增加,大多数支系的S估计值降低,但变化幅度不大。相比之下,虽然增加的假复制只影响了大约40%的进化枝的BS值,但值既增加又减少,而且幅度更大。增加重复/假重复会影响S和BS各树中进化枝的阶序。我们的研究结果明确地表明,BS不是一种有效的间接测量支持度的方法,并且表明即使是非常肤浅的搜索来计算S也可以更好地估计支持度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An empirical test of the relationship between the bootstrap and likelihood ratio support in maximum likelihood phylogenetic analysis

In maximum likelihood (ML), the support for a clade can be calculated directly as the likelihood ratio (LR) or log-likelihood difference (S, LLD) of the best trees with and without the clade of interest. However, bootstrap (BS) clade frequencies are more pervasive in ML phylogenetics and are almost universally interpreted as measuring support. In addition to theoretical arguments against that interpretation, BS has several undesirable attributes for a support measure. For example, it does not vary in proportion to optimality or identify clades that are rejected by the evidence and can be overestimated due to missing data. Nevertheless, if BS is a reliable predictor of S, then it might be an efficient indirect method of measuring support—an attractive possibility, given the speed of many BS implementations. To assess the relationship between S and BS, we analyzed 106 empirical datasets retrieved from TreeBASE. Also, to evaluate the degree to which S and BS are affected by the number of replicates during suboptimal tree searches for S and pseudoreplicates during BS estimation, we randomly selected 5 of the 106 datasets and analyzed them using variable numbers of replicates and pseudoreplicates, respectively. The correlation between S and BS was extremely weak in the datasets we analyzed. Increasing the number of replicates during tree search decreased the estimated values of S for most clades, but the magnitude of change was small. In contrast, although increasing pseudoreplicates affected BS values for only approximately 40% of clades, values both increased and decreased, and they did so at much greater magnitudes. Increasing replicates/pseudoreplicates affected the rank order of clades in each tree for both S and BS. Our findings show decisively that BS is not an efficient indirect method of measuring support and suggest that even quite superficial searches to calculate S provide better estimates of support.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Cladistics
Cladistics 生物-进化生物学
CiteScore
8.60
自引率
5.60%
发文量
34
期刊介绍: Cladistics publishes high quality research papers on systematics, encouraging debate on all aspects of the field, from philosophy, theory and methodology to empirical studies and applications in biogeography, coevolution, conservation biology, ontogeny, genomics and paleontology. Cladistics is read by scientists working in the research fields of evolution, systematics and integrative biology and enjoys a consistently high position in the ISI® rankings for evolutionary biology.
期刊最新文献
Issue Information Incomplete barriers to heterospecific mating among Somatochlora species (Odonata: Corduliidae) as revealed in multi-gene phylogenies Rethinking spatial history: envisioning a mechanistic historical biogeography Robust phylogenomics settles controversies of classification and reveals evolution of male embolic complex of the Laufeia clade (Araneae, Salticidae, Euophryini) Issue Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1