DNA Sequences Are as Useful as Protein Sequences for Inferring Deep Phylogenies.

IF 6.1 1区 生物学 Q1 EVOLUTIONARY BIOLOGY Systematic Biology Pub Date : 2023-11-01 DOI:10.1093/sysbio/syad036
Paschalia Kapli, Ioanna Kotari, Maximilian J Telford, Nick Goldman, Ziheng Yang
{"title":"DNA Sequences Are as Useful as Protein Sequences for Inferring Deep Phylogenies.","authors":"Paschalia Kapli, Ioanna Kotari, Maximilian J Telford, Nick Goldman, Ziheng Yang","doi":"10.1093/sysbio/syad036","DOIUrl":null,"url":null,"abstract":"<p><p>Inference of deep phylogenies has almost exclusively used protein rather than DNA sequences based on the perception that protein sequences are less prone to homoplasy and saturation or to issues of compositional heterogeneity than DNA sequences. Here, we analyze a model of codon evolution under an idealized genetic code and demonstrate that those perceptions may be misconceptions. We conduct a simulation study to assess the utility of protein versus DNA sequences for inferring deep phylogenies, with protein-coding data generated under models of heterogeneous substitution processes across sites in the sequence and among lineages on the tree, and then analyzed using nucleotide, amino acid, and codon models. Analysis of DNA sequences under nucleotide-substitution models (possibly with the third codon positions excluded) recovered the correct tree at least as often as analysis of the corresponding protein sequences under modern amino acid models. We also applied the different data-analysis strategies to an empirical dataset to infer the metazoan phylogeny. Our results from both simulated and real data suggest that DNA sequences may be as useful as proteins for inferring deep phylogenies and should not be excluded from such analyses. Analysis of DNA data under nucleotide models has a major computational advantage over protein-data analysis, potentially making it feasible to use advanced models that account for among-site and among-lineage heterogeneity in the nucleotide-substitution process in inference of deep phylogenies.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10627555/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syad036","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Inference of deep phylogenies has almost exclusively used protein rather than DNA sequences based on the perception that protein sequences are less prone to homoplasy and saturation or to issues of compositional heterogeneity than DNA sequences. Here, we analyze a model of codon evolution under an idealized genetic code and demonstrate that those perceptions may be misconceptions. We conduct a simulation study to assess the utility of protein versus DNA sequences for inferring deep phylogenies, with protein-coding data generated under models of heterogeneous substitution processes across sites in the sequence and among lineages on the tree, and then analyzed using nucleotide, amino acid, and codon models. Analysis of DNA sequences under nucleotide-substitution models (possibly with the third codon positions excluded) recovered the correct tree at least as often as analysis of the corresponding protein sequences under modern amino acid models. We also applied the different data-analysis strategies to an empirical dataset to infer the metazoan phylogeny. Our results from both simulated and real data suggest that DNA sequences may be as useful as proteins for inferring deep phylogenies and should not be excluded from such analyses. Analysis of DNA data under nucleotide models has a major computational advantage over protein-data analysis, potentially making it feasible to use advanced models that account for among-site and among-lineage heterogeneity in the nucleotide-substitution process in inference of deep phylogenies.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DNA序列与蛋白质序列在推断深层系统发育方面同样有用。
基于蛋白质序列比DNA序列更不容易同源性和饱和或成分异质性问题的看法,对深层系统发育的推断几乎只使用蛋白质而不是DNA序列。在这里,我们分析了一个理想化遗传密码下密码子进化的模型,并证明这些认知可能是误解。我们进行了一项模拟研究,以评估蛋白质与DNA序列在推断深层系统发育中的效用,在序列中各位点和树上各谱系之间的异质取代过程模型下生成蛋白质编码数据,然后使用核苷酸、氨基酸和密码子模型进行分析。在核苷酸取代模型(可能排除第三密码子位置)下对DNA序列的分析至少与在现代氨基酸模型下对相应蛋白质序列的分析一样频繁地恢复了正确的树。我们还将不同的数据分析策略应用于经验数据集,以推断后生动物的系统发育。我们从模拟和真实数据中得出的结果表明,DNA序列可能与蛋白质一样有助于推断深层系统发育,不应被排除在此类分析之外。核苷酸模型下的DNA数据分析比蛋白质数据分析具有主要的计算优势,这可能使在推断深层系统发育中使用先进的模型来解释核苷酸替代过程中位点间和谱系间的异质性成为可能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Systematic Biology
Systematic Biology 生物-进化生物学
CiteScore
13.00
自引率
7.70%
发文量
70
审稿时长
6-12 weeks
期刊介绍: Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.
期刊最新文献
The limits of the metapopulation: Lineage fragmentation in a widespread terrestrial salamander (Plethodon cinereus) Dating in the Dark: Elevated Substitution Rates in Cave Cockroaches (Blattodea: Nocticolidae) Have Negative Impacts on Molecular Date Estimates. Clockor2: Inferring Global and Local Strict Molecular Clocks Using Root-to-Tip Regression. Phylogenomics of Neogastropoda: The Backbone Hidden in the Bush. Distinguishing Cophylogenetic Signal from Phylogenetic Congruence Clarifies the Interplay Between Evolutionary History and Species Interactions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1