SCOP的进化一致性家族:序列、结构和功能

IF 2.222 Q3 Biochemistry, Genetics and Molecular Biology BMC Structural Biology Pub Date : 2012-10-18 DOI:10.1186/1472-6807-12-27
Ralph B Pethica, Michael Levitt, Julian Gough
{"title":"SCOP的进化一致性家族:序列、结构和功能","authors":"Ralph B Pethica,&nbsp;Michael Levitt,&nbsp;Julian Gough","doi":"10.1186/1472-6807-12-27","DOIUrl":null,"url":null,"abstract":"<p>SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains with similar structure or by common function? It is these questions we answer, but most importantly, whether each family represents a distinct phylogenetic group within a superfamily.</p><p>Several phylogenetic trees were generated for each superfamily: one derived from a multiple sequence alignment, one based on structural distances, and the final two from presence/absence of GO terms or EC numbers assigned to domains. The topologies of the resulting trees and confidence values were compared to the SCOP family classification.</p><p>We show that SCOP family groupings are evolutionarily consistent to a very high degree with respect to classical sequence phylogenetics. The trees built from (automatically generated) structural distances correlate well, but are not always consistent with SCOP (hand annotated) groupings. Trees derived from functional data are less consistent with the family level than those from structure or sequence, though the majority still agree. Much of GO and EC annotation applies directly to one family or subset of the family; relatively few terms apply at the superfamily level. Maximum sequence diversity within a family is on average 22% but close to zero for superfamilies.</p>","PeriodicalId":498,"journal":{"name":"BMC Structural Biology","volume":"12 1","pages":""},"PeriodicalIF":2.2220,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1472-6807-12-27","citationCount":"16","resultStr":"{\"title\":\"Evolutionarily consistent families in SCOP: sequence, structure and function\",\"authors\":\"Ralph B Pethica,&nbsp;Michael Levitt,&nbsp;Julian Gough\",\"doi\":\"10.1186/1472-6807-12-27\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains with similar structure or by common function? It is these questions we answer, but most importantly, whether each family represents a distinct phylogenetic group within a superfamily.</p><p>Several phylogenetic trees were generated for each superfamily: one derived from a multiple sequence alignment, one based on structural distances, and the final two from presence/absence of GO terms or EC numbers assigned to domains. The topologies of the resulting trees and confidence values were compared to the SCOP family classification.</p><p>We show that SCOP family groupings are evolutionarily consistent to a very high degree with respect to classical sequence phylogenetics. The trees built from (automatically generated) structural distances correlate well, but are not always consistent with SCOP (hand annotated) groupings. Trees derived from functional data are less consistent with the family level than those from structure or sequence, though the majority still agree. Much of GO and EC annotation applies directly to one family or subset of the family; relatively few terms apply at the superfamily level. Maximum sequence diversity within a family is on average 22% but close to zero for superfamilies.</p>\",\"PeriodicalId\":498,\"journal\":{\"name\":\"BMC Structural Biology\",\"volume\":\"12 1\",\"pages\":\"\"},\"PeriodicalIF\":2.2220,\"publicationDate\":\"2012-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1186/1472-6807-12-27\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Structural Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1186/1472-6807-12-27\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Biochemistry, Genetics and Molecular Biology\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Structural Biology","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1186/1472-6807-12-27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 16

摘要

SCOP是一种针对已知结构的蛋白质的层次结构域分类系统。超家族水平有一个明确的定义:如果有结构、功能和序列证据表明有共同的进化祖先,那么蛋白质结构域属于同一个超家族。超科被细分为科,然而,没有这样一个明确的基础的家庭水平分组。SCOP家族是将序列相似的域分组,还是将结构相似或功能相同的域分组?我们要回答的是这些问题,但最重要的是,每个家族是否代表一个超家族中不同的系统发育群体。为每个超家族生成了几个系统发育树:一个来自多个序列比对,一个基于结构距离,最后两个来自是否存在GO项或分配给结构域的EC号。将得到的树的拓扑结构和置信度值与SCOP家族分类进行比较。我们表明SCOP家族分组在进化上高度一致,与经典序列系统发育有关。从(自动生成的)结构距离构建的树相关性很好,但并不总是与SCOP(手工注释)分组一致。从功能数据中得到的树与家族水平的一致性不如从结构或序列中得到的树,尽管大多数人仍然同意。GO和EC注释的大部分直接适用于一个族或族的子集;相对较少的术语适用于超家族级别。一个家族内的最大序列多样性平均为22%,而超家族则接近于零。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Evolutionarily consistent families in SCOP: sequence, structure and function

SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains with similar structure or by common function? It is these questions we answer, but most importantly, whether each family represents a distinct phylogenetic group within a superfamily.

Several phylogenetic trees were generated for each superfamily: one derived from a multiple sequence alignment, one based on structural distances, and the final two from presence/absence of GO terms or EC numbers assigned to domains. The topologies of the resulting trees and confidence values were compared to the SCOP family classification.

We show that SCOP family groupings are evolutionarily consistent to a very high degree with respect to classical sequence phylogenetics. The trees built from (automatically generated) structural distances correlate well, but are not always consistent with SCOP (hand annotated) groupings. Trees derived from functional data are less consistent with the family level than those from structure or sequence, though the majority still agree. Much of GO and EC annotation applies directly to one family or subset of the family; relatively few terms apply at the superfamily level. Maximum sequence diversity within a family is on average 22% but close to zero for superfamilies.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
BMC Structural Biology
BMC Structural Biology 生物-生物物理
CiteScore
3.60
自引率
0.00%
发文量
0
期刊介绍: BMC Structural Biology is an open access, peer-reviewed journal that considers articles on investigations into the structure of biological macromolecules, including solving structures, structural and functional analyses, and computational modeling.
期刊最新文献
Characterization of putative proteins encoded by variable ORFs in white spot syndrome virus genome Correction to: Classification of the human THAP protein family identifies an evolutionarily conserved coiled coil region Effect of low complexity regions within the PvMSP3α block II on the tertiary structure of the protein and implications to immune escape mechanisms QRNAS: software tool for refinement of nucleic acid structures Classification of the human THAP protein family identifies an evolutionarily conserved coiled coil region
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1