Predicting diverse M-best protein contact maps

S. Sun, Jianzhu Ma, Sheng Wang, Jinbo Xu
{"title":"Predicting diverse M-best protein contact maps","authors":"S. Sun, Jianzhu Ma, Sheng Wang, Jinbo Xu","doi":"10.1109/BIBM.2015.7359865","DOIUrl":null,"url":null,"abstract":"Protein contacts contain important information for protein structure and functional study, but contact prediction from sequence information remains very challenging. Recently evolutionary coupling (EC) analysis, which predicts contacts by detecting co-evolved residues (or columns) in a multiple sequence alignment (MSA), has made good progress due to better statistical assessment techniques and high-throughput sequencing. Existing EC analysis methods predict only a single contact map for a given protein, which may have low accuracy especially when the protein under prediction does not have a large number of sequence homologs. Analogous to ab initio folding that usually predicts a few possible 3D models for a given protein sequence, this paper presents a novel structure learning method that can predict a set of diverse contact maps for a given protein sequence, in which the best solution usually has much better accuracy than the first one. Our experimental tests show that for many test proteins, the best out of 5 solutions generated by our method has accuracy at least 0.1 better than the first one when the top L/5 or L/10 (L is the sequence length) predicted long-range contacts are evaluated, especially for protein families with a small number of sequence homologs. Our best solutions also have better quality than those generated by the two popular EC methods Evfold and PSICOV.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2015.7359865","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Protein contacts contain important information for protein structure and functional study, but contact prediction from sequence information remains very challenging. Recently evolutionary coupling (EC) analysis, which predicts contacts by detecting co-evolved residues (or columns) in a multiple sequence alignment (MSA), has made good progress due to better statistical assessment techniques and high-throughput sequencing. Existing EC analysis methods predict only a single contact map for a given protein, which may have low accuracy especially when the protein under prediction does not have a large number of sequence homologs. Analogous to ab initio folding that usually predicts a few possible 3D models for a given protein sequence, this paper presents a novel structure learning method that can predict a set of diverse contact maps for a given protein sequence, in which the best solution usually has much better accuracy than the first one. Our experimental tests show that for many test proteins, the best out of 5 solutions generated by our method has accuracy at least 0.1 better than the first one when the top L/5 or L/10 (L is the sequence length) predicted long-range contacts are evaluated, especially for protein families with a small number of sequence homologs. Our best solutions also have better quality than those generated by the two popular EC methods Evfold and PSICOV.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
预测不同的M-best蛋白接触图
蛋白质接触是蛋白质结构和功能研究的重要信息,但从序列信息中预测蛋白质接触仍然具有很大的挑战性。进化耦合分析(EC)是一种通过检测多序列比对(MSA)中的共同进化残基(或列)来预测接触的分析方法,近年来由于更好的统计评估技术和高通量测序而取得了很好的进展。现有的EC分析方法只能预测给定蛋白质的单一接触图谱,特别是当预测的蛋白质没有大量的序列同源物时,精度可能较低。与从头算折叠通常预测给定蛋白质序列的几种可能的3D模型类似,本文提出了一种新的结构学习方法,该方法可以预测给定蛋白质序列的一组不同的接触映射,其中最佳解通常比第一个解具有更高的精度。我们的实验测试表明,对于许多测试蛋白,当评估预测远程接触的最高L/5或L/10 (L为序列长度)时,我们的方法生成的5个解决方案中的最佳解决方案比第一个解决方案精度至少提高0.1,特别是对于具有少量序列同源的蛋白质家族。我们的最佳解决方案也比两种流行的EC方法Evfold和PSICOV产生的质量更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Rare Diseases clustering based on structural regularities at the gene structure Mining graph patterns in the protein-RNA interfaces Risk prediction of stroke: A prospective statewide study on patients in Maine Predicting diverse M-best protein contact maps Temporal weighting of clinical events in electronic health records for pharmacovigilance
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1