ProteinRPN:利用基于图形的区域建议实现准确的蛋白质功能预测

Shania Mitra, Lei Huang, Manolis Kellis
{"title":"ProteinRPN:利用基于图形的区域建议实现准确的蛋白质功能预测","authors":"Shania Mitra, Lei Huang, Manolis Kellis","doi":"arxiv-2409.00610","DOIUrl":null,"url":null,"abstract":"Protein function prediction is a crucial task in bioinformatics, with\nsignificant implications for understanding biological processes and disease\nmechanisms. While the relationship between sequence and function has been\nextensively explored, translating protein structure to function continues to\npresent substantial challenges. Various models, particularly, CNN and\ngraph-based deep learning approaches that integrate structural and functional\ndata, have been proposed to address these challenges. However, these methods\noften fall short in elucidating the functional significance of key residues\nessential for protein functionality, as they predominantly adopt a\nretrospective perspective, leading to suboptimal performance. Inspired by region proposal networks in computer vision, we introduce the\nProtein Region Proposal Network (ProteinRPN) for accurate protein function\nprediction. Specifically, the region proposal module component of ProteinRPN\nidentifies potential functional regions (anchors) which are refined through the\nhierarchy-aware node drop pooling layer favoring nodes with defined secondary\nstructures and spatial proximity. The representations of the predicted\nfunctional nodes are enriched using attention mechanisms and subsequently fed\ninto a Graph Multiset Transformer, which is trained with supervised contrastive\n(SupCon) and InfoNCE losses on perturbed protein structures. Our model\ndemonstrates significant improvements in predicting Gene Ontology (GO) terms,\neffectively localizing functional residues within protein structures. The\nproposed framework provides a robust, scalable solution for protein function\nannotation, advancing the understanding of protein structure-function\nrelationships in computational biology.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ProteinRPN: Towards Accurate Protein Function Prediction with Graph-Based Region Proposals\",\"authors\":\"Shania Mitra, Lei Huang, Manolis Kellis\",\"doi\":\"arxiv-2409.00610\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Protein function prediction is a crucial task in bioinformatics, with\\nsignificant implications for understanding biological processes and disease\\nmechanisms. While the relationship between sequence and function has been\\nextensively explored, translating protein structure to function continues to\\npresent substantial challenges. Various models, particularly, CNN and\\ngraph-based deep learning approaches that integrate structural and functional\\ndata, have been proposed to address these challenges. However, these methods\\noften fall short in elucidating the functional significance of key residues\\nessential for protein functionality, as they predominantly adopt a\\nretrospective perspective, leading to suboptimal performance. Inspired by region proposal networks in computer vision, we introduce the\\nProtein Region Proposal Network (ProteinRPN) for accurate protein function\\nprediction. Specifically, the region proposal module component of ProteinRPN\\nidentifies potential functional regions (anchors) which are refined through the\\nhierarchy-aware node drop pooling layer favoring nodes with defined secondary\\nstructures and spatial proximity. The representations of the predicted\\nfunctional nodes are enriched using attention mechanisms and subsequently fed\\ninto a Graph Multiset Transformer, which is trained with supervised contrastive\\n(SupCon) and InfoNCE losses on perturbed protein structures. Our model\\ndemonstrates significant improvements in predicting Gene Ontology (GO) terms,\\neffectively localizing functional residues within protein structures. The\\nproposed framework provides a robust, scalable solution for protein function\\nannotation, advancing the understanding of protein structure-function\\nrelationships in computational biology.\",\"PeriodicalId\":501266,\"journal\":{\"name\":\"arXiv - QuanBio - Quantitative Methods\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Quantitative Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.00610\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

蛋白质功能预测是生物信息学的一项重要任务,对了解生物过程和疾病机制具有重要意义。虽然序列与功能之间的关系已被广泛探索,但将蛋白质结构转化为功能仍面临巨大挑战。为了应对这些挑战,人们提出了各种模型,特别是整合了结构和功能数据的 CNN 和基于图谱的深度学习方法。然而,这些方法在阐明对蛋白质功能至关重要的关键残基的功能意义方面往往存在不足,因为它们主要采用的是回顾性视角,导致性能不理想。受计算机视觉中区域提议网络的启发,我们引入了用于准确预测蛋白质功能的蛋白质区域提议网络(ProteinRPN)。具体来说,ProteinRPN 的区域建议模块组件识别潜在的功能区域(锚点),并通过层级感知的节点丢弃池层(node drop pooling layer)对这些锚点进行细化,优先选择具有确定次级结构和空间邻近性的节点。预测功能节点的表征通过注意力机制得到丰富,随后输入到图形多集变换器中,该变换器通过对扰动蛋白质结构的监督对比(SupCon)和 InfoNCE 损失进行训练。我们的模型证明了在预测基因本体(GO)术语方面的显著改进,有效地定位了蛋白质结构中的功能残基。所提出的框架为蛋白质功能注释提供了一个稳健、可扩展的解决方案,推动了计算生物学对蛋白质结构-功能关系的理解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ProteinRPN: Towards Accurate Protein Function Prediction with Graph-Based Region Proposals
Protein function prediction is a crucial task in bioinformatics, with significant implications for understanding biological processes and disease mechanisms. While the relationship between sequence and function has been extensively explored, translating protein structure to function continues to present substantial challenges. Various models, particularly, CNN and graph-based deep learning approaches that integrate structural and functional data, have been proposed to address these challenges. However, these methods often fall short in elucidating the functional significance of key residues essential for protein functionality, as they predominantly adopt a retrospective perspective, leading to suboptimal performance. Inspired by region proposal networks in computer vision, we introduce the Protein Region Proposal Network (ProteinRPN) for accurate protein function prediction. Specifically, the region proposal module component of ProteinRPN identifies potential functional regions (anchors) which are refined through the hierarchy-aware node drop pooling layer favoring nodes with defined secondary structures and spatial proximity. The representations of the predicted functional nodes are enriched using attention mechanisms and subsequently fed into a Graph Multiset Transformer, which is trained with supervised contrastive (SupCon) and InfoNCE losses on perturbed protein structures. Our model demonstrates significant improvements in predicting Gene Ontology (GO) terms, effectively localizing functional residues within protein structures. The proposed framework provides a robust, scalable solution for protein function annotation, advancing the understanding of protein structure-function relationships in computational biology.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities Automating proton PBS treatment planning for head and neck cancers using policy gradient-based deep reinforcement learning A computational framework for optimal and Model Predictive Control of stochastic gene regulatory networks Active learning for energy-based antibody optimization and enhanced screening Comorbid anxiety symptoms predict lower odds of improvement in depression symptoms during smartphone-delivered psychotherapy
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1