GraphPI:利用图神经网络进行高效蛋白质推理

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC ACS Applied Electronic Materials Pub Date : 2024-10-13 DOI:10.1021/acs.jproteome.3c0084510.1021/acs.jproteome.3c00845
Zheng Ma, Jiazhen Chen, Lei Xin and Ali Ghodsi*, 
{"title":"GraphPI:利用图神经网络进行高效蛋白质推理","authors":"Zheng Ma,&nbsp;Jiazhen Chen,&nbsp;Lei Xin and Ali Ghodsi*,&nbsp;","doi":"10.1021/acs.jproteome.3c0084510.1021/acs.jproteome.3c00845","DOIUrl":null,"url":null,"abstract":"<p >The integration of deep learning approaches in biomedical research has been transformative, enabling breakthroughs in various applications. Despite these strides, its application in protein inference is impeded by the scarcity of extensively labeled data sets, a challenge compounded by the high costs and complexities of accurate protein annotation. In this study, we introduce GraphPI, a novel framework that treats protein inference as a node classification problem. We treat proteins as interconnected nodes within a protein–peptide–PSM graph, utilizing a graph neural network-based architecture to elucidate their interrelations. To address label scarcity, we train the model on a set of unlabeled public protein data sets with pseudolabels derived from an existing protein inference algorithm, enhanced by self-training to iteratively refine labels based on confidence scores. Contrary to prevalent methodologies necessitating data set-specific training, our research illustrates that GraphPI, due to the well-normalized nature of Percolator features, exhibits universal applicability without data set-specific fine-tuning, a feature that not only mitigates the risk of overfitting but also enhances computational efficiency. Our empirical experiments reveal notable performance on various test data sets and deliver significantly reduced computation times compared to common protein inference algorithms.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GraphPI: Efficient Protein Inference with Graph Neural Networks\",\"authors\":\"Zheng Ma,&nbsp;Jiazhen Chen,&nbsp;Lei Xin and Ali Ghodsi*,&nbsp;\",\"doi\":\"10.1021/acs.jproteome.3c0084510.1021/acs.jproteome.3c00845\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >The integration of deep learning approaches in biomedical research has been transformative, enabling breakthroughs in various applications. Despite these strides, its application in protein inference is impeded by the scarcity of extensively labeled data sets, a challenge compounded by the high costs and complexities of accurate protein annotation. In this study, we introduce GraphPI, a novel framework that treats protein inference as a node classification problem. We treat proteins as interconnected nodes within a protein–peptide–PSM graph, utilizing a graph neural network-based architecture to elucidate their interrelations. To address label scarcity, we train the model on a set of unlabeled public protein data sets with pseudolabels derived from an existing protein inference algorithm, enhanced by self-training to iteratively refine labels based on confidence scores. Contrary to prevalent methodologies necessitating data set-specific training, our research illustrates that GraphPI, due to the well-normalized nature of Percolator features, exhibits universal applicability without data set-specific fine-tuning, a feature that not only mitigates the risk of overfitting but also enhances computational efficiency. Our empirical experiments reveal notable performance on various test data sets and deliver significantly reduced computation times compared to common protein inference algorithms.</p>\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.jproteome.3c00845\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"99","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jproteome.3c00845","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

深度学习方法在生物医学研究中的整合已经实现了变革,在各种应用中取得了突破性进展。尽管取得了这些长足进步,但在蛋白质推断方面的应用却受到了广泛标记数据集稀缺的阻碍,而精确蛋白质注释的高成本和复杂性又加剧了这一挑战。在本研究中,我们引入了 GraphPI,这是一种将蛋白质推断视为节点分类问题的新型框架。我们将蛋白质视为蛋白质-肽-PSM 图中相互连接的节点,利用基于图神经网络的架构来阐明它们之间的相互关系。为了解决标签稀缺的问题,我们在一组无标签的公共蛋白质数据集上训练模型,这些数据集上的伪标签来自现有的蛋白质推断算法,并通过自我训练得到增强,从而根据置信度分数迭代完善标签。与需要针对特定数据集进行训练的普遍方法相反,我们的研究表明,由于Percolator特征的良好归一化性质,GraphPI无需针对特定数据集进行微调即可显示出普遍适用性,这一特点不仅降低了过拟合的风险,还提高了计算效率。我们的实证实验表明,与普通蛋白质推理算法相比,该算法在各种测试数据集上都有显著的性能表现,并大大缩短了计算时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GraphPI: Efficient Protein Inference with Graph Neural Networks

The integration of deep learning approaches in biomedical research has been transformative, enabling breakthroughs in various applications. Despite these strides, its application in protein inference is impeded by the scarcity of extensively labeled data sets, a challenge compounded by the high costs and complexities of accurate protein annotation. In this study, we introduce GraphPI, a novel framework that treats protein inference as a node classification problem. We treat proteins as interconnected nodes within a protein–peptide–PSM graph, utilizing a graph neural network-based architecture to elucidate their interrelations. To address label scarcity, we train the model on a set of unlabeled public protein data sets with pseudolabels derived from an existing protein inference algorithm, enhanced by self-training to iteratively refine labels based on confidence scores. Contrary to prevalent methodologies necessitating data set-specific training, our research illustrates that GraphPI, due to the well-normalized nature of Percolator features, exhibits universal applicability without data set-specific fine-tuning, a feature that not only mitigates the risk of overfitting but also enhances computational efficiency. Our empirical experiments reveal notable performance on various test data sets and deliver significantly reduced computation times compared to common protein inference algorithms.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
期刊最新文献
Hyperbaric oxygen treatment promotes tendon-bone interface healing in a rabbit model of rotator cuff tears. Oxygen-ozone therapy for myocardial ischemic stroke and cardiovascular disorders. Comparative study on the anti-inflammatory and protective effects of different oxygen therapy regimens on lipopolysaccharide-induced acute lung injury in mice. Heme oxygenase/carbon monoxide system and development of the heart. Hyperbaric oxygen for moderate-to-severe traumatic brain injury: outcomes 5-8 years after injury.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1