SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction.

IF 3.3 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS BMC Bioinformatics Pub Date : 2025-02-10 DOI:10.1186/s12859-025-06059-7
Yansong Wang, Yundong Sun, Baohui Lin, Haotian Zhang, Xiaoling Luo, Yumeng Liu, Xiaopeng Jin, Dongjie Zhu
{"title":"SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction.","authors":"Yansong Wang, Yundong Sun, Baohui Lin, Haotian Zhang, Xiaoling Luo, Yumeng Liu, Xiaopeng Jin, Dongjie Zhu","doi":"10.1186/s12859-025-06059-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>A massive amount of protein sequences have been obtained, but their functions remain challenging to discern. In recent research on protein function prediction, Protein-Protein Interaction (PPI) Networks have played a crucial role. Uncovering potential function relationships between distant proteins within PPI networks is essential for improving the accuracy of protein function prediction. Most current studies attempt to capture these distant relationships by stacking graph network layers, but performance gains diminish as the number of layers increases.</p><p><strong>Results: </strong>To further explore the potential functional relationships between multi-hop proteins in PPI networks, this paper proposes SEGT-GO, a Graph Transformer method based on PPI multi-hop neighborhood Serialization and Explainable artificial intelligence for large-scale multispecies protein function prediction. The multi-hop neighborhood serialization maps multi-hop information in the PPI Network into serialized feature embeddings, enabling the Graph Transformer to learn deeper functional features within the PPI Network. Based on game theory, the SHAP eXplainable Artificial Intelligence (XAI) framework optimizes model input and filters out feature noise, enhancing model performance.</p><p><strong>Conclusions: </strong>Compared to the advanced network method DeepGraphGO, SEGT-GO achieves more competitive results in standard large-scale datasets and superior results on small ones, validating its ability to extract functional information from deep proteins. Furthermore, SEGT-GO achieves superior results in cross-species learning and prediction of the functions of unseen proteins, further proving the method's strong generalization.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"46"},"PeriodicalIF":3.3000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11808960/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06059-7","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: A massive amount of protein sequences have been obtained, but their functions remain challenging to discern. In recent research on protein function prediction, Protein-Protein Interaction (PPI) Networks have played a crucial role. Uncovering potential function relationships between distant proteins within PPI networks is essential for improving the accuracy of protein function prediction. Most current studies attempt to capture these distant relationships by stacking graph network layers, but performance gains diminish as the number of layers increases.

Results: To further explore the potential functional relationships between multi-hop proteins in PPI networks, this paper proposes SEGT-GO, a Graph Transformer method based on PPI multi-hop neighborhood Serialization and Explainable artificial intelligence for large-scale multispecies protein function prediction. The multi-hop neighborhood serialization maps multi-hop information in the PPI Network into serialized feature embeddings, enabling the Graph Transformer to learn deeper functional features within the PPI Network. Based on game theory, the SHAP eXplainable Artificial Intelligence (XAI) framework optimizes model input and filters out feature noise, enhancing model performance.

Conclusions: Compared to the advanced network method DeepGraphGO, SEGT-GO achieves more competitive results in standard large-scale datasets and superior results on small ones, validating its ability to extract functional information from deep proteins. Furthermore, SEGT-GO achieves superior results in cross-species learning and prediction of the functions of unseen proteins, further proving the method's strong generalization.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SEGT-GO:一种基于PPI序列化和解释性人工智能的蛋白质功能预测图转换方法。
背景:已经获得了大量的蛋白质序列,但它们的功能仍然具有挑战性。在蛋白质功能预测的最新研究中,蛋白质-蛋白质相互作用网络(protein - protein Interaction, PPI)起着至关重要的作用。揭示PPI网络中远处蛋白之间的潜在功能关系对于提高蛋白质功能预测的准确性至关重要。目前的大多数研究都试图通过堆叠图网络层来捕获这些遥远的关系,但是随着层数的增加,性能的提高会减少。结果:为了进一步探索PPI网络中多跳蛋白之间潜在的功能关系,本文提出了一种基于PPI多跳邻域序列化和可解释人工智能的Graph Transformer方法SEGT-GO,用于大规模多物种蛋白功能预测。多跳邻居序列化将PPI网络中的多跳信息映射到序列化的特征嵌入中,使Graph Transformer能够在PPI网络中学习更深层次的功能特征。基于博弈论,SHAP可解释人工智能(XAI)框架优化模型输入,滤除特征噪声,提高模型性能。结论:与先进的网络方法DeepGraphGO相比,SEGT-GO在标准的大规模数据集上取得了更有竞争力的结果,在小型数据集上取得了更优的结果,验证了其从深层蛋白质中提取功能信息的能力。此外,SEGT-GO在跨物种学习和未知蛋白质功能预测方面取得了优异的成绩,进一步证明了该方法的强泛化性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
BMC Bioinformatics
BMC Bioinformatics 生物-生化研究方法
CiteScore
5.70
自引率
3.30%
发文量
506
审稿时长
4.3 months
期刊介绍: BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.
期刊最新文献
MicroSSNet: an R package for microbial network construction and analysis at the single-sample and aggregated levels. Efficient and interpretable DNA/RNA representation using Komlós-Hadamard transforms. GGAR: gradient guided adaptive regularization enhances deep learning classification of brassica species using codon usage bias. QStrain: an interactive platform for viral genome analysis and nucleic acid therapeutic design. Biochef: a client-side WebAssembly-based workflow builder for genomic data analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1