Identification of Multi-functional Therapeutic Peptides Based on Prototypical Supervised Contrastive Learning.

IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-23 DOI:10.1007/s12539-024-00674-3
Sitong Niu, Henghui Fan, Fei Wang, Xiaomei Yang, Junfeng Xia
{"title":"Identification of Multi-functional Therapeutic Peptides Based on Prototypical Supervised Contrastive Learning.","authors":"Sitong Niu, Henghui Fan, Fei Wang, Xiaomei Yang, Junfeng Xia","doi":"10.1007/s12539-024-00674-3","DOIUrl":null,"url":null,"abstract":"<p><p>High-throughput sequencing has exponentially increased peptide sequences, necessitating a computational method to identify multi-functional therapeutic peptides (MFTP) from their sequences. However, existing computational methods are challenged by class imbalance, particularly in learning effective sequence representations. To address this, we propose PSCFA, a prototypical supervised contrastive learning with a feature augmentation method for MFTP prediction. We employ a two-stage training scheme to train the feature extractor and the classifier respectively, underpinned by the principle that better feature representation boosts classification accuracy. In the first stage, we utilize a prototypical supervised contrastive learning strategy to enhance the uniformity of feature space distribution, ensuring that the characteristics of samples within the same category are tightly clustered while those from different categories are more dispersed. In the second stage, a feature augmentation strategy that focuses on infrequent labels (tail labels) is used to refine the learning process of the classifier. We use a prototype-based variational autoencoder to capture semantic links among common labels (head labels) and their prototypes. This knowledge is then transferred to tail labels, generating enhanced features for classifier training. The experiments prove that the PSCFA method significantly outperforms existing methods for MFTP prediction, making a significant advancement in therapeutic peptide identification.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interdisciplinary Sciences: Computational Life Sciences","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s12539-024-00674-3","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

High-throughput sequencing has exponentially increased peptide sequences, necessitating a computational method to identify multi-functional therapeutic peptides (MFTP) from their sequences. However, existing computational methods are challenged by class imbalance, particularly in learning effective sequence representations. To address this, we propose PSCFA, a prototypical supervised contrastive learning with a feature augmentation method for MFTP prediction. We employ a two-stage training scheme to train the feature extractor and the classifier respectively, underpinned by the principle that better feature representation boosts classification accuracy. In the first stage, we utilize a prototypical supervised contrastive learning strategy to enhance the uniformity of feature space distribution, ensuring that the characteristics of samples within the same category are tightly clustered while those from different categories are more dispersed. In the second stage, a feature augmentation strategy that focuses on infrequent labels (tail labels) is used to refine the learning process of the classifier. We use a prototype-based variational autoencoder to capture semantic links among common labels (head labels) and their prototypes. This knowledge is then transferred to tail labels, generating enhanced features for classifier training. The experiments prove that the PSCFA method significantly outperforms existing methods for MFTP prediction, making a significant advancement in therapeutic peptide identification.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于原型监督对比学习的多功能治疗肽识别。
高通量测序使得多肽序列呈指数增长,因此需要一种计算方法来从多肽序列中识别多功能治疗肽(MFTP)。然而,现有的计算方法受到类不平衡的挑战,特别是在学习有效的序列表示方面。为了解决这个问题,我们提出了PSCFA,一种典型的带有特征增强的监督对比学习方法,用于MFTP预测。我们采用两阶段训练方案分别训练特征提取器和分类器,以更好的特征表示提高分类精度的原则为基础。在第一阶段,我们利用一种原型监督对比学习策略来增强特征空间分布的均匀性,确保同一类别样本的特征紧密聚类,而不同类别样本的特征更加分散。在第二阶段,使用一种关注不频繁标签(尾标签)的特征增强策略来改进分类器的学习过程。我们使用基于原型的变分自编码器来捕获常见标签(头标签)及其原型之间的语义链接。然后将这些知识转移到尾部标签,生成用于分类器训练的增强特征。实验证明,PSCFA方法明显优于现有的MFTP预测方法,在治疗肽鉴定方面取得了重大进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Interdisciplinary Sciences: Computational Life Sciences
Interdisciplinary Sciences: Computational Life Sciences MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
8.60
自引率
4.20%
发文量
55
期刊介绍: Interdisciplinary Sciences--Computational Life Sciences aims to cover the most recent and outstanding developments in interdisciplinary areas of sciences, especially focusing on computational life sciences, an area that is enjoying rapid development at the forefront of scientific research and technology. The journal publishes original papers of significant general interest covering recent research and developments. Articles will be published rapidly by taking full advantage of internet technology for online submission and peer-reviewing of manuscripts, and then by publishing OnlineFirstTM through SpringerLink even before the issue is built or sent to the printer. The editorial board consists of many leading scientists with international reputation, among others, Luc Montagnier (UNESCO, France), Dennis Salahub (University of Calgary, Canada), Weitao Yang (Duke University, USA). Prof. Dongqing Wei at the Shanghai Jiatong University is appointed as the editor-in-chief; he made important contributions in bioinformatics and computational physics and is best known for his ground-breaking works on the theory of ferroelectric liquids. With the help from a team of associate editors and the editorial board, an international journal with sound reputation shall be created.
期刊最新文献
Reinforced Collaborative-Competitive Representation for Biomedical Image Recognition. A Domain Adaptive Interpretable Substructure-Aware Graph Attention Network for Drug-Drug Interaction Prediction. NRGCNMDA: Microbe-Drug Association Prediction Based on Residual Graph Convolutional Networks and Conditional Random Fields. Reconstructing Waddington Landscape from Cell Migration and Proliferation. MTGGF: A Metabolism Type-Aware Graph Generative Model for Molecular Metabolite Prediction.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1