PLPTP: A Motif-based Interpretable Deep Learning Framework Based on Protein Language Models for Peptide Toxicity Prediction

IF 4.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Journal of Molecular Biology Pub Date : 2025-06-15 Epub Date: 2025-03-28 DOI:10.1016/j.jmb.2025.169115
Shun Gao , Yanna Jia , Feifei Cui , Junlin Xu , Yajie Meng , Leyi Wei , Qingchen Zhang , Quan Zou , Zilong Zhang
{"title":"PLPTP: A Motif-based Interpretable Deep Learning Framework Based on Protein Language Models for Peptide Toxicity Prediction","authors":"Shun Gao ,&nbsp;Yanna Jia ,&nbsp;Feifei Cui ,&nbsp;Junlin Xu ,&nbsp;Yajie Meng ,&nbsp;Leyi Wei ,&nbsp;Qingchen Zhang ,&nbsp;Quan Zou ,&nbsp;Zilong Zhang","doi":"10.1016/j.jmb.2025.169115","DOIUrl":null,"url":null,"abstract":"<div><div>Peptide toxicity prediction holds significant importance in drug development and biotechnology, as accurately identifying toxic peptide sequences is crucial for designing safer peptide-based drugs. This study proposes a deep learning-based model for peptide toxicity prediction, integrating Evolutionary Scale Modeling (ESM2), Bidirectional Long Short-Term Memory (BiLSTM), and Deep Neural Network (DNN). The ESM2 model captures evolutionary information from peptide sequences, providing a rich context for the sequences; the BiLSTM network focuses on extracting contextual dependencies, thereby capturing long-range dependencies within the sequence; and the DNN further classifies the extracted features to achieve the final toxicity prediction. To enhance the reliability and transparency of the model, we also conducted motif analysis to identify key patterns in the data, which helps to explain the model’s attention mechanism and its classification performance. To address the class imbalance in the dataset, we employed Focal Loss as the loss function, which enhances the model’s ability to identify minority class samples by reducing the contribution of easily classified samples. Experimental results demonstrate that the proposed model performs exceptionally well across multiple evaluation metrics, particularly in handling imbalanced data, achieving significant improvements over traditional methods. This result highlights the model’s potential to improve the accuracy of peptide toxicity prediction and its valuable role in drug development and biotechnology research. The PLPTP web server is available at <span><span>https://www.bioai-lab.com/PLPTP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":"437 12","pages":"Article 169115"},"PeriodicalIF":4.5000,"publicationDate":"2025-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022283625001810","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/28 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Peptide toxicity prediction holds significant importance in drug development and biotechnology, as accurately identifying toxic peptide sequences is crucial for designing safer peptide-based drugs. This study proposes a deep learning-based model for peptide toxicity prediction, integrating Evolutionary Scale Modeling (ESM2), Bidirectional Long Short-Term Memory (BiLSTM), and Deep Neural Network (DNN). The ESM2 model captures evolutionary information from peptide sequences, providing a rich context for the sequences; the BiLSTM network focuses on extracting contextual dependencies, thereby capturing long-range dependencies within the sequence; and the DNN further classifies the extracted features to achieve the final toxicity prediction. To enhance the reliability and transparency of the model, we also conducted motif analysis to identify key patterns in the data, which helps to explain the model’s attention mechanism and its classification performance. To address the class imbalance in the dataset, we employed Focal Loss as the loss function, which enhances the model’s ability to identify minority class samples by reducing the contribution of easily classified samples. Experimental results demonstrate that the proposed model performs exceptionally well across multiple evaluation metrics, particularly in handling imbalanced data, achieving significant improvements over traditional methods. This result highlights the model’s potential to improve the accuracy of peptide toxicity prediction and its valuable role in drug development and biotechnology research. The PLPTP web server is available at https://www.bioai-lab.com/PLPTP.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PLPTP:基于蛋白质语言模型的基于基序的可解释深度学习框架,用于多肽毒性预测。
肽毒性预测在药物开发和生物技术中具有重要意义,因为准确识别毒性肽序列对于设计更安全的肽基药物至关重要。本研究提出了一种基于深度学习的多肽毒性预测模型,该模型集成了进化尺度模型(ESM2)、双向长短期记忆(BiLSTM)和深度神经网络(DNN)。ESM2模型从肽序列中捕获进化信息,为序列提供丰富的上下文;BiLSTM网络侧重于提取上下文依赖关系,从而捕获序列中的远程依赖关系;DNN进一步对提取的特征进行分类,实现最终的毒性预测。为了提高模型的可靠性和透明度,我们还进行了基序分析,以识别数据中的关键模式,这有助于解释模型的注意机制和分类性能。为了解决数据集中的类不平衡问题,我们采用Focal Loss作为损失函数,通过减少易分类样本的贡献,增强了模型识别少数类样本的能力。实验结果表明,该模型在多个评估指标上表现优异,特别是在处理不平衡数据方面,比传统方法取得了显著改进。这一结果突出了该模型在提高多肽毒性预测准确性方面的潜力,以及它在药物开发和生物技术研究中的重要作用。PLPTP web服务器可在http://www.bioai-lab.com/PLPTP上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Molecular Biology
Journal of Molecular Biology 生物-生化与分子生物学
CiteScore
11.30
自引率
1.80%
发文量
412
审稿时长
28 days
期刊介绍: Journal of Molecular Biology (JMB) provides high quality, comprehensive and broad coverage in all areas of molecular biology. The journal publishes original scientific research papers that provide mechanistic and functional insights and report a significant advance to the field. The journal encourages the submission of multidisciplinary studies that use complementary experimental and computational approaches to address challenging biological questions. Research areas include but are not limited to: Biomolecular interactions, signaling networks, systems biology; Cell cycle, cell growth, cell differentiation; Cell death, autophagy; Cell signaling and regulation; Chemical biology; Computational biology, in combination with experimental studies; DNA replication, repair, and recombination; Development, regenerative biology, mechanistic and functional studies of stem cells; Epigenetics, chromatin structure and function; Gene expression; Membrane processes, cell surface proteins and cell-cell interactions; Methodological advances, both experimental and theoretical, including databases; Microbiology, virology, and interactions with the host or environment; Microbiota mechanistic and functional studies; Nuclear organization; Post-translational modifications, proteomics; Processing and function of biologically important macromolecules and complexes; Molecular basis of disease; RNA processing, structure and functions of non-coding RNAs, transcription; Sorting, spatiotemporal organization, trafficking; Structural biology; Synthetic biology; Translation, protein folding, chaperones, protein degradation and quality control.
期刊最新文献
Structural, Mechanistic and Phylogenetic Insights Into a Freshwater Actinorhodopsin The Structure of the Full Catalytic Cycle of Vibrio cholerae NFeoB SFPQ Promotes Homologous Recombination via mRNA Stabilization of RAD51 and Its Paralogs Rising Star: Rewriting the Code of Life for the Future of Food Nucleotide-Resolution Mapping Reveals Specific MLE Binding Site on roX2 lncRNA
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1