MCNN-AAPT:利用蛋白质语言模型和多窗口深度学习对二级活性转运体中的氨基酸和肽转运体进行精确分类和功能预测。

IF 2.7 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Journal of Biomolecular Structure & Dynamics Pub Date : 2024-11-22 DOI:10.1080/07391102.2024.2431664
Muhammad Shahid Malik, Van The Le, Syed Muazzam Ali Shah, Yu-Yen Ou
{"title":"MCNN-AAPT:利用蛋白质语言模型和多窗口深度学习对二级活性转运体中的氨基酸和肽转运体进行精确分类和功能预测。","authors":"Muhammad Shahid Malik, Van The Le, Syed Muazzam Ali Shah, Yu-Yen Ou","doi":"10.1080/07391102.2024.2431664","DOIUrl":null,"url":null,"abstract":"<p><p>Secondary active transporters play a crucial role in cellular physiology by facilitating the movement of molecules across cell membranes. Identifying the functional classes of these transporters, particularly amino acid and peptide transporters, is essential for understanding their involvement in various physiological processes and disease pathways, including cancer. This study aims to develop a robust computational framework that integrates pre-trained protein language models and deep learning techniques to classify amino acid and peptide transporters within the secondary active transporter (SAT) family and predict their functional association with solute carrier (SLC) proteins. The study leverages a comprehensive dataset of 448 secondary active transporters, including 36 solute carrier proteins, obtained from UniProt and the Transporter Classification Database (TCDB). Three state-of-the-art protein language models, ProtTrans, ESM-1b, and ESM-2, are evaluated within a deep learning neural network architecture that employs a multi-window scanning technique to capture local and global sequence patterns. The ProtTrans-based feature set demonstrates exceptional performance, achieving a classification accuracy of 98.21% with 87.32% sensitivity and 99.76% specificity for distinguishing amino acid and peptide transporters from other SATs. Furthermore, the model maintains strong predictive ability for SLC proteins, with an overall accuracy of 88.89% and a Matthews Correlation Coefficient (MCC) of 0.7750. This study showcases the power of integrating pre-trained protein language models and deep learning techniques for the functional classification of secondary active transporters and the prediction of associated solute carrier proteins. The findings have significant implications for drug development, disease research, and the broader understanding of cellular transport mechanisms.</p>","PeriodicalId":15272,"journal":{"name":"Journal of Biomolecular Structure & Dynamics","volume":" ","pages":"1-10"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MCNN-AAPT: accurate classification and functional prediction of amino acid and peptide transporters in secondary active transporters using protein language models and multi-window deep learning.\",\"authors\":\"Muhammad Shahid Malik, Van The Le, Syed Muazzam Ali Shah, Yu-Yen Ou\",\"doi\":\"10.1080/07391102.2024.2431664\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Secondary active transporters play a crucial role in cellular physiology by facilitating the movement of molecules across cell membranes. Identifying the functional classes of these transporters, particularly amino acid and peptide transporters, is essential for understanding their involvement in various physiological processes and disease pathways, including cancer. This study aims to develop a robust computational framework that integrates pre-trained protein language models and deep learning techniques to classify amino acid and peptide transporters within the secondary active transporter (SAT) family and predict their functional association with solute carrier (SLC) proteins. The study leverages a comprehensive dataset of 448 secondary active transporters, including 36 solute carrier proteins, obtained from UniProt and the Transporter Classification Database (TCDB). Three state-of-the-art protein language models, ProtTrans, ESM-1b, and ESM-2, are evaluated within a deep learning neural network architecture that employs a multi-window scanning technique to capture local and global sequence patterns. The ProtTrans-based feature set demonstrates exceptional performance, achieving a classification accuracy of 98.21% with 87.32% sensitivity and 99.76% specificity for distinguishing amino acid and peptide transporters from other SATs. Furthermore, the model maintains strong predictive ability for SLC proteins, with an overall accuracy of 88.89% and a Matthews Correlation Coefficient (MCC) of 0.7750. This study showcases the power of integrating pre-trained protein language models and deep learning techniques for the functional classification of secondary active transporters and the prediction of associated solute carrier proteins. The findings have significant implications for drug development, disease research, and the broader understanding of cellular transport mechanisms.</p>\",\"PeriodicalId\":15272,\"journal\":{\"name\":\"Journal of Biomolecular Structure & Dynamics\",\"volume\":\" \",\"pages\":\"1-10\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biomolecular Structure & Dynamics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1080/07391102.2024.2431664\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomolecular Structure & Dynamics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1080/07391102.2024.2431664","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

二级活性转运体通过促进分子在细胞膜上的移动,在细胞生理学中发挥着至关重要的作用。识别这些转运体的功能类别,特别是氨基酸和肽转运体,对于了解它们参与各种生理过程和疾病(包括癌症)的途径至关重要。本研究旨在开发一个强大的计算框架,将预先训练好的蛋白质语言模型与深度学习技术相结合,对二级活性转运体(SAT)家族中的氨基酸和肽转运体进行分类,并预测它们与溶质载体(SLC)蛋白的功能关联。这项研究利用了从UniProt和转运体分类数据库(TCDB)获得的448个二级活性转运体的综合数据集,其中包括36个溶质载体蛋白。在采用多窗口扫描技术捕捉局部和全局序列模式的深度学习神经网络架构中,对 ProtTrans、ESM-1b 和 ESM-2 这三种最先进的蛋白质语言模型进行了评估。基于 ProtTrans 的特征集表现出卓越的性能,在区分氨基酸和肽转运体与其他 SAT 时,分类准确率达到 98.21%,灵敏度为 87.32%,特异度为 99.76%。此外,该模型对 SLC 蛋白保持了很强的预测能力,总体准确率为 88.89%,马修斯相关系数 (Matthews Correlation Coefficient, MCC) 为 0.7750。这项研究展示了将预先训练的蛋白质语言模型与深度学习技术相结合,对二级活性转运体进行功能分类并预测相关溶质载体蛋白的能力。这些发现对药物开发、疾病研究以及更广泛地了解细胞转运机制具有重要意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MCNN-AAPT: accurate classification and functional prediction of amino acid and peptide transporters in secondary active transporters using protein language models and multi-window deep learning.

Secondary active transporters play a crucial role in cellular physiology by facilitating the movement of molecules across cell membranes. Identifying the functional classes of these transporters, particularly amino acid and peptide transporters, is essential for understanding their involvement in various physiological processes and disease pathways, including cancer. This study aims to develop a robust computational framework that integrates pre-trained protein language models and deep learning techniques to classify amino acid and peptide transporters within the secondary active transporter (SAT) family and predict their functional association with solute carrier (SLC) proteins. The study leverages a comprehensive dataset of 448 secondary active transporters, including 36 solute carrier proteins, obtained from UniProt and the Transporter Classification Database (TCDB). Three state-of-the-art protein language models, ProtTrans, ESM-1b, and ESM-2, are evaluated within a deep learning neural network architecture that employs a multi-window scanning technique to capture local and global sequence patterns. The ProtTrans-based feature set demonstrates exceptional performance, achieving a classification accuracy of 98.21% with 87.32% sensitivity and 99.76% specificity for distinguishing amino acid and peptide transporters from other SATs. Furthermore, the model maintains strong predictive ability for SLC proteins, with an overall accuracy of 88.89% and a Matthews Correlation Coefficient (MCC) of 0.7750. This study showcases the power of integrating pre-trained protein language models and deep learning techniques for the functional classification of secondary active transporters and the prediction of associated solute carrier proteins. The findings have significant implications for drug development, disease research, and the broader understanding of cellular transport mechanisms.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Biomolecular Structure & Dynamics
Journal of Biomolecular Structure & Dynamics 生物-生化与分子生物学
CiteScore
8.90
自引率
9.10%
发文量
597
审稿时长
2 months
期刊介绍: The Journal of Biomolecular Structure and Dynamics welcomes manuscripts on biological structure, dynamics, interactions and expression. The Journal is one of the leading publications in high end computational science, atomic structural biology, bioinformatics, virtual drug design, genomics and biological networks.
期刊最新文献
A combination of conserved and stage-specific lncRNA biomarkers to detect lung adenocarcinoma progression. An optimal deep learning approach for breast cancer detection and classification with pre-trained CNN-based feature learning mechanism. Glycosylation analysis of transcription factor TFIIB using bioinformatics and experimental methods. Localization, aggregation, and interaction of glycyrrhizic acid with the plasma membrane. Repurposing of DrugBank molecules as dual non-hydroxamate HDAC8 and HDAC2 inhibitors by pharmacophore modeling, molecular docking, and molecular dynamics studies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1