用于预测癌症中肽与蛋白质相互作用的 tabnet 深度学习模型实现

Q4 Engineering Indian Journal of Computer Science and Engineering Pub Date : 2024-04-20 DOI:10.21817/indjcse/2024/v15i1/241501032

Hanif Aditya Pradana, Ahmad Ardra Damarjati, Isman Kurniawan, W. Kusuma

{"title":"用于预测癌症中肽与蛋白质相互作用的 tabnet 深度学习模型实现","authors":"Hanif Aditya Pradana, Ahmad Ardra Damarjati, Isman Kurniawan, W. Kusuma","doi":"10.21817/indjcse/2024/v15i1/241501032","DOIUrl":null,"url":null,"abstract":"Cancer has become one of the deadliest diseases in the world, mainly caused by the accumulation of somatic and inherited mutations. However, this phenomenon can be traced back to the molecular level, specifically, to proteins. Proteins are molecules responsible for various bioprocesses in the human body through their interactions with other molecules. Abnormalities in these interactions can lead to various undesirable outcomes, including disease and cancer. Peptides have the potential to serve as molecules that can be used in protein interactions to treat cancer. However, identification of peptides corresponding to target proteins in the laboratory is time-consuming and expensive. Therefore, there is a need for computational methods to aid identification. TabNet, a deep learning-based computational method was used in this study. For comparison purposes, we selected techniques from ensemble learning, including Random Forest and Extreme Gradient Boosting, along with methods from deep learning such as Convolutional Neural Network and Stacked Autoencoder-Deep Neural Network. Predictions are performed on a multi-feature peptide-protein interaction dataset, and the features include position-specific scoring matrices, intrinsic disorder, amino acid sequence, and physicochemical properties. Among our selected metrics, we found that TabNet achieved a better score in AUC of 0.7 and lower false negatives compared to other models.","PeriodicalId":52250,"journal":{"name":"Indian Journal of Computer Science and Engineering","volume":"104 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A DEEP LEARNING MODEL IMPLEMENTATION OF TABNET FOR PREDICTING PEPTIDE-PROTEIN INTERACTION IN CANCER\",\"authors\":\"Hanif Aditya Pradana, Ahmad Ardra Damarjati, Isman Kurniawan, W. Kusuma\",\"doi\":\"10.21817/indjcse/2024/v15i1/241501032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cancer has become one of the deadliest diseases in the world, mainly caused by the accumulation of somatic and inherited mutations. However, this phenomenon can be traced back to the molecular level, specifically, to proteins. Proteins are molecules responsible for various bioprocesses in the human body through their interactions with other molecules. Abnormalities in these interactions can lead to various undesirable outcomes, including disease and cancer. Peptides have the potential to serve as molecules that can be used in protein interactions to treat cancer. However, identification of peptides corresponding to target proteins in the laboratory is time-consuming and expensive. Therefore, there is a need for computational methods to aid identification. TabNet, a deep learning-based computational method was used in this study. For comparison purposes, we selected techniques from ensemble learning, including Random Forest and Extreme Gradient Boosting, along with methods from deep learning such as Convolutional Neural Network and Stacked Autoencoder-Deep Neural Network. Predictions are performed on a multi-feature peptide-protein interaction dataset, and the features include position-specific scoring matrices, intrinsic disorder, amino acid sequence, and physicochemical properties. Among our selected metrics, we found that TabNet achieved a better score in AUC of 0.7 and lower false negatives compared to other models.\",\"PeriodicalId\":52250,\"journal\":{\"name\":\"Indian Journal of Computer Science and Engineering\",\"volume\":\"104 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Indian Journal of Computer Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21817/indjcse/2024/v15i1/241501032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Indian Journal of Computer Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21817/indjcse/2024/v15i1/241501032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Engineering","Score":null,"Total":0}

引用次数: 0

摘要

癌症已成为世界上最致命的疾病之一，其主要原因是体细胞和遗传突变的累积。然而，这种现象可以追溯到分子层面，具体来说就是蛋白质。蛋白质是通过与其他分子相互作用来负责人体内各种生物过程的分子。这些相互作用的异常会导致各种不良后果，包括疾病和癌症。肽有可能成为蛋白质相互作用中用于治疗癌症的分子。然而，在实验室中鉴定与目标蛋白质相对应的多肽既耗时又昂贵。因此，需要用计算方法来帮助识别。本研究采用了基于深度学习的计算方法 TabNet。为了便于比较，我们选择了包括随机森林（Random Forest）和极端梯度提升（Extreme Gradient Boosting）在内的集合学习技术，以及卷积神经网络（Convolutional Neural Network）和堆叠自动编码器-深度神经网络（Stacked Autoencoder-Deep Neural Network）等深度学习方法。预测是在多特征肽-蛋白质相互作用数据集上进行的，特征包括特定位置评分矩阵、内在无序性、氨基酸序列和理化性质。在我们选择的指标中，我们发现与其他模型相比，TabNet 的 AUC 得分更高，达到 0.7，假阴性更低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A DEEP LEARNING MODEL IMPLEMENTATION OF TABNET FOR PREDICTING PEPTIDE-PROTEIN INTERACTION IN CANCER

Cancer has become one of the deadliest diseases in the world, mainly caused by the accumulation of somatic and inherited mutations. However, this phenomenon can be traced back to the molecular level, specifically, to proteins. Proteins are molecules responsible for various bioprocesses in the human body through their interactions with other molecules. Abnormalities in these interactions can lead to various undesirable outcomes, including disease and cancer. Peptides have the potential to serve as molecules that can be used in protein interactions to treat cancer. However, identification of peptides corresponding to target proteins in the laboratory is time-consuming and expensive. Therefore, there is a need for computational methods to aid identification. TabNet, a deep learning-based computational method was used in this study. For comparison purposes, we selected techniques from ensemble learning, including Random Forest and Extreme Gradient Boosting, along with methods from deep learning such as Convolutional Neural Network and Stacked Autoencoder-Deep Neural Network. Predictions are performed on a multi-feature peptide-protein interaction dataset, and the features include position-specific scoring matrices, intrinsic disorder, amino acid sequence, and physicochemical properties. Among our selected metrics, we found that TabNet achieved a better score in AUC of 0.7 and lower false negatives compared to other models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Indian Journal of Computer Science and Engineering Engineering-Engineering (miscellaneous)

自引率

0.00%

发文量

146