TC-DTA:利用变压器和卷积神经网络预测药物与目标的结合亲和力。

IF 3.7 4区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS IEEE Transactions on NanoBioscience Pub Date : 2024-08-12 DOI:10.1109/TNB.2024.3441590
Xiwei Tang, Yiqiang Zhou, Mengyun Yang, Wenjun Li
{"title":"TC-DTA:利用变压器和卷积神经网络预测药物与目标的结合亲和力。","authors":"Xiwei Tang, Yiqiang Zhou, Mengyun Yang, Wenjun Li","doi":"10.1109/TNB.2024.3441590","DOIUrl":null,"url":null,"abstract":"<p><p>Bioinformatics is a rapidly growing field involving the application of computational methods to the analysis and interpretation of biological data. An important task in bioinformatics is the identification of novel drug-target interactions (DTIs), which is also an important part of the drug discovery process. Most computational methods for predicting DTI consider it as a binary classification task to predict whether drug target pairs interact with each other. With the increasing amount of drug-target binding affinity data in recent years, this binary classification task can be transformed into a regression task of drug-target affinity (DTA), which reflects the degree of drug-target binding and can provide more detailed and specific information than DTI, making it an important tool in drug discovery through virtual screening. Effectively predicting how compounds interact with targets can help speed up the drug discovery process. In this study, we propose a deep learning model called TC-DTA for the prediction of the DTA, which makes use of the convolutional neural networks (CNN) and encoder module of the transformer architecture. First, the raw drug SMILES strings and protein amino acid sequences are extracted from the dataset. These are then represented using different encoding methods. We then use CNN and the Transformer's encoder module to extract feature information from drug SMILES strings and protein amino acid sequences, respectively. Finally, the feature information obtained is concatenated and fed into a multi-layer perceptron for prediction of the binding affinity score. We evaluated our model on two benchmark DTA datasets, Davis and KIBA, against methods including KronRLS, SimBoost and DeepDTA. On evaluation metrics such as Mean Squared Error, Concordance Index and r<sup>2</sup><sub>m</sub> index, TC-DTA outperforms these baseline methods. These results demonstrate the effectiveness of the Transformer's encoder and CNN in the extraction of meaningful representations from sequences, thereby improving the accuracy of DTA prediction. The deep learning model for DTA prediction can accelerate drug discovery by identifying drug candidates with high binding affinity to specific targets. Compared to traditional methods, the use of machine learning technology allows for a more effective and efficient drug discovery process.</p>","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TC-DTA: predicting drug-target binding affinity with transformer and convolutional neural networks.\",\"authors\":\"Xiwei Tang, Yiqiang Zhou, Mengyun Yang, Wenjun Li\",\"doi\":\"10.1109/TNB.2024.3441590\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Bioinformatics is a rapidly growing field involving the application of computational methods to the analysis and interpretation of biological data. An important task in bioinformatics is the identification of novel drug-target interactions (DTIs), which is also an important part of the drug discovery process. Most computational methods for predicting DTI consider it as a binary classification task to predict whether drug target pairs interact with each other. With the increasing amount of drug-target binding affinity data in recent years, this binary classification task can be transformed into a regression task of drug-target affinity (DTA), which reflects the degree of drug-target binding and can provide more detailed and specific information than DTI, making it an important tool in drug discovery through virtual screening. Effectively predicting how compounds interact with targets can help speed up the drug discovery process. In this study, we propose a deep learning model called TC-DTA for the prediction of the DTA, which makes use of the convolutional neural networks (CNN) and encoder module of the transformer architecture. First, the raw drug SMILES strings and protein amino acid sequences are extracted from the dataset. These are then represented using different encoding methods. We then use CNN and the Transformer's encoder module to extract feature information from drug SMILES strings and protein amino acid sequences, respectively. Finally, the feature information obtained is concatenated and fed into a multi-layer perceptron for prediction of the binding affinity score. We evaluated our model on two benchmark DTA datasets, Davis and KIBA, against methods including KronRLS, SimBoost and DeepDTA. On evaluation metrics such as Mean Squared Error, Concordance Index and r<sup>2</sup><sub>m</sub> index, TC-DTA outperforms these baseline methods. These results demonstrate the effectiveness of the Transformer's encoder and CNN in the extraction of meaningful representations from sequences, thereby improving the accuracy of DTA prediction. The deep learning model for DTA prediction can accelerate drug discovery by identifying drug candidates with high binding affinity to specific targets. Compared to traditional methods, the use of machine learning technology allows for a more effective and efficient drug discovery process.</p>\",\"PeriodicalId\":13264,\"journal\":{\"name\":\"IEEE Transactions on NanoBioscience\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on NanoBioscience\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1109/TNB.2024.3441590\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on NanoBioscience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1109/TNB.2024.3441590","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

生物信息学是一个发展迅速的领域,涉及应用计算方法分析和解读生物数据。生物信息学的一项重要任务是识别新的药物-靶点相互作用(DTI),这也是药物发现过程的重要组成部分。大多数预测 DTI 的计算方法都将其视为一项二元分类任务,即预测药物靶标对之间是否存在相互作用。近年来,随着药物-靶点结合亲和力数据量的不断增加,这种二元分类任务可以转化为药物-靶点亲和力(DTA)的回归任务,DTA 反映了药物-靶点的结合程度,能提供比 DTI 更详细、更具体的信息,成为虚拟筛选药物发现的重要工具。有效预测化合物与靶点的相互作用有助于加快药物发现过程。在本研究中,我们利用卷积神经网络(CNN)和变压器架构的编码器模块,提出了一种名为 TC-DTA 的深度学习模型,用于预测 DTA。首先,从数据集中提取原始药物 SMILES 字符串和蛋白质氨基酸序列。然后使用不同的编码方法对其进行表示。然后,我们使用 CNN 和变换器的编码器模块分别从药物 SMILES 字符串和蛋白质氨基酸序列中提取特征信息。最后,将获得的特征信息串联起来并输入多层感知器,以预测结合亲和力得分。我们在戴维斯和 KIBA 这两个基准 DTA 数据集上评估了我们的模型,并与 KronRLS、SimBoost 和 DeepDTA 等方法进行了对比。在平均平方误差、一致性指数和 r2m 指数等评估指标上,TC-DTA 均优于这些基准方法。这些结果证明了 Transformer 编码器和 CNN 从序列中提取有意义表征的有效性,从而提高了 DTA 预测的准确性。用于 DTA 预测的深度学习模型可以通过识别与特定靶点具有高结合亲和力的候选药物来加速药物发现。与传统方法相比,使用机器学习技术可以实现更有效、更高效的药物发现过程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
TC-DTA: predicting drug-target binding affinity with transformer and convolutional neural networks.

Bioinformatics is a rapidly growing field involving the application of computational methods to the analysis and interpretation of biological data. An important task in bioinformatics is the identification of novel drug-target interactions (DTIs), which is also an important part of the drug discovery process. Most computational methods for predicting DTI consider it as a binary classification task to predict whether drug target pairs interact with each other. With the increasing amount of drug-target binding affinity data in recent years, this binary classification task can be transformed into a regression task of drug-target affinity (DTA), which reflects the degree of drug-target binding and can provide more detailed and specific information than DTI, making it an important tool in drug discovery through virtual screening. Effectively predicting how compounds interact with targets can help speed up the drug discovery process. In this study, we propose a deep learning model called TC-DTA for the prediction of the DTA, which makes use of the convolutional neural networks (CNN) and encoder module of the transformer architecture. First, the raw drug SMILES strings and protein amino acid sequences are extracted from the dataset. These are then represented using different encoding methods. We then use CNN and the Transformer's encoder module to extract feature information from drug SMILES strings and protein amino acid sequences, respectively. Finally, the feature information obtained is concatenated and fed into a multi-layer perceptron for prediction of the binding affinity score. We evaluated our model on two benchmark DTA datasets, Davis and KIBA, against methods including KronRLS, SimBoost and DeepDTA. On evaluation metrics such as Mean Squared Error, Concordance Index and r2m index, TC-DTA outperforms these baseline methods. These results demonstrate the effectiveness of the Transformer's encoder and CNN in the extraction of meaningful representations from sequences, thereby improving the accuracy of DTA prediction. The deep learning model for DTA prediction can accelerate drug discovery by identifying drug candidates with high binding affinity to specific targets. Compared to traditional methods, the use of machine learning technology allows for a more effective and efficient drug discovery process.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on NanoBioscience
IEEE Transactions on NanoBioscience 工程技术-纳米科技
CiteScore
7.00
自引率
5.10%
发文量
197
审稿时长
>12 weeks
期刊介绍: The IEEE Transactions on NanoBioscience reports on original, innovative and interdisciplinary work on all aspects of molecular systems, cellular systems, and tissues (including molecular electronics). Topics covered in the journal focus on a broad spectrum of aspects, both on foundations and on applications. Specifically, methods and techniques, experimental aspects, design and implementation, instrumentation and laboratory equipment, clinical aspects, hardware and software data acquisition and analysis and computer based modelling are covered (based on traditional or high performance computing - parallel computers or computer networks).
期刊最新文献
Molecular Communication-Based Intelligent Dopamine Rate Modulator for Parkinson’s Disease Treatment State Observer Synchronization of Three-dimensional Chaotic Oscillatory Systems Based on DNA Strand Displacement Strategic Multi-Omics Data Integration via Multi-Level Feature Contrasting and Matching A Representation Learning Approach for Predicting circRNA Back-Splicing Event via Sequence-Interaction-Aware Dual Encoder. Design and Performance Evaluation of Machine Learning-based Terahertz Metasurface Chemical Sensor.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1