Improving generalizability of drug-target binding prediction by pre-trained multi-view molecular representations.

Bioinformatics (Oxford, England) Pub Date : 2024-12-26 DOI:10.1093/bioinformatics/btaf002

Xike Ouyang, Yannuo Feng, Chen Cui, Yunhe Li, Li Zhang, Han Wang

{"title":"Improving generalizability of drug-target binding prediction by pre-trained multi-view molecular representations.","authors":"Xike Ouyang, Yannuo Feng, Chen Cui, Yunhe Li, Li Zhang, Han Wang","doi":"10.1093/bioinformatics/btaf002","DOIUrl":null,"url":null,"abstract":"Motivation: Most drugs start on their journey inside the body by binding the right target proteins. This is the reason that numerous efforts have been devoted to predicting the drug-target binding during drug development. However, the inherent diversity among molecular properties, coupled with limited training data availability, poses challenges to the accuracy and generalizability of these methods beyond their training domain.Results: In this work, we proposed a neural networks construction for high accurate and generalizable drug-target binding prediction, named Pre-trained Multi-view Molecular Representations (PMMR). The method uses pre-trained models to transfer representations of target proteins and drugs to the domain of drug-target binding prediction, mitigating the issue of poor generalizability stemming from limited data. Then, two typical representations of drug molecules, Graphs and SMILES strings, are learned respectively by a Graph Neural Network and a Transformer to achieve complementarity between local and global features. PMMR was evaluated on drug-target affinity and interaction benchmark datasets, and it derived preponderant performance contrast to peer methods, especially generalizability in cold-start scenarios. Furthermore, our state-of-the-art method was indicated to have the potential for drug discovery by a case study of cyclin-dependent kinase 2.Availability and implementation: https://github.com/NENUBioCompute/PMMR.","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11751634/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Motivation: Most drugs start on their journey inside the body by binding the right target proteins. This is the reason that numerous efforts have been devoted to predicting the drug-target binding during drug development. However, the inherent diversity among molecular properties, coupled with limited training data availability, poses challenges to the accuracy and generalizability of these methods beyond their training domain.

Results: In this work, we proposed a neural networks construction for high accurate and generalizable drug-target binding prediction, named Pre-trained Multi-view Molecular Representations (PMMR). The method uses pre-trained models to transfer representations of target proteins and drugs to the domain of drug-target binding prediction, mitigating the issue of poor generalizability stemming from limited data. Then, two typical representations of drug molecules, Graphs and SMILES strings, are learned respectively by a Graph Neural Network and a Transformer to achieve complementarity between local and global features. PMMR was evaluated on drug-target affinity and interaction benchmark datasets, and it derived preponderant performance contrast to peer methods, especially generalizability in cold-start scenarios. Furthermore, our state-of-the-art method was indicated to have the potential for drug discovery by a case study of cyclin-dependent kinase 2.

Availability and implementation: https://github.com/NENUBioCompute/PMMR.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用预训练的多视点分子表征提高药物与靶点结合预测的通用性。

动机：大多数药物通过结合正确的靶蛋白开始其在体内的旅程。这就是为什么在药物开发过程中对药物-靶点结合进行预测的原因。然而，分子性质的固有多样性，加上训练数据的有限性，对这些方法的准确性和泛化性提出了挑战。结果：在这项工作中，我们提出了一种用于高精度和可推广的药物靶点结合预测的神经网络结构，称为预训练多视图分子表征（PMMR）。该方法使用预先训练的模型将靶蛋白和药物的表示转移到药物-靶标结合预测领域，减轻了由于数据有限而导致的较差泛化性的问题。然后，利用图神经网络（Graph Neural Network， GNN）和Transformer分别学习药物分子的两种典型表征Graph和SMILES字符串，实现局部特征和全局特征的互补。PMMR在药物靶标亲和力和相互作用基准数据集上进行了评估，与同类方法相比，它的性能优于同类方法，特别是在冷启动场景下的通用性。此外，通过周期蛋白依赖性激酶2 （CDK2）的案例研究表明，我们最先进的方法具有药物发现的潜力。可用性和实施：https://github.com/NENUBioCompute/PMMR.Supplementary信息：补充数据可在生物信息学网站在线获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Bioinformatics (Oxford, England)

自引率

0.00%

发文量