GPT4Kinase: High-accuracy prediction of inhibitor-kinase binding affinity utilizing large language model

IF 7.7 1区 化学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY International Journal of Biological Macromolecules Pub Date : 2024-10-31 DOI:10.1016/j.ijbiomac.2024.137069
Kaifeng Liu , Xiangyu Yu , Huizi Cui , Wannan Li, Weiwei Han
{"title":"GPT4Kinase: High-accuracy prediction of inhibitor-kinase binding affinity utilizing large language model","authors":"Kaifeng Liu ,&nbsp;Xiangyu Yu ,&nbsp;Huizi Cui ,&nbsp;Wannan Li,&nbsp;Weiwei Han","doi":"10.1016/j.ijbiomac.2024.137069","DOIUrl":null,"url":null,"abstract":"<div><div>The accurate prediction of inhibitor-kinase binding affinity is crucial in biological research and medical applications. Particularly, kinases play a pivotal role in numerous cellular processes and are essential enzymes in Mitogen-Activated Protein Kinase (MAPK) signaling pathway. This present study harnesses the capabilities of Large Language Models (LLMs), specifically GPT-4, to predict the binding affinity between inhibitors and kinases within the MAPK pathway, including Raf protein kinase (RAF), Mitogen-activated protein kinase kinase (MEK) and Extracellular Signal-Regulated Kinase (ERK). Remarkably, GPT-4 achieved an impressive 87.31 % accuracy in prediction on RAF binding affinity, and 77.00 % accuracy in comprehensive prediction tasks, substantially outperforming existing mainstream methods such as Autodock Vina (21.21 %), BatchDTA (52.00 %) and KIPP (59.60 %). Furthermore, GPT-4 was employed to delineate the features of high-affinity and low-affinity molecules, as well as their contributing functional groups. These contributing groups were subsequently validated through molecular docking. Additionally, to validate the generalizability of the method, we applied it to six other kinases and achieved a maximum accuracy of 83.78 %. Also, we utilized a dataset comprising over 200 kinases, obtaining a high accuracy of 66.20 %. The study showcases the transformative impact of LLMs on molecular binding affinity prediction, with major implications for biological sciences and therapeutic development.</div></div>","PeriodicalId":333,"journal":{"name":"International Journal of Biological Macromolecules","volume":"282 ","pages":"Article 137069"},"PeriodicalIF":7.7000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biological Macromolecules","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141813024078784","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The accurate prediction of inhibitor-kinase binding affinity is crucial in biological research and medical applications. Particularly, kinases play a pivotal role in numerous cellular processes and are essential enzymes in Mitogen-Activated Protein Kinase (MAPK) signaling pathway. This present study harnesses the capabilities of Large Language Models (LLMs), specifically GPT-4, to predict the binding affinity between inhibitors and kinases within the MAPK pathway, including Raf protein kinase (RAF), Mitogen-activated protein kinase kinase (MEK) and Extracellular Signal-Regulated Kinase (ERK). Remarkably, GPT-4 achieved an impressive 87.31 % accuracy in prediction on RAF binding affinity, and 77.00 % accuracy in comprehensive prediction tasks, substantially outperforming existing mainstream methods such as Autodock Vina (21.21 %), BatchDTA (52.00 %) and KIPP (59.60 %). Furthermore, GPT-4 was employed to delineate the features of high-affinity and low-affinity molecules, as well as their contributing functional groups. These contributing groups were subsequently validated through molecular docking. Additionally, to validate the generalizability of the method, we applied it to six other kinases and achieved a maximum accuracy of 83.78 %. Also, we utilized a dataset comprising over 200 kinases, obtaining a high accuracy of 66.20 %. The study showcases the transformative impact of LLMs on molecular binding affinity prediction, with major implications for biological sciences and therapeutic development.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GPT4Kinase:利用大型语言模型高精度预测抑制剂与激酶的结合亲和力
准确预测抑制剂与激酶的结合亲和力对生物研究和医学应用至关重要。特别是激酶在许多细胞过程中起着关键作用,是丝裂原活化蛋白激酶(MAPK)信号通路中的重要酶。本研究利用大型语言模型(LLM),特别是 GPT-4 的功能,预测抑制剂与 MAPK 通路中的激酶(包括 Raf 蛋白激酶(RAF)、丝裂原活化蛋白激酶激酶(MEK)和细胞外信号调节激酶(ERK))之间的结合亲和力。令人瞩目的是,GPT-4 在预测 RAF 结合亲和力方面达到了令人印象深刻的 87.31 % 的准确率,在综合预测任务方面达到了 77.00 % 的准确率,大大超过了 Autodock Vina(21.21 %)、BatchDTA(52.00 %)和 KIPP(59.60 %)等现有主流方法。此外,GPT-4 还用于划分高亲和力分子和低亲和力分子的特征及其贡献官能团。随后通过分子对接验证了这些功能基团。此外,为了验证该方法的通用性,我们还将其应用于其他六种激酶,并取得了 83.78 % 的最高准确率。此外,我们还使用了由 200 多种激酶组成的数据集,获得了 66.20 % 的高准确率。这项研究展示了 LLM 对分子结合亲和力预测的变革性影响,对生物科学和治疗开发具有重大意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International Journal of Biological Macromolecules
International Journal of Biological Macromolecules 生物-生化与分子生物学
CiteScore
13.70
自引率
9.80%
发文量
2728
审稿时长
64 days
期刊介绍: The International Journal of Biological Macromolecules is a well-established international journal dedicated to research on the chemical and biological aspects of natural macromolecules. Focusing on proteins, macromolecular carbohydrates, glycoproteins, proteoglycans, lignins, biological poly-acids, and nucleic acids, the journal presents the latest findings in molecular structure, properties, biological activities, interactions, modifications, and functional properties. Papers must offer new and novel insights, encompassing related model systems, structural conformational studies, theoretical developments, and analytical techniques. Each paper is required to primarily focus on at least one named biological macromolecule, reflected in the title, abstract, and text.
期刊最新文献
Corrigendum to “Impact of salting-in/out assisted extraction on rheological, biological, and digestive, and proteomic properties of Tenebrio molitor larvae protein isolates” [Int. J. Biol. Macromol. 282 (2024) 137044] Mussel-inspired oxidized sodium alginate/cellulose composite sponge with excellent shape recovery and antibacterial properties for the efficient control of non-compressible hemorrhage. Brick-cement system inspired fabrication of Ti3C2 MXene nanosheet reinforced high-performance of chitosan/gelatin/PVA composite films. Corrigendum to "Antimicrobial peptides-loaded smart chitosan hydrogel: Release behavior and antibacterial potential against antibiotic resistant clinical isolates" [Int. J. Biol. Macromol. 164 (2020) 855-862]. Carboxymethylcellulose-based aggregation-induced emission antibacterial material for multifunctional applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1