TransGEM:基于转换器和基因表达数据的分子生成模型。

IF 4.4 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Bioinformatics Pub Date : 2024-04-17 DOI:10.1093/bioinformatics/btae189
Yanguang Liu, Hailong Yu, Xinya Duan, Xiaomin Zhang, Ting Cheng, Feng Jiang, Hao Tang, Yao Ruan, Miao Zhang, Hongyu Zhang, Qingye Zhang
{"title":"TransGEM:基于转换器和基因表达数据的分子生成模型。","authors":"Yanguang Liu, Hailong Yu, Xinya Duan, Xiaomin Zhang, Ting Cheng, Feng Jiang, Hao Tang, Yao Ruan, Miao Zhang, Hongyu Zhang, Qingye Zhang","doi":"10.1093/bioinformatics/btae189","DOIUrl":null,"url":null,"abstract":"MOTIVATION\nIt is difficult to generate new molecules with desirable bioactivity through ligand-based de novo drug design, and receptor-based de novo drug design is constrained by disease target information availability. The combination of artificial intelligence and phenotype-based de novo drug design can generate new bioactive molecules, independent from disease target information. Gene expression profiles can be used to characterize biological phenotypes. The Transformer model can be utilized to capture the associations between gene expression profiles and molecular structures due to its remarkable ability in processing contextual information.\n\n\nRESULTS\nWe propose TransGEM (Transformer-based model from gene expression to molecules), which is a phenotype-based de novo drug design model. A specialized gene expression encoder is employed to embed gene expression difference values between diseased cell lines and their corresponding normal tissue cells into TransGEM model. The results demonstrate that the TransGEM model can generate molecules with desirable evaluation metrics and property distributions. Case studies illustrate that TransGEM model can generate structurally novel molecules with good binding affinity to disease target proteins. The majority of genes with high attention scores obtained from TransGEM model are associated with the onset of the disease, indicating the potential of these genes as disease targets. Therefore, this study provides a new paradigm for de novo drug design, and it will promote phenotype-based drug discovery.\n\n\nAVAILABILITY\nThe code is available at https://github.com/hzauzqy/TransGEM.\n\n\nSUPPLEMENTARY INFORMATION\nSupplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.4000,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TransGEM: a molecule generation model based on transformer with gene expression data.\",\"authors\":\"Yanguang Liu, Hailong Yu, Xinya Duan, Xiaomin Zhang, Ting Cheng, Feng Jiang, Hao Tang, Yao Ruan, Miao Zhang, Hongyu Zhang, Qingye Zhang\",\"doi\":\"10.1093/bioinformatics/btae189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MOTIVATION\\nIt is difficult to generate new molecules with desirable bioactivity through ligand-based de novo drug design, and receptor-based de novo drug design is constrained by disease target information availability. The combination of artificial intelligence and phenotype-based de novo drug design can generate new bioactive molecules, independent from disease target information. Gene expression profiles can be used to characterize biological phenotypes. The Transformer model can be utilized to capture the associations between gene expression profiles and molecular structures due to its remarkable ability in processing contextual information.\\n\\n\\nRESULTS\\nWe propose TransGEM (Transformer-based model from gene expression to molecules), which is a phenotype-based de novo drug design model. A specialized gene expression encoder is employed to embed gene expression difference values between diseased cell lines and their corresponding normal tissue cells into TransGEM model. The results demonstrate that the TransGEM model can generate molecules with desirable evaluation metrics and property distributions. Case studies illustrate that TransGEM model can generate structurally novel molecules with good binding affinity to disease target proteins. The majority of genes with high attention scores obtained from TransGEM model are associated with the onset of the disease, indicating the potential of these genes as disease targets. Therefore, this study provides a new paradigm for de novo drug design, and it will promote phenotype-based drug discovery.\\n\\n\\nAVAILABILITY\\nThe code is available at https://github.com/hzauzqy/TransGEM.\\n\\n\\nSUPPLEMENTARY INFORMATION\\nSupplementary data are available at Bioinformatics online.\",\"PeriodicalId\":8903,\"journal\":{\"name\":\"Bioinformatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2024-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btae189\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae189","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

动机 通过基于配体的从头药物设计很难产生具有理想生物活性的新分子,而基于受体的从头药物设计又受到疾病靶点信息的限制。人工智能与基于表型的从头药物设计相结合,可以产生新的生物活性分子,而不受疾病靶点信息的影响。基因表达谱可用来描述生物表型。由于 Transformer 模型在处理上下文信息方面的卓越能力,它可以用来捕捉基因表达谱和分子结构之间的关联。结果我们提出了 TransGEM(基于 Transformer 的基因表达到分子模型),这是一种基于表型的新药设计模型。我们采用专门的基因表达编码器将病变细胞系与相应正常组织细胞的基因表达差异值嵌入 TransGEM 模型。结果表明,TransGEM 模型可以生成具有理想评价指标和属性分布的分子。案例研究表明,TransGEM 模型可以生成结构新颖、与疾病靶蛋白结合亲和力良好的分子。从 TransGEM 模型中获得的高关注度基因大多与疾病的发病有关,这表明这些基因有可能成为疾病靶点。因此,这项研究为从头开始的药物设计提供了一个新的范例,它将促进基于表型的药物发现。AVAILABILITY代码可在https://github.com/hzauzqy/TransGEM.SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
TransGEM: a molecule generation model based on transformer with gene expression data.
MOTIVATION It is difficult to generate new molecules with desirable bioactivity through ligand-based de novo drug design, and receptor-based de novo drug design is constrained by disease target information availability. The combination of artificial intelligence and phenotype-based de novo drug design can generate new bioactive molecules, independent from disease target information. Gene expression profiles can be used to characterize biological phenotypes. The Transformer model can be utilized to capture the associations between gene expression profiles and molecular structures due to its remarkable ability in processing contextual information. RESULTS We propose TransGEM (Transformer-based model from gene expression to molecules), which is a phenotype-based de novo drug design model. A specialized gene expression encoder is employed to embed gene expression difference values between diseased cell lines and their corresponding normal tissue cells into TransGEM model. The results demonstrate that the TransGEM model can generate molecules with desirable evaluation metrics and property distributions. Case studies illustrate that TransGEM model can generate structurally novel molecules with good binding affinity to disease target proteins. The majority of genes with high attention scores obtained from TransGEM model are associated with the onset of the disease, indicating the potential of these genes as disease targets. Therefore, this study provides a new paradigm for de novo drug design, and it will promote phenotype-based drug discovery. AVAILABILITY The code is available at https://github.com/hzauzqy/TransGEM. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Bioinformatics
Bioinformatics 生物-生化研究方法
CiteScore
11.20
自引率
5.20%
发文量
753
审稿时长
2.1 months
期刊介绍: The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.
期刊最新文献
PQSDC: a parallel lossless compressor for quality scores data via sequences partition and Run-Length prediction mapping. MUSE-XAE: MUtational Signature Extraction with eXplainable AutoEncoder enhances tumour types classification. CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics CORDAX web server: An online platform for the prediction and 3D visualization of aggregation motifs in protein sequences. LMCrot: An enhanced protein crotonylation site predictor by leveraging an interpretable window-level embedding from a transformer-based protein language model.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1