SSL-VQ:用于半监督预测不同疾病治疗目标的矢量量化变分自编码器。

Satoko Namba, Chen Li, Noriko Yuyama Otani, Yoshihiro Yamanishi
{"title":"SSL-VQ:用于半监督预测不同疾病治疗目标的矢量量化变分自编码器。","authors":"Satoko Namba, Chen Li, Noriko Yuyama Otani, Yoshihiro Yamanishi","doi":"10.1093/bioinformatics/btaf039","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Identifying effective therapeutic targets poses a challenge in drug discovery, especially for uncharacterized diseases without known therapeutic targets (e.g. rare diseases, intractable diseases).</p><p><strong>Results: </strong>This study presents a novel machine learning approach using multimodal vector-quantized variational autoencoders (VQ-VAEs) for predicting therapeutic target molecules across diseases. To address the lack of known therapeutic target-disease associations, we incorporate the information on uncharacterized diseases without known targets or uncharacterized proteins without known indications (applicable diseases) in the semi-supervised learning (SSL) framework. The method integrates disease-specific and protein perturbation profiles with genetic perturbations (e.g. gene knockdowns and gene overexpressions) at the transcriptome level. Cross-cell representation learning, facilitated by VQ-VAEs, was performed to extract informative features from protein perturbation profiles across diverse human cell types. Concurrently, cross-disease representation learning was performed, leveraging VQ-VAE, to extract informative features reflecting disease states from disease-specific profiles. The model's applicability to uncharacterized diseases or proteins is enhanced by considering the consistency between disease-specific and patient-specific signatures. The efficacy of the method is demonstrated across three practical scenarios for 79 diseases: target repositioning for target-disease pairs, new target prediction for uncharacterized diseases, and new indication prediction for uncharacterized proteins. This method is expected to be valuable for identifying therapeutic targets across various diseases.</p><p><strong>Availability and implementation: </strong>Code: github.com/YamanishiLab/SSL-VQ and Data: 10.5281/zenodo.14644837.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11842052/pdf/","citationCount":"0","resultStr":"{\"title\":\"SSL-VQ: vector-quantized variational autoencoders for semi-supervised prediction of therapeutic targets across diverse diseases.\",\"authors\":\"Satoko Namba, Chen Li, Noriko Yuyama Otani, Yoshihiro Yamanishi\",\"doi\":\"10.1093/bioinformatics/btaf039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>Identifying effective therapeutic targets poses a challenge in drug discovery, especially for uncharacterized diseases without known therapeutic targets (e.g. rare diseases, intractable diseases).</p><p><strong>Results: </strong>This study presents a novel machine learning approach using multimodal vector-quantized variational autoencoders (VQ-VAEs) for predicting therapeutic target molecules across diseases. To address the lack of known therapeutic target-disease associations, we incorporate the information on uncharacterized diseases without known targets or uncharacterized proteins without known indications (applicable diseases) in the semi-supervised learning (SSL) framework. The method integrates disease-specific and protein perturbation profiles with genetic perturbations (e.g. gene knockdowns and gene overexpressions) at the transcriptome level. Cross-cell representation learning, facilitated by VQ-VAEs, was performed to extract informative features from protein perturbation profiles across diverse human cell types. Concurrently, cross-disease representation learning was performed, leveraging VQ-VAE, to extract informative features reflecting disease states from disease-specific profiles. The model's applicability to uncharacterized diseases or proteins is enhanced by considering the consistency between disease-specific and patient-specific signatures. The efficacy of the method is demonstrated across three practical scenarios for 79 diseases: target repositioning for target-disease pairs, new target prediction for uncharacterized diseases, and new indication prediction for uncharacterized proteins. This method is expected to be valuable for identifying therapeutic targets across various diseases.</p><p><strong>Availability and implementation: </strong>Code: github.com/YamanishiLab/SSL-VQ and Data: 10.5281/zenodo.14644837.</p>\",\"PeriodicalId\":93899,\"journal\":{\"name\":\"Bioinformatics (Oxford, England)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-02-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11842052/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics (Oxford, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btaf039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

动机:确定有效的治疗靶点是药物发现的挑战,特别是对于没有已知治疗靶点的未表征疾病(如罕见病、难治性疾病)。结果:本研究提出了一种新的机器学习方法,采用多模态矢量量化变分自编码器(VQ-VAEs)来预测疾病的治疗目标分子。为了解决缺乏已知治疗靶标-疾病关联的问题,我们在半监督学习(SSL)框架中纳入了没有已知靶标的未表征疾病或没有已知适应症(适用疾病)的未表征蛋白质的信息。该方法在转录组水平上将疾病特异性和蛋白质扰动谱与遗传扰动(例如,基因敲低和基因过表达)结合起来。在VQ-VAEs的促进下,进行了跨细胞表征学习,以从不同人类细胞类型的蛋白质扰动谱中提取信息特征。同时,利用VQ-VAE进行跨疾病表征学习,从特定疾病的概况中提取反映疾病状态的信息特征。通过考虑疾病特异性和患者特异性特征之间的一致性,该模型对未表征的疾病或蛋白质的适用性得到增强。该方法的有效性在79种疾病的三种实际情况下得到了证明:对靶标-疾病对的靶标重新定位,对未表征疾病的新靶标预测,以及对未表征蛋白质的新适应症预测。该方法有望对识别各种疾病的治疗靶点有价值。可用性和实现:代码:github.com/YamanishiLab/SSL-VQ和数据:10.5281/zenodo.14644837。补充信息:补充数据可在生物信息学在线获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SSL-VQ: vector-quantized variational autoencoders for semi-supervised prediction of therapeutic targets across diverse diseases.

Motivation: Identifying effective therapeutic targets poses a challenge in drug discovery, especially for uncharacterized diseases without known therapeutic targets (e.g. rare diseases, intractable diseases).

Results: This study presents a novel machine learning approach using multimodal vector-quantized variational autoencoders (VQ-VAEs) for predicting therapeutic target molecules across diseases. To address the lack of known therapeutic target-disease associations, we incorporate the information on uncharacterized diseases without known targets or uncharacterized proteins without known indications (applicable diseases) in the semi-supervised learning (SSL) framework. The method integrates disease-specific and protein perturbation profiles with genetic perturbations (e.g. gene knockdowns and gene overexpressions) at the transcriptome level. Cross-cell representation learning, facilitated by VQ-VAEs, was performed to extract informative features from protein perturbation profiles across diverse human cell types. Concurrently, cross-disease representation learning was performed, leveraging VQ-VAE, to extract informative features reflecting disease states from disease-specific profiles. The model's applicability to uncharacterized diseases or proteins is enhanced by considering the consistency between disease-specific and patient-specific signatures. The efficacy of the method is demonstrated across three practical scenarios for 79 diseases: target repositioning for target-disease pairs, new target prediction for uncharacterized diseases, and new indication prediction for uncharacterized proteins. This method is expected to be valuable for identifying therapeutic targets across various diseases.

Availability and implementation: Code: github.com/YamanishiLab/SSL-VQ and Data: 10.5281/zenodo.14644837.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DeepSynBa: Actionable Drug Combination Prediction with Complete Dose-Response Profiles. Microbial Named Entity Recognition and Normalisation for AI-assisted Literature Review and Meta-Analysis. Protein-Nucleic Acid Binding Site Prediction Using Interpretable Kolmogorov-Arnold Networks with Hypergraph Representation Learning. Fitness translocation: improving variant effect prediction with biologically-grounded data augmentation. ChromBERT-tools: A versatile toolkit for context-specific regulatory representations of transcription regulators across different cell types.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1