Fast polypharmacy side effect prediction using tensor factorization.

Oliver Lloyd, Yi Liu, Tom R Gaunt
{"title":"Fast polypharmacy side effect prediction using tensor factorization.","authors":"Oliver Lloyd, Yi Liu, Tom R Gaunt","doi":"10.1093/bioinformatics/btae706","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Adverse reactions from drug combinations are increasingly common, making their accurate prediction a crucial challenge in modern medicine. Laboratory-based identification of these reactions is insufficient due to the combinatorial nature of the problem. While many computational approaches have been proposed, tensor factorization (TF) models have shown mixed results, necessitating a thorough investigation of their capabilities when properly optimized.</p><p><strong>Results: </strong>We demonstrate that TF models can achieve state-of-the-art performance on polypharmacy side effect prediction, with our best model (SimplE) achieving median scores of 0.978 area under receiver-operating characteristic curve, 0.971 area under precision-recall curve, and 1.000 AP@50 across 963 side effects. Notably, this model reaches 98.3% of its maximum performance after just two epochs of training (approximately 4 min), making it substantially faster than existing approaches while maintaining comparable accuracy. We also find that incorporating monopharmacy data as self-looping edges in the graph performs marginally better than using it to initialize embeddings.</p><p><strong>Availability and implementation: </strong>All code used in the experiments is available in our GitHub repository (https://doi.org/10.5281/zenodo.10684402). The implementation was carried out using Python 3.8.12 with PyTorch 1.7.1, accelerated with CUDA 11.4 on NVIDIA GeForce RTX 2080 Ti GPUs.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11646082/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae706","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Adverse reactions from drug combinations are increasingly common, making their accurate prediction a crucial challenge in modern medicine. Laboratory-based identification of these reactions is insufficient due to the combinatorial nature of the problem. While many computational approaches have been proposed, tensor factorization (TF) models have shown mixed results, necessitating a thorough investigation of their capabilities when properly optimized.

Results: We demonstrate that TF models can achieve state-of-the-art performance on polypharmacy side effect prediction, with our best model (SimplE) achieving median scores of 0.978 area under receiver-operating characteristic curve, 0.971 area under precision-recall curve, and 1.000 AP@50 across 963 side effects. Notably, this model reaches 98.3% of its maximum performance after just two epochs of training (approximately 4 min), making it substantially faster than existing approaches while maintaining comparable accuracy. We also find that incorporating monopharmacy data as self-looping edges in the graph performs marginally better than using it to initialize embeddings.

Availability and implementation: All code used in the experiments is available in our GitHub repository (https://doi.org/10.5281/zenodo.10684402). The implementation was carried out using Python 3.8.12 with PyTorch 1.7.1, accelerated with CUDA 11.4 on NVIDIA GeForce RTX 2080 Ti GPUs.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用张量因式分解快速预测多药副作用
动机联合用药引起的不良反应越来越常见,因此准确预测这些不良反应成为现代医学的一项重要挑战。由于问题的组合性质,基于实验室的不良反应识别是不够的。虽然已经提出了许多计算方法,但张量因式分解模型的结果好坏参半,因此有必要对其适当优化后的能力进行深入研究:我们证明了张量因式分解模型可以在多药副作用预测方面达到最先进的性能,我们的最佳模型(SimplE)在 963 种副作用中取得了 0.978 AUROC、0.971 AUPRC 和 1.000 AP@50 的中位分数。值得注意的是,该模型仅经过两个历元的训练(约 4 分钟)就达到了最高性能的 98.3%,这使其在保持可比准确性的同时,大大快于现有方法。我们还发现,将单药疗法数据作为图中的自循环边,其性能略优于使用单药疗法数据来初始化嵌入:实验中使用的所有代码都可以在我们的 GitHub 代码库中找到(https://doi.org/10.5281/zenodo.10684402)。实现过程使用 Python 3.8.12 和 PyTorch 1.7.1,在 NVIDIA GeForce RTX 2080 Ti GPU 上使用 CUDA 11.4 加速:补充数据,包括精度-召回曲线和最佳性能模型的 F1 曲线,可在 Bioinformatics online 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Functional Lipid Analysis via Index-Based Lipidomics Profile: A New Computational Module in LipidOne. pyBiodatafuse: extending interoperability of data using modular queries across biomedical resources. PyEvoMotion: a Python tool for population-based time-course analysis of genome evolution. scDBic: A novel deep learning-based biclustering algorithm for analyzing scRNA-seq data. Differential cell signaling testing for cell-cell communication inference from single-cell data by dominoSignal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1