Fast polypharmacy side effect prediction using tensor factorization.

IF 5.4 Bioinformatics (Oxford, England) Pub Date : 2024-11-28 DOI:10.1093/bioinformatics/btae706

Oliver Lloyd, Yi Liu, Tom R Gaunt

{"title":"Fast polypharmacy side effect prediction using tensor factorization.","authors":"Oliver Lloyd, Yi Liu, Tom R Gaunt","doi":"10.1093/bioinformatics/btae706","DOIUrl":null,"url":null,"abstract":"Motivation: Adverse reactions from drug combinations are increasingly common, making their accurate prediction a crucial challenge in modern medicine. Laboratory-based identification of these reactions is insufficient due to the combinatorial nature of the problem. While many computational approaches have been proposed, tensor factorization (TF) models have shown mixed results, necessitating a thorough investigation of their capabilities when properly optimized.Results: We demonstrate that TF models can achieve state-of-the-art performance on polypharmacy side effect prediction, with our best model (SimplE) achieving median scores of 0.978 area under receiver-operating characteristic curve, 0.971 area under precision-recall curve, and 1.000 AP@50 across 963 side effects. Notably, this model reaches 98.3% of its maximum performance after just two epochs of training (approximately 4 min), making it substantially faster than existing approaches while maintaining comparable accuracy. We also find that incorporating monopharmacy data as self-looping edges in the graph performs marginally better than using it to initialize embeddings.Availability and implementation: All code used in the experiments is available in our GitHub repository (https://doi.org/10.5281/zenodo.10684402). The implementation was carried out using Python 3.8.12 with PyTorch 1.7.1, accelerated with CUDA 11.4 on NVIDIA GeForce RTX 2080 Ti GPUs.","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11646082/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae706","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Motivation: Adverse reactions from drug combinations are increasingly common, making their accurate prediction a crucial challenge in modern medicine. Laboratory-based identification of these reactions is insufficient due to the combinatorial nature of the problem. While many computational approaches have been proposed, tensor factorization (TF) models have shown mixed results, necessitating a thorough investigation of their capabilities when properly optimized.

Results: We demonstrate that TF models can achieve state-of-the-art performance on polypharmacy side effect prediction, with our best model (SimplE) achieving median scores of 0.978 area under receiver-operating characteristic curve, 0.971 area under precision-recall curve, and 1.000 AP@50 across 963 side effects. Notably, this model reaches 98.3% of its maximum performance after just two epochs of training (approximately 4 min), making it substantially faster than existing approaches while maintaining comparable accuracy. We also find that incorporating monopharmacy data as self-looping edges in the graph performs marginally better than using it to initialize embeddings.

Availability and implementation: All code used in the experiments is available in our GitHub repository (https://doi.org/10.5281/zenodo.10684402). The implementation was carried out using Python 3.8.12 with PyTorch 1.7.1, accelerated with CUDA 11.4 on NVIDIA GeForce RTX 2080 Ti GPUs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用张量因式分解快速预测多药副作用

动机联合用药引起的不良反应越来越常见，因此准确预测这些不良反应成为现代医学的一项重要挑战。由于问题的组合性质，基于实验室的不良反应识别是不够的。虽然已经提出了许多计算方法，但张量因式分解模型的结果好坏参半，因此有必要对其适当优化后的能力进行深入研究：我们证明了张量因式分解模型可以在多药副作用预测方面达到最先进的性能，我们的最佳模型（SimplE）在 963 种副作用中取得了 0.978 AUROC、0.971 AUPRC 和 1.000 AP@50 的中位分数。值得注意的是，该模型仅经过两个历元的训练（约 4 分钟）就达到了最高性能的 98.3%，这使其在保持可比准确性的同时，大大快于现有方法。我们还发现，将单药疗法数据作为图中的自循环边，其性能略优于使用单药疗法数据来初始化嵌入：实验中使用的所有代码都可以在我们的 GitHub 代码库中找到（https://doi.org/10.5281/zenodo.10684402）。实现过程使用 Python 3.8.12 和 PyTorch 1.7.1，在 NVIDIA GeForce RTX 2080 Ti GPU 上使用 CUDA 11.4 加速：补充数据，包括精度-召回曲线和最佳性能模型的 F1 曲线，可在 Bioinformatics online 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Bioinformatics (Oxford, England)

自引率

0.00%

发文量