SubGE-DDI: A new prediction model for drug-drug interaction established through biomedical texts and drug-pairs knowledge subgraph enhancement.

IF 3.6 2区生物学 PLoS Computational Biology Pub Date : 2024-04-16 DOI:10.1371/journal.pcbi.1011989

Yiyang Shi, Mingxiu He, Junheng Chen, Fangfang Han, Yongming Cai

{"title":"SubGE-DDI: A new prediction model for drug-drug interaction established through biomedical texts and drug-pairs knowledge subgraph enhancement.","authors":"Yiyang Shi, Mingxiu He, Junheng Chen, Fangfang Han, Yongming Cai","doi":"10.1371/journal.pcbi.1011989","DOIUrl":null,"url":null,"abstract":"Biomedical texts provide important data for investigating drug-drug interactions (DDIs) in the field of pharmacovigilance. Although researchers have attempted to investigate DDIs from biomedical texts and predict unknown DDIs, the lack of accurate manual annotations significantly hinders the performance of machine learning algorithms. In this study, a new DDI prediction framework, Subgraph Enhance model, was developed for DDI (SubGE-DDI) to improve the performance of machine learning algorithms. This model uses drug pairs knowledge subgraph information to achieve large-scale plain text prediction without many annotations. This model treats DDI prediction as a multi-class classification problem and predicts the specific DDI type for each drug pair (e.g. Mechanism, Effect, Advise, Interact and Negative). The drug pairs knowledge subgraph was derived from a huge drug knowledge graph containing various public datasets, such as DrugBank, TwoSIDES, OffSIDES, DrugCentral, EntrezeGene, SMPDB (The Small Molecule Pathway Database), CTD (The Comparative Toxicogenomics Database) and SIDER. The SubGE-DDI was evaluated from the public dataset (SemEval-2013 Task 9 dataset) and then compared with other state-of-the-art baselines. SubGE-DDI achieves 83.91% micro F1 score and 84.75% macro F1 score in the test dataset, outperforming the other state-of-the-art baselines. These findings show that the proposed drug pairs knowledge subgraph-assisted model can effectively improve the prediction performance of DDIs from biomedical texts.","PeriodicalId":49688,"journal":{"name":"PLoS Computational Biology","volume":"349 13","pages":"e1011989"},"PeriodicalIF":3.6000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pcbi.1011989","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Biomedical texts provide important data for investigating drug-drug interactions (DDIs) in the field of pharmacovigilance. Although researchers have attempted to investigate DDIs from biomedical texts and predict unknown DDIs, the lack of accurate manual annotations significantly hinders the performance of machine learning algorithms. In this study, a new DDI prediction framework, Subgraph Enhance model, was developed for DDI (SubGE-DDI) to improve the performance of machine learning algorithms. This model uses drug pairs knowledge subgraph information to achieve large-scale plain text prediction without many annotations. This model treats DDI prediction as a multi-class classification problem and predicts the specific DDI type for each drug pair (e.g. Mechanism, Effect, Advise, Interact and Negative). The drug pairs knowledge subgraph was derived from a huge drug knowledge graph containing various public datasets, such as DrugBank, TwoSIDES, OffSIDES, DrugCentral, EntrezeGene, SMPDB (The Small Molecule Pathway Database), CTD (The Comparative Toxicogenomics Database) and SIDER. The SubGE-DDI was evaluated from the public dataset (SemEval-2013 Task 9 dataset) and then compared with other state-of-the-art baselines. SubGE-DDI achieves 83.91% micro F1 score and 84.75% macro F1 score in the test dataset, outperforming the other state-of-the-art baselines. These findings show that the proposed drug pairs knowledge subgraph-assisted model can effectively improve the prediction performance of DDIs from biomedical texts.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SubGE-DDI：通过生物医学文本和药物对知识子图增强建立的药物相互作用新预测模型。

生物医学文本为研究药物警戒领域的药物相互作用（DDI）提供了重要数据。尽管研究人员已经尝试从生物医学文本中研究 DDIs 并预测未知的 DDIs，但缺乏准确的人工标注极大地阻碍了机器学习算法的性能。本研究开发了一种新的 DDI 预测框架--DDI 子图增强模型（Subgraph Enhance model，简称 SubGE-DDI），以提高机器学习算法的性能。该模型利用药物配对知识子图信息，无需大量注释即可实现大规模纯文本预测。该模型将 DDI 预测视为多类分类问题，并预测每对药物的具体 DDI 类型（如机制、影响、建议、相互作用和阴性）。药物配对知识子图来自一个包含各种公共数据集的庞大药物知识图谱，如 DrugBank、TwoSIDES、OffSIDES、DrugCentral、EntrezeGene、SMPDB（小分子途径数据库）、CTD（比较毒物基因组学数据库）和 SIDER。通过公共数据集（SemEval-2013 Task 9 数据集）对 SubGE-DDI 进行了评估，然后与其他最先进的基线进行了比较。在测试数据集中，SubGE-DDI 获得了 83.91% 的微观 F1 分数和 84.75% 的宏观 F1 分数，表现优于其他先进基线。这些结果表明，所提出的药物配对知识子图辅助模型可以有效提高生物医学文本中 DDI 的预测性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

PLoS Computational Biology 生物-生化研究方法

CiteScore

7.10

自引率

4.70%

发文量

820

期刊介绍： PLOS Computational Biology features works of exceptional significance that further our understanding of living systems at all scales—from molecules and cells, to patient populations and ecosystems—through the application of computational methods. Readers include life and computational scientists, who can take the important findings presented here to the next level of discovery. Research articles must be declared as belonging to a relevant section. More information about the sections can be found in the submission guidelines. Research articles should model aspects of biological systems, demonstrate both methodological and scientific novelty, and provide profound new biological insights. Generally, reliability and significance of biological discovery through computation should be validated and enriched by experimental studies. Inclusion of experimental validation is not required for publication, but should be referenced where possible. Inclusion of experimental validation of a modest biological discovery through computation does not render a manuscript suitable for PLOS Computational Biology. Research articles specifically designated as Methods papers should describe outstanding methods of exceptional importance that have been shown, or have the promise to provide new biological insights. The method must already be widely adopted, or have the promise of wide adoption by a broad community of users. Enhancements to existing published methods will only be considered if those enhancements bring exceptional new capabilities.