Drug side effect prediction through linear neighborhoods and multiple data source integration

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Pub Date : 2016-12-01 DOI:10.1109/BIBM.2016.7822555

Wen Zhang, Yanlin Chen, Shikui Tu, Feng Liu, Qianlong Qu

{"title":"Drug side effect prediction through linear neighborhoods and multiple data source integration","authors":"Wen Zhang, Yanlin Chen, Shikui Tu, Feng Liu, Qianlong Qu","doi":"10.1109/BIBM.2016.7822555","DOIUrl":null,"url":null,"abstract":"predicting drug side effects is a critical task in the drug discovery, which attracts great attentions in both academy and industry. Although lots of machine learning methods have been proposed, great challenges arise with boom of precision medicine. On one hand, many methods are based on the assumption that similar drugs may share same side effects, but measuring the drug-drug similarity appropriately is challenging. One the other hand, multi-source data provide diverse information for the analysis of side effects, and should be integrated for the high-accuracy prediction. In this paper, we tackle the side effect prediction problem through linear neighborhoods and multi-source data integration. In the feature space, linear neighborhoods are constructed to extract the drug-drug similarity, namely “linear neighborhood similarity”. By transferring the similarity into the side effect space, known side effect information is propagated through the similarity-based graph. Thus, we propose the linear neighborhood similarity method (LNSM), which utilizes single-source data for the side effect prediction. Further, we extend LNSM to deal with multi-source data, and propose two data integration methods: similarity matrix integration method (LNSM-SMI) and cost minimization integration method (LNSM-CMI), which integrate drug substructure data, drug target data, drug transporter data, drug enzyme data, drug pathway data and drug indication data to improve the prediction accuracy. The proposed methods are evaluated on the benchmark datasets. The linear neighborhood similarity method (LNSM) can produce satisfying results on the single-source data. Data integration methods (LNSM-SMI and LNSM-CMI) can effectively integrate multi-source data, and outperform other state-of-the-art side effect prediction methods in the cross validation and independent test. The proposed methods are promising for the drug side effect prediction.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 53

Abstract

predicting drug side effects is a critical task in the drug discovery, which attracts great attentions in both academy and industry. Although lots of machine learning methods have been proposed, great challenges arise with boom of precision medicine. On one hand, many methods are based on the assumption that similar drugs may share same side effects, but measuring the drug-drug similarity appropriately is challenging. One the other hand, multi-source data provide diverse information for the analysis of side effects, and should be integrated for the high-accuracy prediction. In this paper, we tackle the side effect prediction problem through linear neighborhoods and multi-source data integration. In the feature space, linear neighborhoods are constructed to extract the drug-drug similarity, namely “linear neighborhood similarity”. By transferring the similarity into the side effect space, known side effect information is propagated through the similarity-based graph. Thus, we propose the linear neighborhood similarity method (LNSM), which utilizes single-source data for the side effect prediction. Further, we extend LNSM to deal with multi-source data, and propose two data integration methods: similarity matrix integration method (LNSM-SMI) and cost minimization integration method (LNSM-CMI), which integrate drug substructure data, drug target data, drug transporter data, drug enzyme data, drug pathway data and drug indication data to improve the prediction accuracy. The proposed methods are evaluated on the benchmark datasets. The linear neighborhood similarity method (LNSM) can produce satisfying results on the single-source data. Data integration methods (LNSM-SMI and LNSM-CMI) can effectively integrate multi-source data, and outperform other state-of-the-art side effect prediction methods in the cross validation and independent test. The proposed methods are promising for the drug side effect prediction.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于线性邻域和多数据源集成的药物副作用预测

药物副作用预测是药物发现中的一项关键任务，受到学术界和产业界的高度关注。虽然已经提出了许多机器学习方法，但随着精准医疗的蓬勃发展，也带来了巨大的挑战。一方面，许多方法是基于类似药物可能具有相同副作用的假设，但适当地测量药物-药物相似性是具有挑战性的。另一方面，多源数据为副作用的分析提供了多样化的信息，为了进行高精度的预测，需要对这些数据进行整合。本文采用线性邻域和多源数据集成的方法解决了副作用预测问题。在特征空间中，构建线性邻域提取药物-药物相似度，即“线性邻域相似度”。通过将相似度转移到副作用空间中，通过基于相似度的图传播已知的副作用信息。因此，我们提出了线性邻域相似法(LNSM)，该方法利用单源数据进行副作用预测。进一步，我们将LNSM扩展到多源数据，提出了两种数据集成方法:相似矩阵集成方法(LNSM- smi)和成本最小化集成方法(LNSM- cmi)，通过整合药物子结构数据、药物靶点数据、药物转运体数据、药物酶数据、药物通路数据和药物适应症数据来提高预测精度。在基准数据集上对所提出的方法进行了评估。线性邻域相似法(LNSM)在单源数据上可以得到令人满意的结果。数据集成方法(LNSM-SMI和LNSM-CMI)可以有效集成多源数据，在交叉验证和独立检验方面优于其他先进的副作用预测方法。该方法在药物副作用预测方面具有广阔的应用前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

自引率

0.00%

发文量