首页 > 最新文献

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)最新文献

英文 中文
The role of high performance, grid and cloud computing in high-throughput sequencing 高性能、网格和云计算在高通量测序中的作用
Pub Date : 2016-12-15 DOI: 10.1109/BIBM.2016.7822643
G. Lightbody, Fiona Browne, Huiru Zheng, Valeriia Haberland, J. Blayney
We have reached the era of full genome sequencing using high throughput sequencing technologies pouring out gigabases of reads in a day. To fully benefit from such a profusion of data high performance tools and systems are needed to extract the information lying within the sequences. This paper provides an overview of the evolution of high-throughput sequencing and the tools, infrastructure and data management developing in this space to support a key area in personalized medicine. The paper concludes by providing an outlook in the future of such technologies and their applications and how they might shape clinical governance.
我们已经进入了全基因组测序的时代,使用高通量测序技术在一天内倾泻千兆字节的读取。为了从如此丰富的数据中充分受益,需要高性能的工具和系统来提取序列中的信息。本文概述了高通量测序的发展以及该领域发展的工具、基础设施和数据管理,以支持个性化医疗的关键领域。论文最后展望了这些技术及其应用的未来,以及它们如何影响临床治理。
{"title":"The role of high performance, grid and cloud computing in high-throughput sequencing","authors":"G. Lightbody, Fiona Browne, Huiru Zheng, Valeriia Haberland, J. Blayney","doi":"10.1109/BIBM.2016.7822643","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822643","url":null,"abstract":"We have reached the era of full genome sequencing using high throughput sequencing technologies pouring out gigabases of reads in a day. To fully benefit from such a profusion of data high performance tools and systems are needed to extract the information lying within the sequences. This paper provides an overview of the evolution of high-throughput sequencing and the tools, infrastructure and data management developing in this space to support a key area in personalized medicine. The paper concludes by providing an outlook in the future of such technologies and their applications and how they might shape clinical governance.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122631981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Disease-specific protein complex detection in the human protein interaction network with a supervised learning method 人类蛋白质相互作用网络中疾病特异性蛋白质复合物检测的监督学习方法
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822705
Ziwei Zhou, Yingyi Gui, Zhihao Yang, Xiaoxia Liu, Lei Wang, Yin Zhang, Hongfei Lin, Jian Wang
High-throughput experimental techniques have produced a large amount of human protein-protein interactions, making it possible to construct a large-scale human PPI network and detect human protein complexes from the network with computational approaches. However, most of current complex detection methods are based on graph theory which can't utilize the information of the known complexes. In this paper, we present a supervised learning method to detect protein complexes in a human PPI network. In this method, biological characteristics and properties of the network are taken into consideration to construct a rich feature set to train a regression model for protein complex detection. In addition, the specific disease related PPIs are extracted from biomedical literatures and then integrated into the original PPI network for detecting the disease-specific protein complexes more effectively. Experimental results show that the performance of our method is superior to other existing state-of-the-art methods. Furthermore, through the analysis of the breast cancer specific complexes detected with our method, more biological insights for breast cancer (e.g., some candidate susceptible genes of breast cancer) are provided.
高通量实验技术已经产生了大量的人类蛋白质-蛋白质相互作用,使得构建大规模的人类PPI网络和用计算方法从网络中检测人类蛋白质复合物成为可能。然而,目前的复合体检测方法大多是基于图论的,不能充分利用已知复合体的信息。在本文中,我们提出了一种监督学习方法来检测人体PPI网络中的蛋白质复合物。该方法综合考虑神经网络的生物学特性和网络特性,构建丰富的特征集,训练用于蛋白质复合体检测的回归模型。此外,从生物医学文献中提取与特定疾病相关的PPI,并将其整合到原始PPI网络中,以更有效地检测疾病特异性蛋白复合物。实验结果表明,该方法的性能优于现有的先进方法。此外,通过对我们的方法检测到的乳腺癌特异性复合物的分析,为乳腺癌提供了更多的生物学见解(如乳腺癌的一些候选易感基因)。
{"title":"Disease-specific protein complex detection in the human protein interaction network with a supervised learning method","authors":"Ziwei Zhou, Yingyi Gui, Zhihao Yang, Xiaoxia Liu, Lei Wang, Yin Zhang, Hongfei Lin, Jian Wang","doi":"10.1109/BIBM.2016.7822705","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822705","url":null,"abstract":"High-throughput experimental techniques have produced a large amount of human protein-protein interactions, making it possible to construct a large-scale human PPI network and detect human protein complexes from the network with computational approaches. However, most of current complex detection methods are based on graph theory which can't utilize the information of the known complexes. In this paper, we present a supervised learning method to detect protein complexes in a human PPI network. In this method, biological characteristics and properties of the network are taken into consideration to construct a rich feature set to train a regression model for protein complex detection. In addition, the specific disease related PPIs are extracted from biomedical literatures and then integrated into the original PPI network for detecting the disease-specific protein complexes more effectively. Experimental results show that the performance of our method is superior to other existing state-of-the-art methods. Furthermore, through the analysis of the breast cancer specific complexes detected with our method, more biological insights for breast cancer (e.g., some candidate susceptible genes of breast cancer) are provided.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115622720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Mining protein complexes based on topology potential from weighted dynamic PPI network 基于加权动态PPI网络拓扑势的蛋白质复合物挖掘
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822611
Xiu-juan Lei, Yuchen Zhang, Fang-Xiang Wu, A. Zhang
Identification of protein complexes is very important to investigate the characteristics of biological processes. Most of existing protein complex clustering algorithms were often run only on a static protein-protein interaction (PPI) network. The dynamic characteristics of interactions were ignored. In order to solve the problem, a new clustering algorithm (TP-WDPIN) was proposed which is based on the concept of topological potential to measure the importance of proteins in the process of detecting seed proteins and then to mine protein complexes from weighted dynamic PPI network. The algorithm used features of core-attachment of complexes and split low density cores to improve density of cores for achieving better clustering results. Experiment results showed that the proposed TP-WDPIN algorithm has better performance than other algorithms on two PPI databases.
蛋白质复合物的鉴定对于研究生物过程的特性是非常重要的。现有的蛋白质复合物聚类算法大多只在静态蛋白质-蛋白质相互作用(PPI)网络上运行。忽略了相互作用的动态特性。为了解决这一问题,提出了一种新的聚类算法TP-WDPIN,该算法基于拓扑势的概念,在检测种子蛋白的过程中度量蛋白质的重要性,然后从加权动态PPI网络中挖掘蛋白质复合物。该算法利用复合体的核附着特性和分割低密度核来提高核密度,从而获得更好的聚类效果。实验结果表明,本文提出的TP-WDPIN算法在两个PPI数据库上的性能优于其他算法。
{"title":"Mining protein complexes based on topology potential from weighted dynamic PPI network","authors":"Xiu-juan Lei, Yuchen Zhang, Fang-Xiang Wu, A. Zhang","doi":"10.1109/BIBM.2016.7822611","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822611","url":null,"abstract":"Identification of protein complexes is very important to investigate the characteristics of biological processes. Most of existing protein complex clustering algorithms were often run only on a static protein-protein interaction (PPI) network. The dynamic characteristics of interactions were ignored. In order to solve the problem, a new clustering algorithm (TP-WDPIN) was proposed which is based on the concept of topological potential to measure the importance of proteins in the process of detecting seed proteins and then to mine protein complexes from weighted dynamic PPI network. The algorithm used features of core-attachment of complexes and split low density cores to improve density of cores for achieving better clustering results. Experiment results showed that the proposed TP-WDPIN algorithm has better performance than other algorithms on two PPI databases.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125882313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Experiences on quantitative cardiac PET analysis 心脏PET定量分析的经验
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822682
P. Vizza, P. Guzzi, P. Veltri, Annalisa Papa, G. Cascini, G. Sesti, E. Succurro
Quantitative analysis of PET images is a useful as well as essential practice to perform an objective measurement of a physiological process. It allows to study diseases, evaluating treatment response and comparing patients data by quantify images. The analysis consists in estimating the quantity of radionuclide tracer uptaken by tissues. We focus on quantitative analysis of dynamic PET studies to evaluate the diseases of coronary artery and myocardium perfusion. We report experiences on quantitative cardiac PET analysis by using a commercial and largely used software to evaluate viable myocardium through Patlak method. We report also results obtained on PET images provided by clinical departments of the Magna Graecia University Medical School of Catanzaro.
PET图像的定量分析是一种有用的,也是必要的实践,以执行生理过程的客观测量。它允许研究疾病,评估治疗反应,并通过量化图像比较患者数据。分析包括估计组织吸收放射性核素示踪剂的量。我们的重点是定量分析动态PET研究,以评估冠状动脉疾病和心肌灌注。我们报告了定量心脏PET分析的经验,使用商业和广泛使用的软件,通过Patlak法评估存活心肌。我们还报告了由Magna Graecia University Medical School of Catanzaro的临床部门提供的PET图像的结果。
{"title":"Experiences on quantitative cardiac PET analysis","authors":"P. Vizza, P. Guzzi, P. Veltri, Annalisa Papa, G. Cascini, G. Sesti, E. Succurro","doi":"10.1109/BIBM.2016.7822682","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822682","url":null,"abstract":"Quantitative analysis of PET images is a useful as well as essential practice to perform an objective measurement of a physiological process. It allows to study diseases, evaluating treatment response and comparing patients data by quantify images. The analysis consists in estimating the quantity of radionuclide tracer uptaken by tissues. We focus on quantitative analysis of dynamic PET studies to evaluate the diseases of coronary artery and myocardium perfusion. We report experiences on quantitative cardiac PET analysis by using a commercial and largely used software to evaluate viable myocardium through Patlak method. We report also results obtained on PET images provided by clinical departments of the Magna Graecia University Medical School of Catanzaro.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126033210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Epistasis detection using a permutation-based Gradient Boosting Machine 基于置换梯度增强机的上位性检测
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822697
Kai Che, Xiaoyan Liu, Maozu Guo, Junwei Zhang, Lei Wang, Yin Zhang
Detecting single nucleotide polymorphism (SNP) epistasis contributes to understand disease susceptibility and discover disease pathogenesis underlying complex disease. In this paper, we propose an approach called permutation-based Gradient Boosting Machine (pGBM) to detect pure epistasis by estimating the power of a GBM classifier which is influenced by permuting SNP pairs. pGBM is based on two permutation strategies and gradient boosting machine model. To extend pGBM to detect pure epistasis well on unbalanced dataset, average AUC difference value is chosen as the metric that quantifies the SNP interactions intensity. The experiment results demonstrate that our method has a high success rate with both balanced/unbalanced simulation and real dataset. In addition, pGBM shows great potential to detect pure SNP epistasis to uncover more complex disease pathogenesis.
检测单核苷酸多态性(SNP)上位性有助于了解疾病的易感性和发现复杂疾病的发病机制。本文提出了一种基于排列的梯度增强机(pGBM)方法,通过估计受排列SNP对影响的GBM分类器的功率来检测纯上位性。pGBM基于两种排列策略和梯度增强机模型。为了将pGBM扩展到在不平衡数据集上很好地检测纯互作,选择平均AUC差值作为量化SNP相互作用强度的度量。实验结果表明,我们的方法在平衡/不平衡模拟和真实数据集上都有很高的成功率。此外,pGBM在检测纯SNP上位以揭示更复杂的疾病发病机制方面显示出巨大的潜力。
{"title":"Epistasis detection using a permutation-based Gradient Boosting Machine","authors":"Kai Che, Xiaoyan Liu, Maozu Guo, Junwei Zhang, Lei Wang, Yin Zhang","doi":"10.1109/BIBM.2016.7822697","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822697","url":null,"abstract":"Detecting single nucleotide polymorphism (SNP) epistasis contributes to understand disease susceptibility and discover disease pathogenesis underlying complex disease. In this paper, we propose an approach called permutation-based Gradient Boosting Machine (pGBM) to detect pure epistasis by estimating the power of a GBM classifier which is influenced by permuting SNP pairs. pGBM is based on two permutation strategies and gradient boosting machine model. To extend pGBM to detect pure epistasis well on unbalanced dataset, average AUC difference value is chosen as the metric that quantifies the SNP interactions intensity. The experiment results demonstrate that our method has a high success rate with both balanced/unbalanced simulation and real dataset. In addition, pGBM shows great potential to detect pure SNP epistasis to uncover more complex disease pathogenesis.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126769940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Network based study for the anti-rheumatic mechanism of Tibetan medicated-bath therapy 基于网络的藏药浴抗风湿机理研究
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822802
Jian Yang, Tianhong Wang, Xiaona Shen, Xing Chen, Kehui Zhao, Jing Wang, Yi Zhang, Jing Zhao, Yang Ga
In clinical practice at Tibetan area of China, Traditional Tibetan Medicine formula Wuwei-Ganlu-Yaoyu-Keli(WGYK) is widely added in warm water of bath therapy to treat rheumatoid arthritis (RA). However, its action mechanism is not well interpreted yet. In this study, based on gene expression data from microarray experiments, we apply approaches of network pharmacology to further reveal the action mechanism that WGYK exerts on RA from perspective of protein-protein interactions and pathways. This study may facilitate our understanding of anti-RA effect of WGYK from perspective of network pharmacology.
在中国藏区的临床实践中,在温浴疗法中广泛加入藏药方五味-甘露-腰俞-克力(WGYK)来治疗类风湿性关节炎(RA)。然而,其作用机制尚未得到很好的解释。本研究基于微阵列实验的基因表达数据,运用网络药理学的方法,从蛋白-蛋白相互作用和通路的角度进一步揭示WGYK对RA的作用机制。本研究可能有助于我们从网络药理学的角度来认识白益黄抗ra的作用。
{"title":"Network based study for the anti-rheumatic mechanism of Tibetan medicated-bath therapy","authors":"Jian Yang, Tianhong Wang, Xiaona Shen, Xing Chen, Kehui Zhao, Jing Wang, Yi Zhang, Jing Zhao, Yang Ga","doi":"10.1109/BIBM.2016.7822802","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822802","url":null,"abstract":"In clinical practice at Tibetan area of China, Traditional Tibetan Medicine formula Wuwei-Ganlu-Yaoyu-Keli(WGYK) is widely added in warm water of bath therapy to treat rheumatoid arthritis (RA). However, its action mechanism is not well interpreted yet. In this study, based on gene expression data from microarray experiments, we apply approaches of network pharmacology to further reveal the action mechanism that WGYK exerts on RA from perspective of protein-protein interactions and pathways. This study may facilitate our understanding of anti-RA effect of WGYK from perspective of network pharmacology.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126945652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards constructing “Super Gene Sets” regulatory networks 构建“超级基因集”调控网络
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822534
J. Chen, Z. Yue, Michael T. Neylon, Thanh Nguyen, Nafisa Bulsara, Itika Arora, Timothy Ratliff
In this article, we described a new computational framework to construct “Super Gene Sets”-Pathways, Annotated list, and Gene signatures (PAGs), regulatory (r-type) PAG-PAG relationships. To construct PAGs, we aggregate singleton PAGs (sPAGs) upstream/downstream of a common shared multi-gene PAG (mPAGs). Then, we iteratively remove a member gene to calculate its Cohesion Coefficient (CoCo), which helps assess the degree of biological relevance beyond random chance, until the CoCo score achieves the maximal value at a specific level. The new relationship between aggregated mPAG (m'PAG) and the shared mPAG will, therefore, have distinct m'PAG-mPAG relationships. Our results suggest the following. First, the new m'PAGs have sufficiently high CoCo scores, suggesting high biological relevance, and distinct gene ontology annotations different from their regulated PAG targets; however, there are significant enrichments of shared GO annotations between each pair of identified m'PAG-mPAG relationships. Second, new m'PAGs are relatively robust against data noise based on noise characteristic simulations. Third, by applying our framework to real cancer microarray analysis data, we demonstrated that our new framework is effective in helping build multi-scale biomolecular systems models that are easy to interpret by biologists.
在本文中,我们描述了一个新的计算框架来构建“超级基因集”-途径,注释列表和基因签名(pag),调控(r型)PAG-PAG关系。为了构建PAG,我们在一个共有的多基因PAG (mPAGs)的上游/下游聚合了单例PAG (sPAGs)。然后,我们迭代去除一个成员基因来计算其凝聚力系数(CoCo),这有助于评估超越随机机会的生物相关性程度,直到CoCo得分在特定水平上达到最大值。因此,聚合mPAG (m'PAG)和共享mPAG之间的新关系将具有不同的m'PAG-mPAG关系。我们的研究结果表明:首先,新的m'PAG具有足够高的CoCo分数,表明具有较高的生物学相关性,并且具有不同于其受调控的PAG靶标的独特基因本体注释;然而,在每一对已识别的m'PAG-mPAG关系之间存在显著丰富的共享GO注释。其次,基于噪声特性模拟,新的m' pag对数据噪声具有相对的鲁棒性。第三,通过将我们的框架应用于真实的癌症微阵列分析数据,我们证明了我们的新框架在帮助构建易于被生物学家解释的多尺度生物分子系统模型方面是有效的。
{"title":"Towards constructing “Super Gene Sets” regulatory networks","authors":"J. Chen, Z. Yue, Michael T. Neylon, Thanh Nguyen, Nafisa Bulsara, Itika Arora, Timothy Ratliff","doi":"10.1109/BIBM.2016.7822534","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822534","url":null,"abstract":"In this article, we described a new computational framework to construct “Super Gene Sets”-Pathways, Annotated list, and Gene signatures (PAGs), regulatory (r-type) PAG-PAG relationships. To construct PAGs, we aggregate singleton PAGs (sPAGs) upstream/downstream of a common shared multi-gene PAG (mPAGs). Then, we iteratively remove a member gene to calculate its Cohesion Coefficient (CoCo), which helps assess the degree of biological relevance beyond random chance, until the CoCo score achieves the maximal value at a specific level. The new relationship between aggregated mPAG (m'PAG) and the shared mPAG will, therefore, have distinct m'PAG-mPAG relationships. Our results suggest the following. First, the new m'PAGs have sufficiently high CoCo scores, suggesting high biological relevance, and distinct gene ontology annotations different from their regulated PAG targets; however, there are significant enrichments of shared GO annotations between each pair of identified m'PAG-mPAG relationships. Second, new m'PAGs are relatively robust against data noise based on noise characteristic simulations. Third, by applying our framework to real cancer microarray analysis data, we demonstrated that our new framework is effective in helping build multi-scale biomolecular systems models that are easy to interpret by biologists.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115327587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Drug side effect prediction through linear neighborhoods and multiple data source integration 基于线性邻域和多数据源集成的药物副作用预测
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822555
Wen Zhang, Yanlin Chen, Shikui Tu, Feng Liu, Qianlong Qu
predicting drug side effects is a critical task in the drug discovery, which attracts great attentions in both academy and industry. Although lots of machine learning methods have been proposed, great challenges arise with boom of precision medicine. On one hand, many methods are based on the assumption that similar drugs may share same side effects, but measuring the drug-drug similarity appropriately is challenging. One the other hand, multi-source data provide diverse information for the analysis of side effects, and should be integrated for the high-accuracy prediction. In this paper, we tackle the side effect prediction problem through linear neighborhoods and multi-source data integration. In the feature space, linear neighborhoods are constructed to extract the drug-drug similarity, namely “linear neighborhood similarity”. By transferring the similarity into the side effect space, known side effect information is propagated through the similarity-based graph. Thus, we propose the linear neighborhood similarity method (LNSM), which utilizes single-source data for the side effect prediction. Further, we extend LNSM to deal with multi-source data, and propose two data integration methods: similarity matrix integration method (LNSM-SMI) and cost minimization integration method (LNSM-CMI), which integrate drug substructure data, drug target data, drug transporter data, drug enzyme data, drug pathway data and drug indication data to improve the prediction accuracy. The proposed methods are evaluated on the benchmark datasets. The linear neighborhood similarity method (LNSM) can produce satisfying results on the single-source data. Data integration methods (LNSM-SMI and LNSM-CMI) can effectively integrate multi-source data, and outperform other state-of-the-art side effect prediction methods in the cross validation and independent test. The proposed methods are promising for the drug side effect prediction.
药物副作用预测是药物发现中的一项关键任务,受到学术界和产业界的高度关注。虽然已经提出了许多机器学习方法,但随着精准医疗的蓬勃发展,也带来了巨大的挑战。一方面,许多方法是基于类似药物可能具有相同副作用的假设,但适当地测量药物-药物相似性是具有挑战性的。另一方面,多源数据为副作用的分析提供了多样化的信息,为了进行高精度的预测,需要对这些数据进行整合。本文采用线性邻域和多源数据集成的方法解决了副作用预测问题。在特征空间中,构建线性邻域提取药物-药物相似度,即“线性邻域相似度”。通过将相似度转移到副作用空间中,通过基于相似度的图传播已知的副作用信息。因此,我们提出了线性邻域相似法(LNSM),该方法利用单源数据进行副作用预测。进一步,我们将LNSM扩展到多源数据,提出了两种数据集成方法:相似矩阵集成方法(LNSM- smi)和成本最小化集成方法(LNSM- cmi),通过整合药物子结构数据、药物靶点数据、药物转运体数据、药物酶数据、药物通路数据和药物适应症数据来提高预测精度。在基准数据集上对所提出的方法进行了评估。线性邻域相似法(LNSM)在单源数据上可以得到令人满意的结果。数据集成方法(LNSM-SMI和LNSM-CMI)可以有效集成多源数据,在交叉验证和独立检验方面优于其他先进的副作用预测方法。该方法在药物副作用预测方面具有广阔的应用前景。
{"title":"Drug side effect prediction through linear neighborhoods and multiple data source integration","authors":"Wen Zhang, Yanlin Chen, Shikui Tu, Feng Liu, Qianlong Qu","doi":"10.1109/BIBM.2016.7822555","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822555","url":null,"abstract":"predicting drug side effects is a critical task in the drug discovery, which attracts great attentions in both academy and industry. Although lots of machine learning methods have been proposed, great challenges arise with boom of precision medicine. On one hand, many methods are based on the assumption that similar drugs may share same side effects, but measuring the drug-drug similarity appropriately is challenging. One the other hand, multi-source data provide diverse information for the analysis of side effects, and should be integrated for the high-accuracy prediction. In this paper, we tackle the side effect prediction problem through linear neighborhoods and multi-source data integration. In the feature space, linear neighborhoods are constructed to extract the drug-drug similarity, namely “linear neighborhood similarity”. By transferring the similarity into the side effect space, known side effect information is propagated through the similarity-based graph. Thus, we propose the linear neighborhood similarity method (LNSM), which utilizes single-source data for the side effect prediction. Further, we extend LNSM to deal with multi-source data, and propose two data integration methods: similarity matrix integration method (LNSM-SMI) and cost minimization integration method (LNSM-CMI), which integrate drug substructure data, drug target data, drug transporter data, drug enzyme data, drug pathway data and drug indication data to improve the prediction accuracy. The proposed methods are evaluated on the benchmark datasets. The linear neighborhood similarity method (LNSM) can produce satisfying results on the single-source data. Data integration methods (LNSM-SMI and LNSM-CMI) can effectively integrate multi-source data, and outperform other state-of-the-art side effect prediction methods in the cross validation and independent test. The proposed methods are promising for the drug side effect prediction.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122705588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Extreme Large Margin Distribution Machine and its applications for biomedical datasets 极大边际分布机及其在生物医学数据集上的应用
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822751
Zhiyong Yang, Jingcheng Lu, Taohong Zhang
Classification methods has become increasingly popular for biomedical and bioinformatical data analysis. However, due to the difficulty of data acquisition, sometimes we could only obtain small-scale datasets which may leads to unreasonable generalization performances. For SVM-like algorithms, we could resort to Large Margin theory to find out solutions for such dilemma. Recent studies on large margin theory show that, besides maximizing the minimum margin of a given training dataset, it is also necessary to optimization the margin distribution to boost the overall generalization ability. Correspondingly, a novel SVM-like algorithm called Large Margin Distribution Machine (LDM) realizes this idea by maximizing the average of margin and minimizing the variance of margin simultaneously. And a series of applications has been reported thereafter. There is another well-known machine learning algorithm called Extreme Learning Machine (ELM) which shares similar framework with SVM. It is believed in this paper ELM could also benefit from the virtues of margin distribution optimization. Bearing this in mind, a novel algorithm called Extreme Large Margin Distribution Machine(ELDM) is proposed in this paper by bridging the advantages of ELM and LDM. And an efficient extension of ELDM for multi-class classifications under One vs. All Scheme is proposed subsequently. Finally, the experiment results on both benchmark datasets and biomedical classification datasets show the effectiveness of our proposed algorithm.
分类方法在生物医学和生物信息学数据分析中越来越受欢迎。然而,由于数据采集的困难,有时我们只能获得小规模的数据集,这可能会导致不合理的泛化性能。对于类svm算法,我们可以借助大边际理论来解决这种困境。最近关于大边际理论的研究表明,除了最大化给定训练数据集的最小边际外,还需要优化边际分布以提高整体泛化能力。相应的,一种新的类似svm的算法——大额保证金分布机(Large Margin Distribution Machine, LDM)通过同时最大化保证金均值和最小化保证金方差来实现这一思想。此后有一系列的应用报道。还有另一种著名的机器学习算法称为极限学习机(ELM),它与支持向量机有相似的框架。本文认为,边际收益管理也可以受益于边际分配优化的优点。考虑到这一点,本文通过桥接ELM和LDM的优点,提出了一种新的算法,称为极大边际分布机(Extreme Large Margin Distribution Machine, ELDM)。在此基础上,提出了一种适用于多类分类的有效扩展方法。最后,在基准数据集和生物医学分类数据集上的实验结果表明了本文算法的有效性。
{"title":"Extreme Large Margin Distribution Machine and its applications for biomedical datasets","authors":"Zhiyong Yang, Jingcheng Lu, Taohong Zhang","doi":"10.1109/BIBM.2016.7822751","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822751","url":null,"abstract":"Classification methods has become increasingly popular for biomedical and bioinformatical data analysis. However, due to the difficulty of data acquisition, sometimes we could only obtain small-scale datasets which may leads to unreasonable generalization performances. For SVM-like algorithms, we could resort to Large Margin theory to find out solutions for such dilemma. Recent studies on large margin theory show that, besides maximizing the minimum margin of a given training dataset, it is also necessary to optimization the margin distribution to boost the overall generalization ability. Correspondingly, a novel SVM-like algorithm called Large Margin Distribution Machine (LDM) realizes this idea by maximizing the average of margin and minimizing the variance of margin simultaneously. And a series of applications has been reported thereafter. There is another well-known machine learning algorithm called Extreme Learning Machine (ELM) which shares similar framework with SVM. It is believed in this paper ELM could also benefit from the virtues of margin distribution optimization. Bearing this in mind, a novel algorithm called Extreme Large Margin Distribution Machine(ELDM) is proposed in this paper by bridging the advantages of ELM and LDM. And an efficient extension of ELDM for multi-class classifications under One vs. All Scheme is proposed subsequently. Finally, the experiment results on both benchmark datasets and biomedical classification datasets show the effectiveness of our proposed algorithm.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"70 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122900879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Development of a computer-aided system for an effective brain connectivity network 一种有效大脑连接网络的计算机辅助系统的开发
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822774
Yaoxin Nie, Linlin Zhu, Yipeng Su, Xudong Li, Zhendong Niu
Currently, dynamic causal modeling (DCM) is one of the most widely used models for an effective brain connectivity network, but it also has some disadvantages (e.g., researchers' selection of cerebral regions of interest [ROIs] is subjective, a substantial time is required for computation, etc.). Statistical Parametric Mapping (SPM) is the most popular statistical data analysis software for brain function, but its settings cumbersome, especially the data preprocessing section. In response to these disadvantages of DCM and SPM, we designed and created a computer-aided system for an effective brain connectivity network, modularized the data preprocessing section of SPM, and we explored the cerebral ROIs and possible co-activation network based on our proposed approach. The co-activation network has as a prior interconnection relationship, and it is used to assist in the selection of ROIs in similar cognitive experiments; thus, the testing of meaningless noise connection modes by the DCM is prevented, the number of models DMC is decreased, and the accuracy of the conclusions and computational efficiency of the DCM are improved.
动态因果模型(dynamic causal modeling, DCM)是目前应用最广泛的有效脑连接网络模型之一,但也存在研究人员对感兴趣脑区(roi)的选择较为主观、计算时间较长等缺点。统计参数映射(SPM)是目前最流行的脑功能统计数据分析软件,但其设置繁琐,特别是数据预处理部分。针对DCM和SPM的这些缺点,我们设计并创建了一个有效的大脑连接网络的计算机辅助系统,模块化了SPM的数据预处理部分,并在此基础上探索了大脑的roi和可能的协同激活网络。协同激活网络作为一种先验互连关系,在类似的认知实验中被用来辅助roi的选择;从而避免了DCM对无意义噪声连接模式的检验,减少了DMC模型的数量,提高了DCM结论的准确性和计算效率。
{"title":"Development of a computer-aided system for an effective brain connectivity network","authors":"Yaoxin Nie, Linlin Zhu, Yipeng Su, Xudong Li, Zhendong Niu","doi":"10.1109/BIBM.2016.7822774","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822774","url":null,"abstract":"Currently, dynamic causal modeling (DCM) is one of the most widely used models for an effective brain connectivity network, but it also has some disadvantages (e.g., researchers' selection of cerebral regions of interest [ROIs] is subjective, a substantial time is required for computation, etc.). Statistical Parametric Mapping (SPM) is the most popular statistical data analysis software for brain function, but its settings cumbersome, especially the data preprocessing section. In response to these disadvantages of DCM and SPM, we designed and created a computer-aided system for an effective brain connectivity network, modularized the data preprocessing section of SPM, and we explored the cerebral ROIs and possible co-activation network based on our proposed approach. The co-activation network has as a prior interconnection relationship, and it is used to assist in the selection of ROIs in similar cognitive experiments; thus, the testing of meaningless noise connection modes by the DCM is prevented, the number of models DMC is decreased, and the accuracy of the conclusions and computational efficiency of the DCM are improved.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122852979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1