Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822728
Jiamin Yuan, Jiachang Chen, Li Huang, Fuping Xu, Mary Yang, Shixing Yan, Guozheng Li, Zhimin Yang
In order to find the change laws of human meridian and to prove the laws' consistency with Traditional Chinese Medicine theory, conductance series data of 72 acupoints from 10 volunteers was collected for 2 years. Visualized analysis method is used in this paper to find the laws, as it a good way to find change laws before there's a definite research target. As it is a tough job to collect data form two years, this data is incomplete and has missing values. Traditionally, researches have to remove the incomplete samples. In this article, we put forward a novel method which estimates missing values in meridian dataset with Bayesian principal component analysis (BPCA) algorithm first and then visualize these values. With the proposed method, some useful characteristics of meridian conductance data were found.
{"title":"Visualized analysis of incomplete TCM meridian conductance data","authors":"Jiamin Yuan, Jiachang Chen, Li Huang, Fuping Xu, Mary Yang, Shixing Yan, Guozheng Li, Zhimin Yang","doi":"10.1109/BIBM.2016.7822728","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822728","url":null,"abstract":"In order to find the change laws of human meridian and to prove the laws' consistency with Traditional Chinese Medicine theory, conductance series data of 72 acupoints from 10 volunteers was collected for 2 years. Visualized analysis method is used in this paper to find the laws, as it a good way to find change laws before there's a definite research target. As it is a tough job to collect data form two years, this data is incomplete and has missing values. Traditionally, researches have to remove the incomplete samples. In this article, we put forward a novel method which estimates missing values in meridian dataset with Bayesian principal component analysis (BPCA) algorithm first and then visualize these values. With the proposed method, some useful characteristics of meridian conductance data were found.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131919567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822818
Donghuo Zeng, Chengjie Sun, Lei Lin, Bingquan Liu
Drug Entity Recognition (DER) is a crucial task for information extraction in biomedical text. Much of previous work for DER using known drugs to build features, however, the known drug resources are limited. In this paper, we proposed a semi-supervised learning to extend an existing drug dictionary. With the extended dictionary, the features for DER can be enriched. Using Conditional Random Fields (CRF) model with the enriched features, an F-measure of 89.26% is achieved on DDIExtraction2013 challenge data set, which outperforms the best system of the DDIExtraction 2013 challenge.
{"title":"Enlarging drug dictionary with semi-supervised learning for Drug Entity Recognition","authors":"Donghuo Zeng, Chengjie Sun, Lei Lin, Bingquan Liu","doi":"10.1109/BIBM.2016.7822818","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822818","url":null,"abstract":"Drug Entity Recognition (DER) is a crucial task for information extraction in biomedical text. Much of previous work for DER using known drugs to build features, however, the known drug resources are limited. In this paper, we proposed a semi-supervised learning to extend an existing drug dictionary. With the extended dictionary, the features for DER can be enriched. Using Conditional Random Fields (CRF) model with the enriched features, an F-measure of 89.26% is achieved on DDIExtraction2013 challenge data set, which outperforms the best system of the DDIExtraction 2013 challenge.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130242289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822507
Natasha Pavlovikj, Kevin Begcy, S. Behera, Malachy T. Campbell, H. Walia, J. Deogun
With advances in next-generation sequencing technologies, transcriptome sequencing has emerged as a powerful tool for performing transcriptome analysis for various organisms. Obtaining draft transcriptome of an organism is a complex multi-stage pipeline with several steps such as data cleaning, error correction and assembly. Based on the analysis performed in this paper, we conclude that the best assembly is produced when the error correction method is used with Velvet Oases and the “multi-k” strategy that combines the 5 k-mer assemblies with highest N50. Our results provide valuable insight for designing good de novo transcriptome assembly pipeline for a given application.
{"title":"Analysis of transcriptome assembly pipelines for wheat","authors":"Natasha Pavlovikj, Kevin Begcy, S. Behera, Malachy T. Campbell, H. Walia, J. Deogun","doi":"10.1109/BIBM.2016.7822507","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822507","url":null,"abstract":"With advances in next-generation sequencing technologies, transcriptome sequencing has emerged as a powerful tool for performing transcriptome analysis for various organisms. Obtaining draft transcriptome of an organism is a complex multi-stage pipeline with several steps such as data cleaning, error correction and assembly. Based on the analysis performed in this paper, we conclude that the best assembly is produced when the error correction method is used with Velvet Oases and the “multi-k” strategy that combines the 5 k-mer assemblies with highest N50. Our results provide valuable insight for designing good de novo transcriptome assembly pipeline for a given application.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134428352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822505
J. S. Kim, G. Chirikjian
Assessing preferred relative rigid-body position and orientation is important in the description of biomolecular structures (such as proteins) and their interactions. For that purpose, techniques from the kinematics community are often used. In this paper, we review parameterization methods that are widely used to describe relative rigid body motions (in particular, orientations). Then we present the extended and updated review of a ‘symmetrical parameterization’ which was newly introduced in the kinematics community. This parameterization is useful in describing the relative biomolecular rigid body motions, where the parameters are symmetrical in the sense that the subunits of a complex biomolecular structure are described in the same way for the corresponding motion and its inverse. The properties of this new parameterization, singularity analysis and inverse kinematics, are also investigated in more detail. Finally the parameterization is applied to real biomolecular structures to show the efficacy of the symmetrical parameterization in the field of computational structural biology.
{"title":"Symmetrical rigid body parameterization for biomolecular structures","authors":"J. S. Kim, G. Chirikjian","doi":"10.1109/BIBM.2016.7822505","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822505","url":null,"abstract":"Assessing preferred relative rigid-body position and orientation is important in the description of biomolecular structures (such as proteins) and their interactions. For that purpose, techniques from the kinematics community are often used. In this paper, we review parameterization methods that are widely used to describe relative rigid body motions (in particular, orientations). Then we present the extended and updated review of a ‘symmetrical parameterization’ which was newly introduced in the kinematics community. This parameterization is useful in describing the relative biomolecular rigid body motions, where the parameters are symmetrical in the sense that the subunits of a complex biomolecular structure are described in the same way for the corresponding motion and its inverse. The properties of this new parameterization, singularity analysis and inverse kinematics, are also investigated in more detail. Finally the parameterization is applied to real biomolecular structures to show the efficacy of the symmetrical parameterization in the field of computational structural biology.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131809549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822669
Jingshan Huang, D. Dou, Jun She, A. Limper, Yanan Yang, Ping Yang
Chronic obstructive pulmonary disease (COPD) and lung cancer (LC) are two serious diseases that present a major health problem worldwide. However, genetic contribution to both diseases remains unclear, including various regulation mechanisms at genetic level resulting in the progression from COPD to LC. In this paper, we describe our comprehensive methodologies, which seamlessly integrate both biological (conducted in “wet labs”) and computational (based on domain ontologies and semantic technologies) approaches, to investigate the important role of microRNA::mRNA regulations performed in COPD and LC. We discovered two genes, RGS6 and PARK2, that are strongly associated with the risk of developing either COPD or LC or both; additionally, we also identified two sets of microRNAs that are computationally predicted to regulate RGS6 and PARK2, respectively. These microRNAs can be further biologically verified in the future and serve as novel biomarkers in COPD and/or LC.
{"title":"A comprehensive (biological and computational) investigation on the role of microRNA::mRNA regulations performed in chronic obstructive pulmonary disease and lung cancer","authors":"Jingshan Huang, D. Dou, Jun She, A. Limper, Yanan Yang, Ping Yang","doi":"10.1109/BIBM.2016.7822669","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822669","url":null,"abstract":"Chronic obstructive pulmonary disease (COPD) and lung cancer (LC) are two serious diseases that present a major health problem worldwide. However, genetic contribution to both diseases remains unclear, including various regulation mechanisms at genetic level resulting in the progression from COPD to LC. In this paper, we describe our comprehensive methodologies, which seamlessly integrate both biological (conducted in “wet labs”) and computational (based on domain ontologies and semantic technologies) approaches, to investigate the important role of microRNA::mRNA regulations performed in COPD and LC. We discovered two genes, RGS6 and PARK2, that are strongly associated with the risk of developing either COPD or LC or both; additionally, we also identified two sets of microRNAs that are computationally predicted to regulate RGS6 and PARK2, respectively. These microRNAs can be further biologically verified in the future and serve as novel biomarkers in COPD and/or LC.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131811242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822529
Morteza Zihayat, Heydar Davoudi, Aijun An
Sequential pattern mining has been used in bioinformatics to discover frequent gene regulation sequential patterns based on time course microarray datasets. While mining frequent sequences are important in biological studies for disease treatment, to date, most of the approaches do not consider the importance of the genes with respect to a disease being studied when identifying gene regulation sequential patterns. In addition, they focus on the more general up/down effects of genes in a microarray dataset and do not take into account the various degrees of expression during the mining process. As a result, the current techniques return too many sequences which may not be informative enough for biologists to explore relationships between the disease and underlying causes encoded in gene regulation sequences. In this paper, we propose a utility model by considering both the importance of genes with respect to a disease and their degrees of expression levels under a biological investigation. Then, we design a new method, called TU-SEQ, for identifying top-k high utility gene regulation sequential patterns from a time-course microarray dataset. The evaluation results show that our approach can effectively and efficiently discover key patterns representing meaningful gene regulation sequential patterns in a time course microarray dataset.
{"title":"Top-k utility-based gene regulation sequential pattern discovery","authors":"Morteza Zihayat, Heydar Davoudi, Aijun An","doi":"10.1109/BIBM.2016.7822529","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822529","url":null,"abstract":"Sequential pattern mining has been used in bioinformatics to discover frequent gene regulation sequential patterns based on time course microarray datasets. While mining frequent sequences are important in biological studies for disease treatment, to date, most of the approaches do not consider the importance of the genes with respect to a disease being studied when identifying gene regulation sequential patterns. In addition, they focus on the more general up/down effects of genes in a microarray dataset and do not take into account the various degrees of expression during the mining process. As a result, the current techniques return too many sequences which may not be informative enough for biologists to explore relationships between the disease and underlying causes encoded in gene regulation sequences. In this paper, we propose a utility model by considering both the importance of genes with respect to a disease and their degrees of expression levels under a biological investigation. Then, we design a new method, called TU-SEQ, for identifying top-k high utility gene regulation sequential patterns from a time-course microarray dataset. The evaluation results show that our approach can effectively and efficiently discover key patterns representing meaningful gene regulation sequential patterns in a time course microarray dataset.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130966671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822598
Peng Xiao, Soumitra Pal, S. Rajasekaran
Discovering patterns in biological sequences is very important to extract useful information from them. Motifs are crucial patterns that have numerous applications including the identification of transcription factors and their binding sites, composite regulatory patterns, similiarity between families of proteins, etc. Several models of motifs have been proposed in the literature. The (l, d)-motif model is one of these that has been studied widely. The (l, d)-motif search problem is also known as Planted Motif Search (PMS). The general problem of PMS has been proven to be NP-hard. In this paper, we present an elegant as well as efficient randomized algorithm, named qPMS10, to solve PMS. Currently, the best known algorithm for solving PMS is qPMS9 and it can solve challenging (l, d)-motif instances up to (28, 12) and (30, 13). qPMS9 is a deterministic algorithm. We provide a performance comparison of qPMS10 with qPMS9 on standard benchmark datasets. Both theoretical and empirical analysis demonstrate that our randomized algorithm outperforms the exsiting algorithms for solving PMS. Besides, the random sampling techniques we employ in our algorithm can also be extended to solve other motif search problems including Simple Motif Search (SMS) and Edit-distance based Motif Search (EMS). Furthermore, our algorithm can be parallelized efficiently and has the potential of yielding great speedups on multi-core machines.
{"title":"qPMS10: A randomized algorithm for efficiently solving quorum Planted Motif Search problem","authors":"Peng Xiao, Soumitra Pal, S. Rajasekaran","doi":"10.1109/BIBM.2016.7822598","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822598","url":null,"abstract":"Discovering patterns in biological sequences is very important to extract useful information from them. Motifs are crucial patterns that have numerous applications including the identification of transcription factors and their binding sites, composite regulatory patterns, similiarity between families of proteins, etc. Several models of motifs have been proposed in the literature. The (l, d)-motif model is one of these that has been studied widely. The (l, d)-motif search problem is also known as Planted Motif Search (PMS). The general problem of PMS has been proven to be NP-hard. In this paper, we present an elegant as well as efficient randomized algorithm, named qPMS10, to solve PMS. Currently, the best known algorithm for solving PMS is qPMS9 and it can solve challenging (l, d)-motif instances up to (28, 12) and (30, 13). qPMS9 is a deterministic algorithm. We provide a performance comparison of qPMS10 with qPMS9 on standard benchmark datasets. Both theoretical and empirical analysis demonstrate that our randomized algorithm outperforms the exsiting algorithms for solving PMS. Besides, the random sampling techniques we employ in our algorithm can also be extended to solve other motif search problems including Simple Motif Search (SMS) and Edit-distance based Motif Search (EMS). Furthermore, our algorithm can be parallelized efficiently and has the potential of yielding great speedups on multi-core machines.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132815394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822517
G. Politano, F. Logrand, M. Brancaccio, S. Carlo
Cardiovascular diseases are one of the leading causes of death in most developed countries and aging is a dominant risk factor for their development. Among the different factors, miRNAs have been identified as relevant players in the development of cardiac pathologies and their ability to influence gene networks suggests them as potential therapeutic targets or diagnostic markers. This paper presents a computational study that applies data fusion techniques coupled with network analysis theory to identify a regulatory model able to represent the relationship between key genes and miRNAs involved in cardiac senescence processes. The model has been validated through an extensive literature analysis that was able to connect 94% of the identified genes and miRNAs with cardiac senescence related studies.
{"title":"A computationally inferred regulatory heart aging model including post-transcriptional regulations","authors":"G. Politano, F. Logrand, M. Brancaccio, S. Carlo","doi":"10.1109/BIBM.2016.7822517","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822517","url":null,"abstract":"Cardiovascular diseases are one of the leading causes of death in most developed countries and aging is a dominant risk factor for their development. Among the different factors, miRNAs have been identified as relevant players in the development of cardiac pathologies and their ability to influence gene networks suggests them as potential therapeutic targets or diagnostic markers. This paper presents a computational study that applies data fusion techniques coupled with network analysis theory to identify a regulatory model able to represent the relationship between key genes and miRNAs involved in cardiac senescence processes. The model has been validated through an extensive literature analysis that was able to connect 94% of the identified genes and miRNAs with cardiac senescence related studies.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132819847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822526
Xinran Yu, H. Zhang, T. Lilburn, Hong Cai, Jianying Gu, T. Korkmaz, Yufeng Wang
Malaria remains one of the most important public health concerns worldwide. It causes nearly half a million deaths every year, and about 40% of the world's population lives in the endemic regions of malaria. A major hurdle in antimalarial development is our limited understanding of the dynamic cellular networks in the malaria parasite. In this study, by coupling RNA-Seq analysis and network mining using a PageRank-based algorithm, we investigated the temporal-specific expression of parasite genes during the 48-hour red blood cycle, and identified genes that may play influential roles in parasite development and invasion. The just-in-time mechanism for gene expression may contribute to a dynamic yet effective adaptive strategy of the malaria parasite.
{"title":"Just-in-time expression of influential genes in the cellular networks of the malaria parasite Plasmodium falciparum during the red blood cycle","authors":"Xinran Yu, H. Zhang, T. Lilburn, Hong Cai, Jianying Gu, T. Korkmaz, Yufeng Wang","doi":"10.1109/BIBM.2016.7822526","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822526","url":null,"abstract":"Malaria remains one of the most important public health concerns worldwide. It causes nearly half a million deaths every year, and about 40% of the world's population lives in the endemic regions of malaria. A major hurdle in antimalarial development is our limited understanding of the dynamic cellular networks in the malaria parasite. In this study, by coupling RNA-Seq analysis and network mining using a PageRank-based algorithm, we investigated the temporal-specific expression of parasite genes during the 48-hour red blood cycle, and identified genes that may play influential roles in parasite development and invasion. The just-in-time mechanism for gene expression may contribute to a dynamic yet effective adaptive strategy of the malaria parasite.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127797978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822597
Dong-Qin Wang, C. Zheng, Ying-Lian Gao, Jin-Xing Liu, Sha-Sha Wu, J. Shang
Pathway-based drug discovery overcomes the disadvantages of the “one drug-one target” method, which aims to find the effective drugs to act on single targets. The current method “iPaD” identities the drug-pathway association pairs by taking the lasso-type penalty on the drug-pathway association matrix. In order to enhance the robustness of the methods and be more effective to find the novel drug-pathway association pairs, we introduce a new method named “L2,1-iPaD”. Compared with the iPaD method, we impose the L2,1-norm constraint on the drug-pathway association coefficient matrix. By applying our method to a real widely datasets (CCLE dataset), we demonstrate that our method is superior to the iPaD method. And our method can obtain the smaller P-values than the iPaD method by performing permutation test to assess the significance of the identified drug-pathway association pairs. More importantly, compared with the iPaD method, our method can identify larger numbers of validated drug-pathway association pairs. The experimental results on the real dataset demonstrate the effectiveness of our method.
{"title":"L21-iPaD: An efficient method for drug-pathway association pairs inference","authors":"Dong-Qin Wang, C. Zheng, Ying-Lian Gao, Jin-Xing Liu, Sha-Sha Wu, J. Shang","doi":"10.1109/BIBM.2016.7822597","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822597","url":null,"abstract":"Pathway-based drug discovery overcomes the disadvantages of the “one drug-one target” method, which aims to find the effective drugs to act on single targets. The current method “iPaD” identities the drug-pathway association pairs by taking the lasso-type penalty on the drug-pathway association matrix. In order to enhance the robustness of the methods and be more effective to find the novel drug-pathway association pairs, we introduce a new method named “L2,1-iPaD”. Compared with the iPaD method, we impose the L2,1-norm constraint on the drug-pathway association coefficient matrix. By applying our method to a real widely datasets (CCLE dataset), we demonstrate that our method is superior to the iPaD method. And our method can obtain the smaller P-values than the iPaD method by performing permutation test to assess the significance of the identified drug-pathway association pairs. More importantly, compared with the iPaD method, our method can identify larger numbers of validated drug-pathway association pairs. The experimental results on the real dataset demonstrate the effectiveness of our method.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"259 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134474035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}