Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822797
Guimin Qin, Yi-Bo Hou, Bao-Guo Yu, Xi-Yang Liu
The protein phosphorylation modifications are important to protein activities and functions. It has been widely recognized that dysfunctional phosphorylation modifications are related to cancer. Specifically, some single amino acid variations could disrupt existing phosphorylation kinase-substrate relationships and create novel kinase-substrate relationships. Besides, numerous network-based methods have been proposed to identify meaningful disease modules, which are locally dense subnetworks. In this work, we proposed a new network clustering method to uncover disease modules, which are correlated with the specific disease, based on significance of connections instead of local density. Specially, we build a weighted tumor network of lung adenocarcinoma with kinase-substrate relationships, tissue-specific gene regulatory network, pairwise gene expression data and mutation data. With appropriate parameters decided by a machine learning method, our method identified 9 disease modules. We found that these disease modules could effectively discriminate tumor samples from normal samples. Some significantly important genes in these modules have been identified as target genes of drugs recently. Our results provide insights into the disease mechanism underlying, and help identify more target genes of drugs in the era of precision medicine.
{"title":"A disease module detection algorithm for lung adenocarcinoma tumor network with significance of connections and network controllability methodology","authors":"Guimin Qin, Yi-Bo Hou, Bao-Guo Yu, Xi-Yang Liu","doi":"10.1109/BIBM.2016.7822797","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822797","url":null,"abstract":"The protein phosphorylation modifications are important to protein activities and functions. It has been widely recognized that dysfunctional phosphorylation modifications are related to cancer. Specifically, some single amino acid variations could disrupt existing phosphorylation kinase-substrate relationships and create novel kinase-substrate relationships. Besides, numerous network-based methods have been proposed to identify meaningful disease modules, which are locally dense subnetworks. In this work, we proposed a new network clustering method to uncover disease modules, which are correlated with the specific disease, based on significance of connections instead of local density. Specially, we build a weighted tumor network of lung adenocarcinoma with kinase-substrate relationships, tissue-specific gene regulatory network, pairwise gene expression data and mutation data. With appropriate parameters decided by a machine learning method, our method identified 9 disease modules. We found that these disease modules could effectively discriminate tumor samples from normal samples. Some significantly important genes in these modules have been identified as target genes of drugs recently. Our results provide insights into the disease mechanism underlying, and help identify more target genes of drugs in the era of precision medicine.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125086921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Post-stroke fatigue (PSF) is a frequently reported complication of stroke. The current drugs play a limited effect on PSF. Bo's abdominal acupuncture (BAA) has been used for decades to treat stroke in China, however, few studies have used the western clinical evaluation approach to verify the efficacy of BAA. Objective: This study aimed to investigate the safety and effectiveness of BAA on PSF. Methods: Seventy stroke patients with fatigue were randomly allocated into the BAA group (n=35) or the control group (n=35). Patients in the control group received conventional rehabilitation treatment, while patients in the BAA group were given 30 additional minutes of BAA treatment each day. The level of patients' fatigue was evaluated by Fatigue Severity Scale (FSS) and the energy domain of the Stroke Specific Quality of Life (SS-QOL-E). Besides, the activity of daily living of patients was assessed by Barthel Index (BI). All adverse events were clearly written during the whole trial. Results: 70 patients with PSF accomplished this study. The mean age of patients was 60.7 years and 47 (67%) were males. At baseline, no significant difference can be observed between two groups in FSS, SS-QOL-E, and BI. After 2-week treatment, both groups signified an increase on SS-QOL-E and BI scores, a decrease on FSS scores; and the SS-QOL-E scores of BAA group increased more than that of the control group (p< 0.05), but the changes of FSS and BI scores between two groups had no significant difference after treatment (p> 0.05). Serious adverse events were not reported. Conclusion: This study suggested the integrative program of BAA and conventional rehabilitation treatment maybe more effective in promoting the recovery of PSF. Further explorations on the treatment of PSF are needed.
{"title":"Effects of Bo's abdominal acupuncture on post-stroke fatigue: A pilot study","authors":"Zhen Huang, Jie Zhan, Ruihuan Pan, Youhua Guo, Mingfeng He, Hongxia Chen, Lechang Zhan","doi":"10.1109/BIBM.2016.7822717","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822717","url":null,"abstract":"Background: Post-stroke fatigue (PSF) is a frequently reported complication of stroke. The current drugs play a limited effect on PSF. Bo's abdominal acupuncture (BAA) has been used for decades to treat stroke in China, however, few studies have used the western clinical evaluation approach to verify the efficacy of BAA. Objective: This study aimed to investigate the safety and effectiveness of BAA on PSF. Methods: Seventy stroke patients with fatigue were randomly allocated into the BAA group (n=35) or the control group (n=35). Patients in the control group received conventional rehabilitation treatment, while patients in the BAA group were given 30 additional minutes of BAA treatment each day. The level of patients' fatigue was evaluated by Fatigue Severity Scale (FSS) and the energy domain of the Stroke Specific Quality of Life (SS-QOL-E). Besides, the activity of daily living of patients was assessed by Barthel Index (BI). All adverse events were clearly written during the whole trial. Results: 70 patients with PSF accomplished this study. The mean age of patients was 60.7 years and 47 (67%) were males. At baseline, no significant difference can be observed between two groups in FSS, SS-QOL-E, and BI. After 2-week treatment, both groups signified an increase on SS-QOL-E and BI scores, a decrease on FSS scores; and the SS-QOL-E scores of BAA group increased more than that of the control group (p< 0.05), but the changes of FSS and BI scores between two groups had no significant difference after treatment (p> 0.05). Serious adverse events were not reported. Conclusion: This study suggested the integrative program of BAA and conventional rehabilitation treatment maybe more effective in promoting the recovery of PSF. Further explorations on the treatment of PSF are needed.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122421908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822524
L. Xing, Maozu Guo, Xiaoyan Liu, Chunyu Wang, Lei Wang, Yin Zhang
The reconstruction of gene regulatory network (GRN) is a great challenge in systems biology and bioinformatics, and methods based on Bayesian network (BN) draw most of attention because of its inherent probability characteristics. As NP-hard problems, most of the BN methods often adopt the heuristic search, but they are time-consuming for biological networks with a large number of nodes. To solve this problem, this paper presents a Candidate Auto Selection algorithm (CAS) based on mutual information and breakpoint detection to limit the search space in order to accelerate the learning process. The proposed algorithm automatically restricts the neighbors of each node to a small set of candidates before structure learning. Then based on CAS algorithm, we propose a globally optimal greedy search method (CAS+G), which focuses on finding the high-scoring network structure, and a local learning method (CAS+L), which focuses on faster learning the structure with small loss of quality. Results show that the proposed CAS algorithm can effectively identify the neighbor nodes of each node. In the experiments, the CAS+G method outperforms the state-of-the-art method on simulation data for inferring GRNs, and the CAS+L method is significantly faster than the state-of-the-art method with little loss of accuracy. Hence, the CAS based algorithms are more suitable for GRN inference.
{"title":"Reconstructing gene regulatory network based on candidate auto selection method","authors":"L. Xing, Maozu Guo, Xiaoyan Liu, Chunyu Wang, Lei Wang, Yin Zhang","doi":"10.1109/BIBM.2016.7822524","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822524","url":null,"abstract":"The reconstruction of gene regulatory network (GRN) is a great challenge in systems biology and bioinformatics, and methods based on Bayesian network (BN) draw most of attention because of its inherent probability characteristics. As NP-hard problems, most of the BN methods often adopt the heuristic search, but they are time-consuming for biological networks with a large number of nodes. To solve this problem, this paper presents a Candidate Auto Selection algorithm (CAS) based on mutual information and breakpoint detection to limit the search space in order to accelerate the learning process. The proposed algorithm automatically restricts the neighbors of each node to a small set of candidates before structure learning. Then based on CAS algorithm, we propose a globally optimal greedy search method (CAS+G), which focuses on finding the high-scoring network structure, and a local learning method (CAS+L), which focuses on faster learning the structure with small loss of quality. Results show that the proposed CAS algorithm can effectively identify the neighbor nodes of each node. In the experiments, the CAS+G method outperforms the state-of-the-art method on simulation data for inferring GRNs, and the CAS+L method is significantly faster than the state-of-the-art method with little loss of accuracy. Hence, the CAS based algorithms are more suitable for GRN inference.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122729916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822561
Moumita Bhattacharya, C. Jurkovitz, H. Shatkay
Multiple adverse health conditions co-occurring in a patient are typically associated with poor prognosis and increased office or hospital visits. Developing methods to identify patterns of co-occurring conditions can assist in diagnosis. Thus, identifying patterns of association among co-occurring conditions is of growing interest. In this paper, we report preliminary results from a data-driven study, in which we apply a machine learning method, namely, topic modeling, to Electronic Medical Records (EMRs), aiming to identify patterns of associated conditions. Specifically, we use the well-established Latent Dirichlet Allocation (LDA), a method based on the idea that documents can be modeled as a mixture of latent topics, where each topic is a distribution over words. In our study, we adapt the LDA model to identify latent topics in patients' EMRs. We evaluate the performance of our method both qualitatively and quantitatively, and show that the obtained topics indeed align well with distinct medical phenomena characterized by co-occurring conditions.
{"title":"Identifying patterns of associated-conditions through topic models of Electronic Medical Records","authors":"Moumita Bhattacharya, C. Jurkovitz, H. Shatkay","doi":"10.1109/BIBM.2016.7822561","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822561","url":null,"abstract":"Multiple adverse health conditions co-occurring in a patient are typically associated with poor prognosis and increased office or hospital visits. Developing methods to identify patterns of co-occurring conditions can assist in diagnosis. Thus, identifying patterns of association among co-occurring conditions is of growing interest. In this paper, we report preliminary results from a data-driven study, in which we apply a machine learning method, namely, topic modeling, to Electronic Medical Records (EMRs), aiming to identify patterns of associated conditions. Specifically, we use the well-established Latent Dirichlet Allocation (LDA), a method based on the idea that documents can be modeled as a mixture of latent topics, where each topic is a distribution over words. In our study, we adapt the LDA model to identify latent topics in patients' EMRs. We evaluate the performance of our method both qualitatively and quantitatively, and show that the obtained topics indeed align well with distinct medical phenomena characterized by co-occurring conditions.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127669516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822544
Qi Li, Y. Gong
Drosophila embryonic images provide valuable spatial and temporal information of gene expression. Extraction of the contour of a targeting embryo in an embryonic image is a fundamental step of a computational system for the study of gene-gene interaction on Drosophila. In this paper, we propose a shape model for contour extraction of Drosophila embryos. The shape model is built on connected components of edge pixels. It approximates a connected component of edge pixels by a polygon that can be either convex or concave. The main contribution of the proposed shape model is its ability of segmenting embryos touching each other. Moreover, the proposed shape model is adaptable to a wide range of applications on contour extraction.
{"title":"A shape model for contour extraction of Drosophila embryos","authors":"Qi Li, Y. Gong","doi":"10.1109/BIBM.2016.7822544","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822544","url":null,"abstract":"Drosophila embryonic images provide valuable spatial and temporal information of gene expression. Extraction of the contour of a targeting embryo in an embryonic image is a fundamental step of a computational system for the study of gene-gene interaction on Drosophila. In this paper, we propose a shape model for contour extraction of Drosophila embryos. The shape model is built on connected components of edge pixels. It approximates a connected component of edge pixels by a polygon that can be either convex or concave. The main contribution of the proposed shape model is its ability of segmenting embryos touching each other. Moreover, the proposed shape model is adaptable to a wide range of applications on contour extraction.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127674586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822659
Mai Omura, N. Sonehara, T. Okumura
Disease similarity is a useful measure to improve clinical decision support systems wherein it allows continuous presentation of similar diseases. In a previous study, we demonstrated that etiological and symptomatic information of diseases provide a reasonable approximation for the similarity of diseases. This study extends the previously proposed approach by incorporating the locational information of diseases, which may improve the performance against the baseline achieved only by the etiological and symptomatic features.
{"title":"Practical approach for disease similarity calculation based on disease phenotype, etiology, and locational clues in disease names","authors":"Mai Omura, N. Sonehara, T. Okumura","doi":"10.1109/BIBM.2016.7822659","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822659","url":null,"abstract":"Disease similarity is a useful measure to improve clinical decision support systems wherein it allows continuous presentation of similar diseases. In a previous study, we demonstrated that etiological and symptomatic information of diseases provide a reasonable approximation for the similarity of diseases. This study extends the previously proposed approach by incorporating the locational information of diseases, which may improve the performance against the baseline achieved only by the etiological and symptomatic features.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127682761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822720
Haiqing Li, Guozheng Li, William Yang, Ying Chen, Xiaoxin Zhu, Mary Yang
Efficacy prediction is an inseparable part of TCM. We firstly analyze the correlation between indicators and efficacy, and max blood-drug concentration(Cmax) is chosen as the target to reflect the efficacy of drugs. Then we apply linear regression(LR), support vector regression(SVR) as well as artificial neural networks(ANNs) to predict the efficacy of Wuji pills. The results of the leave-one-out method show that SVR performs better than other methods for label Cmax, and appears to be a good method for this task. In order to find the relationship between each component of Wuji Pills, several visualization methods are adopted to deal with this problem. The web server of prediction is available at http://data.jindengtai.cn/#/case/drug for public usage.
疗效预测是中医不可分割的一部分。我们首先分析指标与疗效的相关性,选择最大血药浓度(max blood drug concentration, Cmax)作为反映药物疗效的指标。然后应用线性回归(LR)、支持向量回归(SVR)和人工神经网络(ann)对五极丸的疗效进行预测。留一方法的结果表明,对于标签Cmax, SVR的性能优于其他方法,是一种很好的方法。为了找到无忌丸各成分之间的关系,采用了几种可视化方法来处理这一问题。预测网络服务器可在http://data.jindengtai.cn/#/case/drug上公开使用。
{"title":"Prediction of the efficacy of Wuji Pills by machine learning methods","authors":"Haiqing Li, Guozheng Li, William Yang, Ying Chen, Xiaoxin Zhu, Mary Yang","doi":"10.1109/BIBM.2016.7822720","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822720","url":null,"abstract":"Efficacy prediction is an inseparable part of TCM. We firstly analyze the correlation between indicators and efficacy, and max blood-drug concentration(Cmax) is chosen as the target to reflect the efficacy of drugs. Then we apply linear regression(LR), support vector regression(SVR) as well as artificial neural networks(ANNs) to predict the efficacy of Wuji pills. The results of the leave-one-out method show that SVR performs better than other methods for label Cmax, and appears to be a good method for this task. In order to find the relationship between each component of Wuji Pills, several visualization methods are adopted to deal with this problem. The web server of prediction is available at http://data.jindengtai.cn/#/case/drug for public usage.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"240 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132565257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822771
Tianyi Zhao, Ningyi Zhang, Jun Ren, Peigang Xu, Zhiyan Liu, Liang Cheng, Yang Hu
More than 1/3 of human genes are regulated by microRNAs. The identification of microRNA (miRNA) is the precondition of discovering the regulatory mechanism of miRNA and developing the cure for genetic diseases. The traditional identification method is biological experiment, but it has the defects of long period, high cost, and missing the miRNAs that only exist in a specific period or low expression level. Therefore, to overcome these defects, machine learning method is applied to identify miRNAs. In this study, for identifying real and pseudo miRNAs and classifying different species, we extracted 98 dimensional features based on the primary and secondary structure, then we proposed the BP-Adaboost method to figure out the overfitting phenomenon of BP neural network by constructing multiple BP neural network classifiers and distributed weights to these classifiers. The novel method we proposed raised the accuracy and the stability. In this study, we verified the effectiveness and superiority over other methods by experiments.
{"title":"A novel method to identify pre-microRNA in various species knowledge base","authors":"Tianyi Zhao, Ningyi Zhang, Jun Ren, Peigang Xu, Zhiyan Liu, Liang Cheng, Yang Hu","doi":"10.1109/BIBM.2016.7822771","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822771","url":null,"abstract":"More than 1/3 of human genes are regulated by microRNAs. The identification of microRNA (miRNA) is the precondition of discovering the regulatory mechanism of miRNA and developing the cure for genetic diseases. The traditional identification method is biological experiment, but it has the defects of long period, high cost, and missing the miRNAs that only exist in a specific period or low expression level. Therefore, to overcome these defects, machine learning method is applied to identify miRNAs. In this study, for identifying real and pseudo miRNAs and classifying different species, we extracted 98 dimensional features based on the primary and secondary structure, then we proposed the BP-Adaboost method to figure out the overfitting phenomenon of BP neural network by constructing multiple BP neural network classifiers and distributed weights to these classifiers. The novel method we proposed raised the accuracy and the stability. In this study, we verified the effectiveness and superiority over other methods by experiments.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133417593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822512
Jin Zhao, Haodi Feng
Gene controls biological character by various proteins that are formed by isoforms. Through alternative splicing, gene can express multiple isoforms. The next generation of high-throughput RNA sequencing has provided facilitation for quantifying isoform expression level. Extensive efforts have been made in stimulating isoform abundance from RNA-Seq data, but the accuracy still needs to be improved. In this article, we propose a statistical method combined with Particle Swarm Optimization to estimate isoform abundance from RNA-Seq data. After a series of statistical analysis and experiments, we decided on the forms and values of coefficients in Particle Swarm Optimization model. We analyzed the performance of our approach on both simulated and real datasets. Experiment results showed that comparing to Cufflinks our approach makes acceptable improvement on accuracy and is more sensitive to condition changes in most cases.
{"title":"Estimating isoform abundance by Particle Swarm Optimization","authors":"Jin Zhao, Haodi Feng","doi":"10.1109/BIBM.2016.7822512","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822512","url":null,"abstract":"Gene controls biological character by various proteins that are formed by isoforms. Through alternative splicing, gene can express multiple isoforms. The next generation of high-throughput RNA sequencing has provided facilitation for quantifying isoform expression level. Extensive efforts have been made in stimulating isoform abundance from RNA-Seq data, but the accuracy still needs to be improved. In this article, we propose a statistical method combined with Particle Swarm Optimization to estimate isoform abundance from RNA-Seq data. After a series of statistical analysis and experiments, we decided on the forms and values of coefficients in Particle Swarm Optimization model. We analyzed the performance of our approach on both simulated and real datasets. Experiment results showed that comparing to Cufflinks our approach makes acceptable improvement on accuracy and is more sensitive to condition changes in most cases.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133669141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822676
Mohamed El-Dirany, Forrest Wang, J. Furst, J. Rogers, D. Raicu
Distance based methods for constructing phylogenetic trees have long been considered inconsistent and inferior to the more dominant statistical methods. However, use of compression methods specific to DNA could prove valuable in improving the effectiveness of distance based methods. To demonstrate the validity of distance-based methods when utilizing current DNA compression algorithms, such as MFCompress, we have applied such a method to datasets of closely related species of fish from the suborder Labroidei and to strains of Ebola. In both cases, we have managed to produce trees that are either very similar or identical to published trees produced using statistically based methods. This suggests that distance based methods can perform comparably to statistically based methods without requiring as much pre-processing of original DNA sequences or system resources. Additionally, the results also stress the importance of using accurate methods of calculating species distance due to the way that one specific DNA compression algorithm, MFCompress, consistently and convincingly managed to outperform other popular, general use compression algorithms.
{"title":"Compression-based distance methods as an alternative to statistical methods for constructing phylogenetic trees","authors":"Mohamed El-Dirany, Forrest Wang, J. Furst, J. Rogers, D. Raicu","doi":"10.1109/BIBM.2016.7822676","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822676","url":null,"abstract":"Distance based methods for constructing phylogenetic trees have long been considered inconsistent and inferior to the more dominant statistical methods. However, use of compression methods specific to DNA could prove valuable in improving the effectiveness of distance based methods. To demonstrate the validity of distance-based methods when utilizing current DNA compression algorithms, such as MFCompress, we have applied such a method to datasets of closely related species of fish from the suborder Labroidei and to strains of Ebola. In both cases, we have managed to produce trees that are either very similar or identical to published trees produced using statistically based methods. This suggests that distance based methods can perform comparably to statistically based methods without requiring as much pre-processing of original DNA sequences or system resources. Additionally, the results also stress the importance of using accurate methods of calculating species distance due to the way that one specific DNA compression algorithm, MFCompress, consistently and convincingly managed to outperform other popular, general use compression algorithms.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"207 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133719883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}