Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822525
Yang Yang, Tianyu Cao, Wei Kong
Feature selection methods have been widely used in gene expression analysis to identify differentially expressed genes and explore potential biomarkers for complex diseases. While a lot of studies have shown that incorporating feature structure information can greatly enhance the performance of feature selection algorithms, and genes naturally fall into groups with regard to common function and co-regulation, only a few of gene expression studies utilized the structured properties. And, as far as we know, there has been no such study on microRNA (miRNA) expression analysis due to the lack of available functional annotation for miRNAs. In this study, we focus on miRNA expression analysis because of its importance in the diagnosis, prognosis prediction and new therapeutic target detection for complex diseases. MiRNAs tend to work in groups to play their regulation roles, thus the miRNA expression data also has group structure. We utilize the GO-based semantic similarity to infer miRNA functional groups, and propose a new feature selection method taking group structure into consideration, called MiRFFS (MiRNA Functional group-based Feature Selection). We also apply the group information to the sparse group Lasso method, and compare MiRFFS with the sparse group Lasso as well as some existing feature selection methods. The results on three miRNA microarray profiles of breast cancer show that MiRFFS can achieve a compact feature subset with high classification accuracy.
{"title":"Feature selection based on functional group structure for microRNA expression data analysis","authors":"Yang Yang, Tianyu Cao, Wei Kong","doi":"10.1109/BIBM.2016.7822525","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822525","url":null,"abstract":"Feature selection methods have been widely used in gene expression analysis to identify differentially expressed genes and explore potential biomarkers for complex diseases. While a lot of studies have shown that incorporating feature structure information can greatly enhance the performance of feature selection algorithms, and genes naturally fall into groups with regard to common function and co-regulation, only a few of gene expression studies utilized the structured properties. And, as far as we know, there has been no such study on microRNA (miRNA) expression analysis due to the lack of available functional annotation for miRNAs. In this study, we focus on miRNA expression analysis because of its importance in the diagnosis, prognosis prediction and new therapeutic target detection for complex diseases. MiRNAs tend to work in groups to play their regulation roles, thus the miRNA expression data also has group structure. We utilize the GO-based semantic similarity to infer miRNA functional groups, and propose a new feature selection method taking group structure into consideration, called MiRFFS (MiRNA Functional group-based Feature Selection). We also apply the group information to the sparse group Lasso method, and compare MiRFFS with the sparse group Lasso as well as some existing feature selection methods. The results on three miRNA microarray profiles of breast cancer show that MiRFFS can achieve a compact feature subset with high classification accuracy.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"276 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114484819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822594
Ou-Yang Le, Hong Yan, Xiao-Fei Zhang
The detection of protein complexes from protein-protein interaction (PPI) networks is an important step toward understanding the functional organization within cells. A great number of graph clustering algorithms have been proposed to undertake this task. Since PPI data collected by high-throughput technologies is quite noisy, simply applying graph clustering algorithms on PPI data is generally not adequate to achieve reliable prediction results. Behind protein interactions, there are protein domains that interact with each other. Jointly exploiting protein-protein interactions and domain-domain interactions (DDI) have the potential to increase the accuracy of protein complex detection. However, traditional graph clustering algorithms focus on clustering proteins within a single PPI network, and cannot make use of information inherent in other heterogeneous networks. In this paper, we proposed a novel generative model to perform multi-network clustering. Unlike previous protein complex detection algorithms that can only utilize the information within a single PPI network, our model is a flexible framework that can take into account PPIs, DDIs and domain-protein associations to achieve more consistent and reliable clustering results. Experiment results on real data demonstrate that our method performs much better than state-of-the-art protein complex detection techniques.
{"title":"Identifying protein complexes via multi-network clustering","authors":"Ou-Yang Le, Hong Yan, Xiao-Fei Zhang","doi":"10.1109/BIBM.2016.7822594","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822594","url":null,"abstract":"The detection of protein complexes from protein-protein interaction (PPI) networks is an important step toward understanding the functional organization within cells. A great number of graph clustering algorithms have been proposed to undertake this task. Since PPI data collected by high-throughput technologies is quite noisy, simply applying graph clustering algorithms on PPI data is generally not adequate to achieve reliable prediction results. Behind protein interactions, there are protein domains that interact with each other. Jointly exploiting protein-protein interactions and domain-domain interactions (DDI) have the potential to increase the accuracy of protein complex detection. However, traditional graph clustering algorithms focus on clustering proteins within a single PPI network, and cannot make use of information inherent in other heterogeneous networks. In this paper, we proposed a novel generative model to perform multi-network clustering. Unlike previous protein complex detection algorithms that can only utilize the information within a single PPI network, our model is a flexible framework that can take into account PPIs, DDIs and domain-protein associations to achieve more consistent and reliable clustering results. Experiment results on real data demonstrate that our method performs much better than state-of-the-art protein complex detection techniques.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129609119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822779
Michael Siderius, F. Jagodzinski
Understanding how an amino acid substitution affects a protein's stability can aid in the design of pharmaceutical drugs that aim to counter the deleterious effects caused by protein mutants. Unfortunately, performing mutation experiments on the physical protein is both time and cost prohibitive. Thus an exhaustive analysis which includes systematically mutating all amino acids in the physical protein is infeasible. Computational methods have been developed over the years to predict the effects of mutations, but even many of them are computationally intensive else are dependent on homology or experimental data that may not be available for the protein being studied. In this work we motivate and present a computation pipeline whose only input is a Protein Data Bank file containing the 3D coordinates of the atoms of a biomolecule. Our high-throughput approach uses our rMutant algorithm to exhaustively generate in silico mutants with amino acid substitutions to Glycine, Alanine, and Serine for all residues in a protein. We exploit the speed of a fast rigidity analysis approach to analyze our protein variants, and develop a Mutation Sensitivity (MuSe) Map to identify residues that are most sensitive to mutations. We present three case studies and show the degree to which a MuSe Map is able to identify those amino acids which are susceptible to the effects of mutations.
{"title":"Identifying amino acids sensitive to mutations using high-throughput rigidity analysis","authors":"Michael Siderius, F. Jagodzinski","doi":"10.1109/BIBM.2016.7822779","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822779","url":null,"abstract":"Understanding how an amino acid substitution affects a protein's stability can aid in the design of pharmaceutical drugs that aim to counter the deleterious effects caused by protein mutants. Unfortunately, performing mutation experiments on the physical protein is both time and cost prohibitive. Thus an exhaustive analysis which includes systematically mutating all amino acids in the physical protein is infeasible. Computational methods have been developed over the years to predict the effects of mutations, but even many of them are computationally intensive else are dependent on homology or experimental data that may not be available for the protein being studied. In this work we motivate and present a computation pipeline whose only input is a Protein Data Bank file containing the 3D coordinates of the atoms of a biomolecule. Our high-throughput approach uses our rMutant algorithm to exhaustively generate in silico mutants with amino acid substitutions to Glycine, Alanine, and Serine for all residues in a protein. We exploit the speed of a fast rigidity analysis approach to analyze our protein variants, and develop a Mutation Sensitivity (MuSe) Map to identify residues that are most sensitive to mutations. We present three case studies and show the degree to which a MuSe Map is able to identify those amino acids which are susceptible to the effects of mutations.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128501410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822732
D. Lavenier, Jean-François Roy, David Furodet
This paper presents the implementation of a mapping algorithm on a new Processing-in-Memory (PIM) architecture developed by UPMEM Company. UPMEM's solution consists in adding processing units into the DRAM, to minimize data access time and maximize bandwidth, in order to drastically accelerate data-consuming algorithms. The technology developed by UPMEM makes it possible to combine 256 cores with 16 GBytes of DRAM, on a standard DIMM module. An experimentation of DNA Mapping on Human genome dataset shows that a speed-up of 25 can be obtained with UPMEM technology compared to fast mapping software such as BWA, Bowtie2 or NextGenMap running on 16 Intel threads. Experimentation also highlight that data transfer from storage device limits the performances of the implementation. The use of SSD drives can boost the speed-up to 80.
{"title":"DNA mapping using Processor-in-Memory architecture","authors":"D. Lavenier, Jean-François Roy, David Furodet","doi":"10.1109/BIBM.2016.7822732","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822732","url":null,"abstract":"This paper presents the implementation of a mapping algorithm on a new Processing-in-Memory (PIM) architecture developed by UPMEM Company. UPMEM's solution consists in adding processing units into the DRAM, to minimize data access time and maximize bandwidth, in order to drastically accelerate data-consuming algorithms. The technology developed by UPMEM makes it possible to combine 256 cores with 16 GBytes of DRAM, on a standard DIMM module. An experimentation of DNA Mapping on Human genome dataset shows that a speed-up of 25 can be obtained with UPMEM technology compared to fast mapping software such as BWA, Bowtie2 or NextGenMap running on 16 Intel threads. Experimentation also highlight that data transfer from storage device limits the performances of the implementation. The use of SSD drives can boost the speed-up to 80.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128494814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Detecting copy number variants (CNVs) is an essential part in variant calling process. Here, we describe a novel method ERDS-pe to detect CNVs from whole-exome sequencing (WES) data. ERDS-pe first employs principal component analysis to normalize WES data. Then, ERDS-pe incorporates read depth signal and single-nucleotide variation information together as a hybrid signal into a paired hidden Markov model to infer CNVs from WES data. Experimental results on real human WES data show that ERDS-pe demonstrates higher sensitivity and provides comparable or even better specificity than other tools. ERDS-pe is publicly available at: https://github.com/microtan0902/erds-pe.
{"title":"ERDS-pe: A paired hidden Markov model for copy number variant detection from whole-exome sequencing data","authors":"Renjie Tan, Jixuan Wang, Xiaoliang Wu, Guoqiang Wan, Rongjie Wang, Rui Ma, Zhijie Han, Wenyang Zhou, Shuilin Jin, Qinghua Jiang, Yadong Wang","doi":"10.1109/BIBM.2016.7822508","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822508","url":null,"abstract":"Detecting copy number variants (CNVs) is an essential part in variant calling process. Here, we describe a novel method ERDS-pe to detect CNVs from whole-exome sequencing (WES) data. ERDS-pe first employs principal component analysis to normalize WES data. Then, ERDS-pe incorporates read depth signal and single-nucleotide variation information together as a hybrid signal into a paired hidden Markov model to infer CNVs from WES data. Experimental results on real human WES data show that ERDS-pe demonstrates higher sensitivity and provides comparable or even better specificity than other tools. ERDS-pe is publicly available at: https://github.com/microtan0902/erds-pe.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129332369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822784
Sun-Ah Kim, Suh-Ryung Kim, Y. J. Yoo
Linkage disequilibrium structure (LD) is the main source of the study of population genetics and disease-gene association. Especially, analyzing extended long haplotypes carrying a derived allele and examining LD block patterns can provide evidence for positive selection. We investigated the LD block structure of East Asian, European, and African populations for the previously reported sites of positive selection by comparing LD block construction results based on 1000 Genomes Project data. We confirmed that differences of LD block size in EDAR, LCT, PCDH15, and LARGE region among different populations is consistent with previous reports. We found new evidence for positive selection in SLC30A19, PDE11A and BCAS3 in East Asian and European populations based on the LD block patterns.
{"title":"Comparisons of linkage disequilibrium blocks of different populations at the sites of natural selection","authors":"Sun-Ah Kim, Suh-Ryung Kim, Y. J. Yoo","doi":"10.1109/BIBM.2016.7822784","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822784","url":null,"abstract":"Linkage disequilibrium structure (LD) is the main source of the study of population genetics and disease-gene association. Especially, analyzing extended long haplotypes carrying a derived allele and examining LD block patterns can provide evidence for positive selection. We investigated the LD block structure of East Asian, European, and African populations for the previously reported sites of positive selection by comparing LD block construction results based on 1000 Genomes Project data. We confirmed that differences of LD block size in EDAR, LCT, PCDH15, and LARGE region among different populations is consistent with previous reports. We found new evidence for positive selection in SLC30A19, PDE11A and BCAS3 in East Asian and European populations based on the LD block patterns.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124543905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822530
Wenzheng Bao, Lin Zhu, De-shuang Huang
Lysine succinylation is one of most important types in protein post-translational modification, which is involved in many cellular processes and serious diseases. However, effective recognition of such sites with traditional experiment methods may seem to be treated as time-consuming and laborious. Those methods can hardly meet the need of efficient identification a great deal of succinylated sites at speed. In this work, several physicochemical properties of succinylated sites have been extracted, such as the physicochemical property of the amino acids. Flexible neural tree, which is employed as the classification model, was utilized to integrate above mentioned features for generating a novel lysine succinylation prediction framework named ILSES (identification lysine succinylation-sites with ensemble features classification). Such method owns the ability to combining diverse features to predict lysine succinylation with high accuracy and real time.
赖氨酸琥珀酰化是蛋白质翻译后修饰的重要类型之一,涉及许多细胞过程和严重疾病。然而,用传统的实验方法有效地识别这些地点似乎是费时费力的。这些方法很难满足快速高效鉴定大量琥珀化位点的需要。在这项工作中,提取了琥珀酰化位点的一些理化性质,如氨基酸的理化性质。采用柔性神经树作为分类模型,对上述特征进行整合,生成新的赖氨酸琥珀酰化预测框架ILSES (identification lysine succinylation-sites with ensemble features classification)。该方法能够结合多种特征预测赖氨酸琥珀酰化,具有较高的准确性和实时性。
{"title":"ILSES: Identification lysine succinylation-sites with ensemble classification","authors":"Wenzheng Bao, Lin Zhu, De-shuang Huang","doi":"10.1109/BIBM.2016.7822530","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822530","url":null,"abstract":"Lysine succinylation is one of most important types in protein post-translational modification, which is involved in many cellular processes and serious diseases. However, effective recognition of such sites with traditional experiment methods may seem to be treated as time-consuming and laborious. Those methods can hardly meet the need of efficient identification a great deal of succinylated sites at speed. In this work, several physicochemical properties of succinylated sites have been extracted, such as the physicochemical property of the amino acids. Flexible neural tree, which is employed as the classification model, was utilized to integrate above mentioned features for generating a novel lysine succinylation prediction framework named ILSES (identification lysine succinylation-sites with ensemble features classification). Such method owns the ability to combining diverse features to predict lysine succinylation with high accuracy and real time.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129134684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822735
Q. Zou
Multiple sequence alignment (MSA) is the “Holy Grail” problem in computational biology, but bottlenecks arise in the massive MSA of homologous sequences. Most of the available state-of-the-art software tools cannot address large-scale datasets, or they run rather slowly. The similarity of homologous DNA sequences is often ignored. Lack of parallelization is still a challenge for MSA research. Building the phylogenetic trees for ultra-large sequences is also a time-consuming work. MSA is the previous work for phylogenetic reconstruction. With the development of parallel computation, we employed Hadoop platform to solve the two computational intensive problems. Trie trees and suffix trees were used for accelerating multiple similar DNA sequences alignment. The expected time complexity was decreased to linear time from square time. For the phylogenetic tree reconstruction, clustering and multiple-sequence alignment were executed in parallel, and the basic phylogenetic trees were built using the neighbour-joining model. Experiments on two large datasets, both more than 1 GB, show that our software tool can outperform other common phylogenetic reconstruction tools. Furthermore, data, software codes, and web servers were all opened in http://lab.malab.cn/soft/halign/ and http://lab.malab.cn/soft/HPtree/
{"title":"Multiple sequence alignment and reconstructing phylogenetic trees with Hadoop","authors":"Q. Zou","doi":"10.1109/BIBM.2016.7822735","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822735","url":null,"abstract":"Multiple sequence alignment (MSA) is the “Holy Grail” problem in computational biology, but bottlenecks arise in the massive MSA of homologous sequences. Most of the available state-of-the-art software tools cannot address large-scale datasets, or they run rather slowly. The similarity of homologous DNA sequences is often ignored. Lack of parallelization is still a challenge for MSA research. Building the phylogenetic trees for ultra-large sequences is also a time-consuming work. MSA is the previous work for phylogenetic reconstruction. With the development of parallel computation, we employed Hadoop platform to solve the two computational intensive problems. Trie trees and suffix trees were used for accelerating multiple similar DNA sequences alignment. The expected time complexity was decreased to linear time from square time. For the phylogenetic tree reconstruction, clustering and multiple-sequence alignment were executed in parallel, and the basic phylogenetic trees were built using the neighbour-joining model. Experiments on two large datasets, both more than 1 GB, show that our software tool can outperform other common phylogenetic reconstruction tools. Furthermore, data, software codes, and web servers were all opened in http://lab.malab.cn/soft/halign/ and http://lab.malab.cn/soft/HPtree/","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129227693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sparse Principal Component Analysis (SPCA) is a method that can get the sparse loadings of the principal components (PCs), and it may formulate PCA as a regression-type optimization problem by using the elastic net. But the selected features are different with each PC and generally independent. A new method named SPCA has been proposed for removing these detect, which replaces the elastic net with L2,1-norm penalty. The results of the method on gene expression data are still unknown. Therefore, we will take a test to prove this point in this paper. Firstly, this method is applied to the simulated data for obtaining an optimal parameter. Secondly, the L2,1SPCA method is applied to the gene expression data, that is the head and neck squamous carcinoma data (HNSC). Thirdly, the characteristic genes are selected according the PCs. The results consist of very lower P-value and very higher hit count, which shows the method of L2,1SPCA can obtain higher recognition accuracy and higher relevancy to the genes. Finally, the experimental results demonstrate that the L2,1SPCA works well and has good performances in the gene expression data.
{"title":"Characteristic gene selection via L2,1-norm Sparse Principal Component Analysis","authors":"Yao Lu, Ying-Lian Gao, Jin-Xing Liu, Chang-Gang Wen, Yaxuan Wang, Jiguo Yu","doi":"10.1109/BIBM.2016.7822796","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822796","url":null,"abstract":"Sparse Principal Component Analysis (SPCA) is a method that can get the sparse loadings of the principal components (PCs), and it may formulate PCA as a regression-type optimization problem by using the elastic net. But the selected features are different with each PC and generally independent. A new method named SPCA has been proposed for removing these detect, which replaces the elastic net with L2,1-norm penalty. The results of the method on gene expression data are still unknown. Therefore, we will take a test to prove this point in this paper. Firstly, this method is applied to the simulated data for obtaining an optimal parameter. Secondly, the L2,1SPCA method is applied to the gene expression data, that is the head and neck squamous carcinoma data (HNSC). Thirdly, the characteristic genes are selected according the PCs. The results consist of very lower P-value and very higher hit count, which shows the method of L2,1SPCA can obtain higher recognition accuracy and higher relevancy to the genes. Finally, the experimental results demonstrate that the L2,1SPCA works well and has good performances in the gene expression data.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130622168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822556
C. Zhang, Tianzhu Liang, P. Mok, Weichuan Yu
In ultrasound image analysis, speckle tracking methods are widely applied to study the elasticity of body tissue. However, “feature-motion decorrelation” still remains as a challenge for speckle tracking methods. Recently, a coupled filtering method was proposed to accurately estimate strain values when the tissue deformation is large. The major drawback of the new method is its high computational complexity. Even the GPU-based program requires a few hours to finish the analysis. In this paper, we propose an FPGA-based implementation for further acceleration. The capability of FPGAs on handling different image processing components in this method is discussed. The algorithm is reformulated to build a highly efficient pipeline on FPGA. The final implementation on a Xilinx Virtex-7 FPGA is 15 times faster than the GPU implementation on two NVIDIA graphic cards (GeForce GTX 580).
{"title":"FPGA implementation of the coupled filtering method","authors":"C. Zhang, Tianzhu Liang, P. Mok, Weichuan Yu","doi":"10.1109/BIBM.2016.7822556","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822556","url":null,"abstract":"In ultrasound image analysis, speckle tracking methods are widely applied to study the elasticity of body tissue. However, “feature-motion decorrelation” still remains as a challenge for speckle tracking methods. Recently, a coupled filtering method was proposed to accurately estimate strain values when the tissue deformation is large. The major drawback of the new method is its high computational complexity. Even the GPU-based program requires a few hours to finish the analysis. In this paper, we propose an FPGA-based implementation for further acceleration. The capability of FPGAs on handling different image processing components in this method is discussed. The algorithm is reformulated to build a highly efficient pipeline on FPGA. The final implementation on a Xilinx Virtex-7 FPGA is 15 times faster than the GPU implementation on two NVIDIA graphic cards (GeForce GTX 580).","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123617664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}