Pub Date : 2012-10-01DOI: 10.1109/BIBMW.2012.6470224
Po-Yen Wu, John H Phan, May D Wang
Next-generation sequencing (NGS) has brought human genomic research to an unprecedented era. RNA-Seq is a branch of NGS that can be used to quantify gene expression and depends on accurate annotation of the human genome (i.e., the definition of genes and all of their variants or isoforms). Multiple annotations of the human genome exist with varying complexity. However, it is not clear how the choice of genome annotation influences RNA-Seq gene expression quantification. We assess the effect of different genome annotations in terms of (1) mapping quality, (2) quantification variation, (3) quantification accuracy (i.e., by comparing to qRT-PCR data), and (4) the concordance of detecting differentially expressed genes. External validation with qRT-PCR suggests that more complex genome annotations result in higher quantification variation.
{"title":"The Effect of Human Genome Annotation Complexity on RNA-Seq Gene Expression Quantification.","authors":"Po-Yen Wu, John H Phan, May D Wang","doi":"10.1109/BIBMW.2012.6470224","DOIUrl":"https://doi.org/10.1109/BIBMW.2012.6470224","url":null,"abstract":"<p><p>Next-generation sequencing (NGS) has brought human genomic research to an unprecedented era. RNA-Seq is a branch of NGS that can be used to quantify gene expression and depends on accurate annotation of the human genome (i.e., the definition of genes and all of their variants or isoforms). Multiple annotations of the human genome exist with varying complexity. However, it is not clear how the choice of genome annotation influences RNA-Seq gene expression quantification. We assess the effect of different genome annotations in terms of (1) mapping quality, (2) quantification variation, (3) quantification accuracy (i.e., by comparing to qRT-PCR data), and (4) the concordance of detecting differentially expressed genes. External validation with qRT-PCR suggests that more complex genome annotations result in higher quantification variation.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2012 ","pages":"712-717"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBMW.2012.6470224","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34313363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/BIBMW.2012.6470225
Shuai Yuan, Zhaohui Qin
Next generation sequencing (NGS) technologies have been applied extensively in many areas of genetics and genomics research. A fundamental problem when comes to analyzing NGS data is mapping short sequencing reads back to the reference genome. Most of existing software packages rely on a single uniform reference genome and do not automatically take into the consideration of genetic variants. On the other hand, large proportions of incorrectly mapped reads affect the correct interpretation of the NGS experimental results. As an example, Degner et al. showed that detecting allele-specific expression from RNA sequencing data was biased toward the reference allele. In this study, we developed a method that utilize DirectX 11 enabled graphics processing unit (GPU)'s parallel computing power to produces a personalized diploid reference genome based on all known genetic variants of that particular individual. We show that using such a personalized diploid reference genome can improve mapping accuracy and significantly reduce the bias toward reference allele in allele-specific expression analysis. Our method can be applied to any individual that has genotype information obtained either from array-based genotyping or resequencing. Besides the reference genome, no additional changes to alignment algorithm are needed for performing read mapping therefore one can utilize any of the existing read mapping tools and achieve the improved read mapping result. C++ and GPU compute shader source code of the software program is available at: http://code.google.com/p/diploid-mapping/downloads/list.
{"title":"Read-mapping using personalized diploid reference genome for RNA sequencing data reduced bias for detecting allele-specific expression.","authors":"Shuai Yuan, Zhaohui Qin","doi":"10.1109/BIBMW.2012.6470225","DOIUrl":"https://doi.org/10.1109/BIBMW.2012.6470225","url":null,"abstract":"<p><p>Next generation sequencing (NGS) technologies have been applied extensively in many areas of genetics and genomics research. A fundamental problem when comes to analyzing NGS data is mapping short sequencing reads back to the reference genome. Most of existing software packages rely on a single uniform reference genome and do not automatically take into the consideration of genetic variants. On the other hand, large proportions of incorrectly mapped reads affect the correct interpretation of the NGS experimental results. As an example, Degner et al. showed that detecting allele-specific expression from RNA sequencing data was biased toward the reference allele. In this study, we developed a method that utilize DirectX 11 enabled graphics processing unit (GPU)'s parallel computing power to produces a personalized diploid reference genome based on all known genetic variants of that particular individual. We show that using such a personalized diploid reference genome can improve mapping accuracy and significantly reduce the bias toward reference allele in allele-specific expression analysis. Our method can be applied to any individual that has genotype information obtained either from array-based genotyping or resequencing. Besides the reference genome, no additional changes to alignment algorithm are needed for performing read mapping therefore one can utilize any of the existing read mapping tools and achieve the improved read mapping result. C++ and GPU compute shader source code of the software program is available at: http://code.google.com/p/diploid-mapping/downloads/list.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2012 ","pages":"718-724"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBMW.2012.6470225","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33003100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maciej Sykulski, T. Gambin, M. Bartnik, K. Derwinska, B. Wiśniowiecka-Kowalnik, P. Stankiewicz, A. Gambin
We propose a novel multiple sample aCGH analysis methodology aiming in rare Copy-Number Variations (CNVs) detection. Our method is tested on exon targeted aCGH array of 366 patients affected with developmental delay/intellectual disability, epilepsy, or autism. The proposed algorithms can be applied as a post -- processing filtering to any given segmentation method. Thanks to the additional information obtained from multiple samples, we could efficiently detect significant segments corresponding to rare CNVs responsible for pathogenic changes. More detailed description of the method is available in Supplementary Materials at: http://bioputer.mimuw.edu.pl/acgh.
{"title":"Efficient Multiple Samples aCGH Analysis for Rare CNVs Detection","authors":"Maciej Sykulski, T. Gambin, M. Bartnik, K. Derwinska, B. Wiśniowiecka-Kowalnik, P. Stankiewicz, A. Gambin","doi":"10.1109/BIBM.2011.38","DOIUrl":"https://doi.org/10.1109/BIBM.2011.38","url":null,"abstract":"We propose a novel multiple sample aCGH analysis methodology aiming in rare Copy-Number Variations (CNVs) detection. Our method is tested on exon targeted aCGH array of 366 patients affected with developmental delay/intellectual disability, epilepsy, or autism. The proposed algorithms can be applied as a post -- processing filtering to any given segmentation method. Thanks to the additional information obtained from multiple samples, we could efficiently detect significant segments corresponding to rare CNVs responsible for pathogenic changes. More detailed description of the method is available in Supplementary Materials at: http://bioputer.mimuw.edu.pl/acgh.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"43 1","pages":"406-409"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90212488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/BIBMW.2011.6112376
Koon Yin Kong, Adam I Marcus, Paraskevi Giaanakakou, May D Wang
We propose a web interface that allows researchers to quantify and analyze microtubule confocal images online. Most analyses of microtubule confocal images are performed manually using very simple software or tools. Analysis results are stored locally within each collaborator with different styles and formats. This has limited the sharing of data and results when collaborating among different research parties. A web interface provides a simple way for users to process data online. It also allows easy sharing of both data and results among different participating groups. Analysis workflow of the interface is made similar to existing manual protocols. We demonstrate the integration of image processing algorithm in the current workflow to aid the analysis. Our design also allows integration of novel automated analysis algorithms and modules to re-evaluate existing data. This interface can provide a validation platform for new automated algorithm and allow collaboration on microtubule image analysis from different locations.
{"title":"A Web Interface for the Quantification of Microtubule Dynamics.","authors":"Koon Yin Kong, Adam I Marcus, Paraskevi Giaanakakou, May D Wang","doi":"10.1109/BIBMW.2011.6112376","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112376","url":null,"abstract":"<p><p>We propose a web interface that allows researchers to quantify and analyze microtubule confocal images online. Most analyses of microtubule confocal images are performed manually using very simple software or tools. Analysis results are stored locally within each collaborator with different styles and formats. This has limited the sharing of data and results when collaborating among different research parties. A web interface provides a simple way for users to process data online. It also allows easy sharing of both data and results among different participating groups. Analysis workflow of the interface is made similar to existing manual protocols. We demonstrate the integration of image processing algorithm in the current workflow to aid the analysis. Our design also allows integration of novel automated analysis algorithms and modules to re-evaluate existing data. This interface can provide a validation platform for new automated algorithm and allow collaboration on microtubule image analysis from different locations.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2011 ","pages":"209-214"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBMW.2011.6112376","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34711998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/BIBMW.2011.6112354
Po-Yen Wu, John H Phan, Fengfeng Zhou, May D Wang
Statistical inferences on RNA-Seq data, e.g., detecting differential gene expression, are meaningful only after proper normalization. However, there is no consensus for choosing a normalization procedure from among the many existing procedures. We evaluated several RNA-Seq normalization procedures by (1) correlating estimated RNA-Seq expression values to those of microarrays, (2) examining the concordance of stable and differential gene detection between the platforms, and (3) applying the procedures to simulated RNA-Seq data. Results suggested that RNA-Seq normalization procedures have little effect on both inter-platform gene expression correlation as well as inter-platform concordance of genes detected as stably or differentially expressed. However, the results of simulated analysis suggested that some normalization procedures are more robust to changes in distribution of differentially expressed genes. These results may provide guidance for selecting RNA-Seq normalization procedures.
{"title":"Evaluation of Normalization Methods for RNA-Seq Gene Expression Estimation.","authors":"Po-Yen Wu, John H Phan, Fengfeng Zhou, May D Wang","doi":"10.1109/BIBMW.2011.6112354","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112354","url":null,"abstract":"<p><p>Statistical inferences on RNA-Seq data, e.g., detecting differential gene expression, are meaningful only after proper normalization. However, there is no consensus for choosing a normalization procedure from among the many existing procedures. We evaluated several RNA-Seq normalization procedures by (1) correlating estimated RNA-Seq expression values to those of microarrays, (2) examining the concordance of stable and differential gene detection between the platforms, and (3) applying the procedures to simulated RNA-Seq data. Results suggested that RNA-Seq normalization procedures have little effect on both inter-platform gene expression correlation as well as inter-platform concordance of genes detected as stably or differentially expressed. However, the results of simulated analysis suggested that some normalization procedures are more robust to changes in distribution of differentially expressed genes. These results may provide guidance for selecting RNA-Seq normalization procedures.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2011 ","pages":"50-57"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBMW.2011.6112354","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34313421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Delroy Cameron, Ramakanth Kavuluru, Olivier Bodenreider, Pablo N Mendes, Amit P Sheth, Krishnaprasad Thirunarayan
Many complex information needs that arise in biomedical disciplines require exploring multiple documents in order to obtain information. While traditional information retrieval techniques that return a single ranked list of documents are quite common for such tasks, they may not always be adequate. The main issue is that ranked lists typically impose a significant burden on users to filter out irrelevant documents. Additionally, users must intuitively reformulate their search query when relevant documents have not been not highly ranked. Furthermore, even after interesting documents have been selected, very few mechanisms exist that enable document-to-document transitions. In this paper, we demonstrate the utility of assertions extracted from biomedical text (called semantic predications) to facilitate retrieving relevant documents for complex information needs. Our approach offers an alternative to query reformulation by establishing a framework for transitioning from one document to another. We evaluate this novel knowledge-driven approach using precision and recall metrics on the 2006 TREC Genomics Track.
{"title":"Semantic Predications for Complex Information Needs in Biomedical Literature.","authors":"Delroy Cameron, Ramakanth Kavuluru, Olivier Bodenreider, Pablo N Mendes, Amit P Sheth, Krishnaprasad Thirunarayan","doi":"10.1109/BIBM.2011.23","DOIUrl":"10.1109/BIBM.2011.23","url":null,"abstract":"<p><p>Many complex information needs that arise in biomedical disciplines require exploring multiple documents in order to obtain information. While traditional information retrieval techniques that return a single ranked list of documents are quite common for such tasks, they may not always be adequate. The main issue is that ranked lists typically impose a significant burden on users to filter out irrelevant documents. Additionally, users must intuitively reformulate their search query when relevant documents have not been not highly ranked. Furthermore, even after interesting documents have been selected, very few mechanisms exist that enable document-to-document transitions. In this paper, we demonstrate the utility of assertions extracted from biomedical text (called semantic predications) to facilitate retrieving relevant documents for complex information needs. Our approach offers an alternative to query reformulation by establishing a framework for transitioning from one document to another. We evaluate this novel knowledge-driven approach using precision and recall metrics on the 2006 TREC Genomics Track.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2011 ","pages":"512-519"},"PeriodicalIF":0.0,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4330970/pdf/nihms654702.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33399671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01Epub Date: 2011-01-28DOI: 10.1109/BIBMW.2010.5703928
Jiayu Chen, Jingyu Liu, Vince D Calhoun
Copy number variation (CNV) detection using SNP array data is challenging due to the low signal-to-noise ratio. In this study, we propose a principal component analysis (PCA) based correction to eliminate variance in CNV data induced by potential confounding factors. Simulations show a substantial improvement in CNV detection accuracy after correction. We also observe a significant improvement in data quality in real SNP array data after correction.
由于信噪比低,使用 SNP 阵列数据进行拷贝数变异 (CNV) 检测具有挑战性。在本研究中,我们提出了一种基于主成分分析(PCA)的校正方法,以消除潜在混杂因素引起的 CNV 数据方差。模拟结果表明,校正后 CNV 检测准确率大幅提高。我们还观察到,经过校正后,真实 SNP 阵列数据的数据质量也有明显改善。
{"title":"Correction of Copy Number Variation Data Using Principal Component Analysis.","authors":"Jiayu Chen, Jingyu Liu, Vince D Calhoun","doi":"10.1109/BIBMW.2010.5703928","DOIUrl":"10.1109/BIBMW.2010.5703928","url":null,"abstract":"<p><p>Copy number variation (CNV) detection using SNP array data is challenging due to the low signal-to-noise ratio. In this study, we propose a principal component analysis (PCA) based correction to eliminate variance in CNV data induced by potential confounding factors. Simulations show a substantial improvement in CNV detection accuracy after correction. We also observe a significant improvement in data quality in real SNP array data after correction.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2010 ","pages":"827-828"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6353609/pdf/nihms-1007295.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36924030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/BIBMW.2010.5703848
Chihwen Cheng, Todd H Stokes, Sovandy Hang, May D Wang
Doctors need fast and convenient access to medical data. This motivates the use of mobile devices for knowledge retrieval and sharing. We have developed TissueWikiMobile on the Apple iPhone and iPad to seamlessly access TissueWiki, an enormous repository of medical histology images. TissueWiki is a three terabyte database of antibody information and histology images from the Human Protein Atlas (HPA). Using TissueWikiMobile, users are capable of extracting knowledge from protein expression, adding annotations to highlight regions of interest on images, and sharing their professional insight. By providing an intuitive human computer interface, users can efficiently operate TissueWikiMobile to access important biomedical data without losing mobility. TissueWikiMobile furnishes the health community a ubiquitous way to collaborate and share their expert opinions not only on the performance of various antibodies stains but also on histology image annotation.
{"title":"TissueWiki<sup>Mobile</sup>: an Integrative Protein Expression Image Browser for Pathological Knowledge Sharing and Annotation on a Mobile Device.","authors":"Chihwen Cheng, Todd H Stokes, Sovandy Hang, May D Wang","doi":"10.1109/BIBMW.2010.5703848","DOIUrl":"https://doi.org/10.1109/BIBMW.2010.5703848","url":null,"abstract":"<p><p>Doctors need fast and convenient access to medical data. This motivates the use of mobile devices for knowledge retrieval and sharing. We have developed TissueWiki<sup>Mobile</sup> on the Apple iPhone and iPad to seamlessly access TissueWiki, an enormous repository of medical histology images. TissueWiki is a three terabyte database of antibody information and histology images from the Human Protein Atlas (HPA). Using TissueWiki<sup>Mobile</sup>, users are capable of extracting knowledge from protein expression, adding annotations to highlight regions of interest on images, and sharing their professional insight. By providing an intuitive human computer interface, users can efficiently operate TissueWiki<sup>Mobile</sup> to access important biomedical data without losing mobility. TissueWiki<sup>Mobile</sup> furnishes the health community a ubiquitous way to collaborate and share their expert opinions not only on the performance of various antibodies stains but also on histology image annotation.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2010 ","pages":"473-480"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBMW.2010.5703848","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34313420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/BIBMW.2010.5703865
Jaydeep K Srimani, Po-Yen Wu, John H Phan, May D Wang
We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.
{"title":"A distributed system for fast alignment of next-generation sequencing data.","authors":"Jaydeep K Srimani, Po-Yen Wu, John H Phan, May D Wang","doi":"10.1109/BIBMW.2010.5703865","DOIUrl":"https://doi.org/10.1109/BIBMW.2010.5703865","url":null,"abstract":"<p><p>We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2010 ","pages":"579-584"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBMW.2010.5703865","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34378448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-11-01DOI: 10.1109/BIBMW.2009.5332104
Li An, Zoran Obradovic, Desmond Smith, Olivier Bodenreider, Vasileios Megalooikonomou
Association rules mining methods have been recently applied to gene expression data analysis to reveal relationships between genes and different conditions and features. However, not much effort has focused on detecting the relation between gene expression maps and related gene functions. Here we describe such an approach to mine association rules among gene functions in clusters of similar gene expression maps on mouse brain. The experimental results show that the detected association rules make sense biologically. By inspecting the obtained clusters and the genes having the gene functions of frequent itemsets, interesting clues were discovered that provide valuable insight to biological scientists. Moreover, discovered association rules can be potentially used to predict gene functions based on similarity of gene expression maps.
{"title":"Mining Association Rules among Gene Functions in Clusters of Similar Gene Expression Maps.","authors":"Li An, Zoran Obradovic, Desmond Smith, Olivier Bodenreider, Vasileios Megalooikonomou","doi":"10.1109/BIBMW.2009.5332104","DOIUrl":"10.1109/BIBMW.2009.5332104","url":null,"abstract":"<p><p>Association rules mining methods have been recently applied to gene expression data analysis to reveal relationships between genes and different conditions and features. However, not much effort has focused on detecting the relation between gene expression maps and related gene functions. Here we describe such an approach to mine association rules among gene functions in clusters of similar gene expression maps on mouse brain. The experimental results show that the detected association rules make sense biologically. By inspecting the obtained clusters and the genes having the gene functions of frequent itemsets, interesting clues were discovered that provide valuable insight to biological scientists. Moreover, discovered association rules can be potentially used to predict gene functions based on similarity of gene expression maps.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2009 ","pages":"254-259"},"PeriodicalIF":0.0,"publicationDate":"2009-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4307020/pdf/nihms654700.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33345106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}