Metagenomics is the study of environmental samples. Because few tools exist for metagenomic analysis, a natural step has been to utilize the popular homology tool, BLAST, to search for sequence similarity between DNA reads and an administered database. Most biologists use this method today without knowing BLAST’s accuracy, especially when a particular taxonomic class is under-represented in the database. The aim of this paper is to benchmark the performance of BLAST for taxonomic classification of metagenomic datasets in a supervised setting, meaning that the database contains microbes of the same class as the ‘unknown’ query DNA reads. We examine well- and under-represented genera and phyla in order to study their effect on the accuracy of BLAST. We investigate the degradation in BLAST accuracy when genome coverage is reduced in the training database as well as the performance when errors are introduced into the query DNA reads. We conclude that on fine-resolution classes, such as genera, the accuracy of BLAST does not degrade very much with under-representation, but in a highly variant class, such as phyla, performance degrades significantly when whole genomes are used in the training database. BLAST accuracy at the genus level is affected greater than phyla when coverage in the training database is reduced or when 1% sequence error is introduced into the query DNA reads. Our analysis includes five-fold cross validation to substantiate our findings.
{"title":"The Effect of Sequence Error and Partial Training Data on BLAST Accuracy","authors":"S. Essinger, G. Rosen","doi":"10.1109/BIBE.2010.49","DOIUrl":"https://doi.org/10.1109/BIBE.2010.49","url":null,"abstract":"Metagenomics is the study of environmental samples. Because few tools exist for metagenomic analysis, a natural step has been to utilize the popular homology tool, BLAST, to search for sequence similarity between DNA reads and an administered database. Most biologists use this method today without knowing BLAST’s accuracy, especially when a particular taxonomic class is under-represented in the database. The aim of this paper is to benchmark the performance of BLAST for taxonomic classification of metagenomic datasets in a supervised setting, meaning that the database contains microbes of the same class as the ‘unknown’ query DNA reads. We examine well- and under-represented genera and phyla in order to study their effect on the accuracy of BLAST. We investigate the degradation in BLAST accuracy when genome coverage is reduced in the training database as well as the performance when errors are introduced into the query DNA reads. We conclude that on fine-resolution classes, such as genera, the accuracy of BLAST does not degrade very much with under-representation, but in a highly variant class, such as phyla, performance degrades significantly when whole genomes are used in the training database. BLAST accuracy at the genus level is affected greater than phyla when coverage in the training database is reduced or when 1% sequence error is introduced into the query DNA reads. Our analysis includes five-fold cross validation to substantiate our findings.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127931174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. D. Scheff, I. Androulakis, S. Calvano, S. Lowry
Due to the inherent complexities involved in understanding a redundant, non-linear process like inflammation, systems biology modeling approaches have been proposed. While these models have produced encouraging results, they do not take into account the circadian nature of many of their components. Circadian rhythms are daily variations in a wide number of biological processes, including many of the hormones and cytokines responsible for the regulation of inflammation. Thus, to work towards the development of a more complete and useful model of inflammation, we extend previous modeling efforts by incorporating important interactions that give rise to circadian variability in inflammation. This is predicated on the assumption that hormones regulated by the central circadian pacemaker ultimately govern diurnal variations in the inflammatory response. The model is then used to simulate a normal resting state, a healthy self-limited inflammatory response, and an unresolved inflammatory response, by analyzing how these responses change throughout the day, clinically-relevant insight is gained.
{"title":"Modeling Circadian Rhythms in Inflammation","authors":"J. D. Scheff, I. Androulakis, S. Calvano, S. Lowry","doi":"10.1109/BIBE.2010.39","DOIUrl":"https://doi.org/10.1109/BIBE.2010.39","url":null,"abstract":"Due to the inherent complexities involved in understanding a redundant, non-linear process like inflammation, systems biology modeling approaches have been proposed. While these models have produced encouraging results, they do not take into account the circadian nature of many of their components. Circadian rhythms are daily variations in a wide number of biological processes, including many of the hormones and cytokines responsible for the regulation of inflammation. Thus, to work towards the development of a more complete and useful model of inflammation, we extend previous modeling efforts by incorporating important interactions that give rise to circadian variability in inflammation. This is predicated on the assumption that hormones regulated by the central circadian pacemaker ultimately govern diurnal variations in the inflammatory response. The model is then used to simulate a normal resting state, a healthy self-limited inflammatory response, and an unresolved inflammatory response, by analyzing how these responses change throughout the day, clinically-relevant insight is gained.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121004519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This manuscript presents the most rigorous benchmarking of gene annotation algorithms for metagenomic datasets to date. We compare three different programs: GeneMark, MetaGeneAnnotator (MGA) and Orphelia. The comparisons are based on their performances over simulated fragments from hundred species of diverse lineages. We defined three different types of fragments: one type from the intra-coding region and the other types are from the gene edges. The general observation was that performances of all these programs improve as we increase the length of the fragment. On the other hand, intra-coding fragments of our data show a low annotation error in all of the programs if compared to the genes edges.
{"title":"Comparison of Gene Prediction Programs for Metagenomic Data","authors":"Non Yok, G. Rosen","doi":"10.1109/BIBE.2010.58","DOIUrl":"https://doi.org/10.1109/BIBE.2010.58","url":null,"abstract":"This manuscript presents the most rigorous benchmarking of gene annotation algorithms for metagenomic datasets to date. We compare three different programs: GeneMark, MetaGeneAnnotator (MGA) and Orphelia. The comparisons are based on their performances over simulated fragments from hundred species of diverse lineages. We defined three different types of fragments: one type from the intra-coding region and the other types are from the gene edges. The general observation was that performances of all these programs improve as we increase the length of the fragment. On the other hand, intra-coding fragments of our data show a low annotation error in all of the programs if compared to the genes edges.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121125730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Experimental construction of biochemical reaction systems that model cellular behavior often leads to kinetic parameter values that do not satisfy important thermodynamic constraints, thus resulting in models that are not physically realizable. In this paper, we propose a method that takes thermodynamically infeasible published kinetic parameter values and recomputes a new set of thermodynamically consistent values. The method is based on formulating and implementing an appropriate constrained optimization problem by assuming that the molecular dynamics produced by the published values are “noisy” versions of the corresponding dynamics produced by the thermodynamically consistent “true” values.
{"title":"On Constructing Thermodynamically Consistent Parametrizations of Kinetic Models","authors":"W. G. Jenkinson, J. Goutsias","doi":"10.1109/BIBE.2010.42","DOIUrl":"https://doi.org/10.1109/BIBE.2010.42","url":null,"abstract":"Experimental construction of biochemical reaction systems that model cellular behavior often leads to kinetic parameter values that do not satisfy important thermodynamic constraints, thus resulting in models that are not physically realizable. In this paper, we propose a method that takes thermodynamically infeasible published kinetic parameter values and recomputes a new set of thermodynamically consistent values. The method is based on formulating and implementing an appropriate constrained optimization problem by assuming that the molecular dynamics produced by the published values are “noisy” versions of the corresponding dynamics produced by the thermodynamically consistent “true” values.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"9 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126291309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Pu, Saangho Lee, D. Samuels, L. Watson, Yang Cao
Insulin secreted by pancreatic islet beta-cells is the principal regulating hormone of glucose metabolism. Disruption of insulin secretion may cause glucose to accumulate in the blood, and result in diabetes mellitus. Although deterministic models of the insulin secretion pathway are available, the stochastic aspect of the biological pathway has not been explored. As a first step in this direction, we present a hybrid model of the insulin secretion pathway, in which the delayed rectifying K+ channels are treated as stochastic events. Simulation results of our hybrid model demonstrate that our model not only can reproduce the bursts of electrical activity as the deterministic model does, but also can be used to predict the magnitude of the total number of the delayed rectifying K+ channels per cell needed in order to prevent the function of this pathway from disruption by stochastic effects. The coupling effect of multiple cells is also studied based on the hybrid model, which shows the synchronization behavior of the cells.
{"title":"Hybrid Modeling and Simulation of Insulin Secretion Pathway in Pancreatic Islets","authors":"Yang Pu, Saangho Lee, D. Samuels, L. Watson, Yang Cao","doi":"10.1109/BIBE.2010.34","DOIUrl":"https://doi.org/10.1109/BIBE.2010.34","url":null,"abstract":"Insulin secreted by pancreatic islet beta-cells is the principal regulating hormone of glucose metabolism. Disruption of insulin secretion may cause glucose to accumulate in the blood, and result in diabetes mellitus. Although deterministic models of the insulin secretion pathway are available, the stochastic aspect of the biological pathway has not been explored. As a first step in this direction, we present a hybrid model of the insulin secretion pathway, in which the delayed rectifying K+ channels are treated as stochastic events. Simulation results of our hybrid model demonstrate that our model not only can reproduce the bursts of electrical activity as the deterministic model does, but also can be used to predict the magnitude of the total number of the delayed rectifying K+ channels per cell needed in order to prevent the function of this pathway from disruption by stochastic effects. The coupling effect of multiple cells is also studied based on the hybrid model, which shows the synchronization behavior of the cells.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116038751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The identification of significant disease-related genes and networks is an important issue in understanding underlying mechanisms of cells. We integrate phenotype networks, protein networks and efficiently utilize gene expression data to identify human disease networks. We use prostate cancer data as our test domain. In comparison with statistical methods such as t-test and Wilcoxon test, our method identifies more prostate cancer-related genes reported in published database and literature. Interleukin-type growth factors, Ras related oncogenes and cytokine interactions canonical pathways are found to be significantly related to prostate cancer.
{"title":"Identifying Prostate Cancer-Related Networks from Microarray Data Based on Genotype-Phenotype Networks Using Markov Blanket Search","authors":"Hsiang-Yuan Yeh, Yi-Yu Liu, Cheng-Yu Yeh, V. Soo","doi":"10.1109/BIBE.2010.64","DOIUrl":"https://doi.org/10.1109/BIBE.2010.64","url":null,"abstract":"The identification of significant disease-related genes and networks is an important issue in understanding underlying mechanisms of cells. We integrate phenotype networks, protein networks and efficiently utilize gene expression data to identify human disease networks. We use prostate cancer data as our test domain. In comparison with statistical methods such as t-test and Wilcoxon test, our method identifies more prostate cancer-related genes reported in published database and literature. Interleukin-type growth factors, Ras related oncogenes and cytokine interactions canonical pathways are found to be significantly related to prostate cancer.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128435202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wireless Capsule Endoscopy (WCE) is an endoscopy technology that allows medical personnel to view the digestive tract non-invasively. Physicians can detect diseases such as blood-based abnormalities, polyps, ulcers and Crohn’s disease. In previous papers we have proposed methodologies that deal with such abnormalities. In the current paper we are proposing a novel approach to visualize the digestive tract surface in the three-dimensional (3D) space using images (2D) of WCE videos. Current capsule technology has a low video frame rate (3 frames/second) and therefore a 3D reconstruction of the digestive tract lacks of detailed coherence. Since such a 3D reconstruction is unfeasible we focused our research on extracting and representing the texture of the surface of the digestive tract in 3D and not the digestive tract itself. The use of this technique has been well appreciated by our collaborating physicians for giving them the ability to view the WCE videos in a new different way. To our knowledge no similar research has been performed on WCE videos. Illustrative results from the methodology are also given in this paper.
{"title":"3D Representation of the Digestive Tract Surface in Wireless Capsule Endoscopy Videos","authors":"A. Karargyris, O. Karargyris, N. Bourbakis","doi":"10.1109/BIBE.2010.9","DOIUrl":"https://doi.org/10.1109/BIBE.2010.9","url":null,"abstract":"Wireless Capsule Endoscopy (WCE) is an endoscopy technology that allows medical personnel to view the digestive tract non-invasively. Physicians can detect diseases such as blood-based abnormalities, polyps, ulcers and Crohn’s disease. In previous papers we have proposed methodologies that deal with such abnormalities. In the current paper we are proposing a novel approach to visualize the digestive tract surface in the three-dimensional (3D) space using images (2D) of WCE videos. Current capsule technology has a low video frame rate (3 frames/second) and therefore a 3D reconstruction of the digestive tract lacks of detailed coherence. Since such a 3D reconstruction is unfeasible we focused our research on extracting and representing the texture of the surface of the digestive tract in 3D and not the digestive tract itself. The use of this technique has been well appreciated by our collaborating physicians for giving them the ability to view the WCE videos in a new different way. To our knowledge no similar research has been performed on WCE videos. Illustrative results from the methodology are also given in this paper.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132182869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Partial AUC (pAUC) represents the area with a restricted range of specificity (e.g. low false positive rate). It may identify important regional differentiated genes missed by full-range analysis. Unlike the popular t-test, which is based on the mean difference and the standard deviation between the disease and health groups, pAUC based test statistic relies on the rank of a gene in different samples. It can effectively detect genes that are not significant in a t-test and only differentiated in a subset of the disease groups. Our experiments with real gene expression data show that the proposed pAUC statistic is appealing in terms of both detection power and the biological relevance of the results.
{"title":"Partial AUC for Differentiated Gene Detection","authors":"Zhenqiu Liu, T. Hyslop","doi":"10.1109/BIBE.2010.68","DOIUrl":"https://doi.org/10.1109/BIBE.2010.68","url":null,"abstract":"Partial AUC (pAUC) represents the area with a restricted range of specificity (e.g. low false positive rate). It may identify important regional differentiated genes missed by full-range analysis. Unlike the popular t-test, which is based on the mean difference and the standard deviation between the disease and health groups, pAUC based test statistic relies on the rank of a gene in different samples. It can effectively detect genes that are not significant in a t-test and only differentiated in a subset of the disease groups. Our experiments with real gene expression data show that the proposed pAUC statistic is appealing in terms of both detection power and the biological relevance of the results.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133708205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bones are firm and solid organs that form the skeleton of the body. Bone consists of living tissues with which bone remodeling occurs asynchronously at various sites and involves resorption by osteoclast, followed by formation of new bone by osteoblast. Although bone is a simple composite of a mineral structure, its structure highly contributes to the strength of the bone. In this paper, we develop a computational framework for microstructural bone dynamics modeling which is capable of quantitative assessment of bone mineral density and bone micro-architecture. Our paper focuses on bone microstructure and remodeling dynamics based on a bone network model. First, we generate a bone network based on our bone model reflecting bone microstructure. Next, we introduce a mathematical model of autocrine and paracrine interactions among osteoblast and osteoclast. It allows us to calculate cell population dynamics and changes in bone mass at multiple sites of bone remodeling. Last, we analyze two bone networks representing healthy bone and bone with osteoporosis with our evaluation measurements. Our study provides an initial framework of bone remodeling simulation that can explain experimental observations in bone biology to explore bone diseases such as osteoporosis.
{"title":"Computational Framework for Microstructural Bone Dynamics Model and Its Evaluation","authors":"Taehyong Kim, W. Hwang, A. Zhang, M. Ramanathan","doi":"10.1109/BIBE.2010.23","DOIUrl":"https://doi.org/10.1109/BIBE.2010.23","url":null,"abstract":"Bones are firm and solid organs that form the skeleton of the body. Bone consists of living tissues with which bone remodeling occurs asynchronously at various sites and involves resorption by osteoclast, followed by formation of new bone by osteoblast. Although bone is a simple composite of a mineral structure, its structure highly contributes to the strength of the bone. In this paper, we develop a computational framework for microstructural bone dynamics modeling which is capable of quantitative assessment of bone mineral density and bone micro-architecture. Our paper focuses on bone microstructure and remodeling dynamics based on a bone network model. First, we generate a bone network based on our bone model reflecting bone microstructure. Next, we introduce a mathematical model of autocrine and paracrine interactions among osteoblast and osteoclast. It allows us to calculate cell population dynamics and changes in bone mass at multiple sites of bone remodeling. Last, we analyze two bone networks representing healthy bone and bone with osteoporosis with our evaluation measurements. Our study provides an initial framework of bone remodeling simulation that can explain experimental observations in bone biology to explore bone diseases such as osteoporosis.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121713146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We developed a method for analyzing the dynamics of gene regulatory networks in purely qualitative fashion. In our method, constraints for possible behaviors of a network and a biological property of interest are described as Linear Temporal Logic formulas, being automatically analyzed by satisfiability checking. In this way, we can investigate whether there exists some behavior which satisfies a specified property or whether all the behaviors satisfy a specified property, which are difficult in quantitative analysis.
{"title":"Qualitative Analysis of Gene Regulatory Networks by Satisfiability Checking of Linear Temporal Logic","authors":"S. Ito, N. Izumi, Shigeki Hagihara, N. Yonezaki","doi":"10.1109/BIBE.2010.45","DOIUrl":"https://doi.org/10.1109/BIBE.2010.45","url":null,"abstract":"We developed a method for analyzing the dynamics of gene regulatory networks in purely qualitative fashion. In our method, constraints for possible behaviors of a network and a biological property of interest are described as Linear Temporal Logic formulas, being automatically analyzed by satisfiability checking. In this way, we can investigate whether there exists some behavior which satisfies a specified property or whether all the behaviors satisfy a specified property, which are difficult in quantitative analysis.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121971767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}