首页 > 最新文献

Microbial Informatics and Experimentation最新文献

英文 中文
Mycobacterium tuberculosis and Clostridium difficille interactomes: demonstration of rapid development of computational system for bacterial interactome prediction. 结核分枝杆菌和艰难梭菌相互作用组:细菌相互作用组预测计算系统快速发展的示范。
Pub Date : 2012-03-21 DOI: 10.1186/2042-5783-2-4
Seshan Ananthasubramanian, Rahul Metri, Ankur Khetan, Aman Gupta, Adam Handen, Nagasuma Chandra, Madhavi Ganapathiraju

Background: Protein-protein interaction (PPI) networks (interactomes) of most organisms, except for some model organisms, are largely unknown. Experimental methods including high-throughput techniques are highly resource intensive. Therefore, computational discovery of PPIs can accelerate biological discovery by presenting "most-promising" pairs of proteins that are likely to interact. For many bacteria, genome sequence, and thereby genomic context of proteomes, is readily available; additionally, for some of these proteomes, localization and functional annotations are also available, but interactomes are not available. We present here a method for rapid development of computational system to predict interactome of bacterial proteomes. While other studies have presented methods to transfer interologs across species, here, we propose transfer of computational models to benefit from cross-species annotations, thereby predicting many more novel interactions even in the absence of interologs. Mycobacterium tuberculosis (Mtb) and Clostridium difficile (CD) have been used to demonstrate the work.

Results: We developed a random forest classifier over features derived from Gene Ontology annotations and genetic context scores provided by STRING database for predicting Mtb and CD interactions independently. The Mtb classifier gave a precision of 94% and a recall of 23% on a held out test set. The Mtb model was then run on all the 8 million protein pairs of the Mtb proteome, resulting in 708 new interactions (at 94% expected precision) or 1,595 new interactions at 80% expected precision. The CD classifier gave a precision of 90% and a recall of 16% on a held out test set. The CD model was run on all the 8 million protein pairs of the CD proteome, resulting in 143 new interactions (at 90% expected precision) or 580 new interactions (at 80% expected precision). We also compared the overlap of predictions of our method with STRING database interactions for CD and Mtb and also with interactions identified recently by a bacterial 2-hybrid system for Mtb. To demonstrate the utility of transfer of computational models, we made use of the developed Mtb model and used it to predict CD protein-pairs. The cross species model thus developed yielded a precision of 88% at a recall of 8%. To demonstrate transfer of features from other organisms in the absence of feature-based and interaction-based information, we transferred missing feature values from Mtb orthologs into the CD data. In transferring this data from orthologs (not interologs), we showed that a large number of interactions can be predicted.

Conclusions: Rapid discovery of (partial) bacterial interactome can be made by using existing set of GO and STRING features associated with the organisms. We can make use of cross-species interactome development, when there are not even sufficient known interactions to develop a computational prediction sys

背景:除了一些模式生物外,大多数生物的蛋白质-蛋白质相互作用(PPI)网络(相互作用组)在很大程度上是未知的。包括高通量技术在内的实验方法是高度资源密集型的。因此,PPIs的计算发现可以通过呈现可能相互作用的“最有希望的”蛋白质对来加速生物学发现。对于许多细菌来说,基因组序列和蛋白质组的基因组背景是很容易获得的;此外,对于其中一些蛋白质组,定位和功能注释也可用,但相互作用组不可用。本文提出了一种快速开发细菌蛋白质组相互作用预测计算系统的方法。虽然其他研究已经提出了跨物种转移interologi的方法,但在这里,我们提出转移计算模型以受益于跨物种注释,从而预测即使在没有interologi的情况下也会有更多新的相互作用。结核分枝杆菌(Mtb)和艰难梭菌(CD)已被用来证明这项工作。结果:我们开发了一个基于基因本体注释和STRING数据库提供的遗传上下文评分的特征的随机森林分类器,用于独立预测Mtb和CD相互作用。Mtb分类器在hold out测试集上的准确率为94%,召回率为23%。然后在Mtb蛋白质组的所有800万个蛋白质对上运行Mtb模型,产生708个新的相互作用(94%的预期精度)或1595个新的相互作用(80%的预期精度)。CD分类器在hold out测试集上的准确率为90%,召回率为16%。CD模型在CD蛋白质组的所有800万个蛋白质对上运行,产生143个新的相互作用(预期精度为90%)或580个新的相互作用(预期精度为80%)。我们还比较了我们的方法与字符串数据库中CD和Mtb相互作用的预测重叠,以及最近由细菌2-杂交系统确定的Mtb相互作用。为了证明计算模型转移的实用性,我们利用开发的Mtb模型并使用它来预测CD蛋白对。由此建立的跨物种模型的准确率为88%,召回率为8%。为了证明在缺乏基于特征和基于交互的信息的情况下从其他生物转移特征,我们将Mtb同源物中缺失的特征值转移到CD数据中。在从同源物(而非同源物)转移这些数据时,我们表明可以预测大量的相互作用。结论:利用现有的一组与微生物相关的GO和STRING特征可以快速发现(部分)细菌相互作用组。我们可以利用跨物种相互作用的发展,当甚至没有足够的已知相互作用来开发计算预测系统。研究充分的生物的计算模型可以用来对目标生物进行初始的相互作用预测。我们还成功地证明,注释可以从经过充分研究的生物体的同源物中转移,从而对没有注释的生物体进行准确的预测。这些方法可以作为构建块来解决与特征覆盖相关的挑战,缺少相互作用,从而快速发现细菌有机体的相互作用组。可用性:所有结核分枝杆菌和乳糜泻蛋白的预测分别可在http://severus.dbmi.pitt.edu/TB和http://severus.dbmi.pitt.edu/CD上浏览和下载。
{"title":"Mycobacterium tuberculosis and Clostridium difficille interactomes: demonstration of rapid development of computational system for bacterial interactome prediction.","authors":"Seshan Ananthasubramanian,&nbsp;Rahul Metri,&nbsp;Ankur Khetan,&nbsp;Aman Gupta,&nbsp;Adam Handen,&nbsp;Nagasuma Chandra,&nbsp;Madhavi Ganapathiraju","doi":"10.1186/2042-5783-2-4","DOIUrl":"https://doi.org/10.1186/2042-5783-2-4","url":null,"abstract":"<p><strong>Background: </strong>Protein-protein interaction (PPI) networks (interactomes) of most organisms, except for some model organisms, are largely unknown. Experimental methods including high-throughput techniques are highly resource intensive. Therefore, computational discovery of PPIs can accelerate biological discovery by presenting \"most-promising\" pairs of proteins that are likely to interact. For many bacteria, genome sequence, and thereby genomic context of proteomes, is readily available; additionally, for some of these proteomes, localization and functional annotations are also available, but interactomes are not available. We present here a method for rapid development of computational system to predict interactome of bacterial proteomes. While other studies have presented methods to transfer interologs across species, here, we propose transfer of computational models to benefit from cross-species annotations, thereby predicting many more novel interactions even in the absence of interologs. Mycobacterium tuberculosis (Mtb) and Clostridium difficile (CD) have been used to demonstrate the work.</p><p><strong>Results: </strong>We developed a random forest classifier over features derived from Gene Ontology annotations and genetic context scores provided by STRING database for predicting Mtb and CD interactions independently. The Mtb classifier gave a precision of 94% and a recall of 23% on a held out test set. The Mtb model was then run on all the 8 million protein pairs of the Mtb proteome, resulting in 708 new interactions (at 94% expected precision) or 1,595 new interactions at 80% expected precision. The CD classifier gave a precision of 90% and a recall of 16% on a held out test set. The CD model was run on all the 8 million protein pairs of the CD proteome, resulting in 143 new interactions (at 90% expected precision) or 580 new interactions (at 80% expected precision). We also compared the overlap of predictions of our method with STRING database interactions for CD and Mtb and also with interactions identified recently by a bacterial 2-hybrid system for Mtb. To demonstrate the utility of transfer of computational models, we made use of the developed Mtb model and used it to predict CD protein-pairs. The cross species model thus developed yielded a precision of 88% at a recall of 8%. To demonstrate transfer of features from other organisms in the absence of feature-based and interaction-based information, we transferred missing feature values from Mtb orthologs into the CD data. In transferring this data from orthologs (not interologs), we showed that a large number of interactions can be predicted.</p><p><strong>Conclusions: </strong>Rapid discovery of (partial) bacterial interactome can be made by using existing set of GO and STRING features associated with the organisms. We can make use of cross-species interactome development, when there are not even sufficient known interactions to develop a computational prediction sys","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"2 ","pages":"4"},"PeriodicalIF":0.0,"publicationDate":"2012-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-2-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30619562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Metagenomics - a guide from sampling to data analysis. 元基因组学--从取样到数据分析的指南。
Pub Date : 2012-02-09 DOI: 10.1186/2042-5783-2-3
Torsten Thomas, Jack Gilbert, Folker Meyer

Metagenomics applies a suite of genomic technologies and bioinformatics tools to directly access the genetic content of entire communities of organisms. The field of metagenomics has been responsible for substantial advances in microbial ecology, evolution, and diversity over the past 5 to 10 years, and many research laboratories are actively engaged in it now. With the growing numbers of activities also comes a plethora of methodological knowledge and expertise that should guide future developments in the field. This review summarizes the current opinions in metagenomics, and provides practical guidance and advice on sample processing, sequencing technology, assembly, binning, annotation, experimental design, statistical analysis, data storage, and data sharing. As more metagenomic datasets are generated, the availability of standardized procedures and shared data storage and analysis becomes increasingly important to ensure that output of individual projects can be assessed and compared.

元基因组学应用一整套基因组技术和生物信息学工具,直接获取整个生物群落的基因内容。在过去的 5 到 10 年中,元基因组学领域在微生物生态学、进化和多样性方面取得了重大进展,目前许多研究实验室都在积极开展这方面的研究。随着研究活动的不断增多,大量的方法学知识和专业技术也随之而来,这些知识和技术应能指导该领域未来的发展。本综述总结了元基因组学目前的观点,并就样本处理、测序技术、组装、分选、注释、实验设计、统计分析、数据存储和数据共享等方面提供了实用的指导和建议。随着越来越多的元基因组数据集产生,标准化程序和共享数据存储与分析的可用性变得越来越重要,以确保单个项目的产出可以评估和比较。
{"title":"Metagenomics - a guide from sampling to data analysis.","authors":"Torsten Thomas, Jack Gilbert, Folker Meyer","doi":"10.1186/2042-5783-2-3","DOIUrl":"10.1186/2042-5783-2-3","url":null,"abstract":"<p><p> Metagenomics applies a suite of genomic technologies and bioinformatics tools to directly access the genetic content of entire communities of organisms. The field of metagenomics has been responsible for substantial advances in microbial ecology, evolution, and diversity over the past 5 to 10 years, and many research laboratories are actively engaged in it now. With the growing numbers of activities also comes a plethora of methodological knowledge and expertise that should guide future developments in the field. This review summarizes the current opinions in metagenomics, and provides practical guidance and advice on sample processing, sequencing technology, assembly, binning, annotation, experimental design, statistical analysis, data storage, and data sharing. As more metagenomic datasets are generated, the availability of standardized procedures and shared data storage and analysis becomes increasingly important to ensure that output of individual projects can be assessed and compared.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"2 1","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2012-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3351745/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30620216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Linear normalised hash function for clustering gene sequences and identifying reference sequences from multiple sequence alignments. 线性归一化哈希函数用于基因序列聚类和从多个序列比对中识别参考序列。
Pub Date : 2012-01-26 DOI: 10.1186/2042-5783-2-2
Manal Helal, Fanrong Kong, Sharon Ca Chen, Fei Zhou, Dominic E Dwyer, John Potter, Vitali Sintchenko

Background: Comparative genomics has put additional demands on the assessment of similarity between sequences and their clustering as means for classification. However, defining the optimal number of clusters, cluster density and boundaries for sets of potentially related sequences of genes with variable degrees of polymorphism remains a significant challenge. The aim of this study was to develop a method that would identify the cluster centroids and the optimal number of clusters for a given sensitivity level and could work equally well for the different sequence datasets.

Results: A novel method that combines the linear mapping hash function and multiple sequence alignment (MSA) was developed. This method takes advantage of the already sorted by similarity sequences from the MSA output, and identifies the optimal number of clusters, clusters cut-offs, and clusters centroids that can represent reference gene vouchers for the different species. The linear mapping hash function can map an already ordered by similarity distance matrix to indices to reveal gaps in the values around which the optimal cut-offs of the different clusters can be identified. The method was evaluated using sets of closely related (16S rRNA gene sequences of Nocardia species) and highly variable (VP1 genomic region of Enterovirus 71) sequences and outperformed existing unsupervised machine learning clustering methods and dimensionality reduction methods. This method does not require prior knowledge of the number of clusters or the distance between clusters, handles clusters of different sizes and shapes, and scales linearly with the dataset.

Conclusions: The combination of MSA with the linear mapping hash function is a computationally efficient way of gene sequence clustering and can be a valuable tool for the assessment of similarity, clustering of different microbial genomes, identifying reference sequences, and for the study of evolution of bacteria and viruses.

背景:比较基因组学对序列之间的相似性评估及其聚类作为分类手段提出了额外的要求。然而,定义具有不同程度多态性的潜在相关基因序列集的最佳簇数、簇密度和边界仍然是一个重大挑战。本研究的目的是开发一种方法,该方法可以在给定的灵敏度水平下识别聚类质心和最佳聚类数量,并且可以同样地适用于不同的序列数据集。结果:提出了一种结合线性映射哈希函数和多序列比对(MSA)的新方法。该方法利用MSA输出的相似性序列进行排序,确定了代表不同物种参考基因凭证的最佳簇数、簇截断点和簇质心。线性映射哈希函数可以将已经排序的相似距离矩阵映射到索引,以显示值中的间隙,从而可以识别不同聚类的最佳截止点。该方法利用诺卡菌属的16S rRNA基因序列和肠病毒71的VP1基因组区等密切相关的序列进行评估,优于现有的无监督机器学习聚类方法和降维方法。该方法不需要预先知道簇的数量或簇之间的距离,可以处理不同大小和形状的簇,并与数据集线性扩展。结论:MSA与线性映射哈希函数的结合是一种计算效率高的基因序列聚类方法,可用于不同微生物基因组的相似性评估、聚类、参考序列鉴定以及细菌和病毒进化研究。
{"title":"Linear normalised hash function for clustering gene sequences and identifying reference sequences from multiple sequence alignments.","authors":"Manal Helal,&nbsp;Fanrong Kong,&nbsp;Sharon Ca Chen,&nbsp;Fei Zhou,&nbsp;Dominic E Dwyer,&nbsp;John Potter,&nbsp;Vitali Sintchenko","doi":"10.1186/2042-5783-2-2","DOIUrl":"https://doi.org/10.1186/2042-5783-2-2","url":null,"abstract":"<p><strong>Background: </strong>Comparative genomics has put additional demands on the assessment of similarity between sequences and their clustering as means for classification. However, defining the optimal number of clusters, cluster density and boundaries for sets of potentially related sequences of genes with variable degrees of polymorphism remains a significant challenge. The aim of this study was to develop a method that would identify the cluster centroids and the optimal number of clusters for a given sensitivity level and could work equally well for the different sequence datasets.</p><p><strong>Results: </strong>A novel method that combines the linear mapping hash function and multiple sequence alignment (MSA) was developed. This method takes advantage of the already sorted by similarity sequences from the MSA output, and identifies the optimal number of clusters, clusters cut-offs, and clusters centroids that can represent reference gene vouchers for the different species. The linear mapping hash function can map an already ordered by similarity distance matrix to indices to reveal gaps in the values around which the optimal cut-offs of the different clusters can be identified. The method was evaluated using sets of closely related (16S rRNA gene sequences of Nocardia species) and highly variable (VP1 genomic region of Enterovirus 71) sequences and outperformed existing unsupervised machine learning clustering methods and dimensionality reduction methods. This method does not require prior knowledge of the number of clusters or the distance between clusters, handles clusters of different sizes and shapes, and scales linearly with the dataset.</p><p><strong>Conclusions: </strong>The combination of MSA with the linear mapping hash function is a computationally efficient way of gene sequence clustering and can be a valuable tool for the assessment of similarity, clustering of different microbial genomes, identifying reference sequences, and for the study of evolution of bacteria and viruses.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"2 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2012-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-2-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30618713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Use of the University of Minnesota Biocatalysis/Biodegradation Database for study of microbial degradation. 利用明尼苏达大学生物催化/生物降解数据库进行微生物降解研究。
Pub Date : 2012-01-04 DOI: 10.1186/2042-5783-2-1
Lynda Bm Ellis, Lawrence P Wackett

Microorganisms are ubiquitous on earth and have diverse metabolic transformative capabilities important for environmental biodegradation of chemicals that helps maintain ecosystem and human health. Microbial biodegradative metabolism is the main focus of the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD). UM-BBD data has also been used to develop a computational metabolic pathway prediction system that can be applied to chemicals for which biodegradation data is currently lacking. The UM-Pathway Prediction System (UM-PPS) relies on metabolic rules that are based on organic functional groups and predicts plausible biodegradative metabolism. The predictions are useful to environmental chemists that look for metabolic intermediates, for regulators looking for potential toxic products, for microbiologists seeking to understand microbial biodegradation, and others with a wide-range of interests.

微生物在地球上无处不在,具有多种代谢转化能力,对有助于维持生态系统和人类健康的化学品的环境生物降解至关重要。微生物生物降解代谢是明尼苏达大学生物催化/生物降解数据库(UM-BBD)的主要重点。UM-BBD数据还被用于开发计算代谢途径预测系统,该系统可应用于目前缺乏生物降解数据的化学品。um途径预测系统(UM-PPS)依赖于基于有机官能团的代谢规则,并预测合理的生物降解代谢。这些预测对寻找代谢中间体的环境化学家、寻找潜在有毒产品的监管者、试图了解微生物生物降解的微生物学家以及其他有广泛兴趣的人都很有用。
{"title":"Use of the University of Minnesota Biocatalysis/Biodegradation Database for study of microbial degradation.","authors":"Lynda Bm Ellis,&nbsp;Lawrence P Wackett","doi":"10.1186/2042-5783-2-1","DOIUrl":"https://doi.org/10.1186/2042-5783-2-1","url":null,"abstract":"<p><p> Microorganisms are ubiquitous on earth and have diverse metabolic transformative capabilities important for environmental biodegradation of chemicals that helps maintain ecosystem and human health. Microbial biodegradative metabolism is the main focus of the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD). UM-BBD data has also been used to develop a computational metabolic pathway prediction system that can be applied to chemicals for which biodegradation data is currently lacking. The UM-Pathway Prediction System (UM-PPS) relies on metabolic rules that are based on organic functional groups and predicts plausible biodegradative metabolism. The predictions are useful to environmental chemists that look for metabolic intermediates, for regulators looking for potential toxic products, for microbiologists seeking to understand microbial biodegradation, and others with a wide-range of interests.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"2 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2012-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-2-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30619415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Stringent response of Escherichia coli: revisiting the bibliome using literature mining. 大肠杆菌的严格响应:用文献挖掘法重访文献组。
Pub Date : 2011-12-30 DOI: 10.1186/2042-5783-1-14
Sónia Carneiro, Anália Lourenço, Eugénio C Ferreira, Isabel Rocha

Background: Understanding the mechanisms responsible for cellular responses depends on the systematic collection and analysis of information on the main biological concepts involved. Indeed, the identification of biologically relevant concepts in free text, namely genes, tRNAs, mRNAs, gene products and small molecules, is crucial to capture the structure and functioning of different responses.

Results: In this work, we review literature reports on the study of the stringent response in Escherichia coli. Rather than undertaking the development of a highly specialised literature mining approach, we investigate the suitability of concept recognition and statistical analysis of concept occurrence as means to highlight the concepts that are most likely to be biologically engaged during this response. The co-occurrence analysis of core concepts in this stringent response, i.e. the (p)ppGpp nucleotides with gene products was also inspected and suggest that besides the enzymes RelA and SpoT that control the basal levels of (p)ppGpp nucleotides, many other proteins have a key role in this response. Functional enrichment analysis revealed that basic cellular processes such as metabolism, transcriptional and translational regulation are central, but other stress-associated responses might be elicited during the stringent response. In addition, the identification of less annotated concepts revealed that some (p)ppGpp-induced functional activities are still overlooked in most reviews.

Conclusions: In this paper we applied a literature mining approach that offers a more comprehensive analysis of the stringent response in E. coli. The compilation of relevant biological entities to this stress response and the assessment of their functional roles provided a more systematic understanding of this cellular response. Overlooked regulatory entities, such as transcriptional regulators, were found to play a role in this stress response. Moreover, the involvement of other stress-associated concepts demonstrates the complexity of this cellular response.

背景:理解细胞反应的机制取决于系统地收集和分析所涉及的主要生物学概念的信息。事实上,在自由文本中识别生物学相关概念,即基因、trna、mrna、基因产物和小分子,对于捕捉不同反应的结构和功能至关重要。结果:本文对大肠杆菌严格反应研究的文献报道进行了综述。我们不是开发高度专业化的文献挖掘方法,而是研究概念识别的适用性和概念发生的统计分析,以此作为突出在此响应期间最有可能参与生物活动的概念的手段。对这一严格反应的核心概念,即(p)ppGpp核苷酸与基因产物的共现分析也进行了检查,并表明除了控制(p)ppGpp核苷酸基础水平的RelA和SpoT酶外,许多其他蛋白质在这一反应中起关键作用。功能富集分析表明,基本的细胞过程如代谢、转录和翻译调节是中心,但在严格的响应过程中可能会引发其他与应激相关的反应。此外,对注释较少的概念的识别表明,在大多数综述中,一些(p) ppgpp诱导的功能活动仍然被忽视。结论:在本文中,我们采用了一种文献挖掘方法,对大肠杆菌的严格反应进行了更全面的分析。对这种应激反应的相关生物实体的汇编和对其功能作用的评估提供了对这种细胞反应的更系统的理解。被忽视的调控实体,如转录调控,被发现在这种应激反应中发挥作用。此外,其他与压力相关的概念也证明了这种细胞反应的复杂性。
{"title":"Stringent response of Escherichia coli: revisiting the bibliome using literature mining.","authors":"Sónia Carneiro,&nbsp;Anália Lourenço,&nbsp;Eugénio C Ferreira,&nbsp;Isabel Rocha","doi":"10.1186/2042-5783-1-14","DOIUrl":"https://doi.org/10.1186/2042-5783-1-14","url":null,"abstract":"<p><strong>Background: </strong>Understanding the mechanisms responsible for cellular responses depends on the systematic collection and analysis of information on the main biological concepts involved. Indeed, the identification of biologically relevant concepts in free text, namely genes, tRNAs, mRNAs, gene products and small molecules, is crucial to capture the structure and functioning of different responses.</p><p><strong>Results: </strong>In this work, we review literature reports on the study of the stringent response in Escherichia coli. Rather than undertaking the development of a highly specialised literature mining approach, we investigate the suitability of concept recognition and statistical analysis of concept occurrence as means to highlight the concepts that are most likely to be biologically engaged during this response. The co-occurrence analysis of core concepts in this stringent response, i.e. the (p)ppGpp nucleotides with gene products was also inspected and suggest that besides the enzymes RelA and SpoT that control the basal levels of (p)ppGpp nucleotides, many other proteins have a key role in this response. Functional enrichment analysis revealed that basic cellular processes such as metabolism, transcriptional and translational regulation are central, but other stress-associated responses might be elicited during the stringent response. In addition, the identification of less annotated concepts revealed that some (p)ppGpp-induced functional activities are still overlooked in most reviews.</p><p><strong>Conclusions: </strong>In this paper we applied a literature mining approach that offers a more comprehensive analysis of the stringent response in E. coli. The compilation of relevant biological entities to this stress response and the assessment of their functional roles provided a more systematic understanding of this cellular response. Overlooked regulatory entities, such as transcriptional regulators, were found to play a role in this stress response. Moreover, the involvement of other stress-associated concepts demonstrates the complexity of this cellular response.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"1 1","pages":"14"},"PeriodicalIF":0.0,"publicationDate":"2011-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-1-14","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30618686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
PerPlot & PerScan: tools for analysis of DNA curvature-related periodicity in genomic nucleotide sequences. PerPlot & scan:基因组核苷酸序列中DNA曲率相关周期性分析工具。
Pub Date : 2011-11-28 DOI: 10.1186/2042-5783-1-13
Jan Mrázek, Tejas Chaudhari, Aryabrata Basu

Background: Periodic spacing of short adenine or thymine runs phased with DNA helical period of ~10.5 bp is associated with intrinsic DNA curvature and deformability, which play important roles in DNA-protein interactions and in the organization of chromosomes in both eukaryotes and prokaryotes. Local differences in DNA sequence periodicity have been linked to differences in gene expression in some organisms. Despite the significance of these periodic patterns, there are virtually no publicly accessible tools for their analysis.

Results: We present novel tools suitable for assessments of DNA curvature-related sequence periodicity in nucleotide sequences at the genome scale. Utility of the present software is demonstrated on a comparison of sequence periodicities in the genomes of Haemophilus influenzae, Methanocaldococcus jannaschii, Saccharomyces cerevisiae, and Arabidopsis thaliana. The software can be accessed through a web interface and the programs are also available for download.

Conclusions: The present software is suitable for comparing DNA curvature-related sequence periodicity among different genomes as well as for analysis of intrachromosomal heterogeneity of the sequence periodicity. It provides a quick and convenient way to detect anomalous regions of chromosomes that could have unusual structural and functional properties and/or distinct evolutionary history.

背景:在真核生物和原核生物中,腺嘌呤或胸腺嘧啶短链与DNA螺旋周期相同步的周期间隔约10.5 bp,与DNA固有的曲率和可变形性有关,在DNA-蛋白质相互作用和染色体组织中发挥重要作用。在某些生物体中,DNA序列周期性的局部差异与基因表达的差异有关。尽管这些周期性模式很重要,但实际上并没有公开的工具来分析它们。结果:我们提出了新的工具,适用于评估DNA曲率相关的序列周期性在核苷酸序列在基因组规模。本软件的实用性在流感嗜血杆菌、jannaschii甲烷醛球菌、酿酒酵母和拟南芥基因组序列周期性的比较中得到了证明。该软件可以通过web界面访问,程序也可以下载。结论:本软件适用于不同基因组间DNA曲率相关序列周期性的比较以及序列周期性的染色体内异质性分析。它提供了一种快速方便的方法来检测染色体异常区域,这些异常区域可能具有不寻常的结构和功能特性和/或不同的进化史。
{"title":"PerPlot & PerScan: tools for analysis of DNA curvature-related periodicity in genomic nucleotide sequences.","authors":"Jan Mrázek,&nbsp;Tejas Chaudhari,&nbsp;Aryabrata Basu","doi":"10.1186/2042-5783-1-13","DOIUrl":"https://doi.org/10.1186/2042-5783-1-13","url":null,"abstract":"<p><strong>Background: </strong>Periodic spacing of short adenine or thymine runs phased with DNA helical period of ~10.5 bp is associated with intrinsic DNA curvature and deformability, which play important roles in DNA-protein interactions and in the organization of chromosomes in both eukaryotes and prokaryotes. Local differences in DNA sequence periodicity have been linked to differences in gene expression in some organisms. Despite the significance of these periodic patterns, there are virtually no publicly accessible tools for their analysis.</p><p><strong>Results: </strong>We present novel tools suitable for assessments of DNA curvature-related sequence periodicity in nucleotide sequences at the genome scale. Utility of the present software is demonstrated on a comparison of sequence periodicities in the genomes of Haemophilus influenzae, Methanocaldococcus jannaschii, Saccharomyces cerevisiae, and Arabidopsis thaliana. The software can be accessed through a web interface and the programs are also available for download.</p><p><strong>Conclusions: </strong>The present software is suitable for comparing DNA curvature-related sequence periodicity among different genomes as well as for analysis of intrachromosomal heterogeneity of the sequence periodicity. It provides a quick and convenient way to detect anomalous regions of chromosomes that could have unusual structural and functional properties and/or distinct evolutionary history.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"1 1","pages":"13"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-1-13","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30619064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Prediction of a novel RNA binding domain in crocodilepox Zimbabwe Gene 157. 鳄鱼津巴布韦基因157中一个新的RNA结合域的预测。
Pub Date : 2011-11-21 DOI: 10.1186/2042-5783-1-12
Nicole S Little, Taylor Quon, Chris Upton

Background: Although the crocodilepox virus (CRV) is currently unclassified, phylogenetic analyses suggest that its closest known relatives are molluscum contagiosum virus (MCV) and the avipox viruses. The CRV genome is approximately 190 kb and contains a large number of unique genes in addition to the set of conserved Chordopoxvirus genes found in all such viruses. Upon sequencing the viral genome, others noted that this virus was also unusual because of the lack of a series of common immuno-suppressive genes. However, the genome contains multiple genes of unknown function that are likely to function in reducing the anti-viral response of the host.

Results: By using sensitive database searches for similarity, we observed that gene 157 of CRV-strain Zimbabwe (CRV-ZWE) encodes a protein with a domain that is predicted to bind dsRNA. Domain characterization supported this prediction, therefore, we tested the ability of the Robetta protein structure prediction server to model the amino acid sequence of this protein on a well-characterized RNA binding domain. The model generated by Robetta suggests that CRV-ZWE-157 does indeed contain an RNA binding domain; the model could be overlaid on the template protein structure with high confidence.

Conclusion: We hypothesize that CRV-ZWE-157 encodes a novel poxvirus RNA binding protein and suggest that as a non-core gene it may play a role in host-range determination or function to dampen host anti-viral responses. Potential targets for this CRV protein include the host interferon response and miRNA pathways.

背景:虽然鳄鱼痘病毒(CRV)目前尚未分类,但系统发育分析表明,其已知的近亲是传染性软疣病毒(MCV)和禽痘病毒。CRV基因组约190 kb,除了在所有此类病毒中发现的一组保守的脊索虫病毒基因外,还包含大量独特的基因。在对病毒基因组进行测序后,其他人注意到这种病毒也不寻常,因为缺乏一系列常见的免疫抑制基因。然而,基因组包含多个功能未知的基因,这些基因可能在减少宿主的抗病毒反应中起作用。结果:通过使用敏感数据库搜索相似性,我们观察到crv -津巴布韦菌株(CRV-ZWE)的157基因编码一个具有预测结合dsRNA结构域的蛋白质。结构域表征支持这一预测,因此,我们测试了Robetta蛋白结构预测服务器在表征良好的RNA结合结构域上模拟该蛋白氨基酸序列的能力。Robetta建立的模型表明,CRV-ZWE-157确实含有RNA结合域;该模型可以高置信度地覆盖在模板蛋白结构上。结论:我们推测CRV-ZWE-157编码了一种新的痘病毒RNA结合蛋白,并提示作为一个非核心基因,它可能在宿主范围的决定或抑制宿主的抗病毒反应中发挥作用。该CRV蛋白的潜在靶点包括宿主干扰素反应和miRNA途径。
{"title":"Prediction of a novel RNA binding domain in crocodilepox Zimbabwe Gene 157.","authors":"Nicole S Little,&nbsp;Taylor Quon,&nbsp;Chris Upton","doi":"10.1186/2042-5783-1-12","DOIUrl":"https://doi.org/10.1186/2042-5783-1-12","url":null,"abstract":"<p><strong>Background: </strong>Although the crocodilepox virus (CRV) is currently unclassified, phylogenetic analyses suggest that its closest known relatives are molluscum contagiosum virus (MCV) and the avipox viruses. The CRV genome is approximately 190 kb and contains a large number of unique genes in addition to the set of conserved Chordopoxvirus genes found in all such viruses. Upon sequencing the viral genome, others noted that this virus was also unusual because of the lack of a series of common immuno-suppressive genes. However, the genome contains multiple genes of unknown function that are likely to function in reducing the anti-viral response of the host.</p><p><strong>Results: </strong>By using sensitive database searches for similarity, we observed that gene 157 of CRV-strain Zimbabwe (CRV-ZWE) encodes a protein with a domain that is predicted to bind dsRNA. Domain characterization supported this prediction, therefore, we tested the ability of the Robetta protein structure prediction server to model the amino acid sequence of this protein on a well-characterized RNA binding domain. The model generated by Robetta suggests that CRV-ZWE-157 does indeed contain an RNA binding domain; the model could be overlaid on the template protein structure with high confidence.</p><p><strong>Conclusion: </strong>We hypothesize that CRV-ZWE-157 encodes a novel poxvirus RNA binding protein and suggest that as a non-core gene it may play a role in host-range determination or function to dampen host anti-viral responses. Potential targets for this CRV protein include the host interferon response and miRNA pathways.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"1 1","pages":"12"},"PeriodicalIF":0.0,"publicationDate":"2011-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-1-12","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30619325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Natural genetic engineering: intelligence & design in evolution? 自然基因工程:进化中的智能与设计?
Pub Date : 2011-10-31 DOI: 10.1186/2042-5783-1-11
D. Ussery
{"title":"Natural genetic engineering: intelligence & design in evolution?","authors":"D. Ussery","doi":"10.1186/2042-5783-1-11","DOIUrl":"https://doi.org/10.1186/2042-5783-1-11","url":null,"abstract":"","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"30 1","pages":"11 - 11"},"PeriodicalIF":0.0,"publicationDate":"2011-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74576338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Translational web robots for pathogen genome analysis. 用于病原体基因组分析的翻译网络机器人。
Pub Date : 2011-10-31 DOI: 10.1186/2042-5783-1-10
Vitali Sintchenko, Enrico W Coiera
{"title":"Translational web robots for pathogen genome analysis.","authors":"Vitali Sintchenko,&nbsp;Enrico W Coiera","doi":"10.1186/2042-5783-1-10","DOIUrl":"https://doi.org/10.1186/2042-5783-1-10","url":null,"abstract":"","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"1 1","pages":"10"},"PeriodicalIF":0.0,"publicationDate":"2011-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-1-10","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30620556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Quantifying the effect of environment stability on the transcription factor repertoire of marine microbes. 量化环境稳定性对海洋微生物转录因子库的影响。
Pub Date : 2011-09-07 DOI: 10.1186/2042-5783-1-9
Ivaylo Kostadinov, Renzo Kottmann, Alban Ramette, Jost Waldmann, Pier Luigi Buttigieg, Frank Oliver Glöckner

Background: DNA-binding transcription factors (TFs) regulate cellular functions in prokaryotes, often in response to environmental stimuli. Thus, the environment exerts constant selective pressure on the TF gene content of microbial communities. Recently a study on marine Synechococcus strains detected differences in their genomic TF content related to environmental adaptation, but so far the effect of environmental parameters on the content of TFs in bacterial communities has not been systematically investigated.

Results: We quantified the effect of environment stability on the transcription factor repertoire of marine pelagic microbes from the Global Ocean Sampling (GOS) metagenome using interpolated physico-chemical parameters and multivariate statistics. Thirty-five percent of the difference in relative TF abundances between samples could be explained by environment stability. Six percent was attributable to spatial distance but none to a combination of both spatial distance and stability. Some individual TFs showed a stronger relationship to environment stability and space than the total TF pool.

Conclusions: Environmental stability appears to have a clearly detectable effect on TF gene content in bacterioplanktonic communities described by the GOS metagenome. Interpolated environmental parameters were shown to compare well to in situ measurements and were essential for quantifying the effect of the environment on the TF content. It is demonstrated that comprehensive and well-structured contextual data will strongly enhance our ability to interpret the functional potential of microbes from metagenomic data.

背景:dna结合转录因子(TFs)在原核生物中调节细胞功能,通常是对环境刺激的反应。因此,环境对微生物群落的TF基因含量施加了持续的选择压力。最近一项对海洋聚球菌菌株的研究发现其基因组中与环境适应相关的TF含量存在差异,但到目前为止,环境参数对细菌群落中TF含量的影响尚未得到系统的研究。结果:我们利用插值的理化参数和多元统计量化了环境稳定性对全球海洋采样(GOS)宏基因组中海洋远洋微生物转录因子库的影响。样品之间相对TF丰度的35%的差异可以用环境稳定性来解释。6%归因于空间距离,但没有一个是空间距离和稳定性的结合。与总通量池相比,个别通量池与环境稳定性和空间的关系更强。结论:在GOS宏基因组描述的浮游细菌群落中,环境稳定性似乎对TF基因含量有明显的影响。内插的环境参数与原位测量结果比较好,对于量化环境对TF含量的影响至关重要。研究表明,全面和结构良好的上下文数据将大大提高我们从宏基因组数据中解释微生物功能潜力的能力。
{"title":"Quantifying the effect of environment stability on the transcription factor repertoire of marine microbes.","authors":"Ivaylo Kostadinov,&nbsp;Renzo Kottmann,&nbsp;Alban Ramette,&nbsp;Jost Waldmann,&nbsp;Pier Luigi Buttigieg,&nbsp;Frank Oliver Glöckner","doi":"10.1186/2042-5783-1-9","DOIUrl":"https://doi.org/10.1186/2042-5783-1-9","url":null,"abstract":"<p><strong>Background: </strong>DNA-binding transcription factors (TFs) regulate cellular functions in prokaryotes, often in response to environmental stimuli. Thus, the environment exerts constant selective pressure on the TF gene content of microbial communities. Recently a study on marine Synechococcus strains detected differences in their genomic TF content related to environmental adaptation, but so far the effect of environmental parameters on the content of TFs in bacterial communities has not been systematically investigated.</p><p><strong>Results: </strong>We quantified the effect of environment stability on the transcription factor repertoire of marine pelagic microbes from the Global Ocean Sampling (GOS) metagenome using interpolated physico-chemical parameters and multivariate statistics. Thirty-five percent of the difference in relative TF abundances between samples could be explained by environment stability. Six percent was attributable to spatial distance but none to a combination of both spatial distance and stability. Some individual TFs showed a stronger relationship to environment stability and space than the total TF pool.</p><p><strong>Conclusions: </strong>Environmental stability appears to have a clearly detectable effect on TF gene content in bacterioplanktonic communities described by the GOS metagenome. Interpolated environmental parameters were shown to compare well to in situ measurements and were essential for quantifying the effect of the environment on the TF content. It is demonstrated that comprehensive and well-structured contextual data will strongly enhance our ability to interpret the functional potential of microbes from metagenomic data.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"1 1","pages":"9"},"PeriodicalIF":0.0,"publicationDate":"2011-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-1-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30620575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Microbial Informatics and Experimentation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1