首页 > 最新文献

Microbial Informatics and Experimentation最新文献

英文 中文
A systematic search for discriminating sites in the 16S ribosomal RNA gene. 对16S核糖体RNA基因中区分位点的系统搜索。
Pub Date : 2014-01-27 DOI: 10.1186/2042-5783-4-2
Hilde Vinje, Trygve Almøy, Kristian Hovde Liland, Lars Snipen

Background: The 16S rRNA is by far the most common genomic marker used for prokaryotic classification, and has been used extensively in metagenomic studies over recent years. Along the 16S gene there are regions with more or less variation across the kingdom of bacteria. Nine variable regions have been identified, flanked by more conserved parts of the sequence. It has been stated that the discriminatory power of the 16S marker lies in these variable regions. In the present study we wanted to examine this more closely, and used a supervised learning method to search systematically for sites that contribute to correct classification at either the phylum or genus level.

Results: When classifying phyla the site selection algorithm located 50 discriminative sites. These were scattered over most of the alignments and only around half of them were located in the variable regions. The selected sites did, however, have an entropy significantly larger than expected, meaning they are sites of large variation. We found that the discriminative sites typically have a large entropy compared to their closest neighbours along the alignments. When classifying genera the site selection algorithm needed around 80% of the sites in the 16S gene before the classification error reached a minimum. This means that all variation, in both variable and conserved regions, is needed in order to separate genera.

Conclusions: Our findings does not support the statement that the discriminative power of the 16S gene is located only in the variable regions. Variable regions are important, but just as many discriminative sites are found in the more conserved parts. The discriminative power is typically found in sites of large variation located inside shorter regions of higher conservation.

背景:16S rRNA是迄今为止最常用的用于原核生物分类的基因组标记,近年来在宏基因组研究中被广泛使用。沿着16S基因,在细菌王国中有或多或少变异的区域。已经确定了九个可变区域,两侧是序列的更保守的部分。已经指出,16S标记的歧视性力量在于这些可变区域。在本研究中,我们想要更仔细地检查这一点,并使用监督学习方法系统地搜索有助于门或属水平正确分类的位点。结果:在分类门时,该算法确定了50个判别位点。它们分散在大多数排列中,只有大约一半位于可变区域。然而,所选地点的熵值明显大于预期,这意味着它们是变化很大的地点。我们发现,与沿着排列的最近邻居相比,鉴别位点通常具有较大的熵。在分类属时,位点选择算法需要约80%的16S基因位点才能使分类误差达到最小。这意味着所有的变异,无论是在可变区还是保守区,都是分离属所必需的。结论:我们的研究结果不支持16S基因的鉴别能力仅位于可变区域的说法。可变区域很重要,但在更保守的部分也发现了许多区别位点。鉴别力通常在位于较短的较高保守性区域内的大变异位点上发现。
{"title":"A systematic search for discriminating sites in the 16S ribosomal RNA gene.","authors":"Hilde Vinje,&nbsp;Trygve Almøy,&nbsp;Kristian Hovde Liland,&nbsp;Lars Snipen","doi":"10.1186/2042-5783-4-2","DOIUrl":"https://doi.org/10.1186/2042-5783-4-2","url":null,"abstract":"<p><strong>Background: </strong>The 16S rRNA is by far the most common genomic marker used for prokaryotic classification, and has been used extensively in metagenomic studies over recent years. Along the 16S gene there are regions with more or less variation across the kingdom of bacteria. Nine variable regions have been identified, flanked by more conserved parts of the sequence. It has been stated that the discriminatory power of the 16S marker lies in these variable regions. In the present study we wanted to examine this more closely, and used a supervised learning method to search systematically for sites that contribute to correct classification at either the phylum or genus level.</p><p><strong>Results: </strong>When classifying phyla the site selection algorithm located 50 discriminative sites. These were scattered over most of the alignments and only around half of them were located in the variable regions. The selected sites did, however, have an entropy significantly larger than expected, meaning they are sites of large variation. We found that the discriminative sites typically have a large entropy compared to their closest neighbours along the alignments. When classifying genera the site selection algorithm needed around 80% of the sites in the 16S gene before the classification error reached a minimum. This means that all variation, in both variable and conserved regions, is needed in order to separate genera.</p><p><strong>Conclusions: </strong>Our findings does not support the statement that the discriminative power of the 16S gene is located only in the variable regions. Variable regions are important, but just as many discriminative sites are found in the more conserved parts. The discriminative power is typically found in sites of large variation located inside shorter regions of higher conservation.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"4 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2014-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-4-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32065816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions. 进化病原体种群的深度测序:应用、错误和生物信息学解决方案。
Pub Date : 2014-01-15 DOI: 10.1186/2042-5783-4-1
Kerensa McElroy, Torsten Thomas, Fabio Luciani

Deep sequencing harnesses the high throughput nature of next generation sequencing technologies to generate population samples, treating information contained in individual reads as meaningful. Here, we review applications of deep sequencing to pathogen evolution. Pioneering deep sequencing studies from the virology literature are discussed, such as whole genome Roche-454 sequencing analyses of the dynamics of the rapidly mutating pathogens hepatitis C virus and HIV. Extension of the deep sequencing approach to bacterial populations is then discussed, including the impacts of emerging sequencing technologies. While it is clear that deep sequencing has unprecedented potential for assessing the genetic structure and evolutionary history of pathogen populations, bioinformatic challenges remain. We summarise current approaches to overcoming these challenges, in particular methods for detecting low frequency variants in the context of sequencing error and reconstructing individual haplotypes from short reads.

深度测序利用下一代测序技术的高通量特性来生成种群样本,将个体读取的信息视为有意义的。本文综述了深度测序技术在病原菌进化中的应用。讨论了病毒学文献中开创性的深度测序研究,例如快速突变病原体丙型肝炎病毒和艾滋病毒的全基因组罗氏454测序分析。然后讨论了深度测序方法对细菌种群的扩展,包括新兴测序技术的影响。虽然深度测序在评估病原体种群的遗传结构和进化历史方面具有前所未有的潜力,但生物信息学方面的挑战仍然存在。我们总结了目前克服这些挑战的方法,特别是在测序错误的背景下检测低频变异和从短读段重建单个单倍型的方法。
{"title":"Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions.","authors":"Kerensa McElroy,&nbsp;Torsten Thomas,&nbsp;Fabio Luciani","doi":"10.1186/2042-5783-4-1","DOIUrl":"https://doi.org/10.1186/2042-5783-4-1","url":null,"abstract":"<p><p>Deep sequencing harnesses the high throughput nature of next generation sequencing technologies to generate population samples, treating information contained in individual reads as meaningful. Here, we review applications of deep sequencing to pathogen evolution. Pioneering deep sequencing studies from the virology literature are discussed, such as whole genome Roche-454 sequencing analyses of the dynamics of the rapidly mutating pathogens hepatitis C virus and HIV. Extension of the deep sequencing approach to bacterial populations is then discussed, including the impacts of emerging sequencing technologies. While it is clear that deep sequencing has unprecedented potential for assessing the genetic structure and evolutionary history of pathogen populations, bioinformatic challenges remain. We summarise current approaches to overcoming these challenges, in particular methods for detecting low frequency variants in the context of sequencing error and reconstructing individual haplotypes from short reads. </p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"4 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2014-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-4-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32033434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 82
Sialic acid utilization by Cronobacter sakazakii. 阪崎克罗诺杆菌对唾液酸的利用。
Pub Date : 2013-05-24 DOI: 10.1186/2042-5783-3-3
Susan Joseph, Sumyya Hariri, Naqash Masood, Stephen Forsythe

Background: The Cronobacter genus is composed of seven species, and can cause infections in all age groups. Of particular concern is C. sakazakii, as this species is strongly associated with severe and often fatal cases of necrotizing enterocolitis and meningitis in neonates and infants. Whole genome sequencing has revealed that the nanAKT gene cluster required for the utilisation of exogenous sialic acid is unique to the C. sakazakii species (ESA_03609-13).Sialic acid is found in breast milk, infant formula, intestinal mucin, and gangliosides in the brain, hence its metabolism by C. sakazakii is of particular interest. Therefore its metabolism could be an important virulence factor. To date, no laboratory studies demonstrating the growth of C. sakazakii on sialic acid have been published nor have there been reports of sialidase activity. The phylogenetic analysis of the nan genes is of interest to determine whether the genes have been acquired by horizontal gene transfer.

Results: Phylogenetic analysis of 19 Cronobacter strains from 7 recognised species revealed the nanAKTR genes formed a unique cluster, separate from other Enterobacteriaceae such as E. coli K1 and Citrobacter koseri, which are also associated with neonatal meningitis. The gene organisation was similar to Edwardsiella tarda in that nanE gene (N-acetylmannosamine-6-phosphate-2epimerase) was not located within the nanATK cluster. Laboratory studies confirmed that only C. sakazakii, and not the other six Cronobacter species, was able to use sialic acid as a carbon source for growth. Although the ganglioside GM1 was also used as carbon source, no candidate sialidase genes were found in the genome, instead the substrate degradation is probably due to β-galactosidase activity.

Conclusions: Given the relatively recent evolution of both C. sakazakii (15-23 million years ago) and sialic acid synthesis in vertebrates, sialic acid utilization may be an example of co-evolution by one species of the Cronobacter genus with the mammalian host. This has possibly resulted in additional virulence factors contributing to severe life-threatening infections in neonates due to the utilization of sialic acid from breast milk, infant formula, milk (oligosaccharides), mucins lining the intestinal wall, and even gangliosides in the brain after passing through the blood-brain barrier.

背景:克罗诺杆菌属由七个种组成,可引起所有年龄组的感染。特别值得关注的是阪崎梭菌,因为该菌与新生儿和婴儿中严重且往往致命的坏死性小肠结肠炎和脑膜炎病例密切相关。全基因组测序显示,利用外源唾液酸所需的nanAKT基因簇是阪崎c.s akazaki物种所特有的(ESA_03609-13)。唾液酸存在于母乳、婴儿配方奶粉、肠粘蛋白和大脑中的神经节苷脂中,因此阪崎梭菌对唾液酸的代谢特别感兴趣。因此,其代谢可能是一个重要的毒力因素。迄今为止,没有实验室研究表明坂崎梭菌在唾液酸上生长,也没有唾液酸酶活性的报道。nan基因的系统发育分析对确定基因是否通过水平基因转移获得具有重要意义。结果:对7个已知种的19株克罗诺杆菌的系统发育分析显示,nanAKTR基因形成了一个独特的簇,与其他肠杆菌科如大肠杆菌K1和克塞利柠檬酸杆菌等也与新生儿脑膜炎有关。该基因的组织结构与迟发爱德华菌相似,即name基因(n -乙酰甘氨胺-6-磷酸-2epimerase)不位于nanATK簇中。实验室研究证实,只有阪崎梭菌,而不是其他六种克罗诺杆菌,能够使用唾液酸作为碳源生长。虽然神经节苷脂GM1也被用作碳源,但在基因组中没有发现候选唾液酸酶基因,底物降解可能是由于β-半乳糖苷酶活性。结论:考虑到C. sakazakii(1500 - 2300万年前)和唾液酸合成在脊椎动物中相对较近的进化,唾液酸利用可能是克罗诺杆菌属的一个物种与哺乳动物宿主共同进化的一个例子。这可能导致额外的毒力因素,导致新生儿严重危及生命的感染,因为母乳中的唾液酸、婴儿配方奶粉、牛奶(低聚糖)、肠壁内的粘蛋白,甚至通过血脑屏障后大脑中的神经节苷。
{"title":"Sialic acid utilization by Cronobacter sakazakii.","authors":"Susan Joseph,&nbsp;Sumyya Hariri,&nbsp;Naqash Masood,&nbsp;Stephen Forsythe","doi":"10.1186/2042-5783-3-3","DOIUrl":"https://doi.org/10.1186/2042-5783-3-3","url":null,"abstract":"<p><strong>Background: </strong>The Cronobacter genus is composed of seven species, and can cause infections in all age groups. Of particular concern is C. sakazakii, as this species is strongly associated with severe and often fatal cases of necrotizing enterocolitis and meningitis in neonates and infants. Whole genome sequencing has revealed that the nanAKT gene cluster required for the utilisation of exogenous sialic acid is unique to the C. sakazakii species (ESA_03609-13).Sialic acid is found in breast milk, infant formula, intestinal mucin, and gangliosides in the brain, hence its metabolism by C. sakazakii is of particular interest. Therefore its metabolism could be an important virulence factor. To date, no laboratory studies demonstrating the growth of C. sakazakii on sialic acid have been published nor have there been reports of sialidase activity. The phylogenetic analysis of the nan genes is of interest to determine whether the genes have been acquired by horizontal gene transfer.</p><p><strong>Results: </strong>Phylogenetic analysis of 19 Cronobacter strains from 7 recognised species revealed the nanAKTR genes formed a unique cluster, separate from other Enterobacteriaceae such as E. coli K1 and Citrobacter koseri, which are also associated with neonatal meningitis. The gene organisation was similar to Edwardsiella tarda in that nanE gene (N-acetylmannosamine-6-phosphate-2epimerase) was not located within the nanATK cluster. Laboratory studies confirmed that only C. sakazakii, and not the other six Cronobacter species, was able to use sialic acid as a carbon source for growth. Although the ganglioside GM1 was also used as carbon source, no candidate sialidase genes were found in the genome, instead the substrate degradation is probably due to β-galactosidase activity.</p><p><strong>Conclusions: </strong>Given the relatively recent evolution of both C. sakazakii (15-23 million years ago) and sialic acid synthesis in vertebrates, sialic acid utilization may be an example of co-evolution by one species of the Cronobacter genus with the mammalian host. This has possibly resulted in additional virulence factors contributing to severe life-threatening infections in neonates due to the utilization of sialic acid from breast milk, infant formula, milk (oligosaccharides), mucins lining the intestinal wall, and even gangliosides in the brain after passing through the blood-brain barrier.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"3 1","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2013-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-3-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"31454939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Beginner's guide to comparative bacterial genome analysis using next-generation sequence data. 初学者指南比较细菌基因组分析使用下一代序列数据。
Pub Date : 2013-04-10 DOI: 10.1186/2042-5783-3-2
David J Edwards, Kathryn E Holt

High throughput sequencing is now fast and cheap enough to be considered part of the toolbox for investigating bacteria, and there are thousands of bacterial genome sequences available for comparison in the public domain. Bacterial genome analysis is increasingly being performed by diverse groups in research, clinical and public health labs alike, who are interested in a wide array of topics related to bacterial genetics and evolution. Examples include outbreak analysis and the study of pathogenicity and antimicrobial resistance. In this beginner's guide, we aim to provide an entry point for individuals with a biology background who want to perform their own bioinformatics analysis of bacterial genome data, to enable them to answer their own research questions. We assume readers will be familiar with genetics and the basic nature of sequence data, but do not assume any computer programming skills. The main topics covered are assembly, ordering of contigs, annotation, genome comparison and extracting common typing information. Each section includes worked examples using publicly available E. coli data and free software tools, all which can be performed on a desktop computer.

高通量测序现在足够快速和便宜,被认为是研究细菌工具箱的一部分,并且在公共领域有成千上万的细菌基因组序列可供比较。细菌基因组分析越来越多地由不同的研究小组,临床和公共卫生实验室进行,他们对与细菌遗传学和进化相关的广泛主题感兴趣。例子包括疫情分析以及致病性和抗菌素耐药性研究。在这个初学者指南中,我们的目标是为具有生物学背景的个人提供一个切入点,他们想要对细菌基因组数据进行自己的生物信息学分析,使他们能够回答自己的研究问题。我们假设读者将熟悉遗传学和序列数据的基本性质,但不假设任何计算机编程技能。主要内容包括组合、序列排序、注释、基因组比较和提取共同分型信息。每个部分都包括使用公开可用的大肠杆菌数据和免费软件工具的工作示例,所有这些都可以在台式计算机上执行。
{"title":"Beginner's guide to comparative bacterial genome analysis using next-generation sequence data.","authors":"David J Edwards,&nbsp;Kathryn E Holt","doi":"10.1186/2042-5783-3-2","DOIUrl":"https://doi.org/10.1186/2042-5783-3-2","url":null,"abstract":"<p><p>High throughput sequencing is now fast and cheap enough to be considered part of the toolbox for investigating bacteria, and there are thousands of bacterial genome sequences available for comparison in the public domain. Bacterial genome analysis is increasingly being performed by diverse groups in research, clinical and public health labs alike, who are interested in a wide array of topics related to bacterial genetics and evolution. Examples include outbreak analysis and the study of pathogenicity and antimicrobial resistance. In this beginner's guide, we aim to provide an entry point for individuals with a biology background who want to perform their own bioinformatics analysis of bacterial genome data, to enable them to answer their own research questions. We assume readers will be familiar with genetics and the basic nature of sequence data, but do not assume any computer programming skills. The main topics covered are assembly, ordering of contigs, annotation, genome comparison and extracting common typing information. Each section includes worked examples using publicly available E. coli data and free software tools, all which can be performed on a desktop computer.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"3 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2013-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-3-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"31349083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 124
An efficient rRNA removal method for RNA sequencing in GC-rich bacteria. 一种用于富gc细菌RNA测序的高效rRNA去除方法。
Pub Date : 2013-01-07 DOI: 10.1186/2042-5783-3-1
Clelia Peano, Alessandro Pietrelli, Clarissa Consolandi, Elio Rossi, Luca Petiti, Letizia Tagliabue, Gianluca De Bellis, Paolo Landini

Unlabelled:

Background: Next generation sequencing (NGS) technologies have revolutionized gene expression studies and functional genomics analysis. However, further improvement of RNA sequencing protocols is still desirable, in order to reduce NGS costs and to increase its accuracy. In bacteria, a major problem in RNA sequencing is the abundance of ribosomal RNA (rRNA), which accounts for 95-98% of total RNA and can therefore hinder sufficient coverage of mRNA, the main focus of transcriptomic studies. Thus, efficient removal of rRNA is necessary to achieve optimal coverage, good detection sensitivity and reliable results. An additional challenge is presented by microorganisms with GC-rich genomes, in which rRNA removal is less efficient.

Results: In this work, we tested two commercial kits for rRNA removal, either alone or in combination, on Burkholderia thailandensis. This bacterium, chosen as representative of the important Burkholderia genus, which includes both pathogenic and environmental bacteria, has a rather large (6.72 Mb) and GC-rich (67.7%) genome. Each enriched mRNA sample was sequenced through paired-end Illumina GAIIx run in duplicate, yielding between 10 and 40 million reads. We show that combined treatment with both kits allows an mRNA enrichment of more than 238-fold, enabling the sequencing of almost all (more than 90%) B. thailandensis transcripts from less than 10 million reads, without introducing any bias in mRNA relative abundance, thus preserving differential expression profile.

Conclusions: The mRNA enrichment protocol presented in this work leads to an increase in detection sensitivity up to 770% compared to total RNA; such increased sensitivity allows for a corresponding reduction in the number of sequencing reads necessary for the complete analysis of whole transcriptome expression profiling. Thus we can conclude that the MICROBExpress/Ovation combined rRNA removal method could be suitable for RNA sequencing of whole transcriptomes of microorganisms with high GC content and complex genomes enabling at the same time an important scaling down of sequencing costs.

背景:下一代测序(NGS)技术已经彻底改变了基因表达研究和功能基因组学分析。然而,为了降低NGS成本和提高其准确性,RNA测序方案的进一步改进仍然是可取的。在细菌中,RNA测序的一个主要问题是核糖体RNA (rRNA)的丰度,它占总RNA的95-98%,因此可能阻碍转录组学研究的主要焦点mRNA的充分覆盖。因此,为了获得最佳的覆盖范围、良好的检测灵敏度和可靠的结果,需要有效地去除rRNA。另一个挑战是具有富gc基因组的微生物,其rRNA去除效率较低。结果:在这项工作中,我们测试了两种商业试剂盒的rRNA去除,无论是单独或联合,对泰国伯克霍尔德氏菌。该细菌被选为重要的伯克霍尔德菌属的代表,该属包括致病菌和环境细菌,具有相当大的基因组(6.72 Mb)和富含gc(67.7%)。每个富集的mRNA样本通过配对端Illumina GAIIx重复测序,产生1000万至4000万reads。我们发现,两种试剂盒的联合处理可以使mRNA富集量超过238倍,从而可以测序几乎所有(超过90%)少于1000万reads的泰国芽草转录本,而不会引入mRNA相对丰度的任何偏差,从而保留差异表达谱。结论:与总RNA相比,本研究提出的mRNA富集方案可使检测灵敏度提高770%;这种增加的灵敏度允许相应减少完整分析整个转录组表达谱所需的测序读数的数量。因此,我们可以得出结论,MICROBExpress/Ovation联合rRNA去除方法适用于高GC含量和复杂基因组的微生物全转录组的RNA测序,同时可以显著降低测序成本。
{"title":"An efficient rRNA removal method for RNA sequencing in GC-rich bacteria.","authors":"Clelia Peano,&nbsp;Alessandro Pietrelli,&nbsp;Clarissa Consolandi,&nbsp;Elio Rossi,&nbsp;Luca Petiti,&nbsp;Letizia Tagliabue,&nbsp;Gianluca De Bellis,&nbsp;Paolo Landini","doi":"10.1186/2042-5783-3-1","DOIUrl":"https://doi.org/10.1186/2042-5783-3-1","url":null,"abstract":"<p><strong>Unlabelled: </strong></p><p><strong>Background: </strong>Next generation sequencing (NGS) technologies have revolutionized gene expression studies and functional genomics analysis. However, further improvement of RNA sequencing protocols is still desirable, in order to reduce NGS costs and to increase its accuracy. In bacteria, a major problem in RNA sequencing is the abundance of ribosomal RNA (rRNA), which accounts for 95-98% of total RNA and can therefore hinder sufficient coverage of mRNA, the main focus of transcriptomic studies. Thus, efficient removal of rRNA is necessary to achieve optimal coverage, good detection sensitivity and reliable results. An additional challenge is presented by microorganisms with GC-rich genomes, in which rRNA removal is less efficient.</p><p><strong>Results: </strong>In this work, we tested two commercial kits for rRNA removal, either alone or in combination, on Burkholderia thailandensis. This bacterium, chosen as representative of the important Burkholderia genus, which includes both pathogenic and environmental bacteria, has a rather large (6.72 Mb) and GC-rich (67.7%) genome. Each enriched mRNA sample was sequenced through paired-end Illumina GAIIx run in duplicate, yielding between 10 and 40 million reads. We show that combined treatment with both kits allows an mRNA enrichment of more than 238-fold, enabling the sequencing of almost all (more than 90%) B. thailandensis transcripts from less than 10 million reads, without introducing any bias in mRNA relative abundance, thus preserving differential expression profile.</p><p><strong>Conclusions: </strong>The mRNA enrichment protocol presented in this work leads to an increase in detection sensitivity up to 770% compared to total RNA; such increased sensitivity allows for a corresponding reduction in the number of sequencing reads necessary for the complete analysis of whole transcriptome expression profiling. Thus we can conclude that the MICROBExpress/Ovation combined rRNA removal method could be suitable for RNA sequencing of whole transcriptomes of microorganisms with high GC content and complex genomes enabling at the same time an important scaling down of sequencing costs.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":" ","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2013-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-3-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40217597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Bioinformatic identification of Mycobacterium tuberculosis proteins likely to target host cell mitochondria: virulence factors? 可能靶向宿主细胞线粒体的结核分枝杆菌蛋白的生物信息学鉴定:毒力因子?
Pub Date : 2012-12-22 DOI: 10.1186/2042-5783-2-9
María Maximina Bertha Moreno-Altamirano, Iris Selene Paredes-González, Clara Espitia, Mauricio Santiago-Maldonado, Rogelio Hernández-Pando, Francisco Javier Sánchez-García

Unlabelled:

Background: M. tuberculosis infection either induces or inhibits host cell death, depending on the bacterial strain and the cell microenvironment. There is evidence suggesting a role for mitochondria in these processes.On the other hand, it has been shown that several bacterial proteins are able to target mitochondria, playing a critical role in bacterial pathogenesis and modulation of cell death. However, mycobacteria-derived proteins able to target host cell mitochondria are less studied.

Results: A bioinformaic analysis based on available genomic sequences of the common laboratory virulent reference strain Mycobacterium tuberculosis H37Rv, the avirulent strain H37Ra, the clinical isolate CDC1551, and M. bovis BCG Pasteur strain 1173P2, as well as of suitable bioinformatic tools (MitoProt II, PSORT II, and SignalP) for the in silico search for proteins likely to be secreted by mycobacteria that could target host cell mitochondria, showed that at least 19 M. tuberculosis proteins could possibly target host cell mitochondria. We experimentally tested this bioinformatic prediction on four M. tuberculosis recombinant proteins chosen from this list of 19 proteins (p27, PE_PGRS1, PE_PGRS33, and MT_1866). Confocal microscopy analyses showed that p27, and PE_PGRS33 proteins colocalize with mitochondria.

Conclusions: Based on the bioinformatic analysis of whole M. tuberculosis genome sequences, we propose that at least 19 out of 4,246 M. tuberculosis predicted proteins would be able to target host cell mitochondria and, in turn, control mitochondrial physiology. Interestingly, such a list of 19 proteins includes five members of a mycobacteria specific family of proteins (PE/PE_PGRS) thought to be virulence factors, and p27, a well known virulence factor. P27, and PE_PGRS33 proteins experimentally showed to target mitochondria in J774 cells. Our results suggest a link between mitochondrial targeting of M. tuberculosis proteins and virulence.

背景:结核分枝杆菌感染可诱导或抑制宿主细胞死亡,这取决于菌株和细胞微环境。有证据表明线粒体在这些过程中起作用。另一方面,研究表明,一些细菌蛋白能够靶向线粒体,在细菌发病和细胞死亡调节中发挥关键作用。然而,分枝杆菌衍生蛋白能够靶向宿主细胞线粒体的研究较少。结果:基于常见的实验室毒性参考菌株结核分枝杆菌H37Rv、无毒性菌株H37Ra、临床分离株CDC1551和牛分枝杆菌巴斯德菌株1173P2的基因组序列,以及合适的生物信息学工具(MitoProt II、PSORT II和SignalP),进行生物信息学分析,用于计算机搜索可能由分枝杆菌分泌的靶向宿主细胞线粒体的蛋白质。表明至少有19种结核分枝杆菌蛋白可能靶向宿主细胞线粒体。我们对从这19个蛋白列表中选择的4个结核分枝杆菌重组蛋白(p27、PE_PGRS1、PE_PGRS33和MT_1866)进行了生物信息学预测实验。共聚焦显微镜分析显示p27和PE_PGRS33蛋白与线粒体共定位。结论:基于对结核分枝杆菌全基因组序列的生物信息学分析,我们提出4,246个结核分枝杆菌预测蛋白中至少有19个能够靶向宿主细胞线粒体,从而控制线粒体生理。有趣的是,这19种蛋白的列表包括分枝杆菌特异性蛋白家族(PE/PE_PGRS)的5个成员,被认为是毒力因子,以及p27,一个众所周知的毒力因子。实验显示P27和PE_PGRS33蛋白靶向J774细胞的线粒体。我们的结果表明线粒体靶向结核分枝杆菌蛋白和毒力之间存在联系。
{"title":"Bioinformatic identification of Mycobacterium tuberculosis proteins likely to target host cell mitochondria: virulence factors?","authors":"María Maximina Bertha Moreno-Altamirano,&nbsp;Iris Selene Paredes-González,&nbsp;Clara Espitia,&nbsp;Mauricio Santiago-Maldonado,&nbsp;Rogelio Hernández-Pando,&nbsp;Francisco Javier Sánchez-García","doi":"10.1186/2042-5783-2-9","DOIUrl":"https://doi.org/10.1186/2042-5783-2-9","url":null,"abstract":"<p><strong>Unlabelled: </strong></p><p><strong>Background: </strong>M. tuberculosis infection either induces or inhibits host cell death, depending on the bacterial strain and the cell microenvironment. There is evidence suggesting a role for mitochondria in these processes.On the other hand, it has been shown that several bacterial proteins are able to target mitochondria, playing a critical role in bacterial pathogenesis and modulation of cell death. However, mycobacteria-derived proteins able to target host cell mitochondria are less studied.</p><p><strong>Results: </strong>A bioinformaic analysis based on available genomic sequences of the common laboratory virulent reference strain Mycobacterium tuberculosis H37Rv, the avirulent strain H37Ra, the clinical isolate CDC1551, and M. bovis BCG Pasteur strain 1173P2, as well as of suitable bioinformatic tools (MitoProt II, PSORT II, and SignalP) for the in silico search for proteins likely to be secreted by mycobacteria that could target host cell mitochondria, showed that at least 19 M. tuberculosis proteins could possibly target host cell mitochondria. We experimentally tested this bioinformatic prediction on four M. tuberculosis recombinant proteins chosen from this list of 19 proteins (p27, PE_PGRS1, PE_PGRS33, and MT_1866). Confocal microscopy analyses showed that p27, and PE_PGRS33 proteins colocalize with mitochondria.</p><p><strong>Conclusions: </strong>Based on the bioinformatic analysis of whole M. tuberculosis genome sequences, we propose that at least 19 out of 4,246 M. tuberculosis predicted proteins would be able to target host cell mitochondria and, in turn, control mitochondrial physiology. Interestingly, such a list of 19 proteins includes five members of a mycobacteria specific family of proteins (PE/PE_PGRS) thought to be virulence factors, and p27, a well known virulence factor. P27, and PE_PGRS33 proteins experimentally showed to target mitochondria in J774 cells. Our results suggest a link between mitochondrial targeting of M. tuberculosis proteins and virulence.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"2 1","pages":"9"},"PeriodicalIF":0.0,"publicationDate":"2012-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-2-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"31139555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Analysis of evolutionary patterns of genes in Campylobacter jejuni and C. coli. 空肠弯曲杆菌和大肠杆菌基因进化模式分析。
Pub Date : 2012-08-28 DOI: 10.1186/2042-5783-2-8
Lars Snipen, Trudy M Wassenaar, Eric Altermann, Jonathan Olson, Sophia Kathariou, Karin Lagesen, Monica Takamiya, Susanne Knøchel, David W Ussery, Richard J Meinersmann

Unlabelled:

Background: The thermophilic Campylobacter jejuni and Campylobacter coli are considered weakly clonal populations where incongruences between genetic markers are assumed to be due to random horizontal transfer of genomic DNA. In order to investigate the population genetics structure we extracted a set of 1180 core gene families (CGF) from 27 sequenced genomes of C. jejuni and C. coli. We adopted a principal component analysis (PCA) on the normalized evolutionary distances in order to reveal any patterns in the evolutionary signals contained within the various CGFs.

Results: The analysis indicates that the conserved genes in Campylobacter show at least two, possibly five, distinct patterns of evolutionary signals, seen as clusters in the score-space of our PCA. The dominant underlying factor separating the core genes is the ability to distinguish C. jejuni from C. coli. The genes in the clusters outside the main gene group have a strong tendency of being chromosomal neighbors, which is natural if they share a common evolutionary history. Also, the most distinct cluster outside the main group is enriched with genes under positive selection and displays larger than average recombination rates.

Conclusions: The Campylobacter genomes investigated here show that subsets of conserved genes differ from each other in a more systematic way than expected by random horizontal transfer, and is consistent with differences in selection pressure acting on different genes. These findings are indications of a population of bacteria characterized by genomes with a mixture of evolutionary patterns.

背景:嗜热空肠弯曲杆菌和大肠弯曲杆菌被认为是弱克隆群体,其中遗传标记之间的不一致被认为是由于基因组DNA的随机水平转移。为了研究空肠和大肠杆菌的群体遗传结构,从27个已测序的基因组中提取了1180个核心基因家族(CGF)。我们采用主成分分析(PCA)对归一化进化距离进行分析,以揭示各种cgf中包含的进化信号的模式。结果:分析表明弯曲杆菌的保守基因表现出至少两种,可能是五种不同的进化信号模式,在我们的PCA得分空间中被视为集群。分离核心基因的主要潜在因素是区分空肠杆菌和大肠杆菌的能力。主基因群之外的基因簇有很强的成为染色体邻居的倾向,如果它们有共同的进化史,这是很自然的。此外,主群外最明显的聚类富集了正选择下的基因,表现出高于平均水平的重组率。结论:本文研究的弯曲杆菌基因组表明,保守基因亚群之间的差异比预期的通过随机水平转移更为系统,这与作用于不同基因的选择压力差异是一致的。这些发现表明,细菌种群的特征是基因组具有混合的进化模式。
{"title":"Analysis of evolutionary patterns of genes in Campylobacter jejuni and C. coli.","authors":"Lars Snipen,&nbsp;Trudy M Wassenaar,&nbsp;Eric Altermann,&nbsp;Jonathan Olson,&nbsp;Sophia Kathariou,&nbsp;Karin Lagesen,&nbsp;Monica Takamiya,&nbsp;Susanne Knøchel,&nbsp;David W Ussery,&nbsp;Richard J Meinersmann","doi":"10.1186/2042-5783-2-8","DOIUrl":"https://doi.org/10.1186/2042-5783-2-8","url":null,"abstract":"<p><strong>Unlabelled: </strong></p><p><strong>Background: </strong>The thermophilic Campylobacter jejuni and Campylobacter coli are considered weakly clonal populations where incongruences between genetic markers are assumed to be due to random horizontal transfer of genomic DNA. In order to investigate the population genetics structure we extracted a set of 1180 core gene families (CGF) from 27 sequenced genomes of C. jejuni and C. coli. We adopted a principal component analysis (PCA) on the normalized evolutionary distances in order to reveal any patterns in the evolutionary signals contained within the various CGFs.</p><p><strong>Results: </strong>The analysis indicates that the conserved genes in Campylobacter show at least two, possibly five, distinct patterns of evolutionary signals, seen as clusters in the score-space of our PCA. The dominant underlying factor separating the core genes is the ability to distinguish C. jejuni from C. coli. The genes in the clusters outside the main gene group have a strong tendency of being chromosomal neighbors, which is natural if they share a common evolutionary history. Also, the most distinct cluster outside the main group is enriched with genes under positive selection and displays larger than average recombination rates.</p><p><strong>Conclusions: </strong>The Campylobacter genomes investigated here show that subsets of conserved genes differ from each other in a more systematic way than expected by random horizontal transfer, and is consistent with differences in selection pressure acting on different genes. These findings are indications of a population of bacteria characterized by genomes with a mixture of evolutionary patterns.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"2 1","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2012-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-2-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30866026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Computational genomics-proteomics and Phylogeny analysis of twenty one mycobacterial genomes (Tuberculosis & non Tuberculosis strains). 21个分枝杆菌基因组(结核和非结核菌株)的计算基因组学-蛋白质组学和系统发育分析。
Pub Date : 2012-08-28 DOI: 10.1186/2042-5783-2-7
Fathiah Zakham, Othmane Aouane, David Ussery, Abdelaziz Benjouad, Moulay Mustapha Ennaji

Unlabelled:

Background: The genus Mycobacterium comprises different species, among them the most contagious and infectious bacteria. The members of the complex Mycobacterium tuberculosis are the most virulent microorganisms that have killed human and other mammals since millennia. Additionally, with the many different mycobacterial sequences available, there is a crucial need for the visualization and the simplification of their data. In this present study, we aim to highlight a comparative genome, proteome and phylogeny analysis between twenty-one mycobacterial (Tuberculosis and non tuberculosis) strains using a set of computational and bioinformatics tools (Pan and Core genome plotting, BLAST matrix and phylogeny analysis).

Results: Considerably the result of pan and core genome Plotting demonstrated that less than 1250 Mycobacterium gene families are conserved across all species, and a total set of about 20,000 gene families within the Mycobacterium pan-genome of twenty one mycobacterial genomes.Viewing the BLAST matrix a high similarity was found among the species of the complex Mycobacterium tuberculosis and less conservation is found with other slow growing pathogenic mycobacteria.Phylogeny analysis based on both protein conservation, as well as rRNA clearly resolve known relationships between slow growing mycobacteria.

Conclusion: Mycobacteria include important pathogenic species for human and animals and the Mycobacterium tuberculosis complex is the most cause of death of the humankind. The comparative genome analysis could provide a new insight for better controlling and preventing these diseases.

背景:分枝杆菌属包括不同的种类,其中最具传染性和传染性的细菌。复杂的结核分枝杆菌的成员是数千年来杀死人类和其他哺乳动物的最致命的微生物。此外,由于有许多不同的分枝杆菌序列可用,因此对其数据的可视化和简化是至关重要的。在本研究中,我们旨在利用一套计算和生物信息学工具(Pan和Core基因组绘图,BLAST矩阵和系统发育分析)对21种分枝杆菌(结核和非结核)菌株进行比较基因组,蛋白质组学和系统发育分析。结果:总体和核心基因组图谱分析结果显示,在所有物种中保守的分枝杆菌基因家族不到1250个,在21个分枝杆菌基因组的分枝杆菌泛基因组中保守的基因家族总数约为2万个。观察BLAST基质,发现结核分枝杆菌复合体的种类之间具有高度的相似性,而与其他生长缓慢的致病性分枝杆菌的相似性较小。基于蛋白质保护和rRNA的系统发育分析清楚地解决了生长缓慢的分枝杆菌之间的已知关系。结论:分枝杆菌是人类和动物的重要致病菌,结核分枝杆菌复合体是人类死亡的主要原因。比较基因组分析可以为更好地控制和预防这些疾病提供新的见解。
{"title":"Computational genomics-proteomics and Phylogeny analysis of twenty one mycobacterial genomes (Tuberculosis & non Tuberculosis strains).","authors":"Fathiah Zakham,&nbsp;Othmane Aouane,&nbsp;David Ussery,&nbsp;Abdelaziz Benjouad,&nbsp;Moulay Mustapha Ennaji","doi":"10.1186/2042-5783-2-7","DOIUrl":"https://doi.org/10.1186/2042-5783-2-7","url":null,"abstract":"<p><strong>Unlabelled: </strong></p><p><strong>Background: </strong>The genus Mycobacterium comprises different species, among them the most contagious and infectious bacteria. The members of the complex Mycobacterium tuberculosis are the most virulent microorganisms that have killed human and other mammals since millennia. Additionally, with the many different mycobacterial sequences available, there is a crucial need for the visualization and the simplification of their data. In this present study, we aim to highlight a comparative genome, proteome and phylogeny analysis between twenty-one mycobacterial (Tuberculosis and non tuberculosis) strains using a set of computational and bioinformatics tools (Pan and Core genome plotting, BLAST matrix and phylogeny analysis).</p><p><strong>Results: </strong>Considerably the result of pan and core genome Plotting demonstrated that less than 1250 Mycobacterium gene families are conserved across all species, and a total set of about 20,000 gene families within the Mycobacterium pan-genome of twenty one mycobacterial genomes.Viewing the BLAST matrix a high similarity was found among the species of the complex Mycobacterium tuberculosis and less conservation is found with other slow growing pathogenic mycobacteria.Phylogeny analysis based on both protein conservation, as well as rRNA clearly resolve known relationships between slow growing mycobacteria.</p><p><strong>Conclusion: </strong>Mycobacteria include important pathogenic species for human and animals and the Mycobacterium tuberculosis complex is the most cause of death of the humankind. The comparative genome analysis could provide a new insight for better controlling and preventing these diseases.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"2 1","pages":"7"},"PeriodicalIF":0.0,"publicationDate":"2012-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-2-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30866006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Bacterial phylogenetic tree construction based on genomic translation stop signals. 基于基因组翻译停止信号的细菌系统发育树构建。
Pub Date : 2012-05-31 DOI: 10.1186/2042-5783-2-6
Lijing Xu, Jimmy Kuo, Jong-Kang Liu, Tit-Yee Wong

Background: The efficiencies of the stop codons TAA, TAG, and TGA in protein synthesis termination are not the same. These variations could allow many genes to be regulated. There are many similar nucleotide trimers found on the second and third reading-frames of a gene. They are called premature stop codons (PSC). Like stop codons, the PSC in bacterial genomes are also highly bias in terms of their quantities and qualities on the genes. Phylogenetically related species often share a similar PSC profile. We want to know whether the selective forces that influence the stop codons and the PSC usage biases in a genome are related. We also wish to know how strong these trimers in a genome are related to the natural history of the bacterium. Knowing these relations may provide better knowledge in the phylogeny of bacteria

Results: A 16SrRNA-alignment tree of 19 well-studied α-, β- and γ-Proteobacteria Type species is used as standard reference for bacterial phylogeny. The genomes of sixty-one bacteria, belonging to the α-, β- and γ-Proteobacteria subphyla, are used for this study. The stop codons and PSC are collectively termed "Translation Stop Signals" (TSS). A gene is represented by nine scalars corresponding to the numbers of counts of TAA, TAG, and TGA on each of the three reading-frames of that gene. "Translation Stop Signals Ratio" (TSSR) is the ratio between the TSS counts. Four types of TSSR are investigated. The TSSR-1, TSSR-2 and TSSR-3 are each a 3-scalar series corresponding respectively to the average ratio of TAA: TAG: TGA on the first, second, and third reading-frames of all genes in a genome. The Genomic-TSSR is a 9-scalar series representing the ratio of distribution of all TSS on the three reading-frames of all genes in a genome. Results show that bacteria grouped by their similarities based on TSSR-1, TSSR-2, or TSSR-3 values could only partially resolve the phylogeny of the species. However, grouping bacteria based on thier Genomic-TSSR values resulted in clusters of bacteria identical to those bacterial clusters of the reference tree. Unlike the 16SrRNA method, the Genomic-TSSR tree is also able to separate closely related species/strains at high resolution. Species and strains separated by the Genomic-TSSR grouping method are often in good agreement with those classified by other taxonomic methods. Correspondence analysis of individual genes shows that most genes in a bacterial genome share a similar TSSR value. However, within a chromosome, the Genic-TSSR values of genes near the replication origin region (Ori) are more similar to each other than those genes near the terminus region (Ter).

Conclusion: The translation stop signals on the three reading-frames of the genes on a bacterial genome are interrelated, possibly due to frequent off-frame recombination facilitated by translational-associated recombination (TSR). However, TSR may not occur randomly in a bacte

背景:终止密码子TAA、TAG和TGA在蛋白质合成终止中的效率是不一样的。这些变异可以使许多基因受到调控。在一个基因的第二和第三解读框上发现了许多相似的核苷酸三聚体。它们被称为过早终止密码子(PSC)。与终止密码子一样,细菌基因组中的PSC在基因上的数量和质量方面也存在高度的偏差。系统发育相关的物种通常具有相似的PSC特征。我们想知道影响终止密码子和基因组中PSC使用偏差的选择力是否相关。我们还希望知道基因组中的这些三聚体与细菌的自然史有多大关系。结果:用19种α-、β-和γ-变形杆菌的16srrna比对树作为细菌系统发育的标准参考。本研究使用了61种细菌的基因组,这些细菌属于α-、β-和γ-变形菌亚门。停止密码子和PSC统称为“翻译停止信号”(TSS)。一个基因由9个标量表示,对应于该基因的三个读码框上每个TAA、TAG和TGA的计数。“转换停止信号比”(TSSR)是转换停止信号数之间的比率。研究了四种类型的TSSR。TSSR-1、TSSR-2和TSSR-3是一个3标量序列,分别对应于基因组中所有基因的第一、二、三读框上TAA: TAG: TGA的平均比值。genome - tssr是一个9标量序列,表示基因组中所有基因的三个读框上所有TSS的分布比例。结果表明,根据TSSR-1、TSSR-2或TSSR-3的相似性分组的细菌只能部分解决物种的系统发育问题。然而,根据细菌的基因组- tssr值对细菌进行分组,得到的细菌簇与参考树的细菌簇相同。与16SrRNA方法不同,genome - tssr树也能够以高分辨率分离密切相关的物种/菌株。用基因组- tssr分组方法分离的种和品系往往与用其他分类学方法分类的种和品系一致。单个基因的对应分析表明,细菌基因组中的大多数基因具有相似的TSSR值。然而,在一条染色体内,靠近复制起始区(Ori)的基因的遗传- tssr值比靠近末端区(Ter)的基因更相似。结论:细菌基因组的三个读框上的翻译停止信号是相互关联的,可能是由于翻译相关重组(translation -associated recombination, TSR)促进了频繁的框外重组。然而,TSR可能不是随机发生在细菌染色体上。Ori区附近的基因通常高度表达,一个细菌总是保持多个Ori拷贝。DNA聚合酶和rna聚合酶的频繁碰撞会在基因上产生许多DNA链断裂;而DNA链断裂诱导的同源重组更可能发生在序列相似的基因之间。因此,局部重组可以解释为什么Ori区域附近基因的TSSR彼此之间更相似。基因组中这些TSS的数量和质量强烈地反映了细菌的自然历史。我们建议基因组- TSSR可以作为一个主观的生物标记物来代表细菌的种系状态。
{"title":"Bacterial phylogenetic tree construction based on genomic translation stop signals.","authors":"Lijing Xu,&nbsp;Jimmy Kuo,&nbsp;Jong-Kang Liu,&nbsp;Tit-Yee Wong","doi":"10.1186/2042-5783-2-6","DOIUrl":"https://doi.org/10.1186/2042-5783-2-6","url":null,"abstract":"<p><strong>Background: </strong>The efficiencies of the stop codons TAA, TAG, and TGA in protein synthesis termination are not the same. These variations could allow many genes to be regulated. There are many similar nucleotide trimers found on the second and third reading-frames of a gene. They are called premature stop codons (PSC). Like stop codons, the PSC in bacterial genomes are also highly bias in terms of their quantities and qualities on the genes. Phylogenetically related species often share a similar PSC profile. We want to know whether the selective forces that influence the stop codons and the PSC usage biases in a genome are related. We also wish to know how strong these trimers in a genome are related to the natural history of the bacterium. Knowing these relations may provide better knowledge in the phylogeny of bacteria</p><p><strong>Results: </strong>A 16SrRNA-alignment tree of 19 well-studied α-, β- and γ-Proteobacteria Type species is used as standard reference for bacterial phylogeny. The genomes of sixty-one bacteria, belonging to the α-, β- and γ-Proteobacteria subphyla, are used for this study. The stop codons and PSC are collectively termed \"Translation Stop Signals\" (TSS). A gene is represented by nine scalars corresponding to the numbers of counts of TAA, TAG, and TGA on each of the three reading-frames of that gene. \"Translation Stop Signals Ratio\" (TSSR) is the ratio between the TSS counts. Four types of TSSR are investigated. The TSSR-1, TSSR-2 and TSSR-3 are each a 3-scalar series corresponding respectively to the average ratio of TAA: TAG: TGA on the first, second, and third reading-frames of all genes in a genome. The Genomic-TSSR is a 9-scalar series representing the ratio of distribution of all TSS on the three reading-frames of all genes in a genome. Results show that bacteria grouped by their similarities based on TSSR-1, TSSR-2, or TSSR-3 values could only partially resolve the phylogeny of the species. However, grouping bacteria based on thier Genomic-TSSR values resulted in clusters of bacteria identical to those bacterial clusters of the reference tree. Unlike the 16SrRNA method, the Genomic-TSSR tree is also able to separate closely related species/strains at high resolution. Species and strains separated by the Genomic-TSSR grouping method are often in good agreement with those classified by other taxonomic methods. Correspondence analysis of individual genes shows that most genes in a bacterial genome share a similar TSSR value. However, within a chromosome, the Genic-TSSR values of genes near the replication origin region (Ori) are more similar to each other than those genes near the terminus region (Ter).</p><p><strong>Conclusion: </strong>The translation stop signals on the three reading-frames of the genes on a bacterial genome are interrelated, possibly due to frequent off-frame recombination facilitated by translational-associated recombination (TSR). However, TSR may not occur randomly in a bacte","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"2 1","pages":"6"},"PeriodicalIF":0.0,"publicationDate":"2012-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-2-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30658207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Growth comparison of several Escherichia coli strains exposed to various concentrations of lactoferrin using linear spline regression. 使用线性样条回归比较几种暴露于不同浓度乳铁蛋白的大肠杆菌菌株的生长。
Pub Date : 2012-04-16 DOI: 10.1186/2042-5783-2-5
Camilla Sekse, Jon Bohlin, Eystein Skjerve, Gerd E Vegarud

Background: We wanted to compare growth differences between 13 Escherichia coli strains exposed to various concentrations of the growth inhibitor lactoferrin in two different types of broth (Syncase and Luria-Bertani (LB)). To carry this out, we present a simple statistical procedure that separates microbial growth curves that are due to natural random perturbations and growth curves that are more likely caused by biological differences.Bacterial growth was determined using optical density data (OD) recorded for triplicates at 620 nm for 18 hours for each strain. Each resulting growth curve was divided into three equally spaced intervals. We propose a procedure using linear spline regression with two knots to compute the slopes of each interval in the bacterial growth curves. These slopes are subsequently used to estimate a 95% confidence interval based on an appropriate statistical distribution. Slopes outside the confidence interval were considered as significantly different from slopes within. We also demonstrate the use of related, but more advanced methods known collectively as generalized additive models (GAMs) to model growth. In addition to impressive curve fitting capabilities with corresponding confidence intervals, GAM's allow for the computation of derivatives, i.e. growth rate estimation, with respect to each time point.

Results: The results from our proposed procedure agreed well with the observed data. The results indicated that there were substantial growth differences between the E. coli strains. Most strains exhibited improved growth in the nutrient rich LB broth compared to Syncase. The inhibiting effect of lactoferrin varied between the different strains. The atypical enteropathogenic aEPEC-2 grew, on average, faster in both broths than the other strains tested while the enteroinvasive strains, EIEC-6 and EIEC-7 grew slower. The enterotoxigenic ETEC-5 strain, exhibited exceptional growth in Syncase broth, but slower growth in LB broth.

Conclusions: Our results do not indicate clear growth differences between pathogroups or pathogenic versus non-pathogenic E. coli.

背景:我们想比较13株大肠杆菌菌株在两种不同类型的肉汤(Syncase和Luria-Bertani (LB))中暴露于不同浓度的生长抑制剂乳铁蛋白的生长差异。为了实现这一点,我们提出了一个简单的统计程序,将自然随机扰动引起的微生物生长曲线和更可能由生物差异引起的微生物生长曲线分开。利用光密度数据(OD)测定细菌生长,记录三次,每个菌株在620 nm下生长18小时。每个生成的生长曲线被分成三个间隔相等的区间。我们提出了一个程序,使用线性样条回归与两个结点,以计算在细菌生长曲线的每个区间的斜率。这些斜率随后用于估计基于适当统计分布的95%置信区间。置信区间外的斜率被认为与置信区间内的斜率显著不同。我们还演示了使用相关但更先进的方法,统称为广义加性模型(GAMs)来模拟生长。除了具有相应置信区间的令人印象深刻的曲线拟合能力外,GAM还允许计算导数,即相对于每个时间点的增长率估计。结果:实验结果与观测数据吻合良好。结果表明,大肠杆菌菌株之间存在显著的生长差异。与Syncase相比,大多数菌株在富含营养的LB肉汤中表现出更好的生长。乳铁蛋白的抑制作用在不同菌株之间存在差异。非典型肠致病性菌株eiec -2在两种培养液中的平均生长速度均快于其他菌株,而肠侵袭性菌株EIEC-6和EIEC-7的生长速度较慢。产肠毒素菌株ec -5在Syncase肉汤中表现出异常的生长,但在LB肉汤中生长较慢。结论:我们的研究结果没有显示病原菌群之间或致病性与非致病性大肠杆菌之间的明显生长差异。
{"title":"Growth comparison of several Escherichia coli strains exposed to various concentrations of lactoferrin using linear spline regression.","authors":"Camilla Sekse,&nbsp;Jon Bohlin,&nbsp;Eystein Skjerve,&nbsp;Gerd E Vegarud","doi":"10.1186/2042-5783-2-5","DOIUrl":"https://doi.org/10.1186/2042-5783-2-5","url":null,"abstract":"<p><strong>Background: </strong>We wanted to compare growth differences between 13 Escherichia coli strains exposed to various concentrations of the growth inhibitor lactoferrin in two different types of broth (Syncase and Luria-Bertani (LB)). To carry this out, we present a simple statistical procedure that separates microbial growth curves that are due to natural random perturbations and growth curves that are more likely caused by biological differences.Bacterial growth was determined using optical density data (OD) recorded for triplicates at 620 nm for 18 hours for each strain. Each resulting growth curve was divided into three equally spaced intervals. We propose a procedure using linear spline regression with two knots to compute the slopes of each interval in the bacterial growth curves. These slopes are subsequently used to estimate a 95% confidence interval based on an appropriate statistical distribution. Slopes outside the confidence interval were considered as significantly different from slopes within. We also demonstrate the use of related, but more advanced methods known collectively as generalized additive models (GAMs) to model growth. In addition to impressive curve fitting capabilities with corresponding confidence intervals, GAM's allow for the computation of derivatives, i.e. growth rate estimation, with respect to each time point.</p><p><strong>Results: </strong>The results from our proposed procedure agreed well with the observed data. The results indicated that there were substantial growth differences between the E. coli strains. Most strains exhibited improved growth in the nutrient rich LB broth compared to Syncase. The inhibiting effect of lactoferrin varied between the different strains. The atypical enteropathogenic aEPEC-2 grew, on average, faster in both broths than the other strains tested while the enteroinvasive strains, EIEC-6 and EIEC-7 grew slower. The enterotoxigenic ETEC-5 strain, exhibited exceptional growth in Syncase broth, but slower growth in LB broth.</p><p><strong>Conclusions: </strong>Our results do not indicate clear growth differences between pathogroups or pathogenic versus non-pathogenic E. coli.</p>","PeriodicalId":18538,"journal":{"name":"Microbial Informatics and Experimentation","volume":"2 ","pages":"5"},"PeriodicalIF":0.0,"publicationDate":"2012-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2042-5783-2-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30620586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
期刊
Microbial Informatics and Experimentation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1