首页 > 最新文献

Computers & chemistry最新文献

英文 中文
Grand metaphors of biology in the genome era 基因组时代生物学的伟大隐喻
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00024-4
Andrzej K Konopka
{"title":"Grand metaphors of biology in the genome era","authors":"Andrzej K Konopka","doi":"10.1016/S0097-8485(02)00024-4","DOIUrl":"10.1016/S0097-8485(02)00024-4","url":null,"abstract":"","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 397-401"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00024-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90275132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins 生物信息学:基因和蛋白质分析的实用指南
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00027-X
Jean-Loup Risler
{"title":"Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins","authors":"Jean-Loup Risler","doi":"10.1016/S0097-8485(02)00027-X","DOIUrl":"10.1016/S0097-8485(02)00027-X","url":null,"abstract":"","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 549-551"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00027-X","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84741377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
This Is Biology: The Science of the Living World 这是生物学:生命世界的科学
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00025-6
Andrzej K Konopka
{"title":"This Is Biology: The Science of the Living World","authors":"Andrzej K Konopka","doi":"10.1016/S0097-8485(02)00025-6","DOIUrl":"10.1016/S0097-8485(02)00025-6","url":null,"abstract":"","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 543-545"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00025-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83438036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Applications of recursive segmentation to the analysis of DNA sequences 递归分割在DNA序列分析中的应用
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00010-4
Wentian Li , Pedro Bernaola-Galván , Fatameh Haghighi , Ivo Grosse

Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G+C)/weak(A+T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions.

递归分割是一种将DNA序列划分为具有四种核苷酸a、C、G和T组成的均匀结构域的过程。该过程也可以应用于从DNA序列转换的任何序列,例如转换为二元强(G+C)/弱(a +T)序列,转换为指示二核苷酸CpG存在或不存在的二元序列,或转换为指示碱基和密码子位置信息的序列。为了解决以下五个DNA序列分析问题,我们应用了各种转换方案:等质粒定位,CpG岛检测,定位细菌基因组复制的起点和终点,发现端粒序列中的复杂重复序列,以及描述编码区和非编码区。我们发现递归分割方法可以成功地检测等差边界、CpG岛和复制的起点和终点,但在检测复杂重复以及编码区和非编码区之间的边界方面需要改进。
{"title":"Applications of recursive segmentation to the analysis of DNA sequences","authors":"Wentian Li ,&nbsp;Pedro Bernaola-Galván ,&nbsp;Fatameh Haghighi ,&nbsp;Ivo Grosse","doi":"10.1016/S0097-8485(02)00010-4","DOIUrl":"10.1016/S0097-8485(02)00010-4","url":null,"abstract":"<div><p>Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G+C)/weak(A+T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 491-510"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00010-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79853764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 105
Deciphering Arabidopsis thaliana gene neighborhoods through bibliographic co-citations 通过文献共引解读拟南芥基因邻域
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00011-6
A. Louis , H. Chiapello , C. Fabry , E. Ollivier , A. Hénaut

In the framework of genome annotation, scientific literature is obviously the major source of biological knowledge. The aim of the work described in this paper is to exploit this source of data for the model plant Arabidopsis thaliana. The first step has consisted in constituting a relevant bibliographic references dataset for plant genomic research. Genes co-citations have then been systematically annotated in this reference dataset, starting from the simple idea that if genes are cited in the same publication, they must probably share some related functional properties. In order to deal with the synonymous gene name problem; a gene name reference list has been constituted starting from A. thaliana SwissProt entries. This list was used to build clusters of co-cited genes by a single linkage procedure such that any gene in a given cluster possesses at least one co-cited partner in the same cluster. Analysis of the clusters demonstrate the biological consistency of this approach, with only very few fortuitous links. As an example, a cluster including genes related to flowering time is more deeply described in the paper. Finally, a graphical representation of each cluster was performed, which provides a convenient way to retrieve the genes (the nodes of the graphs) and the references in which they were co-cited (the edges of the graphs). All the results can be accessed at the URL http://chlora.Igi.infobiogen.fr:1234/bib_arath/.

在基因组注释的框架中,科学文献显然是生物学知识的主要来源。本文所描述的工作的目的是利用这一数据来源的模式植物拟南芥。第一步是为植物基因组研究建立一个相关的参考书目数据集。从一个简单的想法开始,如果基因在同一出版物中被引用,它们可能具有一些相关的功能特性,然后在这个参考数据集中系统地注释了基因共引。为了处理同义基因名称问题;从拟南芥SwissProt条目开始,构建了一个基因名称参考表。该列表用于通过单一链接程序构建共被引基因簇,使得给定簇中的任何基因在同一簇中至少具有一个共被引伙伴。对聚类的分析证明了这种方法的生物学一致性,只有很少的偶然联系。作为一个例子,本文更深入地描述了一个包含与开花时间有关的基因簇。最后,对每个聚类进行图形化表示,这提供了一种方便的方法来检索基因(图的节点)和它们被共同引用的参考文献(图的边缘)。所有的结果都可以通过URL http://chlora.Igi.infobiogen.fr:1234/bib_arath/访问。
{"title":"Deciphering Arabidopsis thaliana gene neighborhoods through bibliographic co-citations","authors":"A. Louis ,&nbsp;H. Chiapello ,&nbsp;C. Fabry ,&nbsp;E. Ollivier ,&nbsp;A. Hénaut","doi":"10.1016/S0097-8485(02)00011-6","DOIUrl":"10.1016/S0097-8485(02)00011-6","url":null,"abstract":"<div><p>In the framework of genome annotation, scientific literature is obviously the major source of biological knowledge. The aim of the work described in this paper is to exploit this source of data for the model plant <em>Arabidopsis thaliana</em>. The first step has consisted in constituting a relevant bibliographic references dataset for plant genomic research. Genes co-citations have then been systematically annotated in this reference dataset, starting from the simple idea that if genes are cited in the same publication, they must probably share some related functional properties. In order to deal with the synonymous gene name problem; a gene name reference list has been constituted starting from <em>A. thaliana</em> SwissProt entries. This list was used to build clusters of co-cited genes by a single linkage procedure such that any gene in a given cluster possesses at least one co-cited partner in the same cluster. Analysis of the clusters demonstrate the biological consistency of this approach, with only very few fortuitous links. As an example, a cluster including genes related to flowering time is more deeply described in the paper. Finally, a graphical representation of each cluster was performed, which provides a convenient way to retrieve the genes (the nodes of the graphs) and the references in which they were co-cited (the edges of the graphs). All the results can be accessed at the URL <span>http://chlora.Igi.infobiogen.fr:1234/bib_arath/</span><svg><path></path></svg>.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 511-519"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00011-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84143076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Local weighting schemes for protein multiple sequence alignment 蛋白质多序列比对的局部加权方法
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00008-6
Jaap Heringa

This paper describes three weighting schemes for improving the accuracy of progressive multiple sequence alignment methods: (1) global profile pre-processing, to capture for each sequence information about other sequences in a profile before the actual multiple alignment takes place; (2) local pre-processing; which incorporates a new protocol to only use non-overlapping local sequence regions to construct the pre-processed profiles; and (3) local–global alignment, a weighting scheme based on the double dynamic programming (DDP) technique to softly bias global alignment to local sequence motifs. The first two schemes allow the compilation of residue-specific multiple alignment reliability indices, which can be used in an iterative fashion. The schemes have been implemented with associated iterative modes in the PRALINE multiple sequence alignment method, and have been evaluated using the BAliBASE benchmark alignment database. These tests indicate that PRALINE is a toolbox able to build alignments with very high quality. We found that local profile pre-processing raises the alignment quality by 5.5% compared to PRALINE alignments generated under default conditions. Iteration enhances the quality by a further percentage point. The implications of multiple alignment scoring functions and iteration in relation to alignment quality and benchmarking are discussed.

为了提高渐进式多序列比对方法的精度,本文提出了三种加权方案:(1)全局剖面预处理,在实际进行多序列比对之前,为每个序列捕获剖面中其他序列的信息;(2)局部预处理;该算法引入了一种新的协议,只使用不重叠的局部序列区域来构建预处理后的轮廓;(3)局部-全局对齐,这是一种基于双动态规划(DDP)技术的加权方案,可使全局对齐对局部序列基元进行软偏置。前两种方案允许编制残差特定的多重对准可靠性指标,这些指标可以以迭代方式使用。在PRALINE多序列比对方法中,采用相关迭代模式对方案进行了实现,并利用BAliBASE基准比对数据库对方案进行了评价。这些测试表明PRALINE是一个工具箱,能够以非常高的质量构建校准。我们发现,与默认条件下生成的PRALINE对齐相比,本地配置文件预处理将对齐质量提高了5.5%。迭代将质量进一步提高了一个百分点。讨论了多对齐评分函数和迭代对对齐质量和基准的影响。
{"title":"Local weighting schemes for protein multiple sequence alignment","authors":"Jaap Heringa","doi":"10.1016/S0097-8485(02)00008-6","DOIUrl":"10.1016/S0097-8485(02)00008-6","url":null,"abstract":"<div><p>This paper describes three weighting schemes for improving the accuracy of progressive multiple sequence alignment methods: (1) global profile pre-processing, to capture for each sequence information about other sequences in a profile before the actual multiple alignment takes place; (2) local pre-processing; which incorporates a new protocol to only use non-overlapping local sequence regions to construct the pre-processed profiles; and (3) local–global alignment, a weighting scheme based on the double dynamic programming (DDP) technique to softly bias global alignment to local sequence motifs. The first two schemes allow the compilation of residue-specific multiple alignment reliability indices, which can be used in an iterative fashion. The schemes have been implemented with associated iterative modes in the PRALINE multiple sequence alignment method, and have been evaluated using the BAliBASE benchmark alignment database. These tests indicate that PRALINE is a toolbox able to build alignments with very high quality. We found that local profile pre-processing raises the alignment quality by 5.5% compared to PRALINE alignments generated under default conditions. Iteration enhances the quality by a further percentage point. The implications of multiple alignment scoring functions and iteration in relation to alignment quality and benchmarking are discussed.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 459-477"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00008-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73167550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Strategies for the identification, the assembly and the classification of integrated biological systems in completely sequenced genomes 在完全测序的基因组中鉴定、组装和分类整合生物系统的策略
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00007-4
Yves Quentin , Julie Chabalier , Gwennaele Fichant

The proteins involved in a single biological process may form a stable supra-molecular assembly or be transiently in interaction. Although, the first annotation steps of a complete genome may allow the identification of the different partners, their assembly in a functional system, referred to as an integrated system, is a domain where methodological effort has to be done. Indeed, the knowledge required to assemble partners of such systems should be explicitly included in annotation software. The availability of a complete genome, and therefore of all the proteins encoded by that genome, motivated the development of automated approaches through the coordinated combination of different bio-informatic methods allowing the identification of the different partners, their assembly and the classification of the reconstructed systems in functional categories. In this data flux, the identification of the sequence partners represents the principal bottleneck. Here, we describe and compare the results obtained with different classes of methods (blastp2, psi-blast, mast and Meta-meme) applied to the identification in complete genomes of a given family of integrated systems: the ABC transporters. psi-blast appears to significantly outperform motif-based methods, and the results are discussed according to the nature of the proteins and the structure of the sub-families.

参与单一生物过程的蛋白质可能形成稳定的超分子组合,也可能是短暂的相互作用。虽然,完整基因组的第一个注释步骤可能允许识别不同的伙伴,但它们在一个功能系统中的组装(称为集成系统)是一个必须完成方法学工作的领域。实际上,组装这种系统的伙伴所需的知识应该明确地包含在注释软件中。完整基因组的可用性,以及由该基因组编码的所有蛋白质的可用性,推动了自动化方法的发展,通过协调不同生物信息学方法的组合,允许识别不同的伙伴,它们的组装和在功能类别中对重建系统进行分类。在这种数据流中,序列伙伴的识别是主要的瓶颈。在这里,我们描述并比较了不同类别的方法(blastp2、psi-blast、mast和Meta-meme)在完整基因组中鉴定给定的综合系统家族(ABC转运蛋白)所获得的结果。Psi-blast似乎明显优于基于基序的方法,并根据蛋白质的性质和亚家族的结构对结果进行了讨论。
{"title":"Strategies for the identification, the assembly and the classification of integrated biological systems in completely sequenced genomes","authors":"Yves Quentin ,&nbsp;Julie Chabalier ,&nbsp;Gwennaele Fichant","doi":"10.1016/S0097-8485(02)00007-4","DOIUrl":"10.1016/S0097-8485(02)00007-4","url":null,"abstract":"<div><p>The proteins involved in a single biological process may form a stable supra-molecular assembly or be transiently in interaction. Although, the first annotation steps of a complete genome may allow the identification of the different partners, their assembly in a functional system, referred to as an integrated system, is a domain where methodological effort has to be done. Indeed, the knowledge required to assemble partners of such systems should be explicitly included in annotation software. The availability of a complete genome, and therefore of all the proteins encoded by that genome, motivated the development of automated approaches through the coordinated combination of different bio-informatic methods allowing the identification of the different partners, their assembly and the classification of the reconstructed systems in functional categories. In this data flux, the identification of the sequence partners represents the principal bottleneck. Here, we describe and compare the results obtained with different classes of methods (<span>blastp2</span>, <span>psi</span>-<span>blast</span>, <span>mast</span> and M<span>eta</span>-<span>meme</span>) applied to the identification in complete genomes of a given family of integrated systems: the ABC transporters. <span>psi</span>-<span>blast</span> appears to significantly outperform motif-based methods, and the results are discussed according to the nature of the proteins and the structure of the sub-families.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 447-457"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00007-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84279598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An incremental algorithm for Z-value computations 用于z值计算的增量算法
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00003-7
J.-C. Aude , A. Louis

The Z-value (Comput. Chem. 23 (1999) 333) is an extension of the Z-score that is classically used to compare sets of biological sequences. The Z-value has been successfully used to handle complete genome studies as well as analyze large sets of proteins. The Z-value computation is based on a Monte Carlo approach to estimate the statistical significance of a Smith & Waterman alignment score. Comet et al. (Comput. Chem. 23 (1999) 333) have shown that, in contrast to the alignment score, the Z-value largely reduces the bias due to the lengths and compositions of the sequences. They also described an estimator of the deviation of Z-values, that we extend in this paper in order to optimize Z-values computation. The incremental algorithm described here provides two characteristics which are usually incompatible: (i) it improves the accuracy of Z-values calculation; (ii) it reduces the time complexity (this algorithm has been named incremental because it iteratively adds random sequences to the Monte-Carlo process when needed). Results are presented, originating from the all-by-all comparison of the proteins from Saccharomyces cerevisiae and Escherichia coli.

z值(计算。Chem. 23(1999) 333)是Z-score的扩展,通常用于比较生物序列集。z值已成功用于处理全基因组研究以及分析大量蛋白质。z值计算是基于蒙特卡罗方法来估计Smith &沃特曼对齐分数。彗星等人(计算。Chem. 23(1999) 333)表明,与对齐分数相反,z值在很大程度上减少了由于序列的长度和组成而产生的偏差。他们还描述了z值偏差的估计器,我们在本文中扩展了该估计器以优化z值计算。这里描述的增量算法提供了两个通常不兼容的特性:(i)它提高了z值计算的准确性;(ii)降低了时间复杂度(该算法被称为增量算法,因为它在需要时迭代地将随机序列添加到蒙特卡罗过程中)。通过对酿酒酵母和大肠杆菌的蛋白质进行全面比较,得出了这一结果。
{"title":"An incremental algorithm for Z-value computations","authors":"J.-C. Aude ,&nbsp;A. Louis","doi":"10.1016/S0097-8485(02)00003-7","DOIUrl":"10.1016/S0097-8485(02)00003-7","url":null,"abstract":"<div><p>The <em>Z</em>-value (Comput. Chem. 23 (1999) 333) is an extension of the <em>Z</em>-score that is classically used to compare sets of biological sequences. The <em>Z</em>-value has been successfully used to handle complete genome studies as well as analyze large sets of proteins. The <em>Z</em>-value computation is based on a Monte Carlo approach to estimate the statistical significance of a Smith &amp; Waterman alignment score. Comet et al. (Comput. Chem. 23 (1999) 333) have shown that, in contrast to the alignment score, the <em>Z</em>-value largely reduces the bias due to the lengths and compositions of the sequences. They also described an estimator of the deviation of <em>Z</em>-values, that we extend in this paper in order to optimize <em>Z</em>-values computation. The <em>incremental</em> algorithm described here provides two characteristics which are usually incompatible: (i) it improves the accuracy of <em>Z</em>-values calculation; (ii) it reduces the time complexity (this algorithm has been named <em>incremental</em> because it iteratively adds random sequences to the Monte-Carlo process when needed). Results are presented, originating from the all-by-all comparison of the proteins from <em>Saccharomyces cerevisiae</em> and <em>Escherichia coli</em>.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 402-410"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00003-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76202142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Mathematics of Genome Analysis 基因组分析数学
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00026-8
Andrzej K Konopka
{"title":"Mathematics of Genome Analysis","authors":"Andrzej K Konopka","doi":"10.1016/S0097-8485(02)00026-8","DOIUrl":"10.1016/S0097-8485(02)00026-8","url":null,"abstract":"","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 547-548"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00026-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74533333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic RNA secondary structure prediction with a comparative approach 基于比较方法的RNA二级结构自动预测
Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00012-8
Fariza Tahi , Manolo Gouy , Mireille Régnier

This paper presents an algorithm, DCFold, that automatically predicts the common secondary structure of a set of aligned homologous RNA sequences. It is based on the comparative approach. Helices are searched in one of the sequences, called the ‘target sequence’, and compared to the helices in the other sequences, called the ‘test sequences’. Our algorithm searches in the target sequence for palindromes that have a high probability to define helices that are conserved in the test sequences. This selection of significant palindromes is based on criteria that take into account their length and their mutation rate. A recursive search of helices, starting from these likely ones, is implemented using the ‘divide and conquer’ approach. Indeed, as pseudo-knots are not searched by DCFold, a selected palindrome (p, p′) makes possible to divide the initial sequence into two sequences, the internal one and the one resulting from the concatenation of the two external ones. New palindromes can be searched independently in these subsequences. This algorithm was run on ribosomal RNA sequences and recovered very efficiently their common secondary structures.

本文提出了一种自动预测一组同源RNA序列的共同二级结构的算法dcold。它是基于比较的方法。在其中一个序列(称为“目标序列”)中搜索螺旋,并与其他序列(称为“测试序列”)中的螺旋进行比较。我们的算法在目标序列中搜索具有高概率定义在测试序列中保守的螺旋的回文。这种重要回文的选择是基于考虑到它们的长度和突变率的标准。螺旋的递归搜索,从这些可能的螺旋开始,使用“分而治之”的方法实现。实际上,由于dcold不搜索伪结,因此选择回文(p, p ')可以将初始序列分为两个序列,一个是内部序列,另一个是由两个外部序列串联而成的序列。新的回文可以在这些子序列中独立搜索。该算法在核糖体RNA序列上运行,并非常有效地恢复了它们的共同二级结构。
{"title":"Automatic RNA secondary structure prediction with a comparative approach","authors":"Fariza Tahi ,&nbsp;Manolo Gouy ,&nbsp;Mireille Régnier","doi":"10.1016/S0097-8485(02)00012-8","DOIUrl":"10.1016/S0097-8485(02)00012-8","url":null,"abstract":"<div><p>This paper presents an algorithm, DCFold, that automatically predicts the common secondary structure of a set of aligned homologous RNA sequences. It is based on the comparative approach. Helices are searched in one of the sequences, called the ‘target sequence’, and compared to the helices in the other sequences, called the ‘test sequences’. Our algorithm searches in the target sequence for palindromes that have a high probability to define helices that are conserved in the test sequences. This selection of significant palindromes is based on criteria that take into account their length and their mutation rate. A recursive search of helices, starting from these likely ones, is implemented using the ‘divide and conquer’ approach. Indeed, as pseudo-knots are not searched by DCFold, a selected palindrome (<em>p</em>,<!--> <em>p</em>′) makes possible to divide the initial sequence into two sequences, the internal one and the one resulting from the concatenation of the two external ones. New palindromes can be searched independently in these subsequences. This algorithm was run on ribosomal RNA sequences and recovered very efficiently their common secondary structures.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 521-530"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00012-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91073109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
期刊
Computers & chemistry
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1