首页 > 最新文献

Computer applications in the biosciences : CABIOS最新文献

英文 中文
An algorithm for the detection of surface-active alpha helices with the potential to anchor proteins at the membrane interface. 一种用于检测具有在膜界面上锚定蛋白质潜力的表面活性α螺旋的算法。
Pub Date : 1997-02-01 DOI: 10.1093/bioinformatics/13.1.99
M G Roberts, D A Phoenix, A R Pewsey

Motivation: Surface-active peptides are amphiphilic in nature and have been shown to have the potential to interact at the membrane interface, possibly by lying parallel to the membrane surface. Present methodology for the identification of these helices uses a fixed window size, is based on a two-dimensional sum of hydrophobicity vectors and gives no measure of the statistical significance for any region identified as amphiphilic. Identification of weakly surface-active structures is difficult and here we have attempted to remedy this by introducing an algorithm which considers three-dimensional geometries and variable window size.

Results: A new measure of membrane-interactive potential is proposed, called the depth-weighted inserted hydrophobicity (DWIH), which is based on the sequestration of hydrophobic residues within a hydrophobic compartment, such as that produced by a membrane bilayer. A statistical significance for this measure has been derived using Monte Carlo techniques. The algorithm is applied to a set of proteins which are thought to anchor to the membrane via C-terminal amphiphilic alpha helices. The DWIH measure appears to allow the identification of this category of membrane-interactive helices which lie near the boundary of the hydrophobic moment plot and which have previously been hard to classify.

动机:表面活性肽本质上是两亲性的,并且已被证明具有在膜界面上相互作用的潜力,可能通过平行于膜表面而存在。目前用于识别这些螺旋的方法使用固定的窗口大小,基于二维疏水性向量和,并且没有测量任何被识别为两亲性的区域的统计显著性。弱表面活性结构的识别是困难的,在这里,我们试图通过引入一种考虑三维几何形状和可变窗口大小的算法来补救这一问题。结果:提出了一种新的膜相互作用电位的测量方法,称为深度加权插入疏水性(DWIH),它基于疏水腔内疏水残基的隔离,例如由膜双分子层产生的疏水残基。使用蒙特卡罗技术推导出该测量的统计显著性。该算法被应用于一组被认为通过c端两亲性α螺旋锚定在膜上的蛋白质。DWIH测量似乎允许识别这类膜相互作用螺旋,这些螺旋位于疏水力矩图的边界附近,以前很难分类。
{"title":"An algorithm for the detection of surface-active alpha helices with the potential to anchor proteins at the membrane interface.","authors":"M G Roberts,&nbsp;D A Phoenix,&nbsp;A R Pewsey","doi":"10.1093/bioinformatics/13.1.99","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.1.99","url":null,"abstract":"<p><strong>Motivation: </strong>Surface-active peptides are amphiphilic in nature and have been shown to have the potential to interact at the membrane interface, possibly by lying parallel to the membrane surface. Present methodology for the identification of these helices uses a fixed window size, is based on a two-dimensional sum of hydrophobicity vectors and gives no measure of the statistical significance for any region identified as amphiphilic. Identification of weakly surface-active structures is difficult and here we have attempted to remedy this by introducing an algorithm which considers three-dimensional geometries and variable window size.</p><p><strong>Results: </strong>A new measure of membrane-interactive potential is proposed, called the depth-weighted inserted hydrophobicity (DWIH), which is based on the sequestration of hydrophobic residues within a hydrophobic compartment, such as that produced by a membrane bilayer. A statistical significance for this measure has been derived using Monte Carlo techniques. The algorithm is applied to a set of proteins which are thought to anchor to the membrane via C-terminal amphiphilic alpha helices. The DWIH measure appears to allow the identification of this category of membrane-interactive helices which lie near the boundary of the hydrophobic moment plot and which have previously been hard to classify.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1997-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.99","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20040630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
EMBL sequence submission system--an object-oriented approach to developing interactive data collection systems through the WWW. EMBL序列提交系统——一种通过WWW开发交互式数据收集系统的面向对象方法。
Pub Date : 1997-02-01 DOI: 10.1093/bioinformatics/13.1.55
B Shomer

Motivation: To enable a new way of submitting sequence information to the EMBL nucleotide database through the WWW. This process of data submission is long and complex, and calls for efficient and user-friendly mechanisms for collection and validation of information.

Results: Described here is a generic, object-oriented data-submission system that is being used for the EMBL database, but can easily, be tailored to serve several data-submission schemes with a relatively short development and implementation time. The program provides the user with a friendly interface that breaks the complex task into smaller, more manageable tasks and, on the other hand, acts as a pre-filter, scanning errors online.

动机:通过WWW实现向EMBL核苷酸数据库提交序列信息的新方法。这一数据提交过程漫长而复杂,需要有效和用户友好的机制来收集和验证信息。结果:这里描述的是一个通用的、面向对象的数据提交系统,该系统用于EMBL数据库,但是可以很容易地进行定制,以在相对较短的开发和实现时间内服务于几种数据提交方案。该程序为用户提供了一个友好的界面,将复杂的任务分解成更小,更易于管理的任务,另一方面,充当预过滤器,在线扫描错误。
{"title":"EMBL sequence submission system--an object-oriented approach to developing interactive data collection systems through the WWW.","authors":"B Shomer","doi":"10.1093/bioinformatics/13.1.55","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.1.55","url":null,"abstract":"<p><strong>Motivation: </strong>To enable a new way of submitting sequence information to the EMBL nucleotide database through the WWW. This process of data submission is long and complex, and calls for efficient and user-friendly mechanisms for collection and validation of information.</p><p><strong>Results: </strong>Described here is a generic, object-oriented data-submission system that is being used for the EMBL database, but can easily, be tailored to serve several data-submission schemes with a relatively short development and implementation time. The program provides the user with a friendly interface that breaks the complex task into smaller, more manageable tasks and, on the other hand, acts as a pre-filter, scanning errors online.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1997-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.55","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20040129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Software for the analysis of DNA sequence elements of transcription. 用于分析DNA序列转录元件的软件。
Pub Date : 1997-02-01 DOI: 10.1093/bioinformatics/13.1.89
K Frech, K Quandt, T Werner

The detection of transcription control elements in DNA sequences became both more important and more complicated by the completion of the first full genome sequencing projects. Rapid evaluation of potential regulatory elements in large amounts of sequence data requires specific methods preferably available as user-friendly computer programs. However, many more algorithms and methods have been published than programs are available, creating problems for scientists who try to select an appropriate method for their needs from the literature. The Internet provides a worldwide and relatively easy access to computer software if the user knows where to look. One of the major problems remaining is how to find the appropriate software. We have compiled a guide detailing where software is available and what is to be expected in terms of interface and data compatibility with other programs. We also show results obtained with each program for several examples. The summarized features of each program should allow scientists to select quickly the method of their choice and inform them where to download the software.

随着首个全基因组测序项目的完成,DNA序列中转录控制元件的检测变得更加重要和复杂。快速评估大量序列数据中潜在的调控元素需要特定的方法,最好是用户友好的计算机程序。然而,已经发表的算法和方法比可用的程序要多得多,这给试图从文献中选择适合自己需要的方法的科学家带来了问题。如果用户知道在哪里查找,互联网提供了一个全球范围内相对容易的访问计算机软件的途径。剩下的主要问题之一是如何找到合适的软件。我们编制了一份指南,详细说明了软件的可用位置,以及在接口和与其他程序的数据兼容性方面的期望。我们还通过几个例子展示了每个程序得到的结果。每个程序的总结特征应该允许科学家快速选择他们选择的方法,并告知他们在哪里下载软件。
{"title":"Software for the analysis of DNA sequence elements of transcription.","authors":"K Frech,&nbsp;K Quandt,&nbsp;T Werner","doi":"10.1093/bioinformatics/13.1.89","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.1.89","url":null,"abstract":"<p><p>The detection of transcription control elements in DNA sequences became both more important and more complicated by the completion of the first full genome sequencing projects. Rapid evaluation of potential regulatory elements in large amounts of sequence data requires specific methods preferably available as user-friendly computer programs. However, many more algorithms and methods have been published than programs are available, creating problems for scientists who try to select an appropriate method for their needs from the literature. The Internet provides a worldwide and relatively easy access to computer software if the user knows where to look. One of the major problems remaining is how to find the appropriate software. We have compiled a guide detailing where software is available and what is to be expected in terms of interface and data compatibility with other programs. We also show results obtained with each program for several examples. The summarized features of each program should allow scientists to select quickly the method of their choice and inform them where to download the software.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1997-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.89","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20040629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
PromFD 1.0: a computer program that predicts eukaryotic pol II promoters using strings and IMD matrices. PromFD 1.0:一个使用字符串和IMD矩阵预测真核pol II启动子的计算机程序。
Pub Date : 1997-02-01 DOI: 10.1093/bioinformatics/13.1.29
Q K Chen, G Z Hertz, G D Stormo

Motivation: A large number of new DNA sequences with virtually unknown functions are generated as the Human Genome Project progresses. Therefore, it is essential to develop computer algorithms that can predict the functionality of DNA segments according to their primary sequences, including algorithms that can predict promoters. Although several promoter-predicting algorithms are available, they have high false-positive detections and the rate of promoter detection needs to be improved further.

Results: In this research, PromFD, a computer program to recognize vertebrate RNA polymerase II promoters, has been developed. Both vertebrate promoters and non-promoter sequences are used in the analysis. The promoters are obtained from the Eukaryotic Promoter Database. Promoters are divided into a training set and a test set. Non-promoter sequences are obtained from the GenBank sequence databank, and are also divided into a training set and a test set. The first step is to search out, among all possible permutations, patterns of strings 5-10 bp long, that are significantly over-represented in the promoter set. The program also searches IMD (Information Matrix Database) matrices that have a significantly higher presence in the promoter set. The results of the searches are stored in the PromFD database, and the program PromFD scores input DNA sequences according to their content of the database entries. PromFD predicts promoters-their locations and the location of potential TATA boxes, if found. The program can detect 71% of promoters in the training set with a false-positive rate of under 1 in every 13,000 bp, and 47% of promoters in the test set with a false-positive rate of under 1 in every 9800 bp. PromFD uses a new approach and its false-positive identification rate is better compared with other available promoter recognition algorithms. The source code for PromFD is in the 'c+2' language.

动机:随着人类基因组计划的进展,产生了大量具有几乎未知功能的新DNA序列。因此,有必要开发能够根据DNA片段的初级序列预测其功能的计算机算法,包括能够预测启动子的算法。目前已有几种启动子预测算法,但存在较高的假阳性检出率,启动子检出率有待进一步提高。结果:本研究开发了识别脊椎动物RNA聚合酶II启动子的计算机程序PromFD。分析中使用了脊椎动物启动子和非启动子序列。这些启动子来自真核生物启动子数据库。启动子分为训练集和测试集。非启动子序列从GenBank序列数据库中获取,也分为训练集和测试集。第一步是在所有可能的排列中,找出在启动子集中显著过度代表的5-10 bp长的字符串模式。该程序还搜索IMD(信息矩阵数据库)矩阵,这些矩阵在启动子集中有显著更高的存在。搜索结果存储在PromFD数据库中,PromFD程序根据数据库条目的内容对输入的DNA序列进行评分。PromFD预测促销员的位置和潜在的TATA盒子的位置,如果发现的话。该程序可以检测到训练集中71%的启动子,每13000 bp的假阳性率低于1,测试集中47%的启动子,每9800 bp的假阳性率低于1。PromFD采用了一种新的方法,与现有的启动子识别算法相比,它的假阳性识别率更高。PromFD的源代码是用'c+2'语言编写的。
{"title":"PromFD 1.0: a computer program that predicts eukaryotic pol II promoters using strings and IMD matrices.","authors":"Q K Chen,&nbsp;G Z Hertz,&nbsp;G D Stormo","doi":"10.1093/bioinformatics/13.1.29","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.1.29","url":null,"abstract":"<p><strong>Motivation: </strong>A large number of new DNA sequences with virtually unknown functions are generated as the Human Genome Project progresses. Therefore, it is essential to develop computer algorithms that can predict the functionality of DNA segments according to their primary sequences, including algorithms that can predict promoters. Although several promoter-predicting algorithms are available, they have high false-positive detections and the rate of promoter detection needs to be improved further.</p><p><strong>Results: </strong>In this research, PromFD, a computer program to recognize vertebrate RNA polymerase II promoters, has been developed. Both vertebrate promoters and non-promoter sequences are used in the analysis. The promoters are obtained from the Eukaryotic Promoter Database. Promoters are divided into a training set and a test set. Non-promoter sequences are obtained from the GenBank sequence databank, and are also divided into a training set and a test set. The first step is to search out, among all possible permutations, patterns of strings 5-10 bp long, that are significantly over-represented in the promoter set. The program also searches IMD (Information Matrix Database) matrices that have a significantly higher presence in the promoter set. The results of the searches are stored in the PromFD database, and the program PromFD scores input DNA sequences according to their content of the database entries. PromFD predicts promoters-their locations and the location of potential TATA boxes, if found. The program can detect 71% of promoters in the training set with a false-positive rate of under 1 in every 13,000 bp, and 47% of promoters in the test set with a false-positive rate of under 1 in every 9800 bp. PromFD uses a new approach and its false-positive identification rate is better compared with other available promoter recognition algorithms. The source code for PromFD is in the 'c+2' language.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1997-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.29","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20040126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Recognition of 3'-processing sites of human mRNA precursors. 人类mRNA前体3'加工位点的识别。
Pub Date : 1997-02-01 DOI: 10.1093/bioinformatics/13.1.23
A A Salamov, V V Solovyev

We have developed a computer program POLYAH and an algorithm for the identification of 3'-processing sites of human mRNA precursors. The algorithm is based on a linear discriminant function (LDF) trained to discriminate real poly(A) signal regions from the other regions of human genes possessing the AATAAA sequence which is most likely non-functional. As the parameters of LDF, various significant contextual characteristics of sequences surrounding AATAAA signals were used. An accuracy of method has been estimated on a set of 131 poly(A) regions and 1466 regions of human genes having the AATAAA sequence. When the threshold was set to predict 86% of poly(A) regions correctly, specificity of 51% and correlation coefficient of 0.62 had been achieved. The precision of this approach is better than for the other methods and has been tested on a larger data set. POLYAH can be used through World Wide Web (at Gene-Finder Home page: URL http:@dot.imgen.bcm.tmc.edu:9331/gene-finder/ gf.html) or by sending files with uncharacterized human sequences to the University of Houston or Weizmann Institute of Science e-mail servers.

我们开发了一个计算机程序POLYAH和一种算法来识别人类mRNA前体的3'加工位点。该算法是基于线性判别函数(LDF)的训练,以区分真实的多聚(a)信号区域和人类基因的其他区域,这些区域具有最可能是非功能的AATAAA序列。利用AATAAA信号周围序列的各种显著上下文特征作为LDF的参数。对人类AATAAA序列基因的131个聚(a)区和1466个聚(a)区进行了准确度估计。当阈值设定为正确预测86%的poly(A)区域时,特异性为51%,相关系数为0.62。这种方法的精度优于其他方法,并且已经在更大的数据集上进行了测试。POLYAH可以通过万维网(在Gene-Finder主页上:URL:@dot.imgen.bcm.tmc.edu:9331/ Gene-Finder / gf.html)使用,也可以通过向休斯顿大学或魏茨曼科学研究所的电子邮件服务器发送包含未确定的人类序列的文件来使用。
{"title":"Recognition of 3'-processing sites of human mRNA precursors.","authors":"A A Salamov,&nbsp;V V Solovyev","doi":"10.1093/bioinformatics/13.1.23","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.1.23","url":null,"abstract":"<p><p>We have developed a computer program POLYAH and an algorithm for the identification of 3'-processing sites of human mRNA precursors. The algorithm is based on a linear discriminant function (LDF) trained to discriminate real poly(A) signal regions from the other regions of human genes possessing the AATAAA sequence which is most likely non-functional. As the parameters of LDF, various significant contextual characteristics of sequences surrounding AATAAA signals were used. An accuracy of method has been estimated on a set of 131 poly(A) regions and 1466 regions of human genes having the AATAAA sequence. When the threshold was set to predict 86% of poly(A) regions correctly, specificity of 51% and correlation coefficient of 0.62 had been achieved. The precision of this approach is better than for the other methods and has been tested on a larger data set. POLYAH can be used through World Wide Web (at Gene-Finder Home page: URL http:@dot.imgen.bcm.tmc.edu:9331/gene-finder/ gf.html) or by sending files with uncharacterized human sequences to the University of Houston or Weizmann Institute of Science e-mail servers.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1997-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.23","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20040201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Regional assignment of genetic markers using a somatic cell hybrid panel: a WWW interactive program available for the pig genome. 使用体细胞杂交面板的遗传标记的区域分配:猪基因组可用的WWW交互程序。
Pub Date : 1997-02-01 DOI: 10.1093/bioinformatics/13.1.69
C Chevalet, J Gouzy, M SanCristobal-Gaudy

Motivation: Quick and easy gene mapping by the use of a panel of cytogenetically characterized somatic cell hybrids is possible, even if some discordant experimental results arise.

Results: An interactive program is proposed and is made available on a WWW site to users of a somatic cell hybrid panel. Assignments to chromosomes and subchromosomal regions are based on likelihood calculations and Bayes' theorem, and a confidence level is provided. The method is illustrated in the case of the pig genome.

动机:即使出现一些不一致的实验结果,通过使用一组细胞遗传学特征的体细胞杂交来快速简便地定位基因是可能的。结果:提出了一个交互式程序,并在WWW网站上提供给体细胞杂交小组的用户。分配到染色体和亚染色体区域是基于似然计算和贝叶斯定理,并提供了一个置信水平。该方法是在猪基因组的情况下说明。
{"title":"Regional assignment of genetic markers using a somatic cell hybrid panel: a WWW interactive program available for the pig genome.","authors":"C Chevalet,&nbsp;J Gouzy,&nbsp;M SanCristobal-Gaudy","doi":"10.1093/bioinformatics/13.1.69","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.1.69","url":null,"abstract":"<p><strong>Motivation: </strong>Quick and easy gene mapping by the use of a panel of cytogenetically characterized somatic cell hybrids is possible, even if some discordant experimental results arise.</p><p><strong>Results: </strong>An interactive program is proposed and is made available on a WWW site to users of a somatic cell hybrid panel. Assignments to chromosomes and subchromosomal regions are based on likelihood calculations and Bayes' theorem, and a confidence level is provided. The method is illustrated in the case of the pig genome.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1997-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.69","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20040131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
A polynomial-time algorithm for a class of protein threading problems. 一类蛋白质穿线问题的多项式时间算法。
Pub Date : 1996-12-01 DOI: 10.1093/bioinformatics/12.6.511
Y Xu, E C Uberbacher

This paper presents an algorithm for constructing an optimal alignment between a three-dimensional protein structure template and an amino acid sequence. A protein structure template is given as a sequence of amino acid residue positions in three-dimensional space, along with an array of physical properties attached to each position; these residue positions are sequentially grouped into a series of core secondary structures (central helices and beta sheets). In addition to match scores and gap penalties, as in a traditional sequence-sequence alignment problem, the quality of a structure-sequence alignment is also determined by interaction preferences among amino acids aligned with structure positions that are spatially close (we call these 'long-range interactions'). Although it is known that constructing such a structure-sequence alignment in the most general form is NP-hard, our algorithm runs in polynomial time when restricted to structures with a 'modest' number of long-range amino acid interactions. In the current work, long-range interactions are limited to interactions between amino acids from different core secondary structures. Dividing the series of core secondary structures into two subseries creates a cut set of long-range interactions. If we use N, M and C to represent the size of an amino acid sequence, the size of a structure template, and the maximum cut size of long-range interactions, respectively, the algorithm finds an optimal structure-sequence alignment in O(21C NM) time, a polynomial function of N and M when C = O(log(N + M)). When running on structure-sequence alignment problems without long-range intersections, i.e. C = 0, the algorithm achieves the same asymptotic computational complexity of the Smith-Waterman sequence-sequence alignment algorithm.

本文提出了一种构建三维蛋白质结构模板与氨基酸序列最优比对的算法。给出蛋白质结构模板作为三维空间中氨基酸残基位置的序列,以及附加到每个位置的一系列物理性质;这些残基位置依次组合成一系列核心二级结构(中心螺旋和β片)。除了匹配分数和间隙惩罚,就像在传统的序列-序列比对问题中一样,结构-序列比对的质量还取决于与空间上接近的结构位置对齐的氨基酸之间的相互作用偏好(我们称之为“远程相互作用”)。虽然众所周知,在最一般的形式下构建这样的结构-序列比对是np困难的,但当限于具有“适度”数量的远程氨基酸相互作用的结构时,我们的算法在多项式时间内运行。在目前的工作中,远程相互作用仅限于来自不同核心二级结构的氨基酸之间的相互作用。将核心次级结构系列划分为两个子系列可以创建一组远程相互作用。如果我们用N、M和C分别表示氨基酸序列的大小、结构模板的大小和远程相互作用的最大切割大小,该算法在O(21C NM)时间内找到最优的结构序列比对,当C = O(log(N + M))时,该算法是N和M的多项式函数。当算法运行在无长距离交集的结构-序列比对问题上,即C = 0时,算法的计算复杂度与Smith-Waterman序列-序列比对算法相同。
{"title":"A polynomial-time algorithm for a class of protein threading problems.","authors":"Y Xu,&nbsp;E C Uberbacher","doi":"10.1093/bioinformatics/12.6.511","DOIUrl":"https://doi.org/10.1093/bioinformatics/12.6.511","url":null,"abstract":"<p><p>This paper presents an algorithm for constructing an optimal alignment between a three-dimensional protein structure template and an amino acid sequence. A protein structure template is given as a sequence of amino acid residue positions in three-dimensional space, along with an array of physical properties attached to each position; these residue positions are sequentially grouped into a series of core secondary structures (central helices and beta sheets). In addition to match scores and gap penalties, as in a traditional sequence-sequence alignment problem, the quality of a structure-sequence alignment is also determined by interaction preferences among amino acids aligned with structure positions that are spatially close (we call these 'long-range interactions'). Although it is known that constructing such a structure-sequence alignment in the most general form is NP-hard, our algorithm runs in polynomial time when restricted to structures with a 'modest' number of long-range amino acid interactions. In the current work, long-range interactions are limited to interactions between amino acids from different core secondary structures. Dividing the series of core secondary structures into two subseries creates a cut set of long-range interactions. If we use N, M and C to represent the size of an amino acid sequence, the size of a structure template, and the maximum cut size of long-range interactions, respectively, the algorithm finds an optimal structure-sequence alignment in O(21C NM) time, a polynomial function of N and M when C = O(log(N + M)). When running on structure-sequence alignment problems without long-range intersections, i.e. C = 0, the algorithm achieves the same asymptotic computational complexity of the Smith-Waterman sequence-sequence alignment algorithm.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/12.6.511","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"19979902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Image and volume data rotation with 1- and 3-pass algorithms. 图像和体积数据旋转与1和3通过算法。
Pub Date : 1996-12-01 DOI: 10.1093/bioinformatics/12.6.549
L Tosoni, S Lanzavecchia, P L Bellon

Three different implementations of the 3-pass algorithm of image and volume data rotation are illustrated and discussed. The three protocols use interpolation in real domain, with a peculiar implementation of the Shannon reconstruction, or phase shifts in Fourier domain. Accuracy and speed of the three methods are compared with corresponding values obtained with a 1-pass method. The results indicate that for low or moderate accuracy, 1-pass is more convenient than 3-pass rotation for both accuracy and speed. Very accurate rotations can be obtained in reasonable time if all steps of 3-pass rotation are performed in the Fourier domain.

对图像和体数据旋转的三遍算法的三种不同实现进行了说明和讨论。这三种协议在实域使用插值,用一种特殊的香农重建实现,或在傅里叶域相移。将三种方法的精度和速度与1次法得到的相应值进行了比较。结果表明,对于低或中等精度,1次旋转比3次旋转更方便,无论是精度还是速度。如果在傅里叶域中进行三次旋转的所有步骤,则可以在合理的时间内获得非常精确的旋转。
{"title":"Image and volume data rotation with 1- and 3-pass algorithms.","authors":"L Tosoni,&nbsp;S Lanzavecchia,&nbsp;P L Bellon","doi":"10.1093/bioinformatics/12.6.549","DOIUrl":"https://doi.org/10.1093/bioinformatics/12.6.549","url":null,"abstract":"<p><p>Three different implementations of the 3-pass algorithm of image and volume data rotation are illustrated and discussed. The three protocols use interpolation in real domain, with a peculiar implementation of the Shannon reconstruction, or phase shifts in Fourier domain. Accuracy and speed of the three methods are compared with corresponding values obtained with a 1-pass method. The results indicate that for low or moderate accuracy, 1-pass is more convenient than 3-pass rotation for both accuracy and speed. Very accurate rotations can be obtained in reasonable time if all steps of 3-pass rotation are performed in the Fourier domain.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/12.6.549","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"19982278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
LALNVIEW: a graphical viewer for pairwise sequence alignments. LALNVIEW:用于成对序列比对的图形化查看器。
Pub Date : 1996-12-01 DOI: 10.1093/bioinformatics/12.6.507
L Duret, E Gasteiger, G Perrière

LALNVIEW is a graphical program for visualising local alignments between two sequences (protein or nucleic acids). Sequences are represented by coloured rectangles to give an overall picture of their similarities. LALNVIEW can display sequence features (exon, intron, active site, domain, propeptide, etc.) along with the alignment. When using LALNVIEW through our Web servers, sequence features are automatically extracted from database annotations (SWISS-PROT, GenBank, EMBL or HOVERGEN) and displayed with the alignment. LALNVIEW is a useful tool for analysing pairwise sequence alignments and for making the link between sequence homology and what is known about the structure or function of sequences. LALNVIEW executables for UNIX, Macintosh and PC computers are freely available from our server (http:// expasy.hcuge.ch/sprot/lalnview.html).

LALNVIEW是一个图形程序,用于可视化两个序列(蛋白质或核酸)之间的局部比对。序列用彩色矩形表示,以给出它们相似性的总体图像。LALNVIEW可以显示序列特征(外显子,内含子,活性位点,结构域,前肽等)以及比对。当通过我们的Web服务器使用LALNVIEW时,序列特征自动从数据库注释(SWISS-PROT, GenBank, EMBL或HOVERGEN)中提取并显示对齐。LALNVIEW是一个有用的工具,用于分析成对序列比对,并在序列同源性和已知的序列结构或功能之间建立联系。适用于UNIX、Macintosh和PC电脑的LALNVIEW可执行文件可从我们的服务器(http:// expasy.hcuge.ch/ sport / LALNVIEW .html)免费获得。
{"title":"LALNVIEW: a graphical viewer for pairwise sequence alignments.","authors":"L Duret,&nbsp;E Gasteiger,&nbsp;G Perrière","doi":"10.1093/bioinformatics/12.6.507","DOIUrl":"https://doi.org/10.1093/bioinformatics/12.6.507","url":null,"abstract":"<p><p>LALNVIEW is a graphical program for visualising local alignments between two sequences (protein or nucleic acids). Sequences are represented by coloured rectangles to give an overall picture of their similarities. LALNVIEW can display sequence features (exon, intron, active site, domain, propeptide, etc.) along with the alignment. When using LALNVIEW through our Web servers, sequence features are automatically extracted from database annotations (SWISS-PROT, GenBank, EMBL or HOVERGEN) and displayed with the alignment. LALNVIEW is a useful tool for analysing pairwise sequence alignments and for making the link between sequence homology and what is known about the structure or function of sequences. LALNVIEW executables for UNIX, Macintosh and PC computers are freely available from our server (http:// expasy.hcuge.ch/sprot/lalnview.html).</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/12.6.507","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"19979901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Selection of amino acid parameters for Fourier transform-based analysis of proteins. 基于傅里叶变换的蛋白质分析中氨基酸参数的选择。
Pub Date : 1996-12-01 DOI: 10.1093/bioinformatics/12.6.553
J Lazović

Fourier analysis of the parametric profile of a sequence for the detection and localization of the structural motifs that are characteristic for biologically related proteins has been proposed. In order to select parameters that are most appropriate for this analysis, the informational capacity of 226 physicochemical, thermodynamic, structural and statistical amino acid parameters was analyzed. Based on the results, obtained for the four functionally unrelated protein model groups (lysozyme c, HIV-1 gp120, tubulin and tau proteins, and steroid hormone receptors), the electron-ion interaction potential has been selected as the unique amino acid property that can be used in Fourier transform-based analysis of proteins, independently of their biological function.

傅立叶分析的参数轮廓序列的检测和定位的结构基序是生物学相关蛋白的特征已经提出。为了选择最适合该分析的参数,对226个理化、热力学、结构和统计氨基酸参数的信息容量进行了分析。基于四种功能无关的蛋白质模型组(溶菌酶c、HIV-1 gp120、微管蛋白和tau蛋白以及类固醇激素受体)的结果,电子-离子相互作用电位被选为独特的氨基酸性质,可用于基于傅里叶变换的蛋白质分析,而不依赖于它们的生物学功能。
{"title":"Selection of amino acid parameters for Fourier transform-based analysis of proteins.","authors":"J Lazović","doi":"10.1093/bioinformatics/12.6.553","DOIUrl":"https://doi.org/10.1093/bioinformatics/12.6.553","url":null,"abstract":"<p><p>Fourier analysis of the parametric profile of a sequence for the detection and localization of the structural motifs that are characteristic for biologically related proteins has been proposed. In order to select parameters that are most appropriate for this analysis, the informational capacity of 226 physicochemical, thermodynamic, structural and statistical amino acid parameters was analyzed. Based on the results, obtained for the four functionally unrelated protein model groups (lysozyme c, HIV-1 gp120, tubulin and tau proteins, and steroid hormone receptors), the electron-ion interaction potential has been selected as the unique amino acid property that can be used in Fourier transform-based analysis of proteins, independently of their biological function.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/12.6.553","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"19982279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
期刊
Computer applications in the biosciences : CABIOS
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1