Pub Date : 1997-12-01DOI: 10.1093/bioinformatics/13.6.565
C Zhang, A K Wong
Motivation: Multiple molecular sequence alignment is among the most important and most challenging tasks in computational biology. The currently used alignment techniques are characterized by great computational complexity, which prevents their wider use. This research is aimed at developing a new technique for efficient multiple sequence alignment.
Approach: The new method is based on genetic algorithms. Genetic algorithms are stochastic approaches for efficient and robust searching. By converting biomolecular sequence alignment into a problem of searching for optimal or near-optimal points in an 'alignment space', a genetic algorithm can be used to find good alignments very efficiently.
Results: Experiments on real data sets have shown that the average computing time of this technique may be two or three orders lower than that of a technique based on pairwise dynamic programming, while the alignment qualities are very similar.
Availability: A C program on UNIX has been written to implement the technique. It is available on request from the authors.
{"title":"A genetic algorithm for multiple molecular sequence alignment.","authors":"C Zhang, A K Wong","doi":"10.1093/bioinformatics/13.6.565","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.6.565","url":null,"abstract":"<p><strong>Motivation: </strong>Multiple molecular sequence alignment is among the most important and most challenging tasks in computational biology. The currently used alignment techniques are characterized by great computational complexity, which prevents their wider use. This research is aimed at developing a new technique for efficient multiple sequence alignment.</p><p><strong>Approach: </strong>The new method is based on genetic algorithms. Genetic algorithms are stochastic approaches for efficient and robust searching. By converting biomolecular sequence alignment into a problem of searching for optimal or near-optimal points in an 'alignment space', a genetic algorithm can be used to find good alignments very efficiently.</p><p><strong>Results: </strong>Experiments on real data sets have shown that the average computing time of this technique may be two or three orders lower than that of a technique based on pairwise dynamic programming, while the alignment qualities are very similar.</p><p><strong>Availability: </strong>A C program on UNIX has been written to implement the technique. It is available on request from the authors.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 6","pages":"565-81"},"PeriodicalIF":0.0,"publicationDate":"1997-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.6.565","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20401802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-01DOI: 10.1093/bioinformatics/13.6.601
J Gouzy, P Eugéne, E A Greene, D Kahn, F Corpet
Motivation: To extract the maximum possible information from a set of protein sequences, its modular organization must be known and clearly displayed. This is important both for structural and functional analysis.
Results: This paper presents an algorithm and a graphical interface called XDOM which performs a systematic analysis of the modular organization of any set of protein sequences. The algorithm is an automatic method to identify putative domains from sequence comparisons. The graphical tool displays the proteins as a set of linked boxes, corresponding to its domains. The method has been tested on a family of bacterial proteins and on whole genomes. It is currently applied to the complete SWISS-PROT database to build the PRODOM database.
Availability: XDOM is available free of charge by anonymous ftp:¿¿ftp://ftp.toulouse.inra.fr/pub/xdom¿ ¿. The ProDom database can be consulted at ¿¿http://protein.toulouse.inra.fr/prodom.html¿¿.
{"title":"XDOM, a graphical tool to analyse domain arrangements in any set of protein sequences.","authors":"J Gouzy, P Eugéne, E A Greene, D Kahn, F Corpet","doi":"10.1093/bioinformatics/13.6.601","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.6.601","url":null,"abstract":"<p><strong>Motivation: </strong>To extract the maximum possible information from a set of protein sequences, its modular organization must be known and clearly displayed. This is important both for structural and functional analysis.</p><p><strong>Results: </strong>This paper presents an algorithm and a graphical interface called XDOM which performs a systematic analysis of the modular organization of any set of protein sequences. The algorithm is an automatic method to identify putative domains from sequence comparisons. The graphical tool displays the proteins as a set of linked boxes, corresponding to its domains. The method has been tested on a family of bacterial proteins and on whole genomes. It is currently applied to the complete SWISS-PROT database to build the PRODOM database.</p><p><strong>Availability: </strong>XDOM is available free of charge by anonymous ftp:¿¿ftp://ftp.toulouse.inra.fr/pub/xdom¿ ¿. The ProDom database can be consulted at ¿¿http://protein.toulouse.inra.fr/prodom.html¿¿.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 6","pages":"601-8"},"PeriodicalIF":0.0,"publicationDate":"1997-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.6.601","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20401806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DCA: an efficient implementation of the divide-and-conquer approach to simultaneous multiple sequence alignment.","authors":"J Stoye, V Moulton, A W Dress","doi":"10.1093/bioinformatics/13.6.625","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.6.625","url":null,"abstract":"<p><strong>Motivation: </strong>DCA is a new computer program for multiple sequence alignment which utilizes a 'divide-and-conquer' type of heuristic approach.</p><p><strong>Availability: </strong>The algorithm is freely available from http://bibiserv.TechFak.Uni-Bielefeld.DE/dca/.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 6","pages":"625-6"},"PeriodicalIF":0.0,"publicationDate":"1997-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.6.625","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20401812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-01DOI: 10.1093/bioinformatics/13.6.583
J Gorodkin, L J Heyer, S Brunak, G D Stormo
Motivation: We extend the standard 'Sequence Logo' method of Schneider and Stevens (Nucleic Acids Res., 18, 6097-6100, 1990) to incorporate prior frequencies on the bases, allow for gaps in the alignments, and indicate the mutual information of base-paired regions in RNA.
Results: Given an alignment of RNA sequences with the base pairings indicated, the program will calculate the information at each position, including the mutual information of the base pairs, and display the results in a 'Structure Logo'. Alignments without base pairing can also be displayed in a 'Sequence Logo', but still allowing gaps and incorporating prior frequencies if desired.
Availability: The code is available from, and an Internet server can be used to run the program at, http://www.cbs.dtu.dk/gorodkin/appl/slogo. html.
{"title":"Displaying the information contents of structural RNA alignments: the structure logos.","authors":"J Gorodkin, L J Heyer, S Brunak, G D Stormo","doi":"10.1093/bioinformatics/13.6.583","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.6.583","url":null,"abstract":"<p><strong>Motivation: </strong>We extend the standard 'Sequence Logo' method of Schneider and Stevens (Nucleic Acids Res., 18, 6097-6100, 1990) to incorporate prior frequencies on the bases, allow for gaps in the alignments, and indicate the mutual information of base-paired regions in RNA.</p><p><strong>Results: </strong>Given an alignment of RNA sequences with the base pairings indicated, the program will calculate the information at each position, including the mutual information of the base pairs, and display the results in a 'Structure Logo'. Alignments without base pairing can also be displayed in a 'Sequence Logo', but still allowing gaps and incorporating prior frequencies if desired.</p><p><strong>Availability: </strong>The code is available from, and an Internet server can be used to run the program at, http://www.cbs.dtu.dk/gorodkin/appl/slogo. html.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 6","pages":"583-6"},"PeriodicalIF":0.0,"publicationDate":"1997-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.6.583","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20401803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BioWish: a molecular biology command extension to Tcl/Tk.","authors":"T Sicheritz-Pontén","doi":"","DOIUrl":"","url":null,"abstract":"","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 6","pages":"621-2"},"PeriodicalIF":0.0,"publicationDate":"1997-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20401810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-01DOI: 10.1093/bioinformatics/13.6.623
A S Law, D W Burt
Motivation: MacBOB (Macintosh BLAST Output Browser) and MacBOB Filter are two Macintosh-based applications that greatly simplify the viewing of BLAST and FASTA search results files.
Availability: The programs can be obtained via anonymous ftp from ftp.ri.bbsrc.ac.uk from the directory/pub/software/MacBOB, or via WWW from the Roslin Institute Home Page (http://www.ri.bbsrc.ac.uk/).
{"title":"Two applications to facilitate the viewing of database search result files on the Macintosh.","authors":"A S Law, D W Burt","doi":"10.1093/bioinformatics/13.6.623","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.6.623","url":null,"abstract":"<p><strong>Motivation: </strong>MacBOB (Macintosh BLAST Output Browser) and MacBOB Filter are two Macintosh-based applications that greatly simplify the viewing of BLAST and FASTA search results files.</p><p><strong>Availability: </strong>The programs can be obtained via anonymous ftp from ftp.ri.bbsrc.ac.uk from the directory/pub/software/MacBOB, or via WWW from the Roslin Institute Home Page (http://www.ri.bbsrc.ac.uk/).</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 6","pages":"623-4"},"PeriodicalIF":0.0,"publicationDate":"1997-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.6.623","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20401811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-01DOI: 10.1093/bioinformatics/13.6.621
Thomas Sicheritz-Pontén
The Tcl/Tk (Ousterhout, 1994) scripting language has proved to be a powerful tool for building programs involved in the analysis of molecular sequence data. However, typical 'biological' operations, like the translation of a nucleotide sequence to the corresponding amino acid sequence, or the calculation of the G + C content in different codon positions in a 50 kbp cosmid sequence, are performed far too slowly with the standard Tel commands. To circumvent this problem, we have constructed a library that extends the Tcl/Tk language by adding primitive operators suited for sequence analysis implemented in the C-language. Additional commands related to molecular biology, written in Tel, are included. Built as a shared library, usage is easy and does not require modification of the Tcl/Tk source code. BioWish can be obtained from the WWW site http://evolution.bmc.uu.se/~thomas/moLlinux. The distribution consists of a single C-source file which should compile without modifications on all Unix systems capable of dynamical loading. It requires Tel 7.5/Tk4.1 or higher. No patching of the Tcl/Tk core is required. On systems where dynamic loading is not available, BioWish can be compiled as a standalone Tk intepreter.
{"title":"BioWish: a molecular biology command extension to Tcl/Tk","authors":"Thomas Sicheritz-Pontén","doi":"10.1093/bioinformatics/13.6.621","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.6.621","url":null,"abstract":"The Tcl/Tk (Ousterhout, 1994) scripting language has proved to be a powerful tool for building programs involved in the analysis of molecular sequence data. However, typical 'biological' operations, like the translation of a nucleotide sequence to the corresponding amino acid sequence, or the calculation of the G + C content in different codon positions in a 50 kbp cosmid sequence, are performed far too slowly with the standard Tel commands. To circumvent this problem, we have constructed a library that extends the Tcl/Tk language by adding primitive operators suited for sequence analysis implemented in the C-language. Additional commands related to molecular biology, written in Tel, are included. Built as a shared library, usage is easy and does not require modification of the Tcl/Tk source code. BioWish can be obtained from the WWW site http://evolution.bmc.uu.se/~thomas/moLlinux. The distribution consists of a single C-source file which should compile without modifications on all Unix systems capable of dynamical loading. It requires Tel 7.5/Tk4.1 or higher. No patching of the Tcl/Tk core is required. On systems where dynamic loading is not available, BioWish can be compiled as a standalone Tk intepreter.","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 1","pages":"621-2"},"PeriodicalIF":0.0,"publicationDate":"1997-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.6.621","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"60766779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-01DOI: 10.1093/bioinformatics/13.6.619
H Hegyi, J M Lai, P Bork
Unlabelled: A Sequence Alerting Server with a WWW interface is described which informs users with query sequences in database searches about new entries in protein databases related to their query.
Availability: The server address is http://www.bork.embl-heidelberg.de/alerting/.
{"title":"The Sequence Alerting Server--a new WEB server.","authors":"H Hegyi, J M Lai, P Bork","doi":"10.1093/bioinformatics/13.6.619","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.6.619","url":null,"abstract":"<p><strong>Unlabelled: </strong>A Sequence Alerting Server with a WWW interface is described which informs users with query sequences in database searches about new entries in protein databases related to their query.</p><p><strong>Availability: </strong>The server address is http://www.bork.embl-heidelberg.de/alerting/.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 6","pages":"619-20"},"PeriodicalIF":0.0,"publicationDate":"1997-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.6.619","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20401809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-01DOI: 10.1093/bioinformatics/13.6.593
H Kato, Y Takahashi
This paper discusses the implementation of a three-dimensional (3D) structure motif search of proteins. Each protein structure is represented by a set of secondary structure elements (SSEs) which involves alpha-helix segments and beta-strand segments. In describing it, every SSE is further reduced into a two-node graph that consists of the starting amino acid residue, the ending residue and a pseudo-bond between them. The searching algorithm is based on a graph theoretical clique-finding algorithm that has been used for 3D substructure searching in small organic molecules. The program SS3D-P2 was validated using proteins that have well-known 3D motifs, and it correctly found the Greek key motif within an eye lens protein, crystallin, that consists of four anti-parallel beta strands. The program was also successfully applied to searching for the more complex 3D motif, TIM-type beta-barrel motif, with a protein structure database from the Protein Data Bank.
{"title":"SS3D-P2: a three dimensional substructure search program for protein motifs based on secondary structure elements.","authors":"H Kato, Y Takahashi","doi":"10.1093/bioinformatics/13.6.593","DOIUrl":"https://doi.org/10.1093/bioinformatics/13.6.593","url":null,"abstract":"<p><p>This paper discusses the implementation of a three-dimensional (3D) structure motif search of proteins. Each protein structure is represented by a set of secondary structure elements (SSEs) which involves alpha-helix segments and beta-strand segments. In describing it, every SSE is further reduced into a two-node graph that consists of the starting amino acid residue, the ending residue and a pseudo-bond between them. The searching algorithm is based on a graph theoretical clique-finding algorithm that has been used for 3D substructure searching in small organic molecules. The program SS3D-P2 was validated using proteins that have well-known 3D motifs, and it correctly found the Greek key motif within an eye lens protein, crystallin, that consists of four anti-parallel beta strands. The program was also successfully applied to searching for the more complex 3D motif, TIM-type beta-barrel motif, with a protein structure database from the Protein Data Bank.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 6","pages":"593-600"},"PeriodicalIF":0.0,"publicationDate":"1997-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.6.593","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20401805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}