减少空间序列对齐。

Computer applications in the biosciences : CABIOS Pub Date : 1997-02-01 DOI:10.1093/bioinformatics/13.1.45

J A Grice, R Hughey, D Speck

{"title":"减少空间序列对齐。","authors":"J A Grice, R Hughey, D Speck","doi":"10.1093/bioinformatics/13.1.45","DOIUrl":null,"url":null,"abstract":"MOTIVATION Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg's divide-and-conquer approach for finding the single best path reduces space use by a factor of n while inducing only a small constant slowdown to the serial version. RESULTS This paper presents a family of methods for computing sequence alignments with reduced memory that are well suited to serial or parallel implementation. Unlike the divide-and-conquer approach, they can be used in the forward-backward (Baum-Welch) training of linear hidden Markov models, and they avoid data-dependent repartitioning, making them easier to parallelize. The algorithms feature, for an arbitrary integer L, a factor proportional to L slowdown in exchange for reducing space requirement from O(n2) to O(n1 square root of n). A single best path member of this algorithm family matches the quadratic time and linear space of the divide-and-conquer algorithm. Experimentally, the O(n1.5)-space member of the family is 15-40% faster than the O(n)-space divide-and-conquer algorithm.","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 1","pages":"45-53"},"PeriodicalIF":0.0000,"publicationDate":"1997-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.45","citationCount":"57","resultStr":"{\"title\":\"Reduced space sequence alignment.\",\"authors\":\"J A Grice, R Hughey, D Speck\",\"doi\":\"10.1093/bioinformatics/13.1.45\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MOTIVATION Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg's divide-and-conquer approach for finding the single best path reduces space use by a factor of n while inducing only a small constant slowdown to the serial version. RESULTS This paper presents a family of methods for computing sequence alignments with reduced memory that are well suited to serial or parallel implementation. Unlike the divide-and-conquer approach, they can be used in the forward-backward (Baum-Welch) training of linear hidden Markov models, and they avoid data-dependent repartitioning, making them easier to parallelize. The algorithms feature, for an arbitrary integer L, a factor proportional to L slowdown in exchange for reducing space requirement from O(n2) to O(n1 square root of n). A single best path member of this algorithm family matches the quadratic time and linear space of the divide-and-conquer algorithm. Experimentally, the O(n1.5)-space member of the family is 15-40% faster than the O(n)-space divide-and-conquer algorithm.\",\"PeriodicalId\":77081,\"journal\":{\"name\":\"Computer applications in the biosciences : CABIOS\",\"volume\":\"13 1\",\"pages\":\"45-53\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.45\",\"citationCount\":\"57\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer applications in the biosciences : CABIOS\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/13.1.45\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer applications in the biosciences : CABIOS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/13.1.45","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 57

摘要

动机:序列对齐是在两个序列之间找到最佳的逐个字符对应的问题。在串行机器上，它可以在O(n2)时间和O(n2)空间中轻松解决，或者在并行机器上，在O(n)时间和O(n)空间中求解每O(n)个处理元素。Hirschberg寻找单个最佳路径的分而治之方法减少了n倍的空间使用，同时只对串行版本产生很小的恒定减速。结果:本文提出了一个家族的方法来计算序列对齐与减少内存，非常适合串行或并行实现。与分治方法不同，它们可以用于线性隐马尔可夫模型的前向后(Baum-Welch)训练，并且它们避免了依赖数据的重新划分，使它们更容易并行化。该算法的特点是，对于任意整数L，一个与L速度成正比的因子，以换取空间需求从O(n2)减少到O(n1根号n)。该算法族的单个最佳路径成员匹配分治算法的二次时间和线性空间。在实验中，该家族的O(n1.5)空间成员比O(n)空间分治算法快15-40%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Reduced space sequence alignment.

MOTIVATION Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg's divide-and-conquer approach for finding the single best path reduces space use by a factor of n while inducing only a small constant slowdown to the serial version. RESULTS This paper presents a family of methods for computing sequence alignments with reduced memory that are well suited to serial or parallel implementation. Unlike the divide-and-conquer approach, they can be used in the forward-backward (Baum-Welch) training of linear hidden Markov models, and they avoid data-dependent repartitioning, making them easier to parallelize. The algorithms feature, for an arbitrary integer L, a factor proportional to L slowdown in exchange for reducing space requirement from O(n2) to O(n1 square root of n). A single best path member of this algorithm family matches the quadratic time and linear space of the divide-and-conquer algorithm. Experimentally, the O(n1.5)-space member of the family is 15-40% faster than the O(n)-space divide-and-conquer algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer applications in the biosciences : CABIOS

自引率

0.00%

发文量