减少空间序列对齐。

J A Grice, R Hughey, D Speck
{"title":"减少空间序列对齐。","authors":"J A Grice, R Hughey, D Speck","doi":"10.1093/bioinformatics/13.1.45","DOIUrl":null,"url":null,"abstract":"MOTIVATION Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg's divide-and-conquer approach for finding the single best path reduces space use by a factor of n while inducing only a small constant slowdown to the serial version. RESULTS This paper presents a family of methods for computing sequence alignments with reduced memory that are well suited to serial or parallel implementation. Unlike the divide-and-conquer approach, they can be used in the forward-backward (Baum-Welch) training of linear hidden Markov models, and they avoid data-dependent repartitioning, making them easier to parallelize. The algorithms feature, for an arbitrary integer L, a factor proportional to L slowdown in exchange for reducing space requirement from O(n2) to O(n1 square root of n). A single best path member of this algorithm family matches the quadratic time and linear space of the divide-and-conquer algorithm. Experimentally, the O(n1.5)-space member of the family is 15-40% faster than the O(n)-space divide-and-conquer algorithm.","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1997-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.45","citationCount":"57","resultStr":"{\"title\":\"Reduced space sequence alignment.\",\"authors\":\"J A Grice, R Hughey, D Speck\",\"doi\":\"10.1093/bioinformatics/13.1.45\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MOTIVATION Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg's divide-and-conquer approach for finding the single best path reduces space use by a factor of n while inducing only a small constant slowdown to the serial version. RESULTS This paper presents a family of methods for computing sequence alignments with reduced memory that are well suited to serial or parallel implementation. Unlike the divide-and-conquer approach, they can be used in the forward-backward (Baum-Welch) training of linear hidden Markov models, and they avoid data-dependent repartitioning, making them easier to parallelize. The algorithms feature, for an arbitrary integer L, a factor proportional to L slowdown in exchange for reducing space requirement from O(n2) to O(n1 square root of n). A single best path member of this algorithm family matches the quadratic time and linear space of the divide-and-conquer algorithm. Experimentally, the O(n1.5)-space member of the family is 15-40% faster than the O(n)-space divide-and-conquer algorithm.\",\"PeriodicalId\":77081,\"journal\":{\"name\":\"Computer applications in the biosciences : CABIOS\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1093/bioinformatics/13.1.45\",\"citationCount\":\"57\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer applications in the biosciences : CABIOS\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/13.1.45\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer applications in the biosciences : CABIOS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/13.1.45","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 57

摘要

动机:序列对齐是在两个序列之间找到最佳的逐个字符对应的问题。在串行机器上,它可以在O(n2)时间和O(n2)空间中轻松解决,或者在并行机器上,在O(n)时间和O(n)空间中求解每O(n)个处理元素。Hirschberg寻找单个最佳路径的分而治之方法减少了n倍的空间使用,同时只对串行版本产生很小的恒定减速。结果:本文提出了一个家族的方法来计算序列对齐与减少内存,非常适合串行或并行实现。与分治方法不同,它们可以用于线性隐马尔可夫模型的前向后(Baum-Welch)训练,并且它们避免了依赖数据的重新划分,使它们更容易并行化。该算法的特点是,对于任意整数L,一个与L速度成正比的因子,以换取空间需求从O(n2)减少到O(n1根号n)。该算法族的单个最佳路径成员匹配分治算法的二次时间和线性空间。在实验中,该家族的O(n1.5)空间成员比O(n)空间分治算法快15-40%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Reduced space sequence alignment.
MOTIVATION Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg's divide-and-conquer approach for finding the single best path reduces space use by a factor of n while inducing only a small constant slowdown to the serial version. RESULTS This paper presents a family of methods for computing sequence alignments with reduced memory that are well suited to serial or parallel implementation. Unlike the divide-and-conquer approach, they can be used in the forward-backward (Baum-Welch) training of linear hidden Markov models, and they avoid data-dependent repartitioning, making them easier to parallelize. The algorithms feature, for an arbitrary integer L, a factor proportional to L slowdown in exchange for reducing space requirement from O(n2) to O(n1 square root of n). A single best path member of this algorithm family matches the quadratic time and linear space of the divide-and-conquer algorithm. Experimentally, the O(n1.5)-space member of the family is 15-40% faster than the O(n)-space divide-and-conquer algorithm.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A genetic algorithm for multiple molecular sequence alignment. Displaying the information contents of structural RNA alignments: the structure logos. Q-RT-PCR: data analysis software for measurement of gene expression by competitive RT-PCR. SS3D-P2: a three dimensional substructure search program for protein motifs based on secondary structure elements. XDOM, a graphical tool to analyse domain arrangements in any set of protein sequences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1