微并行和高性能蛋白质匹配

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI:10.1145/224170.224222

B. Alpern, L. Carter, K. Gatlin

{"title":"微并行和高性能蛋白质匹配","authors":"B. Alpern, L. Carter, K. Gatlin","doi":"10.1145/224170.224222","DOIUrl":null,"url":null,"abstract":"The Smith-Waterman algorithm is a computationally-intensive string-matching operation that is fundamental to the analysis of proteins and genes. In this paper, we explore the use of some standard and novel techniques for improving its performance. We begin by tuning the algorithm using conventional techniques. These make modest performance improvements by providing efficient cache usage and inner-loop code. One novel technique uses the z-buffer operations of the Intel i860 architecture to perform 4 independent computations in parallel. This achieves a five-fold speedup over the optimized code (six-fold over the original). We also describe a related technique that could be used by processors that have 64-bit integer operations, but no z-buffer. Another new technique uses floating-point multiplies and adds in place of the standard algorithm's integer additions and maximum operations. This gains more than a three-fold speedup on the IBM POWER2 processor. This method doesn't give the identical answers as the original program, but experimental evidence shows that the inaccuracies are small and do not affect which strings are chosen as good matches by the algorithm.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"14 7","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"52","resultStr":"{\"title\":\"Microparallelism and High-Performance Protein Matching\",\"authors\":\"B. Alpern, L. Carter, K. Gatlin\",\"doi\":\"10.1145/224170.224222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Smith-Waterman algorithm is a computationally-intensive string-matching operation that is fundamental to the analysis of proteins and genes. In this paper, we explore the use of some standard and novel techniques for improving its performance. We begin by tuning the algorithm using conventional techniques. These make modest performance improvements by providing efficient cache usage and inner-loop code. One novel technique uses the z-buffer operations of the Intel i860 architecture to perform 4 independent computations in parallel. This achieves a five-fold speedup over the optimized code (six-fold over the original). We also describe a related technique that could be used by processors that have 64-bit integer operations, but no z-buffer. Another new technique uses floating-point multiplies and adds in place of the standard algorithm's integer additions and maximum operations. This gains more than a three-fold speedup on the IBM POWER2 processor. This method doesn't give the identical answers as the original program, but experimental evidence shows that the inaccuracies are small and do not affect which strings are chosen as good matches by the algorithm.\",\"PeriodicalId\":269909,\"journal\":{\"name\":\"Proceedings of the IEEE/ACM SC95 Conference\",\"volume\":\"14 7\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"52\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the IEEE/ACM SC95 Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/224170.224222\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the IEEE/ACM SC95 Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/224170.224222","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 52

摘要

Smith-Waterman算法是一种计算密集型的字符串匹配操作，是蛋白质和基因分析的基础。在本文中，我们探讨了使用一些标准的和新颖的技术来提高它的性能。我们首先使用传统技术调整算法。它们通过提供高效的缓存使用和内循环代码，略微提高了性能。一种新颖的技术使用Intel i860架构的z-buffer操作来并行执行4个独立的计算。这比优化后的代码实现了5倍的加速(比原始代码提高了6倍)。我们还描述了一种相关技术，该技术可用于具有64位整数操作但没有z缓冲区的处理器。另一种新技术使用浮点乘法和加法来代替标准算法的整数加法和最大值运算。这在IBM POWER2处理器上获得了三倍以上的速度提升。该方法不能给出与原始程序相同的答案，但实验证据表明，不准确性很小，并且不影响算法选择哪些字符串作为良好匹配。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Microparallelism and High-Performance Protein Matching

The Smith-Waterman algorithm is a computationally-intensive string-matching operation that is fundamental to the analysis of proteins and genes. In this paper, we explore the use of some standard and novel techniques for improving its performance. We begin by tuning the algorithm using conventional techniques. These make modest performance improvements by providing efficient cache usage and inner-loop code. One novel technique uses the z-buffer operations of the Intel i860 architecture to perform 4 independent computations in parallel. This achieves a five-fold speedup over the optimized code (six-fold over the original). We also describe a related technique that could be used by processors that have 64-bit integer operations, but no z-buffer. Another new technique uses floating-point multiplies and adds in place of the standard algorithm's integer additions and maximum operations. This gains more than a three-fold speedup on the IBM POWER2 processor. This method doesn't give the identical answers as the original program, but experimental evidence shows that the inaccuracies are small and do not affect which strings are chosen as good matches by the algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the IEEE/ACM SC95 Conference

自引率

0.00%

发文量