Smith-Waterman算法在可重构超级计算平台上的实现

高性能计算技术 Pub Date : 2007-11-11 DOI:10.1145/1328554.1328565

Peiheng Zhang, Guangming Tan, G. Gao

{"title":"Smith-Waterman算法在可重构超级计算平台上的实现","authors":"Peiheng Zhang, Guangming Tan, G. Gao","doi":"10.1145/1328554.1328565","DOIUrl":null,"url":null,"abstract":"An innovative reconfigurable supercomputing platform -- XD1000 is developed by XtremeData Inc. to exploit the rapid progress of FPGA technology and the high-performance of Hyper-Transport interconnection. In this paper, we present the implementations of the Smith-Waterman algorithm for both DNA and protein sequences on the platform. The main features include: (1) we bring forward a multistage PE (processing element) design which significantly reduces the FPGA resource usage and hence allows more parallelism to be exploited; (2) our design features a pipelined control mechanism with uneven stage latencies -- a key to minimize the overall PE pipeline cycle time; (3) we also put forward a compressed substitution matrix storage structure, resulting in substantial decrease of the on-chip SRAM usage. Finally, we implement a 384-PE systolic array running at 66.7MHz, which can achieve 25.6GCUPS peak performance. Compared with the 2.2GHz AMD Opteron host processor, the FPGA coprocessor speedups 185X and 250X respectively.","PeriodicalId":59014,"journal":{"name":"高性能计算技术","volume":"20 1","pages":"39-48"},"PeriodicalIF":0.0000,"publicationDate":"2007-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"123","resultStr":"{\"title\":\"Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform\",\"authors\":\"Peiheng Zhang, Guangming Tan, G. Gao\",\"doi\":\"10.1145/1328554.1328565\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An innovative reconfigurable supercomputing platform -- XD1000 is developed by XtremeData Inc. to exploit the rapid progress of FPGA technology and the high-performance of Hyper-Transport interconnection. In this paper, we present the implementations of the Smith-Waterman algorithm for both DNA and protein sequences on the platform. The main features include: (1) we bring forward a multistage PE (processing element) design which significantly reduces the FPGA resource usage and hence allows more parallelism to be exploited; (2) our design features a pipelined control mechanism with uneven stage latencies -- a key to minimize the overall PE pipeline cycle time; (3) we also put forward a compressed substitution matrix storage structure, resulting in substantial decrease of the on-chip SRAM usage. Finally, we implement a 384-PE systolic array running at 66.7MHz, which can achieve 25.6GCUPS peak performance. Compared with the 2.2GHz AMD Opteron host processor, the FPGA coprocessor speedups 185X and 250X respectively.\",\"PeriodicalId\":59014,\"journal\":{\"name\":\"高性能计算技术\",\"volume\":\"20 1\",\"pages\":\"39-48\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"123\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"高性能计算技术\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://doi.org/10.1145/1328554.1328565\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"高性能计算技术","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.1145/1328554.1328565","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 123

摘要

XtremeData公司开发了一款创新的可重构超级计算平台——XD1000，以利用FPGA技术的快速发展和Hyper-Transport互连的高性能。在本文中，我们提出了史密斯-沃特曼算法在平台上的DNA和蛋白质序列的实现。主要特点包括:(1)我们提出了一种多级PE(处理元件)设计，大大减少了FPGA资源的使用，从而允许更多的并行性被利用;(2)我们的设计具有不均匀阶段延迟的流水线控制机制，这是最小化整体PE管道周期时间的关键;(3)我们还提出了一种压缩替代矩阵存储结构，从而大大降低了片上SRAM的使用率。最后，我们实现了一个运行在66.7MHz的384-PE收缩阵列，它可以达到25.6GCUPS的峰值性能。与2.2GHz AMD Opteron主机处理器相比，FPGA协处理器的速度分别提高了185X和250X。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform

An innovative reconfigurable supercomputing platform -- XD1000 is developed by XtremeData Inc. to exploit the rapid progress of FPGA technology and the high-performance of Hyper-Transport interconnection. In this paper, we present the implementations of the Smith-Waterman algorithm for both DNA and protein sequences on the platform. The main features include: (1) we bring forward a multistage PE (processing element) design which significantly reduces the FPGA resource usage and hence allows more parallelism to be exploited; (2) our design features a pipelined control mechanism with uneven stage latencies -- a key to minimize the overall PE pipeline cycle time; (3) we also put forward a compressed substitution matrix storage structure, resulting in substantial decrease of the on-chip SRAM usage. Finally, we implement a 384-PE systolic array running at 66.7MHz, which can achieve 25.6GCUPS peak performance. Compared with the 2.2GHz AMD Opteron host processor, the FPGA coprocessor speedups 185X and 250X respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

高性能计算技术

自引率

0.00%

发文量

1121