为下一代测序加速大规模短读段映射(仅摘要)

Chunming Zhang, Wen Tang, Guangming Tan
{"title":"为下一代测序加速大规模短读段映射(仅摘要)","authors":"Chunming Zhang, Wen Tang, Guangming Tan","doi":"10.1145/2554688.2554707","DOIUrl":null,"url":null,"abstract":"Due to the explosion of gene sequencing data with over one billion reads per run, the data-intensive computations of Next Generation Sequencing (NGS) applications pose great challenges to current computing capability. In this paper we investigate both algorithmic and architectural accelerating strategies to a typical NGS analysis algorithm -- short reads mapping -- on a commodity multicore and customizable FPGA coprocessor architecture, respectively. First, we propose a hash buckets reorder algorithm that increases shared cache parallelism during the course of searching hash index. The algorithmic strategy achieves 122Gbp/day throughput by exploiting shared-cache parallelism, that leads to performance improvement of 2 times on an 8-core Intel Xeon processor. Second, we develop a FPGA coprocessor that leverages both bit-level and word-level parallelism with scatter-gather memory mechanism to speedup inherent irregular memory access operations by increasing effective memory bandwidth. Our customized FPGA coprocessor achieves 947Gbp per day throughput, that is 189 times higher than current mapping tools on single CPU core, and above 2 times higher than a 64-core multi-processor system. The coprocessor's power efficiency is 29 times higher than a conventional 64-core multi-processor. The results indicate that the customized FPGA coprocessor architecture, that is configured with scatter-gather memory's word-level access, appeals to data intensive applications.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating massive short reads mapping for next generation sequencing (abstract only)\",\"authors\":\"Chunming Zhang, Wen Tang, Guangming Tan\",\"doi\":\"10.1145/2554688.2554707\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the explosion of gene sequencing data with over one billion reads per run, the data-intensive computations of Next Generation Sequencing (NGS) applications pose great challenges to current computing capability. In this paper we investigate both algorithmic and architectural accelerating strategies to a typical NGS analysis algorithm -- short reads mapping -- on a commodity multicore and customizable FPGA coprocessor architecture, respectively. First, we propose a hash buckets reorder algorithm that increases shared cache parallelism during the course of searching hash index. The algorithmic strategy achieves 122Gbp/day throughput by exploiting shared-cache parallelism, that leads to performance improvement of 2 times on an 8-core Intel Xeon processor. Second, we develop a FPGA coprocessor that leverages both bit-level and word-level parallelism with scatter-gather memory mechanism to speedup inherent irregular memory access operations by increasing effective memory bandwidth. Our customized FPGA coprocessor achieves 947Gbp per day throughput, that is 189 times higher than current mapping tools on single CPU core, and above 2 times higher than a 64-core multi-processor system. The coprocessor's power efficiency is 29 times higher than a conventional 64-core multi-processor. The results indicate that the customized FPGA coprocessor architecture, that is configured with scatter-gather memory's word-level access, appeals to data intensive applications.\",\"PeriodicalId\":390562,\"journal\":{\"name\":\"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2554688.2554707\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2554688.2554707","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于基因测序数据的爆炸式增长,每次运行读取量超过10亿次,下一代测序(NGS)应用程序的数据密集型计算对当前的计算能力提出了巨大挑战。在本文中,我们分别在商品多核和可定制的FPGA协处理器架构上研究了典型NGS分析算法(短读取映射)的算法和架构加速策略。首先,我们提出了一种哈希桶重排序算法,该算法在搜索哈希索引的过程中增加了共享缓存的并行性。该算法策略通过利用共享缓存并行性实现了122Gbp/天的吞吐量,这使得在8核英特尔至强处理器上的性能提高了2倍。其次,我们开发了一种FPGA协处理器,利用位级和字级并行性以及散射-收集存储器机制,通过增加有效存储器带宽来加速固有的不规则存储器访问操作。我们定制的FPGA协处理器实现了每天947Gbp的吞吐量,是目前单核CPU映射工具的189倍,是64核多处理器系统的2倍以上。协处理器的能效是传统64核多处理器的29倍。结果表明,配置了散集存储器字级访问的定制FPGA协处理器架构适合于数据密集型应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Accelerating massive short reads mapping for next generation sequencing (abstract only)
Due to the explosion of gene sequencing data with over one billion reads per run, the data-intensive computations of Next Generation Sequencing (NGS) applications pose great challenges to current computing capability. In this paper we investigate both algorithmic and architectural accelerating strategies to a typical NGS analysis algorithm -- short reads mapping -- on a commodity multicore and customizable FPGA coprocessor architecture, respectively. First, we propose a hash buckets reorder algorithm that increases shared cache parallelism during the course of searching hash index. The algorithmic strategy achieves 122Gbp/day throughput by exploiting shared-cache parallelism, that leads to performance improvement of 2 times on an 8-core Intel Xeon processor. Second, we develop a FPGA coprocessor that leverages both bit-level and word-level parallelism with scatter-gather memory mechanism to speedup inherent irregular memory access operations by increasing effective memory bandwidth. Our customized FPGA coprocessor achieves 947Gbp per day throughput, that is 189 times higher than current mapping tools on single CPU core, and above 2 times higher than a 64-core multi-processor system. The coprocessor's power efficiency is 29 times higher than a conventional 64-core multi-processor. The results indicate that the customized FPGA coprocessor architecture, that is configured with scatter-gather memory's word-level access, appeals to data intensive applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Energy-efficient multiplier-less discrete convolver through probabilistic domain transformation Revisiting and-inverter cones Pushing the performance boundary of linear projection designs through device specific optimisations (abstract only) MORP: makespan optimization for processors with an embedded reconfigurable fabric Co-processing with dynamic reconfiguration on heterogeneous MPSoC: practices and design tradeoffs (abstract only)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1