GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors

Ashwin M. Aji, Liqing Zhang, Wu-chun Feng
{"title":"GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors","authors":"Ashwin M. Aji, Liqing Zhang, Wu-chun Feng","doi":"10.1109/CSE.2010.29","DOIUrl":null,"url":null,"abstract":"Next-generation, high-throughput sequencers are now capable of producing hundreds of billions of short sequences (reads) in a single day. The task of accurately mapping the reads back to a reference genome is of particular importance because it is used in several other biological applications, e.g., genome re-sequencing, DNA methylation, and ChiP sequencing. On a personal computer (PC), the computationally intensive short-read mapping task currently requires several hours to execute while working on very large sets of reads and genomes. Accelerating this task requires parallel computing. Among the current parallel computing platforms, the graphics processing unit (GPU) provides massively parallel computational prowess that holds the promise of accelerating scientific applications at low cost. In this paper, we propose GPU-RMAP, a massively parallel version of the RMAP short-read mapping tool that is highly optimized for the NVIDIA family of GPUs. We then evaluate GPU-RMAP by mapping millions of synthetic and real reads of varying widths on the mosquito (Aedes aegypti) and human genomes. We also discuss the effects of various input parameters, such as read width, number of reads, and chromosome size, on the performance of GPU-RMAP. We then show that despite using the conventionally “slower” but GPU-compatible binary search algorithm, GPU-RMAP outperforms the sequential RMAP implementation, which uses the “faster” hashing technique on a PC. Our data-parallel GPU implementation results in impressive speedups of up to 14:5-times for the mapping kernel and up to 9:6-times for the overall program execution time over the sequential RMAP implementation on a traditional PC.","PeriodicalId":342688,"journal":{"name":"2010 13th IEEE International Conference on Computational Science and Engineering","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 13th IEEE International Conference on Computational Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSE.2010.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

Abstract

Next-generation, high-throughput sequencers are now capable of producing hundreds of billions of short sequences (reads) in a single day. The task of accurately mapping the reads back to a reference genome is of particular importance because it is used in several other biological applications, e.g., genome re-sequencing, DNA methylation, and ChiP sequencing. On a personal computer (PC), the computationally intensive short-read mapping task currently requires several hours to execute while working on very large sets of reads and genomes. Accelerating this task requires parallel computing. Among the current parallel computing platforms, the graphics processing unit (GPU) provides massively parallel computational prowess that holds the promise of accelerating scientific applications at low cost. In this paper, we propose GPU-RMAP, a massively parallel version of the RMAP short-read mapping tool that is highly optimized for the NVIDIA family of GPUs. We then evaluate GPU-RMAP by mapping millions of synthetic and real reads of varying widths on the mosquito (Aedes aegypti) and human genomes. We also discuss the effects of various input parameters, such as read width, number of reads, and chromosome size, on the performance of GPU-RMAP. We then show that despite using the conventionally “slower” but GPU-compatible binary search algorithm, GPU-RMAP outperforms the sequential RMAP implementation, which uses the “faster” hashing technique on a PC. Our data-parallel GPU implementation results in impressive speedups of up to 14:5-times for the mapping kernel and up to 9:6-times for the overall program execution time over the sequential RMAP implementation on a traditional PC.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GPU-RMAP:加速图形处理器上的短读映射
下一代高通量测序仪现在能够在一天内产生数千亿个短序列(读取)。准确地将reads映射回参考基因组的任务是特别重要的,因为它用于其他几个生物学应用,例如基因组重测序,DNA甲基化和ChiP测序。在个人计算机(PC)上,计算密集型的短读映射任务目前需要几个小时来执行,同时处理非常大的读取集和基因组。加速这项任务需要并行计算。在当前的并行计算平台中,图形处理单元(GPU)提供了大规模并行计算能力,有望以低成本加速科学应用。在本文中,我们提出了GPU-RMAP,这是RMAP短读映射工具的大规模并行版本,针对NVIDIA系列gpu进行了高度优化。然后,我们通过在蚊子(埃及伊蚊)和人类基因组上绘制数百万条不同宽度的合成和真实reads来评估GPU-RMAP。我们还讨论了各种输入参数(如读取宽度、读取数和染色体大小)对GPU-RMAP性能的影响。然后我们表明,尽管使用传统的“较慢”但gpu兼容的二进制搜索算法,GPU-RMAP优于顺序RMAP实现,后者在PC上使用“更快”的哈希技术。与传统PC上的顺序RMAP实现相比,我们的数据并行GPU实现使映射内核的速度提高了14:5倍,整个程序执行时间提高了9:6倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Hybrid Harmony Search Method Based on OBL GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors Fractional Exponent Coupling of RIO Optimizing Academic Conference Classification Using Social Tags Availability-Aware Cache Management with Improved RAID Reconstruction Performance
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1