GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors

2010 13th IEEE International Conference on Computational Science and Engineering Pub Date : 2010-12-11 DOI:10.1109/CSE.2010.29

Ashwin M. Aji, Liqing Zhang, Wu-chun Feng

{"title":"GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors","authors":"Ashwin M. Aji, Liqing Zhang, Wu-chun Feng","doi":"10.1109/CSE.2010.29","DOIUrl":null,"url":null,"abstract":"Next-generation, high-throughput sequencers are now capable of producing hundreds of billions of short sequences (reads) in a single day. The task of accurately mapping the reads back to a reference genome is of particular importance because it is used in several other biological applications, e.g., genome re-sequencing, DNA methylation, and ChiP sequencing. On a personal computer (PC), the computationally intensive short-read mapping task currently requires several hours to execute while working on very large sets of reads and genomes. Accelerating this task requires parallel computing. Among the current parallel computing platforms, the graphics processing unit (GPU) provides massively parallel computational prowess that holds the promise of accelerating scientific applications at low cost. In this paper, we propose GPU-RMAP, a massively parallel version of the RMAP short-read mapping tool that is highly optimized for the NVIDIA family of GPUs. We then evaluate GPU-RMAP by mapping millions of synthetic and real reads of varying widths on the mosquito (Aedes aegypti) and human genomes. We also discuss the effects of various input parameters, such as read width, number of reads, and chromosome size, on the performance of GPU-RMAP. We then show that despite using the conventionally “slower” but GPU-compatible binary search algorithm, GPU-RMAP outperforms the sequential RMAP implementation, which uses the “faster” hashing technique on a PC. Our data-parallel GPU implementation results in impressive speedups of up to 14:5-times for the mapping kernel and up to 9:6-times for the overall program execution time over the sequential RMAP implementation on a traditional PC.","PeriodicalId":342688,"journal":{"name":"2010 13th IEEE International Conference on Computational Science and Engineering","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 13th IEEE International Conference on Computational Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSE.2010.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

Abstract

Next-generation, high-throughput sequencers are now capable of producing hundreds of billions of short sequences (reads) in a single day. The task of accurately mapping the reads back to a reference genome is of particular importance because it is used in several other biological applications, e.g., genome re-sequencing, DNA methylation, and ChiP sequencing. On a personal computer (PC), the computationally intensive short-read mapping task currently requires several hours to execute while working on very large sets of reads and genomes. Accelerating this task requires parallel computing. Among the current parallel computing platforms, the graphics processing unit (GPU) provides massively parallel computational prowess that holds the promise of accelerating scientific applications at low cost. In this paper, we propose GPU-RMAP, a massively parallel version of the RMAP short-read mapping tool that is highly optimized for the NVIDIA family of GPUs. We then evaluate GPU-RMAP by mapping millions of synthetic and real reads of varying widths on the mosquito (Aedes aegypti) and human genomes. We also discuss the effects of various input parameters, such as read width, number of reads, and chromosome size, on the performance of GPU-RMAP. We then show that despite using the conventionally “slower” but GPU-compatible binary search algorithm, GPU-RMAP outperforms the sequential RMAP implementation, which uses the “faster” hashing technique on a PC. Our data-parallel GPU implementation results in impressive speedups of up to 14:5-times for the mapping kernel and up to 9:6-times for the overall program execution time over the sequential RMAP implementation on a traditional PC.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

GPU-RMAP:加速图形处理器上的短读映射

下一代高通量测序仪现在能够在一天内产生数千亿个短序列(读取)。准确地将reads映射回参考基因组的任务是特别重要的，因为它用于其他几个生物学应用，例如基因组重测序，DNA甲基化和ChiP测序。在个人计算机(PC)上，计算密集型的短读映射任务目前需要几个小时来执行，同时处理非常大的读取集和基因组。加速这项任务需要并行计算。在当前的并行计算平台中，图形处理单元(GPU)提供了大规模并行计算能力，有望以低成本加速科学应用。在本文中，我们提出了GPU-RMAP，这是RMAP短读映射工具的大规模并行版本，针对NVIDIA系列gpu进行了高度优化。然后，我们通过在蚊子(埃及伊蚊)和人类基因组上绘制数百万条不同宽度的合成和真实reads来评估GPU-RMAP。我们还讨论了各种输入参数(如读取宽度、读取数和染色体大小)对GPU-RMAP性能的影响。然后我们表明，尽管使用传统的“较慢”但gpu兼容的二进制搜索算法，GPU-RMAP优于顺序RMAP实现，后者在PC上使用“更快”的哈希技术。与传统PC上的顺序RMAP实现相比，我们的数据并行GPU实现使映射内核的速度提高了14:5倍，整个程序执行时间提高了9:6倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2010 13th IEEE International Conference on Computational Science and Engineering

自引率

0.00%

发文量

期刊最新文献

A Hybrid Harmony Search Method Based on OBL GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors Fractional Exponent Coupling of RIO Optimizing Academic Conference Classification Using Social Tags Availability-Aware Cache Management with Improved RAID Reconstruction Performance