Jiajie Li, Jan-Niklas Schmelzle, Yixiao Du, Simon Heumos, Andrea Guarracino, Giulia Guidi, Pjotr Prins, Erik Garrison, Zhiru Zhang
{"title":"基于 GPU 的庞基因组图快速布局","authors":"Jiajie Li, Jan-Niklas Schmelzle, Yixiao Du, Simon Heumos, Andrea Guarracino, Giulia Guidi, Pjotr Prins, Erik Garrison, Zhiru Zhang","doi":"arxiv-2409.00876","DOIUrl":null,"url":null,"abstract":"Computational Pangenomics is an emerging field that studies genetic variation\nusing a graph structure encompassing multiple genomes. Visualizing pangenome\ngraphs is vital for understanding genome diversity. Yet, handling large graphs\ncan be challenging due to the high computational demands of the graph layout\nprocess. In this work, we conduct a thorough performance characterization of a\nstate-of-the-art pangenome graph layout algorithm, revealing significant\ndata-level parallelism, which makes GPUs a promising option for compute\nacceleration. However, irregular data access and the algorithm's memory-bound\nnature present significant hurdles. To overcome these challenges, we develop a\nsolution implementing three key optimizations: a cache-friendly data layout,\ncoalesced random states, and warp merging. Additionally, we propose a\nquantitative metric for scalable evaluation of pangenome layout quality. Evaluated on 24 human whole-chromosome pangenomes, our GPU-based solution\nachieves a 57.3x speedup over the state-of-the-art multithreaded CPU baseline\nwithout layout quality loss, reducing execution time from hours to minutes.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rapid GPU-Based Pangenome Graph Layout\",\"authors\":\"Jiajie Li, Jan-Niklas Schmelzle, Yixiao Du, Simon Heumos, Andrea Guarracino, Giulia Guidi, Pjotr Prins, Erik Garrison, Zhiru Zhang\",\"doi\":\"arxiv-2409.00876\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computational Pangenomics is an emerging field that studies genetic variation\\nusing a graph structure encompassing multiple genomes. Visualizing pangenome\\ngraphs is vital for understanding genome diversity. Yet, handling large graphs\\ncan be challenging due to the high computational demands of the graph layout\\nprocess. In this work, we conduct a thorough performance characterization of a\\nstate-of-the-art pangenome graph layout algorithm, revealing significant\\ndata-level parallelism, which makes GPUs a promising option for compute\\nacceleration. However, irregular data access and the algorithm's memory-bound\\nnature present significant hurdles. To overcome these challenges, we develop a\\nsolution implementing three key optimizations: a cache-friendly data layout,\\ncoalesced random states, and warp merging. Additionally, we propose a\\nquantitative metric for scalable evaluation of pangenome layout quality. Evaluated on 24 human whole-chromosome pangenomes, our GPU-based solution\\nachieves a 57.3x speedup over the state-of-the-art multithreaded CPU baseline\\nwithout layout quality loss, reducing execution time from hours to minutes.\",\"PeriodicalId\":501422,\"journal\":{\"name\":\"arXiv - CS - Distributed, Parallel, and Cluster Computing\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Distributed, Parallel, and Cluster Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.00876\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00876","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Computational Pangenomics is an emerging field that studies genetic variation
using a graph structure encompassing multiple genomes. Visualizing pangenome
graphs is vital for understanding genome diversity. Yet, handling large graphs
can be challenging due to the high computational demands of the graph layout
process. In this work, we conduct a thorough performance characterization of a
state-of-the-art pangenome graph layout algorithm, revealing significant
data-level parallelism, which makes GPUs a promising option for compute
acceleration. However, irregular data access and the algorithm's memory-bound
nature present significant hurdles. To overcome these challenges, we develop a
solution implementing three key optimizations: a cache-friendly data layout,
coalesced random states, and warp merging. Additionally, we propose a
quantitative metric for scalable evaluation of pangenome layout quality. Evaluated on 24 human whole-chromosome pangenomes, our GPU-based solution
achieves a 57.3x speedup over the state-of-the-art multithreaded CPU baseline
without layout quality loss, reducing execution time from hours to minutes.