{"title":"A sparsity-aware distributed-memory algorithm for sparse-sparse matrix multiplication","authors":"Yuxi Hong, Aydin Buluc","doi":"arxiv-2408.14558","DOIUrl":null,"url":null,"abstract":"Multiplying two sparse matrices (SpGEMM) is a common computational primitive\nused in many areas including graph algorithms, bioinformatics, algebraic\nmultigrid solvers, and randomized sketching. Distributed-memory parallel\nalgorithms for SpGEMM have mainly focused on sparsity-oblivious approaches that\nuse 2D and 3D partitioning. Sparsity-aware 1D algorithms can theoretically\nreduce communication by not fetching nonzeros of the sparse matrices that do\nnot participate in the multiplication. Here, we present a distributed-memory 1D SpGEMM algorithm and implementation.\nIt uses MPI RDMA operations to mitigate the cost of packing/unpacking\nsubmatrices for communication, and it uses a block fetching strategy to avoid\nexcessive fine-grained messaging. Our results show that our 1D implementation\noutperforms state-of-the-art 2D and 3D implementations within CombBLAS for many\nconfigurations, inputs, and use cases, while remaining conceptually simpler.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"64 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.14558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multiplying two sparse matrices (SpGEMM) is a common computational primitive
used in many areas including graph algorithms, bioinformatics, algebraic
multigrid solvers, and randomized sketching. Distributed-memory parallel
algorithms for SpGEMM have mainly focused on sparsity-oblivious approaches that
use 2D and 3D partitioning. Sparsity-aware 1D algorithms can theoretically
reduce communication by not fetching nonzeros of the sparse matrices that do
not participate in the multiplication. Here, we present a distributed-memory 1D SpGEMM algorithm and implementation.
It uses MPI RDMA operations to mitigate the cost of packing/unpacking
submatrices for communication, and it uses a block fetching strategy to avoid
excessive fine-grained messaging. Our results show that our 1D implementation
outperforms state-of-the-art 2D and 3D implementations within CombBLAS for many
configurations, inputs, and use cases, while remaining conceptually simpler.
IF 5.4 3区 医学PharmaceuticsPub Date : 2024-04-27DOI: 10.3390/pharmaceutics16050594
Bryan T. Mayer, Lily Zhang, Allan C. deCamp, Chenchen Yu, Alicia Sato, Heather Angier, Kelly E. Seaton, Nicole Yates, Julie E. Ledgerwood, Kenneth Mayer, Marina Caskey, Michel Nussenzweig, Kathryn Stephenson, Boris Julg, Dan H. Barouch, Magdalena E. Sobieszczyk, Srilatha Edupuganti, Colleen F. Kelley, M. Juliana McElrath, Huub C. Gelderblom, Michael Pensiero, Adrian McDermott, Lucio Gama, Richard A. Koup, Peter B. Gilbert, Myron S. Cohen, Lawrence Corey, Ollivier Hyrien, Georgia D. Tomaras, Yunda Huang