声子玻尔兹曼输运方程解的可伸缩并行化

H. Tran, Siddharth Saurav, P. Sadayappan, S. Mazumder, H. Sundar
{"title":"声子玻尔兹曼输运方程解的可伸缩并行化","authors":"H. Tran, Siddharth Saurav, P. Sadayappan, S. Mazumder, H. Sundar","doi":"10.1145/3577193.3593723","DOIUrl":null,"url":null,"abstract":"The Boltzmann Transport Equation (BTE) for phonons is often used to predict thermal transport at submicron scales in semiconductors. The BTE is a seven-dimensional nonlinear integro-differential equation, resulting in difficulty in its solution even after linearization under the single relaxation time approximation. Furthermore, parallelization and load balancing are challenging, given the high dimensionality and variability of the linear systems' conditioning. This work presents a 'synthetic' scalable parallelization method for solving the BTE on large-scale systems. The method includes cell-based parallelization, combined band+cell-based parallelization, and batching technique. The essential computational ingredient of cell-based parallelization is a sparse matrix-vector product (SpMV) that can be integrated with an existing linear algebra library like PETSc. The combined approach enhances the cell-based method by further parallelizing the band dimension to take advantage of low inter-band communication costs. For the batched approach, we developed a batched SpMV that enables multiple linear systems to be solved simultaneously, merging many MPI messages to reduce communication costs, thus maintaining scalability when the grain size becomes very small. We present numerical experiments to demonstrate our method's excellent speedups and scalability up to 16384 cores for a problem with 12.6 billion unknowns.","PeriodicalId":424155,"journal":{"name":"Proceedings of the 37th International Conference on Supercomputing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable parallelization for the solution of phonon Boltzmann Transport Equation\",\"authors\":\"H. Tran, Siddharth Saurav, P. Sadayappan, S. Mazumder, H. Sundar\",\"doi\":\"10.1145/3577193.3593723\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Boltzmann Transport Equation (BTE) for phonons is often used to predict thermal transport at submicron scales in semiconductors. The BTE is a seven-dimensional nonlinear integro-differential equation, resulting in difficulty in its solution even after linearization under the single relaxation time approximation. Furthermore, parallelization and load balancing are challenging, given the high dimensionality and variability of the linear systems' conditioning. This work presents a 'synthetic' scalable parallelization method for solving the BTE on large-scale systems. The method includes cell-based parallelization, combined band+cell-based parallelization, and batching technique. The essential computational ingredient of cell-based parallelization is a sparse matrix-vector product (SpMV) that can be integrated with an existing linear algebra library like PETSc. The combined approach enhances the cell-based method by further parallelizing the band dimension to take advantage of low inter-band communication costs. For the batched approach, we developed a batched SpMV that enables multiple linear systems to be solved simultaneously, merging many MPI messages to reduce communication costs, thus maintaining scalability when the grain size becomes very small. We present numerical experiments to demonstrate our method's excellent speedups and scalability up to 16384 cores for a problem with 12.6 billion unknowns.\",\"PeriodicalId\":424155,\"journal\":{\"name\":\"Proceedings of the 37th International Conference on Supercomputing\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 37th International Conference on Supercomputing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3577193.3593723\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 37th International Conference on Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3577193.3593723","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

声子的玻尔兹曼输运方程(BTE)常用于预测半导体中亚微米尺度的热输运。BTE是一个七维非线性积分微分方程,在单一的松弛时间近似下,即使经过线性化处理也很难求解。此外,考虑到线性系统的高维性和可变性,并行化和负载平衡是具有挑战性的。本文提出了一种“综合”可扩展并行化方法,用于解决大规模系统上的BTE问题。该方法包括基于单元的并行化、组合频带+基于单元的并行化和批处理技术。基于单元的并行化的基本计算要素是稀疏矩阵向量积(SpMV),它可以与现有的线性代数库(如PETSc)集成。该组合方法通过进一步并行化频带维度来增强基于小区的方法,以利用低频带间通信成本的优势。对于批处理方法,我们开发了一种批处理SpMV,可以同时解决多个线性系统,合并许多MPI消息以降低通信成本,从而在粒度变得非常小时保持可扩展性。我们提出了数值实验来证明我们的方法具有出色的加速和可扩展性,最多可达16384个内核,用于解决具有126亿个未知数的问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Scalable parallelization for the solution of phonon Boltzmann Transport Equation
The Boltzmann Transport Equation (BTE) for phonons is often used to predict thermal transport at submicron scales in semiconductors. The BTE is a seven-dimensional nonlinear integro-differential equation, resulting in difficulty in its solution even after linearization under the single relaxation time approximation. Furthermore, parallelization and load balancing are challenging, given the high dimensionality and variability of the linear systems' conditioning. This work presents a 'synthetic' scalable parallelization method for solving the BTE on large-scale systems. The method includes cell-based parallelization, combined band+cell-based parallelization, and batching technique. The essential computational ingredient of cell-based parallelization is a sparse matrix-vector product (SpMV) that can be integrated with an existing linear algebra library like PETSc. The combined approach enhances the cell-based method by further parallelizing the band dimension to take advantage of low inter-band communication costs. For the batched approach, we developed a batched SpMV that enables multiple linear systems to be solved simultaneously, merging many MPI messages to reduce communication costs, thus maintaining scalability when the grain size becomes very small. We present numerical experiments to demonstrate our method's excellent speedups and scalability up to 16384 cores for a problem with 12.6 billion unknowns.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
FLORIA: A Fast and Featherlight Approach for Predicting Cache Performance FT-topo: Architecture-Driven Folded-Triangle Partitioning for Communication-efficient Graph Processing Using Additive Modifications in LU Factorization Instead of Pivoting GRAP: Group-level Resource Allocation Policy for Reconfigurable Dragonfly Network in HPC Enabling Reconfigurable HPC through MPI-based Inter-FPGA Communication
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1