Refactoring BZIP2 on the new-generation sunway supercomputer

IF 2 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Engineering reports : open access Pub Date : 2023-11-03 DOI:10.1002/eng2.12806
Xiaohui Liu, Zekun Yin, Haodong Tian, Wubing Wan, Mengyuan Hua, Wenlai Zhao, Zhenchun Huang, Ping Gao, Fangjin Zhu, Hua Wang, Xiaohui Duan
{"title":"Refactoring BZIP2 on the new-generation sunway supercomputer","authors":"Xiaohui Liu,&nbsp;Zekun Yin,&nbsp;Haodong Tian,&nbsp;Wubing Wan,&nbsp;Mengyuan Hua,&nbsp;Wenlai Zhao,&nbsp;Zhenchun Huang,&nbsp;Ping Gao,&nbsp;Fangjin Zhu,&nbsp;Hua Wang,&nbsp;Xiaohui Duan","doi":"10.1002/eng2.12806","DOIUrl":null,"url":null,"abstract":"<p>High-performance computing is progressively assuming a fundamental role in advancing scientific research and engineering domains. However, the ever-expanding scales of scientific simulations pose challenges for efficient data I/O and storage. The data compression technology has garnered significant attention as a solution to reduce data transmission and storage costs while enhancing performance. In particular, the BZIP2 lossless compression algorithm has been widely used due to its exceptional compression ratio, moderate compression speed, high reliability, and open-source nature. This paper focuses on the design and realization of a parallelized BZIP2 algorithm tailored for deployment on the New-Generation Sunway supercomputing platform. By leveraging the unique cache patterns of the New-Generation Sunway processor, we propose the highly tuned multi-threading and multi-node implementations of the BZIP2 applications for different scenarios. Moreover, we also propose the efficient BZIP2 libraries based on the management processing element and computing processing element which support the commonly used high-level (de)compression interfaces. The test results indicate that the our multi-threading implementation achieves maximum speedup of 23.09<span></span><math>\n <semantics>\n <mrow>\n <mo>×</mo>\n </mrow>\n <annotation>$$ \\times $$</annotation>\n </semantics></math> (8.57<span></span><math>\n <semantics>\n <mrow>\n <mo>×</mo>\n </mrow>\n <annotation>$$ \\times $$</annotation>\n </semantics></math>) in decompression(compression) compared to the sequential implementation. Furthermore, the multi-node implementation achieves 50.81% (26.35%) parallel efficiency and peak performance of 16.6 GB/s (52.8 GB/s) for compression(decompression) when scaling up to 2048 processes.</p>","PeriodicalId":72922,"journal":{"name":"Engineering reports : open access","volume":"7 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.12806","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering reports : open access","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/eng2.12806","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

High-performance computing is progressively assuming a fundamental role in advancing scientific research and engineering domains. However, the ever-expanding scales of scientific simulations pose challenges for efficient data I/O and storage. The data compression technology has garnered significant attention as a solution to reduce data transmission and storage costs while enhancing performance. In particular, the BZIP2 lossless compression algorithm has been widely used due to its exceptional compression ratio, moderate compression speed, high reliability, and open-source nature. This paper focuses on the design and realization of a parallelized BZIP2 algorithm tailored for deployment on the New-Generation Sunway supercomputing platform. By leveraging the unique cache patterns of the New-Generation Sunway processor, we propose the highly tuned multi-threading and multi-node implementations of the BZIP2 applications for different scenarios. Moreover, we also propose the efficient BZIP2 libraries based on the management processing element and computing processing element which support the commonly used high-level (de)compression interfaces. The test results indicate that the our multi-threading implementation achieves maximum speedup of 23.09 × $$ \times $$ (8.57 × $$ \times $$ ) in decompression(compression) compared to the sequential implementation. Furthermore, the multi-node implementation achieves 50.81% (26.35%) parallel efficiency and peak performance of 16.6 GB/s (52.8 GB/s) for compression(decompression) when scaling up to 2048 processes.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在新一代神威超级计算机上重构BZIP2
高性能计算在推进科学研究和工程领域中日益发挥着基础性作用。然而,不断扩大的科学模拟规模对有效的数据I/O和存储提出了挑战。数据压缩技术作为一种既能降低数据传输和存储成本,又能提高性能的解决方案,受到了广泛关注。其中,BZIP2无损压缩算法以其优越的压缩比、适中的压缩速度、高可靠性和开源特性得到了广泛的应用。本文针对新一代神威超级计算平台,设计并实现了一种并行化的BZIP2算法。通过利用新一代神威处理器独特的缓存模式,我们提出了针对不同场景的BZIP2应用程序的高度调优的多线程和多节点实现。此外,我们还提出了基于管理处理元素和计算处理元素的高效BZIP2库,支持常用的高级(解)压缩接口。测试结果表明,与顺序实现相比,我们的多线程实现在解压缩(压缩)方面实现了23.09 × $$ \times $$ (8.57 × $$ \times $$)的最大加速。多节点实现达到50.81% (26.35%) parallel efficiency and peak performance of 16.6 GB/s (52.8 GB/s) for compression(decompression) when scaling up to 2048 processes.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.10
自引率
0.00%
发文量
0
审稿时长
19 weeks
期刊最新文献
Comparative Thermal Performance Analysis of Induction and Interior Permanent Magnet Machines for Electric Vehicles Under Varying Drive Cycles Using ANSYS Motor-CAD Software: Approach Toward Sustainability Busbar Fault Direction Identification Using Sequences of Displacement Vectors An Overview of Recent Progress of Green Nano-Composites for Sustainable Energy Storage Applications Cubic Rank Transmuted Akash Distribution: Model, Properties, and Applications in Cancer Research Genetic Algorithm Based Multipath Optimization for Multimobile Robot Navigations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1