Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply

Benjamin C. Lee, R. Vuduc, J. Demmel, K. Yelick
{"title":"Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply","authors":"Benjamin C. Lee, R. Vuduc, J. Demmel, K. Yelick","doi":"10.1109/ICPP.2004.1327917","DOIUrl":null,"url":null,"abstract":"We present optimizations for sparse matrix-vector multiply SpMV and its generalization to multiple vectors, SpMM, when the matrix is symmetric: (1) symmetric storage, (2) register blocking, and (3) vector blocking. Combined with register blocking, symmetry saves more than 50% in matrix storage. We also show performance speedups of 2.1/spl times/ for SpMV and 2.6/spl times/ for SpMM, when compared to the best nonsymmetric register blocked implementation. We present an approach for the selection of tuning parameters, based on empirical modeling and search that consists of three steps: (1) Off-line benchmark, (2) Runtime search, and (3) Heuristic performance model. This approach generally selects parameters to achieve performance with 85% of that achieved with exhaustive search. We evaluate our implementations with respect to upper bounds on performance. Our model bounds performance by considering only the cost of memory operations and using lower bounds on the number of cache misses. Our optimized codes are within 68% of the upper bounds.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"69","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Parallel Processing, 2004. ICPP 2004.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2004.1327917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 69

Abstract

We present optimizations for sparse matrix-vector multiply SpMV and its generalization to multiple vectors, SpMM, when the matrix is symmetric: (1) symmetric storage, (2) register blocking, and (3) vector blocking. Combined with register blocking, symmetry saves more than 50% in matrix storage. We also show performance speedups of 2.1/spl times/ for SpMV and 2.6/spl times/ for SpMM, when compared to the best nonsymmetric register blocked implementation. We present an approach for the selection of tuning parameters, based on empirical modeling and search that consists of three steps: (1) Off-line benchmark, (2) Runtime search, and (3) Heuristic performance model. This approach generally selects parameters to achieve performance with 85% of that achieved with exhaustive search. We evaluate our implementations with respect to upper bounds on performance. Our model bounds performance by considering only the cost of memory operations and using lower bounds on the number of cache misses. Our optimized codes are within 68% of the upper bounds.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
对称稀疏矩阵向量乘法的评价与自动调优性能模型
当矩阵是对称的时,我们提出了稀疏矩阵-向量乘法SpMV及其推广到多个向量SpMM的优化方法:(1)对称存储,(2)寄存器阻塞,(3)向量阻塞。结合寄存器块,对称性节省了50%以上的矩阵存储。我们还展示了与最佳非对称寄存器阻塞实现相比,SpMV和SpMM的性能加速分别为2.1/spl times/和2.6/spl times/。我们提出了一种基于经验建模和搜索的调优参数选择方法,该方法包括三个步骤:(1)离线基准测试,(2)运行时搜索和(3)启发式性能模型。这种方法通常选择参数来实现性能,其性能是穷举搜索的85%。我们根据性能的上限来评估我们的实现。我们的模型仅考虑内存操作的成本,并使用缓存丢失次数的下限来限制性能。我们优化的代码在上限的68%以内。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Non-uniform dependences partitioned by recurrence chains Clustering strategies for cluster timestamps An effective fault-tolerant routing methodology for direct networks Complexity results and heuristics for pipelined multicast operations on heterogeneous platforms Low-cost register-pressure prediction for scalar replacement using pseudo-schedules
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1