GPU上SpMV的多类SVM稀疏矩阵格式选择

2016 45th International Conference on Parallel Processing (ICPP) Pub Date : 2016-08-01 DOI:10.1109/ICPP.2016.64

Akrem Benatia, Weixing Ji, Yizhuo Wang, Feng Shi

{"title":"GPU上SpMV的多类SVM稀疏矩阵格式选择","authors":"Akrem Benatia, Weixing Ji, Yizhuo Wang, Feng Shi","doi":"10.1109/ICPP.2016.64","DOIUrl":null,"url":null,"abstract":"Sparse Matrix-Vector Multiplication (SpMV) kernel dominates the computing cost in numerous scientific applications. Many implementations based on different sparse formats were proposed recently for this kernel on the GPU side. Since the performance of these sparse formats varies significantly according to the sparsity characteristics of the input matrix and the hardware specifications, no one of them can be considered as the best one to use for every sparse matrix. In this paper, we address the problem of selecting the best representation for a given sparse matrix on GPU by using a machine learning approach. First, we present some interesting and easy to compute features for characterizing the sparse matrices on GPU. Second, we use a multiclass Support Vector Machine (SVM) classifier to select the best format for each input matrix. We consider in this paper four popular formats (COO, CSR, ELL, and HYB), but our work can be extended to support more sparse representations. Experimental results on two different GPUs (Fermi GTX 580 and Maxwell GTX 980 Ti) show that we achieved more than 98% of the performance possible with a perfect selection.","PeriodicalId":409991,"journal":{"name":"2016 45th International Conference on Parallel Processing (ICPP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":"{\"title\":\"Sparse Matrix Format Selection with Multiclass SVM for SpMV on GPU\",\"authors\":\"Akrem Benatia, Weixing Ji, Yizhuo Wang, Feng Shi\",\"doi\":\"10.1109/ICPP.2016.64\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sparse Matrix-Vector Multiplication (SpMV) kernel dominates the computing cost in numerous scientific applications. Many implementations based on different sparse formats were proposed recently for this kernel on the GPU side. Since the performance of these sparse formats varies significantly according to the sparsity characteristics of the input matrix and the hardware specifications, no one of them can be considered as the best one to use for every sparse matrix. In this paper, we address the problem of selecting the best representation for a given sparse matrix on GPU by using a machine learning approach. First, we present some interesting and easy to compute features for characterizing the sparse matrices on GPU. Second, we use a multiclass Support Vector Machine (SVM) classifier to select the best format for each input matrix. We consider in this paper four popular formats (COO, CSR, ELL, and HYB), but our work can be extended to support more sparse representations. Experimental results on two different GPUs (Fermi GTX 580 and Maxwell GTX 980 Ti) show that we achieved more than 98% of the performance possible with a perfect selection.\",\"PeriodicalId\":409991,\"journal\":{\"name\":\"2016 45th International Conference on Parallel Processing (ICPP)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"53\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 45th International Conference on Parallel Processing (ICPP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2016.64\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 45th International Conference on Parallel Processing (ICPP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2016.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 53

摘要

稀疏矩阵向量乘法(SpMV)核在许多科学应用中占据着计算成本的主导地位。最近在GPU端针对该内核提出了许多基于不同稀疏格式的实现。由于这些稀疏格式的性能根据输入矩阵的稀疏性特征和硬件规格而有很大差异，因此没有一种格式可以被认为是适用于每个稀疏矩阵的最佳格式。在本文中，我们使用机器学习方法解决了在GPU上选择给定稀疏矩阵的最佳表示的问题。首先，我们提出了一些有趣且易于计算的特征来描述GPU上的稀疏矩阵。其次，我们使用多类支持向量机(SVM)分类器为每个输入矩阵选择最佳格式。我们在本文中考虑了四种流行的格式(COO、CSR、ELL和HYB)，但我们的工作可以扩展到支持更稀疏的表示。在两种不同的gpu (Fermi GTX 580和Maxwell GTX 980 Ti)上的实验结果表明，我们在完美的选择下实现了超过98%的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Sparse Matrix Format Selection with Multiclass SVM for SpMV on GPU

Sparse Matrix-Vector Multiplication (SpMV) kernel dominates the computing cost in numerous scientific applications. Many implementations based on different sparse formats were proposed recently for this kernel on the GPU side. Since the performance of these sparse formats varies significantly according to the sparsity characteristics of the input matrix and the hardware specifications, no one of them can be considered as the best one to use for every sparse matrix. In this paper, we address the problem of selecting the best representation for a given sparse matrix on GPU by using a machine learning approach. First, we present some interesting and easy to compute features for characterizing the sparse matrices on GPU. Second, we use a multiclass Support Vector Machine (SVM) classifier to select the best format for each input matrix. We consider in this paper four popular formats (COO, CSR, ELL, and HYB), but our work can be extended to support more sparse representations. Experimental results on two different GPUs (Fermi GTX 580 and Maxwell GTX 980 Ti) show that we achieved more than 98% of the performance possible with a perfect selection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 45th International Conference on Parallel Processing (ICPP)

自引率

0.00%

发文量

期刊最新文献

Parallel k-Means++ for Multiple Shared-Memory Architectures RCHC: A Holistic Runtime System for Concurrent Heterogeneous Computing Partial Flattening: A Compilation Technique for Irregular Nested Parallelism on GPGPUs Improving RAID Performance Using an Endurable SSD Cache PARVMEC: An Efficient, Scalable Implementation of the Variational Moments Equilibrium Code