Md Mosharaf Hossain, Thomas M. Hines, S. Ghafoor, Ryan J. Marshall, Muzakhir S. Amanzholov, R. Kannan
{"title":"Performance Issues of SYRK Implementations in Shared Memory Environments for Edge Cases","authors":"Md Mosharaf Hossain, Thomas M. Hines, S. Ghafoor, Ryan J. Marshall, Muzakhir S. Amanzholov, R. Kannan","doi":"10.1109/ICCITECHN.2018.8631936","DOIUrl":null,"url":null,"abstract":"The symmetric rank-k update (SYRK) is a level-3 BLAS routine commonly used by many Data Mining/Machine Learning(DM/ML) algorithms such as regression, dimensionality reduction algorithms like PCA, matrix factorization and k-mean clustering. This paper presents a comprehensive analysis of the SYRK routine under popular dense linear algebra libraries such as OpenBLAS, Intel MKL, and BLIS particularly focusing on edge cases of dense matrices (thin or fat shapes) that are common in DM/ML applications. Our work identifies some performance issues of the SYRK routine in multi-threaded shared memory environments for edge cases and discuss matrix dependent modifications for performance improvement.","PeriodicalId":355984,"journal":{"name":"2018 21st International Conference of Computer and Information Technology (ICCIT)","volume":"250 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 21st International Conference of Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCITECHN.2018.8631936","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The symmetric rank-k update (SYRK) is a level-3 BLAS routine commonly used by many Data Mining/Machine Learning(DM/ML) algorithms such as regression, dimensionality reduction algorithms like PCA, matrix factorization and k-mean clustering. This paper presents a comprehensive analysis of the SYRK routine under popular dense linear algebra libraries such as OpenBLAS, Intel MKL, and BLIS particularly focusing on edge cases of dense matrices (thin or fat shapes) that are common in DM/ML applications. Our work identifies some performance issues of the SYRK routine in multi-threaded shared memory environments for edge cases and discuss matrix dependent modifications for performance improvement.