Md Mosharaf Hossain, Thomas M. Hines, S. Ghafoor, Ryan J. Marshall, Muzakhir S. Amanzholov, R. Kannan
{"title":"边缘情况下共享内存环境中syk实现的性能问题","authors":"Md Mosharaf Hossain, Thomas M. Hines, S. Ghafoor, Ryan J. Marshall, Muzakhir S. Amanzholov, R. Kannan","doi":"10.1109/ICCITECHN.2018.8631936","DOIUrl":null,"url":null,"abstract":"The symmetric rank-k update (SYRK) is a level-3 BLAS routine commonly used by many Data Mining/Machine Learning(DM/ML) algorithms such as regression, dimensionality reduction algorithms like PCA, matrix factorization and k-mean clustering. This paper presents a comprehensive analysis of the SYRK routine under popular dense linear algebra libraries such as OpenBLAS, Intel MKL, and BLIS particularly focusing on edge cases of dense matrices (thin or fat shapes) that are common in DM/ML applications. Our work identifies some performance issues of the SYRK routine in multi-threaded shared memory environments for edge cases and discuss matrix dependent modifications for performance improvement.","PeriodicalId":355984,"journal":{"name":"2018 21st International Conference of Computer and Information Technology (ICCIT)","volume":"250 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Issues of SYRK Implementations in Shared Memory Environments for Edge Cases\",\"authors\":\"Md Mosharaf Hossain, Thomas M. Hines, S. Ghafoor, Ryan J. Marshall, Muzakhir S. Amanzholov, R. Kannan\",\"doi\":\"10.1109/ICCITECHN.2018.8631936\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The symmetric rank-k update (SYRK) is a level-3 BLAS routine commonly used by many Data Mining/Machine Learning(DM/ML) algorithms such as regression, dimensionality reduction algorithms like PCA, matrix factorization and k-mean clustering. This paper presents a comprehensive analysis of the SYRK routine under popular dense linear algebra libraries such as OpenBLAS, Intel MKL, and BLIS particularly focusing on edge cases of dense matrices (thin or fat shapes) that are common in DM/ML applications. Our work identifies some performance issues of the SYRK routine in multi-threaded shared memory environments for edge cases and discuss matrix dependent modifications for performance improvement.\",\"PeriodicalId\":355984,\"journal\":{\"name\":\"2018 21st International Conference of Computer and Information Technology (ICCIT)\",\"volume\":\"250 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 21st International Conference of Computer and Information Technology (ICCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCITECHN.2018.8631936\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 21st International Conference of Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCITECHN.2018.8631936","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Issues of SYRK Implementations in Shared Memory Environments for Edge Cases
The symmetric rank-k update (SYRK) is a level-3 BLAS routine commonly used by many Data Mining/Machine Learning(DM/ML) algorithms such as regression, dimensionality reduction algorithms like PCA, matrix factorization and k-mean clustering. This paper presents a comprehensive analysis of the SYRK routine under popular dense linear algebra libraries such as OpenBLAS, Intel MKL, and BLIS particularly focusing on edge cases of dense matrices (thin or fat shapes) that are common in DM/ML applications. Our work identifies some performance issues of the SYRK routine in multi-threaded shared memory environments for edge cases and discuss matrix dependent modifications for performance improvement.