{"title":"Common Subexpression-Based Compression and Multiplication of Sparse Constant Matrices","authors":"Emre Bilgili;Arda Yurdakul","doi":"10.1109/LES.2023.3323635","DOIUrl":null,"url":null,"abstract":"In deep learning inference, model parameters are pruned and quantized to reduce the model size. Compression methods and common subexpression (CSE) elimination algorithms are applied on sparse constant matrices to deploy the models on low-cost embedded devices. However, the state-of-the-art CSE elimination methods do not scale well for handling large matrices. They reach hours for extracting CSEs in a \n<inline-formula> <tex-math>$200 \\times 200$ </tex-math></inline-formula>\n matrix while their matrix multiplication algorithms execute longer than the conventional matrix multiplication methods. Besides, there exist no compression methods for matrices utilizing CSEs. As a remedy to this problem, a random search-based algorithm is proposed in this letter to extract CSEs in the column pairs of a constant matrix. It produces an adder tree for a \n<inline-formula> <tex-math>$1000 \\times 1000$ </tex-math></inline-formula>\n matrix in a minute. To compress the adder tree, this letter presents a compression format by extending the compressed sparse row (CSR) to include CSEs. While compression rates of more than 50% can be achieved compared to the original CSR format, simulations for a single-core embedded system show that the matrix multiplication execution time can be reduced by 20%.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"82-85"},"PeriodicalIF":1.7000,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Embedded Systems Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10285457/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
In deep learning inference, model parameters are pruned and quantized to reduce the model size. Compression methods and common subexpression (CSE) elimination algorithms are applied on sparse constant matrices to deploy the models on low-cost embedded devices. However, the state-of-the-art CSE elimination methods do not scale well for handling large matrices. They reach hours for extracting CSEs in a
$200 \times 200$
matrix while their matrix multiplication algorithms execute longer than the conventional matrix multiplication methods. Besides, there exist no compression methods for matrices utilizing CSEs. As a remedy to this problem, a random search-based algorithm is proposed in this letter to extract CSEs in the column pairs of a constant matrix. It produces an adder tree for a
$1000 \times 1000$
matrix in a minute. To compress the adder tree, this letter presents a compression format by extending the compressed sparse row (CSR) to include CSEs. While compression rates of more than 50% can be achieved compared to the original CSR format, simulations for a single-core embedded system show that the matrix multiplication execution time can be reduced by 20%.
期刊介绍:
The IEEE Embedded Systems Letters (ESL), provides a forum for rapid dissemination of latest technical advances in embedded systems and related areas in embedded software. The emphasis is on models, methods, and tools that ensure secure, correct, efficient and robust design of embedded systems and their applications.