Efficient Identification of Butterfly Sparse Matrix Factorizations

IF 2.6 Q1 MATHEMATICS, APPLIED SIAM journal on mathematics of data science Pub Date : 2021-10-04 DOI:10.1137/22m1488727

Léon Zheng, E. Riccietti, R. Gribonval

{"title":"Efficient Identification of Butterfly Sparse Matrix Factorizations","authors":"Léon Zheng, E. Riccietti, R. Gribonval","doi":"10.1137/22m1488727","DOIUrl":null,"url":null,"abstract":"Fast transforms correspond to factorizations of the form $\\mathbf{Z} = \\mathbf{X}^{(1)} \\ldots \\mathbf{X}^{(J)}$, where each factor $ \\mathbf{X}^{(\\ell)}$ is sparse and possibly structured. This paper investigates essential uniqueness of such factorizations, i.e., uniqueness up to unavoidable scaling ambiguities. Our main contribution is to prove that any $N \\times N$ matrix having the so-called butterfly structure admits an essentially unique factorization into $J$ butterfly factors (where $N = 2^{J}$), and that the factors can be recovered by a hierarchical factorization method, which consists in recursively factorizing the considered matrix into two factors. This hierarchical identifiability property relies on a simple identifiability condition in the two-layer and fixed-support setting. This approach contrasts with existing ones that fit the product of butterfly factors to a given matrix via gradient descent. The proposed method can be applied in particular to retrieve the factorization of the Hadamard or the discrete Fourier transform matrices of size $N=2^J$. Computing such factorizations costs $\\mathcal{O}(N^{2})$, which is of the order of dense matrix-vector multiplication, while the obtained factorizations enable fast $\\mathcal{O}(N \\log N)$ matrix-vector multiplications and have the potential to be applied to compress deep neural networks.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"50 1","pages":"22-49"},"PeriodicalIF":2.6000,"publicationDate":"2021-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM journal on mathematics of data science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/22m1488727","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 4

Abstract

Fast transforms correspond to factorizations of the form $\mathbf{Z} = \mathbf{X}^{(1)} \ldots \mathbf{X}^{(J)}$, where each factor $ \mathbf{X}^{(\ell)}$ is sparse and possibly structured. This paper investigates essential uniqueness of such factorizations, i.e., uniqueness up to unavoidable scaling ambiguities. Our main contribution is to prove that any $N \times N$ matrix having the so-called butterfly structure admits an essentially unique factorization into $J$ butterfly factors (where $N = 2^{J}$), and that the factors can be recovered by a hierarchical factorization method, which consists in recursively factorizing the considered matrix into two factors. This hierarchical identifiability property relies on a simple identifiability condition in the two-layer and fixed-support setting. This approach contrasts with existing ones that fit the product of butterfly factors to a given matrix via gradient descent. The proposed method can be applied in particular to retrieve the factorization of the Hadamard or the discrete Fourier transform matrices of size $N=2^J$. Computing such factorizations costs $\mathcal{O}(N^{2})$, which is of the order of dense matrix-vector multiplication, while the obtained factorizations enable fast $\mathcal{O}(N \log N)$ matrix-vector multiplications and have the potential to be applied to compress deep neural networks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

蝴蝶稀疏矩阵分解的高效识别

快速变换对应于$\mathbf{Z} = \mathbf{X}^{(1)} \ldots \mathbf{X}^{(J)}$的分解形式，其中每个因子$\mathbf{X}^{(\ well)}$是稀疏的并且可能是结构化的。本文研究了这种分解的本质唯一性，即唯一性到不可避免的尺度歧义。我们的主要贡献是证明了任何具有所谓蝴蝶结构的$N \ * N$矩阵都可以被唯一地分解为$J$蝴蝶因子(其中$N = 2^{J}$)，并且这些因子可以通过分层分解方法恢复，该方法包括将所考虑的矩阵递归分解为两个因子。这种分层的可识别属性依赖于两层固定支持设置中的一个简单的可识别条件。这种方法与现有的通过梯度下降将蝴蝶因子的乘积拟合到给定矩阵的方法形成了对比。该方法特别适用于检索大小为$N=2^J$的Hadamard或离散傅里叶变换矩阵的因式分解。计算这样的因数分解花费$\mathcal{O}(N^{2})$，这是密集矩阵-向量乘法的顺序，而获得的因数分解实现了快速的$\mathcal{O}(N \log N)$矩阵-向量乘法，并且具有应用于压缩深度神经网络的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

SIAM journal on mathematics of data science

自引率

0.00%

发文量