DASS: Differentiable Architecture Search for Sparse Neural Networks

IF 2.8 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI:10.1145/3609385

Hamid Mousavi, Mohammad Loni, Mina Alibeigi, Masoud Daneshtalab

{"title":"DASS: Differentiable Architecture Search for Sparse Neural Networks","authors":"Hamid Mousavi, Mohammad Loni, Mina Alibeigi, Masoud Daneshtalab","doi":"10.1145/3609385","DOIUrl":null,"url":null,"abstract":"The deployment of Deep Neural Networks (DNNs) on edge devices is hindered by the substantial gap between performance requirements and available computational power. While recent research has made significant strides in developing pruning methods to build a sparse network for reducing the computing overhead of DNNs, there remains considerable accuracy loss, especially at high pruning ratios. We find that the architectures designed for dense networks by differentiable architecture search methods are ineffective when pruning mechanisms are applied to them. The main reason is that the current methods do not support sparse architectures in their search space and use a search objective that is made for dense networks and does not focus on sparsity. This paper proposes a new method to search for sparsity-friendly neural architectures. It is done by adding two new sparse operations to the search space and modifying the search objective. We propose two novel parametric SparseConv and SparseLinear operations in order to expand the search space to include sparse operations. In particular, these operations make a flexible search space due to using sparse parametric versions of linear and convolution operations. The proposed search objective lets us train the architecture based on the sparsity of the search space operations. Quantitative analyses demonstrate that architectures found through DASS outperform those used in the state-of-the-art sparse networks on the CIFAR-10 and ImageNet datasets. In terms of performance and hardware effectiveness, DASS increases the accuracy of the sparse version of MobileNet-v2 from 73.44% to 81.35% (+7.91% improvement) with a 3.87× faster inference time.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"27 1","pages":"0"},"PeriodicalIF":2.8000,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Embedded Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3609385","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 1

Abstract

The deployment of Deep Neural Networks (DNNs) on edge devices is hindered by the substantial gap between performance requirements and available computational power. While recent research has made significant strides in developing pruning methods to build a sparse network for reducing the computing overhead of DNNs, there remains considerable accuracy loss, especially at high pruning ratios. We find that the architectures designed for dense networks by differentiable architecture search methods are ineffective when pruning mechanisms are applied to them. The main reason is that the current methods do not support sparse architectures in their search space and use a search objective that is made for dense networks and does not focus on sparsity. This paper proposes a new method to search for sparsity-friendly neural architectures. It is done by adding two new sparse operations to the search space and modifying the search objective. We propose two novel parametric SparseConv and SparseLinear operations in order to expand the search space to include sparse operations. In particular, these operations make a flexible search space due to using sparse parametric versions of linear and convolution operations. The proposed search objective lets us train the architecture based on the sparsity of the search space operations. Quantitative analyses demonstrate that architectures found through DASS outperform those used in the state-of-the-art sparse networks on the CIFAR-10 and ImageNet datasets. In terms of performance and hardware effectiveness, DASS increases the accuracy of the sparse version of MobileNet-v2 from 73.44% to 81.35% (+7.91% improvement) with a 3.87× faster inference time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

稀疏神经网络的可微结构搜索

深度神经网络(dnn)在边缘设备上的部署受到性能要求和可用计算能力之间巨大差距的阻碍。虽然最近的研究在开发修剪方法以构建稀疏网络以减少dnn的计算开销方面取得了重大进展，但仍然存在相当大的准确性损失，特别是在高修剪比率时。我们发现用可微体系结构搜索方法为密集网络设计的体系结构在应用剪枝机制时是无效的。主要原因是当前的方法在其搜索空间中不支持稀疏架构，并且使用为密集网络设计的搜索目标，而不关注稀疏性。本文提出了一种搜索稀疏友好神经结构的新方法。它通过在搜索空间中增加两个新的稀疏操作和修改搜索目标来实现。我们提出了两种新的参数化的SparseConv和SparseLinear操作，以扩展搜索空间以包含稀疏操作。特别是，由于使用线性和卷积操作的稀疏参数版本，这些操作形成了一个灵活的搜索空间。提出的搜索目标使我们能够基于搜索空间操作的稀疏性来训练体系结构。定量分析表明，通过DASS找到的架构优于CIFAR-10和ImageNet数据集上最先进的稀疏网络中使用的架构。在性能和硬件效率方面，DASS将MobileNet-v2的稀疏版本的准确率从73.44%提高到81.35%(提高7.91%)，推理时间提高了3.87倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Embedded Computing Systems 工程技术-计算机：软件工程

CiteScore

3.70

自引率

0.00%

发文量

138

审稿时长

6 months

期刊介绍： The design of embedded computing systems, both the software and hardware, increasingly relies on sophisticated algorithms, analytical models, and methodologies. ACM Transactions on Embedded Computing Systems (TECS) aims to present the leading work relating to the analysis, design, behavior, and experience with embedded computing systems.