利用置换特征压缩深度神经网络的全连通层

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IET Computers and Digital Techniques Pub Date : 2023-06-11 DOI:10.1049/cdt2.12060

Dara Nagaraju, Nitin Chandrachoodan

{"title":"利用置换特征压缩深度神经网络的全连通层","authors":"Dara Nagaraju, Nitin Chandrachoodan","doi":"10.1049/cdt2.12060","DOIUrl":null,"url":null,"abstract":"<p>Modern deep neural networks typically have some fully connected layers at the final classification stages. These stages have large memory requirements that can be expensive on resource-constrained embedded devices and also consume significant energy just to read the parameters from external memory into the processing chip. The authors show that the weights in such layers can be modelled as permutations of a common sequence with minimal impact on recognition accuracy. This allows the storage requirements of FC layer(s) to be significantly reduced, which reflects in the reduction of total network parameters from 1.3× to 36× with a median of 4.45× on several benchmark networks. The authors compare the results with existing pruning, bitwidth reduction, and deep compression techniques and show the superior compression that can be achieved with this method. The authors also showed 7× reduction of parameters on VGG16 architecture with ImageNet dataset. The authors also showed that the proposed method can be used in the classification stage of the transfer learning networks.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"17 3-4","pages":"149-161"},"PeriodicalIF":1.1000,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12060","citationCount":"0","resultStr":"{\"title\":\"Compressing fully connected layers of deep neural networks using permuted features\",\"authors\":\"Dara Nagaraju, Nitin Chandrachoodan\",\"doi\":\"10.1049/cdt2.12060\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Modern deep neural networks typically have some fully connected layers at the final classification stages. These stages have large memory requirements that can be expensive on resource-constrained embedded devices and also consume significant energy just to read the parameters from external memory into the processing chip. The authors show that the weights in such layers can be modelled as permutations of a common sequence with minimal impact on recognition accuracy. This allows the storage requirements of FC layer(s) to be significantly reduced, which reflects in the reduction of total network parameters from 1.3× to 36× with a median of 4.45× on several benchmark networks. The authors compare the results with existing pruning, bitwidth reduction, and deep compression techniques and show the superior compression that can be achieved with this method. The authors also showed 7× reduction of parameters on VGG16 architecture with ImageNet dataset. The authors also showed that the proposed method can be used in the classification stage of the transfer learning networks.</p>\",\"PeriodicalId\":50383,\"journal\":{\"name\":\"IET Computers and Digital Techniques\",\"volume\":\"17 3-4\",\"pages\":\"149-161\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2023-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12060\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Computers and Digital Techniques\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/cdt2.12060\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Computers and Digital Techniques","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cdt2.12060","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

现代深度神经网络通常在最终分类阶段具有一些完全连接的层。这些阶段具有大的存储器需求，这在资源受限的嵌入式设备上可能是昂贵的，并且仅仅为了将参数从外部存储器读取到处理芯片中也消耗大量能量。作者表明，这些层中的权重可以建模为公共序列的排列，对识别精度的影响最小。这使得FC层的存储需求显著降低，这反映在总网络参数从1.3倍降低到36倍，在几个基准网络上的中值为4.45倍。作者将结果与现有的修剪、位宽缩减和深度压缩技术进行了比较，并展示了该方法可以实现的优越压缩。作者还用ImageNet数据集展示了VGG16架构上的参数减少了7倍。作者还表明，所提出的方法可以用于迁移学习网络的分类阶段。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Compressing fully connected layers of deep neural networks using permuted features

Modern deep neural networks typically have some fully connected layers at the final classification stages. These stages have large memory requirements that can be expensive on resource-constrained embedded devices and also consume significant energy just to read the parameters from external memory into the processing chip. The authors show that the weights in such layers can be modelled as permutations of a common sequence with minimal impact on recognition accuracy. This allows the storage requirements of FC layer(s) to be significantly reduced, which reflects in the reduction of total network parameters from 1.3× to 36× with a median of 4.45× on several benchmark networks. The authors compare the results with existing pruning, bitwidth reduction, and deep compression techniques and show the superior compression that can be achieved with this method. The authors also showed 7× reduction of parameters on VGG16 architecture with ImageNet dataset. The authors also showed that the proposed method can be used in the classification stage of the transfer learning networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IET Computers and Digital Techniques 工程技术-计算机：理论方法

CiteScore

3.50

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： IET Computers & Digital Techniques publishes technical papers describing recent research and development work in all aspects of digital system-on-chip design and test of electronic and embedded systems, including the development of design automation tools (methodologies, algorithms and architectures). Papers based on the problems associated with the scaling down of CMOS technology are particularly welcome. It is aimed at researchers, engineers and educators in the fields of computer and digital systems design and test. The key subject areas of interest are: Design Methods and Tools: CAD/EDA tools, hardware description languages, high-level and architectural synthesis, hardware/software co-design, platform-based design, 3D stacking and circuit design, system on-chip architectures and IP cores, embedded systems, logic synthesis, low-power design and power optimisation. Simulation, Test and Validation: electrical and timing simulation, simulation based verification, hardware/software co-simulation and validation, mixed-domain technology modelling and simulation, post-silicon validation, power analysis and estimation, interconnect modelling and signal integrity analysis, hardware trust and security, design-for-testability, embedded core testing, system-on-chip testing, on-line testing, automatic test generation and delay testing, low-power testing, reliability, fault modelling and fault tolerance. Processor and System Architectures: many-core systems, general-purpose and application specific processors, computational arithmetic for DSP applications, arithmetic and logic units, cache memories, memory management, co-processors and accelerators, systems and networks on chip, embedded cores, platforms, multiprocessors, distributed systems, communication protocols and low-power issues. Configurable Computing: embedded cores, FPGAs, rapid prototyping, adaptive computing, evolvable and statically and dynamically reconfigurable and reprogrammable systems, reconfigurable hardware. Design for variability, power and aging: design methods for variability, power and aging aware design, memories, FPGAs, IP components, 3D stacking, energy harvesting. Case Studies: emerging applications, applications in industrial designs, and design frameworks.