QD-Compressor: a Quantization-based Delta Compression Framework for Deep Neural Networks

Shuyu Zhang, Donglei Wu, Haoyu Jin, Xiangyu Zou, Wen Xia, Xiaojia Huang
{"title":"QD-Compressor: a Quantization-based Delta Compression Framework for Deep Neural Networks","authors":"Shuyu Zhang, Donglei Wu, Haoyu Jin, Xiangyu Zou, Wen Xia, Xiaojia Huang","doi":"10.1109/ICCD53106.2021.00088","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) have achieved remarkable success in many fields. Large-scale DNNs also bring storage challenges when storing snapshots for preventing clusters’ frequent failures, and bring massive internet traffic when dispatching or updating DNNs for resource-constrained devices (e.g., IoT devices, mobile phones). Several approaches are aiming to compress DNNs. The Recent work, Delta-DNN, notices high similarity existed in DNNs and thus calculates differences between them for improving the compression ratio.However, we observe that Delta-DNN, applying traditional global lossy quantization technique in calculating differences of two neighboring versions of the DNNs, can not fully exploit the data similarity between them for delta compression. This is because the parameters’ value ranges (and also the delta data in Delta-DNN) are varying among layers in DNNs, which inspires us to propose a local-sensitive quantization scheme: the quantizers are adaptive to parameters’ local value ranges in layers. Moreover, instead of quantizing differences of DNNs in Delta-DNN, our approach quantizes DNNs before calculating differences to make the differences more compressible. Besides, we also propose an error feedback mechanism to reduce DNNs’ accuracy loss caused by the lossy quantization.Therefore, we design a novel quantization-based delta compressor called QD-Compressor, which calculates the lossy differences between epochs of DNNs for saving storage cost of backing up DNNs’ snapshots and internet traffic of dispatching DNNs for resource-constrained devices. Experiments on several popular DNNs and datasets show that QD-Compressor obtains a compression ratio of 2.4× ~ 31.5× higher than the state-of-the-art approaches while well maintaining the model’s test accuracy.","PeriodicalId":154014,"journal":{"name":"2021 IEEE 39th International Conference on Computer Design (ICCD)","volume":"268 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 39th International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD53106.2021.00088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Deep neural networks (DNNs) have achieved remarkable success in many fields. Large-scale DNNs also bring storage challenges when storing snapshots for preventing clusters’ frequent failures, and bring massive internet traffic when dispatching or updating DNNs for resource-constrained devices (e.g., IoT devices, mobile phones). Several approaches are aiming to compress DNNs. The Recent work, Delta-DNN, notices high similarity existed in DNNs and thus calculates differences between them for improving the compression ratio.However, we observe that Delta-DNN, applying traditional global lossy quantization technique in calculating differences of two neighboring versions of the DNNs, can not fully exploit the data similarity between them for delta compression. This is because the parameters’ value ranges (and also the delta data in Delta-DNN) are varying among layers in DNNs, which inspires us to propose a local-sensitive quantization scheme: the quantizers are adaptive to parameters’ local value ranges in layers. Moreover, instead of quantizing differences of DNNs in Delta-DNN, our approach quantizes DNNs before calculating differences to make the differences more compressible. Besides, we also propose an error feedback mechanism to reduce DNNs’ accuracy loss caused by the lossy quantization.Therefore, we design a novel quantization-based delta compressor called QD-Compressor, which calculates the lossy differences between epochs of DNNs for saving storage cost of backing up DNNs’ snapshots and internet traffic of dispatching DNNs for resource-constrained devices. Experiments on several popular DNNs and datasets show that QD-Compressor obtains a compression ratio of 2.4× ~ 31.5× higher than the state-of-the-art approaches while well maintaining the model’s test accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
QD-Compressor:基于量化的深度神经网络Delta压缩框架
深度神经网络(dnn)在许多领域取得了显著的成功。大规模dnn在存储快照以防止集群频繁故障时也会带来存储挑战,并且在为资源受限的设备(例如物联网设备,移动电话)调度或更新dnn时带来巨大的互联网流量。有几种方法旨在压缩dnn。最近的工作Delta-DNN注意到dnn之间存在高度相似性,从而计算它们之间的差异以提高压缩比。然而,我们发现delta - dnn在计算两个相邻版本的dnn的差异时,采用传统的全局有损量化技术,不能充分利用它们之间的数据相似性进行delta压缩。这是因为dnn中参数的取值范围(以及delta - dnn中的delta数据)在各层之间是不同的,这启发我们提出了一种局部敏感的量化方案:量化器自适应各层参数的局部取值范围。此外,我们的方法不是量化Delta-DNN中dnn的差异,而是在计算差异之前量化dnn,使差异更可压缩。此外,我们还提出了一种误差反馈机制,以减少有耗量化给dnn带来的精度损失。因此,我们设计了一种新的基于量化的增量压缩器,称为QD-Compressor,它计算dnn的时代之间的有损差异,以节省备份dnn快照的存储成本和资源受限设备调度dnn的互联网流量。在几个流行的深度神经网络和数据集上的实验表明,QD-Compressor在保持模型测试精度的同时,获得了比现有方法高2.4× ~ 31.5×的压缩比。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Smart-DNN: Efficiently Reducing the Memory Requirements of Running Deep Neural Networks on Resource-constrained Platforms CoRe-ECO: Concurrent Refinement of Detailed Place-and-Route for an Efficient ECO Automation Accurate and Fast Performance Modeling of Processors with Decoupled Front-end Block-LSM: An Ether-aware Block-ordered LSM-tree based Key-Value Storage Engine Dynamic File Cache Optimization for Hybrid SSDs with High-Density and Low-Cost Flash Memory
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1