BottleFit: Learning Compressed Representations in Deep Neural Networks for Effective and Efficient Split Computing

2022 IEEE 23rd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM) Pub Date : 2022-01-07 DOI:10.1109/WoWMoM54355.2022.00032

Yoshitomo Matsubara, Davide Callegaro, Sameer Singh, M. Levorato, Francesco Restuccia

{"title":"BottleFit: Learning Compressed Representations in Deep Neural Networks for Effective and Efficient Split Computing","authors":"Yoshitomo Matsubara, Davide Callegaro, Sameer Singh, M. Levorato, Francesco Restuccia","doi":"10.1109/WoWMoM54355.2022.00032","DOIUrl":null,"url":null,"abstract":"Although mission-critical applications require the use of deep neural networks (DNNs), their continuous execution at mobile devices results in a significant increase in energy consumption. While edge offloading can decrease energy consumption, erratic patterns in channel quality, network and edge server load can lead to severe disruption of the system’s key operations. An alternative approach, called split computing, generates compressed representations within the model (called \"bottlenecks\"), to reduce bandwidth usage and energy consumption. Prior work has proposed approaches that introduce additional layers, to the detriment of energy consumption and latency. For this reason, we propose a new framework called BottleFit, which, in addition to targeted DNN architecture modifications, includes a novel training strategy to achieve high accuracy even with strong compression rates. We apply BottleFit on cutting-edge DNN models in image classification, and show that BottleFit achieves 77.1% data compression with up to 0.6% accuracy loss on ImageNet dataset, while state of the art such as SPINN loses up to 6% in accuracy. We experimentally measure the power consumption and latency of an image classification application running on an NVIDIA Jetson Nano board (GPU-based) and a Raspberry PI board (GPU-less). We show that BottleFit decreases power consumption and latency respectively by up to 49% and 89% with respect to (w.r.t.) local computing and by 37% and 55% w.r.t. edge offloading. We also compare BottleFit with state-of-the-art autoencoders-based approaches, and show that (i) BottleFit reduces power consumption and execution time respectively by up to 54% and 44% on the Jetson and 40% and 62% on Raspberry PI; (ii) the size of the head model executed on the mobile device is 83 times smaller. We publish the code repository for reproducibility of the results in this study.","PeriodicalId":275324,"journal":{"name":"2022 IEEE 23rd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)","volume":"583 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 23rd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WoWMoM54355.2022.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

Abstract

Although mission-critical applications require the use of deep neural networks (DNNs), their continuous execution at mobile devices results in a significant increase in energy consumption. While edge offloading can decrease energy consumption, erratic patterns in channel quality, network and edge server load can lead to severe disruption of the system’s key operations. An alternative approach, called split computing, generates compressed representations within the model (called "bottlenecks"), to reduce bandwidth usage and energy consumption. Prior work has proposed approaches that introduce additional layers, to the detriment of energy consumption and latency. For this reason, we propose a new framework called BottleFit, which, in addition to targeted DNN architecture modifications, includes a novel training strategy to achieve high accuracy even with strong compression rates. We apply BottleFit on cutting-edge DNN models in image classification, and show that BottleFit achieves 77.1% data compression with up to 0.6% accuracy loss on ImageNet dataset, while state of the art such as SPINN loses up to 6% in accuracy. We experimentally measure the power consumption and latency of an image classification application running on an NVIDIA Jetson Nano board (GPU-based) and a Raspberry PI board (GPU-less). We show that BottleFit decreases power consumption and latency respectively by up to 49% and 89% with respect to (w.r.t.) local computing and by 37% and 55% w.r.t. edge offloading. We also compare BottleFit with state-of-the-art autoencoders-based approaches, and show that (i) BottleFit reduces power consumption and execution time respectively by up to 54% and 44% on the Jetson and 40% and 62% on Raspberry PI; (ii) the size of the head model executed on the mobile device is 83 times smaller. We publish the code repository for reproducibility of the results in this study.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

BottleFit:学习深度神经网络中的压缩表示，用于有效和高效的分割计算

尽管关键任务应用需要使用深度神经网络(dnn)，但它们在移动设备上的持续执行会导致能耗显著增加。虽然边缘卸载可以降低能耗，但通道质量、网络和边缘服务器负载的不稳定模式可能导致系统关键操作的严重中断。另一种称为分割计算的方法在模型中生成压缩表示(称为“瓶颈”)，以减少带宽使用和能耗。先前的工作提出了引入额外层的方法，这对能量消耗和延迟造成了损害。因此，我们提出了一个名为BottleFit的新框架，除了有针对性的DNN架构修改外，该框架还包括一种新的训练策略，即使在高压缩率下也能实现高精度。我们将BottleFit应用于最先进的DNN模型进行图像分类，结果表明，在ImageNet数据集上，BottleFit实现了77.1%的数据压缩，准确率损失高达0.6%，而SPINN等最新技术的准确率损失高达6%。我们通过实验测量了在NVIDIA Jetson Nano板(基于gpu)和Raspberry PI板(无gpu)上运行的图像分类应用程序的功耗和延迟。我们表明，相对于(w.r.t.)本地计算，BottleFit的功耗和延迟分别降低了49%和89%，w.r.t.边缘卸载分别降低了37%和55%。我们还将BottleFit与最先进的基于自动编码器的方法进行了比较，并显示:(i) BottleFit在Jetson上分别减少了54%和44%的功耗和执行时间，在Raspberry PI上分别减少了40%和62%;(ii)在移动设备上执行的头部模型的尺寸小83倍。我们发布了代码库，以保证本研究结果的可重复性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE 23rd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)

自引率

0.00%

发文量

期刊最新文献

An Efficient Analog Eigen-Beamforming Procedure for Wideband mmWave MIMO-OFDM Systems Relay selection in Bluetooth Mesh networks by embedding genetic algorithms in a Digital Communication Twin Modeling Service Mixes in Access Links: Product Form and Oscillations Reviewers: Main Conference N2Women Event