Cloud-Edge Inference under Communication Constraints: Data Quantization and Early Exit

Yushu Gao, Wen Wang, Dezhi Wang, Huiqiong Wang, Zhaoyang Zhang
{"title":"Cloud-Edge Inference under Communication Constraints: Data Quantization and Early Exit","authors":"Yushu Gao, Wen Wang, Dezhi Wang, Huiqiong Wang, Zhaoyang Zhang","doi":"10.1109/ISWCS56560.2022.9940360","DOIUrl":null,"url":null,"abstract":"The inference delay of deep neural networks (DNN) cannot always fulfill the application requirements due to the data transmission to the cloud, which can be effectively alleviated by cloud-edge collaboration via DNN partitioning. However, the communication capability between cloud and edge is usually limited. In this paper, we propose a threshold-based data quantization and exit (TDQE) framework, where the classification thresholds divide the data to different parts and determine to either quantize the data for transmitting under the communication constraints or early exit the DNN at the partition point. To solve the optimal solutions of the thresholds, we model an accuracy optimization problem under communication constraints, and solve it through the linear programming. In order to reduce the impact of quantization on accuracy due to difference parts of data, we further adjust the quantization ranges for each part of data to refine the quantization performance. Based on the optimization results, TDQE algorithm is proposed to construct the DNN partitioning with classified data processing. Finally, to evaluate the proposed method, we compare it with two traditional DNN partitioning algorithms via the simulation results. The results show that the proposed algorithm outperforms the other two baselines with respect to the accuracy and meets the real-time requirements.","PeriodicalId":141258,"journal":{"name":"2022 International Symposium on Wireless Communication Systems (ISWCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Symposium on Wireless Communication Systems (ISWCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISWCS56560.2022.9940360","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The inference delay of deep neural networks (DNN) cannot always fulfill the application requirements due to the data transmission to the cloud, which can be effectively alleviated by cloud-edge collaboration via DNN partitioning. However, the communication capability between cloud and edge is usually limited. In this paper, we propose a threshold-based data quantization and exit (TDQE) framework, where the classification thresholds divide the data to different parts and determine to either quantize the data for transmitting under the communication constraints or early exit the DNN at the partition point. To solve the optimal solutions of the thresholds, we model an accuracy optimization problem under communication constraints, and solve it through the linear programming. In order to reduce the impact of quantization on accuracy due to difference parts of data, we further adjust the quantization ranges for each part of data to refine the quantization performance. Based on the optimization results, TDQE algorithm is proposed to construct the DNN partitioning with classified data processing. Finally, to evaluate the proposed method, we compare it with two traditional DNN partitioning algorithms via the simulation results. The results show that the proposed algorithm outperforms the other two baselines with respect to the accuracy and meets the real-time requirements.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通信约束下的云边缘推断:数据量化与早期退出
深度神经网络(deep neural network, DNN)由于数据传输到云端,导致推理延迟不能满足应用需求,通过DNN分区的云边缘协作可以有效缓解这一问题。然而,云和边缘之间的通信能力通常是有限的。在本文中,我们提出了一个基于阈值的数据量化和退出(TDQE)框架,其中分类阈值将数据划分为不同的部分,并决定在通信约束下对数据进行量化传输或在分割点提前退出DNN。为了求解阈值的最优解,我们建立了通信约束下的精度优化问题模型,并通过线性规划进行求解。为了减少由于数据部分不同而导致量化对精度的影响,我们进一步调整各部分数据的量化范围,以细化量化性能。在优化结果的基础上,提出了采用分类数据处理的TDQE算法构建深度神经网络分区。最后,通过仿真结果将本文提出的方法与两种传统的深度神经网络划分算法进行了比较。结果表明,该算法在精度上优于其他两种基线,满足实时性要求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Cluster Precoders for Cell-Free MU-MIMO Systems Energy Consumption Minimization for NOMA-Assisted Mobile Edge Computing Multi-User Symbol-Level Precoding for Downlink Reconfigurable MIMO Communication Systems Deep Learning based Channel Prediction for OFDM Systems under Double-Selective Fading Channels Low Computational Complexity Algorithm for Hand Gesture Recognition using mmWave RADAR
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1