Cloud-Edge Inference under Communication Constraints: Data Quantization and Early Exit

2022 International Symposium on Wireless Communication Systems (ISWCS) Pub Date : 2022-10-19 DOI:10.1109/ISWCS56560.2022.9940360

Yushu Gao, Wen Wang, Dezhi Wang, Huiqiong Wang, Zhaoyang Zhang

{"title":"Cloud-Edge Inference under Communication Constraints: Data Quantization and Early Exit","authors":"Yushu Gao, Wen Wang, Dezhi Wang, Huiqiong Wang, Zhaoyang Zhang","doi":"10.1109/ISWCS56560.2022.9940360","DOIUrl":null,"url":null,"abstract":"The inference delay of deep neural networks (DNN) cannot always fulfill the application requirements due to the data transmission to the cloud, which can be effectively alleviated by cloud-edge collaboration via DNN partitioning. However, the communication capability between cloud and edge is usually limited. In this paper, we propose a threshold-based data quantization and exit (TDQE) framework, where the classification thresholds divide the data to different parts and determine to either quantize the data for transmitting under the communication constraints or early exit the DNN at the partition point. To solve the optimal solutions of the thresholds, we model an accuracy optimization problem under communication constraints, and solve it through the linear programming. In order to reduce the impact of quantization on accuracy due to difference parts of data, we further adjust the quantization ranges for each part of data to refine the quantization performance. Based on the optimization results, TDQE algorithm is proposed to construct the DNN partitioning with classified data processing. Finally, to evaluate the proposed method, we compare it with two traditional DNN partitioning algorithms via the simulation results. The results show that the proposed algorithm outperforms the other two baselines with respect to the accuracy and meets the real-time requirements.","PeriodicalId":141258,"journal":{"name":"2022 International Symposium on Wireless Communication Systems (ISWCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Symposium on Wireless Communication Systems (ISWCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISWCS56560.2022.9940360","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The inference delay of deep neural networks (DNN) cannot always fulfill the application requirements due to the data transmission to the cloud, which can be effectively alleviated by cloud-edge collaboration via DNN partitioning. However, the communication capability between cloud and edge is usually limited. In this paper, we propose a threshold-based data quantization and exit (TDQE) framework, where the classification thresholds divide the data to different parts and determine to either quantize the data for transmitting under the communication constraints or early exit the DNN at the partition point. To solve the optimal solutions of the thresholds, we model an accuracy optimization problem under communication constraints, and solve it through the linear programming. In order to reduce the impact of quantization on accuracy due to difference parts of data, we further adjust the quantization ranges for each part of data to refine the quantization performance. Based on the optimization results, TDQE algorithm is proposed to construct the DNN partitioning with classified data processing. Finally, to evaluate the proposed method, we compare it with two traditional DNN partitioning algorithms via the simulation results. The results show that the proposed algorithm outperforms the other two baselines with respect to the accuracy and meets the real-time requirements.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通信约束下的云边缘推断:数据量化与早期退出

深度神经网络(deep neural network, DNN)由于数据传输到云端，导致推理延迟不能满足应用需求，通过DNN分区的云边缘协作可以有效缓解这一问题。然而，云和边缘之间的通信能力通常是有限的。在本文中，我们提出了一个基于阈值的数据量化和退出(TDQE)框架，其中分类阈值将数据划分为不同的部分，并决定在通信约束下对数据进行量化传输或在分割点提前退出DNN。为了求解阈值的最优解，我们建立了通信约束下的精度优化问题模型，并通过线性规划进行求解。为了减少由于数据部分不同而导致量化对精度的影响，我们进一步调整各部分数据的量化范围，以细化量化性能。在优化结果的基础上，提出了采用分类数据处理的TDQE算法构建深度神经网络分区。最后，通过仿真结果将本文提出的方法与两种传统的深度神经网络划分算法进行了比较。结果表明，该算法在精度上优于其他两种基线，满足实时性要求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 International Symposium on Wireless Communication Systems (ISWCS)

自引率

0.00%

发文量