Novel Adaptive DNN Partitioning Method Based on Image-Stream Pipeline Inference between the Edge and Cloud

2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT) Pub Date : 2022-05-01 DOI:10.1109/cniot55862.2022.00021

Chenchen Ji, Yanjun Wu, Pengpeng Hou, Yang Tai, Jiageng Yu

{"title":"Novel Adaptive DNN Partitioning Method Based on Image-Stream Pipeline Inference between the Edge and Cloud","authors":"Chenchen Ji, Yanjun Wu, Pengpeng Hou, Yang Tai, Jiageng Yu","doi":"10.1109/cniot55862.2022.00021","DOIUrl":null,"url":null,"abstract":"The cloud-only and edge-computing approaches have recently been proposed to satisfy the requirements of complex neural networks. However, the cloud-only approach generates a latency challenge because of the high data volumes that must be sent to a centralized location in the cloud. Less-powerful edge computing resources require a compression model for computation reduction, which degrades the model trading accuracy. To address this challenge, deep neural network (DNN) partitioning has become a recent trend, with DNN models being sliced into head and tail portions executed at the mobile edge devices and cloud server, respectively. We propose Edgepipe, a novel partitioning method based on pipeline inference with an image stream to automatically partition DNN computation between the edge device and cloud server, thereby reducing the global latency and enhancing the system-wide real-time performance. This method adapts to various DNN architectures, hardware platforms, and networks. Here, when evaluated on a suite of five DNN applications, Edgepipe achieves average latency speedups of 1.241× and 1.154× over the cloud-only approach and the state-of-the-art approach known as “Neurosurgeon”, respectively.","PeriodicalId":251734,"journal":{"name":"2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cniot55862.2022.00021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The cloud-only and edge-computing approaches have recently been proposed to satisfy the requirements of complex neural networks. However, the cloud-only approach generates a latency challenge because of the high data volumes that must be sent to a centralized location in the cloud. Less-powerful edge computing resources require a compression model for computation reduction, which degrades the model trading accuracy. To address this challenge, deep neural network (DNN) partitioning has become a recent trend, with DNN models being sliced into head and tail portions executed at the mobile edge devices and cloud server, respectively. We propose Edgepipe, a novel partitioning method based on pipeline inference with an image stream to automatically partition DNN computation between the edge device and cloud server, thereby reducing the global latency and enhancing the system-wide real-time performance. This method adapts to various DNN architectures, hardware platforms, and networks. Here, when evaluated on a suite of five DNN applications, Edgepipe achieves average latency speedups of 1.241× and 1.154× over the cloud-only approach and the state-of-the-art approach known as “Neurosurgeon”, respectively.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于边缘和云之间图像流管道推理的自适应DNN划分新方法

为了满足复杂神经网络的要求，最近提出了纯云计算和边缘计算方法。但是，纯云方法会产生延迟问题，因为必须将高数据量发送到云中的集中位置。较弱的边缘计算资源需要压缩模型来减少计算量，这降低了模型交易的准确性。为了应对这一挑战，深度神经网络(DNN)分区已成为最近的趋势，DNN模型被切割成头部和尾部部分，分别在移动边缘设备和云服务器上执行。提出了一种基于管道推理的基于图像流的分区方法Edgepipe，在边缘设备和云服务器之间自动划分DNN计算，从而减少了全局延迟，提高了全系统的实时性。该方法适用于各种深度神经网络体系结构、硬件平台和网络。在这里，当在一套5个DNN应用程序上进行评估时，Edgepipe的平均延迟速度分别比纯云方法和最先进的“神经外科医生”方法提高了1.241倍和1.154倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT)

自引率

0.00%

发文量

期刊最新文献

Statistical Power Grid Observability under Finite Blocklength Antenna On/Off Strategy for Massive MIMO Based on User Behavior Prediction A Residual Neural Network for Modulation Recognition of 24 kinds of Signals Intelligence Serviced Task-driven Network Architecture Novel Adaptive DNN Partitioning Method Based on Image-Stream Pipeline Inference between the Edge and Cloud