Novel Adaptive DNN Partitioning Method Based on Image-Stream Pipeline Inference between the Edge and Cloud

Chenchen Ji, Yanjun Wu, Pengpeng Hou, Yang Tai, Jiageng Yu
{"title":"Novel Adaptive DNN Partitioning Method Based on Image-Stream Pipeline Inference between the Edge and Cloud","authors":"Chenchen Ji, Yanjun Wu, Pengpeng Hou, Yang Tai, Jiageng Yu","doi":"10.1109/cniot55862.2022.00021","DOIUrl":null,"url":null,"abstract":"The cloud-only and edge-computing approaches have recently been proposed to satisfy the requirements of complex neural networks. However, the cloud-only approach generates a latency challenge because of the high data volumes that must be sent to a centralized location in the cloud. Less-powerful edge computing resources require a compression model for computation reduction, which degrades the model trading accuracy. To address this challenge, deep neural network (DNN) partitioning has become a recent trend, with DNN models being sliced into head and tail portions executed at the mobile edge devices and cloud server, respectively. We propose Edgepipe, a novel partitioning method based on pipeline inference with an image stream to automatically partition DNN computation between the edge device and cloud server, thereby reducing the global latency and enhancing the system-wide real-time performance. This method adapts to various DNN architectures, hardware platforms, and networks. Here, when evaluated on a suite of five DNN applications, Edgepipe achieves average latency speedups of 1.241× and 1.154× over the cloud-only approach and the state-of-the-art approach known as “Neurosurgeon”, respectively.","PeriodicalId":251734,"journal":{"name":"2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cniot55862.2022.00021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The cloud-only and edge-computing approaches have recently been proposed to satisfy the requirements of complex neural networks. However, the cloud-only approach generates a latency challenge because of the high data volumes that must be sent to a centralized location in the cloud. Less-powerful edge computing resources require a compression model for computation reduction, which degrades the model trading accuracy. To address this challenge, deep neural network (DNN) partitioning has become a recent trend, with DNN models being sliced into head and tail portions executed at the mobile edge devices and cloud server, respectively. We propose Edgepipe, a novel partitioning method based on pipeline inference with an image stream to automatically partition DNN computation between the edge device and cloud server, thereby reducing the global latency and enhancing the system-wide real-time performance. This method adapts to various DNN architectures, hardware platforms, and networks. Here, when evaluated on a suite of five DNN applications, Edgepipe achieves average latency speedups of 1.241× and 1.154× over the cloud-only approach and the state-of-the-art approach known as “Neurosurgeon”, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于边缘和云之间图像流管道推理的自适应DNN划分新方法
为了满足复杂神经网络的要求,最近提出了纯云计算和边缘计算方法。但是,纯云方法会产生延迟问题,因为必须将高数据量发送到云中的集中位置。较弱的边缘计算资源需要压缩模型来减少计算量,这降低了模型交易的准确性。为了应对这一挑战,深度神经网络(DNN)分区已成为最近的趋势,DNN模型被切割成头部和尾部部分,分别在移动边缘设备和云服务器上执行。提出了一种基于管道推理的基于图像流的分区方法Edgepipe,在边缘设备和云服务器之间自动划分DNN计算,从而减少了全局延迟,提高了全系统的实时性。该方法适用于各种深度神经网络体系结构、硬件平台和网络。在这里,当在一套5个DNN应用程序上进行评估时,Edgepipe的平均延迟速度分别比纯云方法和最先进的“神经外科医生”方法提高了1.241倍和1.154倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Statistical Power Grid Observability under Finite Blocklength Antenna On/Off Strategy for Massive MIMO Based on User Behavior Prediction A Residual Neural Network for Modulation Recognition of 24 kinds of Signals Intelligence Serviced Task-driven Network Architecture Novel Adaptive DNN Partitioning Method Based on Image-Stream Pipeline Inference between the Edge and Cloud
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1