Enabling Low Latency Edge Intelligence based on Multi-exit DNNs in the Wild

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS) Pub Date : 2021-07-01 DOI:10.1109/ICDCS51616.2021.00075

Zhaowu Huang, Fang Dong, Dian Shen, Junxue Zhang, Huitian Wang, Guangxing Cai, Qiang He

{"title":"Enabling Low Latency Edge Intelligence based on Multi-exit DNNs in the Wild","authors":"Zhaowu Huang, Fang Dong, Dian Shen, Junxue Zhang, Huitian Wang, Guangxing Cai, Qiang He","doi":"10.1109/ICDCS51616.2021.00075","DOIUrl":null,"url":null,"abstract":"In recent years, deep neural networks (DNNs) have witnessed a booming of artificial intelligence Internet of Things applications with stringent demands across high accuracy and low latency. A widely adopted solution is to process such computation-intensive DNNs inference tasks with edge computing. Nevertheless, existing edge-based DNN processing methods still cannot achieve acceptable performance due to the intensive transmission data and unnecessary computation. To address the above limitations, we take the advantage of Multi-exit DNNs (ME-DNNs) that allows the tasks to exit early at different depths of the DNN during inference, based on the input complexity. However, naively deploying ME-DNNs in edge still fails to deliver fast and consistent inference in the wild environment. Specifically, 1) at the model-level, unsuitable exit settings will increase additional computational overhead and will lead to excessive queuing delay; 2) at the computation-level, it is hard to sustain high performance consistently in the dynamic edge computing environment. In this paper, we present a Low Latency Edge Intelligence Scheme based on Multi-Exit DNNs (LEIME) to tackle the aforementioned problem. At the model-level, we propose an exit setting algorithm to automatically build optimal ME-DNNs with lower time complexity; At the computation-level, we present a distributed offloading mechanism to fine-tune the task dispatching at runtime to sustain high performance in the dynamic environment, which has the property of close-to-optimal performance guarantee. Finally, we implement a prototype system and extensively evaluate it through testbed and large-scale simulation experiments. Experimental results demonstrate that LEIME significantly improves applications' performance, achieving 1.1–18.7 × speedup in different situations.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS51616.2021.00075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

In recent years, deep neural networks (DNNs) have witnessed a booming of artificial intelligence Internet of Things applications with stringent demands across high accuracy and low latency. A widely adopted solution is to process such computation-intensive DNNs inference tasks with edge computing. Nevertheless, existing edge-based DNN processing methods still cannot achieve acceptable performance due to the intensive transmission data and unnecessary computation. To address the above limitations, we take the advantage of Multi-exit DNNs (ME-DNNs) that allows the tasks to exit early at different depths of the DNN during inference, based on the input complexity. However, naively deploying ME-DNNs in edge still fails to deliver fast and consistent inference in the wild environment. Specifically, 1) at the model-level, unsuitable exit settings will increase additional computational overhead and will lead to excessive queuing delay; 2) at the computation-level, it is hard to sustain high performance consistently in the dynamic edge computing environment. In this paper, we present a Low Latency Edge Intelligence Scheme based on Multi-Exit DNNs (LEIME) to tackle the aforementioned problem. At the model-level, we propose an exit setting algorithm to automatically build optimal ME-DNNs with lower time complexity; At the computation-level, we present a distributed offloading mechanism to fine-tune the task dispatching at runtime to sustain high performance in the dynamic environment, which has the property of close-to-optimal performance guarantee. Finally, we implement a prototype system and extensively evaluate it through testbed and large-scale simulation experiments. Experimental results demonstrate that LEIME significantly improves applications' performance, achieving 1.1–18.7 × speedup in different situations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于多出口dnn在野外实现低延迟边缘智能

近年来，深度神经网络(deep neural networks, dnn)在人工智能物联网应用领域蓬勃发展，对高精度和低延迟有着严格的要求。一种被广泛采用的解决方案是用边缘计算来处理这种计算密集型的dnn推理任务。然而，现有的基于边缘的深度神经网络处理方法，由于传输数据量大，计算量大，仍然无法达到令人满意的性能。为了解决上述限制，我们利用了多出口DNN (me -DNN)的优势，它允许任务在推理期间根据输入复杂性在DNN的不同深度提前退出。然而，在边缘中天真地部署me - dnn仍然无法在野外环境中提供快速一致的推理。具体而言，1)在模型层面，不合适的出口设置会增加额外的计算开销，并导致过大的排队延迟;2)在计算层面，在动态边缘计算环境下难以持续保持高性能。在本文中，我们提出了一种基于多出口dnn (LEIME)的低延迟边缘智能方案来解决上述问题。在模型层面，提出了一种自动构建时间复杂度较低的最优me - dnn的退出设置算法;在计算层面，我们提出了一种分布式卸载机制，在运行时微调任务调度，以保持动态环境下的高性能，具有接近最优的性能保证。最后，我们实现了一个原型系统，并通过试验台和大规模仿真实验对其进行了广泛的评估。实验结果表明，LEIME显著提高了应用程序的性能，在不同情况下可实现1.1-18.7倍的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

自引率

0.00%

发文量