ATO-EDGE: Adaptive Task Offloading for Deep Learning in Resource-Constrained Edge Computing Systems

2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS) Pub Date : 2021-12-01 DOI:10.1109/ICPADS53394.2021.00025

Yihao Wang, Ling Gao, J. Ren, Rui Cao, Hai Wang, Jie Zheng, Quanli Gao

{"title":"ATO-EDGE: Adaptive Task Offloading for Deep Learning in Resource-Constrained Edge Computing Systems","authors":"Yihao Wang, Ling Gao, J. Ren, Rui Cao, Hai Wang, Jie Zheng, Quanli Gao","doi":"10.1109/ICPADS53394.2021.00025","DOIUrl":null,"url":null,"abstract":"On-device deep learning enables mobile devices to perform complex tasks, such as object detection and voice translation, regardless of the network condition. The advanced deep learning model gives an excellent performance, also leads to a heavy burden on resource-limited devices (i.e., mobile devices). To speed up the on-device deep learning. Prior studies focus on developing lightweight network architecture for real-time inference by sacrificing model accuracy. This paper presents ATO-EDGE: adaptive task offloading for deep learning based on edge computing. Considering three optimization goals, energy consumption, accuracy, and latency, ATO-EDGE leverages an offline pre-trained model to select a suitable deep learning model on a specific device to process the given task. We apply our approach to object detection and evaluate it on Jetson TX2, Xilinx ZYNQ 7020, and Raspberry 3B+. The deep learning model candidates contain ten typical object detection models trained on Microsoft COCO 2017 dataset. We obtain, on average, 28.25%, 35.44%, and 0.9 improvements respectively for latency, energy consumption, and mAP (mean average precision) when compared to the SOTA DETR model on the Raspberry Pi.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS53394.2021.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

On-device deep learning enables mobile devices to perform complex tasks, such as object detection and voice translation, regardless of the network condition. The advanced deep learning model gives an excellent performance, also leads to a heavy burden on resource-limited devices (i.e., mobile devices). To speed up the on-device deep learning. Prior studies focus on developing lightweight network architecture for real-time inference by sacrificing model accuracy. This paper presents ATO-EDGE: adaptive task offloading for deep learning based on edge computing. Considering three optimization goals, energy consumption, accuracy, and latency, ATO-EDGE leverages an offline pre-trained model to select a suitable deep learning model on a specific device to process the given task. We apply our approach to object detection and evaluate it on Jetson TX2, Xilinx ZYNQ 7020, and Raspberry 3B+. The deep learning model candidates contain ten typical object detection models trained on Microsoft COCO 2017 dataset. We obtain, on average, 28.25%, 35.44%, and 0.9 improvements respectively for latency, energy consumption, and mAP (mean average precision) when compared to the SOTA DETR model on the Raspberry Pi.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ATO-EDGE:资源受限边缘计算系统中深度学习的自适应任务卸载

设备上深度学习使移动设备能够执行复杂的任务，如对象检测和语音翻译，而不受网络条件的影响。先进的深度学习模型提供了出色的性能，但也导致资源有限的设备(即移动设备)负担沉重。加速设备上的深度学习。以往的研究主要是通过牺牲模型精度来开发轻量级的实时推理网络架构。提出了一种基于边缘计算的深度学习自适应任务卸载算法ATO-EDGE。考虑到能耗、准确性和延迟三个优化目标，ATO-EDGE利用离线预训练模型在特定设备上选择合适的深度学习模型来处理给定任务。我们将该方法应用于目标检测，并在Jetson TX2、Xilinx ZYNQ 7020和Raspberry 3B+上进行了评估。深度学习候选模型包含在Microsoft COCO 2017数据集上训练的10个典型目标检测模型。与树莓派上的SOTA DETR模型相比，我们在延迟、能耗和mAP(平均平均精度)方面平均分别提高了28.25%、35.44%和0.9。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)

自引率

0.00%

发文量

期刊最新文献

Choosing Appropriate AI-enabled Edge Devices, Not the Costly Ones Collaborative Transmission over Intermediate Links in Duty-Cycle WSNs Efficient Asynchronous GCN Training on a GPU Cluster A Forecasting Method of Dual Traffic Condition Indicators Based on Ensemble Learning Simple yet Efficient Deployment of Scientific Applications in the Cloud