{"title":"Elastic DNN Inference With Unpredictable Exit in Edge Computing","authors":"Jiaming Huang;Yi Gao;Wei Dong","doi":"10.1109/TMC.2024.3441946","DOIUrl":null,"url":null,"abstract":"Multi-exit neural networks have gained popularity in edge computing to leverage the computing power of diverse devices. However, real-time tasks in edge applications often face frequent unpredictable exits caused by power outages or high-priority preemptions, which have been largely overlooked by multi-exit models. To address this challenge, it is crucial to determine the appropriate exit point in the multi-exit model to ensure desirable results during unpredictable exits. In this paper, we propose EINet, a sample-wise planner for real-time multi-exit deep neural networks. EINet enables efficient Elastic Inference with unpredictable exits while ensuring best-effort accuracy on various edge platforms. Our approach involves partitioning a trained deep neural network into multiple blocks, each with its exit. Furthermore, EINet utilizes block-wise model profiles, which include accuracy and inference time information for each block. By leveraging these profiles, EINet dynamically determines the optimal exit plan for each sample during the inference process. We introduce Confidence Score Predictors to adapt to the unique characteristics of input samples and employ the Search Engine to efficiently find near-optimal plans for elastic inference. Extensive evaluations of EINet using multiple deep neural networks and datasets with unpredictable exits demonstrate its superior performance. EINet exhibits significant accuracy improvements: 0.13%–16.5% compared to static plans, 0.79%–4.1% compared to other dynamic plans, and over 50% compared to predictable inference in typical scenarios.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":null,"pages":null},"PeriodicalIF":7.7000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10633848/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-exit neural networks have gained popularity in edge computing to leverage the computing power of diverse devices. However, real-time tasks in edge applications often face frequent unpredictable exits caused by power outages or high-priority preemptions, which have been largely overlooked by multi-exit models. To address this challenge, it is crucial to determine the appropriate exit point in the multi-exit model to ensure desirable results during unpredictable exits. In this paper, we propose EINet, a sample-wise planner for real-time multi-exit deep neural networks. EINet enables efficient Elastic Inference with unpredictable exits while ensuring best-effort accuracy on various edge platforms. Our approach involves partitioning a trained deep neural network into multiple blocks, each with its exit. Furthermore, EINet utilizes block-wise model profiles, which include accuracy and inference time information for each block. By leveraging these profiles, EINet dynamically determines the optimal exit plan for each sample during the inference process. We introduce Confidence Score Predictors to adapt to the unique characteristics of input samples and employ the Search Engine to efficiently find near-optimal plans for elastic inference. Extensive evaluations of EINet using multiple deep neural networks and datasets with unpredictable exits demonstrate its superior performance. EINet exhibits significant accuracy improvements: 0.13%–16.5% compared to static plans, 0.79%–4.1% compared to other dynamic plans, and over 50% compared to predictable inference in typical scenarios.
期刊介绍:
IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.