作为不精确计算的实时深度学习服务调度

IF 0.5 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING International Journal of Embedded and Real-Time Communication Systems (IJERTCS) Pub Date : 2020-08-01 DOI:10.1109/RTCSA50079.2020.9203676

Shuochao Yao, Yifan Hao, Yiran Zhao, Huajie Shao, Dongxin Liu, Shengzhong Liu, Tianshi Wang, Jinyang Li, T. Abdelzaher

{"title":"作为不精确计算的实时深度学习服务调度","authors":"Shuochao Yao, Yifan Hao, Yiran Zhao, Huajie Shao, Dongxin Liu, Shengzhong Liu, Tianshi Wang, Jinyang Li, T. Abdelzaher","doi":"10.1109/RTCSA50079.2020.9203676","DOIUrl":null,"url":null,"abstract":"The paper presents a real-time computing framework for intelligent real-time edge services, on behalf of local embedded devices that are themselves unable to support extensive computations. The work contributes to a new direction in realtime computing that develops scheduling algorithms for machine intelligence tasks that enable anytime prediction. We show that deep neural network workflows can be cast as imprecise computations, each with a mandatory part and (several) optional parts whose execution utility depends on input data. With our design, deep neural networks can be preempted before their completion and support anytime inference. The goal of the realtime scheduler is to maximize the average accuracy of deep neural network outputs while meeting task deadlines, thanks to opportunistic shedding of the least necessary optional parts. The work is motivated by the proliferation of increasingly ubiquitous but resource-constrained embedded devices (for applications ranging from autonomous cars to the Internet of Things) and the desire to develop services that endow them with intelligence. Experiments on recent GPU hardware and a state of the art deep neural network for machine vision illustrate that our scheme can increase the overall accuracy by 10% ∼ 20% while incurring (nearly) no deadline misses.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"83 1","pages":"1-10"},"PeriodicalIF":0.5000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Scheduling Real-time Deep Learning Services as Imprecise Computations\",\"authors\":\"Shuochao Yao, Yifan Hao, Yiran Zhao, Huajie Shao, Dongxin Liu, Shengzhong Liu, Tianshi Wang, Jinyang Li, T. Abdelzaher\",\"doi\":\"10.1109/RTCSA50079.2020.9203676\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper presents a real-time computing framework for intelligent real-time edge services, on behalf of local embedded devices that are themselves unable to support extensive computations. The work contributes to a new direction in realtime computing that develops scheduling algorithms for machine intelligence tasks that enable anytime prediction. We show that deep neural network workflows can be cast as imprecise computations, each with a mandatory part and (several) optional parts whose execution utility depends on input data. With our design, deep neural networks can be preempted before their completion and support anytime inference. The goal of the realtime scheduler is to maximize the average accuracy of deep neural network outputs while meeting task deadlines, thanks to opportunistic shedding of the least necessary optional parts. The work is motivated by the proliferation of increasingly ubiquitous but resource-constrained embedded devices (for applications ranging from autonomous cars to the Internet of Things) and the desire to develop services that endow them with intelligence. Experiments on recent GPU hardware and a state of the art deep neural network for machine vision illustrate that our scheme can increase the overall accuracy by 10% ∼ 20% while incurring (nearly) no deadline misses.\",\"PeriodicalId\":38446,\"journal\":{\"name\":\"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)\",\"volume\":\"83 1\",\"pages\":\"1-10\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2020-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RTCSA50079.2020.9203676\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTCSA50079.2020.9203676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 25

摘要

本文针对本地嵌入式设备本身无法支持大量计算的情况，提出了一种用于智能实时边缘服务的实时计算框架。这项工作为实时计算的新方向做出了贡献，为机器智能任务开发了能够随时预测的调度算法。我们表明，深度神经网络工作流可以被转换为不精确的计算，每个计算都有一个强制部分和(几个)可选部分，其执行效用取决于输入数据。通过我们的设计，深度神经网络可以在完成之前被抢占，并支持随时推理。实时调度程序的目标是最大限度地提高深度神经网络输出的平均精度，同时满足任务期限，这要归功于机会主义地放弃最不必要的可选部分。这项工作的动机是越来越普遍但资源有限的嵌入式设备(用于从自动驾驶汽车到物联网的应用)的激增，以及开发赋予它们智能的服务的愿望。在最近的GPU硬件和最先进的机器视觉深度神经网络上进行的实验表明，我们的方案可以将整体精度提高10% ~ 20%，同时(几乎)不会错过截止日期。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Scheduling Real-time Deep Learning Services as Imprecise Computations

The paper presents a real-time computing framework for intelligent real-time edge services, on behalf of local embedded devices that are themselves unable to support extensive computations. The work contributes to a new direction in realtime computing that develops scheduling algorithms for machine intelligence tasks that enable anytime prediction. We show that deep neural network workflows can be cast as imprecise computations, each with a mandatory part and (several) optional parts whose execution utility depends on input data. With our design, deep neural networks can be preempted before their completion and support anytime inference. The goal of the realtime scheduler is to maximize the average accuracy of deep neural network outputs while meeting task deadlines, thanks to opportunistic shedding of the least necessary optional parts. The work is motivated by the proliferation of increasingly ubiquitous but resource-constrained embedded devices (for applications ranging from autonomous cars to the Internet of Things) and the desire to develop services that endow them with intelligence. Experiments on recent GPU hardware and a state of the art deep neural network for machine vision illustrate that our scheme can increase the overall accuracy by 10% ∼ 20% while incurring (nearly) no deadline misses.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊