Scheduling Real-time Deep Learning Services as Imprecise Computations

Shuochao Yao, Yifan Hao, Yiran Zhao, Huajie Shao, Dongxin Liu, Shengzhong Liu, Tianshi Wang, Jinyang Li, T. Abdelzaher
{"title":"Scheduling Real-time Deep Learning Services as Imprecise Computations","authors":"Shuochao Yao, Yifan Hao, Yiran Zhao, Huajie Shao, Dongxin Liu, Shengzhong Liu, Tianshi Wang, Jinyang Li, T. Abdelzaher","doi":"10.1109/RTCSA50079.2020.9203676","DOIUrl":null,"url":null,"abstract":"The paper presents a real-time computing framework for intelligent real-time edge services, on behalf of local embedded devices that are themselves unable to support extensive computations. The work contributes to a new direction in realtime computing that develops scheduling algorithms for machine intelligence tasks that enable anytime prediction. We show that deep neural network workflows can be cast as imprecise computations, each with a mandatory part and (several) optional parts whose execution utility depends on input data. With our design, deep neural networks can be preempted before their completion and support anytime inference. The goal of the realtime scheduler is to maximize the average accuracy of deep neural network outputs while meeting task deadlines, thanks to opportunistic shedding of the least necessary optional parts. The work is motivated by the proliferation of increasingly ubiquitous but resource-constrained embedded devices (for applications ranging from autonomous cars to the Internet of Things) and the desire to develop services that endow them with intelligence. Experiments on recent GPU hardware and a state of the art deep neural network for machine vision illustrate that our scheme can increase the overall accuracy by 10% ∼ 20% while incurring (nearly) no deadline misses.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"83 1","pages":"1-10"},"PeriodicalIF":0.5000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTCSA50079.2020.9203676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 25

Abstract

The paper presents a real-time computing framework for intelligent real-time edge services, on behalf of local embedded devices that are themselves unable to support extensive computations. The work contributes to a new direction in realtime computing that develops scheduling algorithms for machine intelligence tasks that enable anytime prediction. We show that deep neural network workflows can be cast as imprecise computations, each with a mandatory part and (several) optional parts whose execution utility depends on input data. With our design, deep neural networks can be preempted before their completion and support anytime inference. The goal of the realtime scheduler is to maximize the average accuracy of deep neural network outputs while meeting task deadlines, thanks to opportunistic shedding of the least necessary optional parts. The work is motivated by the proliferation of increasingly ubiquitous but resource-constrained embedded devices (for applications ranging from autonomous cars to the Internet of Things) and the desire to develop services that endow them with intelligence. Experiments on recent GPU hardware and a state of the art deep neural network for machine vision illustrate that our scheme can increase the overall accuracy by 10% ∼ 20% while incurring (nearly) no deadline misses.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
作为不精确计算的实时深度学习服务调度
本文针对本地嵌入式设备本身无法支持大量计算的情况,提出了一种用于智能实时边缘服务的实时计算框架。这项工作为实时计算的新方向做出了贡献,为机器智能任务开发了能够随时预测的调度算法。我们表明,深度神经网络工作流可以被转换为不精确的计算,每个计算都有一个强制部分和(几个)可选部分,其执行效用取决于输入数据。通过我们的设计,深度神经网络可以在完成之前被抢占,并支持随时推理。实时调度程序的目标是最大限度地提高深度神经网络输出的平均精度,同时满足任务期限,这要归功于机会主义地放弃最不必要的可选部分。这项工作的动机是越来越普遍但资源有限的嵌入式设备(用于从自动驾驶汽车到物联网的应用)的激增,以及开发赋予它们智能的服务的愿望。在最近的GPU硬件和最先进的机器视觉深度神经网络上进行的实验表明,我们的方案可以将整体精度提高10% ~ 20%,同时(几乎)不会错过截止日期。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.70
自引率
14.30%
发文量
17
期刊最新文献
Agnostic Hardware-Accelerated Operating System for Low-End IoT Controlling High-Performance Platform Uncertainties with Timing Diversity The Role of Causality in a Formal Definition of Timing Anomalies Analyzing Fixed Task Priority Based Memory Centric Scheduler for the 3-Phase Task Model On the Trade-offs between Generalization and Specialization in Real-Time Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1