Joint Architecture Design and Workload Partitioning for DNN Inference on Industrial IoT Clusters

IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Internet Technology Pub Date : 2023-02-23 DOI:https://dl.acm.org/doi/10.1145/3551638
Weiwei Fang, Wenyuan Xu, Chongchong Yu, Neal. N. Xiong
{"title":"Joint Architecture Design and Workload Partitioning for DNN Inference on Industrial IoT Clusters","authors":"Weiwei Fang, Wenyuan Xu, Chongchong Yu, Neal. N. Xiong","doi":"https://dl.acm.org/doi/10.1145/3551638","DOIUrl":null,"url":null,"abstract":"<p>The advent of Deep Neural Networks (DNNs) has empowered numerous computer-vision applications. Due to the high computational intensity of DNN models, as well as the resource constrained nature of Industrial Internet-of-Things (IIoT) devices, it is generally very challenging to deploy and execute DNNs efficiently in the industrial scenarios. Substantial research has focused on model compression or edge-cloud offloading, which trades off accuracy for efficiency or depends on high-quality infrastructure support, respectively. In this article, we present EdgeDI, a framework for executing DNN inference in a partitioned, distributed manner on a cluster of IIoT devices. To improve the inference performance, EdgeDI exploits two key optimization knobs, including: (1) Model compression based on deep architecture design, which transforms the target DNN model into a compact one that reduces the resource requirements for IIoT devices without sacrificing accuracy; (2) Distributed inference based on adaptive workload partitioning, which achieves high parallelism by adaptively balancing the workload distribution among IIoT devices under heterogeneous resource conditions. We have implemented EdgeDI based on PyTorch, and evaluated its performance with the NEU-CLS defect classification task and two typical DNN models (i.e., VGG and ResNet) on a cluster of heterogeneous Raspberry Pi devices. The results indicate that the proposed two optimization approaches significantly outperform the existing solutions in their specific domains. When they are well combined, EdgeDI can provide scalable DNN inference speedups that are very close to or even much higher than the theoretical speedup bounds, while still maintaining the desired accuracy.</p>","PeriodicalId":50911,"journal":{"name":"ACM Transactions on Internet Technology","volume":"1 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Internet Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3551638","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The advent of Deep Neural Networks (DNNs) has empowered numerous computer-vision applications. Due to the high computational intensity of DNN models, as well as the resource constrained nature of Industrial Internet-of-Things (IIoT) devices, it is generally very challenging to deploy and execute DNNs efficiently in the industrial scenarios. Substantial research has focused on model compression or edge-cloud offloading, which trades off accuracy for efficiency or depends on high-quality infrastructure support, respectively. In this article, we present EdgeDI, a framework for executing DNN inference in a partitioned, distributed manner on a cluster of IIoT devices. To improve the inference performance, EdgeDI exploits two key optimization knobs, including: (1) Model compression based on deep architecture design, which transforms the target DNN model into a compact one that reduces the resource requirements for IIoT devices without sacrificing accuracy; (2) Distributed inference based on adaptive workload partitioning, which achieves high parallelism by adaptively balancing the workload distribution among IIoT devices under heterogeneous resource conditions. We have implemented EdgeDI based on PyTorch, and evaluated its performance with the NEU-CLS defect classification task and two typical DNN models (i.e., VGG and ResNet) on a cluster of heterogeneous Raspberry Pi devices. The results indicate that the proposed two optimization approaches significantly outperform the existing solutions in their specific domains. When they are well combined, EdgeDI can provide scalable DNN inference speedups that are very close to or even much higher than the theoretical speedup bounds, while still maintaining the desired accuracy.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
工业物联网集群DNN推理联合架构设计与工作负载划分
深度神经网络(dnn)的出现为许多计算机视觉应用提供了动力。由于深度神经网络模型的高计算强度,以及工业物联网(IIoT)设备的资源约束性质,在工业场景中有效地部署和执行深度神经网络通常非常具有挑战性。大量的研究集中在模型压缩或边缘云卸载上,它们分别以准确性换取效率或依赖于高质量的基础设施支持。在本文中,我们提出了EdgeDI,这是一个在IIoT设备集群上以分区、分布式方式执行DNN推理的框架。为了提高推理性能,EdgeDI利用了两个关键的优化方法,包括:(1)基于深度架构设计的模型压缩,将目标DNN模型转换为紧凑的模型,在不牺牲精度的情况下减少IIoT设备的资源需求;(2)基于自适应工作负载分区的分布式推理,在异构资源条件下,通过自适应平衡IIoT设备之间的工作负载分布,实现高并行性。我们基于PyTorch实现了EdgeDI,并在异构树莓派设备集群上使用nue - cls缺陷分类任务和两种典型DNN模型(即VGG和ResNet)对其性能进行了评估。结果表明,所提出的两种优化方法在其特定领域内明显优于现有的解决方案。当它们很好地结合在一起时,EdgeDI可以提供非常接近甚至远远高于理论加速界限的可扩展DNN推理加速,同时仍然保持所需的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Transactions on Internet Technology
ACM Transactions on Internet Technology 工程技术-计算机:软件工程
CiteScore
10.30
自引率
1.90%
发文量
137
审稿时长
>12 weeks
期刊介绍: ACM Transactions on Internet Technology (TOIT) brings together many computing disciplines including computer software engineering, computer programming languages, middleware, database management, security, knowledge discovery and data mining, networking and distributed systems, communications, performance and scalability etc. TOIT will cover the results and roles of the individual disciplines and the relationshipsamong them.
期刊最新文献
Interpersonal Communication Interconnection in Media Convergence Metaverse Using Reinforcement Learning and Error Models for Drone Precision Landing Towards Human-AI Teaming to Mitigate Alert Fatigue in Security Operations Centres RESP: A Recursive Clustering Approach for Edge Server Placement in Mobile Edge Computing OTI-IoT: A Blockchain-based Operational Threat Intelligence Framework for Multi-vector DDoS Attacks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1