Self-aware collaborative edge inference with embedded devices for IIoT

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-09-23 DOI:10.1016/j.future.2024.107535
Yifan Chen , Zhuoquan Yu , Yi Jin , Christine Mwase , Xin Hu , Li Da Xu , Zhuo Zou , Lirong Zheng
{"title":"Self-aware collaborative edge inference with embedded devices for IIoT","authors":"Yifan Chen ,&nbsp;Zhuoquan Yu ,&nbsp;Yi Jin ,&nbsp;Christine Mwase ,&nbsp;Xin Hu ,&nbsp;Li Da Xu ,&nbsp;Zhuo Zou ,&nbsp;Lirong Zheng","doi":"10.1016/j.future.2024.107535","DOIUrl":null,"url":null,"abstract":"<div><div>Edge inference and other compute-intensive industrial Internet of Things (IIoT) applications suffer from a bad quality of experience due to the limited and heterogeneous computing and communication resources of embedded devices. To tackle these issues, we propose a model partitioning-based self-aware collaborative edge inference framework. Specifically, the device can adaptively adjust the local model inference scheme by sensing the available computing and communication resources of surrounding devices. When the inference latency requirement cannot be met by local computation, the model should be partitioned for collaborative computation on other devices to improve the inference efficiency. Furthermore, for two typical IIoT scenarios, i.e., bursting and stacking tasks, the latency-aware and throughput-aware collaborative inference algorithms are designed, respectively. Via jointly optimizing the partition layer and collaborative device selection, the optimal inference efficiency, characterized by minimum inference latency and maximum inference throughput, can be obtained. Finally, the performance of our proposal is validated through extensive simulations and tests conducted on 10 Raspberry Pi 4Bs using popular models. Specifically, in the case of two collaborative devices, our platform reaches up to 92.59% latency reduction for bursting tasks and 16.19<span><math><mo>×</mo></math></span> throughput growth for stacking tasks. In addition, the divergence between simulations and tests ranges from 1.64% to 9.56% for bursting tasks and from 3.24% to 11.24% for stacking tasks, which indicates that the theoretical performance analyses are solid. For the general case where the data privacy is not considered and the number of collaborative devices is optimally determined, up to 14.76<span><math><mo>×</mo></math></span> throughput speed up and 84.04% latency reduction can be obtained.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004990","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Edge inference and other compute-intensive industrial Internet of Things (IIoT) applications suffer from a bad quality of experience due to the limited and heterogeneous computing and communication resources of embedded devices. To tackle these issues, we propose a model partitioning-based self-aware collaborative edge inference framework. Specifically, the device can adaptively adjust the local model inference scheme by sensing the available computing and communication resources of surrounding devices. When the inference latency requirement cannot be met by local computation, the model should be partitioned for collaborative computation on other devices to improve the inference efficiency. Furthermore, for two typical IIoT scenarios, i.e., bursting and stacking tasks, the latency-aware and throughput-aware collaborative inference algorithms are designed, respectively. Via jointly optimizing the partition layer and collaborative device selection, the optimal inference efficiency, characterized by minimum inference latency and maximum inference throughput, can be obtained. Finally, the performance of our proposal is validated through extensive simulations and tests conducted on 10 Raspberry Pi 4Bs using popular models. Specifically, in the case of two collaborative devices, our platform reaches up to 92.59% latency reduction for bursting tasks and 16.19× throughput growth for stacking tasks. In addition, the divergence between simulations and tests ranges from 1.64% to 9.56% for bursting tasks and from 3.24% to 11.24% for stacking tasks, which indicates that the theoretical performance analyses are solid. For the general case where the data privacy is not considered and the number of collaborative devices is optimally determined, up to 14.76× throughput speed up and 84.04% latency reduction can be obtained.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向 IIoT 的嵌入式设备自我感知协作边缘推理
由于嵌入式设备的计算和通信资源有限且异构,边缘推理和其他计算密集型工业物联网(IIoT)应用的体验质量很差。为了解决这些问题,我们提出了一种基于模型分区的自感知协作边缘推理框架。具体来说,设备可以通过感知周围设备的可用计算和通信资源,自适应地调整本地模型推理方案。当本地计算无法满足推理延迟要求时,应将模型分割到其他设备上进行协同计算,以提高推理效率。此外,针对突发任务和堆叠任务这两种典型的物联网场景,分别设计了延迟感知协同推理算法和吞吐量感知协同推理算法。通过联合优化分区层和协作设备选择,可以获得以最小推理延迟和最大推理吞吐量为特征的最佳推理效率。最后,我们在 10 个使用流行模型的 Raspberry Pi 4B 上进行了大量模拟和测试,验证了我们建议的性能。具体地说,在两个协作设备的情况下,我们的平台在突发任务中减少了 92.59% 的延迟,在堆叠任务中提高了 16.19 倍的吞吐量。此外,对于突发任务,模拟与测试之间的差异在 1.64% 到 9.56% 之间,对于堆叠任务,差异在 3.24% 到 11.24% 之间,这表明理论性能分析是可靠的。在不考虑数据隐私并优化确定协作设备数量的一般情况下,吞吐量可提高 14.76 倍,延迟可减少 84.04%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
期刊最新文献
SWIM: Sliding-Window Model contrast for federated learning Heterogeneous system list scheduling algorithm based on improved optimistic cost matrix The Fast Inertial ADMM optimization framework for distributed machine learning Review of deep learning-based pathological image classification: From task-specific models to foundation models Learning protein language contrastive models with multi-knowledge representation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1