Optimizing CNN inference speed over big social data through efficient model parallelism for sustainable web of things

IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Journal of Parallel and Distributed Computing Pub Date : 2024-05-31 DOI:10.1016/j.jpdc.2024.104927
Yuhao Hu , Xiaolong Xu , Muhammad Bilal , Weiyi Zhong , Yuwen Liu , Huaizhen Kou , Lingzhen Kong
{"title":"Optimizing CNN inference speed over big social data through efficient model parallelism for sustainable web of things","authors":"Yuhao Hu ,&nbsp;Xiaolong Xu ,&nbsp;Muhammad Bilal ,&nbsp;Weiyi Zhong ,&nbsp;Yuwen Liu ,&nbsp;Huaizhen Kou ,&nbsp;Lingzhen Kong","doi":"10.1016/j.jpdc.2024.104927","DOIUrl":null,"url":null,"abstract":"<div><p>The rapid development of artificial intelligence and networking technologies has catalyzed the popularity of intelligent services based on deep learning in recent years, which in turn fosters the advancement of Web of Things (WoT). Big social data (BSD) plays an important role during the processing of intelligent services in WoT. However, intelligent BSD services are computationally intensive and require ultra-low latency. End or edge devices with limited computing power cannot realize the extremely low response latency of those services. Distributed inference of deep neural networks (DNNs) on various devices is considered a feasible solution by allocating the computing load of a DNN to several devices. In this work, an efficient model parallelism method that couples convolution layer (Conv) split with resource allocation is proposed. First, given a random computing resource allocation strategy, the Conv split decision is made through a mathematical analysis method to realize the parallel inference of convolutional neural networks (CNNs). Next, Deep Reinforcement Learning is used to get the optimal computing resource allocation strategy to maximize the resource utilization rate and minimize the CNN inference latency. Finally, simulation results show that our approach performs better than the baselines and is applicable for BSD services in WoT with a high workload.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"192 ","pages":"Article 104927"},"PeriodicalIF":3.4000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Parallel and Distributed Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0743731524000911","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid development of artificial intelligence and networking technologies has catalyzed the popularity of intelligent services based on deep learning in recent years, which in turn fosters the advancement of Web of Things (WoT). Big social data (BSD) plays an important role during the processing of intelligent services in WoT. However, intelligent BSD services are computationally intensive and require ultra-low latency. End or edge devices with limited computing power cannot realize the extremely low response latency of those services. Distributed inference of deep neural networks (DNNs) on various devices is considered a feasible solution by allocating the computing load of a DNN to several devices. In this work, an efficient model parallelism method that couples convolution layer (Conv) split with resource allocation is proposed. First, given a random computing resource allocation strategy, the Conv split decision is made through a mathematical analysis method to realize the parallel inference of convolutional neural networks (CNNs). Next, Deep Reinforcement Learning is used to get the optimal computing resource allocation strategy to maximize the resource utilization rate and minimize the CNN inference latency. Finally, simulation results show that our approach performs better than the baselines and is applicable for BSD services in WoT with a high workload.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过高效模型并行化优化 CNN 对社交大数据的推理速度,实现可持续物联网
近年来,人工智能和网络技术的快速发展推动了基于深度学习的智能服务的普及,进而促进了物联网(WoT)的发展。大社会数据(BSD)在物联网智能服务的处理过程中发挥着重要作用。然而,智能 BSD 服务是计算密集型的,需要超低延迟。计算能力有限的终端或边缘设备无法实现这些服务的超低响应延迟。通过将深度神经网络(DNN)的计算负载分配给多个设备,在不同设备上进行分布式推理被认为是一种可行的解决方案。在这项工作中,提出了一种将卷积层(Conv)拆分与资源分配相结合的高效模型并行方法。首先,给定随机计算资源分配策略,通过数学分析方法做出 Conv 分割决策,实现卷积神经网络(CNN)的并行推理。接着,利用深度强化学习(Deep Reinforcement Learning)获得最优计算资源分配策略,从而最大化资源利用率,最小化 CNN 推理延迟。最后,仿真结果表明,我们的方法比基线方法性能更好,适用于高工作量的 WoT 中的 BSD 服务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing 工程技术-计算机:理论方法
CiteScore
10.30
自引率
2.60%
发文量
172
审稿时长
12 months
期刊介绍: This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing. The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems.
期刊最新文献
Enabling semi-supervised learning in intrusion detection systems Fault-tolerance in biswapped multiprocessor interconnection networks Editorial Board Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues) Design and experimental evaluation of algorithms for optimizing the throughput of dispersed computing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1