移动网络边缘推断的现实检验

Alejandro Cartas, M. Kocour, Aravindh Raman, Ilias Leontiadis, J. Luque, Nishanth R. Sastry, José Núñez-Martínez, Diego Perino, C. Segura
{"title":"移动网络边缘推断的现实检验","authors":"Alejandro Cartas, M. Kocour, Aravindh Raman, Ilias Leontiadis, J. Luque, Nishanth R. Sastry, José Núñez-Martínez, Diego Perino, C. Segura","doi":"10.1145/3301418.3313946","DOIUrl":null,"url":null,"abstract":"Edge computing is considered a key enabler to deploy Artificial Intelligence platforms to provide real-time applications such as AR/VR or cognitive assistance. Previous works show computing capabilities deployed very close to the user can actually reduce the end-to-end latency of such interactive applications. Nonetheless, the main performance bottleneck remains in the machine learning inference operation. In this paper, we question some assumptions of these works, as the network location where edge computing is deployed, and considered software architectures within the framework of a couple of popular machine learning tasks. Our experimental evaluation shows that after performance tuning that leverages recent advances in deep learning algorithms and hardware, network latency is now the main bottleneck on end-to-end application performance. We also report that deploying computing capabilities at the first network node still provides latency reduction but, overall, it is not required by all applications. Based on our findings, we overview the requirements and sketch the design of an adaptive architecture for general machine learning inference across edge locations.","PeriodicalId":131097,"journal":{"name":"Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"A Reality Check on Inference at Mobile Networks Edge\",\"authors\":\"Alejandro Cartas, M. Kocour, Aravindh Raman, Ilias Leontiadis, J. Luque, Nishanth R. Sastry, José Núñez-Martínez, Diego Perino, C. Segura\",\"doi\":\"10.1145/3301418.3313946\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Edge computing is considered a key enabler to deploy Artificial Intelligence platforms to provide real-time applications such as AR/VR or cognitive assistance. Previous works show computing capabilities deployed very close to the user can actually reduce the end-to-end latency of such interactive applications. Nonetheless, the main performance bottleneck remains in the machine learning inference operation. In this paper, we question some assumptions of these works, as the network location where edge computing is deployed, and considered software architectures within the framework of a couple of popular machine learning tasks. Our experimental evaluation shows that after performance tuning that leverages recent advances in deep learning algorithms and hardware, network latency is now the main bottleneck on end-to-end application performance. We also report that deploying computing capabilities at the first network node still provides latency reduction but, overall, it is not required by all applications. Based on our findings, we overview the requirements and sketch the design of an adaptive architecture for general machine learning inference across edge locations.\",\"PeriodicalId\":131097,\"journal\":{\"name\":\"Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3301418.3313946\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3301418.3313946","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28

摘要

边缘计算被认为是部署人工智能平台以提供AR/VR或认知辅助等实时应用的关键推动者。以前的工作表明,将计算能力部署在离用户非常近的地方,实际上可以减少这种交互式应用程序的端到端延迟。然而,主要的性能瓶颈仍然存在于机器学习推理操作中。在本文中,我们质疑这些工作的一些假设,作为部署边缘计算的网络位置,并在几个流行的机器学习任务的框架内考虑软件架构。我们的实验评估表明,在利用深度学习算法和硬件的最新进展进行性能调优之后,网络延迟现在是端到端应用程序性能的主要瓶颈。我们还报告说,在第一个网络节点部署计算能力仍然可以减少延迟,但总体而言,并非所有应用程序都需要这样做。根据我们的研究结果,我们概述了需求并概述了跨边缘位置的通用机器学习推理的自适应架构的设计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Reality Check on Inference at Mobile Networks Edge
Edge computing is considered a key enabler to deploy Artificial Intelligence platforms to provide real-time applications such as AR/VR or cognitive assistance. Previous works show computing capabilities deployed very close to the user can actually reduce the end-to-end latency of such interactive applications. Nonetheless, the main performance bottleneck remains in the machine learning inference operation. In this paper, we question some assumptions of these works, as the network location where edge computing is deployed, and considered software architectures within the framework of a couple of popular machine learning tasks. Our experimental evaluation shows that after performance tuning that leverages recent advances in deep learning algorithms and hardware, network latency is now the main bottleneck on end-to-end application performance. We also report that deploying computing capabilities at the first network node still provides latency reduction but, overall, it is not required by all applications. Based on our findings, we overview the requirements and sketch the design of an adaptive architecture for general machine learning inference across edge locations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Transparent AR Processing Acceleration at the Edge Energy-Aware Speculative Execution in Vehicular Edge Computing Systems Snape: The Dark Art of Handling Heterogeneous Enclaves Optimized Assignment of Computational Tasks in Vehicular Micro Clouds A Learning-based Framework for Optimizing Service Migration in Mobile Edge Clouds
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1