移动网络边缘推断的现实检验

Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking Pub Date : 2019-03-25 DOI:10.1145/3301418.3313946

Alejandro Cartas, M. Kocour, Aravindh Raman, Ilias Leontiadis, J. Luque, Nishanth R. Sastry, José Núñez-Martínez, Diego Perino, C. Segura

{"title":"移动网络边缘推断的现实检验","authors":"Alejandro Cartas, M. Kocour, Aravindh Raman, Ilias Leontiadis, J. Luque, Nishanth R. Sastry, José Núñez-Martínez, Diego Perino, C. Segura","doi":"10.1145/3301418.3313946","DOIUrl":null,"url":null,"abstract":"Edge computing is considered a key enabler to deploy Artificial Intelligence platforms to provide real-time applications such as AR/VR or cognitive assistance. Previous works show computing capabilities deployed very close to the user can actually reduce the end-to-end latency of such interactive applications. Nonetheless, the main performance bottleneck remains in the machine learning inference operation. In this paper, we question some assumptions of these works, as the network location where edge computing is deployed, and considered software architectures within the framework of a couple of popular machine learning tasks. Our experimental evaluation shows that after performance tuning that leverages recent advances in deep learning algorithms and hardware, network latency is now the main bottleneck on end-to-end application performance. We also report that deploying computing capabilities at the first network node still provides latency reduction but, overall, it is not required by all applications. Based on our findings, we overview the requirements and sketch the design of an adaptive architecture for general machine learning inference across edge locations.","PeriodicalId":131097,"journal":{"name":"Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"A Reality Check on Inference at Mobile Networks Edge\",\"authors\":\"Alejandro Cartas, M. Kocour, Aravindh Raman, Ilias Leontiadis, J. Luque, Nishanth R. Sastry, José Núñez-Martínez, Diego Perino, C. Segura\",\"doi\":\"10.1145/3301418.3313946\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Edge computing is considered a key enabler to deploy Artificial Intelligence platforms to provide real-time applications such as AR/VR or cognitive assistance. Previous works show computing capabilities deployed very close to the user can actually reduce the end-to-end latency of such interactive applications. Nonetheless, the main performance bottleneck remains in the machine learning inference operation. In this paper, we question some assumptions of these works, as the network location where edge computing is deployed, and considered software architectures within the framework of a couple of popular machine learning tasks. Our experimental evaluation shows that after performance tuning that leverages recent advances in deep learning algorithms and hardware, network latency is now the main bottleneck on end-to-end application performance. We also report that deploying computing capabilities at the first network node still provides latency reduction but, overall, it is not required by all applications. Based on our findings, we overview the requirements and sketch the design of an adaptive architecture for general machine learning inference across edge locations.\",\"PeriodicalId\":131097,\"journal\":{\"name\":\"Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3301418.3313946\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3301418.3313946","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

摘要

边缘计算被认为是部署人工智能平台以提供AR/VR或认知辅助等实时应用的关键推动者。以前的工作表明，将计算能力部署在离用户非常近的地方，实际上可以减少这种交互式应用程序的端到端延迟。然而，主要的性能瓶颈仍然存在于机器学习推理操作中。在本文中，我们质疑这些工作的一些假设，作为部署边缘计算的网络位置，并在几个流行的机器学习任务的框架内考虑软件架构。我们的实验评估表明，在利用深度学习算法和硬件的最新进展进行性能调优之后，网络延迟现在是端到端应用程序性能的主要瓶颈。我们还报告说，在第一个网络节点部署计算能力仍然可以减少延迟，但总体而言，并非所有应用程序都需要这样做。根据我们的研究结果，我们概述了需求并概述了跨边缘位置的通用机器学习推理的自适应架构的设计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Reality Check on Inference at Mobile Networks Edge

Edge computing is considered a key enabler to deploy Artificial Intelligence platforms to provide real-time applications such as AR/VR or cognitive assistance. Previous works show computing capabilities deployed very close to the user can actually reduce the end-to-end latency of such interactive applications. Nonetheless, the main performance bottleneck remains in the machine learning inference operation. In this paper, we question some assumptions of these works, as the network location where edge computing is deployed, and considered software architectures within the framework of a couple of popular machine learning tasks. Our experimental evaluation shows that after performance tuning that leverages recent advances in deep learning algorithms and hardware, network latency is now the main bottleneck on end-to-end application performance. We also report that deploying computing capabilities at the first network node still provides latency reduction but, overall, it is not required by all applications. Based on our findings, we overview the requirements and sketch the design of an adaptive architecture for general machine learning inference across edge locations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking

自引率

0.00%

发文量

期刊最新文献

Transparent AR Processing Acceleration at the Edge Energy-Aware Speculative Execution in Vehicular Edge Computing Systems Snape: The Dark Art of Handling Heterogeneous Enclaves Optimized Assignment of Computational Tasks in Vehicular Micro Clouds A Learning-based Framework for Optimizing Service Migration in Mobile Edge Clouds