无服务器计算中分布式推理深度学习模型的服务

Q1 Computer Science IEEE Cloud Computing Pub Date : 2022-07-01 DOI:10.1109/CLOUD55607.2022.00029

K. Mahajan, Rumit Desai

{"title":"无服务器计算中分布式推理深度学习模型的服务","authors":"K. Mahajan, Rumit Desai","doi":"10.1109/CLOUD55607.2022.00029","DOIUrl":null,"url":null,"abstract":"Serverless computing (SC) in an attractive win-win paradigm for cloud providers and customers, simultaneously providing greater flexibility and control over resource utilization for cloud providers while reducing costs through pay-per-use model and no capacity management for customers. While SC has been shown effective for event-triggered web applications, the use of deep learning (DL) applications on SC is limited due to latency-sensitive DL applications and stateless SC. In this paper, we focus on two key problems impacting deployment of distributed inference (DI) models on SC: resource allocation and cold start latency. To address the two problems, we propose a hybrid scheduler for identifying the optimal server resource allocation policy. The hybrid scheduler identifies container allocation based on candidate allocations from greedy strategy as well as deep reinforcement learning based allocation model.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"1 1","pages":"109-111"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Serving distributed inference deep learning models in serverless computing\",\"authors\":\"K. Mahajan, Rumit Desai\",\"doi\":\"10.1109/CLOUD55607.2022.00029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Serverless computing (SC) in an attractive win-win paradigm for cloud providers and customers, simultaneously providing greater flexibility and control over resource utilization for cloud providers while reducing costs through pay-per-use model and no capacity management for customers. While SC has been shown effective for event-triggered web applications, the use of deep learning (DL) applications on SC is limited due to latency-sensitive DL applications and stateless SC. In this paper, we focus on two key problems impacting deployment of distributed inference (DI) models on SC: resource allocation and cold start latency. To address the two problems, we propose a hybrid scheduler for identifying the optimal server resource allocation policy. The hybrid scheduler identifies container allocation based on candidate allocations from greedy strategy as well as deep reinforcement learning based allocation model.\",\"PeriodicalId\":54281,\"journal\":{\"name\":\"IEEE Cloud Computing\",\"volume\":\"1 1\",\"pages\":\"109-111\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLOUD55607.2022.00029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD55607.2022.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 1

摘要

无服务器计算(SC)为云提供商和客户提供了一个有吸引力的双赢范例，同时为云提供商提供了更大的灵活性和对资源利用的控制，同时通过按使用付费模式降低了成本，并且为客户提供了无容量管理。虽然SC已被证明对事件触发的web应用程序有效，但由于延迟敏感的DL应用程序和无状态的SC，深度学习(DL)应用程序在SC上的使用受到限制。在本文中，我们关注影响分布式推理(DI)模型在SC上部署的两个关键问题:资源分配和冷启动延迟。为了解决这两个问题，我们提出了一个混合调度器来确定最佳的服务器资源分配策略。混合调度程序基于贪婪策略的候选分配和基于深度强化学习的分配模型来识别容器分配。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Serving distributed inference deep learning models in serverless computing

Serverless computing (SC) in an attractive win-win paradigm for cloud providers and customers, simultaneously providing greater flexibility and control over resource utilization for cloud providers while reducing costs through pay-per-use model and no capacity management for customers. While SC has been shown effective for event-triggered web applications, the use of deep learning (DL) applications on SC is limited due to latency-sensitive DL applications and stateless SC. In this paper, we focus on two key problems impacting deployment of distributed inference (DI) models on SC: resource allocation and cold start latency. To address the two problems, we propose a hybrid scheduler for identifying the optimal server resource allocation policy. The hybrid scheduler identifies container allocation based on candidate allocations from greedy strategy as well as deep reinforcement learning based allocation model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Cloud Computing Computer Science-Computer Networks and Communications

CiteScore

11.20

自引率

0.00%

发文量

期刊介绍： Cessation. IEEE Cloud Computing is committed to the timely publication of peer-reviewed articles that provide innovative research ideas, applications results, and case studies in all areas of cloud computing. Topics relating to novel theory, algorithms, performance analyses and applications of techniques are covered. More specifically: Cloud software, Cloud security, Trade-offs between privacy and utility of cloud, Cloud in the business environment, Cloud economics, Cloud governance, Migrating to the cloud, Cloud standards, Development tools, Backup and recovery, Interoperability, Applications management, Data analytics, Communications protocols, Mobile cloud, Private clouds, Liability issues for data loss on clouds, Data integration, Big data, Cloud education, Cloud skill sets, Cloud energy consumption, The architecture of cloud computing, Applications in commerce, education, and industry, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), Business Process as a Service (BPaaS)