{"title":"无服务器计算中分布式推理深度学习模型的服务","authors":"K. Mahajan, Rumit Desai","doi":"10.1109/CLOUD55607.2022.00029","DOIUrl":null,"url":null,"abstract":"Serverless computing (SC) in an attractive win-win paradigm for cloud providers and customers, simultaneously providing greater flexibility and control over resource utilization for cloud providers while reducing costs through pay-per-use model and no capacity management for customers. While SC has been shown effective for event-triggered web applications, the use of deep learning (DL) applications on SC is limited due to latency-sensitive DL applications and stateless SC. In this paper, we focus on two key problems impacting deployment of distributed inference (DI) models on SC: resource allocation and cold start latency. To address the two problems, we propose a hybrid scheduler for identifying the optimal server resource allocation policy. The hybrid scheduler identifies container allocation based on candidate allocations from greedy strategy as well as deep reinforcement learning based allocation model.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"1 1","pages":"109-111"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Serving distributed inference deep learning models in serverless computing\",\"authors\":\"K. Mahajan, Rumit Desai\",\"doi\":\"10.1109/CLOUD55607.2022.00029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Serverless computing (SC) in an attractive win-win paradigm for cloud providers and customers, simultaneously providing greater flexibility and control over resource utilization for cloud providers while reducing costs through pay-per-use model and no capacity management for customers. While SC has been shown effective for event-triggered web applications, the use of deep learning (DL) applications on SC is limited due to latency-sensitive DL applications and stateless SC. In this paper, we focus on two key problems impacting deployment of distributed inference (DI) models on SC: resource allocation and cold start latency. To address the two problems, we propose a hybrid scheduler for identifying the optimal server resource allocation policy. The hybrid scheduler identifies container allocation based on candidate allocations from greedy strategy as well as deep reinforcement learning based allocation model.\",\"PeriodicalId\":54281,\"journal\":{\"name\":\"IEEE Cloud Computing\",\"volume\":\"1 1\",\"pages\":\"109-111\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLOUD55607.2022.00029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD55607.2022.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
Serving distributed inference deep learning models in serverless computing
Serverless computing (SC) in an attractive win-win paradigm for cloud providers and customers, simultaneously providing greater flexibility and control over resource utilization for cloud providers while reducing costs through pay-per-use model and no capacity management for customers. While SC has been shown effective for event-triggered web applications, the use of deep learning (DL) applications on SC is limited due to latency-sensitive DL applications and stateless SC. In this paper, we focus on two key problems impacting deployment of distributed inference (DI) models on SC: resource allocation and cold start latency. To address the two problems, we propose a hybrid scheduler for identifying the optimal server resource allocation policy. The hybrid scheduler identifies container allocation based on candidate allocations from greedy strategy as well as deep reinforcement learning based allocation model.
期刊介绍:
Cessation.
IEEE Cloud Computing is committed to the timely publication of peer-reviewed articles that provide innovative research ideas, applications results, and case studies in all areas of cloud computing. Topics relating to novel theory, algorithms, performance analyses and applications of techniques are covered. More specifically: Cloud software, Cloud security, Trade-offs between privacy and utility of cloud, Cloud in the business environment, Cloud economics, Cloud governance, Migrating to the cloud, Cloud standards, Development tools, Backup and recovery, Interoperability, Applications management, Data analytics, Communications protocols, Mobile cloud, Private clouds, Liability issues for data loss on clouds, Data integration, Big data, Cloud education, Cloud skill sets, Cloud energy consumption, The architecture of cloud computing, Applications in commerce, education, and industry, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), Business Process as a Service (BPaaS)