{"title":"SPACE4AI-D:计算连续体中人工智能应用资源选择的设计时工具","authors":"Hamta Sedghani;Federica Filippini;Danilo Ardagna","doi":"10.1109/TSC.2024.3479935","DOIUrl":null,"url":null,"abstract":"Nowadays, Artificial Intelligence (AI) applications are becoming increasingly popular in a wide range of industries, mainly thanks to Deep Neural Networks (DNNs) that needs powerful resources. Cloud computing is a promising approach to serve AI applications thanks to its high processing power, but this sometimes results in an unacceptable latency because of long-distance communication. Vice versa, edge computing is close to where data are generated and therefore it is becoming crucial for their timely, flexible, and secure management. Given the more distributed nature of the edge and the heterogeneity of its resources, efficient component placement and resource allocation approaches become critical in orchestrating the application execution. In this paper, we formulate the resource selection and AI applications component placement problem in a computing continuum as a Mixed Integer Non-Linear Problem (MINLP), and we propose a design-time tool for its efficient solution. We first propose a Random Greedy algorithm to minimize the cost of the placement while guaranteeing some response time performance constraints. Then, we develop some heuristic methods such as Local Search, Tabu Search, Simulated Annealing and Genetic Algorithms, to improve the initial solutions provided by the Random Greedy. To evaluate our proposed approach, we designed an extensive experimental campaign, comparing the heuristics methods with one another and then the best heuristic against Best Cost Performance Constraint (BCPC) algorithm, a state-of-the-art approach. The results demonstrate that our proposed approach finds lower-cost solution than BCPC (27.6% on average) under the same time limit in large-scale systems. Finally, during the validation in a real edge system including FaaS resources our approach finds the globally optimal solution, suffering a deviation of around 12% between actual and predicted costs.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"17 6","pages":"4324-4339"},"PeriodicalIF":5.5000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SPACE4AI-D: A Design-Time Tool for AI Applications Resource Selection in Computing Continua\",\"authors\":\"Hamta Sedghani;Federica Filippini;Danilo Ardagna\",\"doi\":\"10.1109/TSC.2024.3479935\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, Artificial Intelligence (AI) applications are becoming increasingly popular in a wide range of industries, mainly thanks to Deep Neural Networks (DNNs) that needs powerful resources. Cloud computing is a promising approach to serve AI applications thanks to its high processing power, but this sometimes results in an unacceptable latency because of long-distance communication. Vice versa, edge computing is close to where data are generated and therefore it is becoming crucial for their timely, flexible, and secure management. Given the more distributed nature of the edge and the heterogeneity of its resources, efficient component placement and resource allocation approaches become critical in orchestrating the application execution. In this paper, we formulate the resource selection and AI applications component placement problem in a computing continuum as a Mixed Integer Non-Linear Problem (MINLP), and we propose a design-time tool for its efficient solution. We first propose a Random Greedy algorithm to minimize the cost of the placement while guaranteeing some response time performance constraints. Then, we develop some heuristic methods such as Local Search, Tabu Search, Simulated Annealing and Genetic Algorithms, to improve the initial solutions provided by the Random Greedy. To evaluate our proposed approach, we designed an extensive experimental campaign, comparing the heuristics methods with one another and then the best heuristic against Best Cost Performance Constraint (BCPC) algorithm, a state-of-the-art approach. The results demonstrate that our proposed approach finds lower-cost solution than BCPC (27.6% on average) under the same time limit in large-scale systems. Finally, during the validation in a real edge system including FaaS resources our approach finds the globally optimal solution, suffering a deviation of around 12% between actual and predicted costs.\",\"PeriodicalId\":13255,\"journal\":{\"name\":\"IEEE Transactions on Services Computing\",\"volume\":\"17 6\",\"pages\":\"4324-4339\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Services Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10715700/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Services Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10715700/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
SPACE4AI-D: A Design-Time Tool for AI Applications Resource Selection in Computing Continua
Nowadays, Artificial Intelligence (AI) applications are becoming increasingly popular in a wide range of industries, mainly thanks to Deep Neural Networks (DNNs) that needs powerful resources. Cloud computing is a promising approach to serve AI applications thanks to its high processing power, but this sometimes results in an unacceptable latency because of long-distance communication. Vice versa, edge computing is close to where data are generated and therefore it is becoming crucial for their timely, flexible, and secure management. Given the more distributed nature of the edge and the heterogeneity of its resources, efficient component placement and resource allocation approaches become critical in orchestrating the application execution. In this paper, we formulate the resource selection and AI applications component placement problem in a computing continuum as a Mixed Integer Non-Linear Problem (MINLP), and we propose a design-time tool for its efficient solution. We first propose a Random Greedy algorithm to minimize the cost of the placement while guaranteeing some response time performance constraints. Then, we develop some heuristic methods such as Local Search, Tabu Search, Simulated Annealing and Genetic Algorithms, to improve the initial solutions provided by the Random Greedy. To evaluate our proposed approach, we designed an extensive experimental campaign, comparing the heuristics methods with one another and then the best heuristic against Best Cost Performance Constraint (BCPC) algorithm, a state-of-the-art approach. The results demonstrate that our proposed approach finds lower-cost solution than BCPC (27.6% on average) under the same time limit in large-scale systems. Finally, during the validation in a real edge system including FaaS resources our approach finds the globally optimal solution, suffering a deviation of around 12% between actual and predicted costs.
期刊介绍:
IEEE Transactions on Services Computing encompasses the computing and software aspects of the science and technology of services innovation research and development. It places emphasis on algorithmic, mathematical, statistical, and computational methods central to services computing. Topics covered include Service Oriented Architecture, Web Services, Business Process Integration, Solution Performance Management, and Services Operations and Management. The transactions address mathematical foundations, security, privacy, agreement, contract, discovery, negotiation, collaboration, and quality of service for web services. It also covers areas like composite web service creation, business and scientific applications, standards, utility models, business process modeling, integration, collaboration, and more in the realm of Services Computing.