Pub Date : 2025-10-22DOI: 10.1109/TCC.2025.3624031
Jian Jiang;Qianmu Li;Pengchuan Wang;Yunhuai Liu
In the rapidly evolving landscape of cloud-edge computing, efficient resource scheduling across Kubernetes clusters is essential for optimizing microservice deployment. Traditional scheduling methods, e.g., heuristic and meta-heuristic algorithms, often struggle with the dynamic and heterogeneous nature of cloud-edge environments, relying on fixed parameters and lacking adaptability. We propose and implement DRKC, a novel deep reinforcement learning-based approach that addresses these challenges by improving resource utilization and balancing workloads. We model the scheduling problem as a Markov decision process, enabling DRKC to automatically learn optimal scheduling policies from real-time system data without relying on predefined heuristics. The work synthesizes state information from multiple clusters, using multidimensional resource awareness to effectively respond to changing conditions. We evaluate our performance in three Kubernetes clusters with thirteen nodes and ninety-six test applications with different resource requirements. Experimental results validate the effectiveness of DRKC in enhancing overall resource efficiency and achieving superior load balancing across cloud-edge environments.
{"title":"DRKC: Deep Reinforcement Learning Enhanced Microservice Scheduling on Kubernetes Clusters in Cloud-Edge Environment","authors":"Jian Jiang;Qianmu Li;Pengchuan Wang;Yunhuai Liu","doi":"10.1109/TCC.2025.3624031","DOIUrl":"https://doi.org/10.1109/TCC.2025.3624031","url":null,"abstract":"In the rapidly evolving landscape of cloud-edge computing, efficient resource scheduling across Kubernetes clusters is essential for optimizing microservice deployment. Traditional scheduling methods, e.g., heuristic and meta-heuristic algorithms, often struggle with the dynamic and heterogeneous nature of cloud-edge environments, relying on fixed parameters and lacking adaptability. We propose and implement DRKC, a novel deep reinforcement learning-based approach that addresses these challenges by improving resource utilization and balancing workloads. We model the scheduling problem as a Markov decision process, enabling DRKC to automatically learn optimal scheduling policies from real-time system data without relying on predefined heuristics. The work synthesizes state information from multiple clusters, using multidimensional resource awareness to effectively respond to changing conditions. We evaluate our performance in three Kubernetes clusters with thirteen nodes and ninety-six test applications with different resource requirements. Experimental results validate the effectiveness of DRKC in enhancing overall resource efficiency and achieving superior load balancing across cloud-edge environments.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 4","pages":"1472-1486"},"PeriodicalIF":5.0,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-14DOI: 10.1109/TCC.2025.3621432
Xi Liu;Jun Liu;Weidong Li
We consider the edge-vehicle computing system (EVCS), where the combination of edge computing and vehicle computing takes respective advantages to provide various services. We address the problem of computation offloading in EVSC, where the computing tasks and the sensing tasks with limited budgets are offloaded to edge servers and vehicles. The resource-sharing model is proposed, where sensing resources of one vehicle are shared by multiple tasks. We consider the vehicle hierarchy, where vehicles with different equipment accuracy are classified into different hierarchies. A sensing task has different values and different demands for different hierarchies. A budget-feasible mechanism based on the clock auction is proposed. We show our proposed mechanism is strategy-proof and group strategy-proof, this drives the system into an equilibrium. In addition, the proposed mechanism achieves individual rationality, budget balance, and consumer sovereignty. The proposed mechanism consists of two algorithms that are based on the idea of dominant resource and iteration to improve resource utilization and reduce costs. Furthermore, the approximate ratios of the two allocation algorithms are analyzed. Experimental results demonstrate that the proposed mechanism achieves the near-optimal value and brings higher utility for participants.
{"title":"Budget-Feasible Clock Mechanism for Hierarchical Computation Offloading in Edge-Vehicle Collaborative Computing","authors":"Xi Liu;Jun Liu;Weidong Li","doi":"10.1109/TCC.2025.3621432","DOIUrl":"https://doi.org/10.1109/TCC.2025.3621432","url":null,"abstract":"We consider the edge-vehicle computing system (EVCS), where the combination of edge computing and vehicle computing takes respective advantages to provide various services. We address the problem of computation offloading in EVSC, where the computing tasks and the sensing tasks with limited budgets are offloaded to edge servers and vehicles. The resource-sharing model is proposed, where sensing resources of one vehicle are shared by multiple tasks. We consider the vehicle hierarchy, where vehicles with different equipment accuracy are classified into different hierarchies. A sensing task has different values and different demands for different hierarchies. A budget-feasible mechanism based on the clock auction is proposed. We show our proposed mechanism is strategy-proof and group strategy-proof, this drives the system into an equilibrium. In addition, the proposed mechanism achieves individual rationality, budget balance, and consumer sovereignty. The proposed mechanism consists of two algorithms that are based on the idea of dominant resource and iteration to improve resource utilization and reduce costs. Furthermore, the approximate ratios of the two allocation algorithms are analyzed. Experimental results demonstrate that the proposed mechanism achieves the near-optimal value and brings higher utility for participants.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 4","pages":"1458-1471"},"PeriodicalIF":5.0,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As a crucial component of intelligent transportation systems, VANETs are essential for enhancing road safety and enabling efficient traffic management. To ensure secure communication, vehicles often use pseudonyms to protect their identity privacy. However, unconditional anonymity can hinder accountability, making it very necessary to provide conditional privacy protection for vehicles. The conditional privacy-preserving technology not only protects the identity privacy of legitimate vehicles, but also can trace the real identity of malicious vehicles. Some existing schemes lack conditional privacy protection or have large computation and communication costs, which makes them unsuitable for resource-constrained VANETs environments. Hence, we improve the current schnorr-based aggregate signature by eliminating bilinear pairing operations, optimizing the aggregation procedure for batch verification and propose a lightweight certificateless-based aggregate signature scheme (ECPP-CLAS) for VANETs. In our scheme, the aggregation enables multiple signatures to be compressed into an aggregated signature and verified simultaneously, thereby reducing communication consumption, trusted entity generates the pseudonym for the corresponding vehicle through special construction to meet the conditional privacy-preserving requirement. The security analysis and performance evaluation show that our proposed scheme can meet the expected security objectives and lightweight requirements.
{"title":"Lightweight Conditional Privacy-Preserving Scheme for VANET Communications","authors":"Xiaodong Shen;Jianchang Lai;Jinguang Han;Liquan Chen","doi":"10.1109/TCC.2025.3612092","DOIUrl":"https://doi.org/10.1109/TCC.2025.3612092","url":null,"abstract":"As a crucial component of intelligent transportation systems, VANETs are essential for enhancing road safety and enabling efficient traffic management. To ensure secure communication, vehicles often use pseudonyms to protect their identity privacy. However, unconditional anonymity can hinder accountability, making it very necessary to provide conditional privacy protection for vehicles. The conditional privacy-preserving technology not only protects the identity privacy of legitimate vehicles, but also can trace the real identity of malicious vehicles. Some existing schemes lack conditional privacy protection or have large computation and communication costs, which makes them unsuitable for resource-constrained VANETs environments. Hence, we improve the current schnorr-based aggregate signature by eliminating bilinear pairing operations, optimizing the aggregation procedure for batch verification and propose a lightweight certificateless-based aggregate signature scheme (ECPP-CLAS) for VANETs. In our scheme, the aggregation enables multiple signatures to be compressed into an aggregated signature and verified simultaneously, thereby reducing communication consumption, trusted entity generates the pseudonym for the corresponding vehicle through special construction to meet the conditional privacy-preserving requirement. The security analysis and performance evaluation show that our proposed scheme can meet the expected security objectives and lightweight requirements.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 4","pages":"1487-1497"},"PeriodicalIF":5.0,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01DOI: 10.1109/TCC.2025.3604552
Keita Emura
As a variant of PEKS (Public key Encryption with Keyword Search), Zhang et al. (IEEE Transactions on Cloud Computing 2021) introduced a secure and efficient PEKS scheme called SEPSE, where servers issue a servers-derived keyword to a sender or a receiver. In this article, we show that information of keyword is revealed from trapdoor when an adversary is allowed to issue servers-derived keyword queries twice.
作为PEKS (Public key Encryption with Keyword Search)的一种变体,Zhang等人(IEEE Transactions on Cloud Computing 2021)引入了一种安全高效的PEKS方案,称为SEPSE,其中服务器向发送方或接收方发出服务器派生的关键字。在本文中,我们将展示当攻击者被允许两次发出服务器派生的关键字查询时,关键字信息将从陷阱门泄露。
{"title":"Comments on “Blockchain-Assisted Public-Key Encryption With Keyword Search Against Keyword Guessing Attacks for Cloud Storage”","authors":"Keita Emura","doi":"10.1109/TCC.2025.3604552","DOIUrl":"https://doi.org/10.1109/TCC.2025.3604552","url":null,"abstract":"As a variant of PEKS (Public key Encryption with Keyword Search), Zhang et al. (IEEE Transactions on Cloud Computing 2021) introduced a secure and efficient PEKS scheme called SEPSE, where servers issue a servers-derived keyword to a sender or a receiver. In this article, we show that information of keyword is revealed from trapdoor when an adversary is allowed to issue servers-derived keyword queries twice.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 4","pages":"1498-1499"},"PeriodicalIF":5.0,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rapid development of cloud computing and increasing adoption of unstructured data impose higher requirements on cloud servers to deliver advanced query capabilities tailored for protected complex data. To provide outsourced graph privacy and support the shortest path query, a cornerstone of graph computing, various graph searchable encryption (GSE) schemes have been proposed. However, those GSE schemes are only for single-user setting and barely keep forward security, limiting data sharing and value extraction. Therefore, we propose a forward-secure GSE scheme for multi-user querying the exact shortest path. Specifically, our designed encryption structure seamlessly combines the randomizable distributed key-homomorphic pseudorandom function (RDPRF) for multi-user authentication and reduces database update. We then build a dual-server architecture with secure equality test protocol for query. To our knowledge, our GSE scheme is the first to guarantee forward security without a trusted proxy and support multi-user querying the exact shortest path. We formalize leakage functions and model the dynamic multi-user GSE scheme. Formal security proof is offered under reasonable leakage. Finally, we conduct experiments on ten real-world graph datasets with different scales and exemplify the feasibility of our scheme.
{"title":"Forward-Secure Multi-User Graph Searchable Encryption for Exact Shortest Path Queries","authors":"Weixiao Wang;Qing Fan;Chuan Zhang;Cong Zuo;Liehuang Zhu","doi":"10.1109/TCC.2025.3599412","DOIUrl":"https://doi.org/10.1109/TCC.2025.3599412","url":null,"abstract":"The rapid development of cloud computing and increasing adoption of unstructured data impose higher requirements on cloud servers to deliver advanced query capabilities tailored for protected complex data. To provide outsourced graph privacy and support the shortest path query, a cornerstone of graph computing, various graph searchable encryption (GSE) schemes have been proposed. However, those GSE schemes are only for single-user setting and barely keep forward security, limiting data sharing and value extraction. Therefore, we propose a forward-secure GSE scheme for multi-user querying the exact shortest path. Specifically, our designed encryption structure seamlessly combines the randomizable distributed key-homomorphic pseudorandom function (RDPRF) for multi-user authentication and reduces database update. We then build a dual-server architecture with secure equality test protocol for query. To our knowledge, our GSE scheme is the first to guarantee forward security without a trusted proxy and support multi-user querying the exact shortest path. We formalize leakage functions and model the dynamic multi-user GSE scheme. Formal security proof is offered under reasonable leakage. Finally, we conduct experiments on ten real-world graph datasets with different scales and exemplify the feasibility of our scheme.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 4","pages":"1446-1457"},"PeriodicalIF":5.0,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Load balancers (LBs) are crucial in cloud environments, ensuring workload scalability. They route packets destined for a service (identified by a virtual IP address, or VIP) to a group of servers designated to deliver that service, each with its direct IP address (DIP). Consequently, LBs significantly impact the performance of cloud services and the experience of tenants. Many academic studies focus on specific issues such as designing new load balancing algorithms and developing hardware load balancing devices to enhance the LB’s performance, reliability, and scalability. However, we believe this approach is not ideal for cloud data centers for the following reasons: (i) the increasing demands of users and the variety of cloud service types turn the LB into a bottleneck; and (ii) continually adding machines or upgrading hardware devices can incur substantial costs. In this paper, we propose the Next Generation Load Balancer (NGLB), designed to bypass the TCP connection datapath from the LB, thereby eliminating latency overheads and scalability bottlenecks of traditional cloud LBs. The LB only participates in the TCP connection establishment phase. The three key features of our design are: (i) the introduction of an active address learning model to redirect traffic and bypass the LB, (ii) a multi-tenant isolation mechanism for deployment within multi-tenant Virtual Private Cloud networks, and (iii) a distributed flow control method, known as hierarchical connection cleaner, designed to ensure the availability of backend resources. The evaluation results demonstrate that NGLB reduces latency by 16% and increases nearly 3× throughput. With the same LB resources, NGLB improves 10× rate of new connection establishment. More importantly, five years of operational experience has proven NGLB’s stability for high-bandwidth services.
{"title":"Cloud Load Balancers Need to Stay Off the Data Path","authors":"Yuchen Zhang;Shuai Jin;Zhenyu Wen;Shibo He;Qingzheng Hou;Yang Song;Zhigang Zong;Xiaomin Wu;Bengbeng Xue;Chenghao Sun;Ku Li;Xing Li;Biao Lyu;Rong Wen;Jiming Chen;Shunmin Zhu","doi":"10.1109/TCC.2025.3595172","DOIUrl":"https://doi.org/10.1109/TCC.2025.3595172","url":null,"abstract":"Load balancers (LBs) are crucial in cloud environments, ensuring workload scalability. They route packets destined for a service (identified by a virtual IP address, or VIP) to a group of servers designated to deliver that service, each with its direct IP address (DIP). Consequently, LBs significantly impact the performance of cloud services and the experience of tenants. Many academic studies focus on specific issues such as designing new load balancing algorithms and developing hardware load balancing devices to enhance the LB’s performance, reliability, and scalability. However, we believe this approach is not ideal for cloud data centers for the following reasons: (i) the increasing demands of users and the variety of cloud service types turn the LB into a bottleneck; and (ii) continually adding machines or upgrading hardware devices can incur substantial costs. In this paper, we propose the Next Generation Load Balancer (NGLB), designed to bypass the TCP connection datapath from the LB, thereby eliminating latency overheads and scalability bottlenecks of traditional cloud LBs. The LB only participates in the TCP connection establishment phase. The three key features of our design are: (i) the introduction of an <italic>active address learning</i> model to redirect traffic and bypass the LB, (ii) a <italic>multi-tenant isolation</i> mechanism for deployment within multi-tenant Virtual Private Cloud networks, and (iii) a distributed flow control method, known as <italic>hierarchical connection cleaner</i>, designed to ensure the availability of backend resources. The evaluation results demonstrate that NGLB reduces latency by 16% and increases nearly 3× throughput. With the same LB resources, NGLB improves 10× rate of new connection establishment. More importantly, five years of operational experience has proven NGLB’s stability for high-bandwidth services.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 3","pages":"1078-1090"},"PeriodicalIF":5.0,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-31DOI: 10.1109/TCC.2025.3594575
Haiyong Bao;Lu Xing;Honglin Wu;Menghong Guan;Na Ruan;Cheng Huang;Hong-Ning Dai
With the explosion of Big Data in cloud environments, data owners tend to delegate the storage and computation to cloud servers. Since cloud servers are generally untrustworthy, data owners often encrypt data before outsourcing it to the cloud. Numerous privacy-preserving schemes for the multi-keyword ranked query have been proposed, but most of these schemes do not support ciphertext access control, which can easily lead to malicious access by unauthorized users, causing serious damage to personal privacy and commercial secrets. To address the above challenges, we propose an efficient and privacy-preserving multi-keyword ranked query scheme (MKAC) that supports ciphertext access control. Specifically, in order to enhance the efficiency of the multi-keyword ranked query, we employ a vantage point (VP) tree to organize the keyword index. Additionally, we develop a VP tree-based multi-keyword ranked query algorithm, which utilizes the pruning strategy to minimize the number of nodes to search. Next, we propose a privacy-preserving multi-keyword ranked query scheme that combines asymmetric scalar-product-preserving encryption with the VP tree. Furthermore, attribute-based encryption mechanism is used to generate the decryption key based on the query user’s attributes, which is then employed to decrypt the query results and trace any malicious query user who may leak the secret key. Finally, a rigorous analysis of the security of MKAC is conducted. The extensive experimental evaluation shows that the proposed scheme is efficient and practical.
{"title":"MKAC: Efficient and Privacy-Preserving Multi- Keyword Ranked Query With Ciphertext Access Control in Cloud Environments","authors":"Haiyong Bao;Lu Xing;Honglin Wu;Menghong Guan;Na Ruan;Cheng Huang;Hong-Ning Dai","doi":"10.1109/TCC.2025.3594575","DOIUrl":"https://doi.org/10.1109/TCC.2025.3594575","url":null,"abstract":"With the explosion of Big Data in cloud environments, data owners tend to delegate the storage and computation to cloud servers. Since cloud servers are generally untrustworthy, data owners often encrypt data before outsourcing it to the cloud. Numerous privacy-preserving schemes for the multi-keyword ranked query have been proposed, but most of these schemes do not support ciphertext access control, which can easily lead to malicious access by unauthorized users, causing serious damage to personal privacy and commercial secrets. To address the above challenges, we propose an efficient and privacy-preserving multi-keyword ranked query scheme (MKAC) that supports ciphertext access control. Specifically, in order to enhance the efficiency of the multi-keyword ranked query, we employ a vantage point (VP) tree to organize the keyword index. Additionally, we develop a VP tree-based multi-keyword ranked query algorithm, which utilizes the pruning strategy to minimize the number of nodes to search. Next, we propose a privacy-preserving multi-keyword ranked query scheme that combines asymmetric scalar-product-preserving encryption with the VP tree. Furthermore, attribute-based encryption mechanism is used to generate the decryption key based on the query user’s attributes, which is then employed to decrypt the query results and trace any malicious query user who may leak the secret key. Finally, a rigorous analysis of the security of MKAC is conducted. The extensive experimental evaluation shows that the proposed scheme is efficient and practical.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 3","pages":"1065-1077"},"PeriodicalIF":5.0,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144996104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The booming development of artificial intelligence (AI) applications has greatly promoted edge intelligence technology. To support latency-sensitive Deep Neural Network (DNN) based applications, the integration of serverless inference paradigm into edge intelligence has become a widely recognized solution. However, the long DNN model downloading time from central clouds to edge servers hinders inference performance, and asks for establishing model repository within the edge cloud. This paper first identifies the inherent layer redundancy in DNN models, which is potentially beneficial to improve the storage efficiency of the model repository in the edge cloud. However, how to exploit the layer redundancy feature and allocate the DNN layers across different edge servers with capacitated storage resources to reduce the model downloading time remains challenging. To address this issue, we first formulate this problem in Quadratic Integer Programming (QIP) form, based on which a randomized rounding layer redundancy aware DNN model storage planning strategy is proposed. Our approach significantly reduces model downloading time by up to 63% compared to state-of-the-art methods, as demonstrated through extensive trace-driven experiments.
{"title":"Layer Redundancy Aware DNN Model Repository Planning for Fast Model Download in Edge Cloud","authors":"Hongmin Geng;Yuepeng Li;Sheng Wang;Lin Gu;Deze Zeng","doi":"10.1109/TCC.2025.3591482","DOIUrl":"https://doi.org/10.1109/TCC.2025.3591482","url":null,"abstract":"The booming development of artificial intelligence (AI) applications has greatly promoted edge intelligence technology. To support latency-sensitive Deep Neural Network (DNN) based applications, the integration of serverless inference paradigm into edge intelligence has become a widely recognized solution. However, the long DNN model downloading time from central clouds to edge servers hinders inference performance, and asks for establishing model repository within the edge cloud. This paper first identifies the inherent layer redundancy in DNN models, which is potentially beneficial to improve the storage efficiency of the model repository in the edge cloud. However, how to exploit the layer redundancy feature and allocate the DNN layers across different edge servers with capacitated storage resources to reduce the model downloading time remains challenging. To address this issue, we first formulate this problem in Quadratic Integer Programming (QIP) form, based on which a randomized rounding layer redundancy aware DNN model storage planning strategy is proposed. Our approach significantly reduces model downloading time by up to 63% compared to state-of-the-art methods, as demonstrated through extensive trace-driven experiments.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 3","pages":"1038-1049"},"PeriodicalIF":5.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144997130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-22DOI: 10.1109/TCC.2025.3591549
Genxin Chen;Jin Qi;Xingjian Zhu;Jialin Hua;Zhenjiang Dong;Yanfei Sun
The surge in the development of artificial intelligence has led to increases in the complexity of computational tasks and the resource demands within cloud computing scenarios. Therefore, intelligent scheduling methods have formed a crucial research area. Solving complex scheduling problems requires many problem feature and long-sequence decision-making observations as possible. To address the workflow scheduling problem under the limited capabilities of models, workflow reduction and cross-view workflow scheduling problems are first proposed in this article, with the optimization objectives and constraints of each problem described. Second, a cross-view intelligent scheduling method implemented via cloud computing workflow reduction (CSCR), including a workflow reduction sorting algorithm (Task-priority ranker), an intelligent reduction algorithm (Workflow view-transformer), and a cross-view intelligent scheduling algorithm (Joint-scheduler), is proposed. We also propose an intelligent scheduling architecture under the workflow reduction paradigm. By reducing the workflow, we provide multiple views that support the decision-making processes of deep reinforcement learning-based scheduling models and coordinate workflow views before and after the reduction step to achieve cross-view joint scheduling. Experimental results show that CSCR achieves minimum advantages of 42.1%, 43.2%, and 33.3% in terms of three workflow reduction indicators over four other algorithms, significantly optimizing the effect of the employed scheduling model.
{"title":"CSCR: A Cross-View Intelligent Scheduling Method Implemented via Cloud Computing Workflow Reduction","authors":"Genxin Chen;Jin Qi;Xingjian Zhu;Jialin Hua;Zhenjiang Dong;Yanfei Sun","doi":"10.1109/TCC.2025.3591549","DOIUrl":"https://doi.org/10.1109/TCC.2025.3591549","url":null,"abstract":"The surge in the development of artificial intelligence has led to increases in the complexity of computational tasks and the resource demands within cloud computing scenarios. Therefore, intelligent scheduling methods have formed a crucial research area. Solving complex scheduling problems requires many problem feature and long-sequence decision-making observations as possible. To address the workflow scheduling problem under the limited capabilities of models, workflow reduction and cross-view workflow scheduling problems are first proposed in this article, with the optimization objectives and constraints of each problem described. Second, a cross-view intelligent scheduling method implemented via cloud computing workflow reduction (CSCR), including a workflow reduction sorting algorithm (Task-priority ranker), an intelligent reduction algorithm (Workflow view-transformer), and a cross-view intelligent scheduling algorithm (Joint-scheduler), is proposed. We also propose an intelligent scheduling architecture under the workflow reduction paradigm. By reducing the workflow, we provide multiple views that support the decision-making processes of deep reinforcement learning-based scheduling models and coordinate workflow views before and after the reduction step to achieve cross-view joint scheduling. Experimental results show that CSCR achieves minimum advantages of 42.1%, 43.2%, and 33.3% in terms of three workflow reduction indicators over four other algorithms, significantly optimizing the effect of the employed scheduling model.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 3","pages":"1050-1064"},"PeriodicalIF":5.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rapid proliferation of the Internet of Things (IoT) and edge computing devices calls for solutions that deliver low latency, energy efficiency, and robust security—often challenging goals to balance simultaneously. This paper introduces a novel nanoservice-based framework that dynamically adapts to changing demands while achieving sustainable and secure edge operations. By breaking down functionalities into specialized and narrowly scoped nanoservices that are requested only as needed and eliminated when idle, the approach significantly reduces latency and energy usage compared to conventional, more static methods. Moreover, integrating a Zero-Trust Architecture (ZTA) ensures that every component—computational or security-related—is continuously verified and restricted through strict access controls and micro-segmentation. This framework’s adaptability extends uniformly to all nanoservices, including those providing security features, thereby maintaining strong protective measures even as workloads and network conditions evolve. Experimental evaluations on IoT devices under varying workloads demonstrate that the proposed approach significantly reduces energy consumption and latency while maintaining security and scalability. These results underscore the potential for an integrated, flexible model that simultaneously addresses energy efficiency, performance, and security—an essential trifecta in future edge computing environments.
{"title":"Securing and Sustaining IoT Edge-Computing Architectures Through Nanoservice Integration","authors":"Cinthya Celina Tamayo Gonzalez;Ijaz Ahmad;Simone Soderi;Erkki Harjula","doi":"10.1109/TCC.2025.3588681","DOIUrl":"https://doi.org/10.1109/TCC.2025.3588681","url":null,"abstract":"The rapid proliferation of the Internet of Things (IoT) and edge computing devices calls for solutions that deliver low latency, energy efficiency, and robust security—often challenging goals to balance simultaneously. This paper introduces a novel nanoservice-based framework that dynamically adapts to changing demands while achieving sustainable and secure edge operations. By breaking down functionalities into specialized and narrowly scoped nanoservices that are requested only as needed and eliminated when idle, the approach significantly reduces latency and energy usage compared to conventional, more static methods. Moreover, integrating a Zero-Trust Architecture (ZTA) ensures that every component—computational or security-related—is continuously verified and restricted through strict access controls and micro-segmentation. This framework’s adaptability extends uniformly to all nanoservices, including those providing security features, thereby maintaining strong protective measures even as workloads and network conditions evolve. Experimental evaluations on IoT devices under varying workloads demonstrate that the proposed approach significantly reduces energy consumption and latency while maintaining security and scalability. These results underscore the potential for an integrated, flexible model that simultaneously addresses energy efficiency, performance, and security—an essential trifecta in future edge computing environments.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 3","pages":"1026-1037"},"PeriodicalIF":5.0,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}