Pub Date : 2025-03-06DOI: 10.1109/TCC.2025.3548570
Qian Liu;Yu Zhan;Baocang Wang
Secure Multi-party Computation (MPC) is a highly active research field, with Private Set Intersection (PSI) being a classic subtopic within it. However, simple intersection computation is insufficient for many real-world scenarios, leading to the development of various PSI variant protocols. In this context, we propose a cloud-based multi-party private set intersection with union protocol, denoted as MPSI-U. This protocol securely computes the intersection of the designated party's set with the union of the sets of all other parties, which can be applied to scenarios such as contact tracing. MPSI-U leverages cloud servers to alleviate the computational burden placed on users, while guaranteeing privacy and security simultaneously for all involved parties with the threshold BGN cryptographic system. Furthermore, a comprehensive formal security analysis of the protocol was conducted under the semi-honest model to prove its resilience against potential security threats. Based on our performance analysis, MPSI-U exhibits favorable characteristics in terms of communication and computation overhead. This enhances the versatility of MPSI-U, rendering it a valuable solution that can be widely applied across various domains and scenarios.
{"title":"Secure and Efficient Cloud-Based Multi-Party Private Set Intersection With Union Protocol","authors":"Qian Liu;Yu Zhan;Baocang Wang","doi":"10.1109/TCC.2025.3548570","DOIUrl":"https://doi.org/10.1109/TCC.2025.3548570","url":null,"abstract":"Secure Multi-party Computation (MPC) is a highly active research field, with Private Set Intersection (PSI) being a classic subtopic within it. However, simple intersection computation is insufficient for many real-world scenarios, leading to the development of various PSI variant protocols. In this context, we propose a cloud-based multi-party private set intersection with union protocol, denoted as MPSI-U. This protocol securely computes the intersection of the designated party's set with the union of the sets of all other parties, which can be applied to scenarios such as contact tracing. MPSI-U leverages cloud servers to alleviate the computational burden placed on users, while guaranteeing privacy and security simultaneously for all involved parties with the threshold BGN cryptographic system. Furthermore, a comprehensive formal security analysis of the protocol was conducted under the semi-honest model to prove its resilience against potential security threats. Based on our performance analysis, MPSI-U exhibits favorable characteristics in terms of communication and computation overhead. This enhances the versatility of MPSI-U, rendering it a valuable solution that can be widely applied across various domains and scenarios.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"578-589"},"PeriodicalIF":5.3,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144229456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-06DOI: 10.1109/TCC.2025.3548604
Yuchen Zhang;Long Luo;Gang Sun;Hongfang Yu;Bo Li
The explosive growth in training data and model sizes has spurred the adoption of distributed deep learning (DL) in heterogeneous computing clusters. Efficiently scheduling distributed training jobs in such heterogeneous environments while ensuring they meet user-specified deadlines remains a critical challenge. While most existing works focus on reducing job completion time in homogeneous clusters, they pay little attention to meeting job deadlines in heterogeneous clusters. To address this issue, we propose Dancer (Deadline-Aware dyNamiC GPU allocation approach for Efficient Resource utilization), a novel framework that dynamically adjusts not only the number but the type of GPUs assigned to each job throughout its training lifecycle. Dancer aims to maximize the number of jobs meeting their deadlines in heterogeneous GPU clusters. It decouples job placement from resource allocation and formulates the scheduling optimization problem for maximizing the number of deadline-meeting jobs as an Integer Linear Programming (ILP) problem. To solve this ILP problem in real-time, we propose an online algorithm with a competitive ratio guarantee, leveraging primal-dual and dynamic programming techniques. Extensive trace-driven simulations based on real-world DL workloads demonstrate that Dancer significantly outperforms state-of-the-art approaches, improving the deadline satisfactory ratio up to 58.9%–74.2%.
{"title":"Deadline-Aware Online Job Scheduling for Distributed Training in Heterogeneous Clusters","authors":"Yuchen Zhang;Long Luo;Gang Sun;Hongfang Yu;Bo Li","doi":"10.1109/TCC.2025.3548604","DOIUrl":"https://doi.org/10.1109/TCC.2025.3548604","url":null,"abstract":"The explosive growth in training data and model sizes has spurred the adoption of distributed deep learning (DL) in heterogeneous computing clusters. Efficiently scheduling distributed training jobs in such heterogeneous environments while ensuring they meet user-specified deadlines remains a critical challenge. While most existing works focus on reducing job completion time in homogeneous clusters, they pay little attention to meeting job deadlines in heterogeneous clusters. To address this issue, we propose <sc>Dancer</small> (Deadline-Aware dyNamiC GPU allocation approach for Efficient Resource utilization), a novel framework that dynamically adjusts not only the number but the type of GPUs assigned to each job throughout its training lifecycle. <sc>Dancer</small> aims to maximize the number of jobs meeting their deadlines in heterogeneous GPU clusters. It decouples job placement from resource allocation and formulates the scheduling optimization problem for maximizing the number of deadline-meeting jobs as an Integer Linear Programming (ILP) problem. To solve this ILP problem in real-time, we propose an online algorithm with a competitive ratio guarantee, leveraging primal-dual and dynamic programming techniques. Extensive trace-driven simulations based on real-world DL workloads demonstrate that <sc>Dancer</small> significantly outperforms state-of-the-art approaches, improving the deadline satisfactory ratio up to 58.9%–74.2%.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"590-604"},"PeriodicalIF":5.3,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Industrial Internet of Things provides an opportunity for flexible and collaborative manufacturing, but introduces more risk and more communication overhead from the Internet to the industrial field. To avoid attacks from unreliable service providers and requesters, Industrial Demilitarized Zone (IDMZ) is introduced in conjunction with firewalls to provide new communication modes between edge servers and industrial devices. As the number of tasks being offloaded to the edge side increases, optimal task offloading to balance the risk and the communication overhead with limited demilitarized buffer size becomes a challenge. Therefore, this paper establishes a mathematical model for secure task offloading in the Industrial Internet-of-Things considering dense communication with different communication modes. Then, a Parallel Gbest-centric differential evolution (P-G-DE) is designed to solve this task offloading problem with a heuristic-embedded initialization strategy, a modified Gbest-centric differential evolutionary operator and a circular-rotated parallelization scheme. The experimental results verify that the proposed method is capable of providing a high-quality solution with a lower risk and a shorter execution time in seconds, compared to six state-of-the-art evolutionary algorithms.
{"title":"Communication Intensive Task Offloading With IDMZ for Secure Industrial Edge Computing","authors":"Yuanjun Laili;Jiabei Gong;Yusheng Kong;Fei Wang;Lei Ren;Lin Zhang","doi":"10.1109/TCC.2025.3548043","DOIUrl":"https://doi.org/10.1109/TCC.2025.3548043","url":null,"abstract":"The Industrial Internet of Things provides an opportunity for flexible and collaborative manufacturing, but introduces more risk and more communication overhead from the Internet to the industrial field. To avoid attacks from unreliable service providers and requesters, Industrial Demilitarized Zone (IDMZ) is introduced in conjunction with firewalls to provide new communication modes between edge servers and industrial devices. As the number of tasks being offloaded to the edge side increases, optimal task offloading to balance the risk and the communication overhead with limited demilitarized buffer size becomes a challenge. Therefore, this paper establishes a mathematical model for secure task offloading in the Industrial Internet-of-Things considering dense communication with different communication modes. Then, a Parallel Gbest-centric differential evolution (P-G-DE) is designed to solve this task offloading problem with a heuristic-embedded initialization strategy, a modified Gbest-centric differential evolutionary operator and a circular-rotated parallelization scheme. The experimental results verify that the proposed method is capable of providing a high-quality solution with a lower risk and a shorter execution time in seconds, compared to six state-of-the-art evolutionary algorithms.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"560-577"},"PeriodicalIF":5.3,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The growth of cloud computing has led to the widespread use of location-based services, such as spatial keyword queries, which return spatial data points within a given range that have the highest similarity in keyword sets to the user’s. As the volume of spatial data increases, providers commonly outsource data to powerful cloud servers. Because cloud servers are untrustworthy, privacy-preserving keyword query schemes have been proposed. However, existing schemes consider only location queries or exact keyword matching. To address these issues, we propose the Privacy-Preserving Spatial Keyword Similarity Query Scheme (PPSKSQ), designed to search for spatial data points with the highest similarity while protecting the privacy of outsourced data, query requests, and results. First, we design two sub-protocols based on improved symmetric homomorphic encryption (iSHE): iSHE-SC for secure size comparison and iSHE-SIP for secure inner product computation. Then, we encode range information and integrate it with a quadtree to construct a novel index structure. Additionally, we use the Jaccard to measure similarity in conjunction with the iSHE-SC protocol, transforming similarity comparison into a matrix trace operation. Finally, rigorous security analysis and extensive simulation experiments confirm the flexibility, efficiency, and scalability of our scheme.
{"title":"PPSKSQ: Towards Efficient and Privacy-Preserving Spatial Keyword Similarity Query in Cloud","authors":"Changrui Wang;Lei Wu;Lijuan Xu;Haojie Yuan;Hao Wang;Wenying Zhang;Weizhi Meng","doi":"10.1109/TCC.2025.3547563","DOIUrl":"https://doi.org/10.1109/TCC.2025.3547563","url":null,"abstract":"The growth of cloud computing has led to the widespread use of location-based services, such as spatial keyword queries, which return spatial data points within a given range that have the highest similarity in keyword sets to the user’s. As the volume of spatial data increases, providers commonly outsource data to powerful cloud servers. Because cloud servers are untrustworthy, privacy-preserving keyword query schemes have been proposed. However, existing schemes consider only location queries or exact keyword matching. To address these issues, we propose the Privacy-Preserving Spatial Keyword Similarity Query Scheme (PPSKSQ), designed to search for spatial data points with the highest similarity while protecting the privacy of outsourced data, query requests, and results. First, we design two sub-protocols based on improved symmetric homomorphic encryption (iSHE): iSHE-SC for secure size comparison and iSHE-SIP for secure inner product computation. Then, we encode range information and integrate it with a quadtree to construct a novel index structure. Additionally, we use the Jaccard to measure similarity in conjunction with the iSHE-SC protocol, transforming similarity comparison into a matrix trace operation. Finally, rigorous security analysis and extensive simulation experiments confirm the flexibility, efficiency, and scalability of our scheme.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"544-559"},"PeriodicalIF":5.3,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144230589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-26DOI: 10.1109/TCC.2025.3546528
Jiajie Shen;Bochun Wu;Maoyi Wang;Sai Zou;Laizhong Cui;Wei Ni
Cloud-of-clouds storage systems are widely used in online applications, where user data are encrypted, encoded, and stored in multiple clouds. When some cloud nodes fail, the storage systems can reconstruct the lost data and store it in the substitute nodes. It is a challenge to reduce the latency of data recovery to ensure data reliability. In this paper, we adopt a Reinforcement Learning-based Data Recovery (RLDR) approach to reduce the regeneration time. By employing the Monte-Carlo method, our approach can construct the tree-topology-based regeneration process, a.k.a. regeneration tree, to effectively reduce the regeneration time. Through rigorous analysis, we apply the information flow graph to optimize the inter-cloud traffic for a given regeneration tree. To verify the merit of RLDR, We conduct extensive experiments on real-world traces. Experiments demonstrate that RLDR can significantly accelerate the regeneration process. Specifically, RLDR can reduce the regeneration time by up to 92% and increase the throughput by up to twelve-fold, compared to the prior art.
{"title":"RLDR: Reinforcement Learning-Based Fast Data Recovery in Cloud-of-Clouds Storage Systems","authors":"Jiajie Shen;Bochun Wu;Maoyi Wang;Sai Zou;Laizhong Cui;Wei Ni","doi":"10.1109/TCC.2025.3546528","DOIUrl":"https://doi.org/10.1109/TCC.2025.3546528","url":null,"abstract":"Cloud-of-clouds storage systems are widely used in online applications, where user data are encrypted, encoded, and stored in multiple clouds. When some cloud nodes fail, the storage systems can reconstruct the lost data and store it in the substitute nodes. It is a challenge to reduce the latency of data recovery to ensure data reliability. In this paper, we adopt a Reinforcement Learning-based Data Recovery (RLDR) approach to reduce the regeneration time. By employing the Monte-Carlo method, our approach can construct the tree-topology-based regeneration process, a.k.a. regeneration tree, to effectively reduce the regeneration time. Through rigorous analysis, we apply the information flow graph to optimize the inter-cloud traffic for a given regeneration tree. To verify the merit of RLDR, We conduct extensive experiments on real-world traces. Experiments demonstrate that RLDR can significantly accelerate the regeneration process. Specifically, RLDR can reduce the regeneration time by up to 92% and increase the throughput by up to twelve-fold, compared to the prior art.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"526-543"},"PeriodicalIF":5.3,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-21DOI: 10.1109/TCC.2025.3544628
Hongjun Li;Debiao He;Qi Feng;Xiaolin Yang;Qingcai Luo
The development of cloud computing needs to continuously improve and perfect the privacy-preserving techniques for the user’s confidential data. Multi-user join query, as an important method of data sharing, allows multiple legitimate data users to perform join query over the data owner’s encrypted database. However, some existing join query protocols may face some challenges in the practical application, such as practicality, security, and efficiency. In this article, we put forward a dynamic and secure join query protocol in the multi-user environment. Compared with some existing protocols, the proposed protocol has the following advantages. On the one hand, we utilize the dynamic oblivious cross tags structure to realize an efficient join query with forward and backward security. On the other hand, we combine the randomizable distributed key-homomorphic pseudo-random functions with join query to support multiple data users, which can provide resilience against the single user’s key leakage and resist collusion attacks between the cloud server and a subset of data users. We formally define and prove the security of proposed protocol. In addition, we give a detailed analysis of computation and communication overheads to demonstrate the efficiency of proposed protocol. Finally, we carry out some experimental evaluations to further demonstrate the superiority of functionality and efficiency.
{"title":"A Dynamic and Secure Join Query Protocol for Multi-User Environment in Cloud Computing","authors":"Hongjun Li;Debiao He;Qi Feng;Xiaolin Yang;Qingcai Luo","doi":"10.1109/TCC.2025.3544628","DOIUrl":"https://doi.org/10.1109/TCC.2025.3544628","url":null,"abstract":"The development of cloud computing needs to continuously improve and perfect the privacy-preserving techniques for the user’s confidential data. Multi-user join query, as an important method of data sharing, allows multiple legitimate data users to perform join query over the data owner’s encrypted database. However, some existing join query protocols may face some challenges in the practical application, such as practicality, security, and efficiency. In this article, we put forward a dynamic and secure join query protocol in the multi-user environment. Compared with some existing protocols, the proposed protocol has the following advantages. On the one hand, we utilize the dynamic oblivious cross tags structure to realize an efficient join query with forward and backward security. On the other hand, we combine the randomizable distributed key-homomorphic pseudo-random functions with join query to support multiple data users, which can provide resilience against the single user’s key leakage and resist collusion attacks between the cloud server and a subset of data users. We formally define and prove the security of proposed protocol. In addition, we give a detailed analysis of computation and communication overheads to demonstrate the efficiency of proposed protocol. Finally, we carry out some experimental evaluations to further demonstrate the superiority of functionality and efficiency.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"512-525"},"PeriodicalIF":5.3,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144229477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-18DOI: 10.1109/TCC.2025.3543477
Ziyuan Liu;Zhixiong Niu;Ran Shu;Wenxue Cheng;Lihua Yuan;Jacob Nelson;Dan R. K. Ports;Peng Cheng;Yongqiang Xiong
In cloud datacenter operations, telemetry and logs are indispensable, enabling essential services such as network diagnostics, auditing, and knowledge discovery. The escalating scale of data centers, coupled with increased bandwidth and finer-grained telemetry, results in an overwhelming volume of data. This proliferation poses significant storage challenges for telemetry systems. In this article, we introduce HyperDrive, an innovative system designed to efficiently store large volumes of telemetry and logs in data centers using programmable switches. This in-network approach effectively mitigates bandwidth bottlenecks commonly associated with traditional endpoint-based methods. To our knowledge, we are the first to use a programmable switch to directly control storage, bypassing the CPU to achieve the best performance. With merely 21% of a switch’s resources, our HyperDrive implementation showcases remarkable scalability and efficiency. Through rigorous evaluation, it has demonstrated linear scaling capabilities, efficiently managing 12 SSDs on a single server with minimal host overhead. In an eight-server testbed, HyperDrive achieved an impressive throughput of approximately 730 Gbps, underscoring its potential to transform data center telemetry and logging practices.
{"title":"HyperDrive: Direct Network Telemetry Storage via Programmable Switches","authors":"Ziyuan Liu;Zhixiong Niu;Ran Shu;Wenxue Cheng;Lihua Yuan;Jacob Nelson;Dan R. K. Ports;Peng Cheng;Yongqiang Xiong","doi":"10.1109/TCC.2025.3543477","DOIUrl":"https://doi.org/10.1109/TCC.2025.3543477","url":null,"abstract":"In cloud datacenter operations, telemetry and logs are indispensable, enabling essential services such as network diagnostics, auditing, and knowledge discovery. The escalating scale of data centers, coupled with increased bandwidth and finer-grained telemetry, results in an overwhelming volume of data. This proliferation poses significant storage challenges for telemetry systems. In this article, we introduce HyperDrive, an innovative system designed to efficiently store large volumes of telemetry and logs in data centers using programmable switches. This in-network approach effectively mitigates bandwidth bottlenecks commonly associated with traditional endpoint-based methods. To our knowledge, we are the first to use a programmable switch to directly control storage, bypassing the CPU to achieve the best performance. With merely 21% of a switch’s resources, our HyperDrive implementation showcases remarkable scalability and efficiency. Through rigorous evaluation, it has demonstrated linear scaling capabilities, efficiently managing 12 SSDs on a single server with minimal host overhead. In an eight-server testbed, HyperDrive achieved an impressive throughput of approximately 730 Gbps, underscoring its potential to transform data center telemetry and logging practices.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"498-511"},"PeriodicalIF":5.3,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Call detail records (CDRs) provide valuable insights into user behavior, which are instrumental for telecom companies in optimizing network coverage and service quality. However, while cloud computing facilitates clustering analysis on a vast scale of CDR data, it introduces privacy risks. The challenge lies in striking a balance between efficiency, security, and cost-effectiveness in privacy-preserving algorithms. To tackle this issue, we propose a privacy-preserving and cost-effective incremental density peak clustering scheme. Our approach leverages homomorphic encryption and order-preserving encryption to enable direct computations and clustering on encrypted data. Moreover, it employs reaching definition analysis to optimize the execution flow of static tasks, pinpointing the optimal junctures for transitioning between the two types of encryption to reduce communication overhead. Furthermore, our scheme utilizes a game theory-based verification strategy to ascertain the accuracy of the results. This methodology can be effectively deployed on the Ethereum blockchain via smart contracts. A comprehensive security analysis confirms that our scheme upholds both privacy and data integrity. Experimental evaluations substantiate the clustering accuracy, communication load, and computational efficiency of our scheme, thereby validating its viability in real-world applications.
{"title":"PPEC: A Privacy-Preserving, Cost-Effective Incremental Density Peak Clustering Analysis on Encrypted Outsourced Data","authors":"Haomiao Yang;ZiKang Ding;Ruiheng Lu;Kunlan Xiang;Hongwei Li;Dakui Wu","doi":"10.1109/TCC.2025.3541749","DOIUrl":"https://doi.org/10.1109/TCC.2025.3541749","url":null,"abstract":"Call detail records (CDRs) provide valuable insights into user behavior, which are instrumental for telecom companies in optimizing network coverage and service quality. However, while cloud computing facilitates clustering analysis on a vast scale of CDR data, it introduces privacy risks. The challenge lies in striking a balance between efficiency, security, and cost-effectiveness in privacy-preserving algorithms. To tackle this issue, we propose a privacy-preserving and cost-effective incremental density peak clustering scheme. Our approach leverages homomorphic encryption and order-preserving encryption to enable direct computations and clustering on encrypted data. Moreover, it employs reaching definition analysis to optimize the execution flow of static tasks, pinpointing the optimal junctures for transitioning between the two types of encryption to reduce communication overhead. Furthermore, our scheme utilizes a game theory-based verification strategy to ascertain the accuracy of the results. This methodology can be effectively deployed on the Ethereum blockchain via smart contracts. A comprehensive security analysis confirms that our scheme upholds both privacy and data integrity. Experimental evaluations substantiate the clustering accuracy, communication load, and computational efficiency of our scheme, thereby validating its viability in real-world applications.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"485-497"},"PeriodicalIF":5.3,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11DOI: 10.1109/TCC.2025.3540023
Kaiwei Mo;Wei Lin;Jiaxun Lu;Chun Jason Xue;Yunfeng Shao;Hong Xu
Federated learning (FL) is increasingly adopted to combine knowledge from clients in training without revealing their private data. In order to improve the performance of different participants, personalized FL has recently been proposed. However, considering the non-independent and identically distributed (non-IID) data and limited bandwidth at clients, the model performance could be compromised. In reality, clients near each other often tend to have similar data distributions. In this work, we train the personalized edge-based model in the client-edge-server FL. While considering the differences in data distribution, we fully utilize the limited bandwidth resources. To make training efficient and accurate at the same time, An intuitive idea is to learn as much useful knowledge as possible from other edges and reduce the accuracy loss incurred by non-IID data. Therefore, we devise Grouping Hierarchical Personalized Federated Learning (GHPFL). In this framework, each edge establishes physical connections with multiple clients, while the server physically connects with edges. It clusters edges into groups and establishes client-edge logical connections for synchronization. This is based on data similarities that the nodes actively identify, as well as the underlying physical topology. We perform a large-scale evaluation to demonstrate GHPFL’s benefits over other schemes.
{"title":"GHPFL: Advancing Personalized Edge-Based Learning Through Optimized Bandwidth Utilization","authors":"Kaiwei Mo;Wei Lin;Jiaxun Lu;Chun Jason Xue;Yunfeng Shao;Hong Xu","doi":"10.1109/TCC.2025.3540023","DOIUrl":"https://doi.org/10.1109/TCC.2025.3540023","url":null,"abstract":"Federated learning (FL) is increasingly adopted to combine knowledge from clients in training without revealing their private data. In order to improve the performance of different participants, personalized FL has recently been proposed. However, considering the non-independent and identically distributed (non-IID) data and limited bandwidth at clients, the model performance could be compromised. In reality, clients near each other often tend to have similar data distributions. In this work, we train the personalized edge-based model in the client-edge-server FL. While considering the differences in data distribution, we fully utilize the limited bandwidth resources. To make training efficient and accurate at the same time, An intuitive idea is to learn as much useful knowledge as possible from other edges and reduce the accuracy loss incurred by non-IID data. Therefore, we devise Grouping Hierarchical Personalized Federated Learning (GHPFL). In this framework, each edge establishes physical connections with multiple clients, while the server physically connects with edges. It clusters edges into groups and establishes client-edge logical connections for synchronization. This is based on data similarities that the nodes actively identify, as well as the underlying physical topology. We perform a large-scale evaluation to demonstrate GHPFL’s benefits over other schemes.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"473-484"},"PeriodicalIF":5.3,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-04DOI: 10.1109/TCC.2025.3538158
Ayoub Ben-Ameur;Andrea Araldo;Tijani Chahed;György Dán
We consider a Network Operator (NO) that owns Edge Computing (EC) resources, virtualizes them and lets third party Service Providers (SPs) run their services, using the allocated slice of resources. We focus on one specific resource, i.e., cache space, and on the problem of how to allocate it among several SPs in order to minimize the backhaul traffic. Due to confidentiality guarantees, the NO cannot observe the nature of the traffic of SPs, which is encrypted. Allocation decisions are thus challenging, since they must be taken solely based on observed monitoring information. Another challenge is that not all the traffic is cacheable. We propose a data-driven cache allocation strategy, based on Reinforcement Learning (RL). Unlike most RL applications, in which the decision policy is learned offline on a simulator, we assume no previous knowledge is available to build such a simulator. We thus apply RL in an online fashion, i.e., the model and the policy are learned by directly perturbing and monitoring the actual system. Since perturbations generate spurious traffic, we thus need to limit perturbations. This requires learning to be extremely efficient. To this aim, we devise a strategy that learns an approximation of the cost function, while interacting with the system. We then use such an approximation in a Model-Based RL (MB-RL) to speed up convergence. We prove analytically that our strategy brings cache allocation boundedly close to the optimum and stably remains in such an allocation. We show in simulations that such convergence is obtained within few minutes. We also study its fairness, its sensitivity to several scenario characteristics and compare it with a method from the state-of-the-art.
{"title":"Cache Allocation in Multi-Tenant Edge Computing: An Online Model-Based Reinforcement Learning Approach","authors":"Ayoub Ben-Ameur;Andrea Araldo;Tijani Chahed;György Dán","doi":"10.1109/TCC.2025.3538158","DOIUrl":"https://doi.org/10.1109/TCC.2025.3538158","url":null,"abstract":"We consider a Network Operator (NO) that owns Edge Computing (EC) resources, virtualizes them and lets third party Service Providers (SPs) run their services, using the allocated slice of resources. We focus on one specific resource, i.e., cache space, and on the problem of how to allocate it among several SPs in order to minimize the backhaul traffic. Due to confidentiality guarantees, the NO cannot observe the nature of the traffic of SPs, which is encrypted. Allocation decisions are thus challenging, since they must be taken solely based on observed monitoring information. Another challenge is that not all the traffic is cacheable. We propose a data-driven cache allocation strategy, based on Reinforcement Learning (RL). Unlike most RL applications, in which the decision policy is learned offline on a simulator, we assume no previous knowledge is available to build such a simulator. We thus apply RL in an <italic>online</i> fashion, i.e., the model and the policy are learned by directly perturbing and monitoring the actual system. Since perturbations generate spurious traffic, we thus need to limit perturbations. This requires learning to be extremely efficient. To this aim, we devise a strategy that learns an approximation of the cost function, while interacting with the system. We then use such an approximation in a Model-Based RL (MB-RL) to speed up convergence. We prove analytically that our strategy brings cache allocation boundedly close to the optimum and stably remains in such an allocation. We show in simulations that such convergence is obtained within few minutes. We also study its fairness, its sensitivity to several scenario characteristics and compare it with a method from the state-of-the-art.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"459-472"},"PeriodicalIF":5.3,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144230590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}