Deyu Zhang, Wang Sun, Zi-Ang Zheng, Wenxin Chen, Shiwen He
{"title":"Adaptive device sampling and deadline determination for cloud-based heterogeneous federated learning","authors":"Deyu Zhang, Wang Sun, Zi-Ang Zheng, Wenxin Chen, Shiwen He","doi":"10.1186/s13677-023-00515-6","DOIUrl":null,"url":null,"abstract":"Abstract As a new approach to machine learning, Federated learning enables distributned traiing on edge devices and aggregates local models into a global model. The edge devices that participate in federated learning are highly heterogeneous in terms of computing power, device state, and data distribution, making it challenging to converge models efficiently. In this paper, we propose FedState, which is an adaptive device sampling and deadline determination technique for cloud-based heterogeneous federated learning. Specifically, we consider the cloud as a central server that orchestrates federated learning on a large pool of edge devices. To improve the efficiency of model convergence in heterogeneous federated learning, our approach adaptively samples devices to join each round of training and determines the deadline for result submission based on device state. We analyze existing device usage traces to build device state models in different scenarios and design a dynamic importance measurement mechanism based on device availability, data utility, and computing power. We also propose a deadline determination module that dynamically sets the deadline according to the availability of all sampled devices, local training time, and communication time, enabling more clients to submit local models more efficiently. Due to the variability of device state, we design an experience-driven algorithm based on Deep Reinforcement Learning (DRL) that can dynamically adjust our sampling and deadline policies according to the current environment state. We demonstrate the effectiveness of our approach through a series of experiments with the FMNIST dataset and show that our method outperforms current state-of-the-art approaches in terms of model accuracy and convergence speed.","PeriodicalId":56007,"journal":{"name":"Journal of Cloud Computing-Advances Systems and Applications","volume":"48 5","pages":"0"},"PeriodicalIF":3.7000,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cloud Computing-Advances Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13677-023-00515-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract As a new approach to machine learning, Federated learning enables distributned traiing on edge devices and aggregates local models into a global model. The edge devices that participate in federated learning are highly heterogeneous in terms of computing power, device state, and data distribution, making it challenging to converge models efficiently. In this paper, we propose FedState, which is an adaptive device sampling and deadline determination technique for cloud-based heterogeneous federated learning. Specifically, we consider the cloud as a central server that orchestrates federated learning on a large pool of edge devices. To improve the efficiency of model convergence in heterogeneous federated learning, our approach adaptively samples devices to join each round of training and determines the deadline for result submission based on device state. We analyze existing device usage traces to build device state models in different scenarios and design a dynamic importance measurement mechanism based on device availability, data utility, and computing power. We also propose a deadline determination module that dynamically sets the deadline according to the availability of all sampled devices, local training time, and communication time, enabling more clients to submit local models more efficiently. Due to the variability of device state, we design an experience-driven algorithm based on Deep Reinforcement Learning (DRL) that can dynamically adjust our sampling and deadline policies according to the current environment state. We demonstrate the effectiveness of our approach through a series of experiments with the FMNIST dataset and show that our method outperforms current state-of-the-art approaches in terms of model accuracy and convergence speed.
期刊介绍:
The Journal of Cloud Computing: Advances, Systems and Applications (JoCCASA) will publish research articles on all aspects of Cloud Computing. Principally, articles will address topics that are core to Cloud Computing, focusing on the Cloud applications, the Cloud systems, and the advances that will lead to the Clouds of the future. Comprehensive review and survey articles that offer up new insights, and lay the foundations for further exploratory and experimental work, are also relevant.