{"title":"Addressing Heterogeneity in Federated Learning with Client Selection via Submodular Optimization","authors":"Jinghui Zhang, Jiawei Wang, Yaning Li, Fan Xin, Fang Dong, Junzhou Luo, Zhihua Wu","doi":"10.1145/3638052","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) has been proposed as a privacy-preserving distributed learning paradigm, which differs from traditional distributed learning in two main aspects: the systems heterogeneity meaning that clients participating in training have significant differences in systems performance including CPU frequency, dataset size and transmission power, and the statistical heterogeneity indicating that the data distribution among clients exhibits Non-Independent Identical Distribution (Non-IID). Therefore, the random selection of clients will significantly reduce the training efficiency of FL. In this paper, we propose a client selection mechanism considering both systems and statistical heterogeneity, which aims to improve the time-to-accuracy performance by trading off the impact of systems performance differences and data distribution differences among the clients on training efficiency. Firstly, client selection is formulated as a combinatorial optimization problem that jointly optimizes systems and statistical performance. Then we generalize it to a submodular maximization problem with knapsack constraint, and propose the Iterative Greedy with Partial Enumeration (IGPE) algorithm to greedily select the suitable clients. Then, the approximation ratio of IGPE is analyzed theoretically. Extensive experiments verify that the time-to-accuracy performance of the IGPE algorithm outperforms other compared algorithms in a variety of heterogeneous environments.","PeriodicalId":50910,"journal":{"name":"ACM Transactions on Sensor Networks","volume":"10 26","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Sensor Networks","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3638052","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Federated learning (FL) has been proposed as a privacy-preserving distributed learning paradigm, which differs from traditional distributed learning in two main aspects: the systems heterogeneity meaning that clients participating in training have significant differences in systems performance including CPU frequency, dataset size and transmission power, and the statistical heterogeneity indicating that the data distribution among clients exhibits Non-Independent Identical Distribution (Non-IID). Therefore, the random selection of clients will significantly reduce the training efficiency of FL. In this paper, we propose a client selection mechanism considering both systems and statistical heterogeneity, which aims to improve the time-to-accuracy performance by trading off the impact of systems performance differences and data distribution differences among the clients on training efficiency. Firstly, client selection is formulated as a combinatorial optimization problem that jointly optimizes systems and statistical performance. Then we generalize it to a submodular maximization problem with knapsack constraint, and propose the Iterative Greedy with Partial Enumeration (IGPE) algorithm to greedily select the suitable clients. Then, the approximation ratio of IGPE is analyzed theoretically. Extensive experiments verify that the time-to-accuracy performance of the IGPE algorithm outperforms other compared algorithms in a variety of heterogeneous environments.
期刊介绍:
ACM Transactions on Sensor Networks (TOSN) is a central publication by the ACM in the interdisciplinary area of sensor networks spanning a broad discipline from signal processing, networking and protocols, embedded systems, information management, to distributed algorithms. It covers research contributions that introduce new concepts, techniques, analyses, or architectures, as well as applied contributions that report on development of new tools and systems or experiences and experiments with high-impact, innovative applications. The Transactions places special attention on contributions to systemic approaches to sensor networks as well as fundamental contributions.