{"title":"Self-adaptive asynchronous federated optimizer with adversarial sharpness-aware minimization","authors":"","doi":"10.1016/j.future.2024.07.045","DOIUrl":null,"url":null,"abstract":"<div><p>The past years have witnessed the success of a distributed learning system called Federated Learning (FL). Recently, asynchronous FL (AFL) has demonstrated its potential in concurrency compared to mainstream synchronous FL. However, the inherent systematic and statistical heterogeneity has presented several impediments to AFL: On the client side, the discrepancies in trips and local model drift impede global performance enhancement; On the server side, dynamic communication leads to significant fluctuations in gradient arrival time, while asynchronous arrival gradients with ambiguous value are not fully leveraged. In this paper, we propose an adaptive AFL framework, ARDAGH, which systematically addresses the aforementioned challenges: Firstly, to address the discrepancies in client trips, ARDAGH ensures their convergence by incorporating only 1-bit feedback information into the downlink. Secondly, to counter the drift of clients, ARDAGH generalizes the local models by employing our novel adversarial sharpness-aware minimization, which does not necessitate reliance on additional global variables. Thirdly, in the face of gradient latency issues, ARDAGH employs a communication-aware dropout strategy to adaptively compress gradients to ensure similar transmission times. Finally, to fully unleash the potential of each gradient, we establish a consistent optimal direction by conceptualizing the aggregation as an optimizer with successive momentum. In light of the comprehensive solution provided by ARDAGH, an algorithm named FedAMO is derived, and its superiority is confirmed by experimental results obtained under challenging prototype and simulation settings. Particularly in typical sentiment analysis tasks, FedAMO demonstrates an improvement of up to 5.351% with a 20.056-fold acceleration compared to conventional asynchronous methods.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004175","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
The past years have witnessed the success of a distributed learning system called Federated Learning (FL). Recently, asynchronous FL (AFL) has demonstrated its potential in concurrency compared to mainstream synchronous FL. However, the inherent systematic and statistical heterogeneity has presented several impediments to AFL: On the client side, the discrepancies in trips and local model drift impede global performance enhancement; On the server side, dynamic communication leads to significant fluctuations in gradient arrival time, while asynchronous arrival gradients with ambiguous value are not fully leveraged. In this paper, we propose an adaptive AFL framework, ARDAGH, which systematically addresses the aforementioned challenges: Firstly, to address the discrepancies in client trips, ARDAGH ensures their convergence by incorporating only 1-bit feedback information into the downlink. Secondly, to counter the drift of clients, ARDAGH generalizes the local models by employing our novel adversarial sharpness-aware minimization, which does not necessitate reliance on additional global variables. Thirdly, in the face of gradient latency issues, ARDAGH employs a communication-aware dropout strategy to adaptively compress gradients to ensure similar transmission times. Finally, to fully unleash the potential of each gradient, we establish a consistent optimal direction by conceptualizing the aggregation as an optimizer with successive momentum. In light of the comprehensive solution provided by ARDAGH, an algorithm named FedAMO is derived, and its superiority is confirmed by experimental results obtained under challenging prototype and simulation settings. Particularly in typical sentiment analysis tasks, FedAMO demonstrates an improvement of up to 5.351% with a 20.056-fold acceleration compared to conventional asynchronous methods.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.