{"title":"FedREAS: A Robust Efficient Aggregation and Selection Framework for Federated Learning","authors":"Shuming Fan, Chenpei Wang, Xinyu Ruan, Hongjian Shi, Ruhui Ma, Haibing Guan","doi":"10.1145/3670689","DOIUrl":null,"url":null,"abstract":"<p>In the field of Natural Language Processing (NLP), Deep Learning (DL) and Neural Network (NN) technologies have been widely applied to machine translation and sentiment analysis and have demonstrated outstanding performance. In recent years, NLP applications have also combined multimodal data, such as visual and audio, continuously improving language processing performance. At the same time, the size of Neural Network models is increasing, and many models cannot be deployed on devices with limited computing resources. Deploying models on cloud platforms has become a trend. However, deploying models in the cloud introduces new privacy risks for endpoint data, despite overcoming computational limitations. Federated Learning (FL) methods protect local data by keeping the data on the client side and only sending local updates to the central server. However, the FL architecture still has problems, such as vulnerability to adversarial attacks and non-IID data distribution. In this work, we propose a Federated Learning aggregation method called FedREAS. The server uses a benchmark dataset to train a global model and obtains benchmark updates in this method. Before aggregating local updates, the server adjusts the local updates using the benchmark updates and then returns the adjusted benchmark updates. Then, based on the similarity between the adjusted local updates and the adjusted benchmark updates, the server aggregates these local updates to obtain a more robust update. This method also improves the client selection process. FedREAS selects suitable clients for training at the beginning of each round based on specific strategies, the similarity of the previous round’s updates, and the submitted data. We conduct experiments on different datasets and compare FedREAS with other Federated Learning methods. The results show that FedREAS outperforms other methods regarding model performance and resistance to attacks.</p>","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":"70 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Asian and Low-Resource Language Information Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3670689","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the field of Natural Language Processing (NLP), Deep Learning (DL) and Neural Network (NN) technologies have been widely applied to machine translation and sentiment analysis and have demonstrated outstanding performance. In recent years, NLP applications have also combined multimodal data, such as visual and audio, continuously improving language processing performance. At the same time, the size of Neural Network models is increasing, and many models cannot be deployed on devices with limited computing resources. Deploying models on cloud platforms has become a trend. However, deploying models in the cloud introduces new privacy risks for endpoint data, despite overcoming computational limitations. Federated Learning (FL) methods protect local data by keeping the data on the client side and only sending local updates to the central server. However, the FL architecture still has problems, such as vulnerability to adversarial attacks and non-IID data distribution. In this work, we propose a Federated Learning aggregation method called FedREAS. The server uses a benchmark dataset to train a global model and obtains benchmark updates in this method. Before aggregating local updates, the server adjusts the local updates using the benchmark updates and then returns the adjusted benchmark updates. Then, based on the similarity between the adjusted local updates and the adjusted benchmark updates, the server aggregates these local updates to obtain a more robust update. This method also improves the client selection process. FedREAS selects suitable clients for training at the beginning of each round based on specific strategies, the similarity of the previous round’s updates, and the submitted data. We conduct experiments on different datasets and compare FedREAS with other Federated Learning methods. The results show that FedREAS outperforms other methods regarding model performance and resistance to attacks.
期刊介绍:
The ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) publishes high quality original archival papers and technical notes in the areas of computation and processing of information in Asian languages, low-resource languages of Africa, Australasia, Oceania and the Americas, as well as related disciplines. The subject areas covered by TALLIP include, but are not limited to:
-Computational Linguistics: including computational phonology, computational morphology, computational syntax (e.g. parsing), computational semantics, computational pragmatics, etc.
-Linguistic Resources: including computational lexicography, terminology, electronic dictionaries, cross-lingual dictionaries, electronic thesauri, etc.
-Hardware and software algorithms and tools for Asian or low-resource language processing, e.g., handwritten character recognition.
-Information Understanding: including text understanding, speech understanding, character recognition, discourse processing, dialogue systems, etc.
-Machine Translation involving Asian or low-resource languages.
-Information Retrieval: including natural language processing (NLP) for concept-based indexing, natural language query interfaces, semantic relevance judgments, etc.
-Information Extraction and Filtering: including automatic abstraction, user profiling, etc.
-Speech processing: including text-to-speech synthesis and automatic speech recognition.
-Multimedia Asian Information Processing: including speech, image, video, image/text translation, etc.
-Cross-lingual information processing involving Asian or low-resource languages.
-Papers that deal in theory, systems design, evaluation and applications in the aforesaid subjects are appropriate for TALLIP. Emphasis will be placed on the originality and the practical significance of the reported research.