CaBaFL：通过分层缓存和特征平衡进行异步联合学习

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2024-11-06 DOI:10.1109/TCAD.2024.3446881

Zeke Xia;Ming Hu;Dengke Yan;Xiaofei Xie;Tianlin Li;Anran Li;Junlong Zhou;Mingsong Chen

{"title":"CaBaFL：通过分层缓存和特征平衡进行异步联合学习","authors":"Zeke Xia;Ming Hu;Dengke Yan;Xiaofei Xie;Tianlin Li;Anran Li;Junlong Zhou;Mingsong Chen","doi":"10.1109/TCAD.2024.3446881","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) as a promising distributed machine learning paradigm has been widely adopted in Artificial Intelligence of Things (AIoT) applications. However, the efficiency and inference capability of FL is seriously limited due to the presence of stragglers and data imbalance across massive AIoT devices, respectively. To address the above challenges, we present a novel asynchronous FL approach named CaBaFL, which includes a hierarchical cache-based aggregation mechanism and a feature balance-guided device selection strategy. CaBaFL maintains multiple intermediate models simultaneously for local training. The hierarchical cache-based aggregation mechanism enables each intermediate model to be trained on multiple devices to align the training time and mitigate the straggler issue. In specific, each intermediate model is stored in a low-level cache for local training and when it is trained by sufficient local devices, it will be stored in a high-level cache for aggregation. To address the problem of imbalanced data, the feature balance-guided device selection strategy in CaBaFL adopts the activation distribution as a metric, which enables each intermediate model to be trained across devices with totally balanced data distributions before aggregation. Experimental results show that compared to the state-of-the-art FL methods, CaBaFL achieves up to 9.26X training acceleration and 19.71% accuracy improvements.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4057-4068"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CaBaFL: Asynchronous Federated Learning via Hierarchical Cache and Feature Balance\",\"authors\":\"Zeke Xia;Ming Hu;Dengke Yan;Xiaofei Xie;Tianlin Li;Anran Li;Junlong Zhou;Mingsong Chen\",\"doi\":\"10.1109/TCAD.2024.3446881\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning (FL) as a promising distributed machine learning paradigm has been widely adopted in Artificial Intelligence of Things (AIoT) applications. However, the efficiency and inference capability of FL is seriously limited due to the presence of stragglers and data imbalance across massive AIoT devices, respectively. To address the above challenges, we present a novel asynchronous FL approach named CaBaFL, which includes a hierarchical cache-based aggregation mechanism and a feature balance-guided device selection strategy. CaBaFL maintains multiple intermediate models simultaneously for local training. The hierarchical cache-based aggregation mechanism enables each intermediate model to be trained on multiple devices to align the training time and mitigate the straggler issue. In specific, each intermediate model is stored in a low-level cache for local training and when it is trained by sufficient local devices, it will be stored in a high-level cache for aggregation. To address the problem of imbalanced data, the feature balance-guided device selection strategy in CaBaFL adopts the activation distribution as a metric, which enables each intermediate model to be trained across devices with totally balanced data distributions before aggregation. Experimental results show that compared to the state-of-the-art FL methods, CaBaFL achieves up to 9.26X training acceleration and 19.71% accuracy improvements.\",\"PeriodicalId\":13251,\"journal\":{\"name\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"volume\":\"43 11\",\"pages\":\"4057-4068\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10745831/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10745831/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

联合学习（FL）作为一种前景广阔的分布式机器学习范式，已在人工智能物联网（AIoT）应用中得到广泛采用。然而，由于海量 AIoT 设备中存在散兵游勇和数据不平衡，FL 的效率和推理能力受到严重限制。为了应对上述挑战，我们提出了一种名为 CaBaFL 的新型异步 FL 方法，其中包括基于缓存的分层聚合机制和特征平衡指导的设备选择策略。CaBaFL 同时维护多个中间模型，用于本地训练。基于缓存的分层聚合机制使每个中间模型都能在多个设备上进行训练，以调整训练时间并减少滞后问题。具体来说，每个中间模型都存储在一个低级缓存中进行本地训练，当有足够多的本地设备对其进行训练后，它将被存储在一个高级缓存中进行聚合。为了解决数据不平衡的问题，CaBaFL 的特征平衡指导设备选择策略采用了激活分布作为度量标准，这使得每个中间模型在聚合前都能在数据分布完全平衡的设备上进行训练。实验结果表明，与最先进的 FL 方法相比，CaBaFL 的训练速度提高了 9.26 倍，准确率提高了 19.71%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

CaBaFL: Asynchronous Federated Learning via Hierarchical Cache and Feature Balance

Federated learning (FL) as a promising distributed machine learning paradigm has been widely adopted in Artificial Intelligence of Things (AIoT) applications. However, the efficiency and inference capability of FL is seriously limited due to the presence of stragglers and data imbalance across massive AIoT devices, respectively. To address the above challenges, we present a novel asynchronous FL approach named CaBaFL, which includes a hierarchical cache-based aggregation mechanism and a feature balance-guided device selection strategy. CaBaFL maintains multiple intermediate models simultaneously for local training. The hierarchical cache-based aggregation mechanism enables each intermediate model to be trained on multiple devices to align the training time and mitigate the straggler issue. In specific, each intermediate model is stored in a low-level cache for local training and when it is trained by sufficient local devices, it will be stored in a high-level cache for aggregation. To address the problem of imbalanced data, the feature balance-guided device selection strategy in CaBaFL adopts the activation distribution as a metric, which enables each intermediate model to be trained across devices with totally balanced data distributions before aggregation. Experimental results show that compared to the state-of-the-art FL methods, CaBaFL achieves up to 9.26X training acceleration and 19.71% accuracy improvements.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 工程技术-工程：电子与电气

CiteScore

5.60

自引率

13.80%

发文量

500

审稿时长

7 months

期刊介绍： The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.

期刊最新文献

Table of Contents IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems society information IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems publication information Table of Contents IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems publication information