用于异构联合学习的双向去耦蒸馏技术

IF 2.1 3区物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY Entropy Pub Date : 2024-09-05 DOI:10.3390/e26090762

Wenshuai Song, Mengwei Yan, Xinze Li, Longfei Han

{"title":"用于异构联合学习的双向去耦蒸馏技术","authors":"Wenshuai Song, Mengwei Yan, Xinze Li, Longfei Han","doi":"10.3390/e26090762","DOIUrl":null,"url":null,"abstract":"Federated learning enables multiple devices to collaboratively train a high-performance model on the central server while keeping their data on the devices themselves. However, due to the significant variability in data distribution across devices, the aggregated global model’s optimization direction may differ from that of the local models, making the clients lose their personality. To address this challenge, we propose a Bidirectional Decoupled Distillation For Heterogeneous Federated Learning (BDD-HFL) approach, which incorporates an additional private model within each local client. This design enables mutual knowledge exchange between the private and local models in a bidirectional manner. Specifically, previous one-way federated distillation methods mainly focused on learning features from the target class, which limits their ability to distill features from non-target classes and hinders the convergence of local models. To solve this limitation, we decompose the network output into target and non-target class logits and distill them separately using a joint optimization of cross-entropy and decoupled relative-entropy loss. We evaluate the effectiveness of BDD-HFL through extensive experiments on three benchmarks under IID, Non-IID, and unbalanced data distribution scenarios. Our results show that BDD-HFL outperforms state-of-the-art federated distillation methods across five baselines, achieving at most 3% improvement in average classification accuracy on the CIFAR-10, CIFAR-100, and MNIST datasets. The experiments demonstrate the superiority and generalization capability of BDD-HFL in addressing personalization challenges in federated learning.","PeriodicalId":11694,"journal":{"name":"Entropy","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bidirectional Decoupled Distillation For Heterogeneous Federated Learning\",\"authors\":\"Wenshuai Song, Mengwei Yan, Xinze Li, Longfei Han\",\"doi\":\"10.3390/e26090762\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning enables multiple devices to collaboratively train a high-performance model on the central server while keeping their data on the devices themselves. However, due to the significant variability in data distribution across devices, the aggregated global model’s optimization direction may differ from that of the local models, making the clients lose their personality. To address this challenge, we propose a Bidirectional Decoupled Distillation For Heterogeneous Federated Learning (BDD-HFL) approach, which incorporates an additional private model within each local client. This design enables mutual knowledge exchange between the private and local models in a bidirectional manner. Specifically, previous one-way federated distillation methods mainly focused on learning features from the target class, which limits their ability to distill features from non-target classes and hinders the convergence of local models. To solve this limitation, we decompose the network output into target and non-target class logits and distill them separately using a joint optimization of cross-entropy and decoupled relative-entropy loss. We evaluate the effectiveness of BDD-HFL through extensive experiments on three benchmarks under IID, Non-IID, and unbalanced data distribution scenarios. Our results show that BDD-HFL outperforms state-of-the-art federated distillation methods across five baselines, achieving at most 3% improvement in average classification accuracy on the CIFAR-10, CIFAR-100, and MNIST datasets. The experiments demonstrate the superiority and generalization capability of BDD-HFL in addressing personalization challenges in federated learning.\",\"PeriodicalId\":11694,\"journal\":{\"name\":\"Entropy\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Entropy\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.3390/e26090762\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entropy","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/e26090762","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

联盟学习能让多台设备在中央服务器上协同训练一个高性能模型，同时将数据保存在设备本身。然而，由于各设备间的数据分布存在很大差异，聚合全局模型的优化方向可能与本地模型的优化方向不同，从而使客户端失去个性。为了应对这一挑战，我们提出了一种双向去耦蒸馏异构联合学习（BDD-HFL）方法，在每个本地客户端中加入了一个额外的私有模型。这种设计能使私有模型和本地模型以双向方式相互交换知识。具体来说，以前的单向联合提炼方法主要集中于学习目标类的特征，这限制了它们从非目标类中提炼特征的能力，并阻碍了本地模型的收敛。为了解决这一局限，我们将网络输出分解为目标类和非目标类对数，并使用交叉熵损失和解耦相对熵损失的联合优化方法分别对其进行蒸馏。我们在 IID、Non-ID 和不平衡数据分布场景下的三个基准上进行了大量实验，评估了 BDD-HFL 的有效性。我们的结果表明，BDD-HFL 在五个基线上的表现优于最先进的联合蒸馏方法，在 CIFAR-10、CIFAR-100 和 MNIST 数据集上的平均分类准确率最多提高了 3%。这些实验证明了 BDD-HFL 在应对联合学习中的个性化挑战方面的优越性和泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Bidirectional Decoupled Distillation For Heterogeneous Federated Learning

Federated learning enables multiple devices to collaboratively train a high-performance model on the central server while keeping their data on the devices themselves. However, due to the significant variability in data distribution across devices, the aggregated global model’s optimization direction may differ from that of the local models, making the clients lose their personality. To address this challenge, we propose a Bidirectional Decoupled Distillation For Heterogeneous Federated Learning (BDD-HFL) approach, which incorporates an additional private model within each local client. This design enables mutual knowledge exchange between the private and local models in a bidirectional manner. Specifically, previous one-way federated distillation methods mainly focused on learning features from the target class, which limits their ability to distill features from non-target classes and hinders the convergence of local models. To solve this limitation, we decompose the network output into target and non-target class logits and distill them separately using a joint optimization of cross-entropy and decoupled relative-entropy loss. We evaluate the effectiveness of BDD-HFL through extensive experiments on three benchmarks under IID, Non-IID, and unbalanced data distribution scenarios. Our results show that BDD-HFL outperforms state-of-the-art federated distillation methods across five baselines, achieving at most 3% improvement in average classification accuracy on the CIFAR-10, CIFAR-100, and MNIST datasets. The experiments demonstrate the superiority and generalization capability of BDD-HFL in addressing personalization challenges in federated learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Entropy PHYSICS, MULTIDISCIPLINARY-

CiteScore

4.90

自引率

11.10%

发文量

1580

审稿时长

21.05 days

期刊介绍： Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.

期刊最新文献

Motor Fault Diagnosis Based on Convolutional Block Attention Module-Xception Lightweight Neural Network. One-Photon-Interference Quantum Secure Direct Communication. The Application of Pinch Technology to a Novel Closed-Loop Spray Drying System with a Condenser and Reheater. Autocatalytic Sets and Assembly Theory: A Toy Model Perspective. Transient GI/MSP/1/N Queue.