SCALE：在同质环境中进行自我调节的聚类联合学习

arXiv - CS - Performance Pub Date : 2024-07-25 DOI:arxiv-2407.18387

Sai Puppala, Ismail Hossain, Md Jahangir Alam, Sajedul Talukder, Zahidur Talukder, Syed Bahauddin

{"title":"SCALE：在同质环境中进行自我调节的聚类联合学习","authors":"Sai Puppala, Ismail Hossain, Md Jahangir Alam, Sajedul Talukder, Zahidur Talukder, Syed Bahauddin","doi":"arxiv-2407.18387","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) has emerged as a transformative approach for enabling\ndistributed machine learning while preserving user privacy, yet it faces\nchallenges like communication inefficiencies and reliance on centralized\ninfrastructures, leading to increased latency and costs. This paper presents a\nnovel FL methodology that overcomes these limitations by eliminating the\ndependency on edge servers, employing a server-assisted Proximity Evaluation\nfor dynamic cluster formation based on data similarity, performance indices,\nand geographical proximity. Our integrated approach enhances operational\nefficiency and scalability through a Hybrid Decentralized Aggregation Protocol,\nwhich merges local model training with peer-to-peer weight exchange and a\ncentralized final aggregation managed by a dynamically elected driver node,\nsignificantly curtailing global communication overhead. Additionally, the\nmethodology includes Decentralized Driver Selection, Check-pointing to reduce\nnetwork traffic, and a Health Status Verification Mechanism for system\nrobustness. Validated using the breast cancer dataset, our architecture not\nonly demonstrates a nearly tenfold reduction in communication overhead but also\nshows remarkable improvements in reducing training latency and energy\nconsumption while maintaining high learning performance, offering a scalable,\nefficient, and privacy-preserving solution for the future of federated learning\necosystems.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"42 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SCALE: Self-regulated Clustered federAted LEarning in a Homogeneous Environment\",\"authors\":\"Sai Puppala, Ismail Hossain, Md Jahangir Alam, Sajedul Talukder, Zahidur Talukder, Syed Bahauddin\",\"doi\":\"arxiv-2407.18387\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated Learning (FL) has emerged as a transformative approach for enabling\\ndistributed machine learning while preserving user privacy, yet it faces\\nchallenges like communication inefficiencies and reliance on centralized\\ninfrastructures, leading to increased latency and costs. This paper presents a\\nnovel FL methodology that overcomes these limitations by eliminating the\\ndependency on edge servers, employing a server-assisted Proximity Evaluation\\nfor dynamic cluster formation based on data similarity, performance indices,\\nand geographical proximity. Our integrated approach enhances operational\\nefficiency and scalability through a Hybrid Decentralized Aggregation Protocol,\\nwhich merges local model training with peer-to-peer weight exchange and a\\ncentralized final aggregation managed by a dynamically elected driver node,\\nsignificantly curtailing global communication overhead. Additionally, the\\nmethodology includes Decentralized Driver Selection, Check-pointing to reduce\\nnetwork traffic, and a Health Status Verification Mechanism for system\\nrobustness. Validated using the breast cancer dataset, our architecture not\\nonly demonstrates a nearly tenfold reduction in communication overhead but also\\nshows remarkable improvements in reducing training latency and energy\\nconsumption while maintaining high learning performance, offering a scalable,\\nefficient, and privacy-preserving solution for the future of federated learning\\necosystems.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"42 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.18387\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.18387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

联合学习（Federated Learning，FL）已成为在保护用户隐私的同时实现分布式机器学习的变革性方法，但它面临着通信效率低下和依赖集中式基础设施等挑战，导致延迟和成本增加。本文介绍了一种先进的 FL 方法，该方法消除了对边缘服务器的依赖，采用服务器辅助的 "邻近性评估"（Proximity Evaluation），根据数据相似性、性能指标和地理邻近性动态形成集群，从而克服了这些局限性。我们的集成方法通过混合分散聚合协议提高了运行效率和可扩展性，该协议将本地模型训练与点对点权重交换以及由动态选出的驱动节点管理的集中式最终聚合合并在一起，大大减少了全球通信开销。此外，该方法还包括分散式驱动程序选择、用于减少网络流量的检查点，以及用于确保系统稳健性的健康状态验证机制。通过使用乳腺癌数据集进行验证，我们的架构不仅证明通信开销减少了近十倍，而且在降低训练延迟和能耗方面也有显著改进，同时还保持了较高的学习性能，为未来的联合学习生态系统提供了可扩展、高效和保护隐私的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SCALE: Self-regulated Clustered federAted LEarning in a Homogeneous Environment

Federated Learning (FL) has emerged as a transformative approach for enabling distributed machine learning while preserving user privacy, yet it faces challenges like communication inefficiencies and reliance on centralized infrastructures, leading to increased latency and costs. This paper presents a novel FL methodology that overcomes these limitations by eliminating the dependency on edge servers, employing a server-assisted Proximity Evaluation for dynamic cluster formation based on data similarity, performance indices, and geographical proximity. Our integrated approach enhances operational efficiency and scalability through a Hybrid Decentralized Aggregation Protocol, which merges local model training with peer-to-peer weight exchange and a centralized final aggregation managed by a dynamically elected driver node, significantly curtailing global communication overhead. Additionally, the methodology includes Decentralized Driver Selection, Check-pointing to reduce network traffic, and a Health Status Verification Mechanism for system robustness. Validated using the breast cancer dataset, our architecture not only demonstrates a nearly tenfold reduction in communication overhead but also shows remarkable improvements in reducing training latency and energy consumption while maintaining high learning performance, offering a scalable, efficient, and privacy-preserving solution for the future of federated learning ecosystems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Performance

自引率

0.00%

发文量