Efficient and Intelligent Multijob Federated Learning in Wireless Networks

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Internet of Things Journal Pub Date : 2024-11-19 DOI:10.1109/JIOT.2024.3502403

Jiajin Wang;Ne Wang;Ruiting Zhou;Bo Li

{"title":"Efficient and Intelligent Multijob Federated Learning in Wireless Networks","authors":"Jiajin Wang;Ne Wang;Ruiting Zhou;Bo Li","doi":"10.1109/JIOT.2024.3502403","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) has emerged as an innovative paradigm designed to protect privacy by enabling collaborative machine learning (ML) model training across multiple data owners (also known as clients) without the need to access clients’ raw data. The majority of existing FL research concentrates on scenarios where a single job necessitates training. In practical applications, multiple FL jobs can simultaneously undergo training using a common pool of clients, a scenario known as multijob FL. However, the problem of FL training with multiple jobs remains open and presents significant challenges of the escalated heterogeneity of jobs and clients, complex tradeoffs between training latency and energy consumption, uncertainty of client quality, and potential linear switching cost associated with client selection. This work aims to jointly optimize training efficiency in terms of latency, energy consumption, and switching cost for multiple jobs in stochastic and dynamic environments. Specifically, we propose a novel multijob FL framework, named <monospace>EffI-FL</monospace>, incorporating three innovative designs: 1) to reduce switching cost, we extend the client selection interval from every round to multiple rounds, called a block, within which client subset switching is prohibited; 2) we employ multiarmed bandit (MAB) methods to measure clients’ latency and energy cost under uncertainty. Additionally, we utilize the virtual queue technique to trace clients’ battery usage patterns. By integrating the above client-side knowledge, we propose an adaptive client selection policy aimed at balancing latency, energy consumption, and battery condition; and 3) given that multiple jobs may compete for the same client, we devise a greedy algorithm to assign each client to a single job. We rigorously prove that the regret of our client selection policy and the cost of our block-wise client subset switching algorithm are both sublinear. Finally, we implement <monospace>EffI-FL</monospace> using PyTorch and conduct experiments demonstrating that <monospace>EffI-FL</monospace> reduces the weighted sum of latency, energy consumption, and switching cost by up to 52.3% compared to four state-of-the-art FL frameworks.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 7","pages":"8685-8698"},"PeriodicalIF":8.9000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10757319/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Federated learning (FL) has emerged as an innovative paradigm designed to protect privacy by enabling collaborative machine learning (ML) model training across multiple data owners (also known as clients) without the need to access clients’ raw data. The majority of existing FL research concentrates on scenarios where a single job necessitates training. In practical applications, multiple FL jobs can simultaneously undergo training using a common pool of clients, a scenario known as multijob FL. However, the problem of FL training with multiple jobs remains open and presents significant challenges of the escalated heterogeneity of jobs and clients, complex tradeoffs between training latency and energy consumption, uncertainty of client quality, and potential linear switching cost associated with client selection. This work aims to jointly optimize training efficiency in terms of latency, energy consumption, and switching cost for multiple jobs in stochastic and dynamic environments. Specifically, we propose a novel multijob FL framework, named EffI-FL, incorporating three innovative designs: 1) to reduce switching cost, we extend the client selection interval from every round to multiple rounds, called a block, within which client subset switching is prohibited; 2) we employ multiarmed bandit (MAB) methods to measure clients’ latency and energy cost under uncertainty. Additionally, we utilize the virtual queue technique to trace clients’ battery usage patterns. By integrating the above client-side knowledge, we propose an adaptive client selection policy aimed at balancing latency, energy consumption, and battery condition; and 3) given that multiple jobs may compete for the same client, we devise a greedy algorithm to assign each client to a single job. We rigorously prove that the regret of our client selection policy and the cost of our block-wise client subset switching algorithm are both sublinear. Finally, we implement EffI-FL using PyTorch and conduct experiments demonstrating that EffI-FL reduces the weighted sum of latency, energy consumption, and switching cost by up to 52.3% compared to four state-of-the-art FL frameworks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

无线网络中的高效智能多任务联合学习

联邦学习（FL）已成为一种创新范例，旨在通过支持跨多个数据所有者（也称为客户端）的协作机器学习（ML）模型训练来保护隐私，而无需访问客户端原始数据。现有的大多数FL研究集中在单一工作需要培训的情况下。在实际应用中，多个FL作业可以使用一个共同的客户端池同时进行培训，这种情况被称为多作业FL。然而，具有多作业的FL培训问题仍然存在，并提出了重大挑战，包括作业和客户端的异质性升级、培训延迟和能耗之间的复杂权衡、客户端质量的不确定性以及与客户端选择相关的潜在线性切换成本。本工作旨在随机动态环境下多工种的延迟、能耗、切换成本等方面共同优化训练效率。具体而言，我们提出了一种新的多任务切换框架EffI-FL，其中包含三个创新设计：1)为了降低切换成本，我们将客户端选择间隔从每轮延长到多轮，称为块，在此块内禁止客户端子集切换；2)采用多臂强盗（multiarmed bandit， MAB）方法测量不确定情况下客户端的延迟和能量成本。此外，我们利用虚拟队列技术来跟踪客户端的电池使用模式。通过整合上述客户端知识，我们提出了一种旨在平衡延迟、能耗和电池状况的自适应客户端选择策略；3)考虑到多个作业可能会竞争同一个客户端，我们设计了一个贪婪算法，将每个客户端分配给单个作业。我们严格地证明了我们的客户端选择策略的遗憾和我们的块客户端子集交换算法的代价都是次线性的。最后，我们使用PyTorch实现EffI-FL并进行实验，证明EffI-FL与四个最先进的FL框架相比，将延迟，能耗和切换成本的加权和降低了52.3%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Internet of Things Journal Computer Science-Information Systems

CiteScore

17.60

自引率

13.20%

发文量

1982

期刊介绍： The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.