支持swift的物联网网络中无线联合学习的深度强化学习

2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall) Pub Date : 2022-09-01 DOI:10.1109/VTC2022-Fall57202.2022.10012702

Xinran Zhang, Hui Tian, Wanli Ni, Mengying Sun

{"title":"支持swift的物联网网络中无线联合学习的深度强化学习","authors":"Xinran Zhang, Hui Tian, Wanli Ni, Mengying Sun","doi":"10.1109/VTC2022-Fall57202.2022.10012702","DOIUrl":null,"url":null,"abstract":"As a distributed machine learning paradigm, federated learning (FL) has been regarded as a promising candidate to preserve user privacy in Internet of Things (IoT) networks. Leveraging the waveform superposition property of wireless channels, over-the-air FL (AirFL) achieves fast model aggregation by integrating communication and computation via concurrent analog transmissions. To support sustainable AirFL among energy-constrained IoT devices, we consider that the base station (BS) adopts simultaneous wireless information and power transfer (SWIPT) to distribute global model and charge local devices in each communication round. To maximize the long-term energy efficiency (EE) of AirFL, we investigate a resource allocation problem by jointly optimizing the time division, transceiver beamforming, and power splitting in SWIPT-enabled IoT networks. Considering such multiple closely-coupled continuous valuables, we propose a deep reinforcement learning (DRL) algorithm based on twin delayed deep deterministic (TD3) policy to smartly make downlink and uplink communication strategies with the coordination between the BS and devices. Simulation results show that the proposed TD3 algorithm obtains about 41% EE improvement compared to traditional optimization method and other DRL algorithms.","PeriodicalId":326047,"journal":{"name":"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Deep Reinforcement Learning for Over-the-Air Federated Learning in SWIPT-Enabled IoT Networks\",\"authors\":\"Xinran Zhang, Hui Tian, Wanli Ni, Mengying Sun\",\"doi\":\"10.1109/VTC2022-Fall57202.2022.10012702\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a distributed machine learning paradigm, federated learning (FL) has been regarded as a promising candidate to preserve user privacy in Internet of Things (IoT) networks. Leveraging the waveform superposition property of wireless channels, over-the-air FL (AirFL) achieves fast model aggregation by integrating communication and computation via concurrent analog transmissions. To support sustainable AirFL among energy-constrained IoT devices, we consider that the base station (BS) adopts simultaneous wireless information and power transfer (SWIPT) to distribute global model and charge local devices in each communication round. To maximize the long-term energy efficiency (EE) of AirFL, we investigate a resource allocation problem by jointly optimizing the time division, transceiver beamforming, and power splitting in SWIPT-enabled IoT networks. Considering such multiple closely-coupled continuous valuables, we propose a deep reinforcement learning (DRL) algorithm based on twin delayed deep deterministic (TD3) policy to smartly make downlink and uplink communication strategies with the coordination between the BS and devices. Simulation results show that the proposed TD3 algorithm obtains about 41% EE improvement compared to traditional optimization method and other DRL algorithms.\",\"PeriodicalId\":326047,\"journal\":{\"name\":\"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VTC2022-Fall57202.2022.10012702\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VTC2022-Fall57202.2022.10012702","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

作为一种分布式机器学习范式，联邦学习(FL)被认为是保护物联网(IoT)网络中用户隐私的一个有前途的候选。AirFL (over- AirFL)利用无线信道的波形叠加特性，通过并行模拟传输将通信和计算集成在一起，实现快速的模型聚合。为了支持能源受限的物联网设备之间的可持续AirFL，我们认为基站(BS)采用同步无线信息和电力传输(SWIPT)在每一轮通信中分发全局模型并为本地设备充电。为了最大限度地提高AirFL的长期能源效率(EE)，我们通过共同优化支持swift的物联网网络中的时分、收发器波束形成和功率分割来研究资源分配问题。考虑到这种多紧耦合的连续值，我们提出了一种基于双延迟深度确定性(TD3)策略的深度强化学习(DRL)算法，在BS与设备之间的协调下，智能地制定上下行通信策略。仿真结果表明，与传统优化方法和其他DRL算法相比，提出的TD3算法的EE提高了约41%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Deep Reinforcement Learning for Over-the-Air Federated Learning in SWIPT-Enabled IoT Networks

As a distributed machine learning paradigm, federated learning (FL) has been regarded as a promising candidate to preserve user privacy in Internet of Things (IoT) networks. Leveraging the waveform superposition property of wireless channels, over-the-air FL (AirFL) achieves fast model aggregation by integrating communication and computation via concurrent analog transmissions. To support sustainable AirFL among energy-constrained IoT devices, we consider that the base station (BS) adopts simultaneous wireless information and power transfer (SWIPT) to distribute global model and charge local devices in each communication round. To maximize the long-term energy efficiency (EE) of AirFL, we investigate a resource allocation problem by jointly optimizing the time division, transceiver beamforming, and power splitting in SWIPT-enabled IoT networks. Considering such multiple closely-coupled continuous valuables, we propose a deep reinforcement learning (DRL) algorithm based on twin delayed deep deterministic (TD3) policy to smartly make downlink and uplink communication strategies with the coordination between the BS and devices. Simulation results show that the proposed TD3 algorithm obtains about 41% EE improvement compared to traditional optimization method and other DRL algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)

自引率

0.00%

发文量