首页 > 最新文献

IEEE Transactions on Machine Learning in Communications and Networking最新文献

英文 中文
Optimal Stopping Theory-Based Online Node Selection in IoT Networks for Multi-Parameter Federated Learning 基于最优停止理论的物联网多参数联邦学习在线节点选择
Pub Date : 2025-03-06 DOI: 10.1109/TMLCN.2025.3567370
Seda Dogan-Tusha;Faissal El Bouanani;Marwa Qaraqe
Federated Learning (FL) has attracted the interest of researchers since it hinders inefficient resource utilization by developing a global learning model based on local model parameters (LMP). This study introduces a novel optimal stopping theory (OST) based online node selection scheme for low complex and multi-parameter FL procedure in IoT networks. Global model accuracy (GMA) in FL depends on the accuracy of the LMP received by the central entity (CE). It is therefore essential to choose trusty nodes to guarantee a certain level of global model accuracy without inducing additional system complexity. For this reason, the proposed technique in this study utilizes the secretary problem (SP) approach as an OST to perform node selection considering both received signal strength (RSS) and local model accuracy (LMA) of available nodes. By leveraging the SP, the proposed technique employs a stopping rule that maximizes the probability of selecting the node with the best quality, and thereby avoids testing all candidate nodes. To this end, this work provides a mathematical framework for maximizing the selection probability of the best node amongst candidate nodes. Specifically, the developed framework has been used to calculate the weighting coefficients of the RSS and LMA to define the node quality. Comprehensive analysis and simulation results illustrate that the OST based proposed technique outperforms state-of-the-art methods including the random node selection and the offline node selection (exhaustive search methods) in terms of GMA and computational complexity, respectively.
联邦学习(FL)通过建立基于局部模型参数的全局学习模型来抑制资源的低效利用,引起了研究人员的广泛关注。针对物联网网络中低复杂度多参数FL过程,提出了一种基于最优停止理论(OST)的在线节点选择方案。FL中的全局模型精度(GMA)取决于中央实体(CE)接收的LMP的精度。因此,必须选择可信节点,以保证一定程度的全局模型精度,而不会引起额外的系统复杂性。因此,本研究提出的技术利用秘书问题(SP)方法作为OST,同时考虑可用节点的接收信号强度(RSS)和局部模型精度(LMA)进行节点选择。通过利用SP,所提出的技术采用了一个停止规则,使选择具有最佳质量的节点的概率最大化,从而避免测试所有候选节点。为此,本工作提供了一个数学框架,用于最大化候选节点中最佳节点的选择概率。具体来说,使用开发的框架计算RSS和LMA的权重系数来定义节点质量。综合分析和仿真结果表明,基于OST的方法在GMA和计算复杂度方面分别优于随机节点选择和离线节点选择(穷举搜索方法)。
{"title":"Optimal Stopping Theory-Based Online Node Selection in IoT Networks for Multi-Parameter Federated Learning","authors":"Seda Dogan-Tusha;Faissal El Bouanani;Marwa Qaraqe","doi":"10.1109/TMLCN.2025.3567370","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3567370","url":null,"abstract":"Federated Learning (FL) has attracted the interest of researchers since it hinders inefficient resource utilization by developing a global learning model based on local model parameters (LMP). This study introduces a novel optimal stopping theory (OST) based online node selection scheme for low complex and multi-parameter FL procedure in IoT networks. Global model accuracy (GMA) in FL depends on the accuracy of the LMP received by the central entity (CE). It is therefore essential to choose trusty nodes to guarantee a certain level of global model accuracy without inducing additional system complexity. For this reason, the proposed technique in this study utilizes the secretary problem (SP) approach as an OST to perform node selection considering both received signal strength (RSS) and local model accuracy (LMA) of available nodes. By leveraging the SP, the proposed technique employs a stopping rule that maximizes the probability of selecting the node with the best quality, and thereby avoids testing all candidate nodes. To this end, this work provides a mathematical framework for maximizing the selection probability of the best node amongst candidate nodes. Specifically, the developed framework has been used to calculate the weighting coefficients of the RSS and LMA to define the node quality. Comprehensive analysis and simulation results illustrate that the OST based proposed technique outperforms state-of-the-art methods including the random node selection and the offline node selection (exhaustive search methods) in terms of GMA and computational complexity, respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"659-676"},"PeriodicalIF":0.0,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10988901","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144100011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Paths Optimization by Jointing Link Management and Channel Estimation Using Variational Autoencoder With Attention for IRS-MIMO Systems 关注IRS-MIMO系统的变分自编码器结合链路管理和信道估计的路径优化
Pub Date : 2025-03-03 DOI: 10.1109/TMLCN.2025.3547689
Meng-Hsun Wu;Hong-Yunn Chen;Ta-Wei Yang;Chih-Chuan Hsu;Chih-Wei Huang;Cheng-Fu Chou
In massive MIMO systems, achieving optimal end-to-end transmission encompasses various aspects such as power control, modulation schemes, path selection, and accurate channel estimation. Nonetheless, optimizing resource allocation remains a significant challenge. In path selection, the direct link is a straightforward link between the transmitter and the receiver. On the other hand, the indirect link involves reflections, diffraction, or scattering, often due to interactions with objects or obstacles. Relying exclusively on one type of link can lead to suboptimal and limited performance. Link management (LM) is emerging as a viable solution, and accurate channel estimation provides essential information to make informed decisions about transmission parameters. In this paper, we study LM and channel estimation that flexibly adjust the transmission ratio of direct and indirect links to improve generalization, using a denoising variational autoencoder with attention modules (DVAE-ATT) to enhance sum rate. Our experiments show significant improvements in IRS-assisted millimeter-wave MIMO systems. Incorporating LM increased the sum rate and reduced MSE by approximately 9%. Variational autoencoders (VAE) outperformed traditional autoencoders in the spatial domain, as confirmed by heatmap analysis. Additionally, our investigation of DVAE-ATT reveals notable differences in the temporal domain with and without attention mechanisms. Finally, we analyze performance across varying numbers of users and ranges. Across various distances—5m, 15m, 25m, and 35m—performance improvements averaged 6%, 11%, 16%, and 22%, respectively.
在大规模MIMO系统中,实现最佳端到端传输包括功率控制、调制方案、路径选择和准确的信道估计等各个方面。尽管如此,优化资源分配仍然是一个重大挑战。在路径选择中,直接链路是发射器和接收器之间的直接链路。另一方面,间接联系涉及反射、衍射或散射,通常是由于与物体或障碍物的相互作用。完全依赖于一种类型的链接可能导致次优和有限的性能。链路管理(LM)正在成为一种可行的解决方案,准确的信道估计为做出有关传输参数的明智决策提供了必要的信息。在本文中,我们研究了LM和信道估计,灵活调整直接和间接链路的传输率来提高泛化,使用带有注意模块的去噪变分自编码器(DVAE-ATT)来提高和率。我们的实验显示了irs辅助毫米波MIMO系统的显著改进。合并LM提高了总和率,并将MSE降低了约9%。热力图分析证实了变分自编码器(VAE)在空间域上优于传统的自编码器。此外,我们对DVAE-ATT的调查显示,在有和没有注意机制的情况下,颞域存在显著差异。最后,我们分析不同数量的用户和范围的性能。在不同距离(5m、15m、25m和35m)上,性能提升的平均幅度分别为6%、11%、16%和22%。
{"title":"Paths Optimization by Jointing Link Management and Channel Estimation Using Variational Autoencoder With Attention for IRS-MIMO Systems","authors":"Meng-Hsun Wu;Hong-Yunn Chen;Ta-Wei Yang;Chih-Chuan Hsu;Chih-Wei Huang;Cheng-Fu Chou","doi":"10.1109/TMLCN.2025.3547689","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3547689","url":null,"abstract":"In massive MIMO systems, achieving optimal end-to-end transmission encompasses various aspects such as power control, modulation schemes, path selection, and accurate channel estimation. Nonetheless, optimizing resource allocation remains a significant challenge. In path selection, the direct link is a straightforward link between the transmitter and the receiver. On the other hand, the indirect link involves reflections, diffraction, or scattering, often due to interactions with objects or obstacles. Relying exclusively on one type of link can lead to suboptimal and limited performance. Link management (LM) is emerging as a viable solution, and accurate channel estimation provides essential information to make informed decisions about transmission parameters. In this paper, we study LM and channel estimation that flexibly adjust the transmission ratio of direct and indirect links to improve generalization, using a denoising variational autoencoder with attention modules (DVAE-ATT) to enhance sum rate. Our experiments show significant improvements in IRS-assisted millimeter-wave MIMO systems. Incorporating LM increased the sum rate and reduced MSE by approximately 9%. Variational autoencoders (VAE) outperformed traditional autoencoders in the spatial domain, as confirmed by heatmap analysis. Additionally, our investigation of DVAE-ATT reveals notable differences in the temporal domain with and without attention mechanisms. Finally, we analyze performance across varying numbers of users and ranges. Across various distances—5m, 15m, 25m, and 35m—performance improvements averaged 6%, 11%, 16%, and 22%, respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"381-394"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10909334","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143583165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Multiple Access Scheme for Heterogeneous Wireless Communications Using Symmetry-Aware Continual Deep Reinforcement Learning 一种基于对称感知持续深度强化学习的异构无线通信多址方案
Pub Date : 2025-02-28 DOI: 10.1109/TMLCN.2025.3546183
Hamidreza Mazandarani;Masoud Shokrnezhad;Tarik Taleb
The Metaverse holds the potential to revolutionize digital interactions through the establishment of a highly dynamic and immersive virtual realm over wireless communications systems, offering services such as massive twinning and telepresence. This landscape presents novel challenges, particularly efficient management of multiple access to the frequency spectrum, for which numerous adaptive Deep Reinforcement Learning (DRL) approaches have been explored. However, challenges persist in adapting agents to heterogeneous and non-stationary wireless environments. In this paper, we present a novel approach that leverages Continual Learning (CL) to enhance intelligent Medium Access Control (MAC) protocols, featuring an intelligent agent coexisting with legacy User Equipments (UEs) with varying numbers, protocols, and transmission profiles unknown to the agent for the sake of backward compatibility and privacy. We introduce an adaptive Double and Dueling Deep Q-Learning (D3QL)-based MAC protocol, enriched by a symmetry-aware CL mechanism, which maximizes intelligent agent throughput while ensuring fairness. Mathematical analysis validates the efficiency of our proposed scheme, showcasing superiority over conventional DRL-based techniques in terms of throughput, collision rate, and fairness, coupled with real-time responsiveness in highly dynamic scenarios.
通过在无线通信系统上建立一个高度动态和沉浸式的虚拟领域,提供大规模孪生和远程呈现等服务,虚拟世界具有革命性的数字交互潜力。这一前景提出了新的挑战,特别是对频谱的多次访问的有效管理,为此已经探索了许多自适应深度强化学习(DRL)方法。然而,在使代理适应异构和非固定无线环境方面仍然存在挑战。在本文中,我们提出了一种利用持续学习(CL)来增强智能媒体访问控制(MAC)协议的新方法,其特点是智能代理与遗留用户设备(ue)共存,这些设备具有不同的数量、协议和传输配置文件,对于代理来说是向后兼容性和隐私性。我们引入了一种基于自适应Double和Dueling深度Q-Learning (D3QL)的MAC协议,该协议由对称感知CL机制丰富,在确保公平性的同时最大限度地提高智能代理吞吐量。数学分析验证了我们提出的方案的效率,显示了在吞吐量、碰撞率和公平性方面优于传统基于drl的技术,并且在高动态场景下具有实时响应能力。
{"title":"A Novel Multiple Access Scheme for Heterogeneous Wireless Communications Using Symmetry-Aware Continual Deep Reinforcement Learning","authors":"Hamidreza Mazandarani;Masoud Shokrnezhad;Tarik Taleb","doi":"10.1109/TMLCN.2025.3546183","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3546183","url":null,"abstract":"The Metaverse holds the potential to revolutionize digital interactions through the establishment of a highly dynamic and immersive virtual realm over wireless communications systems, offering services such as massive twinning and telepresence. This landscape presents novel challenges, particularly efficient management of multiple access to the frequency spectrum, for which numerous adaptive Deep Reinforcement Learning (DRL) approaches have been explored. However, challenges persist in adapting agents to heterogeneous and non-stationary wireless environments. In this paper, we present a novel approach that leverages Continual Learning (CL) to enhance intelligent Medium Access Control (MAC) protocols, featuring an intelligent agent coexisting with legacy User Equipments (UEs) with varying numbers, protocols, and transmission profiles unknown to the agent for the sake of backward compatibility and privacy. We introduce an adaptive Double and Dueling Deep Q-Learning (D3QL)-based MAC protocol, enriched by a symmetry-aware CL mechanism, which maximizes intelligent agent throughput while ensuring fairness. Mathematical analysis validates the efficiency of our proposed scheme, showcasing superiority over conventional DRL-based techniques in terms of throughput, collision rate, and fairness, coupled with real-time responsiveness in highly dynamic scenarios.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"353-368"},"PeriodicalIF":0.0,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10908203","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UAV-Assisted Unbiased Hierarchical Federated Learning: Performance and Convergence Analysis 无人机辅助无偏分层联邦学习:性能与收敛分析
Pub Date : 2025-02-26 DOI: 10.1109/TMLCN.2025.3546181
Ruslan Zhagypar;Nour Kouzayha;Hesham ElSawy;Hayssam Dahrouj;Tareq Y. Al-Naffouri
The development of the sixth-generation (6G) of wireless networks is driving computation toward the network edge, where Hierarchical Federated Learning (HFL) plays a pivotal role in distributing learning across edge devices. In HFL, edge devices train local models and send updates to an edge server for local aggregation, which are then forwarded to a central server for global aggregation. However, the unreliability of communication channels at the edge and backhaul links poses a significant bottleneck for HFL-enabled systems. To address this challenge, this paper proposes an unbiased HFL algorithm for Uncrewed Aerial Vehicle (UAV)-assisted wireless networks. While applicable to terrestrial base stations (BSs), the proposed algorithm relies on UAVs for local model aggregation thanks to their ability to enhance wireless channels with lower latency and improved coverage. The proposed algorithm adjusts update weights during local and global aggregations at UAVs to mitigate the impact of unreliable channels. To quantify channel unreliability in HFL, stochastic geometry tools are employed to assess success probabilities of local and global model parameter transmissions. Incorporating these metrics aims to mitigate biases towards devices with better channel conditions in UAV-assisted networks. The paper further examines the theoretical convergence of the proposed unbiased UAV-assisted HFL algorithm under adverse channel conditions and highlights the impact of the limited battery capacity of the UAV on the efficiency of the HFL algorithm. Additionally, the algorithm facilitates optimization of system parameters such as UAV count, altitude, battery capacity, etc. The simulation results underscore the effectiveness of the proposed unbiased HFL scheme, demonstrating a 5.5% higher accuracy and approximately 85% faster convergence compared to conventional HFL algorithms. We make our code available at the following GitHub repository: $texttt {UAV-assisted Unbiased HFL Code}$ .
第六代(6G)无线网络的发展正推动计算向网络边缘发展,而分层联合学习(HFL)在跨边缘设备分配学习方面发挥着举足轻重的作用。在 HFL 中,边缘设备训练本地模型,并将更新发送到边缘服务器进行本地聚合,然后再转发到中央服务器进行全局聚合。然而,边缘和回程链路通信信道的不稳定性对支持 HFL 的系统构成了重大瓶颈。为了应对这一挑战,本文为无人机辅助无线网络提出了一种无偏 HFL 算法。虽然该算法适用于地面基站(BS),但由于无人机能够增强无线信道,降低延迟并改善覆盖范围,因此该算法依赖于无人机进行本地模型聚合。拟议算法在无人机进行本地和全局聚合时调整更新权重,以减轻不可靠信道的影响。为了量化 HFL 中信道的不可靠程度,采用了随机几何工具来评估局部和全局模型参数传输的成功概率。在无人机辅助网络中,纳入这些指标的目的是减轻对信道条件更好的设备的偏见。论文进一步研究了所提出的无偏无人机辅助 HFL 算法在不利信道条件下的理论收敛性,并强调了无人机有限的电池容量对 HFL 算法效率的影响。此外,该算法还有助于优化无人机数量、高度、电池容量等系统参数。仿真结果表明,与传统的 HFL 算法相比,所提出的无偏 HFL 方案的精度提高了 5.5%,收敛速度加快了约 85%。我们在以下 GitHub 代码库中提供了我们的代码:$texttt {无人机辅助无偏 HFL 代码}$ 。
{"title":"UAV-Assisted Unbiased Hierarchical Federated Learning: Performance and Convergence Analysis","authors":"Ruslan Zhagypar;Nour Kouzayha;Hesham ElSawy;Hayssam Dahrouj;Tareq Y. Al-Naffouri","doi":"10.1109/TMLCN.2025.3546181","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3546181","url":null,"abstract":"The development of the sixth-generation (6G) of wireless networks is driving computation toward the network edge, where Hierarchical Federated Learning (HFL) plays a pivotal role in distributing learning across edge devices. In HFL, edge devices train local models and send updates to an edge server for local aggregation, which are then forwarded to a central server for global aggregation. However, the unreliability of communication channels at the edge and backhaul links poses a significant bottleneck for HFL-enabled systems. To address this challenge, this paper proposes an unbiased HFL algorithm for Uncrewed Aerial Vehicle (UAV)-assisted wireless networks. While applicable to terrestrial base stations (BSs), the proposed algorithm relies on UAVs for local model aggregation thanks to their ability to enhance wireless channels with lower latency and improved coverage. The proposed algorithm adjusts update weights during local and global aggregations at UAVs to mitigate the impact of unreliable channels. To quantify channel unreliability in HFL, stochastic geometry tools are employed to assess success probabilities of local and global model parameter transmissions. Incorporating these metrics aims to mitigate biases towards devices with better channel conditions in UAV-assisted networks. The paper further examines the theoretical convergence of the proposed unbiased UAV-assisted HFL algorithm under adverse channel conditions and highlights the impact of the limited battery capacity of the UAV on the efficiency of the HFL algorithm. Additionally, the algorithm facilitates optimization of system parameters such as UAV count, altitude, battery capacity, etc. The simulation results underscore the effectiveness of the proposed unbiased HFL scheme, demonstrating a 5.5% higher accuracy and approximately 85% faster convergence compared to conventional HFL algorithms. We make our code available at the following GitHub repository: <inline-formula> <tex-math>$texttt {UAV-assisted Unbiased HFL Code}$ </tex-math></inline-formula>.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"420-447"},"PeriodicalIF":0.0,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10904929","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143645156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Traffic Prediction With Knowledge-Driven Spatial–Temporal Graph Convolutional Network Aided by Selected Attention Mechanism 基于选择注意机制的知识驱动时空图卷积网络交通预测研究
Pub Date : 2025-02-26 DOI: 10.1109/TMLCN.2025.3545777
Yuwen Qian;Tianyang Qiu;Chuan Ma;Yiyang Ni;Long Yuan;Xiangwei Zhou;Jun Li
Intelligent transportation systems grapple with the formidable task of precisely forecasting real-time traffic conditions, where the traffic dynamics exhibit intricacies arising from spatial and temporal dependencies. The urban road network presents a complex web of interconnected roads, where the state of traffic on one road can influence the conditions of others. Moreover, the prediction of traffic conditions necessitates the consideration of diverse temporal factors. Notably, the proximity of a time point to the present moment wields a more substantial impact on subsequent states. In this paper, we propose the knowledge-driven graph convolutional network (KGCN) aided by the gated recurrent unit with a selected attention mechanism (GSAM) to predict traffic flow. In particular, KGCN is employed to capture the correlation of the external knowledge factors for the road and the spatial dependencies, and the gated recurrent unit (GRU) is used to cope with temporal dependence. Furthermore, to improve traffic prediction accuracy, we propose the GRU combined with a selected attention mechanism with Gumble-Max to predict traffic at the temporal dimension, where a selector is chosen to dynamically assign the feature in various time intervals with different weights. Experimental results with real-life data show the proposed KGCN with GSAM can achieve high accuracy in traffic prediction. Compared to the traditional traffic prediction method, the proposed KGCN with GSAM can achieve higher efficacy and robustness when capturing global dynamic temporal dependencies, external knowledge factor correlations, and spatial correlations.
智能交通系统面临着精确预测实时交通状况的艰巨任务,其中交通动态表现出由空间和时间依赖性引起的复杂性。城市道路网络呈现出由相互连接的道路组成的复杂网络,其中一条道路的交通状况会影响其他道路的状况。此外,交通状况的预测需要考虑多种时间因素。值得注意的是,一个时间点与当前时刻的接近程度会对随后的状态产生更大的影响。在本文中,我们提出了知识驱动的图卷积网络(KGCN)辅以门控循环单元与选择注意机制(GSAM)来预测交通流。其中,KGCN用于捕获道路外部知识因子与空间依赖性的相关性,GRU用于处理时间依赖性。此外,为了提高流量预测精度,我们提出了GRU与选择注意机制结合Gumble-Max在时间维度上进行流量预测,其中选择一个选择器在不同的时间间隔内以不同的权重动态分配特征。实际数据的实验结果表明,基于GSAM的KGCN可以达到较高的流量预测精度。与传统的流量预测方法相比,基于GSAM的KGCN在捕获全局动态时间依赖性、外部知识因子相关性和空间相关性方面具有更高的有效性和鲁棒性。
{"title":"On Traffic Prediction With Knowledge-Driven Spatial–Temporal Graph Convolutional Network Aided by Selected Attention Mechanism","authors":"Yuwen Qian;Tianyang Qiu;Chuan Ma;Yiyang Ni;Long Yuan;Xiangwei Zhou;Jun Li","doi":"10.1109/TMLCN.2025.3545777","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3545777","url":null,"abstract":"Intelligent transportation systems grapple with the formidable task of precisely forecasting real-time traffic conditions, where the traffic dynamics exhibit intricacies arising from spatial and temporal dependencies. The urban road network presents a complex web of interconnected roads, where the state of traffic on one road can influence the conditions of others. Moreover, the prediction of traffic conditions necessitates the consideration of diverse temporal factors. Notably, the proximity of a time point to the present moment wields a more substantial impact on subsequent states. In this paper, we propose the knowledge-driven graph convolutional network (KGCN) aided by the gated recurrent unit with a selected attention mechanism (GSAM) to predict traffic flow. In particular, KGCN is employed to capture the correlation of the external knowledge factors for the road and the spatial dependencies, and the gated recurrent unit (GRU) is used to cope with temporal dependence. Furthermore, to improve traffic prediction accuracy, we propose the GRU combined with a selected attention mechanism with Gumble-Max to predict traffic at the temporal dimension, where a selector is chosen to dynamically assign the feature in various time intervals with different weights. Experimental results with real-life data show the proposed KGCN with GSAM can achieve high accuracy in traffic prediction. Compared to the traditional traffic prediction method, the proposed KGCN with GSAM can achieve higher efficacy and robustness when capturing global dynamic temporal dependencies, external knowledge factor correlations, and spatial correlations.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"369-380"},"PeriodicalIF":0.0,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10904899","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RACH Traffic Prediction in Massive Machine Type Communications 大规模机器通信中的RACH流量预测
Pub Date : 2025-02-17 DOI: 10.1109/TMLCN.2025.3542760
Hossein Mehri;Hani Mehrpouyan;Hao Chen
Traffic pattern prediction has emerged as a promising approach for efficiently managing and mitigating the impacts of event-driven bursty traffic in massive machine-type communication (mMTC) networks. However, achieving accurate predictions of bursty traffic remains a non-trivial task due to the inherent randomness of events, and these challenges intensify within live network environments. Consequently, there is a compelling imperative to design a lightweight and agile framework capable of assimilating continuously collected data from the network and accurately forecasting bursty traffic in mMTC networks. This paper addresses these challenges by presenting a machine learning-based framework tailored for forecasting bursty traffic in multi-channel slotted ALOHA networks. The proposed machine learning network comprises long-term short-term memory (LSTM) and a DenseNet with feed-forward neural network (FFNN) layers, where the residual connections enhance the training ability of the machine learning network in capturing complicated patterns. Furthermore, we develop a new low-complexity online prediction algorithm that updates the states of the LSTM network by leveraging frequently collected data from the mMTC network. Simulation results and complexity analysis demonstrate the superiority of our proposed algorithm in terms of both accuracy and complexity, making it well-suited for time-critical live scenarios. We evaluate the performance of the proposed framework in a network with a single base station and thousands of devices organized into groups with distinct traffic-generating characteristics. Comprehensive evaluations and simulations indicate that our proposed machine learning approach achieves a remarkable 52% higher accuracy in long-term predictions compared to traditional methods, without imposing additional processing load on the system.
流量模式预测已成为有效管理和减轻大规模机器型通信(mMTC)网络中由事件驱动的突发流量影响的一种有前途的方法。然而,由于事件固有的随机性,实现突发流量的准确预测仍然是一项非同小可的任务,而且这些挑战在实时网络环境中更加严峻。因此,当务之急是设计一种轻量级的敏捷框架,能够吸收从网络中连续收集的数据,并准确预测 mMTC 网络中的突发流量。本文针对这些挑战,提出了一种基于机器学习的框架,专门用于预测多通道插槽式 ALOHA 网络中的突发流量。本文提出的机器学习网络由长期短期记忆(LSTM)和带有前馈神经网络(FFNN)层的 DenseNet 组成,其中的残差连接增强了机器学习网络捕捉复杂模式的训练能力。此外,我们还开发了一种新的低复杂度在线预测算法,利用从 mMTC 网络中频繁收集的数据更新 LSTM 网络的状态。仿真结果和复杂性分析表明,我们提出的算法在准确性和复杂性方面都具有优势,非常适合时间紧迫的现场场景。我们评估了拟议框架在一个网络中的性能,该网络由一个基站和成千上万个设备组成,每个设备组都具有不同的流量产生特征。综合评估和模拟结果表明,与传统方法相比,我们提出的机器学习方法的长期预测准确率显著提高了 52%,而且不会给系统带来额外的处理负荷。
{"title":"RACH Traffic Prediction in Massive Machine Type Communications","authors":"Hossein Mehri;Hani Mehrpouyan;Hao Chen","doi":"10.1109/TMLCN.2025.3542760","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3542760","url":null,"abstract":"Traffic pattern prediction has emerged as a promising approach for efficiently managing and mitigating the impacts of event-driven bursty traffic in massive machine-type communication (mMTC) networks. However, achieving accurate predictions of bursty traffic remains a non-trivial task due to the inherent randomness of events, and these challenges intensify within live network environments. Consequently, there is a compelling imperative to design a lightweight and agile framework capable of assimilating continuously collected data from the network and accurately forecasting bursty traffic in mMTC networks. This paper addresses these challenges by presenting a machine learning-based framework tailored for forecasting bursty traffic in multi-channel slotted ALOHA networks. The proposed machine learning network comprises long-term short-term memory (LSTM) and a DenseNet with feed-forward neural network (FFNN) layers, where the residual connections enhance the training ability of the machine learning network in capturing complicated patterns. Furthermore, we develop a new low-complexity online prediction algorithm that updates the states of the LSTM network by leveraging frequently collected data from the mMTC network. Simulation results and complexity analysis demonstrate the superiority of our proposed algorithm in terms of both accuracy and complexity, making it well-suited for time-critical live scenarios. We evaluate the performance of the proposed framework in a network with a single base station and thousands of devices organized into groups with distinct traffic-generating characteristics. Comprehensive evaluations and simulations indicate that our proposed machine learning approach achieves a remarkable 52% higher accuracy in long-term predictions compared to traditional methods, without imposing additional processing load on the system.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"315-331"},"PeriodicalIF":0.0,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10891603","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated Learning-Based Collaborative Wideband Spectrum Sensing and Scheduling for UAVs in UTM Systems UTM系统中基于联邦学习的无人机协同宽带频谱感知与调度
Pub Date : 2025-02-11 DOI: 10.1109/TMLCN.2025.3540747
Sravan Reddy Chintareddy;Keenan Roach;Kenny Cheung;Morteza Hashemi
In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as secondary users (SUs) to opportunistically utilize detected “spectrum holes”. Our overall framework consists of three main stages. Firstly, in the model training stage, we explore dataset generation in a multi-cell environment and train a machine learning (ML) model using the federated learning (FL) architecture. Unlike the existing studies on FL for wireless that presume datasets are readily available for training, we propose an end-to-end architecture that directly integrates wireless dataset generation, which involves capturing I/Q samples from over-the-air signals in a multi-cell environment, into the FL training process. To this purpose, we propose a multi-label classification problem for wideband spectrum sensing to detect multiple spectrum holes simultaneously based on the I/Q samples collected locally by the UAVs. In the traditional FL that employs federated averaging (FedAvg) as the aggregating method, each UAV is assigned an equal weight during model aggregation. However, due to the differences in wireless channels observed at each UAV in a multi-cell environment, the received signal powers and collected datasets at different UAV locations could be significantly different, which could degrade the FL performance using equal weights. To address this issue, we propose a proportional weighted federated averaging method (pwFedAvg) in which the aggregating weights are proportional to the received signal powers at each UAV, thereby integrating the intrinsic properties of wireless channels into the FL algorithm. Secondly, in the collaborative spectrum inference stage, we propose a collaborative spectrum fusion strategy that is compatible with the unmanned aircraft system traffic management (UTM) ecosystem. In particular, we improve the accuracy of spectrum sensing results by combining the multi-label classification results from the individual UAVs by performing spectrum fusion at a central server. Finally, in the spectrum scheduling stage, we leverage reinforcement learning (RL) solutions to dynamically allocate the detected spectrum holes to the secondary users. To evaluate the proposed methods, we establish a comprehensive simulation framework that generates a near-realistic synthetic dataset using MATLAB LTE toolbox by incorporating base station (BS) locations in a chosen area of interest, performing ray-tracing, and emulating the primary user’s channel usage in terms of I/Q samples. This evaluation methodology provides a flexible framework to generate large spectrum datasets that could be used for developing ML/AI-based spectrum management solutions for aerial devices.
在本文中,我们提出了一个数据驱动的框架,用于网络无人机(uav)的协同宽带频谱感知和调度,这些无人机作为次要用户(SUs),机会主义地利用检测到的“频谱漏洞”。我们的总体框架由三个主要阶段组成。首先,在模型训练阶段,我们探索了多单元环境下的数据集生成,并使用联邦学习(FL)架构训练机器学习(ML)模型。与现有的假设数据集易于训练的无线FL研究不同,我们提出了一种端到端架构,直接将无线数据集生成集成到FL训练过程中,其中包括从多单元环境中的空中信号中捕获I/Q样本。为此,我们提出了一种基于无人机局部采集的I/Q样本,同时检测多个频谱漏洞的宽带频谱传感多标签分类问题。在采用联邦平均(FedAvg)作为聚合方法的传统无人机模型中,每个无人机在模型聚合过程中被赋予相同的权值。然而,由于在多小区环境中每架无人机观察到的无线信道的差异,不同无人机位置接收到的信号功率和收集的数据集可能会有显著差异,这可能会降低使用等权重的FL性能。为了解决这个问题,我们提出了一种比例加权联邦平均方法(pwFedAvg),其中的聚合权与每架无人机接收到的信号功率成正比,从而将无线信道的固有特性集成到FL算法中。其次,在协同频谱推理阶段,提出了一种与无人机系统交通管理(UTM)生态系统兼容的协同频谱融合策略。特别是,我们通过在中央服务器上执行频谱融合,将来自单个无人机的多标签分类结果组合在一起,从而提高了频谱感知结果的准确性。最后,在频谱调度阶段,我们利用强化学习(RL)解决方案将检测到的频谱漏洞动态分配给辅助用户。为了评估所提出的方法,我们建立了一个全面的仿真框架,通过在选定的感兴趣区域合并基站(BS)位置,执行光线追踪,并在I/Q样本方面模拟主要用户的信道使用情况,使用MATLAB LTE工具箱生成接近真实的合成数据集。这种评估方法为生成大型频谱数据集提供了一个灵活的框架,可用于开发基于ML/ ai的航空设备频谱管理解决方案。
{"title":"Federated Learning-Based Collaborative Wideband Spectrum Sensing and Scheduling for UAVs in UTM Systems","authors":"Sravan Reddy Chintareddy;Keenan Roach;Kenny Cheung;Morteza Hashemi","doi":"10.1109/TMLCN.2025.3540747","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3540747","url":null,"abstract":"In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as secondary users (SUs) to opportunistically utilize detected “spectrum holes”. Our overall framework consists of three main stages. Firstly, in the model training stage, we explore dataset generation in a multi-cell environment and train a machine learning (ML) model using the federated learning (FL) architecture. Unlike the existing studies on FL for wireless that presume datasets are readily available for training, we propose an end-to-end architecture that directly integrates wireless dataset generation, which involves capturing I/Q samples from over-the-air signals in a multi-cell environment, into the FL training process. To this purpose, we propose a multi-label classification problem for wideband spectrum sensing to detect multiple spectrum holes simultaneously based on the I/Q samples collected locally by the UAVs. In the traditional FL that employs federated averaging (FedAvg) as the aggregating method, each UAV is assigned an equal weight during model aggregation. However, due to the differences in wireless channels observed at each UAV in a multi-cell environment, the received signal powers and collected datasets at different UAV locations could be significantly different, which could degrade the FL performance using equal weights. To address this issue, we propose a proportional weighted federated averaging method (pwFedAvg) in which the aggregating weights are proportional to the received signal powers at each UAV, thereby integrating the intrinsic properties of wireless channels into the FL algorithm. Secondly, in the collaborative spectrum inference stage, we propose a collaborative spectrum fusion strategy that is compatible with the unmanned aircraft system traffic management (UTM) ecosystem. In particular, we improve the accuracy of spectrum sensing results by combining the multi-label classification results from the individual UAVs by performing spectrum fusion at a central server. Finally, in the spectrum scheduling stage, we leverage reinforcement learning (RL) solutions to dynamically allocate the detected spectrum holes to the secondary users. To evaluate the proposed methods, we establish a comprehensive simulation framework that generates a near-realistic synthetic dataset using MATLAB LTE toolbox by incorporating base station (BS) locations in a chosen area of interest, performing ray-tracing, and emulating the primary user’s channel usage in terms of I/Q samples. This evaluation methodology provides a flexible framework to generate large spectrum datasets that could be used for developing ML/AI-based spectrum management solutions for aerial devices.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"296-314"},"PeriodicalIF":0.0,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10879292","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement Learning With Selective Exploration for Interference Management in mmWave Networks 基于选择性探索的毫米波网络干扰管理强化学习
Pub Date : 2025-02-03 DOI: 10.1109/TMLCN.2025.3537967
Son Dinh-van;van-Linh Nguyen;Berna Bulut Cebecioglu;Antonino Masaracchia;Matthew D. Higgins
The next generation of wireless systems will leverage the millimeter-wave (mmWave) bands to meet the increasing traffic volume and high data rate requirements of emerging applications (e.g., ultra HD streaming, metaverse, and holographic telepresence). In this paper, we address the joint optimization of beamforming, power control, and interference management in multi-cell mmWave networks. We propose novel reinforcement learning algorithms, including a single-agent-based method (BPC-SA) for centralized settings and a multi-agent-based method (BPC-MA) for distributed settings. To tackle the high-variance rewards caused by narrow antenna beamwidths, we introduce a selective exploration method to guide the agent towards more intelligent exploration. Our proposed algorithms are well-suited for scenarios where beamforming vectors require control in either a discrete domain, such as a codebook, or in a continuous domain. Furthermore, they do not require channel state information, extensive feedback from user equipments, or any searching methods, thus reducing overhead and enhancing scalability. Numerical results demonstrate that selective exploration improves per-user spectral efficiency by up to 22.5% compared to scenarios without it. Additionally, our algorithms significantly outperform existing methods by 50% in terms of per-user spectral effciency and achieve 90% of the per-user spectral efficiency of the exhaustive search approach while requiring only 0.1% of its computational runtime.
下一代无线系统将利用毫米波(mmWave)频段来满足日益增长的通信量和新兴应用(例如,超高清流媒体、虚拟世界和全息远程呈现)的高数据速率要求。在本文中,我们讨论了多小区毫米波网络中波束形成、功率控制和干扰管理的联合优化。我们提出了新的强化学习算法,包括用于集中式设置的基于单智能体的方法(BPC-SA)和用于分布式设置的基于多智能体的方法(BPC-MA)。为了解决天线波束宽度窄导致的高方差奖励,我们引入了一种选择性探索方法,引导智能体进行更智能的探索。我们提出的算法非常适合于波束形成矢量需要在离散域(如码本)或连续域进行控制的情况。此外,它们不需要通道状态信息、来自用户设备的大量反馈或任何搜索方法,从而减少了开销并增强了可伸缩性。数值结果表明,与没有选择性勘探的情况相比,选择性勘探可将每个用户的频谱效率提高22.5%。此外,我们的算法在每用户频谱效率方面显著优于现有方法50%,达到穷举搜索方法每用户频谱效率的90%,而只需要0.1%的计算时间。
{"title":"Reinforcement Learning With Selective Exploration for Interference Management in mmWave Networks","authors":"Son Dinh-van;van-Linh Nguyen;Berna Bulut Cebecioglu;Antonino Masaracchia;Matthew D. Higgins","doi":"10.1109/TMLCN.2025.3537967","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3537967","url":null,"abstract":"The next generation of wireless systems will leverage the millimeter-wave (mmWave) bands to meet the increasing traffic volume and high data rate requirements of emerging applications (e.g., ultra HD streaming, metaverse, and holographic telepresence). In this paper, we address the joint optimization of beamforming, power control, and interference management in multi-cell mmWave networks. We propose novel reinforcement learning algorithms, including a single-agent-based method (BPC-SA) for centralized settings and a multi-agent-based method (BPC-MA) for distributed settings. To tackle the high-variance rewards caused by narrow antenna beamwidths, we introduce a selective exploration method to guide the agent towards more intelligent exploration. Our proposed algorithms are well-suited for scenarios where beamforming vectors require control in either a discrete domain, such as a codebook, or in a continuous domain. Furthermore, they do not require channel state information, extensive feedback from user equipments, or any searching methods, thus reducing overhead and enhancing scalability. Numerical results demonstrate that selective exploration improves per-user spectral efficiency by up to 22.5% compared to scenarios without it. Additionally, our algorithms significantly outperform existing methods by 50% in terms of per-user spectral effciency and achieve 90% of the per-user spectral efficiency of the exhaustive search approach while requiring only 0.1% of its computational runtime.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"280-295"},"PeriodicalIF":0.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10869481","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143422871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge- and Model-Driven Deep Reinforcement Learning for Efficient Federated Edge Learning: Single- and Multi-Agent Frameworks 高效联邦边缘学习的知识和模型驱动深度强化学习:单和多智能体框架
Pub Date : 2025-01-27 DOI: 10.1109/TMLCN.2025.3534754
Yangchen Li;Lingzhi Zhao;Tianle Wang;Lianghui Ding;Feng Yang
In this paper, we investigate federated learning (FL) efficiency improvement in practical edge computing systems, where edge workers have non-independent and identically distributed (non-IID) local data, as well as dynamic and heterogeneous computing and communication capabilities. We consider a general FL algorithm with configurable parameters, including the number of local iterations, mini-batch sizes, step sizes, aggregation weights, and quantization parameters, and provide a rigorous convergence analysis. We formulate a joint optimization problem for FL worker selection and algorithm parameter configuration to minimize the final test loss subject to time and energy constraints. The resulting problem is a complicated stochastic sequential decision-making problem with an implicit objective function and unknown transition probabilities. To address these challenges, we propose knowledge/model-driven single-agent and multi-agent deep reinforcement learning (DRL) frameworks. We transform the primal problem into a Markov decision process (MDP) for the single-agent DRL framework and a decentralized partially-observable Markov decision process (Dec-POMDP) for the multi-agent DRL framework. We develop efficient single-agent and multi-agent asynchronous advantage actor-critic (A3C) approaches to solve the MDP and Dec-POMDP, respectively. In both frameworks, we design a knowledge-based reward to facilitate effective DRL and propose a model-based stochastic policy to tackle the mixed discrete-continuous actions and large action spaces. To reduce the computational complexities of policy learning and execution, we introduce a segmented actor-critic architecture for the single-agent DRL and a distributed actor-critic architecture for the multi-agent DRL. Numerical results demonstrate the effectiveness and advantages of the proposed frameworks in enhancing FL efficiency.
在本文中,我们研究了实际边缘计算系统中联邦学习(FL)效率的提高,其中边缘工作者具有非独立和同分布(非iid)本地数据,以及动态和异构计算和通信能力。我们考虑了一种具有可配置参数的通用FL算法,包括局部迭代次数、小批量大小、步长、聚合权值和量化参数,并提供了严格的收敛分析。在时间和能量约束下,以最小化最终测试损失为目标,提出了FL工人选择和算法参数配置的联合优化问题。该问题是一个具有隐式目标函数和未知转移概率的复杂随机序列决策问题。为了应对这些挑战,我们提出了知识/模型驱动的单智能体和多智能体深度强化学习(DRL)框架。我们将原始问题转化为单智能体DRL框架的马尔可夫决策过程(MDP)和多智能体DRL框架的分散部分可观察马尔可夫决策过程(Dec-POMDP)。我们开发了高效的单智能体和多智能体异步优势参与者-评论家(A3C)方法来分别解决MDP和Dec-POMDP问题。在这两个框架中,我们设计了一种基于知识的奖励来促进有效的DRL,并提出了一种基于模型的随机策略来处理混合离散-连续动作和大动作空间。为了降低策略学习和执行的计算复杂性,我们为单智能体DRL引入了分段的参与者-批评体系结构,为多智能体DRL引入了分布式的参与者-批评体系结构。数值结果表明了所提框架在提高FL效率方面的有效性和优越性。
{"title":"Knowledge- and Model-Driven Deep Reinforcement Learning for Efficient Federated Edge Learning: Single- and Multi-Agent Frameworks","authors":"Yangchen Li;Lingzhi Zhao;Tianle Wang;Lianghui Ding;Feng Yang","doi":"10.1109/TMLCN.2025.3534754","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3534754","url":null,"abstract":"In this paper, we investigate federated learning (FL) efficiency improvement in practical edge computing systems, where edge workers have non-independent and identically distributed (non-IID) local data, as well as dynamic and heterogeneous computing and communication capabilities. We consider a general FL algorithm with configurable parameters, including the number of local iterations, mini-batch sizes, step sizes, aggregation weights, and quantization parameters, and provide a rigorous convergence analysis. We formulate a joint optimization problem for FL worker selection and algorithm parameter configuration to minimize the final test loss subject to time and energy constraints. The resulting problem is a complicated stochastic sequential decision-making problem with an implicit objective function and unknown transition probabilities. To address these challenges, we propose knowledge/model-driven single-agent and multi-agent deep reinforcement learning (DRL) frameworks. We transform the primal problem into a Markov decision process (MDP) for the single-agent DRL framework and a decentralized partially-observable Markov decision process (Dec-POMDP) for the multi-agent DRL framework. We develop efficient single-agent and multi-agent asynchronous advantage actor-critic (A3C) approaches to solve the MDP and Dec-POMDP, respectively. In both frameworks, we design a knowledge-based reward to facilitate effective DRL and propose a model-based stochastic policy to tackle the mixed discrete-continuous actions and large action spaces. To reduce the computational complexities of policy learning and execution, we introduce a segmented actor-critic architecture for the single-agent DRL and a distributed actor-critic architecture for the multi-agent DRL. Numerical results demonstrate the effectiveness and advantages of the proposed frameworks in enhancing FL efficiency.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"332-352"},"PeriodicalIF":0.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10854500","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Risk-Aware Reinforcement Learning Framework for User-Centric O-RAN 以用户为中心的O-RAN风险感知强化学习框架
Pub Date : 2025-01-24 DOI: 10.1109/TMLCN.2025.3534139
Shahrukh Khan Kasi;Fahd Ahmed Khan;Sabit Ekin;Ali Imran
The evolution of Open Radio Access Networks (O-RAN) presents an opportunity to enhance network performance by enabling dynamic orchestration of configuration and optimization parameters (COPs) through online learning methods. However, leveraging this potential requires overcoming the limitations of traditional cell-centric RAN architectures, which lack the necessary flexibility. On the other hand, despite their recent popularity, the practical deployment of online learning frameworks, such as Deep Reinforcement Learning (DRL)-based COP optimization solutions, remains limited due to their risk of deteriorating network performance during the exploration phase. In this article, we propose and analyze a novel risk-aware DRL framework for user-centric RAN (UC-RAN), which offers both the architectural flexibility and COP optimization to exploit this flexibility. We investigate and identify UC-RAN COPs that can be optimized via a soft actor-critic algorithm implementable as an O-RAN application (rApp) to jointly maximize latency satisfaction, reliability satisfaction, area spectral efficiency, and energy efficiency. We use the offline learning on UC-RAN to reliably accelerate DRL training, thus minimizing the risk of DRL deteriorating cellular network performance. Results show that our proposed solution approaches near-optimal performance in just a few hundred iterations with a decrease in risk score by a factor of ten.
开放无线接入网络(O-RAN)的发展为通过在线学习方法实现配置和优化参数(cop)的动态编排提供了提高网络性能的机会。然而,利用这种潜力需要克服传统的以蜂窝为中心的RAN架构的局限性,这些架构缺乏必要的灵活性。另一方面,尽管在线学习框架最近很流行,但基于深度强化学习(DRL)的COP优化解决方案等在线学习框架的实际部署仍然有限,因为它们在探索阶段存在网络性能恶化的风险。在本文中,我们为以用户为中心的RAN (UC-RAN)提出并分析了一种新颖的风险感知DRL框架,该框架提供了架构灵活性和COP优化以利用这种灵活性。我们研究并确定了UC-RAN cop,这些cop可以通过可作为O-RAN应用(rApp)实现的软行为者批评算法进行优化,以共同最大化延迟满意度、可靠性满意度、区域频谱效率和能源效率。我们使用UC-RAN的离线学习来可靠地加速DRL训练,从而最大限度地降低DRL恶化蜂窝网络性能的风险。结果表明,我们提出的解决方案在仅仅几百次迭代中接近最优性能,风险评分降低了十倍。
{"title":"Risk-Aware Reinforcement Learning Framework for User-Centric O-RAN","authors":"Shahrukh Khan Kasi;Fahd Ahmed Khan;Sabit Ekin;Ali Imran","doi":"10.1109/TMLCN.2025.3534139","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3534139","url":null,"abstract":"The evolution of Open Radio Access Networks (O-RAN) presents an opportunity to enhance network performance by enabling dynamic orchestration of configuration and optimization parameters (COPs) through online learning methods. However, leveraging this potential requires overcoming the limitations of traditional cell-centric RAN architectures, which lack the necessary flexibility. On the other hand, despite their recent popularity, the practical deployment of online learning frameworks, such as Deep Reinforcement Learning (DRL)-based COP optimization solutions, remains limited due to their risk of deteriorating network performance during the exploration phase. In this article, we propose and analyze a novel risk-aware DRL framework for user-centric RAN (UC-RAN), which offers both the architectural flexibility and COP optimization to exploit this flexibility. We investigate and identify UC-RAN COPs that can be optimized via a soft actor-critic algorithm implementable as an O-RAN application (rApp) to jointly maximize latency satisfaction, reliability satisfaction, area spectral efficiency, and energy efficiency. We use the offline learning on UC-RAN to reliably accelerate DRL training, thus minimizing the risk of DRL deteriorating cellular network performance. Results show that our proposed solution approaches near-optimal performance in just a few hundred iterations with a decrease in risk score by a factor of ten.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"195-214"},"PeriodicalIF":0.0,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10852269","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Machine Learning in Communications and Networking
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1