Pub Date : 2025-12-02DOI: 10.1109/TMLCN.2025.3639348
Fahui Wu;Zhijie Wang;Jiangling Cao;Shi Peng;Yu Xu;Yunfei Gao;Qinghua Wu;Dingcheng Yang
In this paper, we consider a UAV-assisted cargo delivery system with limited payload capacity. Due to the limited load capacity of the cargo UAV, it needs to make multiple trips to the warehouse to pick up the parcels. Meanwhile, due to the uneven distribution of cellular signal strength in the air, to send logistics information to ground users (GUs) in time, the cellular-connected UAV needs to bypass the weak signal area in the air. Therefore, these two factors lead to the increase of the total cargo delivery time. To reduce the total delivery time and ensure the communication quality of the UAV, we formulate an objective function to be optimized, which is the weighted sum of the delivery time and the communication outage time of the cargo UAV. We propose a limited payload UAV delivery (LP-UAV-D) framework to solve this problem. The framework consists of the particle swarm optimization (PSO) algorithm and the dueling double deep Q network (D3QN) algorithm. We used two classic algorithms as control groups. The numerical results show that regardless of the maximum payload or flight speed of the UAV, the objective function value obtained through our proposed LP-UAV-D framework and with the help of radio maps is always the smallest. Specifically, the performance of solving the trade-off problem between delivery time and communication quality is improved by about 10%-20% compared with the two comparison algorithms.
{"title":"Radio Map-Based Delivery Sequence Design and Trajectory Optimization in UAV Cargo Delivery Systems","authors":"Fahui Wu;Zhijie Wang;Jiangling Cao;Shi Peng;Yu Xu;Yunfei Gao;Qinghua Wu;Dingcheng Yang","doi":"10.1109/TMLCN.2025.3639348","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3639348","url":null,"abstract":"In this paper, we consider a UAV-assisted cargo delivery system with limited payload capacity. Due to the limited load capacity of the cargo UAV, it needs to make multiple trips to the warehouse to pick up the parcels. Meanwhile, due to the uneven distribution of cellular signal strength in the air, to send logistics information to ground users (GUs) in time, the cellular-connected UAV needs to bypass the weak signal area in the air. Therefore, these two factors lead to the increase of the total cargo delivery time. To reduce the total delivery time and ensure the communication quality of the UAV, we formulate an objective function to be optimized, which is the weighted sum of the delivery time and the communication outage time of the cargo UAV. We propose a limited payload UAV delivery (LP-UAV-D) framework to solve this problem. The framework consists of the particle swarm optimization (PSO) algorithm and the dueling double deep Q network (D3QN) algorithm. We used two classic algorithms as control groups. The numerical results show that regardless of the maximum payload or flight speed of the UAV, the objective function value obtained through our proposed LP-UAV-D framework and with the help of radio maps is always the smallest. Specifically, the performance of solving the trade-off problem between delivery time and communication quality is improved by about 10%-20% compared with the two comparison algorithms.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"17-32"},"PeriodicalIF":0.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11272178","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1109/TMLCN.2025.3638711
Daniel F. Pérez-Ramírez;Nicolas Tsiftes;Carlos Pérez-Penichet;Dejan Kostić;Thiemo Voigt;Magnus Boman
Novel backscatter communication techniques allow battery-free sensor tags to operate with standard IoT devices, thereby augmenting a network’s sensing capabilities. For communicating, sensor tags rely on an unmodulated carrier provided by neighboring IoT devices, with a schedule coordinating this provisioning across the network. Computing schedules to interrogate all sensor tags while minimizing energy, spectrum utilization, and latency—i.e., carrier scheduling—is an NP-hard problem. While recent work introduces learning-based systems for carrier scheduling, we find that their advantage over traditional heuristics progressively decreases for networks with hundreds of IoT nodes. Moreover, we find that their generalization is not consistent: it greatly varies across identically trained models while fixing the dataset, hyperparameters and random seeds used. We present RobustGANTT, a Graph Neural Network scheduler for backscatter networks that learns from optimal schedules of small networks (up to 10 nodes). Our scheduler generalizes, without the need for retraining, to networks of up to hundreds of nodes ($mathbf {100}boldsymbol {times }$ training topology sizes), and exhibits consistent generalization across independent training rounds. We evaluate our system on both simulated topologies of up to 1000 nodes and real-life IoT network topologies of up to 300 IoT devices. RobustGANTT not only exhibits better generalization than existing systems, it also computes schedules achieving up to $mathbf {2}boldsymbol {times }$ less energy and spectrum utilization. Additionally, its polynomial runtime complexity allows it to react fast to changing network conditions. Our work facilitates the operation of large-scale IoT networks, and our machine learning findings further advance the capabilities of learning-based network scheduling. We release our code, datasets and pre-trained models.
{"title":"Robust Generalization of Graph Neural Networks for Scheduling Backscatter Communications at Scale","authors":"Daniel F. Pérez-Ramírez;Nicolas Tsiftes;Carlos Pérez-Penichet;Dejan Kostić;Thiemo Voigt;Magnus Boman","doi":"10.1109/TMLCN.2025.3638711","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3638711","url":null,"abstract":"Novel backscatter communication techniques allow battery-free sensor tags to operate with standard IoT devices, thereby augmenting a network’s sensing capabilities. For communicating, sensor tags rely on an unmodulated carrier provided by neighboring IoT devices, with a schedule coordinating this provisioning across the network. Computing schedules to interrogate all sensor tags while minimizing energy, spectrum utilization, and latency—i.e., carrier scheduling—is an NP-hard problem. While recent work introduces learning-based systems for carrier scheduling, we find that their advantage over traditional heuristics progressively decreases for networks with hundreds of IoT nodes. Moreover, we find that their generalization is not consistent: it greatly varies across identically trained models while fixing the dataset, hyperparameters and random seeds used. We present RobustGANTT, a Graph Neural Network scheduler for backscatter networks that learns from optimal schedules of small networks (up to 10 nodes). Our scheduler generalizes, without the need for retraining, to networks of up to hundreds of nodes (<inline-formula> <tex-math>$mathbf {100}boldsymbol {times }$ </tex-math></inline-formula> training topology sizes), and exhibits consistent generalization across independent training rounds. We evaluate our system on both simulated topologies of up to 1000 nodes and real-life IoT network topologies of up to 300 IoT devices. RobustGANTT not only exhibits better generalization than existing systems, it also computes schedules achieving up to <inline-formula> <tex-math>$mathbf {2}boldsymbol {times }$ </tex-math></inline-formula> less energy and spectrum utilization. Additionally, its polynomial runtime complexity allows it to react fast to changing network conditions. Our work facilitates the operation of large-scale IoT networks, and our machine learning findings further advance the capabilities of learning-based network scheduling. We release our code, datasets and pre-trained models.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"76-97"},"PeriodicalIF":0.0,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271344","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1109/TMLCN.2025.3638983
Heasung Kim;Hyeji Kim;Gustavo De Veciana
Neural network-based encoders and decoders have demonstrated significant performance gains over traditional methods for Channel State Information (CSI) feedback in MIMO communications. However, key challenges in deploying these models in real-world scenarios remain underexplored, including: a) the need to efficiently accommodate diverse channel conditions across varying contexts, e.g., environments, and whether to use multiple encoders and decoders; b) the cost of gathering sufficient data to train neural network models across various contexts; and c) the need to protect sensitive data regarding competing providers’ coverages. To address the first challenge, we propose a novel system using context-dependent decoders and a universal encoder. We limit the number of decoders by clustering similar contexts and allowing those within a cluster to share the same decoder. To address the second and third challenges, we introduce a clustered federated learning-based approach that jointly clusters contexts and learns the desired encoder and context cluster-dependent decoders, leveraging distributed data. The clustering is performed efficiently based on the similarity of time-averaged gradients across contexts. To evaluate our approach, a new dataset reflecting the heterogeneous nature of the wireless systems was curated and made publicly available. Extensive experimental results demonstrate that our proposed CSI compression framework is highly effective and able to efficiently determine a correct context clustering and associated encoder and decoders.
{"title":"Clustered Federated Learning to Support Context-Dependent CSI Decoding","authors":"Heasung Kim;Hyeji Kim;Gustavo De Veciana","doi":"10.1109/TMLCN.2025.3638983","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3638983","url":null,"abstract":"Neural network-based encoders and decoders have demonstrated significant performance gains over traditional methods for Channel State Information (CSI) feedback in MIMO communications. However, key challenges in deploying these models in real-world scenarios remain underexplored, including: a) the need to efficiently accommodate diverse channel conditions across varying contexts, e.g., environments, and whether to use multiple encoders and decoders; b) the cost of gathering sufficient data to train neural network models across various contexts; and c) the need to protect sensitive data regarding competing providers’ coverages. To address the first challenge, we propose a novel system using context-dependent decoders and a universal encoder. We limit the number of decoders by clustering similar contexts and allowing those within a cluster to share the same decoder. To address the second and third challenges, we introduce a clustered federated learning-based approach that jointly clusters contexts and learns the desired encoder and context cluster-dependent decoders, leveraging distributed data. The clustering is performed efficiently based on the similarity of time-averaged gradients across contexts. To evaluate our approach, a new dataset reflecting the heterogeneous nature of the wireless systems was curated and made publicly available. Extensive experimental results demonstrate that our proposed CSI compression framework is highly effective and able to efficiently determine a correct context clustering and associated encoder and decoders.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"211-227"},"PeriodicalIF":0.0,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271400","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1109/TMLCN.2025.3637784
Achintha Wijesinghe;Songyang Zhang;Zhi Ding
Recent advances in generative artificial intelligence (AI) have led to rising interest in federated learning (FL) based on generative adversarial network (GAN) models. GAN-based FL shows promises in many communication and network applications, such as edge computing and the Internet of Things. In the context of FL, GANs can capture the underlying client data structure, and regenerate samples resembling the original data distribution without compromising data privacy. Although most existing GAN-based FL works focus on training a global model, some scenarios exist where personalized FL (PFL) can be more desirable when incorporating client data heterogeneity in terms of distinct data distributions, feature spaces, and labels. To cope with client heterogeneity in GAN-based FL, we propose a novel GAN sharing and aggregation strategy for PFL that can efficiently characterize client heterogeneity in different settings. More specifically, our proposed PFL-GAN first learns the similarities among clients before implementing a weighted collaborative data aggregation. Our empirical results through rigorous experimentation on several well-known datasets demonstrate the effectiveness of PFL-GAN.
{"title":"PFL-GAN: Client Heterogeneity Meets Generative Models in Personalized Federated Learning","authors":"Achintha Wijesinghe;Songyang Zhang;Zhi Ding","doi":"10.1109/TMLCN.2025.3637784","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3637784","url":null,"abstract":"Recent advances in generative artificial intelligence (AI) have led to rising interest in federated learning (FL) based on generative adversarial network (GAN) models. GAN-based FL shows promises in many communication and network applications, such as edge computing and the Internet of Things. In the context of FL, GANs can capture the underlying client data structure, and regenerate samples resembling the original data distribution without compromising data privacy. Although most existing GAN-based FL works focus on training a global model, some scenarios exist where personalized FL (PFL) can be more desirable when incorporating client data heterogeneity in terms of distinct data distributions, feature spaces, and labels. To cope with client heterogeneity in GAN-based FL, we propose a novel GAN sharing and aggregation strategy for PFL that can efficiently characterize client heterogeneity in different settings. More specifically, our proposed PFL-GAN first learns the similarities among clients before implementing a weighted collaborative data aggregation. Our empirical results through rigorous experimentation on several well-known datasets demonstrate the effectiveness of PFL-GAN.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"33-44"},"PeriodicalIF":0.0,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11270937","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deploying transformer models in Personalized Federated Learning (PFL) at the wireless edge faces critical challenges, including high communication overhead, latency, and energy consumption. Existing compression methods, such as pruning and sparsification, typically degrade performance due to the sensitivity of self-attention layers (SALs) to parameter reduction. Also, standard federated averaging (FedAvg) often diminishes personalization by blending crucial client-specific parameters. To overcome these issues, we propose PFL-TPP (Personalized Federated Learning with Transformer Pruning and Personalization). This dual-strategy framework effectively reduces computational and communication burdens while maintaining high model accuracy and personalization. Our approach employs dynamic, learnable threshold pruning on feed-forward layers (FFLs) to eliminate redundant computations. For SALs, we introduce a novel server-side hypernetwork that generates personalized attention parameters from client-specific embeddings, significantly cutting communication overhead without sacrificing personalization. Extensive experiments demonstrate that PFL-TPP achieves up to 82.73% energy savings, 86% reduction in training time, and improved model accuracy compared to standard baselines. These results demonstrate the effectiveness of our proposed approach in enabling scalable, communication-efficient deployment of transformers in real-world PFL scenarios.
{"title":"Personalized Federated Learning With Adaptive Transformer Pruning and Hypernetwork-Driven Personalization in Wireless Networks","authors":"Moqbel Hamood;Abdullatif Albaseer;Hassan El-Sallabi;Mohamed Abdallah;Ala Al-Fuqaha;Bechir Hamdaoui","doi":"10.1109/TMLCN.2025.3637083","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3637083","url":null,"abstract":"Deploying transformer models in Personalized Federated Learning (PFL) at the wireless edge faces critical challenges, including high communication overhead, latency, and energy consumption. Existing compression methods, such as pruning and sparsification, typically degrade performance due to the sensitivity of self-attention layers (SALs) to parameter reduction. Also, standard federated averaging (FedAvg) often diminishes personalization by blending crucial client-specific parameters. To overcome these issues, we propose PFL-TPP (Personalized Federated Learning with Transformer Pruning and Personalization). This dual-strategy framework effectively reduces computational and communication burdens while maintaining high model accuracy and personalization. Our approach employs dynamic, learnable threshold pruning on feed-forward layers (FFLs) to eliminate redundant computations. For SALs, we introduce a novel server-side hypernetwork that generates personalized attention parameters from client-specific embeddings, significantly cutting communication overhead without sacrificing personalization. Extensive experiments demonstrate that PFL-TPP achieves up to 82.73% energy savings, 86% reduction in training time, and improved model accuracy compared to standard baselines. These results demonstrate the effectiveness of our proposed approach in enabling scalable, communication-efficient deployment of transformers in real-world PFL scenarios.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"1-16"},"PeriodicalIF":0.0,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268477","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work proposes a multi-objective resource optimization framework for integrated access and backhaul (IAB) networks, tackling the dual challenges of timely data updates and spectral efficiency under dynamic wireless conditions. Conventional single-objective optimization is often impractical for IAB networks, where objective preferences are unknown or difficult to predefine. Therefore, we formulate a multi-objective problem that minimizes the age of information (AoI) and maximizes spectral efficiency, subject to a risk-aware AoI constraint, access-backhaul throughput fairness, and other contextual requirements. A lightweight proportional fair (PF) scheduling algorithm first handles user association and access resource allocation. Subsequently, a Pareto Q-learning-based reinforcement learning (RL) scheme allocates backhaul resources, with the PF scheduler’s outcomes integrated into the state and constrained action spaces of a Markov decision process (MDP). The reward function balances AoI and spectral efficiency objectives while explicitly capturing fairness, thereby resulting in robust long-term performance without imposing fixed weights. Furthermore, an adaptive value-difference-based exploration technique adjusts exploration rates based on Q-value estimate variances, promoting strategic exploration for optimal trade-offs. Simulations show that the proposed method outperforms baselines, reducing the convexity gap between approximated and optimal Pareto fronts by 68.6% and improving fairness by 16.9%.
{"title":"Resource Optimization in Multi-Hop IAB Networks: Balancing Data Freshness and Spectral Efficiency","authors":"Sarder Fakhrul Abedin;Aamir Mahmood;Zhu Han;Mikael Gidlund","doi":"10.1109/TMLCN.2025.3635578","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3635578","url":null,"abstract":"This work proposes a multi-objective resource optimization framework for integrated access and backhaul (IAB) networks, tackling the dual challenges of timely data updates and spectral efficiency under dynamic wireless conditions. Conventional single-objective optimization is often impractical for IAB networks, where objective preferences are unknown or difficult to predefine. Therefore, we formulate a multi-objective problem that minimizes the age of information (AoI) and maximizes spectral efficiency, subject to a risk-aware AoI constraint, access-backhaul throughput fairness, and other contextual requirements. A lightweight proportional fair (PF) scheduling algorithm first handles user association and access resource allocation. Subsequently, a Pareto Q-learning-based reinforcement learning (RL) scheme allocates backhaul resources, with the PF scheduler’s outcomes integrated into the state and constrained action spaces of a Markov decision process (MDP). The reward function balances AoI and spectral efficiency objectives while explicitly capturing fairness, thereby resulting in robust long-term performance without imposing fixed weights. Furthermore, an adaptive value-difference-based exploration technique adjusts exploration rates based on Q-value estimate variances, promoting strategic exploration for optimal trade-offs. Simulations show that the proposed method outperforms baselines, reducing the convexity gap between approximated and optimal Pareto fronts by 68.6% and improving fairness by 16.9%.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1287-1310"},"PeriodicalIF":0.0,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11262194","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-19DOI: 10.1109/TMLCN.2025.3634994
Fatemeh Lotfi;Hossein Rajoli;Fatemeh Afghah
Next-generation networks utilize the Open Radio Access Network (O-RAN) architecture to enable dynamic resource management, facilitated by the RAN Intelligent Controller (RIC). While deep reinforcement learning (DRL) models show promise in optimizing network resources, they often struggle with robustness and generalizability in dynamic environments. This paper introduces a novel resource management approach that enhances the Soft Actor Critic (SAC) algorithm with Sharpness-Aware Minimization (SAM) in a distributed Multi-Agent RL (MARL) framework. Our method introduces an adaptive and selective SAM mechanism, where regularization is explicitly driven by temporal-difference (TD)-error variance, ensuring that only agents facing high environmental complexity are regularized. This targeted strategy reduces unnecessary overhead, improves training stability, and enhances generalization without sacrificing learning efficiency. We further incorporate a dynamic $rho $ scheduling scheme to refine the exploration-exploitation trade-off across agents. Experimental results show our method significantly outperforms conventional DRL approaches, yielding up to a 22% improvement in resource allocation efficiency and ensuring superior QoS satisfaction across diverse O-RAN slices.
{"title":"Task-Specific Sharpness-Aware O-RAN Resource Management Using Multi-Agent Reinforcement Learning","authors":"Fatemeh Lotfi;Hossein Rajoli;Fatemeh Afghah","doi":"10.1109/TMLCN.2025.3634994","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3634994","url":null,"abstract":"Next-generation networks utilize the Open Radio Access Network (O-RAN) architecture to enable dynamic resource management, facilitated by the RAN Intelligent Controller (RIC). While deep reinforcement learning (DRL) models show promise in optimizing network resources, they often struggle with robustness and generalizability in dynamic environments. This paper introduces a novel resource management approach that enhances the Soft Actor Critic (SAC) algorithm with Sharpness-Aware Minimization (SAM) in a distributed Multi-Agent RL (MARL) framework. Our method introduces an adaptive and selective SAM mechanism, where regularization is explicitly driven by temporal-difference (TD)-error variance, ensuring that only agents facing high environmental complexity are regularized. This targeted strategy reduces unnecessary overhead, improves training stability, and enhances generalization without sacrificing learning efficiency. We further incorporate a dynamic <inline-formula> <tex-math>$rho $ </tex-math></inline-formula> scheduling scheme to refine the exploration-exploitation trade-off across agents. Experimental results show our method significantly outperforms conventional DRL approaches, yielding up to a 22% improvement in resource allocation efficiency and ensuring superior QoS satisfaction across diverse O-RAN slices.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"98-114"},"PeriodicalIF":0.0,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11260483","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-19DOI: 10.1109/TMLCN.2025.3635050
Xiang Fang;Li Chen;Huarui Yin;Xiaohui Chen;Weidong Wang
Model quantization is an effective method that can improve communication efficiency in federated learning (FL). The existing FL quantization protocols almost stay at the level of post-training quantization (PTQ), which comes at the cost of large quantization loss, especially in the setting of low-bits quantization. In this work, we propose a FL quantization training strategy to reduce the impact of quantization on model quality. Specifically, we first apply quantization-aware training (QAT) to FL (QAT-FL), which reduces quantization distortion by adding a fake-quantization module to the model so that the model could perceive future quantization during training. The convergence guarantee of the QAT-FL algorithm is established under certain assumptions. On the basis of the QAT-FL algorithm, we extend the discussion of non-uniform quantization and the adaptive algorithm, so that the model can adaptively adjust the parametric distribution and the number of quantization bits to reduce the amount of traffic in training. Experimental results based on MNIST, CIFAR-10 and FEMNIST datasets show that QAT-FL has advantages in terms of training loss and model inference accuracy, and adaptive-bits quantization of QAT-FL also greatly improves communication efficiency.
{"title":"Communication Efficient Federated Learning With Quantization-Aware Training Design","authors":"Xiang Fang;Li Chen;Huarui Yin;Xiaohui Chen;Weidong Wang","doi":"10.1109/TMLCN.2025.3635050","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3635050","url":null,"abstract":"Model quantization is an effective method that can improve communication efficiency in federated learning (FL). The existing FL quantization protocols almost stay at the level of post-training quantization (PTQ), which comes at the cost of large quantization loss, especially in the setting of low-bits quantization. In this work, we propose a FL quantization training strategy to reduce the impact of quantization on model quality. Specifically, we first apply quantization-aware training (QAT) to FL (QAT-FL), which reduces quantization distortion by adding a fake-quantization module to the model so that the model could perceive future quantization during training. The convergence guarantee of the QAT-FL algorithm is established under certain assumptions. On the basis of the QAT-FL algorithm, we extend the discussion of non-uniform quantization and the adaptive algorithm, so that the model can adaptively adjust the parametric distribution and the number of quantization bits to reduce the amount of traffic in training. Experimental results based on MNIST, CIFAR-10 and FEMNIST datasets show that QAT-FL has advantages in terms of training loss and model inference accuracy, and adaptive-bits quantization of QAT-FL also greatly improves communication efficiency.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"45-59"},"PeriodicalIF":0.0,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11260453","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To meet the ever-increasing demand for higher data rates in mobile networks across generations, many novel schemes have been proposed in the standards. One such scheme is carrier aggregation (CA). Simply put, CA is a technique that allows mobile networks to combine multiple carriers to increase data rate and improve network efficiency. On the uplink, for power constrained users, this translates to the need for an efficient resource allocation scheme, where each user distributes its available power among its assigned uplink carriers. Choosing a good set of carriers and allocating appropriate power on the carriers is of paramount importance for good performance. Another factor that is critical to obtaining good performance is how well the degradation caused by the harmonic/intermodulation terms generated by the user’s transmitter non-linearities is handled. Specifically, for example, if the carrier allocation is such that a harmonic of a user’s uplink carrier falls on the downlink frequency of that user, it leads to a self coupling-induced sensitivity degradation of that user’s downlink receiver. Considering these factors, in this paper, we model the uplink carrier aggregation problem as an optimal resource allocation problem with the associated constraints of non-linearities induced self interference (SI). This involves optimization over a discrete variable (which carriers need to be turned on) and a continuous variable (what power needs to be allocated on the selected carriers) in dynamic environments, a problem which is hard to solve using traditional methods owing to the mixed nature of the optimization variables and the additional need to consider the SI constraint in the problem. Therefore, in this paper, we adopt a reinforcement learning (RL) framework involving a compound-action actor-critic (CA2C) algorithm for the uplink carrier aggregation problem. We propose a novel reward function that is critical for enabling the proposed CA2C algorithm to efficiently handle SI. The CA2C algorithm along with the proposed reward function learns to assign and activate suitable carriers in an online fashion. Numerical results demonstrate that the proposed RL based scheme is able to achieve higher sum throughputs compared to naive schemes. The results also demonstrate that the proposed reward function allows the CA2C algorithm to adapt the optimization both in the presence and absence of SI.
{"title":"A Reinforcement Learning Framework for Resource Allocation in Uplink Carrier Aggregation in the Presence of Self Interference","authors":"Jaswanth Bodempudi;Batta Siva Sairam;Madepalli Haritha;Sandesh Rao Mattu;Ananthanarayanan Chockalingam","doi":"10.1109/TMLCN.2025.3633248","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3633248","url":null,"abstract":"To meet the ever-increasing demand for higher data rates in mobile networks across generations, many novel schemes have been proposed in the standards. One such scheme is carrier aggregation (CA). Simply put, CA is a technique that allows mobile networks to combine multiple carriers to increase data rate and improve network efficiency. On the uplink, for power constrained users, this translates to the need for an efficient resource allocation scheme, where each user distributes its available power among its assigned uplink carriers. Choosing a good set of carriers and allocating appropriate power on the carriers is of paramount importance for good performance. Another factor that is critical to obtaining good performance is how well the degradation caused by the harmonic/intermodulation terms generated by the user’s transmitter non-linearities is handled. Specifically, for example, if the carrier allocation is such that a harmonic of a user’s uplink carrier falls on the downlink frequency of that user, it leads to a self coupling-induced sensitivity degradation of that user’s downlink receiver. Considering these factors, in this paper, we model the uplink carrier aggregation problem as an optimal resource allocation problem with the associated constraints of non-linearities induced self interference (SI). This involves optimization over a discrete variable (which carriers need to be turned on) and a continuous variable (what power needs to be allocated on the selected carriers) in dynamic environments, a problem which is hard to solve using traditional methods owing to the mixed nature of the optimization variables and the additional need to consider the SI constraint in the problem. Therefore, in this paper, we adopt a reinforcement learning (RL) framework involving a compound-action actor-critic (CA2C) algorithm for the uplink carrier aggregation problem. We propose a novel reward function that is critical for enabling the proposed CA2C algorithm to efficiently handle SI. The CA2C algorithm along with the proposed reward function learns to assign and activate suitable carriers in an online fashion. Numerical results demonstrate that the proposed RL based scheme is able to achieve higher sum throughputs compared to naive schemes. The results also demonstrate that the proposed reward function allows the CA2C algorithm to adapt the optimization both in the presence and absence of SI.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1265-1286"},"PeriodicalIF":0.0,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11248959","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-11DOI: 10.1109/TMLCN.2025.3631379
Li Yang;Abdallah Shami
With increasingly sophisticated cybersecurity threats and rising demand for network automation, autonomous cybersecurity mechanisms are becoming critical for securing modern networks. The rapid expansion of Internet of Things (IoT) systems amplifies these challenges, as resource-constrained IoT devices demand scalable and efficient security solutions. In this work, an innovative Intrusion Detection System (IDS) utilizing Automated Machine Learning (AutoML) and Multi-Objective Optimization (MOO) is proposed for autonomous and optimized cyber-attack detection in modern networking environments. The proposed IDS framework integrates two primary innovative techniques: Optimized Importance and Percentage-based Automated Feature Selection (OIP-AutoFS) and Optimized Performance, Confidence, and Efficiency-based Combined Algorithm Selection and Hyperparameter Optimization (OPCE-CASH). These components optimize feature selection and model learning processes to strike a balance between intrusion detection effectiveness and computational efficiency. This work presents the first IDS framework that integrates all four AutoML stages and employs multi-objective optimization to jointly optimize detection effectiveness, efficiency, and confidence for deployment in resource-constrained systems. Experimental evaluations over two benchmark cybersecurity datasets demonstrate that the proposed MOO-AutoML IDS outperforms state-of-the-art IDSs, establishing a new benchmark for autonomous, efficient, and optimized security for networks. Designed to support IoT and edge environments with resource constraints, the proposed framework is applicable to a variety of autonomous cybersecurity applications across diverse networked environments.
{"title":"Toward Autonomous and Efficient Cybersecurity: A Multi-Objective AutoML-Based Intrusion Detection System","authors":"Li Yang;Abdallah Shami","doi":"10.1109/TMLCN.2025.3631379","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3631379","url":null,"abstract":"With increasingly sophisticated cybersecurity threats and rising demand for network automation, autonomous cybersecurity mechanisms are becoming critical for securing modern networks. The rapid expansion of Internet of Things (IoT) systems amplifies these challenges, as resource-constrained IoT devices demand scalable and efficient security solutions. In this work, an innovative Intrusion Detection System (IDS) utilizing Automated Machine Learning (AutoML) and Multi-Objective Optimization (MOO) is proposed for autonomous and optimized cyber-attack detection in modern networking environments. The proposed IDS framework integrates two primary innovative techniques: Optimized Importance and Percentage-based Automated Feature Selection (OIP-AutoFS) and Optimized Performance, Confidence, and Efficiency-based Combined Algorithm Selection and Hyperparameter Optimization (OPCE-CASH). These components optimize feature selection and model learning processes to strike a balance between intrusion detection effectiveness and computational efficiency. This work presents the first IDS framework that integrates all four AutoML stages and employs multi-objective optimization to jointly optimize detection effectiveness, efficiency, and confidence for deployment in resource-constrained systems. Experimental evaluations over two benchmark cybersecurity datasets demonstrate that the proposed MOO-AutoML IDS outperforms state-of-the-art IDSs, establishing a new benchmark for autonomous, efficient, and optimized security for networks. Designed to support IoT and edge environments with resource constraints, the proposed framework is applicable to a variety of autonomous cybersecurity applications across diverse networked environments.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1244-1264"},"PeriodicalIF":0.0,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11240569","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145560635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}