Pub Date : 2025-12-16DOI: 10.1109/TMLCN.2025.3644333
Ons Aouedi;Flor Ortiz;Thang X. Vu;Alexandre Lefourn;Felix Giese;Guillermo Gutierrez;Symeon Chatzinotas
The growing integration of non-terrestrial networks (NTNs), particularly low Earth orbit (LEO) satellite constellations, has significantly extended the reach of maritime connectivity, supporting critical applications such as vessel monitoring, navigation safety, and maritime surveillance in remote and oceanic regions. Automatic Identification System (AIS) data, increasingly collected through a combination of satellite and terrestrial infrastructures, provide a rich source of spatiotemporal vessel information. However, accurate trajectory prediction in maritime domains remains challenging due to irregular sampling rates, dynamic environmental conditions, and heterogeneous vessel behaviors. This study proposes a velocity-based trajectory prediction framework that leverages AIS data collected from integrated satellite–terrestrial networks. Rather than directly predicting absolute positions (latitude and longitude), our model predicts vessel motion in the form of latitude and longitude velocities. This formulation simplifies the learning task, enhances temporal continuity, and improves scalability, making it well-suited for resource-constrained NTN environments. The predictive architecture is built upon a Long Short-Term Memory network enhanced with attention mechanisms and residual connections (LSTM-RA), enabling it to capture complex temporal dependencies and adapt to noise in real-world AIS data. Extensive experiments on two maritime datasets validate the robustness and accuracy of our framework, demonstrating clear improvements over state-of-the-art baselines.
{"title":"AIS-Based Hybrid Vessel Trajectory Prediction for Enhanced Maritime Navigation","authors":"Ons Aouedi;Flor Ortiz;Thang X. Vu;Alexandre Lefourn;Felix Giese;Guillermo Gutierrez;Symeon Chatzinotas","doi":"10.1109/TMLCN.2025.3644333","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3644333","url":null,"abstract":"The growing integration of non-terrestrial networks (NTNs), particularly low Earth orbit (LEO) satellite constellations, has significantly extended the reach of maritime connectivity, supporting critical applications such as vessel monitoring, navigation safety, and maritime surveillance in remote and oceanic regions. Automatic Identification System (AIS) data, increasingly collected through a combination of satellite and terrestrial infrastructures, provide a rich source of spatiotemporal vessel information. However, accurate trajectory prediction in maritime domains remains challenging due to irregular sampling rates, dynamic environmental conditions, and heterogeneous vessel behaviors. This study proposes a velocity-based trajectory prediction framework that leverages AIS data collected from integrated satellite–terrestrial networks. Rather than directly predicting absolute positions (latitude and longitude), our model predicts vessel motion in the form of latitude and longitude velocities. This formulation simplifies the learning task, enhances temporal continuity, and improves scalability, making it well-suited for resource-constrained NTN environments. The predictive architecture is built upon a Long Short-Term Memory network enhanced with attention mechanisms and residual connections (<monospace>LSTM-RA</monospace>), enabling it to capture complex temporal dependencies and adapt to noise in real-world AIS data. Extensive experiments on two maritime datasets validate the robustness and accuracy of our framework, demonstrating clear improvements over state-of-the-art baselines.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"198-210"},"PeriodicalIF":0.0,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11301841","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-12DOI: 10.1109/TMLCN.2025.3643409
George P. Kontoudis;Daniel J. Stilwell
In this paper, we propose scalable methods for Gaussian process (GP) prediction in decentralized multi-agent systems. Multiple aggregation techniques for GP prediction are decentralized with the use of iterative and consensus methods. Moreover, we introduce a covariance-based nearest neighbor selection strategy that leverages cross-covariance similarity, enabling subsets of agents to make accurate predictions. The proposed decentralized schemes preserve the consistency properties of their centralized counterparts, while adhering to federated learning principles by restricting raw data exchange between agents. We validate the efficacy of the proposed decentralized algorithms with numerical experiments on real-world sea surface temperature and ground elevation map datasets across multiple fleet sizes.
{"title":"Multi-Agent Federated Learning Using Covariance-Based Nearest Neighbor Gaussian Processes","authors":"George P. Kontoudis;Daniel J. Stilwell","doi":"10.1109/TMLCN.2025.3643409","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3643409","url":null,"abstract":"In this paper, we propose scalable methods for Gaussian process (GP) prediction in decentralized multi-agent systems. Multiple aggregation techniques for GP prediction are decentralized with the use of iterative and consensus methods. Moreover, we introduce a covariance-based nearest neighbor selection strategy that leverages cross-covariance similarity, enabling subsets of agents to make accurate predictions. The proposed decentralized schemes preserve the consistency properties of their centralized counterparts, while adhering to federated learning principles by restricting raw data exchange between agents. We validate the efficacy of the proposed decentralized algorithms with numerical experiments on real-world sea surface temperature and ground elevation map datasets across multiple fleet sizes.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"115-138"},"PeriodicalIF":0.0,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11299094","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the increasing prevalence of DDoS attacks, various machine learning-based detection models have been employed to mitigate these malicious behaviors. Understanding how machine learning models function can be quite complex, especially for intricate and nonlinear models like deep learning architectures. Recently, various techniques have been advanced to interpret deep learning models and address issues of ambiguity. In this paper, we present a comprehensive analysis of various explanation methods that are applied to Long Short-Term Memory (LSTM) model for detecting Distributed Denial of Service (DDoS) attacks on raw traffic data. While previous studies have focused primarily on improving detection accuracy on feature-based datasets, this paper emphasizes the importance of interpretability in deep learning models on raw-based traffic datasets. By employing explanation techniques such as LIME, SHAP, Anchor, and LORE, we provide insights into the decision-making processes of LSTM models, thereby enhancing trust and understanding in classifying DDoS attacks. The use of raw-based network traffic revealed crucial packet fields that played an important role behind the true and false positive predictions of the LSTM model, as well as identifying common network fields among the DDoS attacks to justify the misclassifications between similar DDoS attacks.
{"title":"A Deeper Look on Explanation Methods for Deep Learning Models on Raw-Based Traffic of DDoS Attacks","authors":"Basil AsSadhan;Abdulmuneem Bashaiwth;Hamad Binsalleeh","doi":"10.1109/TMLCN.2025.3642211","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3642211","url":null,"abstract":"With the increasing prevalence of DDoS attacks, various machine learning-based detection models have been employed to mitigate these malicious behaviors. Understanding how machine learning models function can be quite complex, especially for intricate and nonlinear models like deep learning architectures. Recently, various techniques have been advanced to interpret deep learning models and address issues of ambiguity. In this paper, we present a comprehensive analysis of various explanation methods that are applied to Long Short-Term Memory (LSTM) model for detecting Distributed Denial of Service (DDoS) attacks on raw traffic data. While previous studies have focused primarily on improving detection accuracy on feature-based datasets, this paper emphasizes the importance of interpretability in deep learning models on raw-based traffic datasets. By employing explanation techniques such as LIME, SHAP, Anchor, and LORE, we provide insights into the decision-making processes of LSTM models, thereby enhancing trust and understanding in classifying DDoS attacks. The use of raw-based network traffic revealed crucial packet fields that played an important role behind the true and false positive predictions of the LSTM model, as well as identifying common network fields among the DDoS attacks to justify the misclassifications between similar DDoS attacks.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"139-160"},"PeriodicalIF":0.0,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11289572","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-08DOI: 10.1109/TMLCN.2025.3638067
{"title":"IEEE Communications Society Board of Governors","authors":"","doi":"10.1109/TMLCN.2025.3638067","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3638067","url":null,"abstract":"","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"C3-C3"},"PeriodicalIF":0.0,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11283087","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145698226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1109/TMLCN.2025.3639365
Sheikh Islam;Xin Ma;Chunxiao Chigan
Achieving effective self-interference cancellation (SIC) in full-duplex (FD) wireless communication systems under time-varying channel conditions remains a significant challenge. To address this challenge, we propose a novel adaptive SIC solution through leveraging Hyper Neural Networks (HyperNet) and incremental learning (IL). Unlike the existing methods that rely on offline training or lack real-time adaptability, our approach enables autonomous learning and fast adaptation to the complex, nonlinear, and time-varying nature of self-interference (SI) channels. It effectively addresses dynamic adaptation challenges, such as catastrophic forgetting, through the use of experience replay (ER). Our experimental results show that traditional model-based methods exhibit limited adaptability under dynamic channel conditions, while conventional data-driven models fail to maintain consistent performance without the adaptive capabilities provided by IL. In contrast, the proposed HyperNet-based IL model reduces training time by 33% and achieves three times faster convergence compared to a standalone HyperNet trained separately for each static condition. Extensive evaluations using simulated datasets that emulate real-world scenarios demonstrate that our approach consistently achieves SI suppression down to the noise floor. It also delivers significantly lower computational complexity and training time. These improvements collectively enhance the efficiency and reliability of FD communication systems operating in dynamic wireless environments.
{"title":"Adaptive Nonlinear Digital Self-Interference Cancellation for Full-Duplex Wireless Systems Using Hypernetwork-Based Incremental Learning","authors":"Sheikh Islam;Xin Ma;Chunxiao Chigan","doi":"10.1109/TMLCN.2025.3639365","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3639365","url":null,"abstract":"Achieving effective self-interference cancellation (SIC) in full-duplex (FD) wireless communication systems under time-varying channel conditions remains a significant challenge. To address this challenge, we propose a novel adaptive SIC solution through leveraging Hyper Neural Networks (HyperNet) and incremental learning (IL). Unlike the existing methods that rely on offline training or lack real-time adaptability, our approach enables autonomous learning and fast adaptation to the complex, nonlinear, and time-varying nature of self-interference (SI) channels. It effectively addresses dynamic adaptation challenges, such as catastrophic forgetting, through the use of experience replay (ER). Our experimental results show that traditional model-based methods exhibit limited adaptability under dynamic channel conditions, while conventional data-driven models fail to maintain consistent performance without the adaptive capabilities provided by IL. In contrast, the proposed HyperNet-based IL model reduces training time by 33% and achieves three times faster convergence compared to a standalone HyperNet trained separately for each static condition. Extensive evaluations using simulated datasets that emulate real-world scenarios demonstrate that our approach consistently achieves SI suppression down to the noise floor. It also delivers significantly lower computational complexity and training time. These improvements collectively enhance the efficiency and reliability of FD communication systems operating in dynamic wireless environments.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"60-75"},"PeriodicalIF":0.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11272907","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1109/TMLCN.2025.3639348
Fahui Wu;Zhijie Wang;Jiangling Cao;Shi Peng;Yu Xu;Yunfei Gao;Qinghua Wu;Dingcheng Yang
In this paper, we consider a UAV-assisted cargo delivery system with limited payload capacity. Due to the limited load capacity of the cargo UAV, it needs to make multiple trips to the warehouse to pick up the parcels. Meanwhile, due to the uneven distribution of cellular signal strength in the air, to send logistics information to ground users (GUs) in time, the cellular-connected UAV needs to bypass the weak signal area in the air. Therefore, these two factors lead to the increase of the total cargo delivery time. To reduce the total delivery time and ensure the communication quality of the UAV, we formulate an objective function to be optimized, which is the weighted sum of the delivery time and the communication outage time of the cargo UAV. We propose a limited payload UAV delivery (LP-UAV-D) framework to solve this problem. The framework consists of the particle swarm optimization (PSO) algorithm and the dueling double deep Q network (D3QN) algorithm. We used two classic algorithms as control groups. The numerical results show that regardless of the maximum payload or flight speed of the UAV, the objective function value obtained through our proposed LP-UAV-D framework and with the help of radio maps is always the smallest. Specifically, the performance of solving the trade-off problem between delivery time and communication quality is improved by about 10%-20% compared with the two comparison algorithms.
{"title":"Radio Map-Based Delivery Sequence Design and Trajectory Optimization in UAV Cargo Delivery Systems","authors":"Fahui Wu;Zhijie Wang;Jiangling Cao;Shi Peng;Yu Xu;Yunfei Gao;Qinghua Wu;Dingcheng Yang","doi":"10.1109/TMLCN.2025.3639348","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3639348","url":null,"abstract":"In this paper, we consider a UAV-assisted cargo delivery system with limited payload capacity. Due to the limited load capacity of the cargo UAV, it needs to make multiple trips to the warehouse to pick up the parcels. Meanwhile, due to the uneven distribution of cellular signal strength in the air, to send logistics information to ground users (GUs) in time, the cellular-connected UAV needs to bypass the weak signal area in the air. Therefore, these two factors lead to the increase of the total cargo delivery time. To reduce the total delivery time and ensure the communication quality of the UAV, we formulate an objective function to be optimized, which is the weighted sum of the delivery time and the communication outage time of the cargo UAV. We propose a limited payload UAV delivery (LP-UAV-D) framework to solve this problem. The framework consists of the particle swarm optimization (PSO) algorithm and the dueling double deep Q network (D3QN) algorithm. We used two classic algorithms as control groups. The numerical results show that regardless of the maximum payload or flight speed of the UAV, the objective function value obtained through our proposed LP-UAV-D framework and with the help of radio maps is always the smallest. Specifically, the performance of solving the trade-off problem between delivery time and communication quality is improved by about 10%-20% compared with the two comparison algorithms.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"17-32"},"PeriodicalIF":0.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11272178","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1109/TMLCN.2025.3638711
Daniel F. Pérez-Ramírez;Nicolas Tsiftes;Carlos Pérez-Penichet;Dejan Kostić;Thiemo Voigt;Magnus Boman
Novel backscatter communication techniques allow battery-free sensor tags to operate with standard IoT devices, thereby augmenting a network’s sensing capabilities. For communicating, sensor tags rely on an unmodulated carrier provided by neighboring IoT devices, with a schedule coordinating this provisioning across the network. Computing schedules to interrogate all sensor tags while minimizing energy, spectrum utilization, and latency—i.e., carrier scheduling—is an NP-hard problem. While recent work introduces learning-based systems for carrier scheduling, we find that their advantage over traditional heuristics progressively decreases for networks with hundreds of IoT nodes. Moreover, we find that their generalization is not consistent: it greatly varies across identically trained models while fixing the dataset, hyperparameters and random seeds used. We present RobustGANTT, a Graph Neural Network scheduler for backscatter networks that learns from optimal schedules of small networks (up to 10 nodes). Our scheduler generalizes, without the need for retraining, to networks of up to hundreds of nodes ($mathbf {100}boldsymbol {times }$ training topology sizes), and exhibits consistent generalization across independent training rounds. We evaluate our system on both simulated topologies of up to 1000 nodes and real-life IoT network topologies of up to 300 IoT devices. RobustGANTT not only exhibits better generalization than existing systems, it also computes schedules achieving up to $mathbf {2}boldsymbol {times }$ less energy and spectrum utilization. Additionally, its polynomial runtime complexity allows it to react fast to changing network conditions. Our work facilitates the operation of large-scale IoT networks, and our machine learning findings further advance the capabilities of learning-based network scheduling. We release our code, datasets and pre-trained models.
{"title":"Robust Generalization of Graph Neural Networks for Scheduling Backscatter Communications at Scale","authors":"Daniel F. Pérez-Ramírez;Nicolas Tsiftes;Carlos Pérez-Penichet;Dejan Kostić;Thiemo Voigt;Magnus Boman","doi":"10.1109/TMLCN.2025.3638711","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3638711","url":null,"abstract":"Novel backscatter communication techniques allow battery-free sensor tags to operate with standard IoT devices, thereby augmenting a network’s sensing capabilities. For communicating, sensor tags rely on an unmodulated carrier provided by neighboring IoT devices, with a schedule coordinating this provisioning across the network. Computing schedules to interrogate all sensor tags while minimizing energy, spectrum utilization, and latency—i.e., carrier scheduling—is an NP-hard problem. While recent work introduces learning-based systems for carrier scheduling, we find that their advantage over traditional heuristics progressively decreases for networks with hundreds of IoT nodes. Moreover, we find that their generalization is not consistent: it greatly varies across identically trained models while fixing the dataset, hyperparameters and random seeds used. We present RobustGANTT, a Graph Neural Network scheduler for backscatter networks that learns from optimal schedules of small networks (up to 10 nodes). Our scheduler generalizes, without the need for retraining, to networks of up to hundreds of nodes (<inline-formula> <tex-math>$mathbf {100}boldsymbol {times }$ </tex-math></inline-formula> training topology sizes), and exhibits consistent generalization across independent training rounds. We evaluate our system on both simulated topologies of up to 1000 nodes and real-life IoT network topologies of up to 300 IoT devices. RobustGANTT not only exhibits better generalization than existing systems, it also computes schedules achieving up to <inline-formula> <tex-math>$mathbf {2}boldsymbol {times }$ </tex-math></inline-formula> less energy and spectrum utilization. Additionally, its polynomial runtime complexity allows it to react fast to changing network conditions. Our work facilitates the operation of large-scale IoT networks, and our machine learning findings further advance the capabilities of learning-based network scheduling. We release our code, datasets and pre-trained models.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"76-97"},"PeriodicalIF":0.0,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271344","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1109/TMLCN.2025.3638983
Heasung Kim;Hyeji Kim;Gustavo De Veciana
Neural network-based encoders and decoders have demonstrated significant performance gains over traditional methods for Channel State Information (CSI) feedback in MIMO communications. However, key challenges in deploying these models in real-world scenarios remain underexplored, including: a) the need to efficiently accommodate diverse channel conditions across varying contexts, e.g., environments, and whether to use multiple encoders and decoders; b) the cost of gathering sufficient data to train neural network models across various contexts; and c) the need to protect sensitive data regarding competing providers’ coverages. To address the first challenge, we propose a novel system using context-dependent decoders and a universal encoder. We limit the number of decoders by clustering similar contexts and allowing those within a cluster to share the same decoder. To address the second and third challenges, we introduce a clustered federated learning-based approach that jointly clusters contexts and learns the desired encoder and context cluster-dependent decoders, leveraging distributed data. The clustering is performed efficiently based on the similarity of time-averaged gradients across contexts. To evaluate our approach, a new dataset reflecting the heterogeneous nature of the wireless systems was curated and made publicly available. Extensive experimental results demonstrate that our proposed CSI compression framework is highly effective and able to efficiently determine a correct context clustering and associated encoder and decoders.
{"title":"Clustered Federated Learning to Support Context-Dependent CSI Decoding","authors":"Heasung Kim;Hyeji Kim;Gustavo De Veciana","doi":"10.1109/TMLCN.2025.3638983","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3638983","url":null,"abstract":"Neural network-based encoders and decoders have demonstrated significant performance gains over traditional methods for Channel State Information (CSI) feedback in MIMO communications. However, key challenges in deploying these models in real-world scenarios remain underexplored, including: a) the need to efficiently accommodate diverse channel conditions across varying contexts, e.g., environments, and whether to use multiple encoders and decoders; b) the cost of gathering sufficient data to train neural network models across various contexts; and c) the need to protect sensitive data regarding competing providers’ coverages. To address the first challenge, we propose a novel system using context-dependent decoders and a universal encoder. We limit the number of decoders by clustering similar contexts and allowing those within a cluster to share the same decoder. To address the second and third challenges, we introduce a clustered federated learning-based approach that jointly clusters contexts and learns the desired encoder and context cluster-dependent decoders, leveraging distributed data. The clustering is performed efficiently based on the similarity of time-averaged gradients across contexts. To evaluate our approach, a new dataset reflecting the heterogeneous nature of the wireless systems was curated and made publicly available. Extensive experimental results demonstrate that our proposed CSI compression framework is highly effective and able to efficiently determine a correct context clustering and associated encoder and decoders.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"211-227"},"PeriodicalIF":0.0,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271400","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1109/TMLCN.2025.3637784
Achintha Wijesinghe;Songyang Zhang;Zhi Ding
Recent advances in generative artificial intelligence (AI) have led to rising interest in federated learning (FL) based on generative adversarial network (GAN) models. GAN-based FL shows promises in many communication and network applications, such as edge computing and the Internet of Things. In the context of FL, GANs can capture the underlying client data structure, and regenerate samples resembling the original data distribution without compromising data privacy. Although most existing GAN-based FL works focus on training a global model, some scenarios exist where personalized FL (PFL) can be more desirable when incorporating client data heterogeneity in terms of distinct data distributions, feature spaces, and labels. To cope with client heterogeneity in GAN-based FL, we propose a novel GAN sharing and aggregation strategy for PFL that can efficiently characterize client heterogeneity in different settings. More specifically, our proposed PFL-GAN first learns the similarities among clients before implementing a weighted collaborative data aggregation. Our empirical results through rigorous experimentation on several well-known datasets demonstrate the effectiveness of PFL-GAN.
{"title":"PFL-GAN: Client Heterogeneity Meets Generative Models in Personalized Federated Learning","authors":"Achintha Wijesinghe;Songyang Zhang;Zhi Ding","doi":"10.1109/TMLCN.2025.3637784","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3637784","url":null,"abstract":"Recent advances in generative artificial intelligence (AI) have led to rising interest in federated learning (FL) based on generative adversarial network (GAN) models. GAN-based FL shows promises in many communication and network applications, such as edge computing and the Internet of Things. In the context of FL, GANs can capture the underlying client data structure, and regenerate samples resembling the original data distribution without compromising data privacy. Although most existing GAN-based FL works focus on training a global model, some scenarios exist where personalized FL (PFL) can be more desirable when incorporating client data heterogeneity in terms of distinct data distributions, feature spaces, and labels. To cope with client heterogeneity in GAN-based FL, we propose a novel GAN sharing and aggregation strategy for PFL that can efficiently characterize client heterogeneity in different settings. More specifically, our proposed PFL-GAN first learns the similarities among clients before implementing a weighted collaborative data aggregation. Our empirical results through rigorous experimentation on several well-known datasets demonstrate the effectiveness of PFL-GAN.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"33-44"},"PeriodicalIF":0.0,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11270937","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deploying transformer models in Personalized Federated Learning (PFL) at the wireless edge faces critical challenges, including high communication overhead, latency, and energy consumption. Existing compression methods, such as pruning and sparsification, typically degrade performance due to the sensitivity of self-attention layers (SALs) to parameter reduction. Also, standard federated averaging (FedAvg) often diminishes personalization by blending crucial client-specific parameters. To overcome these issues, we propose PFL-TPP (Personalized Federated Learning with Transformer Pruning and Personalization). This dual-strategy framework effectively reduces computational and communication burdens while maintaining high model accuracy and personalization. Our approach employs dynamic, learnable threshold pruning on feed-forward layers (FFLs) to eliminate redundant computations. For SALs, we introduce a novel server-side hypernetwork that generates personalized attention parameters from client-specific embeddings, significantly cutting communication overhead without sacrificing personalization. Extensive experiments demonstrate that PFL-TPP achieves up to 82.73% energy savings, 86% reduction in training time, and improved model accuracy compared to standard baselines. These results demonstrate the effectiveness of our proposed approach in enabling scalable, communication-efficient deployment of transformers in real-world PFL scenarios.
{"title":"Personalized Federated Learning With Adaptive Transformer Pruning and Hypernetwork-Driven Personalization in Wireless Networks","authors":"Moqbel Hamood;Abdullatif Albaseer;Hassan El-Sallabi;Mohamed Abdallah;Ala Al-Fuqaha;Bechir Hamdaoui","doi":"10.1109/TMLCN.2025.3637083","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3637083","url":null,"abstract":"Deploying transformer models in Personalized Federated Learning (PFL) at the wireless edge faces critical challenges, including high communication overhead, latency, and energy consumption. Existing compression methods, such as pruning and sparsification, typically degrade performance due to the sensitivity of self-attention layers (SALs) to parameter reduction. Also, standard federated averaging (FedAvg) often diminishes personalization by blending crucial client-specific parameters. To overcome these issues, we propose PFL-TPP (Personalized Federated Learning with Transformer Pruning and Personalization). This dual-strategy framework effectively reduces computational and communication burdens while maintaining high model accuracy and personalization. Our approach employs dynamic, learnable threshold pruning on feed-forward layers (FFLs) to eliminate redundant computations. For SALs, we introduce a novel server-side hypernetwork that generates personalized attention parameters from client-specific embeddings, significantly cutting communication overhead without sacrificing personalization. Extensive experiments demonstrate that PFL-TPP achieves up to 82.73% energy savings, 86% reduction in training time, and improved model accuracy compared to standard baselines. These results demonstrate the effectiveness of our proposed approach in enabling scalable, communication-efficient deployment of transformers in real-world PFL scenarios.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"1-16"},"PeriodicalIF":0.0,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268477","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}