Pub Date : 2026-01-01DOI: 10.1109/TMLCN.2025.3650440
Hieu Le;Oguz Bedir;Mostafa Ibrahim;Jian Tao;Sabit Ekin
This paper presents a multi-agent reinforcement learning (MARL) approach for controlling adjustable metallic reflector arrays to enhance wireless signal reception in non-line-of-sight (NLOS) scenarios. Unlike conventional reconfigurable intelligent surfaces (RIS) that require complex channel estimation, our system employs a centralized training with decentralized execution (CTDE) paradigm where individual agents corresponding to reflector segments autonomously optimize reflector element orientation in three-dimensional space using spatial intelligence based on user location information. Through extensive ray-tracing simulations with dynamic user mobility, the proposed multi-agent beam-focusing framework demonstrates substantial performance improvements over single-agent reinforcement learning baselines, while maintaining rapid adaptation to user movement within one simulation step. Comprehensive evaluation across varying user densities and reflector configurations validates system scalability and robustness. The results demonstrate the potential of learning-based approaches for adaptive wireless propagation control.
{"title":"Signal Whisperers: Enhancing Wireless Reception Using DRL-Guided Reflector Arrays","authors":"Hieu Le;Oguz Bedir;Mostafa Ibrahim;Jian Tao;Sabit Ekin","doi":"10.1109/TMLCN.2025.3650440","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3650440","url":null,"abstract":"This paper presents a multi-agent reinforcement learning (MARL) approach for controlling adjustable metallic reflector arrays to enhance wireless signal reception in non-line-of-sight (NLOS) scenarios. Unlike conventional reconfigurable intelligent surfaces (RIS) that require complex channel estimation, our system employs a centralized training with decentralized execution (CTDE) paradigm where individual agents corresponding to reflector segments autonomously optimize reflector element orientation in three-dimensional space using spatial intelligence based on user location information. Through extensive ray-tracing simulations with dynamic user mobility, the proposed multi-agent beam-focusing framework demonstrates substantial performance improvements over single-agent reinforcement learning baselines, while maintaining rapid adaptation to user movement within one simulation step. Comprehensive evaluation across varying user densities and reflector configurations validates system scalability and robustness. The results demonstrate the potential of learning-based approaches for adaptive wireless propagation control.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"265-281"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11322690","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Considering their high mobility and relatively low cost, uncrewed aerial vehicles (UAVs) equipped with mobile base stations are regarded as a potential technological approach. However, the dual pressures of limited onboard resources of UAVs and the demand for high-quality services in dynamic low-altitude applications jointly form a bottleneck for system performance. Although multi-UAVs communication networks can provide higher system performance through coordinated deployment, the challenges of cooperation and competition among UAVs, as well as more complex optimization problems, significantly increase costs and pose formidable challenges. To overcome the challenges of low coordination efficiency and intense resource competition among multiple UAVs, and to ensure the timely and efficient satisfaction of ground users (GUs) communication service demands, this paper conceives a centralized-controlled two-tier-cooperated UAVs communication network. The network comprises a central UAV (C-UAV) tier as control center and a marginal UAV (M-UAV) tier to serve GUs. In response to the increasingly dynamic and complex scenarios, along with the challenge of insufficient generalization ability in Deep Reinforcement Learning (DRL) algorithms, we propose a clustering-assisted dual-agent soft actor critic (CDA-SAC) algorithm for trajectory design and resource allocation, aiming to maximize the fair energy efficiency of the system. Specifically, by integrating a clustering-matching method with a dual-agent strategy, the proposed CDA-SAC algorithm achieves significant improvements in generalization ability and exploration capability. Simulation results demonstrate that the proposed CDA-SAC algorithm can be deployed without retraining in scenarios with different numbers of GUs. Furthermore, the CDA-SAC algorithm outperforms both the multi-UAV scenarios based on the MADDPG algorithm and the FDMA scheme in terms of fairness and total energy efficiency.
{"title":"Clustering-Assisted Deep Reinforcement Learning for Joint Trajectory Design and Resource Allocation in Two-Tier-Cooperated UAVs Communications","authors":"Shujun Zhao;Simeng Feng;Chao Dong;Xiaojun Zhu;Qihui Wu","doi":"10.1109/TMLCN.2025.3647806","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3647806","url":null,"abstract":"Considering their high mobility and relatively low cost, uncrewed aerial vehicles (UAVs) equipped with mobile base stations are regarded as a potential technological approach. However, the dual pressures of limited onboard resources of UAVs and the demand for high-quality services in dynamic low-altitude applications jointly form a bottleneck for system performance. Although multi-UAVs communication networks can provide higher system performance through coordinated deployment, the challenges of cooperation and competition among UAVs, as well as more complex optimization problems, significantly increase costs and pose formidable challenges. To overcome the challenges of low coordination efficiency and intense resource competition among multiple UAVs, and to ensure the timely and efficient satisfaction of ground users (GUs) communication service demands, this paper conceives a centralized-controlled two-tier-cooperated UAVs communication network. The network comprises a central UAV (C-UAV) tier as control center and a marginal UAV (M-UAV) tier to serve GUs. In response to the increasingly dynamic and complex scenarios, along with the challenge of insufficient generalization ability in Deep Reinforcement Learning (DRL) algorithms, we propose a clustering-assisted dual-agent soft actor critic (CDA-SAC) algorithm for trajectory design and resource allocation, aiming to maximize the fair energy efficiency of the system. Specifically, by integrating a clustering-matching method with a dual-agent strategy, the proposed CDA-SAC algorithm achieves significant improvements in generalization ability and exploration capability. Simulation results demonstrate that the proposed CDA-SAC algorithm can be deployed without retraining in scenarios with different numbers of GUs. Furthermore, the CDA-SAC algorithm outperforms both the multi-UAV scenarios based on the MADDPG algorithm and the FDMA scheme in terms of fairness and total energy efficiency.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"178-197"},"PeriodicalIF":0.0,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11313631","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mobility management in cellular networks faces increasing complexity due to network densification and heterogeneous user mobility characteristics. Traditional handover (HO) mechanisms, which rely on predefined parameters such as A3-offset and time-to-trigger (TTT), often fail to optimize mobility performance across varying speeds and deployment conditions. Fixed A3-offset and TTT configurations either delay HOs, increasing radio link failures (RLFs), or accelerate them, leading to excessive ping-pong effects. To address these challenges, we propose two distinct data-driven mobility management approaches leveraging high-dimensional Bayesian optimization (HD-BO) and deep reinforcement learning (DRL). While HD-BO optimizes predefined HO parameters such as A3-offset and TTT, DRL provides a parameter-free alternative by allowing an agent to select serving cells based on real-time network conditions. We systematically compare these two approaches in real-world site-specific deployment scenarios (employing Sionna ray tracing for site-specific channel propagation modeling), highlighting their complementary strengths. Results show that both HD-BO and DRL outperform 3GPP set-1 (TTT of 480 ms and A3-offset of 3 dB) and set-5 (TTT of 40 ms and A3-offset of −1 dB) benchmarks. We augment HD-BO with transfer learning so it can generalize across a range of user speeds. Applying the same transfer-learning strategy to the DRL method reduces its training time by a factor of 2.5 while preserving optimal HO performance, showing that it adapts efficiently to the mobility of aerial users such as UAVs. Simulations further reveal that HD-BO remains more sample-efficient than DRL, making it more suitable for scenarios with limited training data.
{"title":"Data-Driven Cellular Mobility Management Via Bayesian Optimization and Reinforcement Learning","authors":"Mohamed Benzaghta;Sahar Ammar;David López-Pére;Basem Shihada;Giovanni Geraci","doi":"10.1109/TMLCN.2025.3647807","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3647807","url":null,"abstract":"Mobility management in cellular networks faces increasing complexity due to network densification and heterogeneous user mobility characteristics. Traditional handover (HO) mechanisms, which rely on predefined parameters such as A3-offset and time-to-trigger (TTT), often fail to optimize mobility performance across varying speeds and deployment conditions. Fixed A3-offset and TTT configurations either delay HOs, increasing radio link failures (RLFs), or accelerate them, leading to excessive ping-pong effects. To address these challenges, we propose two distinct data-driven mobility management approaches leveraging high-dimensional Bayesian optimization (HD-BO) and deep reinforcement learning (DRL). While HD-BO optimizes predefined HO parameters such as A3-offset and TTT, DRL provides a parameter-free alternative by allowing an agent to select serving cells based on real-time network conditions. We systematically compare these two approaches in real-world site-specific deployment scenarios (employing Sionna ray tracing for site-specific channel propagation modeling), highlighting their complementary strengths. Results show that both HD-BO and DRL outperform 3GPP set-1 (TTT of 480 ms and A3-offset of 3 dB) and set-5 (TTT of 40 ms and A3-offset of −1 dB) benchmarks. We augment HD-BO with transfer learning so it can generalize across a range of user speeds. Applying the same transfer-learning strategy to the DRL method reduces its training time by a factor of 2.5 while preserving optimal HO performance, showing that it adapts efficiently to the mobility of aerial users such as UAVs. Simulations further reveal that HD-BO remains more sample-efficient than DRL, making it more suitable for scenarios with limited training data.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"228-244"},"PeriodicalIF":0.0,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11313634","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-23DOI: 10.1109/TMLCN.2025.3647376
Saad Masrur;Jung-Fu Cheng;Atieh R. Khamesi;İsmail Güvenç
Indoor localization in challenging non-line-of-sight (NLOS) environments often leads to poor accuracy with traditional approaches. Deep learning (DL) has been applied to tackle these challenges; however, many DL approaches overlook computational complexity, especially for floating-point operations (FLOPs), making them unsuitable for resource-limited devices. Transformer-based models have achieved remarkable success in natural language processing (NLP) and computer vision (CV) tasks, motivating their use in wireless applications. However, their use in indoor localization remains nascent, and directly applying Transformers for indoor localization can be both computationally intensive and exhibit limitations in accuracy. To address these challenges, in this work, we introduce a novel tokenization approach, referred to as Sensor Snapshot Tokenization (SST), which preserves variable-specific representations of power delay profile (PDP) and enhances attention mechanisms by effectively capturing multi-variate correlation. Complementing this, we propose a lightweight Swish-Gated Linear Unit-based Transformer (L-SwiGLU-T) model, designed to reduce computational complexity without compromising localization accuracy. Together, these contributions mitigate the computational burden and dependency on large datasets, making Transformer models more efficient and suitable for resource-constrained scenarios. Experimental results on simulated and real-world datasets demonstrate that SST and L-SwiGLU-T achieve substantial accuracy and efficiency gains, outperforming larger Transformer and CNN baselines by over 40% while using significantly fewer FLOPs and training samples.
{"title":"Transforming Indoor Localization: Advanced Transformer Architecture for NLOS Dominated Wireless Environments With Distributed Sensors","authors":"Saad Masrur;Jung-Fu Cheng;Atieh R. Khamesi;İsmail Güvenç","doi":"10.1109/TMLCN.2025.3647376","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3647376","url":null,"abstract":"Indoor localization in challenging non-line-of-sight (NLOS) environments often leads to poor accuracy with traditional approaches. Deep learning (DL) has been applied to tackle these challenges; however, many DL approaches overlook computational complexity, especially for floating-point operations (FLOPs), making them unsuitable for resource-limited devices. Transformer-based models have achieved remarkable success in natural language processing (NLP) and computer vision (CV) tasks, motivating their use in wireless applications. However, their use in indoor localization remains nascent, and directly applying Transformers for indoor localization can be both computationally intensive and exhibit limitations in accuracy. To address these challenges, in this work, we introduce a novel tokenization approach, referred to as Sensor Snapshot Tokenization (SST), which preserves variable-specific representations of power delay profile (PDP) and enhances attention mechanisms by effectively capturing multi-variate correlation. Complementing this, we propose a lightweight Swish-Gated Linear Unit-based Transformer (L-SwiGLU-T) model, designed to reduce computational complexity without compromising localization accuracy. Together, these contributions mitigate the computational burden and dependency on large datasets, making Transformer models more efficient and suitable for resource-constrained scenarios. Experimental results on simulated and real-world datasets demonstrate that SST and L-SwiGLU-T achieve substantial accuracy and efficiency gains, outperforming larger Transformer and CNN baselines by over 40% while using significantly fewer FLOPs and training samples.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"161-177"},"PeriodicalIF":0.0,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11313538","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1109/TMLCN.2025.3646125
Pedro Enrique Iturria-Rivera;Raimundas Gaigalas;Medhat Elsayed;Majid Bavand;Yigit Ozcan;Melike Erol-Kantarci
Extended Reality (XR) services are set to transform applications over ${mathbf {5}}^{th}$ and ${mathbf {6}}^{th}$ generation wireless networks, delivering immersive experiences. Concurrently, Artificial Intelligence (AI) advancements have expanded their role in wireless networks, however, trust and transparency in AI remain to be strengthened. Thus, providing explanations for AI-enabled systems can enhance trust. We introduce Value Function Factorization (VFF)-based Explainable (X) Multi-Agent Reinforcement Learning (MARL) algorithms, explaining reward design in XR codec adaptation through reward decomposition. We contribute four enhancements to XMARL algorithms. Firstly, we detail architectural modifications to enable reward decomposition in VFF-based MARL algorithms: Value Decomposition Networks (VDN), Mixture of Q-Values (QMIX), and Q-Transformation (Q-TRAN). Secondly, inspired by multi-task learning, we reduce the overhead of vanilla XMARL algorithms. Thirdly, we propose a new explainability metric, Reward Difference Fluctuation Explanation (RDFX), suitable for problems with adjustable parameters. Lastly, we propose adaptive XMARL, leveraging network gradients and reward decomposition for improved action selection. Simulation results indicate that, in XR codec adaptation, the Packet Delivery Ratio reward is the primary contributor to optimal performance compared to the initial composite reward, which included delay and Data Rate Ratio components. Modifications to VFF-based XMARL algorithms, incorporating multi-headed structures and adaptive loss functions, enable the best-performing algorithm, Multi-Headed Adaptive (MHA)-QMIX, to achieve significant average gains over the Adjust Packet Size baseline up to 10.7%, 41.4%, 33.3%, and 67.9% in XR index, jitter, delay, and Packet Loss Ratio (PLR), respectively.
{"title":"Explainable Multi-Agent Reinforcement Learning for Extended Reality Codec Adaptation","authors":"Pedro Enrique Iturria-Rivera;Raimundas Gaigalas;Medhat Elsayed;Majid Bavand;Yigit Ozcan;Melike Erol-Kantarci","doi":"10.1109/TMLCN.2025.3646125","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3646125","url":null,"abstract":"Extended Reality (XR) services are set to transform applications over <inline-formula> <tex-math>${mathbf {5}}^{th}$ </tex-math></inline-formula> and <inline-formula> <tex-math>${mathbf {6}}^{th}$ </tex-math></inline-formula> generation wireless networks, delivering immersive experiences. Concurrently, Artificial Intelligence (AI) advancements have expanded their role in wireless networks, however, trust and transparency in AI remain to be strengthened. Thus, providing explanations for AI-enabled systems can enhance trust. We introduce Value Function Factorization (VFF)-based Explainable (X) Multi-Agent Reinforcement Learning (MARL) algorithms, explaining reward design in XR codec adaptation through reward decomposition. We contribute four enhancements to XMARL algorithms. Firstly, we detail architectural modifications to enable reward decomposition in VFF-based MARL algorithms: Value Decomposition Networks (VDN), Mixture of Q-Values (QMIX), and Q-Transformation (Q-TRAN). Secondly, inspired by multi-task learning, we reduce the overhead of vanilla XMARL algorithms. Thirdly, we propose a new explainability metric, Reward Difference Fluctuation Explanation (RDFX), suitable for problems with adjustable parameters. Lastly, we propose adaptive XMARL, leveraging network gradients and reward decomposition for improved action selection. Simulation results indicate that, in XR codec adaptation, the Packet Delivery Ratio reward is the primary contributor to optimal performance compared to the initial composite reward, which included delay and Data Rate Ratio components. Modifications to VFF-based XMARL algorithms, incorporating multi-headed structures and adaptive loss functions, enable the best-performing algorithm, Multi-Headed Adaptive (MHA)-QMIX, to achieve significant average gains over the Adjust Packet Size baseline up to 10.7%, 41.4%, 33.3%, and 67.9% in XR index, jitter, delay, and Packet Loss Ratio (PLR), respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"245-264"},"PeriodicalIF":0.0,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11303975","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-16DOI: 10.1109/TMLCN.2025.3644333
Ons Aouedi;Flor Ortiz;Thang X. Vu;Alexandre Lefourn;Felix Giese;Guillermo Gutierrez;Symeon Chatzinotas
The growing integration of non-terrestrial networks (NTNs), particularly low Earth orbit (LEO) satellite constellations, has significantly extended the reach of maritime connectivity, supporting critical applications such as vessel monitoring, navigation safety, and maritime surveillance in remote and oceanic regions. Automatic Identification System (AIS) data, increasingly collected through a combination of satellite and terrestrial infrastructures, provide a rich source of spatiotemporal vessel information. However, accurate trajectory prediction in maritime domains remains challenging due to irregular sampling rates, dynamic environmental conditions, and heterogeneous vessel behaviors. This study proposes a velocity-based trajectory prediction framework that leverages AIS data collected from integrated satellite–terrestrial networks. Rather than directly predicting absolute positions (latitude and longitude), our model predicts vessel motion in the form of latitude and longitude velocities. This formulation simplifies the learning task, enhances temporal continuity, and improves scalability, making it well-suited for resource-constrained NTN environments. The predictive architecture is built upon a Long Short-Term Memory network enhanced with attention mechanisms and residual connections (LSTM-RA), enabling it to capture complex temporal dependencies and adapt to noise in real-world AIS data. Extensive experiments on two maritime datasets validate the robustness and accuracy of our framework, demonstrating clear improvements over state-of-the-art baselines.
{"title":"AIS-Based Hybrid Vessel Trajectory Prediction for Enhanced Maritime Navigation","authors":"Ons Aouedi;Flor Ortiz;Thang X. Vu;Alexandre Lefourn;Felix Giese;Guillermo Gutierrez;Symeon Chatzinotas","doi":"10.1109/TMLCN.2025.3644333","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3644333","url":null,"abstract":"The growing integration of non-terrestrial networks (NTNs), particularly low Earth orbit (LEO) satellite constellations, has significantly extended the reach of maritime connectivity, supporting critical applications such as vessel monitoring, navigation safety, and maritime surveillance in remote and oceanic regions. Automatic Identification System (AIS) data, increasingly collected through a combination of satellite and terrestrial infrastructures, provide a rich source of spatiotemporal vessel information. However, accurate trajectory prediction in maritime domains remains challenging due to irregular sampling rates, dynamic environmental conditions, and heterogeneous vessel behaviors. This study proposes a velocity-based trajectory prediction framework that leverages AIS data collected from integrated satellite–terrestrial networks. Rather than directly predicting absolute positions (latitude and longitude), our model predicts vessel motion in the form of latitude and longitude velocities. This formulation simplifies the learning task, enhances temporal continuity, and improves scalability, making it well-suited for resource-constrained NTN environments. The predictive architecture is built upon a Long Short-Term Memory network enhanced with attention mechanisms and residual connections (<monospace>LSTM-RA</monospace>), enabling it to capture complex temporal dependencies and adapt to noise in real-world AIS data. Extensive experiments on two maritime datasets validate the robustness and accuracy of our framework, demonstrating clear improvements over state-of-the-art baselines.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"198-210"},"PeriodicalIF":0.0,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11301841","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-12DOI: 10.1109/TMLCN.2025.3643409
George P. Kontoudis;Daniel J. Stilwell
In this paper, we propose scalable methods for Gaussian process (GP) prediction in decentralized multi-agent systems. Multiple aggregation techniques for GP prediction are decentralized with the use of iterative and consensus methods. Moreover, we introduce a covariance-based nearest neighbor selection strategy that leverages cross-covariance similarity, enabling subsets of agents to make accurate predictions. The proposed decentralized schemes preserve the consistency properties of their centralized counterparts, while adhering to federated learning principles by restricting raw data exchange between agents. We validate the efficacy of the proposed decentralized algorithms with numerical experiments on real-world sea surface temperature and ground elevation map datasets across multiple fleet sizes.
{"title":"Multi-Agent Federated Learning Using Covariance-Based Nearest Neighbor Gaussian Processes","authors":"George P. Kontoudis;Daniel J. Stilwell","doi":"10.1109/TMLCN.2025.3643409","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3643409","url":null,"abstract":"In this paper, we propose scalable methods for Gaussian process (GP) prediction in decentralized multi-agent systems. Multiple aggregation techniques for GP prediction are decentralized with the use of iterative and consensus methods. Moreover, we introduce a covariance-based nearest neighbor selection strategy that leverages cross-covariance similarity, enabling subsets of agents to make accurate predictions. The proposed decentralized schemes preserve the consistency properties of their centralized counterparts, while adhering to federated learning principles by restricting raw data exchange between agents. We validate the efficacy of the proposed decentralized algorithms with numerical experiments on real-world sea surface temperature and ground elevation map datasets across multiple fleet sizes.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"115-138"},"PeriodicalIF":0.0,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11299094","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the increasing prevalence of DDoS attacks, various machine learning-based detection models have been employed to mitigate these malicious behaviors. Understanding how machine learning models function can be quite complex, especially for intricate and nonlinear models like deep learning architectures. Recently, various techniques have been advanced to interpret deep learning models and address issues of ambiguity. In this paper, we present a comprehensive analysis of various explanation methods that are applied to Long Short-Term Memory (LSTM) model for detecting Distributed Denial of Service (DDoS) attacks on raw traffic data. While previous studies have focused primarily on improving detection accuracy on feature-based datasets, this paper emphasizes the importance of interpretability in deep learning models on raw-based traffic datasets. By employing explanation techniques such as LIME, SHAP, Anchor, and LORE, we provide insights into the decision-making processes of LSTM models, thereby enhancing trust and understanding in classifying DDoS attacks. The use of raw-based network traffic revealed crucial packet fields that played an important role behind the true and false positive predictions of the LSTM model, as well as identifying common network fields among the DDoS attacks to justify the misclassifications between similar DDoS attacks.
{"title":"A Deeper Look on Explanation Methods for Deep Learning Models on Raw-Based Traffic of DDoS Attacks","authors":"Basil AsSadhan;Abdulmuneem Bashaiwth;Hamad Binsalleeh","doi":"10.1109/TMLCN.2025.3642211","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3642211","url":null,"abstract":"With the increasing prevalence of DDoS attacks, various machine learning-based detection models have been employed to mitigate these malicious behaviors. Understanding how machine learning models function can be quite complex, especially for intricate and nonlinear models like deep learning architectures. Recently, various techniques have been advanced to interpret deep learning models and address issues of ambiguity. In this paper, we present a comprehensive analysis of various explanation methods that are applied to Long Short-Term Memory (LSTM) model for detecting Distributed Denial of Service (DDoS) attacks on raw traffic data. While previous studies have focused primarily on improving detection accuracy on feature-based datasets, this paper emphasizes the importance of interpretability in deep learning models on raw-based traffic datasets. By employing explanation techniques such as LIME, SHAP, Anchor, and LORE, we provide insights into the decision-making processes of LSTM models, thereby enhancing trust and understanding in classifying DDoS attacks. The use of raw-based network traffic revealed crucial packet fields that played an important role behind the true and false positive predictions of the LSTM model, as well as identifying common network fields among the DDoS attacks to justify the misclassifications between similar DDoS attacks.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"139-160"},"PeriodicalIF":0.0,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11289572","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-08DOI: 10.1109/TMLCN.2025.3638067
{"title":"IEEE Communications Society Board of Governors","authors":"","doi":"10.1109/TMLCN.2025.3638067","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3638067","url":null,"abstract":"","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"C3-C3"},"PeriodicalIF":0.0,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11283087","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145698226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1109/TMLCN.2025.3639365
Sheikh Islam;Xin Ma;Chunxiao Chigan
Achieving effective self-interference cancellation (SIC) in full-duplex (FD) wireless communication systems under time-varying channel conditions remains a significant challenge. To address this challenge, we propose a novel adaptive SIC solution through leveraging Hyper Neural Networks (HyperNet) and incremental learning (IL). Unlike the existing methods that rely on offline training or lack real-time adaptability, our approach enables autonomous learning and fast adaptation to the complex, nonlinear, and time-varying nature of self-interference (SI) channels. It effectively addresses dynamic adaptation challenges, such as catastrophic forgetting, through the use of experience replay (ER). Our experimental results show that traditional model-based methods exhibit limited adaptability under dynamic channel conditions, while conventional data-driven models fail to maintain consistent performance without the adaptive capabilities provided by IL. In contrast, the proposed HyperNet-based IL model reduces training time by 33% and achieves three times faster convergence compared to a standalone HyperNet trained separately for each static condition. Extensive evaluations using simulated datasets that emulate real-world scenarios demonstrate that our approach consistently achieves SI suppression down to the noise floor. It also delivers significantly lower computational complexity and training time. These improvements collectively enhance the efficiency and reliability of FD communication systems operating in dynamic wireless environments.
{"title":"Adaptive Nonlinear Digital Self-Interference Cancellation for Full-Duplex Wireless Systems Using Hypernetwork-Based Incremental Learning","authors":"Sheikh Islam;Xin Ma;Chunxiao Chigan","doi":"10.1109/TMLCN.2025.3639365","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3639365","url":null,"abstract":"Achieving effective self-interference cancellation (SIC) in full-duplex (FD) wireless communication systems under time-varying channel conditions remains a significant challenge. To address this challenge, we propose a novel adaptive SIC solution through leveraging Hyper Neural Networks (HyperNet) and incremental learning (IL). Unlike the existing methods that rely on offline training or lack real-time adaptability, our approach enables autonomous learning and fast adaptation to the complex, nonlinear, and time-varying nature of self-interference (SI) channels. It effectively addresses dynamic adaptation challenges, such as catastrophic forgetting, through the use of experience replay (ER). Our experimental results show that traditional model-based methods exhibit limited adaptability under dynamic channel conditions, while conventional data-driven models fail to maintain consistent performance without the adaptive capabilities provided by IL. In contrast, the proposed HyperNet-based IL model reduces training time by 33% and achieves three times faster convergence compared to a standalone HyperNet trained separately for each static condition. Extensive evaluations using simulated datasets that emulate real-world scenarios demonstrate that our approach consistently achieves SI suppression down to the noise floor. It also delivers significantly lower computational complexity and training time. These improvements collectively enhance the efficiency and reliability of FD communication systems operating in dynamic wireless environments.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"60-75"},"PeriodicalIF":0.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11272907","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}