Pub Date : 2026-01-13DOI: 10.1109/TMLCN.2026.3653719
Hind Mukhtar;Raymond Schaub;Melike Erol-Kantarci
Satellite-based communication systems are crucial for providing high-speed data services in aviation, particularly for business aviation operations that demand global connectivity. These systems face challenges from numerous interdependent factors, such as satellite handovers, congestion, flight maneuvers, and seasonal variations, making accurate Quality of Service (QoS) prediction complex. Currently, there is no established methodology for predicting QoS in avionic communication systems. This paper addresses this gap by proposing machine learning-based approaches for pre-flight QoS prediction. Specifically, we leverage transformer models to predict QoS along a given flight path using real-world data. The model takes as input a variety of positional and network-related features, such as aircraft location, satellite information, historical QoS, and handover probabilities, and outputs a predicted performance score for each position along the flight. This approach allows for proactive decision-making, enabling flight crews to select the most optimal flight paths before departure, improving overall operational efficiency in business aviation. Our proposed encoder-decoder transformer model achieved an overall prediction accuracy of 65% and an RMSE of 1.91, representing a significant improvement over traditional baseline methods. While these metrics are notable, our model’s key contribution is a substantial improvement in prediction accuracy for underrepresented classes, which were a major limitation of prior approaches. Additionally, the model significantly reduces inference time, achieving predictions in 40 seconds compared to 6,353 seconds for a traditional KNN model. This approach allows for proactive decision-making, enabling flight crews to select optimal flight paths before departure, improving overall operational efficiency in business aviation.
{"title":"QoS Prediction for Satellite-Based Avionic Communication Using Transformers","authors":"Hind Mukhtar;Raymond Schaub;Melike Erol-Kantarci","doi":"10.1109/TMLCN.2026.3653719","DOIUrl":"https://doi.org/10.1109/TMLCN.2026.3653719","url":null,"abstract":"Satellite-based communication systems are crucial for providing high-speed data services in aviation, particularly for business aviation operations that demand global connectivity. These systems face challenges from numerous interdependent factors, such as satellite handovers, congestion, flight maneuvers, and seasonal variations, making accurate Quality of Service (QoS) prediction complex. Currently, there is no established methodology for predicting QoS in avionic communication systems. This paper addresses this gap by proposing machine learning-based approaches for pre-flight QoS prediction. Specifically, we leverage transformer models to predict QoS along a given flight path using real-world data. The model takes as input a variety of positional and network-related features, such as aircraft location, satellite information, historical QoS, and handover probabilities, and outputs a predicted performance score for each position along the flight. This approach allows for proactive decision-making, enabling flight crews to select the most optimal flight paths before departure, improving overall operational efficiency in business aviation. Our proposed encoder-decoder transformer model achieved an overall prediction accuracy of 65% and an RMSE of 1.91, representing a significant improvement over traditional baseline methods. While these metrics are notable, our model’s key contribution is a substantial improvement in prediction accuracy for underrepresented classes, which were a major limitation of prior approaches. Additionally, the model significantly reduces inference time, achieving predictions in 40 seconds compared to 6,353 seconds for a traditional KNN model. This approach allows for proactive decision-making, enabling flight crews to select optimal flight paths before departure, improving overall operational efficiency in business aviation.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"300-317"},"PeriodicalIF":0.0,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11348973","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/TMLCN.2026.3653010
Benjamin W. Domae;Danijela Cabric
Large antenna arrays are critical for reliability and high data rates in wireless networks at millimeter-wave and sub-terahertz bands. While traditional methods for initial beam alignment for analog phased arrays scale beam alignment overhead linearly with the array size, compressive sensing (CS) and machine learning (ML) algorithms can scale logarithmically. CS and ML methods typically utilize pseudo-random or heuristic beam designs as compressive codebooks. However, these codebooks may not be optimal for scenarios with uncertain array impairments or multipath, particularly when measurements are phase-less or power-based. In this work, we propose a novel dictionary learning method to design codebooks for phase-less beam alignment given multipath and unknown impairment statistics. This codebook learning algorithm uses an alternating optimization with block coordinate descent to update the codebooks and Monte Carlo trials over multipath and impairments to incorporate a-priori knowledge of the hardware and environment. Additionally, we discuss engineering considerations for the codebook design algorithm, including a comparison of three proposed loss functions and three proposed beam alignment algorithms used for codebook learning. As one of the three beam alignment methods, we propose transfer learning for ML-based beam alignment to reduce the training time of both the ML model and codebook learning. We demonstrate that codebook learning and our ML-based beam alignment algorithms can significantly reduce the beam alignment overhead in terms of number of measurements required.
{"title":"Dictionary Learning for Phase-Less Beam Alignment Codebook Design in Multipath Channels","authors":"Benjamin W. Domae;Danijela Cabric","doi":"10.1109/TMLCN.2026.3653010","DOIUrl":"https://doi.org/10.1109/TMLCN.2026.3653010","url":null,"abstract":"Large antenna arrays are critical for reliability and high data rates in wireless networks at millimeter-wave and sub-terahertz bands. While traditional methods for initial beam alignment for analog phased arrays scale beam alignment overhead linearly with the array size, compressive sensing (CS) and machine learning (ML) algorithms can scale logarithmically. CS and ML methods typically utilize pseudo-random or heuristic beam designs as compressive codebooks. However, these codebooks may not be optimal for scenarios with uncertain array impairments or multipath, particularly when measurements are phase-less or power-based. In this work, we propose a novel dictionary learning method to design codebooks for phase-less beam alignment given multipath and unknown impairment statistics. This codebook learning algorithm uses an alternating optimization with block coordinate descent to update the codebooks and Monte Carlo trials over multipath and impairments to incorporate a-priori knowledge of the hardware and environment. Additionally, we discuss engineering considerations for the codebook design algorithm, including a comparison of three proposed loss functions and three proposed beam alignment algorithms used for codebook learning. As one of the three beam alignment methods, we propose transfer learning for ML-based beam alignment to reduce the training time of both the ML model and codebook learning. We demonstrate that codebook learning and our ML-based beam alignment algorithms can significantly reduce the beam alignment overhead in terms of number of measurements required.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"318-336"},"PeriodicalIF":0.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11346817","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate decoding of Uplink Control Information (UCI) on the Physical Uplink Control Channel (PUCCH) is essential for enabling 5G wireless links. This paper explores an AI/ML-based receiver design for PUCCH Format 0. Format 0 signaling encodes the UCI content within the phase of a known base waveform and even supports multiplexing of up to 12 users within the same time-frequency resources. The proposed neural network classifier, which we term UCINet0, is capable of predicting when no user is transmitting on the PUCCH, as well as decoding the UCI content for any number of multiplexed users (up to 12). The test results with simulated, hardware-captured (lab) and field datasets show that the UCINet0 model outperforms conventional correlation-based decoders across all Signal-to-Noise Ratio (SNR) ranges and multiple fading scenarios.
在物理上行控制信道(Physical Uplink Control Channel, PUCCH)上准确解码UCI (Uplink Control Information)是实现5G无线链路的关键。本文探讨了一种基于AI/ ml的PUCCH Format 0接收机设计。Format 0信令对已知基波相位内的UCI内容进行编码,甚至支持在相同时频资源内多达12个用户的多路复用。所提出的神经网络分类器,我们称之为UCINet0,能够预测何时没有用户在PUCCH上传输,以及解码任意数量的多路复用用户(最多12个)的UCI内容。模拟、硬件捕获(实验室)和现场数据集的测试结果表明,UCINet0模型在所有信噪比(SNR)范围和多种衰落情况下都优于传统的基于相关的解码器。
{"title":"UCINet0: A Machine Learning-Based Receiver for 5G NR PUCCH Format 0","authors":"Jeeva Keshav Sattianarayanin;Anil Kumar Yerrapragada;Radha Krishna Ganti","doi":"10.1109/TMLCN.2025.3650730","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3650730","url":null,"abstract":"Accurate decoding of Uplink Control Information (UCI) on the Physical Uplink Control Channel (PUCCH) is essential for enabling 5G wireless links. This paper explores an AI/ML-based receiver design for PUCCH Format 0. Format 0 signaling encodes the UCI content within the phase of a known base waveform and even supports multiplexing of up to 12 users within the same time-frequency resources. The proposed neural network classifier, which we term UCINet0, is capable of predicting when no user is transmitting on the PUCCH, as well as decoding the UCI content for any number of multiplexed users (up to 12). The test results with simulated, hardware-captured (lab) and field datasets show that the UCINet0 model outperforms conventional correlation-based decoders across all Signal-to-Noise Ratio (SNR) ranges and multiple fading scenarios.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"282-299"},"PeriodicalIF":0.0,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11328864","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TMLCN.2025.3650440
Hieu Le;Oguz Bedir;Mostafa Ibrahim;Jian Tao;Sabit Ekin
This paper presents a multi-agent reinforcement learning (MARL) approach for controlling adjustable metallic reflector arrays to enhance wireless signal reception in non-line-of-sight (NLOS) scenarios. Unlike conventional reconfigurable intelligent surfaces (RIS) that require complex channel estimation, our system employs a centralized training with decentralized execution (CTDE) paradigm where individual agents corresponding to reflector segments autonomously optimize reflector element orientation in three-dimensional space using spatial intelligence based on user location information. Through extensive ray-tracing simulations with dynamic user mobility, the proposed multi-agent beam-focusing framework demonstrates substantial performance improvements over single-agent reinforcement learning baselines, while maintaining rapid adaptation to user movement within one simulation step. Comprehensive evaluation across varying user densities and reflector configurations validates system scalability and robustness. The results demonstrate the potential of learning-based approaches for adaptive wireless propagation control.
{"title":"Signal Whisperers: Enhancing Wireless Reception Using DRL-Guided Reflector Arrays","authors":"Hieu Le;Oguz Bedir;Mostafa Ibrahim;Jian Tao;Sabit Ekin","doi":"10.1109/TMLCN.2025.3650440","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3650440","url":null,"abstract":"This paper presents a multi-agent reinforcement learning (MARL) approach for controlling adjustable metallic reflector arrays to enhance wireless signal reception in non-line-of-sight (NLOS) scenarios. Unlike conventional reconfigurable intelligent surfaces (RIS) that require complex channel estimation, our system employs a centralized training with decentralized execution (CTDE) paradigm where individual agents corresponding to reflector segments autonomously optimize reflector element orientation in three-dimensional space using spatial intelligence based on user location information. Through extensive ray-tracing simulations with dynamic user mobility, the proposed multi-agent beam-focusing framework demonstrates substantial performance improvements over single-agent reinforcement learning baselines, while maintaining rapid adaptation to user movement within one simulation step. Comprehensive evaluation across varying user densities and reflector configurations validates system scalability and robustness. The results demonstrate the potential of learning-based approaches for adaptive wireless propagation control.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"265-281"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11322690","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Considering their high mobility and relatively low cost, uncrewed aerial vehicles (UAVs) equipped with mobile base stations are regarded as a potential technological approach. However, the dual pressures of limited onboard resources of UAVs and the demand for high-quality services in dynamic low-altitude applications jointly form a bottleneck for system performance. Although multi-UAVs communication networks can provide higher system performance through coordinated deployment, the challenges of cooperation and competition among UAVs, as well as more complex optimization problems, significantly increase costs and pose formidable challenges. To overcome the challenges of low coordination efficiency and intense resource competition among multiple UAVs, and to ensure the timely and efficient satisfaction of ground users (GUs) communication service demands, this paper conceives a centralized-controlled two-tier-cooperated UAVs communication network. The network comprises a central UAV (C-UAV) tier as control center and a marginal UAV (M-UAV) tier to serve GUs. In response to the increasingly dynamic and complex scenarios, along with the challenge of insufficient generalization ability in Deep Reinforcement Learning (DRL) algorithms, we propose a clustering-assisted dual-agent soft actor critic (CDA-SAC) algorithm for trajectory design and resource allocation, aiming to maximize the fair energy efficiency of the system. Specifically, by integrating a clustering-matching method with a dual-agent strategy, the proposed CDA-SAC algorithm achieves significant improvements in generalization ability and exploration capability. Simulation results demonstrate that the proposed CDA-SAC algorithm can be deployed without retraining in scenarios with different numbers of GUs. Furthermore, the CDA-SAC algorithm outperforms both the multi-UAV scenarios based on the MADDPG algorithm and the FDMA scheme in terms of fairness and total energy efficiency.
{"title":"Clustering-Assisted Deep Reinforcement Learning for Joint Trajectory Design and Resource Allocation in Two-Tier-Cooperated UAVs Communications","authors":"Shujun Zhao;Simeng Feng;Chao Dong;Xiaojun Zhu;Qihui Wu","doi":"10.1109/TMLCN.2025.3647806","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3647806","url":null,"abstract":"Considering their high mobility and relatively low cost, uncrewed aerial vehicles (UAVs) equipped with mobile base stations are regarded as a potential technological approach. However, the dual pressures of limited onboard resources of UAVs and the demand for high-quality services in dynamic low-altitude applications jointly form a bottleneck for system performance. Although multi-UAVs communication networks can provide higher system performance through coordinated deployment, the challenges of cooperation and competition among UAVs, as well as more complex optimization problems, significantly increase costs and pose formidable challenges. To overcome the challenges of low coordination efficiency and intense resource competition among multiple UAVs, and to ensure the timely and efficient satisfaction of ground users (GUs) communication service demands, this paper conceives a centralized-controlled two-tier-cooperated UAVs communication network. The network comprises a central UAV (C-UAV) tier as control center and a marginal UAV (M-UAV) tier to serve GUs. In response to the increasingly dynamic and complex scenarios, along with the challenge of insufficient generalization ability in Deep Reinforcement Learning (DRL) algorithms, we propose a clustering-assisted dual-agent soft actor critic (CDA-SAC) algorithm for trajectory design and resource allocation, aiming to maximize the fair energy efficiency of the system. Specifically, by integrating a clustering-matching method with a dual-agent strategy, the proposed CDA-SAC algorithm achieves significant improvements in generalization ability and exploration capability. Simulation results demonstrate that the proposed CDA-SAC algorithm can be deployed without retraining in scenarios with different numbers of GUs. Furthermore, the CDA-SAC algorithm outperforms both the multi-UAV scenarios based on the MADDPG algorithm and the FDMA scheme in terms of fairness and total energy efficiency.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"178-197"},"PeriodicalIF":0.0,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11313631","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mobility management in cellular networks faces increasing complexity due to network densification and heterogeneous user mobility characteristics. Traditional handover (HO) mechanisms, which rely on predefined parameters such as A3-offset and time-to-trigger (TTT), often fail to optimize mobility performance across varying speeds and deployment conditions. Fixed A3-offset and TTT configurations either delay HOs, increasing radio link failures (RLFs), or accelerate them, leading to excessive ping-pong effects. To address these challenges, we propose two distinct data-driven mobility management approaches leveraging high-dimensional Bayesian optimization (HD-BO) and deep reinforcement learning (DRL). While HD-BO optimizes predefined HO parameters such as A3-offset and TTT, DRL provides a parameter-free alternative by allowing an agent to select serving cells based on real-time network conditions. We systematically compare these two approaches in real-world site-specific deployment scenarios (employing Sionna ray tracing for site-specific channel propagation modeling), highlighting their complementary strengths. Results show that both HD-BO and DRL outperform 3GPP set-1 (TTT of 480 ms and A3-offset of 3 dB) and set-5 (TTT of 40 ms and A3-offset of −1 dB) benchmarks. We augment HD-BO with transfer learning so it can generalize across a range of user speeds. Applying the same transfer-learning strategy to the DRL method reduces its training time by a factor of 2.5 while preserving optimal HO performance, showing that it adapts efficiently to the mobility of aerial users such as UAVs. Simulations further reveal that HD-BO remains more sample-efficient than DRL, making it more suitable for scenarios with limited training data.
{"title":"Data-Driven Cellular Mobility Management Via Bayesian Optimization and Reinforcement Learning","authors":"Mohamed Benzaghta;Sahar Ammar;David López-Pére;Basem Shihada;Giovanni Geraci","doi":"10.1109/TMLCN.2025.3647807","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3647807","url":null,"abstract":"Mobility management in cellular networks faces increasing complexity due to network densification and heterogeneous user mobility characteristics. Traditional handover (HO) mechanisms, which rely on predefined parameters such as A3-offset and time-to-trigger (TTT), often fail to optimize mobility performance across varying speeds and deployment conditions. Fixed A3-offset and TTT configurations either delay HOs, increasing radio link failures (RLFs), or accelerate them, leading to excessive ping-pong effects. To address these challenges, we propose two distinct data-driven mobility management approaches leveraging high-dimensional Bayesian optimization (HD-BO) and deep reinforcement learning (DRL). While HD-BO optimizes predefined HO parameters such as A3-offset and TTT, DRL provides a parameter-free alternative by allowing an agent to select serving cells based on real-time network conditions. We systematically compare these two approaches in real-world site-specific deployment scenarios (employing Sionna ray tracing for site-specific channel propagation modeling), highlighting their complementary strengths. Results show that both HD-BO and DRL outperform 3GPP set-1 (TTT of 480 ms and A3-offset of 3 dB) and set-5 (TTT of 40 ms and A3-offset of −1 dB) benchmarks. We augment HD-BO with transfer learning so it can generalize across a range of user speeds. Applying the same transfer-learning strategy to the DRL method reduces its training time by a factor of 2.5 while preserving optimal HO performance, showing that it adapts efficiently to the mobility of aerial users such as UAVs. Simulations further reveal that HD-BO remains more sample-efficient than DRL, making it more suitable for scenarios with limited training data.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"228-244"},"PeriodicalIF":0.0,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11313634","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-23DOI: 10.1109/TMLCN.2025.3647376
Saad Masrur;Jung-Fu Cheng;Atieh R. Khamesi;İsmail Güvenç
Indoor localization in challenging non-line-of-sight (NLOS) environments often leads to poor accuracy with traditional approaches. Deep learning (DL) has been applied to tackle these challenges; however, many DL approaches overlook computational complexity, especially for floating-point operations (FLOPs), making them unsuitable for resource-limited devices. Transformer-based models have achieved remarkable success in natural language processing (NLP) and computer vision (CV) tasks, motivating their use in wireless applications. However, their use in indoor localization remains nascent, and directly applying Transformers for indoor localization can be both computationally intensive and exhibit limitations in accuracy. To address these challenges, in this work, we introduce a novel tokenization approach, referred to as Sensor Snapshot Tokenization (SST), which preserves variable-specific representations of power delay profile (PDP) and enhances attention mechanisms by effectively capturing multi-variate correlation. Complementing this, we propose a lightweight Swish-Gated Linear Unit-based Transformer (L-SwiGLU-T) model, designed to reduce computational complexity without compromising localization accuracy. Together, these contributions mitigate the computational burden and dependency on large datasets, making Transformer models more efficient and suitable for resource-constrained scenarios. Experimental results on simulated and real-world datasets demonstrate that SST and L-SwiGLU-T achieve substantial accuracy and efficiency gains, outperforming larger Transformer and CNN baselines by over 40% while using significantly fewer FLOPs and training samples.
{"title":"Transforming Indoor Localization: Advanced Transformer Architecture for NLOS Dominated Wireless Environments With Distributed Sensors","authors":"Saad Masrur;Jung-Fu Cheng;Atieh R. Khamesi;İsmail Güvenç","doi":"10.1109/TMLCN.2025.3647376","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3647376","url":null,"abstract":"Indoor localization in challenging non-line-of-sight (NLOS) environments often leads to poor accuracy with traditional approaches. Deep learning (DL) has been applied to tackle these challenges; however, many DL approaches overlook computational complexity, especially for floating-point operations (FLOPs), making them unsuitable for resource-limited devices. Transformer-based models have achieved remarkable success in natural language processing (NLP) and computer vision (CV) tasks, motivating their use in wireless applications. However, their use in indoor localization remains nascent, and directly applying Transformers for indoor localization can be both computationally intensive and exhibit limitations in accuracy. To address these challenges, in this work, we introduce a novel tokenization approach, referred to as Sensor Snapshot Tokenization (SST), which preserves variable-specific representations of power delay profile (PDP) and enhances attention mechanisms by effectively capturing multi-variate correlation. Complementing this, we propose a lightweight Swish-Gated Linear Unit-based Transformer (L-SwiGLU-T) model, designed to reduce computational complexity without compromising localization accuracy. Together, these contributions mitigate the computational burden and dependency on large datasets, making Transformer models more efficient and suitable for resource-constrained scenarios. Experimental results on simulated and real-world datasets demonstrate that SST and L-SwiGLU-T achieve substantial accuracy and efficiency gains, outperforming larger Transformer and CNN baselines by over 40% while using significantly fewer FLOPs and training samples.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"161-177"},"PeriodicalIF":0.0,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11313538","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1109/TMLCN.2025.3646125
Pedro Enrique Iturria-Rivera;Raimundas Gaigalas;Medhat Elsayed;Majid Bavand;Yigit Ozcan;Melike Erol-Kantarci
Extended Reality (XR) services are set to transform applications over ${mathbf {5}}^{th}$ and ${mathbf {6}}^{th}$ generation wireless networks, delivering immersive experiences. Concurrently, Artificial Intelligence (AI) advancements have expanded their role in wireless networks, however, trust and transparency in AI remain to be strengthened. Thus, providing explanations for AI-enabled systems can enhance trust. We introduce Value Function Factorization (VFF)-based Explainable (X) Multi-Agent Reinforcement Learning (MARL) algorithms, explaining reward design in XR codec adaptation through reward decomposition. We contribute four enhancements to XMARL algorithms. Firstly, we detail architectural modifications to enable reward decomposition in VFF-based MARL algorithms: Value Decomposition Networks (VDN), Mixture of Q-Values (QMIX), and Q-Transformation (Q-TRAN). Secondly, inspired by multi-task learning, we reduce the overhead of vanilla XMARL algorithms. Thirdly, we propose a new explainability metric, Reward Difference Fluctuation Explanation (RDFX), suitable for problems with adjustable parameters. Lastly, we propose adaptive XMARL, leveraging network gradients and reward decomposition for improved action selection. Simulation results indicate that, in XR codec adaptation, the Packet Delivery Ratio reward is the primary contributor to optimal performance compared to the initial composite reward, which included delay and Data Rate Ratio components. Modifications to VFF-based XMARL algorithms, incorporating multi-headed structures and adaptive loss functions, enable the best-performing algorithm, Multi-Headed Adaptive (MHA)-QMIX, to achieve significant average gains over the Adjust Packet Size baseline up to 10.7%, 41.4%, 33.3%, and 67.9% in XR index, jitter, delay, and Packet Loss Ratio (PLR), respectively.
{"title":"Explainable Multi-Agent Reinforcement Learning for Extended Reality Codec Adaptation","authors":"Pedro Enrique Iturria-Rivera;Raimundas Gaigalas;Medhat Elsayed;Majid Bavand;Yigit Ozcan;Melike Erol-Kantarci","doi":"10.1109/TMLCN.2025.3646125","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3646125","url":null,"abstract":"Extended Reality (XR) services are set to transform applications over <inline-formula> <tex-math>${mathbf {5}}^{th}$ </tex-math></inline-formula> and <inline-formula> <tex-math>${mathbf {6}}^{th}$ </tex-math></inline-formula> generation wireless networks, delivering immersive experiences. Concurrently, Artificial Intelligence (AI) advancements have expanded their role in wireless networks, however, trust and transparency in AI remain to be strengthened. Thus, providing explanations for AI-enabled systems can enhance trust. We introduce Value Function Factorization (VFF)-based Explainable (X) Multi-Agent Reinforcement Learning (MARL) algorithms, explaining reward design in XR codec adaptation through reward decomposition. We contribute four enhancements to XMARL algorithms. Firstly, we detail architectural modifications to enable reward decomposition in VFF-based MARL algorithms: Value Decomposition Networks (VDN), Mixture of Q-Values (QMIX), and Q-Transformation (Q-TRAN). Secondly, inspired by multi-task learning, we reduce the overhead of vanilla XMARL algorithms. Thirdly, we propose a new explainability metric, Reward Difference Fluctuation Explanation (RDFX), suitable for problems with adjustable parameters. Lastly, we propose adaptive XMARL, leveraging network gradients and reward decomposition for improved action selection. Simulation results indicate that, in XR codec adaptation, the Packet Delivery Ratio reward is the primary contributor to optimal performance compared to the initial composite reward, which included delay and Data Rate Ratio components. Modifications to VFF-based XMARL algorithms, incorporating multi-headed structures and adaptive loss functions, enable the best-performing algorithm, Multi-Headed Adaptive (MHA)-QMIX, to achieve significant average gains over the Adjust Packet Size baseline up to 10.7%, 41.4%, 33.3%, and 67.9% in XR index, jitter, delay, and Packet Loss Ratio (PLR), respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"245-264"},"PeriodicalIF":0.0,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11303975","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-16DOI: 10.1109/TMLCN.2025.3644333
Ons Aouedi;Flor Ortiz;Thang X. Vu;Alexandre Lefourn;Felix Giese;Guillermo Gutierrez;Symeon Chatzinotas
The growing integration of non-terrestrial networks (NTNs), particularly low Earth orbit (LEO) satellite constellations, has significantly extended the reach of maritime connectivity, supporting critical applications such as vessel monitoring, navigation safety, and maritime surveillance in remote and oceanic regions. Automatic Identification System (AIS) data, increasingly collected through a combination of satellite and terrestrial infrastructures, provide a rich source of spatiotemporal vessel information. However, accurate trajectory prediction in maritime domains remains challenging due to irregular sampling rates, dynamic environmental conditions, and heterogeneous vessel behaviors. This study proposes a velocity-based trajectory prediction framework that leverages AIS data collected from integrated satellite–terrestrial networks. Rather than directly predicting absolute positions (latitude and longitude), our model predicts vessel motion in the form of latitude and longitude velocities. This formulation simplifies the learning task, enhances temporal continuity, and improves scalability, making it well-suited for resource-constrained NTN environments. The predictive architecture is built upon a Long Short-Term Memory network enhanced with attention mechanisms and residual connections (LSTM-RA), enabling it to capture complex temporal dependencies and adapt to noise in real-world AIS data. Extensive experiments on two maritime datasets validate the robustness and accuracy of our framework, demonstrating clear improvements over state-of-the-art baselines.
{"title":"AIS-Based Hybrid Vessel Trajectory Prediction for Enhanced Maritime Navigation","authors":"Ons Aouedi;Flor Ortiz;Thang X. Vu;Alexandre Lefourn;Felix Giese;Guillermo Gutierrez;Symeon Chatzinotas","doi":"10.1109/TMLCN.2025.3644333","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3644333","url":null,"abstract":"The growing integration of non-terrestrial networks (NTNs), particularly low Earth orbit (LEO) satellite constellations, has significantly extended the reach of maritime connectivity, supporting critical applications such as vessel monitoring, navigation safety, and maritime surveillance in remote and oceanic regions. Automatic Identification System (AIS) data, increasingly collected through a combination of satellite and terrestrial infrastructures, provide a rich source of spatiotemporal vessel information. However, accurate trajectory prediction in maritime domains remains challenging due to irregular sampling rates, dynamic environmental conditions, and heterogeneous vessel behaviors. This study proposes a velocity-based trajectory prediction framework that leverages AIS data collected from integrated satellite–terrestrial networks. Rather than directly predicting absolute positions (latitude and longitude), our model predicts vessel motion in the form of latitude and longitude velocities. This formulation simplifies the learning task, enhances temporal continuity, and improves scalability, making it well-suited for resource-constrained NTN environments. The predictive architecture is built upon a Long Short-Term Memory network enhanced with attention mechanisms and residual connections (<monospace>LSTM-RA</monospace>), enabling it to capture complex temporal dependencies and adapt to noise in real-world AIS data. Extensive experiments on two maritime datasets validate the robustness and accuracy of our framework, demonstrating clear improvements over state-of-the-art baselines.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"198-210"},"PeriodicalIF":0.0,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11301841","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-12DOI: 10.1109/TMLCN.2025.3643409
George P. Kontoudis;Daniel J. Stilwell
In this paper, we propose scalable methods for Gaussian process (GP) prediction in decentralized multi-agent systems. Multiple aggregation techniques for GP prediction are decentralized with the use of iterative and consensus methods. Moreover, we introduce a covariance-based nearest neighbor selection strategy that leverages cross-covariance similarity, enabling subsets of agents to make accurate predictions. The proposed decentralized schemes preserve the consistency properties of their centralized counterparts, while adhering to federated learning principles by restricting raw data exchange between agents. We validate the efficacy of the proposed decentralized algorithms with numerical experiments on real-world sea surface temperature and ground elevation map datasets across multiple fleet sizes.
{"title":"Multi-Agent Federated Learning Using Covariance-Based Nearest Neighbor Gaussian Processes","authors":"George P. Kontoudis;Daniel J. Stilwell","doi":"10.1109/TMLCN.2025.3643409","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3643409","url":null,"abstract":"In this paper, we propose scalable methods for Gaussian process (GP) prediction in decentralized multi-agent systems. Multiple aggregation techniques for GP prediction are decentralized with the use of iterative and consensus methods. Moreover, we introduce a covariance-based nearest neighbor selection strategy that leverages cross-covariance similarity, enabling subsets of agents to make accurate predictions. The proposed decentralized schemes preserve the consistency properties of their centralized counterparts, while adhering to federated learning principles by restricting raw data exchange between agents. We validate the efficacy of the proposed decentralized algorithms with numerical experiments on real-world sea surface temperature and ground elevation map datasets across multiple fleet sizes.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"115-138"},"PeriodicalIF":0.0,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11299094","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}