Pub Date : 2025-10-07DOI: 10.1109/TMLCN.2025.3618815
Ehsan Ghoreishi;Bahman Abolhassani;Yan Huang;Shiva Acharya;Wenjing Lou;Y. Thomas Hou
Puncturing is a promising technique in 3GPP to multiplex Enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low Latency Communications (URLLC) traffic on the same 5G New Radio (NR) air interface. The essence of puncturing is to transmit URLLC packets on demand upon their arrival, by preempting the radio resources (or subcarriers) that are already allocated to eMBB traffic. Although it is considered most bandwidth efficient, puncturing URLLC data on eMBB can lead to degradation of eMBB’s performance. Most of the state-of-the-art research addressing this problem employ raw eMBB data throughput as performance metric. This is inadequate as, after puncturing, eMBB data may or may not be successfully decoded at its receiver. This paper presents Cyrus+—a deep reinforcement learning (DRL)-based puncturing solution that employs goodput (through feedback from a receiver’s decoder), rather than estimated raw throughput, in its design of reward function. Further, Cyrus+ is tailored specifically for the Open RAN (O-RAN) architecture and fully leverages O-RAN’s three control loops at different time scales in its design of DRL. In the Non-Real-Time (Non-RT) RAN Intelligent Controller (RIC), Cyrus+ initializes the policy network that will be used in the RT Open Distributed Unit (O-DU). In the Near-RT RIC, Cyrus+ refines the policy based on dynamic network conditions and feedback from the receivers. In the RT O-DU, Cyrus+ generates a puncturing codebook by considering all possible URLLC arrivals. We build a standard-compliant link-level 5G NR simulator to demonstrate the efficacy of Cyrus+. Experimental results show that Cyrus+ outperforms benchmark puncturing algorithms and meets the stringent timing requirement in 5G NR (numerology 3).
{"title":"Cyrus+: A DRL-Based Puncturing Solution to URLLC/eMBB Multiplexing in O-RAN","authors":"Ehsan Ghoreishi;Bahman Abolhassani;Yan Huang;Shiva Acharya;Wenjing Lou;Y. Thomas Hou","doi":"10.1109/TMLCN.2025.3618815","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3618815","url":null,"abstract":"Puncturing is a promising technique in 3GPP to multiplex Enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low Latency Communications (URLLC) traffic on the same 5G New Radio (NR) air interface. The essence of puncturing is to transmit URLLC packets on demand upon their arrival, by preempting the radio resources (or subcarriers) that are already allocated to eMBB traffic. Although it is considered most bandwidth efficient, puncturing URLLC data on eMBB can lead to degradation of eMBB’s performance. Most of the state-of-the-art research addressing this problem employ raw eMBB data throughput as performance metric. This is inadequate as, after puncturing, eMBB data may or may not be successfully decoded at its receiver. This paper presents Cyrus+—a deep reinforcement learning (DRL)-based puncturing solution that employs goodput (through feedback from a receiver’s decoder), rather than estimated raw throughput, in its design of reward function. Further, Cyrus+ is tailored specifically for the Open RAN (O-RAN) architecture and fully leverages O-RAN’s three control loops at different time scales in its design of DRL. In the Non-Real-Time (Non-RT) RAN Intelligent Controller (RIC), Cyrus+ initializes the policy network that will be used in the RT Open Distributed Unit (O-DU). In the Near-RT RIC, Cyrus+ refines the policy based on dynamic network conditions and feedback from the receivers. In the RT O-DU, Cyrus+ generates a puncturing codebook by considering all possible URLLC arrivals. We build a standard-compliant link-level 5G NR simulator to demonstrate the efficacy of Cyrus+. Experimental results show that Cyrus+ outperforms benchmark puncturing algorithms and meets the stringent timing requirement in 5G NR (numerology 3).","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1178-1196"},"PeriodicalIF":0.0,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11195824","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federated Learning (FL) faces significant challenges when applied in 6G (sixth-generation wireless technology) Non-Terrestrial Network (NTN) environments, including heterogeneous interference, stringent requirements for real-time model responsiveness, and limited ability to collect comprehensive datasets due to the absence of a global network view. In this paper, we propose Spark, a novel framework designed to enable a fully decentralized FL process tailored for NTN. By leveraging a Directed acyclic graph (DAG)-based architecture, Spark addresses the unique demands of NTN through asynchronous updates, localized learning prioritization, and adaptive aggregation strategies, ensuring robust performance under dynamic and constrained conditions. Extensive experiments demonstrate that Spark outperforms other FL frameworks and effectively addresses the key challenges of NTN-based FL through its asynchronous design–ensuring resilience under communication delays, enhancing responsiveness via timely local updates, and improving coverage through altitude-aware aggregation that leverages diverse, high-altitude knowledge.
{"title":"SPARK: A Scalable Peer-to-Peer Asynchronous Resilient Framework for Federated Learning in Non-Terrestrial Networks","authors":"Guangsheng Yu;Ying He;Eryk Dutkiewicz;Bathiya Senanayake;Manik Attygalle","doi":"10.1109/TMLCN.2025.3617883","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3617883","url":null,"abstract":"Federated Learning (FL) faces significant challenges when applied in 6G (sixth-generation wireless technology) Non-Terrestrial Network (NTN) environments, including heterogeneous interference, stringent requirements for real-time model responsiveness, and limited ability to collect comprehensive datasets due to the absence of a global network view. In this paper, we propose S<sc>park</small>, a novel framework designed to enable a fully decentralized FL process tailored for NTN. By leveraging a Directed acyclic graph (DAG)-based architecture, S<sc>park</small> addresses the unique demands of NTN through asynchronous updates, localized learning prioritization, and adaptive aggregation strategies, ensuring robust performance under dynamic and constrained conditions. Extensive experiments demonstrate that S<sc>park</small> outperforms other FL frameworks and effectively addresses the key challenges of NTN-based FL through its asynchronous design–ensuring resilience under communication delays, enhancing responsiveness via timely local updates, and improving coverage through altitude-aware aggregation that leverages diverse, high-altitude knowledge.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1092-1107"},"PeriodicalIF":0.0,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11193784","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Millimeter-wave (mmWave) technology is expected to support next-generation wireless networks by expanding the available spectrum and supporting multi-gigabit services. While mmWave communications hold great promise, mmWave links are vulnerable against link blockages, which can severely impact their performance. This paper aims to develop resilient transmission mechanisms to suitably distribute traffic across multiple paths in mmWave networks. The main contributions include: (a) the development of proactive transmission mechanisms to build resilience against link blockages in advance, while achieving a high end-to-end packet rate; (b) the design of a heuristic path selection algorithm to efficiently select (in polynomial time in the network size) multiple proactively resilient paths that have high capacity; and (c) the development of a hybrid scheduling algorithm that combines the proposed path selection algorithm with a deep reinforcement learning (DRL) based online approach for decentralized adaptation to blockages. To achieve resilience against link blockages and to adapt the information flow through the network, a prominent Soft Actor-Critic DRL algorithm is investigated. The proposed scheduling algorithm robustly adapts to blockages and channel variations over different topologies, channels, and blockage realizations while outperforming alternative algorithms which include a conventional congestion control algorithm additive increase multiplicative decrease. Specifically, it achieves the desired packet rate in over 99% of the episodes in static networks with blockages (where up to 80% of the paths are blocked), and in 100% of the episodes in time-varying networks with blockages and link capacity variations – compared to 0.5% to 45% success rates achieved by the baseline methods under the same conditions.
{"title":"A Reinforcement Learning-Based Hybrid Scheduling Mechanism for mmWave Networks","authors":"Mine Gokce Dogan;Martina Cardone;Christina Fragouli","doi":"10.1109/TMLCN.2025.3615119","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3615119","url":null,"abstract":"Millimeter-wave (mmWave) technology is expected to support next-generation wireless networks by expanding the available spectrum and supporting multi-gigabit services. While mmWave communications hold great promise, mmWave links are vulnerable against link blockages, which can severely impact their performance. This paper aims to develop resilient transmission mechanisms to suitably distribute traffic across multiple paths in mmWave networks. The main contributions include: (a) the development of proactive transmission mechanisms to build resilience against link blockages in advance, while achieving a high end-to-end packet rate; (b) the design of a heuristic path selection algorithm to efficiently select (in polynomial time in the network size) multiple proactively resilient paths that have high capacity; and (c) the development of a hybrid scheduling algorithm that combines the proposed path selection algorithm with a deep reinforcement learning (DRL) based online approach for decentralized adaptation to blockages. To achieve resilience against link blockages and to adapt the information flow through the network, a prominent Soft Actor-Critic DRL algorithm is investigated. The proposed scheduling algorithm robustly adapts to blockages and channel variations over different topologies, channels, and blockage realizations while outperforming alternative algorithms which include a conventional congestion control algorithm additive increase multiplicative decrease. Specifically, it achieves the desired packet rate in over 99% of the episodes in static networks with blockages (where up to 80% of the paths are blocked), and in 100% of the episodes in time-varying networks with blockages and link capacity variations – compared to 0.5% to 45% success rates achieved by the baseline methods under the same conditions.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1121-1142"},"PeriodicalIF":0.0,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11181133","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multimodal federated learning (MFL) is a distributed framework for training multimodal models without uploading local multimodal data of clients, thereby effectively protecting client privacy. However, multimodal data is commonly heterogeneous across diverse clients, where each client possesses only a subset of all modalities, renders conventional analysis results and optimization methods in unimodal federated learning inapplicable. In addition, fixed latency demand and limited communication bandwidth pose significant challenges for deploying MFL in wireless scenarios. To optimize the wireless MFL performance on modal heterogeneity, this paper proposes a joint client scheduling and bandwidth allocation (JCSBA) algorithm based on a decision-level fusion architecture with adding a unimodal loss function. Specifically, with the decision results, the unimodal loss functions are added to both the training objective and local update loss functions to accelerate multimodal convergence and improve unimodal performance. To characterize MFL performance, we derive a closed-form upper bound related to client and modality scheduling and minimize the derived bound under the latency, energy, and bandwidth constraints through JCSBA. Experimental results on multimodal datasets demonstrate that the JCSBA algorithm improves the multimodal accuracy and the unimodal accuracy by 4.06% and 2.73%, respectively, compared to conventional algorithms.
{"title":"Analysis and Optimization of Wireless Multimodal Federated Learning on Modal Heterogeneity","authors":"Xuefeng Han;Wen Chen;Jun Li;Ming Ding;Qingqing Wu;Kang Wei;Xiumei Deng;Yumeng Shao;Qiong Wu","doi":"10.1109/TMLCN.2025.3611977","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3611977","url":null,"abstract":"Multimodal federated learning (MFL) is a distributed framework for training multimodal models without uploading local multimodal data of clients, thereby effectively protecting client privacy. However, multimodal data is commonly heterogeneous across diverse clients, where each client possesses only a subset of all modalities, renders conventional analysis results and optimization methods in unimodal federated learning inapplicable. In addition, fixed latency demand and limited communication bandwidth pose significant challenges for deploying MFL in wireless scenarios. To optimize the wireless MFL performance on modal heterogeneity, this paper proposes a joint client scheduling and bandwidth allocation (JCSBA) algorithm based on a decision-level fusion architecture with adding a unimodal loss function. Specifically, with the decision results, the unimodal loss functions are added to both the training objective and local update loss functions to accelerate multimodal convergence and improve unimodal performance. To characterize MFL performance, we derive a closed-form upper bound related to client and modality scheduling and minimize the derived bound under the latency, energy, and bandwidth constraints through JCSBA. Experimental results on multimodal datasets demonstrate that the JCSBA algorithm improves the multimodal accuracy and the unimodal accuracy by 4.06% and 2.73%, respectively, compared to conventional algorithms.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1075-1091"},"PeriodicalIF":0.0,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11174013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-18DOI: 10.1109/TMLCN.2025.3611398
Md Jueal Mia;M. Hadi Amini
Federated Learning (FL) is a decentralized learning method that enables collaborative model training while preserving data privacy. This makes FL a promising solution in various applications, particularly in cross-silo settings such as healthcare, finance, and transportation. However, FL remains highly vulnerable to adversarial threats, especially backdoor attacks, where malicious clients inject poisoned data to manipulate global model behavior. Existing outlier detection techniques often struggle to effectively isolate such adversarial updates, compromising model integrity. To address this challenge, we propose Backdoor Attack Resilient Technique for Federated Learning (BART-FL), a novel lightweight defense mechanism that enhances FL security through malicious client filtering. Our method integrates Principal Component Analysis (PCA) for dimensionality reduction with cosine similarity for measuring pairwise distances between model updates and K-means clustering for detecting potentially malicious clients. To reliably identify the benign cluster, we introduce a multi-metric statistical voting mechanism based on point-level mean, median absolute deviation (MAD), and cluster-level mean. This approach strengthens model resilience against adversarial manipulations by identifying and filtering malicious updates before aggregation, thereby preserving the integrity of the global model. Experimental evaluations conducted on the LISA traffic light dataset, CIFAR-10, and CIFAR-100 demonstrate the effectiveness of BART-FL in maintaining model performance across diverse FL settings. Additionally, we perform a comparative analysis against existing backdoor defense techniques, highlighting BART-FL’s ability to improve security while ensuring computational efficiency. Our results showcase the potential of BART-FL as a scalable and adversary-resilient defense mechanism for secure training in cross-silo FL applications.
{"title":"BART-FL: A Backdoor Attack-Resilient Federated Aggregation Technique for Cross-Silo Applications","authors":"Md Jueal Mia;M. Hadi Amini","doi":"10.1109/TMLCN.2025.3611398","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3611398","url":null,"abstract":"Federated Learning (FL) is a decentralized learning method that enables collaborative model training while preserving data privacy. This makes FL a promising solution in various applications, particularly in cross-silo settings such as healthcare, finance, and transportation. However, FL remains highly vulnerable to adversarial threats, especially backdoor attacks, where malicious clients inject poisoned data to manipulate global model behavior. Existing outlier detection techniques often struggle to effectively isolate such adversarial updates, compromising model integrity. To address this challenge, we propose Backdoor Attack Resilient Technique for Federated Learning (BART-FL), a novel lightweight defense mechanism that enhances FL security through malicious client filtering. Our method integrates Principal Component Analysis (PCA) for dimensionality reduction with cosine similarity for measuring pairwise distances between model updates and K-means clustering for detecting potentially malicious clients. To reliably identify the benign cluster, we introduce a multi-metric statistical voting mechanism based on point-level mean, median absolute deviation (MAD), and cluster-level mean. This approach strengthens model resilience against adversarial manipulations by identifying and filtering malicious updates before aggregation, thereby preserving the integrity of the global model. Experimental evaluations conducted on the LISA traffic light dataset, CIFAR-10, and CIFAR-100 demonstrate the effectiveness of BART-FL in maintaining model performance across diverse FL settings. Additionally, we perform a comparative analysis against existing backdoor defense techniques, highlighting BART-FL’s ability to improve security while ensuring computational efficiency. Our results showcase the potential of BART-FL as a scalable and adversary-resilient defense mechanism for secure training in cross-silo FL applications.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1311-1325"},"PeriodicalIF":0.0,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11172307","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-09DOI: 10.1109/TMLCN.2025.3607891
Maheshi Lokumarambage;Thushan Sivalingam;Feng Dong;Nandana Rajatheva;Anil Fernando
Semantic communication (SemCom) systems enhance transmission efficiency by conveying semantic information in lieu of raw data. However, challenges arise when designing these systems due to the need for robust semantic source coding for information representation extending beyond the training dataset, maintaining channel-agnostic performance, and ensuring robustness to channel and semantic noise. We propose a novel generative artificial intelligence (AI) based SemCom architecture conditioned on quantized latent. The system reduces the communication overhead of the wireless channel by transmitting the index of the quantized latent over the communication channel by mapping the quantized vector to the learned codebook vectors. The learned codebook is the shared knowledge base. The encoder is designed with a novel spatial attention mechanism based on image energy, focusing on object edges. The critic assesses the realism of generated data relative to the original distribution, with the Wasserstein distance. The model introduces novel contrastive objectives at multiple levels, including pixel, latent, perceptual, and task output, tailored for noisy wireless semantic communication. We validated the proposed model for transmission quality and robustness with low-density parity-check (LDPC), which outperforms the baselines of better portable graphics (BPG), specifically at low signal-to-noise ratio (SNR) levels ($ {lt }~ {5}$ dB). Additionally, it shows comparable results with joint source-channel coding (JSCC) with lower complexity and latency. The model is validated for human perception and machine perception-oriented task utility. The model effectively transmits high-resolution images without requiring additional error correction at the receiver. We propose a novel semantic-based matrix to evaluate the robustness to noise and task-specific semantic distortion.
{"title":"Generative AI-Based Vector Quantized End-to-End Semantic Communication System for Wireless Image Transmission","authors":"Maheshi Lokumarambage;Thushan Sivalingam;Feng Dong;Nandana Rajatheva;Anil Fernando","doi":"10.1109/TMLCN.2025.3607891","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3607891","url":null,"abstract":"Semantic communication (SemCom) systems enhance transmission efficiency by conveying semantic information in lieu of raw data. However, challenges arise when designing these systems due to the need for robust semantic source coding for information representation extending beyond the training dataset, maintaining channel-agnostic performance, and ensuring robustness to channel and semantic noise. We propose a novel generative artificial intelligence (AI) based SemCom architecture conditioned on quantized latent. The system reduces the communication overhead of the wireless channel by transmitting the index of the quantized latent over the communication channel by mapping the quantized vector to the learned codebook vectors. The learned codebook is the shared knowledge base. The encoder is designed with a novel spatial attention mechanism based on image energy, focusing on object edges. The critic assesses the realism of generated data relative to the original distribution, with the Wasserstein distance. The model introduces novel contrastive objectives at multiple levels, including pixel, latent, perceptual, and task output, tailored for noisy wireless semantic communication. We validated the proposed model for transmission quality and robustness with low-density parity-check (LDPC), which outperforms the baselines of better portable graphics (BPG), specifically at low signal-to-noise ratio (SNR) levels (<inline-formula> <tex-math>$ {lt }~ {5}$ </tex-math></inline-formula> dB). Additionally, it shows comparable results with joint source-channel coding (JSCC) with lower complexity and latency. The model is validated for human perception and machine perception-oriented task utility. The model effectively transmits high-resolution images without requiring additional error correction at the receiver. We propose a novel semantic-based matrix to evaluate the robustness to noise and task-specific semantic distortion.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1050-1074"},"PeriodicalIF":0.0,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11154002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145100350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-03DOI: 10.1109/TMLCN.2025.3605855
Muhammad O. Farooq
Modern cyber environments are becoming increasingly complex and distributed, often organized into multiple interconnected subnets and nodes. Even relatively small-scale networks can exhibit significant security challenges due to their dynamic topologies and the diversity of potential attack vectors. In modern cyber environments, human-led defense alone is insufficient due to delayed response times, cognitive overload, and limited availability of skilled personnel, particularly in remote or resource-constrained settings. These challenges are intensified by the growing diversity of cyber threats, including adaptive and machine learning-based attacks, which demand rapid and intelligent responses. Addressing this, we propose a reinforcement learning (RL)-based framework that integrates eXtreme Gradient Boosting (XGBoost) and transformer architectures to develop robust, generalizable defensive agents. The proposed agents are evaluated against both baseline defenders trained to counter specific adversaries and hierarchical generic agents representing the current state-of-the-art. Experimental results demonstrate that the RL-XGBoost (integration of RL and XGBoost) agent consistently achieves superior performance in terms of defense accuracy and efficiency across varied adversarial strategies and network configurations. Notably, in scenarios involving changes to network topology, both RL-Transformer (RL combined with transformer architectures) and RL-XGBoost agents exhibit strong adaptability and resilience, outperforming specialized blue agents and hierarchical agents in performance consistency. In particular, the RL-Transformer variant (RL-BERT) demonstrates exceptional robustness when attacker entry points are altered, effectively capturing long-range dependencies and temporal patterns through its self-attention mechanism. Overall, these findings highlight the RL-XGBoost model’s potential as a scalable and intelligent solution for multi-adversary defense in dynamic and heterogeneous cyber environments.
{"title":"Robust Defensive Cyber Agent for Multi-Adversary Defense","authors":"Muhammad O. Farooq","doi":"10.1109/TMLCN.2025.3605855","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3605855","url":null,"abstract":"Modern cyber environments are becoming increasingly complex and distributed, often organized into multiple interconnected subnets and nodes. Even relatively small-scale networks can exhibit significant security challenges due to their dynamic topologies and the diversity of potential attack vectors. In modern cyber environments, human-led defense alone is insufficient due to delayed response times, cognitive overload, and limited availability of skilled personnel, particularly in remote or resource-constrained settings. These challenges are intensified by the growing diversity of cyber threats, including adaptive and machine learning-based attacks, which demand rapid and intelligent responses. Addressing this, we propose a reinforcement learning (RL)-based framework that integrates eXtreme Gradient Boosting (XGBoost) and transformer architectures to develop robust, generalizable defensive agents. The proposed agents are evaluated against both baseline defenders trained to counter specific adversaries and hierarchical generic agents representing the current state-of-the-art. Experimental results demonstrate that the RL-XGBoost (integration of RL and XGBoost) agent consistently achieves superior performance in terms of defense accuracy and efficiency across varied adversarial strategies and network configurations. Notably, in scenarios involving changes to network topology, both RL-Transformer (RL combined with transformer architectures) and RL-XGBoost agents exhibit strong adaptability and resilience, outperforming specialized blue agents and hierarchical agents in performance consistency. In particular, the RL-Transformer variant (RL-BERT) demonstrates exceptional robustness when attacker entry points are altered, effectively capturing long-range dependencies and temporal patterns through its self-attention mechanism. Overall, these findings highlight the RL-XGBoost model’s potential as a scalable and intelligent solution for multi-adversary defense in dynamic and heterogeneous cyber environments.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1030-1049"},"PeriodicalIF":0.0,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11150430","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-13DOI: 10.1109/TMLCN.2025.3598739
Kai Wang;Chee Wei Tan
Network routing is a core functionality in computer networks that holds significant potential for integrating newly developed techniques with minimal software effort through the use of Software-Defined Networking (SDN). However, with the ever-expansion of the Internet, traditional destination-based IP routing techniques struggle to meet Quality-of-Service (QoS) requirements with SDN alone. To address these challenges, a modern network routing technique called Segment Routing (SR) has been designed to simplify traffic engineering and make networks more flexible and scalable. However, existing SR routing algorithms used by major Internet Service Providers (ISPs) are mostly proprietary, whose details remain unknown. This study delves into the inverse problem of a general type of SR and attempts to infer the SR policies given expert traffic traces. To this end, we propose MoME, a Mixture-of-Experts (MoE) model using the Maximum Entropy Inverse Reinforcement Learning (MaxEnt-IRL) framework that is capable of incorporating diverse features (e.g., router, link and context) and capturing complex relationships in the link cost, in combination with an Expectation-Maximization (EM) based iterative algorithm that jointly infers link costs and SR policy classes. Experimental results on real-world ISP topologies and Traffic Matrices (TMs) demonstrate the superior performance of our approach in jointly classifying SR policies and inferring link cost functions. Specifically, our model achieves classification accuracies of 0.90, 0.81, 0.75, and 0.57 on datasets that contain five SR policies over the small-scale Abilene and GÉANT, the medium-scale Exodus, and the large-scale Sprintlink network topologies, respectively.
{"title":"Reverse Engineering Segment Routing Policies and Link Costs With Inverse Reinforcement Learning and EM","authors":"Kai Wang;Chee Wei Tan","doi":"10.1109/TMLCN.2025.3598739","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3598739","url":null,"abstract":"Network routing is a core functionality in computer networks that holds significant potential for integrating newly developed techniques with minimal software effort through the use of Software-Defined Networking (SDN). However, with the ever-expansion of the Internet, traditional destination-based IP routing techniques struggle to meet Quality-of-Service (QoS) requirements with SDN alone. To address these challenges, a modern network routing technique called Segment Routing (SR) has been designed to simplify traffic engineering and make networks more flexible and scalable. However, existing SR routing algorithms used by major Internet Service Providers (ISPs) are mostly proprietary, whose details remain unknown. This study delves into the inverse problem of a general type of SR and attempts to infer the SR policies given expert traffic traces. To this end, we propose MoME, a Mixture-of-Experts (MoE) model using the Maximum Entropy Inverse Reinforcement Learning (MaxEnt-IRL) framework that is capable of incorporating diverse features (e.g., router, link and context) and capturing complex relationships in the link cost, in combination with an Expectation-Maximization (EM) based iterative algorithm that jointly infers link costs and SR policy classes. Experimental results on real-world ISP topologies and Traffic Matrices (TMs) demonstrate the superior performance of our approach in jointly classifying SR policies and inferring link cost functions. Specifically, our model achieves classification accuracies of 0.90, 0.81, 0.75, and 0.57 on datasets that contain five SR policies over the small-scale Abilene and GÉANT, the medium-scale Exodus, and the large-scale Sprintlink network topologies, respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1014-1029"},"PeriodicalIF":0.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11124467","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144891104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-06DOI: 10.1109/TMLCN.2025.3596548
Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier
The support of artificial intelligence (AI) based decision-making is a key element in future 6G networks. Moreover, AI is widely employed in critical applications such as autonomous driving and medical diagnosis. In such applications, using AI as black-box models is risky and challenging. Hence, it is crucial to understand and trust the decisions taken by these models. Tackling this issue can be achieved by developing explainable AI (XAI) schemes that aim to explain the logic behind the black-box model behavior, and thus, ensure its efficient and safe deployment. Highlighting the relevant inputs the black-box model uses to accomplish the desired prediction is essential towards ensuring its interpretability. Recently, we proposed a novel perturbation-based feature selection framework called XAI-CHEST and oriented toward channel estimation in wireless communications. This manuscript provides the detailed theoretical foundations of the XAI-CHEST framework. In particular, we derive the analytical expressions of the XAI-CHEST loss functions and the noise threshold fine-tuning optimization problem. Hence the designed XAI-CHEST delivers a smart low-complex one-shot input feature selection methodology for high-dimensional model input that can further improve the overall performance while optimizing the architecture of the employed model. Simulation results show that the XAI-CHEST framework outperforms the classical feature selection XAI schemes such as local interpretable model-agnostic explanations (LIME) and shapley additive explanations (SHAP), mainly in terms of interpretability resolution as well as providing better performance-complexity trade-off.
{"title":"Explainable AI for Enhancing Efficiency of DL-Based Channel Estimation","authors":"Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier","doi":"10.1109/TMLCN.2025.3596548","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3596548","url":null,"abstract":"The support of artificial intelligence (AI) based decision-making is a key element in future 6G networks. Moreover, AI is widely employed in critical applications such as autonomous driving and medical diagnosis. In such applications, using AI as black-box models is risky and challenging. Hence, it is crucial to understand and trust the decisions taken by these models. Tackling this issue can be achieved by developing explainable AI (XAI) schemes that aim to explain the logic behind the black-box model behavior, and thus, ensure its efficient and safe deployment. Highlighting the relevant inputs the black-box model uses to accomplish the desired prediction is essential towards ensuring its interpretability. Recently, we proposed a novel perturbation-based feature selection framework called XAI-CHEST and oriented toward channel estimation in wireless communications. This manuscript provides the detailed theoretical foundations of the XAI-CHEST framework. In particular, we derive the analytical expressions of the XAI-CHEST loss functions and the noise threshold fine-tuning optimization problem. Hence the designed XAI-CHEST delivers a smart low-complex one-shot input feature selection methodology for high-dimensional model input that can further improve the overall performance while optimizing the architecture of the employed model. Simulation results show that the XAI-CHEST framework outperforms the classical feature selection XAI schemes such as local interpretable model-agnostic explanations (LIME) and shapley additive explanations (SHAP), mainly in terms of interpretability resolution as well as providing better performance-complexity trade-off.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"976-996"},"PeriodicalIF":0.0,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11115091","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel “Plan A - Plan B” framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. Furthermore, we propose a Bayesian optimization scheme that reshapes the probability distribution of the MLLM’s inference process based on the contextual information of the image. The optimization scheme significantly enhances the MLLM’s performance in semantic compression by 1) filtering out irrelevant vocabulary in the original MLLM output; and 2) using contextual similarities between prospective answers of the MLLM and the background information as prior knowledge to modify the MLLM’s probability distribution during inference. Further, at the receiver side of the communication system, we put forth a “generate-criticize” framework that utilizes the cooperation of multiple MLLMs to enhance the reliability of image reconstruction.
{"title":"Out-of-Distribution in Image Semantic Communication: A Solution With Multimodal Large Language Models","authors":"Feifan Zhang;Yuyang Du;Kexin Chen;Yulin Shao;Soung Chang Liew","doi":"10.1109/TMLCN.2025.3595841","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595841","url":null,"abstract":"Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel “Plan A - Plan B” framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. Furthermore, we propose a Bayesian optimization scheme that reshapes the probability distribution of the MLLM’s inference process based on the contextual information of the image. The optimization scheme significantly enhances the MLLM’s performance in semantic compression by 1) filtering out irrelevant vocabulary in the original MLLM output; and 2) using contextual similarities between prospective answers of the MLLM and the background information as prior knowledge to modify the MLLM’s probability distribution during inference. Further, at the receiver side of the communication system, we put forth a “generate-criticize” framework that utilizes the cooperation of multiple MLLMs to enhance the reliability of image reconstruction.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"997-1013"},"PeriodicalIF":0.0,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11113346","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}