首页 > 最新文献

IEEE Transactions on Machine Learning in Communications and Networking最新文献

英文 中文
Cyrus+: A DRL-Based Puncturing Solution to URLLC/eMBB Multiplexing in O-RAN Cyrus+:一种基于drl的O-RAN中URLLC/eMBB复用的穿刺解决方案
Pub Date : 2025-10-07 DOI: 10.1109/TMLCN.2025.3618815
Ehsan Ghoreishi;Bahman Abolhassani;Yan Huang;Shiva Acharya;Wenjing Lou;Y. Thomas Hou
Puncturing is a promising technique in 3GPP to multiplex Enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low Latency Communications (URLLC) traffic on the same 5G New Radio (NR) air interface. The essence of puncturing is to transmit URLLC packets on demand upon their arrival, by preempting the radio resources (or subcarriers) that are already allocated to eMBB traffic. Although it is considered most bandwidth efficient, puncturing URLLC data on eMBB can lead to degradation of eMBB’s performance. Most of the state-of-the-art research addressing this problem employ raw eMBB data throughput as performance metric. This is inadequate as, after puncturing, eMBB data may or may not be successfully decoded at its receiver. This paper presents Cyrus+—a deep reinforcement learning (DRL)-based puncturing solution that employs goodput (through feedback from a receiver’s decoder), rather than estimated raw throughput, in its design of reward function. Further, Cyrus+ is tailored specifically for the Open RAN (O-RAN) architecture and fully leverages O-RAN’s three control loops at different time scales in its design of DRL. In the Non-Real-Time (Non-RT) RAN Intelligent Controller (RIC), Cyrus+ initializes the policy network that will be used in the RT Open Distributed Unit (O-DU). In the Near-RT RIC, Cyrus+ refines the policy based on dynamic network conditions and feedback from the receivers. In the RT O-DU, Cyrus+ generates a puncturing codebook by considering all possible URLLC arrivals. We build a standard-compliant link-level 5G NR simulator to demonstrate the efficacy of Cyrus+. Experimental results show that Cyrus+ outperforms benchmark puncturing algorithms and meets the stringent timing requirement in 5G NR (numerology 3).
在3GPP中,穿刺是一种很有前途的技术,可以在同一个5G新无线电(NR)空中接口上复用增强型移动宽带(eMBB)和超可靠低延迟通信(URLLC)流量。穿刺的本质是在URLLC数据包到达时按需传输,通过抢占已经分配给eMBB流量的无线电资源(或子载波)。虽然它被认为是最有效的带宽,但在eMBB上刺穿URLLC数据可能会导致eMBB的性能下降。解决这个问题的大多数最新研究都采用原始eMBB数据吞吐量作为性能指标。这是不够的,因为在穿刺后,eMBB数据可能会或可能不会在其接收器上成功解码。本文介绍了Cyrus+ -一种基于深度强化学习(DRL)的穿刺解决方案,在其奖励函数的设计中使用goodput(通过接收器解码器的反馈),而不是估计的原始吞吐量。此外,Cyrus+专为开放式RAN (O-RAN)架构量身定制,在DRL设计中充分利用了O-RAN在不同时间尺度下的三个控制回路。在Non-Real-Time RAN智能控制器RIC (Non-Real-Time RAN Intelligent Controller)中,Cyrus+会初始化用于RT O-DU (Open Distributed Unit)的策略网络。在Near-RT RIC中,Cyrus+基于动态网络条件和接收方反馈来细化策略。在RT - O-DU中,Cyrus+通过考虑所有可能的URLLC到达生成一个穿孔码本。我们构建了一个符合标准的链路级5G NR模拟器来验证Cyrus+的有效性。实验结果表明,Cyrus+优于基准穿刺算法,满足5G NR中严格的时序要求(numerology 3)。
{"title":"Cyrus+: A DRL-Based Puncturing Solution to URLLC/eMBB Multiplexing in O-RAN","authors":"Ehsan Ghoreishi;Bahman Abolhassani;Yan Huang;Shiva Acharya;Wenjing Lou;Y. Thomas Hou","doi":"10.1109/TMLCN.2025.3618815","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3618815","url":null,"abstract":"Puncturing is a promising technique in 3GPP to multiplex Enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low Latency Communications (URLLC) traffic on the same 5G New Radio (NR) air interface. The essence of puncturing is to transmit URLLC packets on demand upon their arrival, by preempting the radio resources (or subcarriers) that are already allocated to eMBB traffic. Although it is considered most bandwidth efficient, puncturing URLLC data on eMBB can lead to degradation of eMBB’s performance. Most of the state-of-the-art research addressing this problem employ raw eMBB data throughput as performance metric. This is inadequate as, after puncturing, eMBB data may or may not be successfully decoded at its receiver. This paper presents Cyrus+—a deep reinforcement learning (DRL)-based puncturing solution that employs goodput (through feedback from a receiver’s decoder), rather than estimated raw throughput, in its design of reward function. Further, Cyrus+ is tailored specifically for the Open RAN (O-RAN) architecture and fully leverages O-RAN’s three control loops at different time scales in its design of DRL. In the Non-Real-Time (Non-RT) RAN Intelligent Controller (RIC), Cyrus+ initializes the policy network that will be used in the RT Open Distributed Unit (O-DU). In the Near-RT RIC, Cyrus+ refines the policy based on dynamic network conditions and feedback from the receivers. In the RT O-DU, Cyrus+ generates a puncturing codebook by considering all possible URLLC arrivals. We build a standard-compliant link-level 5G NR simulator to demonstrate the efficacy of Cyrus+. Experimental results show that Cyrus+ outperforms benchmark puncturing algorithms and meets the stringent timing requirement in 5G NR (numerology 3).","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1178-1196"},"PeriodicalIF":0.0,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11195824","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPARK: A Scalable Peer-to-Peer Asynchronous Resilient Framework for Federated Learning in Non-Terrestrial Networks SPARK:用于非地面网络中联邦学习的可扩展点对点异步弹性框架
Pub Date : 2025-10-06 DOI: 10.1109/TMLCN.2025.3617883
Guangsheng Yu;Ying He;Eryk Dutkiewicz;Bathiya Senanayake;Manik Attygalle
Federated Learning (FL) faces significant challenges when applied in 6G (sixth-generation wireless technology) Non-Terrestrial Network (NTN) environments, including heterogeneous interference, stringent requirements for real-time model responsiveness, and limited ability to collect comprehensive datasets due to the absence of a global network view. In this paper, we propose Spark, a novel framework designed to enable a fully decentralized FL process tailored for NTN. By leveraging a Directed acyclic graph (DAG)-based architecture, Spark addresses the unique demands of NTN through asynchronous updates, localized learning prioritization, and adaptive aggregation strategies, ensuring robust performance under dynamic and constrained conditions. Extensive experiments demonstrate that Spark outperforms other FL frameworks and effectively addresses the key challenges of NTN-based FL through its asynchronous design–ensuring resilience under communication delays, enhancing responsiveness via timely local updates, and improving coverage through altitude-aware aggregation that leverages diverse, high-altitude knowledge.
联邦学习(FL)在6G(第六代无线技术)非地面网络(NTN)环境中应用时面临着重大挑战,包括异构干扰、对实时模型响应的严格要求,以及由于缺乏全局网络视图而收集综合数据集的能力有限。在本文中,我们提出了Spark,这是一个新颖的框架,旨在实现为NTN量身定制的完全分散的FL过程。通过利用基于有向无环图(DAG)的架构,Spark通过异步更新、本地化学习优先级和自适应聚合策略解决了NTN的独特需求,确保了动态和约束条件下的稳健性能。大量的实验表明,Spark优于其他FL框架,并通过其异步设计有效地解决了基于ntn的FL的关键挑战——确保通信延迟下的弹性,通过及时的本地更新增强响应能力,并通过利用多种高海拔知识的高度感知聚合提高覆盖范围。
{"title":"SPARK: A Scalable Peer-to-Peer Asynchronous Resilient Framework for Federated Learning in Non-Terrestrial Networks","authors":"Guangsheng Yu;Ying He;Eryk Dutkiewicz;Bathiya Senanayake;Manik Attygalle","doi":"10.1109/TMLCN.2025.3617883","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3617883","url":null,"abstract":"Federated Learning (FL) faces significant challenges when applied in 6G (sixth-generation wireless technology) Non-Terrestrial Network (NTN) environments, including heterogeneous interference, stringent requirements for real-time model responsiveness, and limited ability to collect comprehensive datasets due to the absence of a global network view. In this paper, we propose S<sc>park</small>, a novel framework designed to enable a fully decentralized FL process tailored for NTN. By leveraging a Directed acyclic graph (DAG)-based architecture, S<sc>park</small> addresses the unique demands of NTN through asynchronous updates, localized learning prioritization, and adaptive aggregation strategies, ensuring robust performance under dynamic and constrained conditions. Extensive experiments demonstrate that S<sc>park</small> outperforms other FL frameworks and effectively addresses the key challenges of NTN-based FL through its asynchronous design–ensuring resilience under communication delays, enhancing responsiveness via timely local updates, and improving coverage through altitude-aware aggregation that leverages diverse, high-altitude knowledge.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1092-1107"},"PeriodicalIF":0.0,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11193784","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Reinforcement Learning-Based Hybrid Scheduling Mechanism for mmWave Networks 基于强化学习的毫米波网络混合调度机制
Pub Date : 2025-09-26 DOI: 10.1109/TMLCN.2025.3615119
Mine Gokce Dogan;Martina Cardone;Christina Fragouli
Millimeter-wave (mmWave) technology is expected to support next-generation wireless networks by expanding the available spectrum and supporting multi-gigabit services. While mmWave communications hold great promise, mmWave links are vulnerable against link blockages, which can severely impact their performance. This paper aims to develop resilient transmission mechanisms to suitably distribute traffic across multiple paths in mmWave networks. The main contributions include: (a) the development of proactive transmission mechanisms to build resilience against link blockages in advance, while achieving a high end-to-end packet rate; (b) the design of a heuristic path selection algorithm to efficiently select (in polynomial time in the network size) multiple proactively resilient paths that have high capacity; and (c) the development of a hybrid scheduling algorithm that combines the proposed path selection algorithm with a deep reinforcement learning (DRL) based online approach for decentralized adaptation to blockages. To achieve resilience against link blockages and to adapt the information flow through the network, a prominent Soft Actor-Critic DRL algorithm is investigated. The proposed scheduling algorithm robustly adapts to blockages and channel variations over different topologies, channels, and blockage realizations while outperforming alternative algorithms which include a conventional congestion control algorithm additive increase multiplicative decrease. Specifically, it achieves the desired packet rate in over 99% of the episodes in static networks with blockages (where up to 80% of the paths are blocked), and in 100% of the episodes in time-varying networks with blockages and link capacity variations – compared to 0.5% to 45% success rates achieved by the baseline methods under the same conditions.
毫米波(mmWave)技术有望通过扩展可用频谱和支持千兆服务来支持下一代无线网络。虽然毫米波通信具有很大的前景,但毫米波链路容易受到链路阻塞的影响,这可能严重影响其性能。本文旨在开发弹性传输机制,以在毫米波网络中适当地跨多路径分配流量。主要贡献包括:(a)开发了主动传输机制,以提前建立针对链路阻塞的弹性,同时实现高端到端分组速率;(b)设计了一种启发式路径选择算法,以有效地(在网络大小的多项式时间内)选择多条具有高容量的主动弹性路径;(c)开发一种混合调度算法,该算法将所提出的路径选择算法与基于深度强化学习(DRL)的在线方法相结合,用于分散适应阻塞。为了实现对链路阻塞的弹性并适应网络中的信息流,研究了一种突出的软行为者-批评家DRL算法。提出的调度算法鲁棒地适应不同拓扑、通道和阻塞实现的阻塞和通道变化,同时优于包括传统拥塞控制算法的替代算法,包括加法增加乘法减少算法。具体来说,它在具有阻塞的静态网络中超过99%的集(其中多达80%的路径被阻塞)达到了期望的数据包速率,并且在具有阻塞和链路容量变化的时变网络中达到了100%的集——相比之下,在相同条件下基线方法实现的成功率为0.5%至45%。
{"title":"A Reinforcement Learning-Based Hybrid Scheduling Mechanism for mmWave Networks","authors":"Mine Gokce Dogan;Martina Cardone;Christina Fragouli","doi":"10.1109/TMLCN.2025.3615119","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3615119","url":null,"abstract":"Millimeter-wave (mmWave) technology is expected to support next-generation wireless networks by expanding the available spectrum and supporting multi-gigabit services. While mmWave communications hold great promise, mmWave links are vulnerable against link blockages, which can severely impact their performance. This paper aims to develop resilient transmission mechanisms to suitably distribute traffic across multiple paths in mmWave networks. The main contributions include: (a) the development of proactive transmission mechanisms to build resilience against link blockages in advance, while achieving a high end-to-end packet rate; (b) the design of a heuristic path selection algorithm to efficiently select (in polynomial time in the network size) multiple proactively resilient paths that have high capacity; and (c) the development of a hybrid scheduling algorithm that combines the proposed path selection algorithm with a deep reinforcement learning (DRL) based online approach for decentralized adaptation to blockages. To achieve resilience against link blockages and to adapt the information flow through the network, a prominent Soft Actor-Critic DRL algorithm is investigated. The proposed scheduling algorithm robustly adapts to blockages and channel variations over different topologies, channels, and blockage realizations while outperforming alternative algorithms which include a conventional congestion control algorithm additive increase multiplicative decrease. Specifically, it achieves the desired packet rate in over 99% of the episodes in static networks with blockages (where up to 80% of the paths are blocked), and in 100% of the episodes in time-varying networks with blockages and link capacity variations – compared to 0.5% to 45% success rates achieved by the baseline methods under the same conditions.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1121-1142"},"PeriodicalIF":0.0,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11181133","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis and Optimization of Wireless Multimodal Federated Learning on Modal Heterogeneity 基于模态异质性的无线多模态联邦学习分析与优化
Pub Date : 2025-09-19 DOI: 10.1109/TMLCN.2025.3611977
Xuefeng Han;Wen Chen;Jun Li;Ming Ding;Qingqing Wu;Kang Wei;Xiumei Deng;Yumeng Shao;Qiong Wu
Multimodal federated learning (MFL) is a distributed framework for training multimodal models without uploading local multimodal data of clients, thereby effectively protecting client privacy. However, multimodal data is commonly heterogeneous across diverse clients, where each client possesses only a subset of all modalities, renders conventional analysis results and optimization methods in unimodal federated learning inapplicable. In addition, fixed latency demand and limited communication bandwidth pose significant challenges for deploying MFL in wireless scenarios. To optimize the wireless MFL performance on modal heterogeneity, this paper proposes a joint client scheduling and bandwidth allocation (JCSBA) algorithm based on a decision-level fusion architecture with adding a unimodal loss function. Specifically, with the decision results, the unimodal loss functions are added to both the training objective and local update loss functions to accelerate multimodal convergence and improve unimodal performance. To characterize MFL performance, we derive a closed-form upper bound related to client and modality scheduling and minimize the derived bound under the latency, energy, and bandwidth constraints through JCSBA. Experimental results on multimodal datasets demonstrate that the JCSBA algorithm improves the multimodal accuracy and the unimodal accuracy by 4.06% and 2.73%, respectively, compared to conventional algorithms.
多模态联邦学习(Multimodal federated learning, MFL)是一种用于训练多模态模型的分布式框架,无需上传客户端的本地多模态数据,从而有效地保护了客户端的隐私。然而,在不同的客户端中,多模态数据通常是异构的,每个客户端只拥有所有模态的一个子集,这使得单模态联邦学习中的传统分析结果和优化方法不适用。此外,固定的延迟需求和有限的通信带宽对在无线场景中部署MFL构成了重大挑战。为了优化无线MFL在模态异构方面的性能,提出了一种基于决策级融合架构并增加单峰损失函数的联合客户端调度和带宽分配(JCSBA)算法。具体而言,根据决策结果,将单峰损失函数添加到训练目标和局部更新损失函数中,以加速多峰收敛并提高单峰性能。为了表征MFL性能,我们推导了一个与客户端和模态调度相关的封闭式上界,并通过JCSBA在延迟、能量和带宽约束下最小化推导出的上界。在多模态数据集上的实验结果表明,与传统算法相比,JCSBA算法的多模态精度和单模态精度分别提高了4.06%和2.73%。
{"title":"Analysis and Optimization of Wireless Multimodal Federated Learning on Modal Heterogeneity","authors":"Xuefeng Han;Wen Chen;Jun Li;Ming Ding;Qingqing Wu;Kang Wei;Xiumei Deng;Yumeng Shao;Qiong Wu","doi":"10.1109/TMLCN.2025.3611977","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3611977","url":null,"abstract":"Multimodal federated learning (MFL) is a distributed framework for training multimodal models without uploading local multimodal data of clients, thereby effectively protecting client privacy. However, multimodal data is commonly heterogeneous across diverse clients, where each client possesses only a subset of all modalities, renders conventional analysis results and optimization methods in unimodal federated learning inapplicable. In addition, fixed latency demand and limited communication bandwidth pose significant challenges for deploying MFL in wireless scenarios. To optimize the wireless MFL performance on modal heterogeneity, this paper proposes a joint client scheduling and bandwidth allocation (JCSBA) algorithm based on a decision-level fusion architecture with adding a unimodal loss function. Specifically, with the decision results, the unimodal loss functions are added to both the training objective and local update loss functions to accelerate multimodal convergence and improve unimodal performance. To characterize MFL performance, we derive a closed-form upper bound related to client and modality scheduling and minimize the derived bound under the latency, energy, and bandwidth constraints through JCSBA. Experimental results on multimodal datasets demonstrate that the JCSBA algorithm improves the multimodal accuracy and the unimodal accuracy by 4.06% and 2.73%, respectively, compared to conventional algorithms.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1075-1091"},"PeriodicalIF":0.0,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11174013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BART-FL: A Backdoor Attack-Resilient Federated Aggregation Technique for Cross-Silo Applications BART-FL:跨竖井应用的后门攻击弹性联邦聚合技术
Pub Date : 2025-09-18 DOI: 10.1109/TMLCN.2025.3611398
Md Jueal Mia;M. Hadi Amini
Federated Learning (FL) is a decentralized learning method that enables collaborative model training while preserving data privacy. This makes FL a promising solution in various applications, particularly in cross-silo settings such as healthcare, finance, and transportation. However, FL remains highly vulnerable to adversarial threats, especially backdoor attacks, where malicious clients inject poisoned data to manipulate global model behavior. Existing outlier detection techniques often struggle to effectively isolate such adversarial updates, compromising model integrity. To address this challenge, we propose Backdoor Attack Resilient Technique for Federated Learning (BART-FL), a novel lightweight defense mechanism that enhances FL security through malicious client filtering. Our method integrates Principal Component Analysis (PCA) for dimensionality reduction with cosine similarity for measuring pairwise distances between model updates and K-means clustering for detecting potentially malicious clients. To reliably identify the benign cluster, we introduce a multi-metric statistical voting mechanism based on point-level mean, median absolute deviation (MAD), and cluster-level mean. This approach strengthens model resilience against adversarial manipulations by identifying and filtering malicious updates before aggregation, thereby preserving the integrity of the global model. Experimental evaluations conducted on the LISA traffic light dataset, CIFAR-10, and CIFAR-100 demonstrate the effectiveness of BART-FL in maintaining model performance across diverse FL settings. Additionally, we perform a comparative analysis against existing backdoor defense techniques, highlighting BART-FL’s ability to improve security while ensuring computational efficiency. Our results showcase the potential of BART-FL as a scalable and adversary-resilient defense mechanism for secure training in cross-silo FL applications.
联邦学习(FL)是一种分散的学习方法,可以在保护数据隐私的同时实现协作模型训练。这使得FL在各种应用中成为一种很有前途的解决方案,特别是在医疗保健、金融和交通等跨竖井环境中。然而,FL仍然非常容易受到对抗性威胁,特别是后门攻击,恶意客户端注入有毒数据来操纵全局模型行为。现有的异常值检测技术往往难以有效地隔离这种对抗性更新,从而损害了模型的完整性。为了应对这一挑战,我们提出了联邦学习后门攻击弹性技术(BART-FL),这是一种新的轻量级防御机制,通过恶意客户端过滤来增强联邦学习的安全性。我们的方法将主成分分析(PCA)用于降维,余弦相似度用于测量模型更新和K-means聚类之间的成对距离,用于检测潜在的恶意客户端。为了可靠地识别良性聚类,我们引入了一种基于点水平均值、中位数绝对偏差(MAD)和聚类水平均值的多度量统计投票机制。这种方法通过在聚合之前识别和过滤恶意更新来增强模型抵御对抗性操作的弹性,从而保持全局模型的完整性。在LISA交通灯数据集、CIFAR-10和CIFAR-100上进行的实验评估证明了BART-FL在不同交通灯设置下保持模型性能的有效性。此外,我们对现有的后门防御技术进行了比较分析,强调BART-FL在确保计算效率的同时提高安全性的能力。我们的研究结果展示了BART-FL作为跨竖井FL应用中安全培训的可扩展和对抗弹性防御机制的潜力。
{"title":"BART-FL: A Backdoor Attack-Resilient Federated Aggregation Technique for Cross-Silo Applications","authors":"Md Jueal Mia;M. Hadi Amini","doi":"10.1109/TMLCN.2025.3611398","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3611398","url":null,"abstract":"Federated Learning (FL) is a decentralized learning method that enables collaborative model training while preserving data privacy. This makes FL a promising solution in various applications, particularly in cross-silo settings such as healthcare, finance, and transportation. However, FL remains highly vulnerable to adversarial threats, especially backdoor attacks, where malicious clients inject poisoned data to manipulate global model behavior. Existing outlier detection techniques often struggle to effectively isolate such adversarial updates, compromising model integrity. To address this challenge, we propose Backdoor Attack Resilient Technique for Federated Learning (BART-FL), a novel lightweight defense mechanism that enhances FL security through malicious client filtering. Our method integrates Principal Component Analysis (PCA) for dimensionality reduction with cosine similarity for measuring pairwise distances between model updates and K-means clustering for detecting potentially malicious clients. To reliably identify the benign cluster, we introduce a multi-metric statistical voting mechanism based on point-level mean, median absolute deviation (MAD), and cluster-level mean. This approach strengthens model resilience against adversarial manipulations by identifying and filtering malicious updates before aggregation, thereby preserving the integrity of the global model. Experimental evaluations conducted on the LISA traffic light dataset, CIFAR-10, and CIFAR-100 demonstrate the effectiveness of BART-FL in maintaining model performance across diverse FL settings. Additionally, we perform a comparative analysis against existing backdoor defense techniques, highlighting BART-FL’s ability to improve security while ensuring computational efficiency. Our results showcase the potential of BART-FL as a scalable and adversary-resilient defense mechanism for secure training in cross-silo FL applications.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1311-1325"},"PeriodicalIF":0.0,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11172307","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative AI-Based Vector Quantized End-to-End Semantic Communication System for Wireless Image Transmission 基于生成人工智能的矢量量化端到端无线图像传输语义通信系统
Pub Date : 2025-09-09 DOI: 10.1109/TMLCN.2025.3607891
Maheshi Lokumarambage;Thushan Sivalingam;Feng Dong;Nandana Rajatheva;Anil Fernando
Semantic communication (SemCom) systems enhance transmission efficiency by conveying semantic information in lieu of raw data. However, challenges arise when designing these systems due to the need for robust semantic source coding for information representation extending beyond the training dataset, maintaining channel-agnostic performance, and ensuring robustness to channel and semantic noise. We propose a novel generative artificial intelligence (AI) based SemCom architecture conditioned on quantized latent. The system reduces the communication overhead of the wireless channel by transmitting the index of the quantized latent over the communication channel by mapping the quantized vector to the learned codebook vectors. The learned codebook is the shared knowledge base. The encoder is designed with a novel spatial attention mechanism based on image energy, focusing on object edges. The critic assesses the realism of generated data relative to the original distribution, with the Wasserstein distance. The model introduces novel contrastive objectives at multiple levels, including pixel, latent, perceptual, and task output, tailored for noisy wireless semantic communication. We validated the proposed model for transmission quality and robustness with low-density parity-check (LDPC), which outperforms the baselines of better portable graphics (BPG), specifically at low signal-to-noise ratio (SNR) levels ( $ {lt }~ {5}$ dB). Additionally, it shows comparable results with joint source-channel coding (JSCC) with lower complexity and latency. The model is validated for human perception and machine perception-oriented task utility. The model effectively transmits high-resolution images without requiring additional error correction at the receiver. We propose a novel semantic-based matrix to evaluate the robustness to noise and task-specific semantic distortion.
语义通信(SemCom)系统通过传递语义信息代替原始数据来提高传输效率。然而,在设计这些系统时,由于需要对超出训练数据集的信息表示进行鲁棒的语义源编码,保持信道不可知的性能,并确保对信道和语义噪声的鲁棒性,因此会出现挑战。我们提出了一种基于量化潜的SemCom架构。该系统通过将量化的隐波向量映射到学习到的码本向量,在通信信道上传输量化的隐波索引,从而降低了无线信道的通信开销。学习的代码本是共享的知识库。该编码器设计了一种基于图像能量的空间注意机制,聚焦于物体边缘。评论家评估生成的数据相对于原始分布的真实性,使用Wasserstein距离。该模型在多个层面引入了新的对比目标,包括像素、潜在、感知和任务输出,为有噪声的无线语义通信量身定制。我们通过低密度奇偶校验(LDPC)验证了所提出的模型的传输质量和鲁棒性,其性能优于更好的便携式图形(BPG)的基线,特别是在低信噪比(SNR)水平($ {lt}~ {5}$ dB)下。此外,它还显示了具有较低复杂性和延迟的联合源信道编码(JSCC)的类似结果。该模型在面向人类感知和面向机器感知的任务应用中得到了验证。该模型可以有效地传输高分辨率图像,而无需在接收器处进行额外的误差校正。我们提出了一种新的基于语义的矩阵来评估噪声和任务特定语义失真的鲁棒性。
{"title":"Generative AI-Based Vector Quantized End-to-End Semantic Communication System for Wireless Image Transmission","authors":"Maheshi Lokumarambage;Thushan Sivalingam;Feng Dong;Nandana Rajatheva;Anil Fernando","doi":"10.1109/TMLCN.2025.3607891","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3607891","url":null,"abstract":"Semantic communication (SemCom) systems enhance transmission efficiency by conveying semantic information in lieu of raw data. However, challenges arise when designing these systems due to the need for robust semantic source coding for information representation extending beyond the training dataset, maintaining channel-agnostic performance, and ensuring robustness to channel and semantic noise. We propose a novel generative artificial intelligence (AI) based SemCom architecture conditioned on quantized latent. The system reduces the communication overhead of the wireless channel by transmitting the index of the quantized latent over the communication channel by mapping the quantized vector to the learned codebook vectors. The learned codebook is the shared knowledge base. The encoder is designed with a novel spatial attention mechanism based on image energy, focusing on object edges. The critic assesses the realism of generated data relative to the original distribution, with the Wasserstein distance. The model introduces novel contrastive objectives at multiple levels, including pixel, latent, perceptual, and task output, tailored for noisy wireless semantic communication. We validated the proposed model for transmission quality and robustness with low-density parity-check (LDPC), which outperforms the baselines of better portable graphics (BPG), specifically at low signal-to-noise ratio (SNR) levels (<inline-formula> <tex-math>$ {lt }~ {5}$ </tex-math></inline-formula> dB). Additionally, it shows comparable results with joint source-channel coding (JSCC) with lower complexity and latency. The model is validated for human perception and machine perception-oriented task utility. The model effectively transmits high-resolution images without requiring additional error correction at the receiver. We propose a novel semantic-based matrix to evaluate the robustness to noise and task-specific semantic distortion.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1050-1074"},"PeriodicalIF":0.0,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11154002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145100350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Defensive Cyber Agent for Multi-Adversary Defense 面向多对手防御的稳健防御网络代理
Pub Date : 2025-09-03 DOI: 10.1109/TMLCN.2025.3605855
Muhammad O. Farooq
Modern cyber environments are becoming increasingly complex and distributed, often organized into multiple interconnected subnets and nodes. Even relatively small-scale networks can exhibit significant security challenges due to their dynamic topologies and the diversity of potential attack vectors. In modern cyber environments, human-led defense alone is insufficient due to delayed response times, cognitive overload, and limited availability of skilled personnel, particularly in remote or resource-constrained settings. These challenges are intensified by the growing diversity of cyber threats, including adaptive and machine learning-based attacks, which demand rapid and intelligent responses. Addressing this, we propose a reinforcement learning (RL)-based framework that integrates eXtreme Gradient Boosting (XGBoost) and transformer architectures to develop robust, generalizable defensive agents. The proposed agents are evaluated against both baseline defenders trained to counter specific adversaries and hierarchical generic agents representing the current state-of-the-art. Experimental results demonstrate that the RL-XGBoost (integration of RL and XGBoost) agent consistently achieves superior performance in terms of defense accuracy and efficiency across varied adversarial strategies and network configurations. Notably, in scenarios involving changes to network topology, both RL-Transformer (RL combined with transformer architectures) and RL-XGBoost agents exhibit strong adaptability and resilience, outperforming specialized blue agents and hierarchical agents in performance consistency. In particular, the RL-Transformer variant (RL-BERT) demonstrates exceptional robustness when attacker entry points are altered, effectively capturing long-range dependencies and temporal patterns through its self-attention mechanism. Overall, these findings highlight the RL-XGBoost model’s potential as a scalable and intelligent solution for multi-adversary defense in dynamic and heterogeneous cyber environments.
现代网络环境正变得越来越复杂和分布式,通常被组织成多个相互连接的子网和节点。即使是相对较小的网络,由于其动态拓扑结构和潜在攻击向量的多样性,也可能表现出重大的安全挑战。在现代网络环境中,由于响应时间延迟、认知超载以及熟练人员的可用性有限,特别是在偏远或资源受限的环境中,仅靠人为主导的防御是不够的。日益多样化的网络威胁(包括自适应攻击和基于机器学习的攻击)加剧了这些挑战,这些攻击需要快速和智能的响应。为了解决这个问题,我们提出了一个基于强化学习(RL)的框架,该框架集成了极限梯度增强(XGBoost)和变压器架构,以开发健壮的、可推广的防御代理。所建议的代理将根据训练有素的基线防御者来评估,以对抗特定的对手和代表当前最先进技术的分层通用代理。实验结果表明,RL-XGBoost (RL和XGBoost的集成)智能体在不同的对抗策略和网络配置下,在防御精度和效率方面始终保持优异的性能。值得注意的是,在涉及网络拓扑变化的场景中,RL- transformer (RL与变压器架构相结合)和RL- xgboost代理都表现出强大的适应性和弹性,在性能一致性方面优于专门的蓝色代理和分层代理。特别是,当攻击者的入口点发生改变时,RL-Transformer变体(RL-BERT)展示了异常的健壮性,通过其自关注机制有效地捕获远程依赖关系和时间模式。总的来说,这些发现突出了RL-XGBoost模型作为动态和异构网络环境中多对手防御的可扩展和智能解决方案的潜力。
{"title":"Robust Defensive Cyber Agent for Multi-Adversary Defense","authors":"Muhammad O. Farooq","doi":"10.1109/TMLCN.2025.3605855","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3605855","url":null,"abstract":"Modern cyber environments are becoming increasingly complex and distributed, often organized into multiple interconnected subnets and nodes. Even relatively small-scale networks can exhibit significant security challenges due to their dynamic topologies and the diversity of potential attack vectors. In modern cyber environments, human-led defense alone is insufficient due to delayed response times, cognitive overload, and limited availability of skilled personnel, particularly in remote or resource-constrained settings. These challenges are intensified by the growing diversity of cyber threats, including adaptive and machine learning-based attacks, which demand rapid and intelligent responses. Addressing this, we propose a reinforcement learning (RL)-based framework that integrates eXtreme Gradient Boosting (XGBoost) and transformer architectures to develop robust, generalizable defensive agents. The proposed agents are evaluated against both baseline defenders trained to counter specific adversaries and hierarchical generic agents representing the current state-of-the-art. Experimental results demonstrate that the RL-XGBoost (integration of RL and XGBoost) agent consistently achieves superior performance in terms of defense accuracy and efficiency across varied adversarial strategies and network configurations. Notably, in scenarios involving changes to network topology, both RL-Transformer (RL combined with transformer architectures) and RL-XGBoost agents exhibit strong adaptability and resilience, outperforming specialized blue agents and hierarchical agents in performance consistency. In particular, the RL-Transformer variant (RL-BERT) demonstrates exceptional robustness when attacker entry points are altered, effectively capturing long-range dependencies and temporal patterns through its self-attention mechanism. Overall, these findings highlight the RL-XGBoost model’s potential as a scalable and intelligent solution for multi-adversary defense in dynamic and heterogeneous cyber environments.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1030-1049"},"PeriodicalIF":0.0,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11150430","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reverse Engineering Segment Routing Policies and Link Costs With Inverse Reinforcement Learning and EM 基于逆强化学习和EM的逆向工程分段路由策略和链路代价
Pub Date : 2025-08-13 DOI: 10.1109/TMLCN.2025.3598739
Kai Wang;Chee Wei Tan
Network routing is a core functionality in computer networks that holds significant potential for integrating newly developed techniques with minimal software effort through the use of Software-Defined Networking (SDN). However, with the ever-expansion of the Internet, traditional destination-based IP routing techniques struggle to meet Quality-of-Service (QoS) requirements with SDN alone. To address these challenges, a modern network routing technique called Segment Routing (SR) has been designed to simplify traffic engineering and make networks more flexible and scalable. However, existing SR routing algorithms used by major Internet Service Providers (ISPs) are mostly proprietary, whose details remain unknown. This study delves into the inverse problem of a general type of SR and attempts to infer the SR policies given expert traffic traces. To this end, we propose MoME, a Mixture-of-Experts (MoE) model using the Maximum Entropy Inverse Reinforcement Learning (MaxEnt-IRL) framework that is capable of incorporating diverse features (e.g., router, link and context) and capturing complex relationships in the link cost, in combination with an Expectation-Maximization (EM) based iterative algorithm that jointly infers link costs and SR policy classes. Experimental results on real-world ISP topologies and Traffic Matrices (TMs) demonstrate the superior performance of our approach in jointly classifying SR policies and inferring link cost functions. Specifically, our model achieves classification accuracies of 0.90, 0.81, 0.75, and 0.57 on datasets that contain five SR policies over the small-scale Abilene and GÉANT, the medium-scale Exodus, and the large-scale Sprintlink network topologies, respectively.
网络路由是计算机网络中的一项核心功能,它通过使用软件定义网络(SDN)将新开发的技术以最小的软件工作量集成在一起,具有巨大的潜力。然而,随着互联网的不断扩展,传统的基于目的地的IP路由技术很难满足单独使用SDN的服务质量(QoS)要求。为了应对这些挑战,一种称为分段路由(SR)的现代网络路由技术被设计用来简化流量工程,使网络更加灵活和可扩展。然而,主要互联网服务提供商(isp)使用的现有SR路由算法大多是专有的,其细节仍然未知。本研究深入研究了一般类型SR的逆问题,并试图在给定专家流量轨迹的情况下推断SR策略。为此,我们提出了MoME,这是一种使用最大熵逆强化学习(maxt - irl)框架的专家混合(MoE)模型,该模型能够结合不同的特征(例如,路由器,链路和上下文)并捕获链路成本中的复杂关系,并结合基于期望最大化(EM)的迭代算法,共同推断链路成本和SR策略类。在真实的ISP拓扑和流量矩阵(TMs)上的实验结果表明,我们的方法在联合分类SR策略和推断链路成本函数方面具有卓越的性能。具体来说,我们的模型在包含5个SR策略的数据集上分别实现了0.90、0.81、0.75和0.57的分类精度,这些策略分别针对小规模的Abilene和GÉANT、中等规模的Exodus和大规模的Sprintlink网络拓扑。
{"title":"Reverse Engineering Segment Routing Policies and Link Costs With Inverse Reinforcement Learning and EM","authors":"Kai Wang;Chee Wei Tan","doi":"10.1109/TMLCN.2025.3598739","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3598739","url":null,"abstract":"Network routing is a core functionality in computer networks that holds significant potential for integrating newly developed techniques with minimal software effort through the use of Software-Defined Networking (SDN). However, with the ever-expansion of the Internet, traditional destination-based IP routing techniques struggle to meet Quality-of-Service (QoS) requirements with SDN alone. To address these challenges, a modern network routing technique called Segment Routing (SR) has been designed to simplify traffic engineering and make networks more flexible and scalable. However, existing SR routing algorithms used by major Internet Service Providers (ISPs) are mostly proprietary, whose details remain unknown. This study delves into the inverse problem of a general type of SR and attempts to infer the SR policies given expert traffic traces. To this end, we propose MoME, a Mixture-of-Experts (MoE) model using the Maximum Entropy Inverse Reinforcement Learning (MaxEnt-IRL) framework that is capable of incorporating diverse features (e.g., router, link and context) and capturing complex relationships in the link cost, in combination with an Expectation-Maximization (EM) based iterative algorithm that jointly infers link costs and SR policy classes. Experimental results on real-world ISP topologies and Traffic Matrices (TMs) demonstrate the superior performance of our approach in jointly classifying SR policies and inferring link cost functions. Specifically, our model achieves classification accuracies of 0.90, 0.81, 0.75, and 0.57 on datasets that contain five SR policies over the small-scale Abilene and GÉANT, the medium-scale Exodus, and the large-scale Sprintlink network topologies, respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1014-1029"},"PeriodicalIF":0.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11124467","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144891104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable AI for Enhancing Efficiency of DL-Based Channel Estimation 提高基于dl的信道估计效率的可解释人工智能
Pub Date : 2025-08-06 DOI: 10.1109/TMLCN.2025.3596548
Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier
The support of artificial intelligence (AI) based decision-making is a key element in future 6G networks. Moreover, AI is widely employed in critical applications such as autonomous driving and medical diagnosis. In such applications, using AI as black-box models is risky and challenging. Hence, it is crucial to understand and trust the decisions taken by these models. Tackling this issue can be achieved by developing explainable AI (XAI) schemes that aim to explain the logic behind the black-box model behavior, and thus, ensure its efficient and safe deployment. Highlighting the relevant inputs the black-box model uses to accomplish the desired prediction is essential towards ensuring its interpretability. Recently, we proposed a novel perturbation-based feature selection framework called XAI-CHEST and oriented toward channel estimation in wireless communications. This manuscript provides the detailed theoretical foundations of the XAI-CHEST framework. In particular, we derive the analytical expressions of the XAI-CHEST loss functions and the noise threshold fine-tuning optimization problem. Hence the designed XAI-CHEST delivers a smart low-complex one-shot input feature selection methodology for high-dimensional model input that can further improve the overall performance while optimizing the architecture of the employed model. Simulation results show that the XAI-CHEST framework outperforms the classical feature selection XAI schemes such as local interpretable model-agnostic explanations (LIME) and shapley additive explanations (SHAP), mainly in terms of interpretability resolution as well as providing better performance-complexity trade-off.
基于人工智能(AI)的决策支持是未来6G网络的关键要素。此外,人工智能被广泛应用于自动驾驶和医疗诊断等关键应用。在这样的应用中,使用人工智能作为黑盒模型是有风险和挑战性的。因此,理解和信任这些模型所做的决定是至关重要的。解决这个问题可以通过开发可解释的AI (XAI)方案来实现,该方案旨在解释黑盒模型行为背后的逻辑,从而确保其高效和安全的部署。突出显示黑箱模型用于完成预期预测的相关输入对于确保其可解释性至关重要。最近,我们提出了一种新的基于微扰的特征选择框架,称为XAI-CHEST,并针对无线通信中的信道估计。本文为XAI-CHEST框架提供了详细的理论基础。特别地,我们推导了XAI-CHEST损失函数的解析表达式和噪声阈值微调优化问题。因此,设计的XAI-CHEST为高维模型输入提供了一种智能低复杂度的一次性输入特征选择方法,可以在优化所采用模型架构的同时进一步提高整体性能。仿真结果表明,XAI- chest框架优于经典的特征选择XAI方案,如局部可解释模型不可知解释(LIME)和shapley加性解释(SHAP),主要体现在可解释性分辨率和更好的性能复杂度权衡方面。
{"title":"Explainable AI for Enhancing Efficiency of DL-Based Channel Estimation","authors":"Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier","doi":"10.1109/TMLCN.2025.3596548","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3596548","url":null,"abstract":"The support of artificial intelligence (AI) based decision-making is a key element in future 6G networks. Moreover, AI is widely employed in critical applications such as autonomous driving and medical diagnosis. In such applications, using AI as black-box models is risky and challenging. Hence, it is crucial to understand and trust the decisions taken by these models. Tackling this issue can be achieved by developing explainable AI (XAI) schemes that aim to explain the logic behind the black-box model behavior, and thus, ensure its efficient and safe deployment. Highlighting the relevant inputs the black-box model uses to accomplish the desired prediction is essential towards ensuring its interpretability. Recently, we proposed a novel perturbation-based feature selection framework called XAI-CHEST and oriented toward channel estimation in wireless communications. This manuscript provides the detailed theoretical foundations of the XAI-CHEST framework. In particular, we derive the analytical expressions of the XAI-CHEST loss functions and the noise threshold fine-tuning optimization problem. Hence the designed XAI-CHEST delivers a smart low-complex one-shot input feature selection methodology for high-dimensional model input that can further improve the overall performance while optimizing the architecture of the employed model. Simulation results show that the XAI-CHEST framework outperforms the classical feature selection XAI schemes such as local interpretable model-agnostic explanations (LIME) and shapley additive explanations (SHAP), mainly in terms of interpretability resolution as well as providing better performance-complexity trade-off.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"976-996"},"PeriodicalIF":0.0,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11115091","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Out-of-Distribution in Image Semantic Communication: A Solution With Multimodal Large Language Models 基于多模态大语言模型的图像语义通信的非分布解决方案
Pub Date : 2025-08-05 DOI: 10.1109/TMLCN.2025.3595841
Feifan Zhang;Yuyang Du;Kexin Chen;Yulin Shao;Soung Chang Liew
Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel “Plan A - Plan B” framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. Furthermore, we propose a Bayesian optimization scheme that reshapes the probability distribution of the MLLM’s inference process based on the contextual information of the image. The optimization scheme significantly enhances the MLLM’s performance in semantic compression by 1) filtering out irrelevant vocabulary in the original MLLM output; and 2) using contextual similarities between prospective answers of the MLLM and the background information as prior knowledge to modify the MLLM’s probability distribution during inference. Further, at the receiver side of the communication system, we put forth a “generate-criticize” framework that utilizes the cooperation of multiple MLLMs to enhance the reliability of image reconstruction.
语义通信是下一代无线网络的一项重要技术。然而,在分布外(OOD)问题中,预训练的机器学习(ML)模型应用于其训练数据分布之外的看不见的任务,可能会损害语义压缩的完整性。本文探讨了使用多模态大语言模型(mllm)来解决图像语义通信中的OOD问题。我们提出了一种新颖的“计划a -计划B”框架,该框架利用MLLM的广博知识和强大的泛化能力,在传统ML模型在语义编码过程中遇到OOD输入时对其进行辅助。此外,我们提出了一种贝叶斯优化方案,该方案基于图像的上下文信息重塑MLLM推理过程的概率分布。该优化方案通过1)过滤掉原始MLLM输出中的不相关词汇,显著提高了MLLM在语义压缩方面的性能;2)利用MLLM的预期答案与背景信息之间的上下文相似性作为先验知识,修改MLLM在推理过程中的概率分布。此外,在通信系统的接收端,我们提出了一种“生成-批评”框架,利用多个mllm的合作来提高图像重建的可靠性。
{"title":"Out-of-Distribution in Image Semantic Communication: A Solution With Multimodal Large Language Models","authors":"Feifan Zhang;Yuyang Du;Kexin Chen;Yulin Shao;Soung Chang Liew","doi":"10.1109/TMLCN.2025.3595841","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595841","url":null,"abstract":"Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel “Plan A - Plan B” framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. Furthermore, we propose a Bayesian optimization scheme that reshapes the probability distribution of the MLLM’s inference process based on the contextual information of the image. The optimization scheme significantly enhances the MLLM’s performance in semantic compression by 1) filtering out irrelevant vocabulary in the original MLLM output; and 2) using contextual similarities between prospective answers of the MLLM and the background information as prior knowledge to modify the MLLM’s probability distribution during inference. Further, at the receiver side of the communication system, we put forth a “generate-criticize” framework that utilizes the cooperation of multiple MLLMs to enhance the reliability of image reconstruction.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"997-1013"},"PeriodicalIF":0.0,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11113346","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Machine Learning in Communications and Networking
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1