Pub Date : 2025-08-13DOI: 10.1109/TMLCN.2025.3598739
Kai Wang;Chee Wei Tan
Network routing is a core functionality in computer networks that holds significant potential for integrating newly developed techniques with minimal software effort through the use of Software-Defined Networking (SDN). However, with the ever-expansion of the Internet, traditional destination-based IP routing techniques struggle to meet Quality-of-Service (QoS) requirements with SDN alone. To address these challenges, a modern network routing technique called Segment Routing (SR) has been designed to simplify traffic engineering and make networks more flexible and scalable. However, existing SR routing algorithms used by major Internet Service Providers (ISPs) are mostly proprietary, whose details remain unknown. This study delves into the inverse problem of a general type of SR and attempts to infer the SR policies given expert traffic traces. To this end, we propose MoME, a Mixture-of-Experts (MoE) model using the Maximum Entropy Inverse Reinforcement Learning (MaxEnt-IRL) framework that is capable of incorporating diverse features (e.g., router, link and context) and capturing complex relationships in the link cost, in combination with an Expectation-Maximization (EM) based iterative algorithm that jointly infers link costs and SR policy classes. Experimental results on real-world ISP topologies and Traffic Matrices (TMs) demonstrate the superior performance of our approach in jointly classifying SR policies and inferring link cost functions. Specifically, our model achieves classification accuracies of 0.90, 0.81, 0.75, and 0.57 on datasets that contain five SR policies over the small-scale Abilene and GÉANT, the medium-scale Exodus, and the large-scale Sprintlink network topologies, respectively.
{"title":"Reverse Engineering Segment Routing Policies and Link Costs With Inverse Reinforcement Learning and EM","authors":"Kai Wang;Chee Wei Tan","doi":"10.1109/TMLCN.2025.3598739","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3598739","url":null,"abstract":"Network routing is a core functionality in computer networks that holds significant potential for integrating newly developed techniques with minimal software effort through the use of Software-Defined Networking (SDN). However, with the ever-expansion of the Internet, traditional destination-based IP routing techniques struggle to meet Quality-of-Service (QoS) requirements with SDN alone. To address these challenges, a modern network routing technique called Segment Routing (SR) has been designed to simplify traffic engineering and make networks more flexible and scalable. However, existing SR routing algorithms used by major Internet Service Providers (ISPs) are mostly proprietary, whose details remain unknown. This study delves into the inverse problem of a general type of SR and attempts to infer the SR policies given expert traffic traces. To this end, we propose MoME, a Mixture-of-Experts (MoE) model using the Maximum Entropy Inverse Reinforcement Learning (MaxEnt-IRL) framework that is capable of incorporating diverse features (e.g., router, link and context) and capturing complex relationships in the link cost, in combination with an Expectation-Maximization (EM) based iterative algorithm that jointly infers link costs and SR policy classes. Experimental results on real-world ISP topologies and Traffic Matrices (TMs) demonstrate the superior performance of our approach in jointly classifying SR policies and inferring link cost functions. Specifically, our model achieves classification accuracies of 0.90, 0.81, 0.75, and 0.57 on datasets that contain five SR policies over the small-scale Abilene and GÉANT, the medium-scale Exodus, and the large-scale Sprintlink network topologies, respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1014-1029"},"PeriodicalIF":0.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11124467","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144891104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-06DOI: 10.1109/TMLCN.2025.3596548
Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier
The support of artificial intelligence (AI) based decision-making is a key element in future 6G networks. Moreover, AI is widely employed in critical applications such as autonomous driving and medical diagnosis. In such applications, using AI as black-box models is risky and challenging. Hence, it is crucial to understand and trust the decisions taken by these models. Tackling this issue can be achieved by developing explainable AI (XAI) schemes that aim to explain the logic behind the black-box model behavior, and thus, ensure its efficient and safe deployment. Highlighting the relevant inputs the black-box model uses to accomplish the desired prediction is essential towards ensuring its interpretability. Recently, we proposed a novel perturbation-based feature selection framework called XAI-CHEST and oriented toward channel estimation in wireless communications. This manuscript provides the detailed theoretical foundations of the XAI-CHEST framework. In particular, we derive the analytical expressions of the XAI-CHEST loss functions and the noise threshold fine-tuning optimization problem. Hence the designed XAI-CHEST delivers a smart low-complex one-shot input feature selection methodology for high-dimensional model input that can further improve the overall performance while optimizing the architecture of the employed model. Simulation results show that the XAI-CHEST framework outperforms the classical feature selection XAI schemes such as local interpretable model-agnostic explanations (LIME) and shapley additive explanations (SHAP), mainly in terms of interpretability resolution as well as providing better performance-complexity trade-off.
{"title":"Explainable AI for Enhancing Efficiency of DL-Based Channel Estimation","authors":"Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier","doi":"10.1109/TMLCN.2025.3596548","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3596548","url":null,"abstract":"The support of artificial intelligence (AI) based decision-making is a key element in future 6G networks. Moreover, AI is widely employed in critical applications such as autonomous driving and medical diagnosis. In such applications, using AI as black-box models is risky and challenging. Hence, it is crucial to understand and trust the decisions taken by these models. Tackling this issue can be achieved by developing explainable AI (XAI) schemes that aim to explain the logic behind the black-box model behavior, and thus, ensure its efficient and safe deployment. Highlighting the relevant inputs the black-box model uses to accomplish the desired prediction is essential towards ensuring its interpretability. Recently, we proposed a novel perturbation-based feature selection framework called XAI-CHEST and oriented toward channel estimation in wireless communications. This manuscript provides the detailed theoretical foundations of the XAI-CHEST framework. In particular, we derive the analytical expressions of the XAI-CHEST loss functions and the noise threshold fine-tuning optimization problem. Hence the designed XAI-CHEST delivers a smart low-complex one-shot input feature selection methodology for high-dimensional model input that can further improve the overall performance while optimizing the architecture of the employed model. Simulation results show that the XAI-CHEST framework outperforms the classical feature selection XAI schemes such as local interpretable model-agnostic explanations (LIME) and shapley additive explanations (SHAP), mainly in terms of interpretability resolution as well as providing better performance-complexity trade-off.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"976-996"},"PeriodicalIF":0.0,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11115091","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel “Plan A - Plan B” framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. Furthermore, we propose a Bayesian optimization scheme that reshapes the probability distribution of the MLLM’s inference process based on the contextual information of the image. The optimization scheme significantly enhances the MLLM’s performance in semantic compression by 1) filtering out irrelevant vocabulary in the original MLLM output; and 2) using contextual similarities between prospective answers of the MLLM and the background information as prior knowledge to modify the MLLM’s probability distribution during inference. Further, at the receiver side of the communication system, we put forth a “generate-criticize” framework that utilizes the cooperation of multiple MLLMs to enhance the reliability of image reconstruction.
{"title":"Out-of-Distribution in Image Semantic Communication: A Solution With Multimodal Large Language Models","authors":"Feifan Zhang;Yuyang Du;Kexin Chen;Yulin Shao;Soung Chang Liew","doi":"10.1109/TMLCN.2025.3595841","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595841","url":null,"abstract":"Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel “Plan A - Plan B” framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. Furthermore, we propose a Bayesian optimization scheme that reshapes the probability distribution of the MLLM’s inference process based on the contextual information of the image. The optimization scheme significantly enhances the MLLM’s performance in semantic compression by 1) filtering out irrelevant vocabulary in the original MLLM output; and 2) using contextual similarities between prospective answers of the MLLM and the background information as prior knowledge to modify the MLLM’s probability distribution during inference. Further, at the receiver side of the communication system, we put forth a “generate-criticize” framework that utilizes the cooperation of multiple MLLMs to enhance the reliability of image reconstruction.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"997-1013"},"PeriodicalIF":0.0,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11113346","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The growing popularity of cellular networks among users, primarily due to affordable prices and high speeds, has escalated the need for strategic capacity planning to ensure a seamless end-user experience and profitable returns on network investments. Traditional capacity planning methods rely on static analysis of network parameters with the aim of minimizing the CAPEX and the OPEX. However, to address the evolving dynamics of cellular networks, this paper advocates for a data-driven approach that considers user behavioral analysis in the planning process to make it proactive and adaptive. We introduce a Hierarchical Feature-based Time Series Clustering (HFTSC) approach that organizes clustering in a multi-level tree structure. Each level addresses a specific aspect of time series data using focused features, enabling explainable clustering. The proposed approach assigns labels to clusters based on the time series properties targeted at each level, generating annotated clusters while applying unsupervised clustering methods. To evaluate the effectiveness of HFTSC, we conduct a comprehensive case study using real-world data from thousands of network elements. Our evaluation examines the identified clusters from analytical and geographical perspectives, focusing on supporting network planners in data-informed decision-making and analysis. Finally, we perform an extensive comparison with several baseline methods to reflect the practical advantages of our approach in capacity planning and optimization.
{"title":"A Hierarchical Feature-Based Time Series Clustering Approach for Data-Driven Capacity Planning of Cellular Networks","authors":"Vineeta Jain;Anna Richter;Vladimir Fokow;Mathias Schweigel;Ulf Wetzker;Andreas Frotzscher","doi":"10.1109/TMLCN.2025.3595125","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595125","url":null,"abstract":"The growing popularity of cellular networks among users, primarily due to affordable prices and high speeds, has escalated the need for strategic capacity planning to ensure a seamless end-user experience and profitable returns on network investments. Traditional capacity planning methods rely on static analysis of network parameters with the aim of minimizing the CAPEX and the OPEX. However, to address the evolving dynamics of cellular networks, this paper advocates for a data-driven approach that considers user behavioral analysis in the planning process to make it proactive and adaptive. We introduce a Hierarchical Feature-based Time Series Clustering (HFTSC) approach that organizes clustering in a multi-level tree structure. Each level addresses a specific aspect of time series data using focused features, enabling explainable clustering. The proposed approach assigns labels to clusters based on the time series properties targeted at each level, generating annotated clusters while applying unsupervised clustering methods. To evaluate the effectiveness of HFTSC, we conduct a comprehensive case study using real-world data from thousands of network elements. Our evaluation examines the identified clusters from analytical and geographical perspectives, focusing on supporting network planners in data-informed decision-making and analysis. Finally, we perform an extensive comparison with several baseline methods to reflect the practical advantages of our approach in capacity planning and optimization.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"921-947"},"PeriodicalIF":0.0,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11108703","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-28DOI: 10.1109/TMLCN.2025.3593184
Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah
The emergent field of Large Language Models (LLMs) has significant potential to revolutionize how future telecom networks are designed and operated. However, mainstream Large Language Models (LLMs) lack the specialized knowledge required to understand and operate within the highly technical telecom domain. In this paper, we introduce TelecomGPT, the first telecom-specific LLM, built through a systematic adaptation pipeline designed to enhance general-purpose LLMs for telecom applications. To achieve this, we curate comprehensive telecom-specific datasets, including pre-training datasets, instruction datasets, and preference datasets. These datasets are leveraged for continual pre-training, instruction tuning, and alignment tuning, respectively. Additionally, due to the lack of widely accepted evaluation benchmarks that are tailored for the telecom domain, we proposed three novel LLM-Telecom evaluation benchmarks, namely, Telecom Math Modeling, Telecom Open QnA, and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs in telecom math modeling, open-ended question answering, code generation, infilling, summarization and analysis. Using the curated datasets, our fine-tuned LLM, TelecomGPT, significantly outperforms general-purpose state of the art (SOTA) LLMs, including GPT-4, Llama-3 and Mistral, particularly in Telecom Math Modeling benchmarks. Additionally, it achieves comparable performance across various evaluation benchmarks, such as TeleQnA, 3GPP technical document classification, telecom code summarization, generation, and infilling. This work establishes a new foundation for integrating LLMs into telecom systems, paving the way for AI-powered advancements in network operations.
{"title":"TelecomGPT: A Framework to Build Telecom-Specific Large Language Models","authors":"Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah","doi":"10.1109/TMLCN.2025.3593184","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3593184","url":null,"abstract":"The emergent field of Large Language Models (LLMs) has significant potential to revolutionize how future telecom networks are designed and operated. However, mainstream Large Language Models (LLMs) lack the specialized knowledge required to understand and operate within the highly technical telecom domain. In this paper, we introduce TelecomGPT, the first telecom-specific LLM, built through a systematic adaptation pipeline designed to enhance general-purpose LLMs for telecom applications. To achieve this, we curate comprehensive telecom-specific datasets, including pre-training datasets, instruction datasets, and preference datasets. These datasets are leveraged for continual pre-training, instruction tuning, and alignment tuning, respectively. Additionally, due to the lack of widely accepted evaluation benchmarks that are tailored for the telecom domain, we proposed three novel LLM-Telecom evaluation benchmarks, namely, Telecom Math Modeling, Telecom Open QnA, and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs in telecom math modeling, open-ended question answering, code generation, infilling, summarization and analysis. Using the curated datasets, our fine-tuned LLM, TelecomGPT, significantly outperforms general-purpose state of the art (SOTA) LLMs, including GPT-4, Llama-3 and Mistral, particularly in Telecom Math Modeling benchmarks. Additionally, it achieves comparable performance across various evaluation benchmarks, such as TeleQnA, 3GPP technical document classification, telecom code summarization, generation, and infilling. This work establishes a new foundation for integrating LLMs into telecom systems, paving the way for AI-powered advancements in network operations.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"948-975"},"PeriodicalIF":0.0,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11097898","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-25DOI: 10.1109/TMLCN.2025.3592658
Pranshav Gajjar;Vijay K. Shah
Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.
{"title":"ORANSight-2.0: Foundational LLMs for O-RAN","authors":"Pranshav Gajjar;Vijay K. Shah","doi":"10.1109/TMLCN.2025.3592658","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3592658","url":null,"abstract":"Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"903-920"},"PeriodicalIF":0.0,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096935","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-10DOI: 10.1109/TMLCN.2025.3587205
Amardip Kumar Singh;Kim Khoa Nguyen
The Open Radio Access Network (O-RAN) architecture, enhanced by its AI-enabled Radio Intelligent Controllers (RIC), offers a more flexible and intelligent solution to optimize next generation networks compared to traditional mobile network architectures. By leveraging its distributed structure, which aligns seamlessly with O-RAN’s disaggregated design, Federated Learning (FL), particularly Hierarchical FL, facilitates decentralized AI model training, improving network performance, reducing resource costs, and safeguarding user privacy. However, the dynamic nature of mobile networks, particularly the frequent handovers of User Equipment (UE) between base stations, poses significant challenges for FL model training. These challenges include managing continuously changing device sets and mitigating the impact of handover delays on global model convergence. To address these challenges, we propose MHORANFed, a novel optimization algorithm tailored to minimize learning time and resource usage costs while preserving model performance within a mobility-aware hierarchical FL framework for O-RAN. Firstly, MHORANFed simplifies the upper layer of the HFL training at edge aggregate servers, which reduces the model complexity and thereby improves the learning time and the resource usage cost. Secondly, it uses jointly optimized bandwidth resource allocation and handed over local trainers’ participation to mitigate the UE handover delay in each global round. Through a rigorous convergence analysis and extensive simulation results, this work demonstrates its superiority over existing state-of-the-art methods. Furthermore, our findings underscore significant improvements in FL training efficiency, paving the way for advanced applications such as autonomous driving and augmented reality in 5G and next-generation O-RAN networks.
{"title":"User Handover Aware Hierarchical Federated Learning for Open RAN-Based Next-Generation Mobile Networks","authors":"Amardip Kumar Singh;Kim Khoa Nguyen","doi":"10.1109/TMLCN.2025.3587205","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3587205","url":null,"abstract":"The Open Radio Access Network (O-RAN) architecture, enhanced by its AI-enabled Radio Intelligent Controllers (RIC), offers a more flexible and intelligent solution to optimize next generation networks compared to traditional mobile network architectures. By leveraging its distributed structure, which aligns seamlessly with O-RAN’s disaggregated design, Federated Learning (FL), particularly Hierarchical FL, facilitates decentralized AI model training, improving network performance, reducing resource costs, and safeguarding user privacy. However, the dynamic nature of mobile networks, particularly the frequent handovers of User Equipment (UE) between base stations, poses significant challenges for FL model training. These challenges include managing continuously changing device sets and mitigating the impact of handover delays on global model convergence. To address these challenges, we propose MHORANFed, a novel optimization algorithm tailored to minimize learning time and resource usage costs while preserving model performance within a mobility-aware hierarchical FL framework for O-RAN. Firstly, MHORANFed simplifies the upper layer of the HFL training at edge aggregate servers, which reduces the model complexity and thereby improves the learning time and the resource usage cost. Secondly, it uses jointly optimized bandwidth resource allocation and handed over local trainers’ participation to mitigate the UE handover delay in each global round. Through a rigorous convergence analysis and extensive simulation results, this work demonstrates its superiority over existing state-of-the-art methods. Furthermore, our findings underscore significant improvements in FL training efficiency, paving the way for advanced applications such as autonomous driving and augmented reality in 5G and next-generation O-RAN networks.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"848-863"},"PeriodicalIF":0.0,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11075644","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144739815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-03DOI: 10.1109/TMLCN.2025.3585849
Eli Garlick;Nourhan Hesham;MD. Zoheb Hassan;Imtiaz Ahmed;Anas Chaaban;MD. Jahangir Hossain
Cognitive tactical wireless networks (TWNs) require spectrum awareness to avoid interference and jamming in the communication channel and assure quality-of-service in data transmission. Conventional supervised machine learning (ML) algorithm’s capability to provide spectrum awareness is confronted by the requirement of labeled interference signals. Due to the vast nature of interference signals in the frequency bands used by cognitive TWNs, it is non-trivial to acquire manually labeled data sets of all interference signals. Detecting the presence of an unknown and remote interference source in a frequency band from the transmitter end is also challenging, especially when the received interference power remains at or below the noise floor. To address these issues, this paper proposes an automated interference detection framework, entitled $textsf {MARSS}$ (Machine Learning Aided Resilient Spectrum Surveillance). $textsf {MARSS}$ is a fully unsupervised method, which first extracts the low-dimensional representative features from spectrograms by suppressing noise and background information and employing convolutional neural network (CNN) with novel loss function, and subsequently, distinguishes signals with and without interference by applying an isolation forest model on the extracted features. The uniqueness of $textsf {MARSS}$ is its ability to detect hidden and unknown interference signals in multiple frequency bands without using any prior labels, thanks to its superior feature extraction capability. The capability of $textsf {MARSS}$ is further extended to infer the level of interference by designing a multi-level interference classification framework. Using extensive simulations in GNURadio, the superiority of $textsf {MARSS}$ in detecting interference over existing ML methods is demonstrated. The effectiveness $textsf {MARSS}$ is also validated by extensive over-the-air (OTA) experiments using software-defined radios.
{"title":"Machine Learning Aided Resilient Spectrum Surveillance for Cognitive Tactical Wireless Networks: Design and Proof-of-Concept","authors":"Eli Garlick;Nourhan Hesham;MD. Zoheb Hassan;Imtiaz Ahmed;Anas Chaaban;MD. Jahangir Hossain","doi":"10.1109/TMLCN.2025.3585849","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585849","url":null,"abstract":"Cognitive tactical wireless networks (TWNs) require spectrum awareness to avoid interference and jamming in the communication channel and assure quality-of-service in data transmission. Conventional supervised machine learning (ML) algorithm’s capability to provide spectrum awareness is confronted by the requirement of labeled interference signals. Due to the vast nature of interference signals in the frequency bands used by cognitive TWNs, it is non-trivial to acquire manually labeled data sets of all interference signals. Detecting the presence of an unknown and remote interference source in a frequency band from the transmitter end is also challenging, especially when the received interference power remains at or below the noise floor. To address these issues, this paper proposes an automated interference detection framework, entitled <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> (Machine Learning Aided Resilient Spectrum Surveillance). <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is a fully unsupervised method, which first extracts the low-dimensional representative features from spectrograms by suppressing noise and background information and employing convolutional neural network (CNN) with novel loss function, and subsequently, distinguishes signals with and without interference by applying an isolation forest model on the extracted features. The uniqueness of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is its ability to detect hidden and unknown interference signals in multiple frequency bands without using any prior labels, thanks to its superior feature extraction capability. The capability of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is further extended to infer the level of interference by designing a multi-level interference classification framework. Using extensive simulations in GNURadio, the superiority of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> in detecting interference over existing ML methods is demonstrated. The effectiveness <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is also validated by extensive over-the-air (OTA) experiments using software-defined radios.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"814-834"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11068948","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144646650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this research, we proposed a novel anomaly detection system (ADS) that integrates federated learning (FL) with blockchain for resource-constrained IoT. The proposed system allows IoT devices to exchange machine learning (ML) models through a permissioned blockchain, enabling trustworthy collaborative learning through model sharing. To avoid single-point failure, any device can be a centre of the FL process. To deal with the issue of resource constraints in IoT devices and the model poisoning problem in FL, we introduced a novel method to use commitment coefficients and ML model discrepancies when selecting particular devices to join the FL process. We also proposed an efficient heuristic method to aggregate a federated model from a list of ML models trained locally on the selected devices, which helps to improve the federated model’s anomaly detection ability. The experiment results with the popular N-BaIoT dataset for IoT botnet attack detection show that the proposed system is more effective in detecting anomalies and resisting poisoning attacks than the two baselines (FedProx and FedAvg).
{"title":"A Novel Blockchain-Enabled Federated Learning Scheme for IoT Anomaly Detection","authors":"Van-Doan Nguyen;Abebe Diro;Naveen Chilamkurti;Will Heyne;Khoa Tran Phan","doi":"10.1109/TMLCN.2025.3585842","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585842","url":null,"abstract":"In this research, we proposed a novel anomaly detection system (ADS) that integrates federated learning (FL) with blockchain for resource-constrained IoT. The proposed system allows IoT devices to exchange machine learning (ML) models through a permissioned blockchain, enabling trustworthy collaborative learning through model sharing. To avoid single-point failure, any device can be a centre of the FL process. To deal with the issue of resource constraints in IoT devices and the model poisoning problem in FL, we introduced a novel method to use commitment coefficients and ML model discrepancies when selecting particular devices to join the FL process. We also proposed an efficient heuristic method to aggregate a federated model from a list of ML models trained locally on the selected devices, which helps to improve the federated model’s anomaly detection ability. The experiment results with the popular N-BaIoT dataset for IoT botnet attack detection show that the proposed system is more effective in detecting anomalies and resisting poisoning attacks than the two baselines (FedProx and FedAvg).","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"798-813"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11070312","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-03DOI: 10.1109/TMLCN.2025.3585845
Xuanyu Liu;Shijian Gao;Boxun Liu;Xiang Cheng;Liuqing Yang
The wireless channel is fundamental to communication, encompassing numerous tasks collectively referred to as channel-associated tasks. These tasks can leverage joint learning based on channel characteristics to share representations and enhance system design. To capitalize on this advantage, LLM4WM is proposed—a large language model (LLM) multi-task fine-tuning framework specifically tailored for channel-associated tasks. This framework utilizes a Mixture of Experts with Low-Rank Adaptation (MoE-LoRA) approach for multi-task fine-tuning, enabling the transfer of the pre-trained LLM’s general knowledge to these tasks. Given the unique characteristics of wireless channel data, preprocessing modules, adapter modules, and multi-task output layers are designed to align the channel data with the LLM’s semantic feature space. Experiments on a channel-associated multi-task dataset demonstrate that LLM4WM outperforms existing methodologies in both full-sample and few-shot evaluations, owing to its robust multi-task joint modeling and transfer learning capabilities.
{"title":"LLM4WM: Adapting LLM for Wireless Multi-Tasking","authors":"Xuanyu Liu;Shijian Gao;Boxun Liu;Xiang Cheng;Liuqing Yang","doi":"10.1109/TMLCN.2025.3585845","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585845","url":null,"abstract":"The wireless channel is fundamental to communication, encompassing numerous tasks collectively referred to as channel-associated tasks. These tasks can leverage joint learning based on channel characteristics to share representations and enhance system design. To capitalize on this advantage, LLM4WM is proposed—a large language model (LLM) multi-task fine-tuning framework specifically tailored for channel-associated tasks. This framework utilizes a Mixture of Experts with Low-Rank Adaptation (MoE-LoRA) approach for multi-task fine-tuning, enabling the transfer of the pre-trained LLM’s general knowledge to these tasks. Given the unique characteristics of wireless channel data, preprocessing modules, adapter modules, and multi-task output layers are designed to align the channel data with the LLM’s semantic feature space. Experiments on a channel-associated multi-task dataset demonstrate that LLM4WM outperforms existing methodologies in both full-sample and few-shot evaluations, owing to its robust multi-task joint modeling and transfer learning capabilities.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"835-847"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11071329","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144712086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}