The growing popularity of cellular networks among users, primarily due to affordable prices and high speeds, has escalated the need for strategic capacity planning to ensure a seamless end-user experience and profitable returns on network investments. Traditional capacity planning methods rely on static analysis of network parameters with the aim of minimizing the CAPEX and the OPEX. However, to address the evolving dynamics of cellular networks, this paper advocates for a data-driven approach that considers user behavioral analysis in the planning process to make it proactive and adaptive. We introduce a Hierarchical Feature-based Time Series Clustering (HFTSC) approach that organizes clustering in a multi-level tree structure. Each level addresses a specific aspect of time series data using focused features, enabling explainable clustering. The proposed approach assigns labels to clusters based on the time series properties targeted at each level, generating annotated clusters while applying unsupervised clustering methods. To evaluate the effectiveness of HFTSC, we conduct a comprehensive case study using real-world data from thousands of network elements. Our evaluation examines the identified clusters from analytical and geographical perspectives, focusing on supporting network planners in data-informed decision-making and analysis. Finally, we perform an extensive comparison with several baseline methods to reflect the practical advantages of our approach in capacity planning and optimization.
{"title":"A Hierarchical Feature-Based Time Series Clustering Approach for Data-Driven Capacity Planning of Cellular Networks","authors":"Vineeta Jain;Anna Richter;Vladimir Fokow;Mathias Schweigel;Ulf Wetzker;Andreas Frotzscher","doi":"10.1109/TMLCN.2025.3595125","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595125","url":null,"abstract":"The growing popularity of cellular networks among users, primarily due to affordable prices and high speeds, has escalated the need for strategic capacity planning to ensure a seamless end-user experience and profitable returns on network investments. Traditional capacity planning methods rely on static analysis of network parameters with the aim of minimizing the CAPEX and the OPEX. However, to address the evolving dynamics of cellular networks, this paper advocates for a data-driven approach that considers user behavioral analysis in the planning process to make it proactive and adaptive. We introduce a Hierarchical Feature-based Time Series Clustering (HFTSC) approach that organizes clustering in a multi-level tree structure. Each level addresses a specific aspect of time series data using focused features, enabling explainable clustering. The proposed approach assigns labels to clusters based on the time series properties targeted at each level, generating annotated clusters while applying unsupervised clustering methods. To evaluate the effectiveness of HFTSC, we conduct a comprehensive case study using real-world data from thousands of network elements. Our evaluation examines the identified clusters from analytical and geographical perspectives, focusing on supporting network planners in data-informed decision-making and analysis. Finally, we perform an extensive comparison with several baseline methods to reflect the practical advantages of our approach in capacity planning and optimization.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"921-947"},"PeriodicalIF":0.0,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11108703","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-28DOI: 10.1109/TMLCN.2025.3593184
Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah
The emergent field of Large Language Models (LLMs) has significant potential to revolutionize how future telecom networks are designed and operated. However, mainstream Large Language Models (LLMs) lack the specialized knowledge required to understand and operate within the highly technical telecom domain. In this paper, we introduce TelecomGPT, the first telecom-specific LLM, built through a systematic adaptation pipeline designed to enhance general-purpose LLMs for telecom applications. To achieve this, we curate comprehensive telecom-specific datasets, including pre-training datasets, instruction datasets, and preference datasets. These datasets are leveraged for continual pre-training, instruction tuning, and alignment tuning, respectively. Additionally, due to the lack of widely accepted evaluation benchmarks that are tailored for the telecom domain, we proposed three novel LLM-Telecom evaluation benchmarks, namely, Telecom Math Modeling, Telecom Open QnA, and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs in telecom math modeling, open-ended question answering, code generation, infilling, summarization and analysis. Using the curated datasets, our fine-tuned LLM, TelecomGPT, significantly outperforms general-purpose state of the art (SOTA) LLMs, including GPT-4, Llama-3 and Mistral, particularly in Telecom Math Modeling benchmarks. Additionally, it achieves comparable performance across various evaluation benchmarks, such as TeleQnA, 3GPP technical document classification, telecom code summarization, generation, and infilling. This work establishes a new foundation for integrating LLMs into telecom systems, paving the way for AI-powered advancements in network operations.
{"title":"TelecomGPT: A Framework to Build Telecom-Specific Large Language Models","authors":"Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah","doi":"10.1109/TMLCN.2025.3593184","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3593184","url":null,"abstract":"The emergent field of Large Language Models (LLMs) has significant potential to revolutionize how future telecom networks are designed and operated. However, mainstream Large Language Models (LLMs) lack the specialized knowledge required to understand and operate within the highly technical telecom domain. In this paper, we introduce TelecomGPT, the first telecom-specific LLM, built through a systematic adaptation pipeline designed to enhance general-purpose LLMs for telecom applications. To achieve this, we curate comprehensive telecom-specific datasets, including pre-training datasets, instruction datasets, and preference datasets. These datasets are leveraged for continual pre-training, instruction tuning, and alignment tuning, respectively. Additionally, due to the lack of widely accepted evaluation benchmarks that are tailored for the telecom domain, we proposed three novel LLM-Telecom evaluation benchmarks, namely, Telecom Math Modeling, Telecom Open QnA, and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs in telecom math modeling, open-ended question answering, code generation, infilling, summarization and analysis. Using the curated datasets, our fine-tuned LLM, TelecomGPT, significantly outperforms general-purpose state of the art (SOTA) LLMs, including GPT-4, Llama-3 and Mistral, particularly in Telecom Math Modeling benchmarks. Additionally, it achieves comparable performance across various evaluation benchmarks, such as TeleQnA, 3GPP technical document classification, telecom code summarization, generation, and infilling. This work establishes a new foundation for integrating LLMs into telecom systems, paving the way for AI-powered advancements in network operations.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"948-975"},"PeriodicalIF":0.0,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11097898","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-25DOI: 10.1109/TMLCN.2025.3592658
Pranshav Gajjar;Vijay K. Shah
Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.
{"title":"ORANSight-2.0: Foundational LLMs for O-RAN","authors":"Pranshav Gajjar;Vijay K. Shah","doi":"10.1109/TMLCN.2025.3592658","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3592658","url":null,"abstract":"Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"903-920"},"PeriodicalIF":0.0,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096935","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-10DOI: 10.1109/TMLCN.2025.3587205
Amardip Kumar Singh;Kim Khoa Nguyen
The Open Radio Access Network (O-RAN) architecture, enhanced by its AI-enabled Radio Intelligent Controllers (RIC), offers a more flexible and intelligent solution to optimize next generation networks compared to traditional mobile network architectures. By leveraging its distributed structure, which aligns seamlessly with O-RAN’s disaggregated design, Federated Learning (FL), particularly Hierarchical FL, facilitates decentralized AI model training, improving network performance, reducing resource costs, and safeguarding user privacy. However, the dynamic nature of mobile networks, particularly the frequent handovers of User Equipment (UE) between base stations, poses significant challenges for FL model training. These challenges include managing continuously changing device sets and mitigating the impact of handover delays on global model convergence. To address these challenges, we propose MHORANFed, a novel optimization algorithm tailored to minimize learning time and resource usage costs while preserving model performance within a mobility-aware hierarchical FL framework for O-RAN. Firstly, MHORANFed simplifies the upper layer of the HFL training at edge aggregate servers, which reduces the model complexity and thereby improves the learning time and the resource usage cost. Secondly, it uses jointly optimized bandwidth resource allocation and handed over local trainers’ participation to mitigate the UE handover delay in each global round. Through a rigorous convergence analysis and extensive simulation results, this work demonstrates its superiority over existing state-of-the-art methods. Furthermore, our findings underscore significant improvements in FL training efficiency, paving the way for advanced applications such as autonomous driving and augmented reality in 5G and next-generation O-RAN networks.
{"title":"User Handover Aware Hierarchical Federated Learning for Open RAN-Based Next-Generation Mobile Networks","authors":"Amardip Kumar Singh;Kim Khoa Nguyen","doi":"10.1109/TMLCN.2025.3587205","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3587205","url":null,"abstract":"The Open Radio Access Network (O-RAN) architecture, enhanced by its AI-enabled Radio Intelligent Controllers (RIC), offers a more flexible and intelligent solution to optimize next generation networks compared to traditional mobile network architectures. By leveraging its distributed structure, which aligns seamlessly with O-RAN’s disaggregated design, Federated Learning (FL), particularly Hierarchical FL, facilitates decentralized AI model training, improving network performance, reducing resource costs, and safeguarding user privacy. However, the dynamic nature of mobile networks, particularly the frequent handovers of User Equipment (UE) between base stations, poses significant challenges for FL model training. These challenges include managing continuously changing device sets and mitigating the impact of handover delays on global model convergence. To address these challenges, we propose MHORANFed, a novel optimization algorithm tailored to minimize learning time and resource usage costs while preserving model performance within a mobility-aware hierarchical FL framework for O-RAN. Firstly, MHORANFed simplifies the upper layer of the HFL training at edge aggregate servers, which reduces the model complexity and thereby improves the learning time and the resource usage cost. Secondly, it uses jointly optimized bandwidth resource allocation and handed over local trainers’ participation to mitigate the UE handover delay in each global round. Through a rigorous convergence analysis and extensive simulation results, this work demonstrates its superiority over existing state-of-the-art methods. Furthermore, our findings underscore significant improvements in FL training efficiency, paving the way for advanced applications such as autonomous driving and augmented reality in 5G and next-generation O-RAN networks.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"848-863"},"PeriodicalIF":0.0,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11075644","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144739815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-03DOI: 10.1109/TMLCN.2025.3585849
Eli Garlick;Nourhan Hesham;MD. Zoheb Hassan;Imtiaz Ahmed;Anas Chaaban;MD. Jahangir Hossain
Cognitive tactical wireless networks (TWNs) require spectrum awareness to avoid interference and jamming in the communication channel and assure quality-of-service in data transmission. Conventional supervised machine learning (ML) algorithm’s capability to provide spectrum awareness is confronted by the requirement of labeled interference signals. Due to the vast nature of interference signals in the frequency bands used by cognitive TWNs, it is non-trivial to acquire manually labeled data sets of all interference signals. Detecting the presence of an unknown and remote interference source in a frequency band from the transmitter end is also challenging, especially when the received interference power remains at or below the noise floor. To address these issues, this paper proposes an automated interference detection framework, entitled $textsf {MARSS}$ (Machine Learning Aided Resilient Spectrum Surveillance). $textsf {MARSS}$ is a fully unsupervised method, which first extracts the low-dimensional representative features from spectrograms by suppressing noise and background information and employing convolutional neural network (CNN) with novel loss function, and subsequently, distinguishes signals with and without interference by applying an isolation forest model on the extracted features. The uniqueness of $textsf {MARSS}$ is its ability to detect hidden and unknown interference signals in multiple frequency bands without using any prior labels, thanks to its superior feature extraction capability. The capability of $textsf {MARSS}$ is further extended to infer the level of interference by designing a multi-level interference classification framework. Using extensive simulations in GNURadio, the superiority of $textsf {MARSS}$ in detecting interference over existing ML methods is demonstrated. The effectiveness $textsf {MARSS}$ is also validated by extensive over-the-air (OTA) experiments using software-defined radios.
{"title":"Machine Learning Aided Resilient Spectrum Surveillance for Cognitive Tactical Wireless Networks: Design and Proof-of-Concept","authors":"Eli Garlick;Nourhan Hesham;MD. Zoheb Hassan;Imtiaz Ahmed;Anas Chaaban;MD. Jahangir Hossain","doi":"10.1109/TMLCN.2025.3585849","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585849","url":null,"abstract":"Cognitive tactical wireless networks (TWNs) require spectrum awareness to avoid interference and jamming in the communication channel and assure quality-of-service in data transmission. Conventional supervised machine learning (ML) algorithm’s capability to provide spectrum awareness is confronted by the requirement of labeled interference signals. Due to the vast nature of interference signals in the frequency bands used by cognitive TWNs, it is non-trivial to acquire manually labeled data sets of all interference signals. Detecting the presence of an unknown and remote interference source in a frequency band from the transmitter end is also challenging, especially when the received interference power remains at or below the noise floor. To address these issues, this paper proposes an automated interference detection framework, entitled <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> (Machine Learning Aided Resilient Spectrum Surveillance). <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is a fully unsupervised method, which first extracts the low-dimensional representative features from spectrograms by suppressing noise and background information and employing convolutional neural network (CNN) with novel loss function, and subsequently, distinguishes signals with and without interference by applying an isolation forest model on the extracted features. The uniqueness of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is its ability to detect hidden and unknown interference signals in multiple frequency bands without using any prior labels, thanks to its superior feature extraction capability. The capability of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is further extended to infer the level of interference by designing a multi-level interference classification framework. Using extensive simulations in GNURadio, the superiority of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> in detecting interference over existing ML methods is demonstrated. The effectiveness <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is also validated by extensive over-the-air (OTA) experiments using software-defined radios.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"814-834"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11068948","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144646650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this research, we proposed a novel anomaly detection system (ADS) that integrates federated learning (FL) with blockchain for resource-constrained IoT. The proposed system allows IoT devices to exchange machine learning (ML) models through a permissioned blockchain, enabling trustworthy collaborative learning through model sharing. To avoid single-point failure, any device can be a centre of the FL process. To deal with the issue of resource constraints in IoT devices and the model poisoning problem in FL, we introduced a novel method to use commitment coefficients and ML model discrepancies when selecting particular devices to join the FL process. We also proposed an efficient heuristic method to aggregate a federated model from a list of ML models trained locally on the selected devices, which helps to improve the federated model’s anomaly detection ability. The experiment results with the popular N-BaIoT dataset for IoT botnet attack detection show that the proposed system is more effective in detecting anomalies and resisting poisoning attacks than the two baselines (FedProx and FedAvg).
{"title":"A Novel Blockchain-Enabled Federated Learning Scheme for IoT Anomaly Detection","authors":"Van-Doan Nguyen;Abebe Diro;Naveen Chilamkurti;Will Heyne;Khoa Tran Phan","doi":"10.1109/TMLCN.2025.3585842","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585842","url":null,"abstract":"In this research, we proposed a novel anomaly detection system (ADS) that integrates federated learning (FL) with blockchain for resource-constrained IoT. The proposed system allows IoT devices to exchange machine learning (ML) models through a permissioned blockchain, enabling trustworthy collaborative learning through model sharing. To avoid single-point failure, any device can be a centre of the FL process. To deal with the issue of resource constraints in IoT devices and the model poisoning problem in FL, we introduced a novel method to use commitment coefficients and ML model discrepancies when selecting particular devices to join the FL process. We also proposed an efficient heuristic method to aggregate a federated model from a list of ML models trained locally on the selected devices, which helps to improve the federated model’s anomaly detection ability. The experiment results with the popular N-BaIoT dataset for IoT botnet attack detection show that the proposed system is more effective in detecting anomalies and resisting poisoning attacks than the two baselines (FedProx and FedAvg).","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"798-813"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11070312","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-03DOI: 10.1109/TMLCN.2025.3585845
Xuanyu Liu;Shijian Gao;Boxun Liu;Xiang Cheng;Liuqing Yang
The wireless channel is fundamental to communication, encompassing numerous tasks collectively referred to as channel-associated tasks. These tasks can leverage joint learning based on channel characteristics to share representations and enhance system design. To capitalize on this advantage, LLM4WM is proposed—a large language model (LLM) multi-task fine-tuning framework specifically tailored for channel-associated tasks. This framework utilizes a Mixture of Experts with Low-Rank Adaptation (MoE-LoRA) approach for multi-task fine-tuning, enabling the transfer of the pre-trained LLM’s general knowledge to these tasks. Given the unique characteristics of wireless channel data, preprocessing modules, adapter modules, and multi-task output layers are designed to align the channel data with the LLM’s semantic feature space. Experiments on a channel-associated multi-task dataset demonstrate that LLM4WM outperforms existing methodologies in both full-sample and few-shot evaluations, owing to its robust multi-task joint modeling and transfer learning capabilities.
{"title":"LLM4WM: Adapting LLM for Wireless Multi-Tasking","authors":"Xuanyu Liu;Shijian Gao;Boxun Liu;Xiang Cheng;Liuqing Yang","doi":"10.1109/TMLCN.2025.3585845","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585845","url":null,"abstract":"The wireless channel is fundamental to communication, encompassing numerous tasks collectively referred to as channel-associated tasks. These tasks can leverage joint learning based on channel characteristics to share representations and enhance system design. To capitalize on this advantage, LLM4WM is proposed—a large language model (LLM) multi-task fine-tuning framework specifically tailored for channel-associated tasks. This framework utilizes a Mixture of Experts with Low-Rank Adaptation (MoE-LoRA) approach for multi-task fine-tuning, enabling the transfer of the pre-trained LLM’s general knowledge to these tasks. Given the unique characteristics of wireless channel data, preprocessing modules, adapter modules, and multi-task output layers are designed to align the channel data with the LLM’s semantic feature space. Experiments on a channel-associated multi-task dataset demonstrate that LLM4WM outperforms existing methodologies in both full-sample and few-shot evaluations, owing to its robust multi-task joint modeling and transfer learning capabilities.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"835-847"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11071329","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144712086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-27DOI: 10.1109/TMLCN.2025.3584009
Berend J. D. Gort;Godfrey M. Kibalya;Angelos Antonopoulos
Effective resource management in edge-cloud networks demands precise forecasting of diverse workload resource usage. Due to the fluctuating nature of user demands, prediction models must have strong generalization abilities, ensuring high performance amidst sudden traffic changes or unfamiliar patterns. Existing approaches often struggle with handling long-term dependencies and the diversity of temporal patterns. This paper introduces OmniFORE (Framework for Optimization of Resource forecasts in Edge-cloud networks), which integrates attention-based time-series models with temporal clustering to enhance generalization and predict diverse workloads efficiently in volatile settings. By training on carefully selected subsets from extensive datasets, OmniFORE captures both short-term stability and long-term shifts in resource usage patterns. Experiments show that OmniFORE outperforms state-of-the-art methods in prediction accuracy, inference speed, and generalization to unseen data, particularly in scenarios with dynamic workload changes and varying trace variance. These improvements enable more efficient resource management in the compute continuum.
边缘云网络中有效的资源管理要求对各种工作负载资源使用情况进行精确预测。由于用户需求的波动性,预测模型必须具有较强的泛化能力,以确保在突发流量变化或不熟悉的模式下具有较高的性能。现有的方法经常在处理长期依赖关系和时态模式的多样性方面遇到困难。本文介绍了OmniFORE (Framework for Optimization of Resource forecasting in Edge-cloud networks),它将基于注意力的时间序列模型与时间聚类相结合,以增强泛化能力,并在不稳定的环境中有效地预测不同的工作负载。通过对大量数据集中精心挑选的子集进行训练,OmniFORE捕获了资源使用模式的短期稳定性和长期变化。实验表明,OmniFORE在预测精度、推理速度和对未见数据的泛化方面优于最先进的方法,特别是在动态工作负载变化和跟踪方差变化的情况下。这些改进可以在计算连续体中实现更有效的资源管理。
{"title":"Attention-Driven AI Model Generalization for Workload Forecasting in the Compute Continuum","authors":"Berend J. D. Gort;Godfrey M. Kibalya;Angelos Antonopoulos","doi":"10.1109/TMLCN.2025.3584009","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3584009","url":null,"abstract":"Effective resource management in edge-cloud networks demands precise forecasting of diverse workload resource usage. Due to the fluctuating nature of user demands, prediction models must have strong generalization abilities, ensuring high performance amidst sudden traffic changes or unfamiliar patterns. Existing approaches often struggle with handling long-term dependencies and the diversity of temporal patterns. This paper introduces OmniFORE (Framework for Optimization of Resource forecasts in Edge-cloud networks), which integrates attention-based time-series models with temporal clustering to enhance generalization and predict diverse workloads efficiently in volatile settings. By training on carefully selected subsets from extensive datasets, OmniFORE captures both short-term stability and long-term shifts in resource usage patterns. Experiments show that OmniFORE outperforms state-of-the-art methods in prediction accuracy, inference speed, and generalization to unseen data, particularly in scenarios with dynamic workload changes and varying trace variance. These improvements enable more efficient resource management in the compute continuum.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"779-797"},"PeriodicalIF":0.0,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11053768","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-26DOI: 10.1109/TMLCN.2025.3583659
Afsaneh Mahmoudi;Ming Xiao;Emil Björnson
Federated Learning (FL) enables clients to share model parameters instead of raw data, reducing communication overhead. Traditional wireless networks, however, suffer from latency issues when supporting FL. Cell-Free Massive MIMO (CFmMIMO) offers a promising alternative, as it can serve multiple clients simultaneously on shared resources, enhancing spectral efficiency and reducing latency in large-scale FL. Still, communication resource constraints at the client side can impede the completion of FL training. To tackle this issue, we propose a low-latency, energy-efficient FL framework with optimized uplink power allocation for efficient uplink communication. Our approach integrates an adaptive quantization strategy that dynamically adjusts bit allocation for local gradient updates, significantly lowering communication cost. We formulate a joint optimization problem involving FL model updates, local iterations, and power allocation. This problem is solved using sequential quadratic programming (SQP) to balance energy consumption and latency. Moreover, for local model training, clients employ the AdaDelta optimizer, which improves convergence compared to standard SGD, Adam, and RMSProp. We also provide a theoretical analysis of FL convergence under AdaDelta. Numerical results demonstrate that, under equal energy and latency budgets, our power allocation strategy improves test accuracy by up to 7% and 19% compared to Dinkelbach and max-sum rate approaches. Furthermore, across all power allocation methods, our quantization scheme outperforms AQUILA and LAQ, increasing test accuracy by up to 36% and 35%, respectively.
{"title":"Accelerating Energy-Efficient Federated Learning in Cell-Free Networks With Adaptive Quantization","authors":"Afsaneh Mahmoudi;Ming Xiao;Emil Björnson","doi":"10.1109/TMLCN.2025.3583659","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3583659","url":null,"abstract":"Federated Learning (FL) enables clients to share model parameters instead of raw data, reducing communication overhead. Traditional wireless networks, however, suffer from latency issues when supporting FL. Cell-Free Massive MIMO (CFmMIMO) offers a promising alternative, as it can serve multiple clients simultaneously on shared resources, enhancing spectral efficiency and reducing latency in large-scale FL. Still, communication resource constraints at the client side can impede the completion of FL training. To tackle this issue, we propose a low-latency, energy-efficient FL framework with optimized uplink power allocation for efficient uplink communication. Our approach integrates an adaptive quantization strategy that dynamically adjusts bit allocation for local gradient updates, significantly lowering communication cost. We formulate a joint optimization problem involving FL model updates, local iterations, and power allocation. This problem is solved using sequential quadratic programming (SQP) to balance energy consumption and latency. Moreover, for local model training, clients employ the AdaDelta optimizer, which improves convergence compared to standard SGD, Adam, and RMSProp. We also provide a theoretical analysis of FL convergence under AdaDelta. Numerical results demonstrate that, under equal energy and latency budgets, our power allocation strategy improves test accuracy by up to 7% and 19% compared to Dinkelbach and max-sum rate approaches. Furthermore, across all power allocation methods, our quantization scheme outperforms AQUILA and LAQ, increasing test accuracy by up to 36% and 35%, respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"761-778"},"PeriodicalIF":0.0,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11052837","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-16DOI: 10.1109/TMLCN.2025.3579750
Kamran Sattar Awaisi;Qiang Ye;Srinivas Sampalli
The Industrial Internet of Things (IIoT) has revolutionized the industrial sector by integrating sensors to monitor equipment health and optimize production processes. These sensors collect real-time data and are prone to a variety of different faults, such as bias, drift, noise, gain, spike, and constant faults. Such faults can lead to significant operational problems, including false results, incorrect predictions, and misleading maintenance decisions. Therefore, classifying sensor data appropriately is essential for ensuring the reliability and efficiency of IIoT systems. In this paper, we propose the Shared-Encoder Transformer (SET) scheme for multi-sensor, multi-class fault classification in IIoT systems. Leveraging the transformer architecture, the SET uses a shared encoder with positional encoding and multi-head self-attention mechanisms to capture complex temporal patterns in sensor data. Consequently, it can accurately detect the health status of sensor data, and if the sensor data is faulty, it can specifically identify the fault type. Additionally, we introduce a comprehensive fault injection strategy to address the problem of fault data scarcity, enabling the validation of the robust performance of SET even with limited fault samples in both ideal and realistic scenarios. In our research, we conducted extensive experiments using the Commercial Modular Aeropropulsion System Simulation (C-MAPSS) and Skoltech Anomaly Benchmark (SKAB) datasets to study the performance of the SET. Our experimental results indicate that SET consistently outperforms baseline methods, including Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN)-LSTM, and Multilayer Perceptron (MLP), as well as the proposed comparative variant of SET, Multi-Encoder Transformer (MET), in terms of accuracy, precision, recall, and F1-score across different fault intensities. The shared-kmencoder architecture improves fault detection accuracy and ensures parameter efficiency/robustness, making it suitable for deployment in memory-constrained industrial environments.
{"title":"SET: A Shared-Encoder Transformer Scheme for Multi-Sensor, Multi-Class Fault Classification in Industrial IoT","authors":"Kamran Sattar Awaisi;Qiang Ye;Srinivas Sampalli","doi":"10.1109/TMLCN.2025.3579750","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3579750","url":null,"abstract":"The Industrial Internet of Things (IIoT) has revolutionized the industrial sector by integrating sensors to monitor equipment health and optimize production processes. These sensors collect real-time data and are prone to a variety of different faults, such as bias, drift, noise, gain, spike, and constant faults. Such faults can lead to significant operational problems, including false results, incorrect predictions, and misleading maintenance decisions. Therefore, classifying sensor data appropriately is essential for ensuring the reliability and efficiency of IIoT systems. In this paper, we propose the Shared-Encoder Transformer (SET) scheme for multi-sensor, multi-class fault classification in IIoT systems. Leveraging the transformer architecture, the SET uses a shared encoder with positional encoding and multi-head self-attention mechanisms to capture complex temporal patterns in sensor data. Consequently, it can accurately detect the health status of sensor data, and if the sensor data is faulty, it can specifically identify the fault type. Additionally, we introduce a comprehensive fault injection strategy to address the problem of fault data scarcity, enabling the validation of the robust performance of SET even with limited fault samples in both ideal and realistic scenarios. In our research, we conducted extensive experiments using the Commercial Modular Aeropropulsion System Simulation (C-MAPSS) and Skoltech Anomaly Benchmark (SKAB) datasets to study the performance of the SET. Our experimental results indicate that SET consistently outperforms baseline methods, including Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN)-LSTM, and Multilayer Perceptron (MLP), as well as the proposed comparative variant of SET, Multi-Encoder Transformer (MET), in terms of accuracy, precision, recall, and F1-score across different fault intensities. The shared-kmencoder architecture improves fault detection accuracy and ensures parameter efficiency/robustness, making it suitable for deployment in memory-constrained industrial environments.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"744-760"},"PeriodicalIF":0.0,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11037229","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144367023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}