首页 > 最新文献

IEEE Transactions on Machine Learning in Communications and Networking最新文献

英文 中文
Explainable AI for Enhancing Efficiency of DL-Based Channel Estimation 提高基于dl的信道估计效率的可解释人工智能
Pub Date : 2025-08-06 DOI: 10.1109/TMLCN.2025.3596548
Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier
The support of artificial intelligence (AI) based decision-making is a key element in future 6G networks. Moreover, AI is widely employed in critical applications such as autonomous driving and medical diagnosis. In such applications, using AI as black-box models is risky and challenging. Hence, it is crucial to understand and trust the decisions taken by these models. Tackling this issue can be achieved by developing explainable AI (XAI) schemes that aim to explain the logic behind the black-box model behavior, and thus, ensure its efficient and safe deployment. Highlighting the relevant inputs the black-box model uses to accomplish the desired prediction is essential towards ensuring its interpretability. Recently, we proposed a novel perturbation-based feature selection framework called XAI-CHEST and oriented toward channel estimation in wireless communications. This manuscript provides the detailed theoretical foundations of the XAI-CHEST framework. In particular, we derive the analytical expressions of the XAI-CHEST loss functions and the noise threshold fine-tuning optimization problem. Hence the designed XAI-CHEST delivers a smart low-complex one-shot input feature selection methodology for high-dimensional model input that can further improve the overall performance while optimizing the architecture of the employed model. Simulation results show that the XAI-CHEST framework outperforms the classical feature selection XAI schemes such as local interpretable model-agnostic explanations (LIME) and shapley additive explanations (SHAP), mainly in terms of interpretability resolution as well as providing better performance-complexity trade-off.
基于人工智能(AI)的决策支持是未来6G网络的关键要素。此外,人工智能被广泛应用于自动驾驶和医疗诊断等关键应用。在这样的应用中,使用人工智能作为黑盒模型是有风险和挑战性的。因此,理解和信任这些模型所做的决定是至关重要的。解决这个问题可以通过开发可解释的AI (XAI)方案来实现,该方案旨在解释黑盒模型行为背后的逻辑,从而确保其高效和安全的部署。突出显示黑箱模型用于完成预期预测的相关输入对于确保其可解释性至关重要。最近,我们提出了一种新的基于微扰的特征选择框架,称为XAI-CHEST,并针对无线通信中的信道估计。本文为XAI-CHEST框架提供了详细的理论基础。特别地,我们推导了XAI-CHEST损失函数的解析表达式和噪声阈值微调优化问题。因此,设计的XAI-CHEST为高维模型输入提供了一种智能低复杂度的一次性输入特征选择方法,可以在优化所采用模型架构的同时进一步提高整体性能。仿真结果表明,XAI- chest框架优于经典的特征选择XAI方案,如局部可解释模型不可知解释(LIME)和shapley加性解释(SHAP),主要体现在可解释性分辨率和更好的性能复杂度权衡方面。
{"title":"Explainable AI for Enhancing Efficiency of DL-Based Channel Estimation","authors":"Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier","doi":"10.1109/TMLCN.2025.3596548","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3596548","url":null,"abstract":"The support of artificial intelligence (AI) based decision-making is a key element in future 6G networks. Moreover, AI is widely employed in critical applications such as autonomous driving and medical diagnosis. In such applications, using AI as black-box models is risky and challenging. Hence, it is crucial to understand and trust the decisions taken by these models. Tackling this issue can be achieved by developing explainable AI (XAI) schemes that aim to explain the logic behind the black-box model behavior, and thus, ensure its efficient and safe deployment. Highlighting the relevant inputs the black-box model uses to accomplish the desired prediction is essential towards ensuring its interpretability. Recently, we proposed a novel perturbation-based feature selection framework called XAI-CHEST and oriented toward channel estimation in wireless communications. This manuscript provides the detailed theoretical foundations of the XAI-CHEST framework. In particular, we derive the analytical expressions of the XAI-CHEST loss functions and the noise threshold fine-tuning optimization problem. Hence the designed XAI-CHEST delivers a smart low-complex one-shot input feature selection methodology for high-dimensional model input that can further improve the overall performance while optimizing the architecture of the employed model. Simulation results show that the XAI-CHEST framework outperforms the classical feature selection XAI schemes such as local interpretable model-agnostic explanations (LIME) and shapley additive explanations (SHAP), mainly in terms of interpretability resolution as well as providing better performance-complexity trade-off.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"976-996"},"PeriodicalIF":0.0,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11115091","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Out-of-Distribution in Image Semantic Communication: A Solution With Multimodal Large Language Models 基于多模态大语言模型的图像语义通信的非分布解决方案
Pub Date : 2025-08-05 DOI: 10.1109/TMLCN.2025.3595841
Feifan Zhang;Yuyang Du;Kexin Chen;Yulin Shao;Soung Chang Liew
Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel “Plan A - Plan B” framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. Furthermore, we propose a Bayesian optimization scheme that reshapes the probability distribution of the MLLM’s inference process based on the contextual information of the image. The optimization scheme significantly enhances the MLLM’s performance in semantic compression by 1) filtering out irrelevant vocabulary in the original MLLM output; and 2) using contextual similarities between prospective answers of the MLLM and the background information as prior knowledge to modify the MLLM’s probability distribution during inference. Further, at the receiver side of the communication system, we put forth a “generate-criticize” framework that utilizes the cooperation of multiple MLLMs to enhance the reliability of image reconstruction.
语义通信是下一代无线网络的一项重要技术。然而,在分布外(OOD)问题中,预训练的机器学习(ML)模型应用于其训练数据分布之外的看不见的任务,可能会损害语义压缩的完整性。本文探讨了使用多模态大语言模型(mllm)来解决图像语义通信中的OOD问题。我们提出了一种新颖的“计划a -计划B”框架,该框架利用MLLM的广博知识和强大的泛化能力,在传统ML模型在语义编码过程中遇到OOD输入时对其进行辅助。此外,我们提出了一种贝叶斯优化方案,该方案基于图像的上下文信息重塑MLLM推理过程的概率分布。该优化方案通过1)过滤掉原始MLLM输出中的不相关词汇,显著提高了MLLM在语义压缩方面的性能;2)利用MLLM的预期答案与背景信息之间的上下文相似性作为先验知识,修改MLLM在推理过程中的概率分布。此外,在通信系统的接收端,我们提出了一种“生成-批评”框架,利用多个mllm的合作来提高图像重建的可靠性。
{"title":"Out-of-Distribution in Image Semantic Communication: A Solution With Multimodal Large Language Models","authors":"Feifan Zhang;Yuyang Du;Kexin Chen;Yulin Shao;Soung Chang Liew","doi":"10.1109/TMLCN.2025.3595841","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595841","url":null,"abstract":"Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel “Plan A - Plan B” framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. Furthermore, we propose a Bayesian optimization scheme that reshapes the probability distribution of the MLLM’s inference process based on the contextual information of the image. The optimization scheme significantly enhances the MLLM’s performance in semantic compression by 1) filtering out irrelevant vocabulary in the original MLLM output; and 2) using contextual similarities between prospective answers of the MLLM and the background information as prior knowledge to modify the MLLM’s probability distribution during inference. Further, at the receiver side of the communication system, we put forth a “generate-criticize” framework that utilizes the cooperation of multiple MLLMs to enhance the reliability of image reconstruction.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"997-1013"},"PeriodicalIF":0.0,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11113346","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hierarchical Feature-Based Time Series Clustering Approach for Data-Driven Capacity Planning of Cellular Networks 基于层次特征的时间序列聚类方法在蜂窝网络数据驱动容量规划中的应用
Pub Date : 2025-08-04 DOI: 10.1109/TMLCN.2025.3595125
Vineeta Jain;Anna Richter;Vladimir Fokow;Mathias Schweigel;Ulf Wetzker;Andreas Frotzscher
The growing popularity of cellular networks among users, primarily due to affordable prices and high speeds, has escalated the need for strategic capacity planning to ensure a seamless end-user experience and profitable returns on network investments. Traditional capacity planning methods rely on static analysis of network parameters with the aim of minimizing the CAPEX and the OPEX. However, to address the evolving dynamics of cellular networks, this paper advocates for a data-driven approach that considers user behavioral analysis in the planning process to make it proactive and adaptive. We introduce a Hierarchical Feature-based Time Series Clustering (HFTSC) approach that organizes clustering in a multi-level tree structure. Each level addresses a specific aspect of time series data using focused features, enabling explainable clustering. The proposed approach assigns labels to clusters based on the time series properties targeted at each level, generating annotated clusters while applying unsupervised clustering methods. To evaluate the effectiveness of HFTSC, we conduct a comprehensive case study using real-world data from thousands of network elements. Our evaluation examines the identified clusters from analytical and geographical perspectives, focusing on supporting network planners in data-informed decision-making and analysis. Finally, we perform an extensive comparison with several baseline methods to reflect the practical advantages of our approach in capacity planning and optimization.
蜂窝网络在用户中越来越受欢迎,主要是由于价格实惠和高速,这使得对战略容量规划的需求升级,以确保无缝的最终用户体验和网络投资的盈利回报。传统的容量规划方法依赖于对网络参数的静态分析,以最小化CAPEX和OPEX为目标。然而,为了解决蜂窝网络不断变化的动态,本文提倡一种数据驱动的方法,在规划过程中考虑用户行为分析,使其具有前瞻性和适应性。我们介绍了一种基于层次特征的时间序列聚类(HFTSC)方法,该方法将聚类组织成多层次树结构。每个级别使用重点特征处理时间序列数据的特定方面,从而实现可解释的聚类。该方法根据每个层次的时间序列属性为聚类分配标签,在应用无监督聚类方法的同时生成带注释的聚类。为了评估HFTSC的有效性,我们使用来自数千个网络元素的真实数据进行了全面的案例研究。我们的评估从分析和地理角度考察了已确定的集群,重点是支持网络规划者在数据知情的决策和分析中。最后,我们与几种基线方法进行了广泛的比较,以反映我们的方法在容量规划和优化方面的实际优势。
{"title":"A Hierarchical Feature-Based Time Series Clustering Approach for Data-Driven Capacity Planning of Cellular Networks","authors":"Vineeta Jain;Anna Richter;Vladimir Fokow;Mathias Schweigel;Ulf Wetzker;Andreas Frotzscher","doi":"10.1109/TMLCN.2025.3595125","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595125","url":null,"abstract":"The growing popularity of cellular networks among users, primarily due to affordable prices and high speeds, has escalated the need for strategic capacity planning to ensure a seamless end-user experience and profitable returns on network investments. Traditional capacity planning methods rely on static analysis of network parameters with the aim of minimizing the CAPEX and the OPEX. However, to address the evolving dynamics of cellular networks, this paper advocates for a data-driven approach that considers user behavioral analysis in the planning process to make it proactive and adaptive. We introduce a Hierarchical Feature-based Time Series Clustering (HFTSC) approach that organizes clustering in a multi-level tree structure. Each level addresses a specific aspect of time series data using focused features, enabling explainable clustering. The proposed approach assigns labels to clusters based on the time series properties targeted at each level, generating annotated clusters while applying unsupervised clustering methods. To evaluate the effectiveness of HFTSC, we conduct a comprehensive case study using real-world data from thousands of network elements. Our evaluation examines the identified clusters from analytical and geographical perspectives, focusing on supporting network planners in data-informed decision-making and analysis. Finally, we perform an extensive comparison with several baseline methods to reflect the practical advantages of our approach in capacity planning and optimization.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"921-947"},"PeriodicalIF":0.0,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11108703","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TelecomGPT: A Framework to Build Telecom-Specific Large Language Models TelecomGPT:构建电信专用大型语言模型的框架
Pub Date : 2025-07-28 DOI: 10.1109/TMLCN.2025.3593184
Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah
The emergent field of Large Language Models (LLMs) has significant potential to revolutionize how future telecom networks are designed and operated. However, mainstream Large Language Models (LLMs) lack the specialized knowledge required to understand and operate within the highly technical telecom domain. In this paper, we introduce TelecomGPT, the first telecom-specific LLM, built through a systematic adaptation pipeline designed to enhance general-purpose LLMs for telecom applications. To achieve this, we curate comprehensive telecom-specific datasets, including pre-training datasets, instruction datasets, and preference datasets. These datasets are leveraged for continual pre-training, instruction tuning, and alignment tuning, respectively. Additionally, due to the lack of widely accepted evaluation benchmarks that are tailored for the telecom domain, we proposed three novel LLM-Telecom evaluation benchmarks, namely, Telecom Math Modeling, Telecom Open QnA, and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs in telecom math modeling, open-ended question answering, code generation, infilling, summarization and analysis. Using the curated datasets, our fine-tuned LLM, TelecomGPT, significantly outperforms general-purpose state of the art (SOTA) LLMs, including GPT-4, Llama-3 and Mistral, particularly in Telecom Math Modeling benchmarks. Additionally, it achieves comparable performance across various evaluation benchmarks, such as TeleQnA, 3GPP technical document classification, telecom code summarization, generation, and infilling. This work establishes a new foundation for integrating LLMs into telecom systems, paving the way for AI-powered advancements in network operations.
大型语言模型(llm)这一新兴领域具有革新未来电信网络设计和运营方式的巨大潜力。然而,主流的大型语言模型(llm)缺乏理解和操作高技术电信领域所需的专业知识。在本文中,我们介绍了TelecomGPT,这是第一个电信专用的LLM,通过系统的自适应管道构建,旨在增强电信应用的通用LLM。为了实现这一目标,我们策划了全面的电信特定数据集,包括预训练数据集、指令数据集和偏好数据集。这些数据集分别用于持续的预训练、指令调优和对齐调优。此外,由于缺乏为电信领域量身定制的广泛接受的评估基准,我们提出了三个新的llm -电信评估基准,即电信数学建模,电信开放QnA和电信代码任务。这些新的基准对法学硕士在电信数学建模、开放式问题回答、代码生成、填充、总结和分析方面的能力进行了全面的评估。使用精心整理的数据集,我们的微调LLM, TelecomGPT,显著优于通用的最先进的LLM,包括GPT-4, Llama-3和Mistral,特别是在电信数学建模基准测试中。此外,它在各种评估基准(如TeleQnA、3GPP技术文档分类、电信代码汇总、生成和填充)中实现了可比较的性能。这项工作为将法学硕士集成到电信系统中奠定了新的基础,为网络运营中人工智能的进步铺平了道路。
{"title":"TelecomGPT: A Framework to Build Telecom-Specific Large Language Models","authors":"Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah","doi":"10.1109/TMLCN.2025.3593184","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3593184","url":null,"abstract":"The emergent field of Large Language Models (LLMs) has significant potential to revolutionize how future telecom networks are designed and operated. However, mainstream Large Language Models (LLMs) lack the specialized knowledge required to understand and operate within the highly technical telecom domain. In this paper, we introduce TelecomGPT, the first telecom-specific LLM, built through a systematic adaptation pipeline designed to enhance general-purpose LLMs for telecom applications. To achieve this, we curate comprehensive telecom-specific datasets, including pre-training datasets, instruction datasets, and preference datasets. These datasets are leveraged for continual pre-training, instruction tuning, and alignment tuning, respectively. Additionally, due to the lack of widely accepted evaluation benchmarks that are tailored for the telecom domain, we proposed three novel LLM-Telecom evaluation benchmarks, namely, Telecom Math Modeling, Telecom Open QnA, and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs in telecom math modeling, open-ended question answering, code generation, infilling, summarization and analysis. Using the curated datasets, our fine-tuned LLM, TelecomGPT, significantly outperforms general-purpose state of the art (SOTA) LLMs, including GPT-4, Llama-3 and Mistral, particularly in Telecom Math Modeling benchmarks. Additionally, it achieves comparable performance across various evaluation benchmarks, such as TeleQnA, 3GPP technical document classification, telecom code summarization, generation, and infilling. This work establishes a new foundation for integrating LLMs into telecom systems, paving the way for AI-powered advancements in network operations.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"948-975"},"PeriodicalIF":0.0,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11097898","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ORANSight-2.0: Foundational LLMs for O-RAN orsight -2.0: O-RAN的基础法学硕士
Pub Date : 2025-07-25 DOI: 10.1109/TMLCN.2025.3592658
Pranshav Gajjar;Vijay K. Shah
Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.
尽管大型语言模型(llm)在医疗保健、客户服务和商业营销等关键领域产生了变革性影响,但它们与开放无线接入网络(O-RAN)的集成仍然有限。这种差距主要是由于缺乏特定领域的基础模型,现有的解决方案通常依赖于通用的llm,无法解决O-RAN的独特挑战和技术复杂性。为了弥补这一差距,我们推出了ORANSight-2.0 (O-RAN Insights),这是一项开创性的举措,旨在为O-RAN量身定制专门的基础法学硕士。构建在18个模型上,跨越5个开源LLM框架- mistral, Qwen, Llama, Phi和gemma - oranight -2.0微调模型范围从1B到70B参数,显着减少对专有,闭源模型的依赖,同时增强o - ran特定任务的性能。orsight -2.0的核心是RANSTRUCT,这是一个新颖的基于检索增强生成(RAG)的指令调优框架,它使用两个LLM代理——一个基于mistral的问题生成器和一个基于qwen的答案生成器——来创建高质量的指令调优数据集。然后使用生成的数据集通过QLoRA对18个预训练的开源llm进行微调。为了评估ORANSight-2.0,我们引入了srsRANBench,这是一个新颖的基准测试,专为srsRAN(一种广泛使用的5G O-RAN堆栈)背景下的代码生成和代码库理解而设计。此外,我们还利用ORAN-Bench-13K,这是一个评估o - ran特定知识的现有基准。我们的综合评估表明,oranight -2.0模型在ORANBench上比通用和闭源模型(如chatggt - 40和Gemini)高出5.421%,在srsRANBench上高出18.465%,在保持较低的计算和能源成本的同时获得了卓越的性能。我们还对ORANSight-2.0模型的RAG增强变体进行了实验,并观察到RAG增强在基准测试中的性能平均提高了6.35%,达到了最佳的总累积分数0.854,比领先的闭源替代方案好12.37%。我们全面评估了orsight -2.0的能量特性,展示了其在训练、推理和RAG增强推理方面的效率,确保了最佳性能,同时保持了较低的计算和能源成本。此外,将最佳ORANSight-2.0配置与现有的电信llm进行了比较,我们提出的模型的性能优于它们,平均提高了27.96%。
{"title":"ORANSight-2.0: Foundational LLMs for O-RAN","authors":"Pranshav Gajjar;Vijay K. Shah","doi":"10.1109/TMLCN.2025.3592658","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3592658","url":null,"abstract":"Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"903-920"},"PeriodicalIF":0.0,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096935","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
User Handover Aware Hierarchical Federated Learning for Open RAN-Based Next-Generation Mobile Networks 基于开放局域网的下一代移动网络用户切换感知分层联邦学习
Pub Date : 2025-07-10 DOI: 10.1109/TMLCN.2025.3587205
Amardip Kumar Singh;Kim Khoa Nguyen
The Open Radio Access Network (O-RAN) architecture, enhanced by its AI-enabled Radio Intelligent Controllers (RIC), offers a more flexible and intelligent solution to optimize next generation networks compared to traditional mobile network architectures. By leveraging its distributed structure, which aligns seamlessly with O-RAN’s disaggregated design, Federated Learning (FL), particularly Hierarchical FL, facilitates decentralized AI model training, improving network performance, reducing resource costs, and safeguarding user privacy. However, the dynamic nature of mobile networks, particularly the frequent handovers of User Equipment (UE) between base stations, poses significant challenges for FL model training. These challenges include managing continuously changing device sets and mitigating the impact of handover delays on global model convergence. To address these challenges, we propose MHORANFed, a novel optimization algorithm tailored to minimize learning time and resource usage costs while preserving model performance within a mobility-aware hierarchical FL framework for O-RAN. Firstly, MHORANFed simplifies the upper layer of the HFL training at edge aggregate servers, which reduces the model complexity and thereby improves the learning time and the resource usage cost. Secondly, it uses jointly optimized bandwidth resource allocation and handed over local trainers’ participation to mitigate the UE handover delay in each global round. Through a rigorous convergence analysis and extensive simulation results, this work demonstrates its superiority over existing state-of-the-art methods. Furthermore, our findings underscore significant improvements in FL training efficiency, paving the way for advanced applications such as autonomous driving and augmented reality in 5G and next-generation O-RAN networks.
开放无线接入网(O-RAN)架构通过其支持ai的无线电智能控制器(RIC)进行增强,与传统移动网络架构相比,为优化下一代网络提供了更灵活、更智能的解决方案。通过利用其与O-RAN的分解设计无缝结合的分布式结构,联邦学习(FL),特别是分层FL,促进了分散的人工智能模型训练,提高了网络性能,降低了资源成本,并保护了用户隐私。然而,移动网络的动态性,特别是基站之间用户设备(UE)的频繁切换,对FL模型训练提出了重大挑战。这些挑战包括管理不断变化的设备集和减轻切换延迟对全局模型收敛的影响。为了解决这些挑战,我们提出了MHORANFed,这是一种新颖的优化算法,旨在最大限度地减少学习时间和资源使用成本,同时在O-RAN的移动感知分层FL框架内保持模型性能。首先,MHORANFed在边缘聚合服务器上简化了HFL训练的上层,降低了模型复杂度,从而提高了学习时间和资源使用成本。其次,采用联合优化的带宽资源分配和移交本地培训师的参与来缓解每轮全局UE切换的延迟;通过严格的收敛分析和广泛的仿真结果,这项工作证明了它比现有的最先进的方法的优越性。此外,我们的研究结果强调了FL训练效率的显著提高,为5G和下一代O-RAN网络中的自动驾驶和增强现实等高级应用铺平了道路。
{"title":"User Handover Aware Hierarchical Federated Learning for Open RAN-Based Next-Generation Mobile Networks","authors":"Amardip Kumar Singh;Kim Khoa Nguyen","doi":"10.1109/TMLCN.2025.3587205","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3587205","url":null,"abstract":"The Open Radio Access Network (O-RAN) architecture, enhanced by its AI-enabled Radio Intelligent Controllers (RIC), offers a more flexible and intelligent solution to optimize next generation networks compared to traditional mobile network architectures. By leveraging its distributed structure, which aligns seamlessly with O-RAN’s disaggregated design, Federated Learning (FL), particularly Hierarchical FL, facilitates decentralized AI model training, improving network performance, reducing resource costs, and safeguarding user privacy. However, the dynamic nature of mobile networks, particularly the frequent handovers of User Equipment (UE) between base stations, poses significant challenges for FL model training. These challenges include managing continuously changing device sets and mitigating the impact of handover delays on global model convergence. To address these challenges, we propose MHORANFed, a novel optimization algorithm tailored to minimize learning time and resource usage costs while preserving model performance within a mobility-aware hierarchical FL framework for O-RAN. Firstly, MHORANFed simplifies the upper layer of the HFL training at edge aggregate servers, which reduces the model complexity and thereby improves the learning time and the resource usage cost. Secondly, it uses jointly optimized bandwidth resource allocation and handed over local trainers’ participation to mitigate the UE handover delay in each global round. Through a rigorous convergence analysis and extensive simulation results, this work demonstrates its superiority over existing state-of-the-art methods. Furthermore, our findings underscore significant improvements in FL training efficiency, paving the way for advanced applications such as autonomous driving and augmented reality in 5G and next-generation O-RAN networks.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"848-863"},"PeriodicalIF":0.0,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11075644","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144739815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Aided Resilient Spectrum Surveillance for Cognitive Tactical Wireless Networks: Design and Proof-of-Concept 认知战术无线网络的机器学习辅助弹性频谱监视:设计和概念验证
Pub Date : 2025-07-03 DOI: 10.1109/TMLCN.2025.3585849
Eli Garlick;Nourhan Hesham;MD. Zoheb Hassan;Imtiaz Ahmed;Anas Chaaban;MD. Jahangir Hossain
Cognitive tactical wireless networks (TWNs) require spectrum awareness to avoid interference and jamming in the communication channel and assure quality-of-service in data transmission. Conventional supervised machine learning (ML) algorithm’s capability to provide spectrum awareness is confronted by the requirement of labeled interference signals. Due to the vast nature of interference signals in the frequency bands used by cognitive TWNs, it is non-trivial to acquire manually labeled data sets of all interference signals. Detecting the presence of an unknown and remote interference source in a frequency band from the transmitter end is also challenging, especially when the received interference power remains at or below the noise floor. To address these issues, this paper proposes an automated interference detection framework, entitled $textsf {MARSS}$ (Machine Learning Aided Resilient Spectrum Surveillance). $textsf {MARSS}$ is a fully unsupervised method, which first extracts the low-dimensional representative features from spectrograms by suppressing noise and background information and employing convolutional neural network (CNN) with novel loss function, and subsequently, distinguishes signals with and without interference by applying an isolation forest model on the extracted features. The uniqueness of $textsf {MARSS}$ is its ability to detect hidden and unknown interference signals in multiple frequency bands without using any prior labels, thanks to its superior feature extraction capability. The capability of $textsf {MARSS}$ is further extended to infer the level of interference by designing a multi-level interference classification framework. Using extensive simulations in GNURadio, the superiority of $textsf {MARSS}$ in detecting interference over existing ML methods is demonstrated. The effectiveness $textsf {MARSS}$ is also validated by extensive over-the-air (OTA) experiments using software-defined radios.
认知战术无线网络(TWNs)需要频谱感知,以避免通信信道中的干扰和干扰,保证数据传输的服务质量。传统的有监督机器学习算法对频谱感知的能力面临着标记干扰信号的要求。由于认知twn使用的频带中干扰信号的广泛性,获取所有干扰信号的人工标记数据集并非易事。从发射端检测频段内未知和远程干扰源的存在也具有挑战性,特别是当接收到的干扰功率保持在或低于噪声本底时。为了解决这些问题,本文提出了一个自动干扰检测框架,名为$textsf {MARSS}$(机器学习辅助弹性频谱监视)。$textsf {MARSS}$是一种完全无监督的方法,该方法首先通过抑制噪声和背景信息,利用具有新型损失函数的卷积神经网络(CNN)从频谱图中提取低维代表性特征,然后通过对提取的特征应用隔离森林模型来区分有干扰和无干扰的信号。$textsf {MARSS}$的独特之处在于,由于其优越的特征提取能力,它能够在不使用任何先验标签的情况下检测多个频段的隐藏和未知干扰信号。通过设计多级干扰分类框架,进一步扩展了$textsf {MARSS}$的干扰推断能力。通过在GNURadio中进行大量仿真,证明了$textsf {MARSS}$在检测干扰方面优于现有的ML方法。使用软件定义无线电的大量空中(OTA)实验也验证了其有效性。
{"title":"Machine Learning Aided Resilient Spectrum Surveillance for Cognitive Tactical Wireless Networks: Design and Proof-of-Concept","authors":"Eli Garlick;Nourhan Hesham;MD. Zoheb Hassan;Imtiaz Ahmed;Anas Chaaban;MD. Jahangir Hossain","doi":"10.1109/TMLCN.2025.3585849","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585849","url":null,"abstract":"Cognitive tactical wireless networks (TWNs) require spectrum awareness to avoid interference and jamming in the communication channel and assure quality-of-service in data transmission. Conventional supervised machine learning (ML) algorithm’s capability to provide spectrum awareness is confronted by the requirement of labeled interference signals. Due to the vast nature of interference signals in the frequency bands used by cognitive TWNs, it is non-trivial to acquire manually labeled data sets of all interference signals. Detecting the presence of an unknown and remote interference source in a frequency band from the transmitter end is also challenging, especially when the received interference power remains at or below the noise floor. To address these issues, this paper proposes an automated interference detection framework, entitled <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> (Machine Learning Aided Resilient Spectrum Surveillance). <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is a fully unsupervised method, which first extracts the low-dimensional representative features from spectrograms by suppressing noise and background information and employing convolutional neural network (CNN) with novel loss function, and subsequently, distinguishes signals with and without interference by applying an isolation forest model on the extracted features. The uniqueness of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is its ability to detect hidden and unknown interference signals in multiple frequency bands without using any prior labels, thanks to its superior feature extraction capability. The capability of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is further extended to infer the level of interference by designing a multi-level interference classification framework. Using extensive simulations in GNURadio, the superiority of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> in detecting interference over existing ML methods is demonstrated. The effectiveness <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is also validated by extensive over-the-air (OTA) experiments using software-defined radios.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"814-834"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11068948","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144646650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Blockchain-Enabled Federated Learning Scheme for IoT Anomaly Detection 一种新的基于区块链的物联网异常检测联邦学习方案
Pub Date : 2025-07-03 DOI: 10.1109/TMLCN.2025.3585842
Van-Doan Nguyen;Abebe Diro;Naveen Chilamkurti;Will Heyne;Khoa Tran Phan
In this research, we proposed a novel anomaly detection system (ADS) that integrates federated learning (FL) with blockchain for resource-constrained IoT. The proposed system allows IoT devices to exchange machine learning (ML) models through a permissioned blockchain, enabling trustworthy collaborative learning through model sharing. To avoid single-point failure, any device can be a centre of the FL process. To deal with the issue of resource constraints in IoT devices and the model poisoning problem in FL, we introduced a novel method to use commitment coefficients and ML model discrepancies when selecting particular devices to join the FL process. We also proposed an efficient heuristic method to aggregate a federated model from a list of ML models trained locally on the selected devices, which helps to improve the federated model’s anomaly detection ability. The experiment results with the popular N-BaIoT dataset for IoT botnet attack detection show that the proposed system is more effective in detecting anomalies and resisting poisoning attacks than the two baselines (FedProx and FedAvg).
在这项研究中,我们提出了一种新的异常检测系统(ADS),该系统将联邦学习(FL)与区块链相结合,用于资源受限的物联网。该系统允许物联网设备通过允许的区块链交换机器学习(ML)模型,通过模型共享实现可信的协作学习。为了避免单点故障,任何设备都可以成为FL过程的中心。为了解决物联网设备中的资源约束问题和FL中的模型中毒问题,我们引入了一种新的方法,在选择特定设备加入FL过程时使用承诺系数和ML模型差异。我们还提出了一种有效的启发式方法,从在选定设备上本地训练的ML模型列表中聚合联邦模型,这有助于提高联邦模型的异常检测能力。在物联网僵尸网络攻击检测常用的N-BaIoT数据集上进行的实验结果表明,本文提出的系统在检测异常和抵抗中毒攻击方面比两个基线(FedProx和fedag)更有效。
{"title":"A Novel Blockchain-Enabled Federated Learning Scheme for IoT Anomaly Detection","authors":"Van-Doan Nguyen;Abebe Diro;Naveen Chilamkurti;Will Heyne;Khoa Tran Phan","doi":"10.1109/TMLCN.2025.3585842","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585842","url":null,"abstract":"In this research, we proposed a novel anomaly detection system (ADS) that integrates federated learning (FL) with blockchain for resource-constrained IoT. The proposed system allows IoT devices to exchange machine learning (ML) models through a permissioned blockchain, enabling trustworthy collaborative learning through model sharing. To avoid single-point failure, any device can be a centre of the FL process. To deal with the issue of resource constraints in IoT devices and the model poisoning problem in FL, we introduced a novel method to use commitment coefficients and ML model discrepancies when selecting particular devices to join the FL process. We also proposed an efficient heuristic method to aggregate a federated model from a list of ML models trained locally on the selected devices, which helps to improve the federated model’s anomaly detection ability. The experiment results with the popular N-BaIoT dataset for IoT botnet attack detection show that the proposed system is more effective in detecting anomalies and resisting poisoning attacks than the two baselines (FedProx and FedAvg).","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"798-813"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11070312","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LLM4WM: Adapting LLM for Wireless Multi-Tasking LLM4WM:适应无线多任务的LLM
Pub Date : 2025-07-03 DOI: 10.1109/TMLCN.2025.3585845
Xuanyu Liu;Shijian Gao;Boxun Liu;Xiang Cheng;Liuqing Yang
The wireless channel is fundamental to communication, encompassing numerous tasks collectively referred to as channel-associated tasks. These tasks can leverage joint learning based on channel characteristics to share representations and enhance system design. To capitalize on this advantage, LLM4WM is proposed—a large language model (LLM) multi-task fine-tuning framework specifically tailored for channel-associated tasks. This framework utilizes a Mixture of Experts with Low-Rank Adaptation (MoE-LoRA) approach for multi-task fine-tuning, enabling the transfer of the pre-trained LLM’s general knowledge to these tasks. Given the unique characteristics of wireless channel data, preprocessing modules, adapter modules, and multi-task output layers are designed to align the channel data with the LLM’s semantic feature space. Experiments on a channel-associated multi-task dataset demonstrate that LLM4WM outperforms existing methodologies in both full-sample and few-shot evaluations, owing to its robust multi-task joint modeling and transfer learning capabilities.
无线信道是通信的基础,包括许多任务,统称为信道相关任务。这些任务可以利用基于通道特征的联合学习来共享表征并增强系统设计。为了利用这一优势,提出了LLM4WM——一个专门为通道相关任务量身定制的大型语言模型(LLM)多任务微调框架。该框架利用低秩自适应混合专家(MoE-LoRA)方法进行多任务微调,使预训练的法学硕士的一般知识能够转移到这些任务中。考虑到无线信道数据的独特特征,预处理模块、适配器模块和多任务输出层被设计成使信道数据与LLM的语义特征空间保持一致。在一个与通道相关的多任务数据集上的实验表明,LLM4WM由于其强大的多任务联合建模和迁移学习能力,在全样本和少样本评估方面都优于现有的方法。
{"title":"LLM4WM: Adapting LLM for Wireless Multi-Tasking","authors":"Xuanyu Liu;Shijian Gao;Boxun Liu;Xiang Cheng;Liuqing Yang","doi":"10.1109/TMLCN.2025.3585845","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585845","url":null,"abstract":"The wireless channel is fundamental to communication, encompassing numerous tasks collectively referred to as channel-associated tasks. These tasks can leverage joint learning based on channel characteristics to share representations and enhance system design. To capitalize on this advantage, LLM4WM is proposed—a large language model (LLM) multi-task fine-tuning framework specifically tailored for channel-associated tasks. This framework utilizes a Mixture of Experts with Low-Rank Adaptation (MoE-LoRA) approach for multi-task fine-tuning, enabling the transfer of the pre-trained LLM’s general knowledge to these tasks. Given the unique characteristics of wireless channel data, preprocessing modules, adapter modules, and multi-task output layers are designed to align the channel data with the LLM’s semantic feature space. Experiments on a channel-associated multi-task dataset demonstrate that LLM4WM outperforms existing methodologies in both full-sample and few-shot evaluations, owing to its robust multi-task joint modeling and transfer learning capabilities.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"835-847"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11071329","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144712086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-Driven AI Model Generalization for Workload Forecasting in the Compute Continuum 计算连续体中工作负荷预测的注意力驱动人工智能模型泛化
Pub Date : 2025-06-27 DOI: 10.1109/TMLCN.2025.3584009
Berend J. D. Gort;Godfrey M. Kibalya;Angelos Antonopoulos
Effective resource management in edge-cloud networks demands precise forecasting of diverse workload resource usage. Due to the fluctuating nature of user demands, prediction models must have strong generalization abilities, ensuring high performance amidst sudden traffic changes or unfamiliar patterns. Existing approaches often struggle with handling long-term dependencies and the diversity of temporal patterns. This paper introduces OmniFORE (Framework for Optimization of Resource forecasts in Edge-cloud networks), which integrates attention-based time-series models with temporal clustering to enhance generalization and predict diverse workloads efficiently in volatile settings. By training on carefully selected subsets from extensive datasets, OmniFORE captures both short-term stability and long-term shifts in resource usage patterns. Experiments show that OmniFORE outperforms state-of-the-art methods in prediction accuracy, inference speed, and generalization to unseen data, particularly in scenarios with dynamic workload changes and varying trace variance. These improvements enable more efficient resource management in the compute continuum.
边缘云网络中有效的资源管理要求对各种工作负载资源使用情况进行精确预测。由于用户需求的波动性,预测模型必须具有较强的泛化能力,以确保在突发流量变化或不熟悉的模式下具有较高的性能。现有的方法经常在处理长期依赖关系和时态模式的多样性方面遇到困难。本文介绍了OmniFORE (Framework for Optimization of Resource forecasting in Edge-cloud networks),它将基于注意力的时间序列模型与时间聚类相结合,以增强泛化能力,并在不稳定的环境中有效地预测不同的工作负载。通过对大量数据集中精心挑选的子集进行训练,OmniFORE捕获了资源使用模式的短期稳定性和长期变化。实验表明,OmniFORE在预测精度、推理速度和对未见数据的泛化方面优于最先进的方法,特别是在动态工作负载变化和跟踪方差变化的情况下。这些改进可以在计算连续体中实现更有效的资源管理。
{"title":"Attention-Driven AI Model Generalization for Workload Forecasting in the Compute Continuum","authors":"Berend J. D. Gort;Godfrey M. Kibalya;Angelos Antonopoulos","doi":"10.1109/TMLCN.2025.3584009","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3584009","url":null,"abstract":"Effective resource management in edge-cloud networks demands precise forecasting of diverse workload resource usage. Due to the fluctuating nature of user demands, prediction models must have strong generalization abilities, ensuring high performance amidst sudden traffic changes or unfamiliar patterns. Existing approaches often struggle with handling long-term dependencies and the diversity of temporal patterns. This paper introduces OmniFORE (Framework for Optimization of Resource forecasts in Edge-cloud networks), which integrates attention-based time-series models with temporal clustering to enhance generalization and predict diverse workloads efficiently in volatile settings. By training on carefully selected subsets from extensive datasets, OmniFORE captures both short-term stability and long-term shifts in resource usage patterns. Experiments show that OmniFORE outperforms state-of-the-art methods in prediction accuracy, inference speed, and generalization to unseen data, particularly in scenarios with dynamic workload changes and varying trace variance. These improvements enable more efficient resource management in the compute continuum.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"779-797"},"PeriodicalIF":0.0,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11053768","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Machine Learning in Communications and Networking
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1