首页 > 最新文献

IEEE Transactions on Machine Learning in Communications and Networking最新文献

英文 中文
A Hierarchical Feature-Based Time Series Clustering Approach for Data-Driven Capacity Planning of Cellular Networks 基于层次特征的时间序列聚类方法在蜂窝网络数据驱动容量规划中的应用
Pub Date : 2025-08-04 DOI: 10.1109/TMLCN.2025.3595125
Vineeta Jain;Anna Richter;Vladimir Fokow;Mathias Schweigel;Ulf Wetzker;Andreas Frotzscher
The growing popularity of cellular networks among users, primarily due to affordable prices and high speeds, has escalated the need for strategic capacity planning to ensure a seamless end-user experience and profitable returns on network investments. Traditional capacity planning methods rely on static analysis of network parameters with the aim of minimizing the CAPEX and the OPEX. However, to address the evolving dynamics of cellular networks, this paper advocates for a data-driven approach that considers user behavioral analysis in the planning process to make it proactive and adaptive. We introduce a Hierarchical Feature-based Time Series Clustering (HFTSC) approach that organizes clustering in a multi-level tree structure. Each level addresses a specific aspect of time series data using focused features, enabling explainable clustering. The proposed approach assigns labels to clusters based on the time series properties targeted at each level, generating annotated clusters while applying unsupervised clustering methods. To evaluate the effectiveness of HFTSC, we conduct a comprehensive case study using real-world data from thousands of network elements. Our evaluation examines the identified clusters from analytical and geographical perspectives, focusing on supporting network planners in data-informed decision-making and analysis. Finally, we perform an extensive comparison with several baseline methods to reflect the practical advantages of our approach in capacity planning and optimization.
蜂窝网络在用户中越来越受欢迎,主要是由于价格实惠和高速,这使得对战略容量规划的需求升级,以确保无缝的最终用户体验和网络投资的盈利回报。传统的容量规划方法依赖于对网络参数的静态分析,以最小化CAPEX和OPEX为目标。然而,为了解决蜂窝网络不断变化的动态,本文提倡一种数据驱动的方法,在规划过程中考虑用户行为分析,使其具有前瞻性和适应性。我们介绍了一种基于层次特征的时间序列聚类(HFTSC)方法,该方法将聚类组织成多层次树结构。每个级别使用重点特征处理时间序列数据的特定方面,从而实现可解释的聚类。该方法根据每个层次的时间序列属性为聚类分配标签,在应用无监督聚类方法的同时生成带注释的聚类。为了评估HFTSC的有效性,我们使用来自数千个网络元素的真实数据进行了全面的案例研究。我们的评估从分析和地理角度考察了已确定的集群,重点是支持网络规划者在数据知情的决策和分析中。最后,我们与几种基线方法进行了广泛的比较,以反映我们的方法在容量规划和优化方面的实际优势。
{"title":"A Hierarchical Feature-Based Time Series Clustering Approach for Data-Driven Capacity Planning of Cellular Networks","authors":"Vineeta Jain;Anna Richter;Vladimir Fokow;Mathias Schweigel;Ulf Wetzker;Andreas Frotzscher","doi":"10.1109/TMLCN.2025.3595125","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595125","url":null,"abstract":"The growing popularity of cellular networks among users, primarily due to affordable prices and high speeds, has escalated the need for strategic capacity planning to ensure a seamless end-user experience and profitable returns on network investments. Traditional capacity planning methods rely on static analysis of network parameters with the aim of minimizing the CAPEX and the OPEX. However, to address the evolving dynamics of cellular networks, this paper advocates for a data-driven approach that considers user behavioral analysis in the planning process to make it proactive and adaptive. We introduce a Hierarchical Feature-based Time Series Clustering (HFTSC) approach that organizes clustering in a multi-level tree structure. Each level addresses a specific aspect of time series data using focused features, enabling explainable clustering. The proposed approach assigns labels to clusters based on the time series properties targeted at each level, generating annotated clusters while applying unsupervised clustering methods. To evaluate the effectiveness of HFTSC, we conduct a comprehensive case study using real-world data from thousands of network elements. Our evaluation examines the identified clusters from analytical and geographical perspectives, focusing on supporting network planners in data-informed decision-making and analysis. Finally, we perform an extensive comparison with several baseline methods to reflect the practical advantages of our approach in capacity planning and optimization.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"921-947"},"PeriodicalIF":0.0,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11108703","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TelecomGPT: A Framework to Build Telecom-Specific Large Language Models TelecomGPT:构建电信专用大型语言模型的框架
Pub Date : 2025-07-28 DOI: 10.1109/TMLCN.2025.3593184
Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah
The emergent field of Large Language Models (LLMs) has significant potential to revolutionize how future telecom networks are designed and operated. However, mainstream Large Language Models (LLMs) lack the specialized knowledge required to understand and operate within the highly technical telecom domain. In this paper, we introduce TelecomGPT, the first telecom-specific LLM, built through a systematic adaptation pipeline designed to enhance general-purpose LLMs for telecom applications. To achieve this, we curate comprehensive telecom-specific datasets, including pre-training datasets, instruction datasets, and preference datasets. These datasets are leveraged for continual pre-training, instruction tuning, and alignment tuning, respectively. Additionally, due to the lack of widely accepted evaluation benchmarks that are tailored for the telecom domain, we proposed three novel LLM-Telecom evaluation benchmarks, namely, Telecom Math Modeling, Telecom Open QnA, and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs in telecom math modeling, open-ended question answering, code generation, infilling, summarization and analysis. Using the curated datasets, our fine-tuned LLM, TelecomGPT, significantly outperforms general-purpose state of the art (SOTA) LLMs, including GPT-4, Llama-3 and Mistral, particularly in Telecom Math Modeling benchmarks. Additionally, it achieves comparable performance across various evaluation benchmarks, such as TeleQnA, 3GPP technical document classification, telecom code summarization, generation, and infilling. This work establishes a new foundation for integrating LLMs into telecom systems, paving the way for AI-powered advancements in network operations.
大型语言模型(llm)这一新兴领域具有革新未来电信网络设计和运营方式的巨大潜力。然而,主流的大型语言模型(llm)缺乏理解和操作高技术电信领域所需的专业知识。在本文中,我们介绍了TelecomGPT,这是第一个电信专用的LLM,通过系统的自适应管道构建,旨在增强电信应用的通用LLM。为了实现这一目标,我们策划了全面的电信特定数据集,包括预训练数据集、指令数据集和偏好数据集。这些数据集分别用于持续的预训练、指令调优和对齐调优。此外,由于缺乏为电信领域量身定制的广泛接受的评估基准,我们提出了三个新的llm -电信评估基准,即电信数学建模,电信开放QnA和电信代码任务。这些新的基准对法学硕士在电信数学建模、开放式问题回答、代码生成、填充、总结和分析方面的能力进行了全面的评估。使用精心整理的数据集,我们的微调LLM, TelecomGPT,显著优于通用的最先进的LLM,包括GPT-4, Llama-3和Mistral,特别是在电信数学建模基准测试中。此外,它在各种评估基准(如TeleQnA、3GPP技术文档分类、电信代码汇总、生成和填充)中实现了可比较的性能。这项工作为将法学硕士集成到电信系统中奠定了新的基础,为网络运营中人工智能的进步铺平了道路。
{"title":"TelecomGPT: A Framework to Build Telecom-Specific Large Language Models","authors":"Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah","doi":"10.1109/TMLCN.2025.3593184","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3593184","url":null,"abstract":"The emergent field of Large Language Models (LLMs) has significant potential to revolutionize how future telecom networks are designed and operated. However, mainstream Large Language Models (LLMs) lack the specialized knowledge required to understand and operate within the highly technical telecom domain. In this paper, we introduce TelecomGPT, the first telecom-specific LLM, built through a systematic adaptation pipeline designed to enhance general-purpose LLMs for telecom applications. To achieve this, we curate comprehensive telecom-specific datasets, including pre-training datasets, instruction datasets, and preference datasets. These datasets are leveraged for continual pre-training, instruction tuning, and alignment tuning, respectively. Additionally, due to the lack of widely accepted evaluation benchmarks that are tailored for the telecom domain, we proposed three novel LLM-Telecom evaluation benchmarks, namely, Telecom Math Modeling, Telecom Open QnA, and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs in telecom math modeling, open-ended question answering, code generation, infilling, summarization and analysis. Using the curated datasets, our fine-tuned LLM, TelecomGPT, significantly outperforms general-purpose state of the art (SOTA) LLMs, including GPT-4, Llama-3 and Mistral, particularly in Telecom Math Modeling benchmarks. Additionally, it achieves comparable performance across various evaluation benchmarks, such as TeleQnA, 3GPP technical document classification, telecom code summarization, generation, and infilling. This work establishes a new foundation for integrating LLMs into telecom systems, paving the way for AI-powered advancements in network operations.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"948-975"},"PeriodicalIF":0.0,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11097898","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ORANSight-2.0: Foundational LLMs for O-RAN orsight -2.0: O-RAN的基础法学硕士
Pub Date : 2025-07-25 DOI: 10.1109/TMLCN.2025.3592658
Pranshav Gajjar;Vijay K. Shah
Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.
尽管大型语言模型(llm)在医疗保健、客户服务和商业营销等关键领域产生了变革性影响,但它们与开放无线接入网络(O-RAN)的集成仍然有限。这种差距主要是由于缺乏特定领域的基础模型,现有的解决方案通常依赖于通用的llm,无法解决O-RAN的独特挑战和技术复杂性。为了弥补这一差距,我们推出了ORANSight-2.0 (O-RAN Insights),这是一项开创性的举措,旨在为O-RAN量身定制专门的基础法学硕士。构建在18个模型上,跨越5个开源LLM框架- mistral, Qwen, Llama, Phi和gemma - oranight -2.0微调模型范围从1B到70B参数,显着减少对专有,闭源模型的依赖,同时增强o - ran特定任务的性能。orsight -2.0的核心是RANSTRUCT,这是一个新颖的基于检索增强生成(RAG)的指令调优框架,它使用两个LLM代理——一个基于mistral的问题生成器和一个基于qwen的答案生成器——来创建高质量的指令调优数据集。然后使用生成的数据集通过QLoRA对18个预训练的开源llm进行微调。为了评估ORANSight-2.0,我们引入了srsRANBench,这是一个新颖的基准测试,专为srsRAN(一种广泛使用的5G O-RAN堆栈)背景下的代码生成和代码库理解而设计。此外,我们还利用ORAN-Bench-13K,这是一个评估o - ran特定知识的现有基准。我们的综合评估表明,oranight -2.0模型在ORANBench上比通用和闭源模型(如chatggt - 40和Gemini)高出5.421%,在srsRANBench上高出18.465%,在保持较低的计算和能源成本的同时获得了卓越的性能。我们还对ORANSight-2.0模型的RAG增强变体进行了实验,并观察到RAG增强在基准测试中的性能平均提高了6.35%,达到了最佳的总累积分数0.854,比领先的闭源替代方案好12.37%。我们全面评估了orsight -2.0的能量特性,展示了其在训练、推理和RAG增强推理方面的效率,确保了最佳性能,同时保持了较低的计算和能源成本。此外,将最佳ORANSight-2.0配置与现有的电信llm进行了比较,我们提出的模型的性能优于它们,平均提高了27.96%。
{"title":"ORANSight-2.0: Foundational LLMs for O-RAN","authors":"Pranshav Gajjar;Vijay K. Shah","doi":"10.1109/TMLCN.2025.3592658","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3592658","url":null,"abstract":"Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"903-920"},"PeriodicalIF":0.0,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096935","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
User Handover Aware Hierarchical Federated Learning for Open RAN-Based Next-Generation Mobile Networks 基于开放局域网的下一代移动网络用户切换感知分层联邦学习
Pub Date : 2025-07-10 DOI: 10.1109/TMLCN.2025.3587205
Amardip Kumar Singh;Kim Khoa Nguyen
The Open Radio Access Network (O-RAN) architecture, enhanced by its AI-enabled Radio Intelligent Controllers (RIC), offers a more flexible and intelligent solution to optimize next generation networks compared to traditional mobile network architectures. By leveraging its distributed structure, which aligns seamlessly with O-RAN’s disaggregated design, Federated Learning (FL), particularly Hierarchical FL, facilitates decentralized AI model training, improving network performance, reducing resource costs, and safeguarding user privacy. However, the dynamic nature of mobile networks, particularly the frequent handovers of User Equipment (UE) between base stations, poses significant challenges for FL model training. These challenges include managing continuously changing device sets and mitigating the impact of handover delays on global model convergence. To address these challenges, we propose MHORANFed, a novel optimization algorithm tailored to minimize learning time and resource usage costs while preserving model performance within a mobility-aware hierarchical FL framework for O-RAN. Firstly, MHORANFed simplifies the upper layer of the HFL training at edge aggregate servers, which reduces the model complexity and thereby improves the learning time and the resource usage cost. Secondly, it uses jointly optimized bandwidth resource allocation and handed over local trainers’ participation to mitigate the UE handover delay in each global round. Through a rigorous convergence analysis and extensive simulation results, this work demonstrates its superiority over existing state-of-the-art methods. Furthermore, our findings underscore significant improvements in FL training efficiency, paving the way for advanced applications such as autonomous driving and augmented reality in 5G and next-generation O-RAN networks.
开放无线接入网(O-RAN)架构通过其支持ai的无线电智能控制器(RIC)进行增强,与传统移动网络架构相比,为优化下一代网络提供了更灵活、更智能的解决方案。通过利用其与O-RAN的分解设计无缝结合的分布式结构,联邦学习(FL),特别是分层FL,促进了分散的人工智能模型训练,提高了网络性能,降低了资源成本,并保护了用户隐私。然而,移动网络的动态性,特别是基站之间用户设备(UE)的频繁切换,对FL模型训练提出了重大挑战。这些挑战包括管理不断变化的设备集和减轻切换延迟对全局模型收敛的影响。为了解决这些挑战,我们提出了MHORANFed,这是一种新颖的优化算法,旨在最大限度地减少学习时间和资源使用成本,同时在O-RAN的移动感知分层FL框架内保持模型性能。首先,MHORANFed在边缘聚合服务器上简化了HFL训练的上层,降低了模型复杂度,从而提高了学习时间和资源使用成本。其次,采用联合优化的带宽资源分配和移交本地培训师的参与来缓解每轮全局UE切换的延迟;通过严格的收敛分析和广泛的仿真结果,这项工作证明了它比现有的最先进的方法的优越性。此外,我们的研究结果强调了FL训练效率的显著提高,为5G和下一代O-RAN网络中的自动驾驶和增强现实等高级应用铺平了道路。
{"title":"User Handover Aware Hierarchical Federated Learning for Open RAN-Based Next-Generation Mobile Networks","authors":"Amardip Kumar Singh;Kim Khoa Nguyen","doi":"10.1109/TMLCN.2025.3587205","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3587205","url":null,"abstract":"The Open Radio Access Network (O-RAN) architecture, enhanced by its AI-enabled Radio Intelligent Controllers (RIC), offers a more flexible and intelligent solution to optimize next generation networks compared to traditional mobile network architectures. By leveraging its distributed structure, which aligns seamlessly with O-RAN’s disaggregated design, Federated Learning (FL), particularly Hierarchical FL, facilitates decentralized AI model training, improving network performance, reducing resource costs, and safeguarding user privacy. However, the dynamic nature of mobile networks, particularly the frequent handovers of User Equipment (UE) between base stations, poses significant challenges for FL model training. These challenges include managing continuously changing device sets and mitigating the impact of handover delays on global model convergence. To address these challenges, we propose MHORANFed, a novel optimization algorithm tailored to minimize learning time and resource usage costs while preserving model performance within a mobility-aware hierarchical FL framework for O-RAN. Firstly, MHORANFed simplifies the upper layer of the HFL training at edge aggregate servers, which reduces the model complexity and thereby improves the learning time and the resource usage cost. Secondly, it uses jointly optimized bandwidth resource allocation and handed over local trainers’ participation to mitigate the UE handover delay in each global round. Through a rigorous convergence analysis and extensive simulation results, this work demonstrates its superiority over existing state-of-the-art methods. Furthermore, our findings underscore significant improvements in FL training efficiency, paving the way for advanced applications such as autonomous driving and augmented reality in 5G and next-generation O-RAN networks.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"848-863"},"PeriodicalIF":0.0,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11075644","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144739815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Aided Resilient Spectrum Surveillance for Cognitive Tactical Wireless Networks: Design and Proof-of-Concept 认知战术无线网络的机器学习辅助弹性频谱监视:设计和概念验证
Pub Date : 2025-07-03 DOI: 10.1109/TMLCN.2025.3585849
Eli Garlick;Nourhan Hesham;MD. Zoheb Hassan;Imtiaz Ahmed;Anas Chaaban;MD. Jahangir Hossain
Cognitive tactical wireless networks (TWNs) require spectrum awareness to avoid interference and jamming in the communication channel and assure quality-of-service in data transmission. Conventional supervised machine learning (ML) algorithm’s capability to provide spectrum awareness is confronted by the requirement of labeled interference signals. Due to the vast nature of interference signals in the frequency bands used by cognitive TWNs, it is non-trivial to acquire manually labeled data sets of all interference signals. Detecting the presence of an unknown and remote interference source in a frequency band from the transmitter end is also challenging, especially when the received interference power remains at or below the noise floor. To address these issues, this paper proposes an automated interference detection framework, entitled $textsf {MARSS}$ (Machine Learning Aided Resilient Spectrum Surveillance). $textsf {MARSS}$ is a fully unsupervised method, which first extracts the low-dimensional representative features from spectrograms by suppressing noise and background information and employing convolutional neural network (CNN) with novel loss function, and subsequently, distinguishes signals with and without interference by applying an isolation forest model on the extracted features. The uniqueness of $textsf {MARSS}$ is its ability to detect hidden and unknown interference signals in multiple frequency bands without using any prior labels, thanks to its superior feature extraction capability. The capability of $textsf {MARSS}$ is further extended to infer the level of interference by designing a multi-level interference classification framework. Using extensive simulations in GNURadio, the superiority of $textsf {MARSS}$ in detecting interference over existing ML methods is demonstrated. The effectiveness $textsf {MARSS}$ is also validated by extensive over-the-air (OTA) experiments using software-defined radios.
认知战术无线网络(TWNs)需要频谱感知,以避免通信信道中的干扰和干扰,保证数据传输的服务质量。传统的有监督机器学习算法对频谱感知的能力面临着标记干扰信号的要求。由于认知twn使用的频带中干扰信号的广泛性,获取所有干扰信号的人工标记数据集并非易事。从发射端检测频段内未知和远程干扰源的存在也具有挑战性,特别是当接收到的干扰功率保持在或低于噪声本底时。为了解决这些问题,本文提出了一个自动干扰检测框架,名为$textsf {MARSS}$(机器学习辅助弹性频谱监视)。$textsf {MARSS}$是一种完全无监督的方法,该方法首先通过抑制噪声和背景信息,利用具有新型损失函数的卷积神经网络(CNN)从频谱图中提取低维代表性特征,然后通过对提取的特征应用隔离森林模型来区分有干扰和无干扰的信号。$textsf {MARSS}$的独特之处在于,由于其优越的特征提取能力,它能够在不使用任何先验标签的情况下检测多个频段的隐藏和未知干扰信号。通过设计多级干扰分类框架,进一步扩展了$textsf {MARSS}$的干扰推断能力。通过在GNURadio中进行大量仿真,证明了$textsf {MARSS}$在检测干扰方面优于现有的ML方法。使用软件定义无线电的大量空中(OTA)实验也验证了其有效性。
{"title":"Machine Learning Aided Resilient Spectrum Surveillance for Cognitive Tactical Wireless Networks: Design and Proof-of-Concept","authors":"Eli Garlick;Nourhan Hesham;MD. Zoheb Hassan;Imtiaz Ahmed;Anas Chaaban;MD. Jahangir Hossain","doi":"10.1109/TMLCN.2025.3585849","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585849","url":null,"abstract":"Cognitive tactical wireless networks (TWNs) require spectrum awareness to avoid interference and jamming in the communication channel and assure quality-of-service in data transmission. Conventional supervised machine learning (ML) algorithm’s capability to provide spectrum awareness is confronted by the requirement of labeled interference signals. Due to the vast nature of interference signals in the frequency bands used by cognitive TWNs, it is non-trivial to acquire manually labeled data sets of all interference signals. Detecting the presence of an unknown and remote interference source in a frequency band from the transmitter end is also challenging, especially when the received interference power remains at or below the noise floor. To address these issues, this paper proposes an automated interference detection framework, entitled <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> (Machine Learning Aided Resilient Spectrum Surveillance). <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is a fully unsupervised method, which first extracts the low-dimensional representative features from spectrograms by suppressing noise and background information and employing convolutional neural network (CNN) with novel loss function, and subsequently, distinguishes signals with and without interference by applying an isolation forest model on the extracted features. The uniqueness of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is its ability to detect hidden and unknown interference signals in multiple frequency bands without using any prior labels, thanks to its superior feature extraction capability. The capability of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is further extended to infer the level of interference by designing a multi-level interference classification framework. Using extensive simulations in GNURadio, the superiority of <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> in detecting interference over existing ML methods is demonstrated. The effectiveness <inline-formula> <tex-math>$textsf {MARSS}$ </tex-math></inline-formula> is also validated by extensive over-the-air (OTA) experiments using software-defined radios.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"814-834"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11068948","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144646650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Blockchain-Enabled Federated Learning Scheme for IoT Anomaly Detection 一种新的基于区块链的物联网异常检测联邦学习方案
Pub Date : 2025-07-03 DOI: 10.1109/TMLCN.2025.3585842
Van-Doan Nguyen;Abebe Diro;Naveen Chilamkurti;Will Heyne;Khoa Tran Phan
In this research, we proposed a novel anomaly detection system (ADS) that integrates federated learning (FL) with blockchain for resource-constrained IoT. The proposed system allows IoT devices to exchange machine learning (ML) models through a permissioned blockchain, enabling trustworthy collaborative learning through model sharing. To avoid single-point failure, any device can be a centre of the FL process. To deal with the issue of resource constraints in IoT devices and the model poisoning problem in FL, we introduced a novel method to use commitment coefficients and ML model discrepancies when selecting particular devices to join the FL process. We also proposed an efficient heuristic method to aggregate a federated model from a list of ML models trained locally on the selected devices, which helps to improve the federated model’s anomaly detection ability. The experiment results with the popular N-BaIoT dataset for IoT botnet attack detection show that the proposed system is more effective in detecting anomalies and resisting poisoning attacks than the two baselines (FedProx and FedAvg).
在这项研究中,我们提出了一种新的异常检测系统(ADS),该系统将联邦学习(FL)与区块链相结合,用于资源受限的物联网。该系统允许物联网设备通过允许的区块链交换机器学习(ML)模型,通过模型共享实现可信的协作学习。为了避免单点故障,任何设备都可以成为FL过程的中心。为了解决物联网设备中的资源约束问题和FL中的模型中毒问题,我们引入了一种新的方法,在选择特定设备加入FL过程时使用承诺系数和ML模型差异。我们还提出了一种有效的启发式方法,从在选定设备上本地训练的ML模型列表中聚合联邦模型,这有助于提高联邦模型的异常检测能力。在物联网僵尸网络攻击检测常用的N-BaIoT数据集上进行的实验结果表明,本文提出的系统在检测异常和抵抗中毒攻击方面比两个基线(FedProx和fedag)更有效。
{"title":"A Novel Blockchain-Enabled Federated Learning Scheme for IoT Anomaly Detection","authors":"Van-Doan Nguyen;Abebe Diro;Naveen Chilamkurti;Will Heyne;Khoa Tran Phan","doi":"10.1109/TMLCN.2025.3585842","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585842","url":null,"abstract":"In this research, we proposed a novel anomaly detection system (ADS) that integrates federated learning (FL) with blockchain for resource-constrained IoT. The proposed system allows IoT devices to exchange machine learning (ML) models through a permissioned blockchain, enabling trustworthy collaborative learning through model sharing. To avoid single-point failure, any device can be a centre of the FL process. To deal with the issue of resource constraints in IoT devices and the model poisoning problem in FL, we introduced a novel method to use commitment coefficients and ML model discrepancies when selecting particular devices to join the FL process. We also proposed an efficient heuristic method to aggregate a federated model from a list of ML models trained locally on the selected devices, which helps to improve the federated model’s anomaly detection ability. The experiment results with the popular N-BaIoT dataset for IoT botnet attack detection show that the proposed system is more effective in detecting anomalies and resisting poisoning attacks than the two baselines (FedProx and FedAvg).","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"798-813"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11070312","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LLM4WM: Adapting LLM for Wireless Multi-Tasking LLM4WM:适应无线多任务的LLM
Pub Date : 2025-07-03 DOI: 10.1109/TMLCN.2025.3585845
Xuanyu Liu;Shijian Gao;Boxun Liu;Xiang Cheng;Liuqing Yang
The wireless channel is fundamental to communication, encompassing numerous tasks collectively referred to as channel-associated tasks. These tasks can leverage joint learning based on channel characteristics to share representations and enhance system design. To capitalize on this advantage, LLM4WM is proposed—a large language model (LLM) multi-task fine-tuning framework specifically tailored for channel-associated tasks. This framework utilizes a Mixture of Experts with Low-Rank Adaptation (MoE-LoRA) approach for multi-task fine-tuning, enabling the transfer of the pre-trained LLM’s general knowledge to these tasks. Given the unique characteristics of wireless channel data, preprocessing modules, adapter modules, and multi-task output layers are designed to align the channel data with the LLM’s semantic feature space. Experiments on a channel-associated multi-task dataset demonstrate that LLM4WM outperforms existing methodologies in both full-sample and few-shot evaluations, owing to its robust multi-task joint modeling and transfer learning capabilities.
无线信道是通信的基础,包括许多任务,统称为信道相关任务。这些任务可以利用基于通道特征的联合学习来共享表征并增强系统设计。为了利用这一优势,提出了LLM4WM——一个专门为通道相关任务量身定制的大型语言模型(LLM)多任务微调框架。该框架利用低秩自适应混合专家(MoE-LoRA)方法进行多任务微调,使预训练的法学硕士的一般知识能够转移到这些任务中。考虑到无线信道数据的独特特征,预处理模块、适配器模块和多任务输出层被设计成使信道数据与LLM的语义特征空间保持一致。在一个与通道相关的多任务数据集上的实验表明,LLM4WM由于其强大的多任务联合建模和迁移学习能力,在全样本和少样本评估方面都优于现有的方法。
{"title":"LLM4WM: Adapting LLM for Wireless Multi-Tasking","authors":"Xuanyu Liu;Shijian Gao;Boxun Liu;Xiang Cheng;Liuqing Yang","doi":"10.1109/TMLCN.2025.3585845","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3585845","url":null,"abstract":"The wireless channel is fundamental to communication, encompassing numerous tasks collectively referred to as channel-associated tasks. These tasks can leverage joint learning based on channel characteristics to share representations and enhance system design. To capitalize on this advantage, LLM4WM is proposed—a large language model (LLM) multi-task fine-tuning framework specifically tailored for channel-associated tasks. This framework utilizes a Mixture of Experts with Low-Rank Adaptation (MoE-LoRA) approach for multi-task fine-tuning, enabling the transfer of the pre-trained LLM’s general knowledge to these tasks. Given the unique characteristics of wireless channel data, preprocessing modules, adapter modules, and multi-task output layers are designed to align the channel data with the LLM’s semantic feature space. Experiments on a channel-associated multi-task dataset demonstrate that LLM4WM outperforms existing methodologies in both full-sample and few-shot evaluations, owing to its robust multi-task joint modeling and transfer learning capabilities.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"835-847"},"PeriodicalIF":0.0,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11071329","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144712086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-Driven AI Model Generalization for Workload Forecasting in the Compute Continuum 计算连续体中工作负荷预测的注意力驱动人工智能模型泛化
Pub Date : 2025-06-27 DOI: 10.1109/TMLCN.2025.3584009
Berend J. D. Gort;Godfrey M. Kibalya;Angelos Antonopoulos
Effective resource management in edge-cloud networks demands precise forecasting of diverse workload resource usage. Due to the fluctuating nature of user demands, prediction models must have strong generalization abilities, ensuring high performance amidst sudden traffic changes or unfamiliar patterns. Existing approaches often struggle with handling long-term dependencies and the diversity of temporal patterns. This paper introduces OmniFORE (Framework for Optimization of Resource forecasts in Edge-cloud networks), which integrates attention-based time-series models with temporal clustering to enhance generalization and predict diverse workloads efficiently in volatile settings. By training on carefully selected subsets from extensive datasets, OmniFORE captures both short-term stability and long-term shifts in resource usage patterns. Experiments show that OmniFORE outperforms state-of-the-art methods in prediction accuracy, inference speed, and generalization to unseen data, particularly in scenarios with dynamic workload changes and varying trace variance. These improvements enable more efficient resource management in the compute continuum.
边缘云网络中有效的资源管理要求对各种工作负载资源使用情况进行精确预测。由于用户需求的波动性,预测模型必须具有较强的泛化能力,以确保在突发流量变化或不熟悉的模式下具有较高的性能。现有的方法经常在处理长期依赖关系和时态模式的多样性方面遇到困难。本文介绍了OmniFORE (Framework for Optimization of Resource forecasting in Edge-cloud networks),它将基于注意力的时间序列模型与时间聚类相结合,以增强泛化能力,并在不稳定的环境中有效地预测不同的工作负载。通过对大量数据集中精心挑选的子集进行训练,OmniFORE捕获了资源使用模式的短期稳定性和长期变化。实验表明,OmniFORE在预测精度、推理速度和对未见数据的泛化方面优于最先进的方法,特别是在动态工作负载变化和跟踪方差变化的情况下。这些改进可以在计算连续体中实现更有效的资源管理。
{"title":"Attention-Driven AI Model Generalization for Workload Forecasting in the Compute Continuum","authors":"Berend J. D. Gort;Godfrey M. Kibalya;Angelos Antonopoulos","doi":"10.1109/TMLCN.2025.3584009","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3584009","url":null,"abstract":"Effective resource management in edge-cloud networks demands precise forecasting of diverse workload resource usage. Due to the fluctuating nature of user demands, prediction models must have strong generalization abilities, ensuring high performance amidst sudden traffic changes or unfamiliar patterns. Existing approaches often struggle with handling long-term dependencies and the diversity of temporal patterns. This paper introduces OmniFORE (Framework for Optimization of Resource forecasts in Edge-cloud networks), which integrates attention-based time-series models with temporal clustering to enhance generalization and predict diverse workloads efficiently in volatile settings. By training on carefully selected subsets from extensive datasets, OmniFORE captures both short-term stability and long-term shifts in resource usage patterns. Experiments show that OmniFORE outperforms state-of-the-art methods in prediction accuracy, inference speed, and generalization to unseen data, particularly in scenarios with dynamic workload changes and varying trace variance. These improvements enable more efficient resource management in the compute continuum.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"779-797"},"PeriodicalIF":0.0,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11053768","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Energy-Efficient Federated Learning in Cell-Free Networks With Adaptive Quantization 利用自适应量化加速无单元网络的节能联邦学习
Pub Date : 2025-06-26 DOI: 10.1109/TMLCN.2025.3583659
Afsaneh Mahmoudi;Ming Xiao;Emil Björnson
Federated Learning (FL) enables clients to share model parameters instead of raw data, reducing communication overhead. Traditional wireless networks, however, suffer from latency issues when supporting FL. Cell-Free Massive MIMO (CFmMIMO) offers a promising alternative, as it can serve multiple clients simultaneously on shared resources, enhancing spectral efficiency and reducing latency in large-scale FL. Still, communication resource constraints at the client side can impede the completion of FL training. To tackle this issue, we propose a low-latency, energy-efficient FL framework with optimized uplink power allocation for efficient uplink communication. Our approach integrates an adaptive quantization strategy that dynamically adjusts bit allocation for local gradient updates, significantly lowering communication cost. We formulate a joint optimization problem involving FL model updates, local iterations, and power allocation. This problem is solved using sequential quadratic programming (SQP) to balance energy consumption and latency. Moreover, for local model training, clients employ the AdaDelta optimizer, which improves convergence compared to standard SGD, Adam, and RMSProp. We also provide a theoretical analysis of FL convergence under AdaDelta. Numerical results demonstrate that, under equal energy and latency budgets, our power allocation strategy improves test accuracy by up to 7% and 19% compared to Dinkelbach and max-sum rate approaches. Furthermore, across all power allocation methods, our quantization scheme outperforms AQUILA and LAQ, increasing test accuracy by up to 36% and 35%, respectively.
联邦学习(FL)使客户端能够共享模型参数而不是原始数据,从而减少通信开销。然而,传统无线网络在支持FL时存在延迟问题。CFmMIMO (cell - - - Massive MIMO)提供了一个很有前途的替代方案,因为它可以在共享资源上同时为多个客户端服务,提高频谱效率并减少大规模FL的延迟。然而,客户端的通信资源限制可能会阻碍FL训练的完成。为了解决这个问题,我们提出了一个低延迟,节能的FL框架,优化了上行功率分配,以实现高效的上行通信。我们的方法集成了自适应量化策略,动态调整局部梯度更新的比特分配,显著降低了通信成本。我们提出了一个涉及FL模型更新、局部迭代和功率分配的联合优化问题。采用顺序二次规划(SQP)来平衡能量消耗和延迟,解决了这一问题。此外,对于局部模型训练,客户使用AdaDelta优化器,与标准SGD、Adam和RMSProp相比,它提高了收敛性。本文还对AdaDelta下的FL收敛性进行了理论分析。数值结果表明,在相同的能量和延迟预算下,与Dinkelbach和最大和率方法相比,我们的功率分配策略可将测试精度提高7%和19%。此外,在所有功率分配方法中,我们的量化方案优于AQUILA和LAQ,分别将测试精度提高了36%和35%。
{"title":"Accelerating Energy-Efficient Federated Learning in Cell-Free Networks With Adaptive Quantization","authors":"Afsaneh Mahmoudi;Ming Xiao;Emil Björnson","doi":"10.1109/TMLCN.2025.3583659","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3583659","url":null,"abstract":"Federated Learning (FL) enables clients to share model parameters instead of raw data, reducing communication overhead. Traditional wireless networks, however, suffer from latency issues when supporting FL. Cell-Free Massive MIMO (CFmMIMO) offers a promising alternative, as it can serve multiple clients simultaneously on shared resources, enhancing spectral efficiency and reducing latency in large-scale FL. Still, communication resource constraints at the client side can impede the completion of FL training. To tackle this issue, we propose a low-latency, energy-efficient FL framework with optimized uplink power allocation for efficient uplink communication. Our approach integrates an adaptive quantization strategy that dynamically adjusts bit allocation for local gradient updates, significantly lowering communication cost. We formulate a joint optimization problem involving FL model updates, local iterations, and power allocation. This problem is solved using sequential quadratic programming (SQP) to balance energy consumption and latency. Moreover, for local model training, clients employ the AdaDelta optimizer, which improves convergence compared to standard SGD, Adam, and RMSProp. We also provide a theoretical analysis of FL convergence under AdaDelta. Numerical results demonstrate that, under equal energy and latency budgets, our power allocation strategy improves test accuracy by up to 7% and 19% compared to Dinkelbach and max-sum rate approaches. Furthermore, across all power allocation methods, our quantization scheme outperforms AQUILA and LAQ, increasing test accuracy by up to 36% and 35%, respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"761-778"},"PeriodicalIF":0.0,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11052837","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SET: A Shared-Encoder Transformer Scheme for Multi-Sensor, Multi-Class Fault Classification in Industrial IoT SET:面向工业物联网多传感器、多类故障分类的共享编码器变压器方案
Pub Date : 2025-06-16 DOI: 10.1109/TMLCN.2025.3579750
Kamran Sattar Awaisi;Qiang Ye;Srinivas Sampalli
The Industrial Internet of Things (IIoT) has revolutionized the industrial sector by integrating sensors to monitor equipment health and optimize production processes. These sensors collect real-time data and are prone to a variety of different faults, such as bias, drift, noise, gain, spike, and constant faults. Such faults can lead to significant operational problems, including false results, incorrect predictions, and misleading maintenance decisions. Therefore, classifying sensor data appropriately is essential for ensuring the reliability and efficiency of IIoT systems. In this paper, we propose the Shared-Encoder Transformer (SET) scheme for multi-sensor, multi-class fault classification in IIoT systems. Leveraging the transformer architecture, the SET uses a shared encoder with positional encoding and multi-head self-attention mechanisms to capture complex temporal patterns in sensor data. Consequently, it can accurately detect the health status of sensor data, and if the sensor data is faulty, it can specifically identify the fault type. Additionally, we introduce a comprehensive fault injection strategy to address the problem of fault data scarcity, enabling the validation of the robust performance of SET even with limited fault samples in both ideal and realistic scenarios. In our research, we conducted extensive experiments using the Commercial Modular Aeropropulsion System Simulation (C-MAPSS) and Skoltech Anomaly Benchmark (SKAB) datasets to study the performance of the SET. Our experimental results indicate that SET consistently outperforms baseline methods, including Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN)-LSTM, and Multilayer Perceptron (MLP), as well as the proposed comparative variant of SET, Multi-Encoder Transformer (MET), in terms of accuracy, precision, recall, and F1-score across different fault intensities. The shared-kmencoder architecture improves fault detection accuracy and ensures parameter efficiency/robustness, making it suitable for deployment in memory-constrained industrial environments.
工业物联网(IIoT)通过集成传感器来监控设备健康并优化生产流程,彻底改变了工业领域。这些传感器收集实时数据,容易出现各种不同的故障,如偏置、漂移、噪声、增益、尖峰和恒定故障。此类故障可能导致严重的操作问题,包括错误的结果、不正确的预测和误导性的维护决策。因此,对传感器数据进行适当分类对于确保工业物联网系统的可靠性和效率至关重要。在本文中,我们提出了用于工业物联网系统中多传感器、多类别故障分类的共享编码器变压器(SET)方案。利用变压器架构,SET使用具有位置编码和多头自关注机制的共享编码器来捕获传感器数据中的复杂时间模式。因此,它可以准确地检测传感器数据的健康状态,当传感器数据出现故障时,它可以准确地识别故障类型。此外,我们引入了一种全面的故障注入策略来解决故障数据稀缺的问题,使得在理想和现实场景中,即使故障样本有限,也能验证SET的鲁棒性。在我们的研究中,我们使用商业模块化航空推进系统仿真(C-MAPSS)和Skoltech异常基准(SKAB)数据集进行了广泛的实验,以研究SET的性能。我们的实验结果表明,SET在不同故障强度下的准确率、精密度、召回率和f1分数方面始终优于基准方法,包括长短期记忆(LSTM)、卷积神经网络(CNN)-LSTM和多层感知器(MLP),以及SET的比较变量——多编码器变压器(MET)。shared-kmencoder架构提高了故障检测的准确性,并确保了参数效率/鲁棒性,使其适合部署在内存受限的工业环境中。
{"title":"SET: A Shared-Encoder Transformer Scheme for Multi-Sensor, Multi-Class Fault Classification in Industrial IoT","authors":"Kamran Sattar Awaisi;Qiang Ye;Srinivas Sampalli","doi":"10.1109/TMLCN.2025.3579750","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3579750","url":null,"abstract":"The Industrial Internet of Things (IIoT) has revolutionized the industrial sector by integrating sensors to monitor equipment health and optimize production processes. These sensors collect real-time data and are prone to a variety of different faults, such as bias, drift, noise, gain, spike, and constant faults. Such faults can lead to significant operational problems, including false results, incorrect predictions, and misleading maintenance decisions. Therefore, classifying sensor data appropriately is essential for ensuring the reliability and efficiency of IIoT systems. In this paper, we propose the Shared-Encoder Transformer (SET) scheme for multi-sensor, multi-class fault classification in IIoT systems. Leveraging the transformer architecture, the SET uses a shared encoder with positional encoding and multi-head self-attention mechanisms to capture complex temporal patterns in sensor data. Consequently, it can accurately detect the health status of sensor data, and if the sensor data is faulty, it can specifically identify the fault type. Additionally, we introduce a comprehensive fault injection strategy to address the problem of fault data scarcity, enabling the validation of the robust performance of SET even with limited fault samples in both ideal and realistic scenarios. In our research, we conducted extensive experiments using the Commercial Modular Aeropropulsion System Simulation (C-MAPSS) and Skoltech Anomaly Benchmark (SKAB) datasets to study the performance of the SET. Our experimental results indicate that SET consistently outperforms baseline methods, including Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN)-LSTM, and Multilayer Perceptron (MLP), as well as the proposed comparative variant of SET, Multi-Encoder Transformer (MET), in terms of accuracy, precision, recall, and F1-score across different fault intensities. The shared-kmencoder architecture improves fault detection accuracy and ensures parameter efficiency/robustness, making it suitable for deployment in memory-constrained industrial environments.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"744-760"},"PeriodicalIF":0.0,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11037229","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144367023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Machine Learning in Communications and Networking
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1