Service function chain (SFC) consists of multiple ordered network functions (e.g., firewall, load balancer) and plays an important role in improving network security and ensuring network performance. Offloading SFCs onto programmable switches can bring significant performance improvement, but it suffers from unbearable reconfiguration delays, making it hard to cope with network workload dynamics in a timely manner. To bridge the gap, this paper presents OptRec, an efficient SFC proactive reconfiguration optimization framework based on deep reinforcement learning (DRL). OptRec predicts future traffic and places SFCs on programmable switches in advance to ensure the timeliness of the SFC reconfiguration, which is a proactive approach. However, it is non-trivial to extract effective features from historical traffic information and global network states, while ensuring efficient and stable model training. To this end, OptRec introduces a multi-level feature extraction model for different types of features. Additionally, it combines reinforcement learning and autoregressive learning to enhance model efficiency and stability. Results of in-depth simulations based on real-world datasets show the average prediction error of OptRec is less than 3% and OptRec can increase the system throughput by up to 69.6%~72.6% compared with other alternatives.
{"title":"Achieving Efficient SFC Proactive Reconfiguration Through Deep Reinforcement Learning in Programmable Networks","authors":"Huaqing Tu;Ziqiang Hua;Qi Xu;Jun Zhu;Tao Zou;Hongli Xu;Qiao Xiang;Zuqing Zhu","doi":"10.1109/TNSM.2025.3585590","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3585590","url":null,"abstract":"Service function chain (SFC) consists of multiple ordered network functions (e.g., firewall, load balancer) and plays an important role in improving network security and ensuring network performance. Offloading SFCs onto programmable switches can bring significant performance improvement, but it suffers from unbearable reconfiguration delays, making it hard to cope with network workload dynamics in a timely manner. To bridge the gap, this paper presents OptRec, an efficient SFC proactive reconfiguration optimization framework based on deep reinforcement learning (DRL). OptRec predicts future traffic and places SFCs on programmable switches in advance to ensure the timeliness of the SFC reconfiguration, which is a proactive approach. However, it is non-trivial to extract effective features from historical traffic information and global network states, while ensuring efficient and stable model training. To this end, OptRec introduces a multi-level feature extraction model for different types of features. Additionally, it combines reinforcement learning and autoregressive learning to enhance model efficiency and stability. Results of in-depth simulations based on real-world datasets show the average prediction error of OptRec is less than 3% and OptRec can increase the system throughput by up to 69.6%~72.6% compared with other alternatives.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"4917-4932"},"PeriodicalIF":5.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145230026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-02DOI: 10.1109/TNSM.2025.3585148
Xu Liu;Zheng-Yi Chai;Yan-Yang Cheng;Ya-Lun Li;Tao Li
Mobile Edge Computing (MEC) plays a pivotal role in optimizing the Industrial Internet of Things (IIoT), where the Industrial Task Offloading Problem (ITOP) is crucial for ensuring optimal system performance by balancing conflicting objectives such as delay, energy consumption, and cost. However, existing approaches often oversimplify multi-objective optimization by aggregating conflicting goals into a single objective, while also suffering from limited exploration and robustness in uncertain MEC scenarios within IIoT. To overcome this limitation, we propose EMDRL-ITOP, an Evolutionary Multi-Objective Deep Reinforcement Learning algorithm that synergizes evolutionary algorithm with deep reinforcement learning (DRL). Firstly, we formulate a multi-objective task scheduling model for IIoT-MEC and design a three-dimensional vector reward function within a Multi-Objective Markov Decision Process framework, enabling simultaneous optimization of delay, energy, and cost. Then, EMDRL-ITOP integrates evolutionary mechanisms to enhance exploration and robustness: a dynamic elite selection strategy prioritizes high-quality policies, a distillation crossover operator fuses advantageous traits from elite strategies, and a proximal mutation mechanism maintains population diversity. These components collectively improve learning efficiency and solution quality in dynamic environments. Extensive simulations across six instances demonstrate that EMDRL-ITOP achieves a superior balance among conflicting objectives compared to state-of-the-art methods, while also outperforming existing algorithms in several key performance metrics.
{"title":"Evolutionary Multi-Objective Deep Reinforcement Learning for Task Offloading in Industrial Internet of Things","authors":"Xu Liu;Zheng-Yi Chai;Yan-Yang Cheng;Ya-Lun Li;Tao Li","doi":"10.1109/TNSM.2025.3585148","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3585148","url":null,"abstract":"Mobile Edge Computing (MEC) plays a pivotal role in optimizing the Industrial Internet of Things (IIoT), where the Industrial Task Offloading Problem (ITOP) is crucial for ensuring optimal system performance by balancing conflicting objectives such as delay, energy consumption, and cost. However, existing approaches often oversimplify multi-objective optimization by aggregating conflicting goals into a single objective, while also suffering from limited exploration and robustness in uncertain MEC scenarios within IIoT. To overcome this limitation, we propose EMDRL-ITOP, an Evolutionary Multi-Objective Deep Reinforcement Learning algorithm that synergizes evolutionary algorithm with deep reinforcement learning (DRL). Firstly, we formulate a multi-objective task scheduling model for IIoT-MEC and design a three-dimensional vector reward function within a Multi-Objective Markov Decision Process framework, enabling simultaneous optimization of delay, energy, and cost. Then, EMDRL-ITOP integrates evolutionary mechanisms to enhance exploration and robustness: a dynamic elite selection strategy prioritizes high-quality policies, a distillation crossover operator fuses advantageous traits from elite strategies, and a proximal mutation mechanism maintains population diversity. These components collectively improve learning efficiency and solution quality in dynamic environments. Extensive simulations across six instances demonstrate that EMDRL-ITOP achieves a superior balance among conflicting objectives compared to state-of-the-art methods, while also outperforming existing algorithms in several key performance metrics.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"5074-5089"},"PeriodicalIF":5.4,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145230068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-02DOI: 10.1109/TNSM.2025.3583898
Huanlin Liu;Bing Ma;Jianjian Zhang;Yong Chen;Bo Liu;Haonan Chen;Di Deng
With the continuous advancement of network virtualization (NV) technology, virtual network embedding (VNE) has played a crucial role in solving network resource allocation problem. However, multi-domain elastic optical networks (MD-EONs) are increasingly facing privacy and security challenges. The centralized VNE methods lead to significant communication overhead due to their excessive reliance on central servers. Additionally, network attacks, such as eavesdropping, pose severe threats to data security. So, we propose a blockchain-assisted virtual network secure embedding (BA-VNSE) framework MD-EONs. This framework employs quantum key distribution (QKD) technology to ensure data security during transmission and leverages the blockchain technology to enhance the transparency and security of the VNE process. Furthermore, we propose a blockchain-assisted minimum cost virtual network secure embedding (BAMC-VNSE). During the virtual node embedding (VNM), the multidimensional resources of nodes are comprehensively considered to ensure effective embedding. In the virtual link embedding (VLM), the QKD paths are allowed to differ from the encrypted data transmission paths, ultimately resulting in the selection of the most cost-effective valid embedding scheme. The simulation results demonstrate that the BAMC-VNSE effectively reduces request blocking probability, embedding cost and average number of message while improving the key utilization ratio.
{"title":"Blockchain-Assisted Secure Embedding of Virtual Networks in Multi-Domain Elastic Optical Network","authors":"Huanlin Liu;Bing Ma;Jianjian Zhang;Yong Chen;Bo Liu;Haonan Chen;Di Deng","doi":"10.1109/TNSM.2025.3583898","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3583898","url":null,"abstract":"With the continuous advancement of network virtualization (NV) technology, virtual network embedding (VNE) has played a crucial role in solving network resource allocation problem. However, multi-domain elastic optical networks (MD-EONs) are increasingly facing privacy and security challenges. The centralized VNE methods lead to significant communication overhead due to their excessive reliance on central servers. Additionally, network attacks, such as eavesdropping, pose severe threats to data security. So, we propose a blockchain-assisted virtual network secure embedding (BA-VNSE) framework MD-EONs. This framework employs quantum key distribution (QKD) technology to ensure data security during transmission and leverages the blockchain technology to enhance the transparency and security of the VNE process. Furthermore, we propose a blockchain-assisted minimum cost virtual network secure embedding (BAMC-VNSE). During the virtual node embedding (VNM), the multidimensional resources of nodes are comprehensively considered to ensure effective embedding. In the virtual link embedding (VLM), the QKD paths are allowed to differ from the encrypted data transmission paths, ultimately resulting in the selection of the most cost-effective valid embedding scheme. The simulation results demonstrate that the BAMC-VNSE effectively reduces request blocking probability, embedding cost and average number of message while improving the key utilization ratio.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"3838-3848"},"PeriodicalIF":5.4,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-26DOI: 10.1109/TNSM.2025.3583213
Junjun Li;Shi Ying;Tiangang Li;Xiangbo Tian
Microservice systems have become a popular architecture for modern Web applications owing to their scalability, modularity, and maintainability. However, with the increasing complexity and size of these systems, anomaly detection emerges as a critical task. In this paper, we introduce TraceDAE, a trace-based anomaly detection approach in microservice systems. The approach initially constructs a Service Trace Graph (STG) to depict service invocation relationships and performance metrics, subsequently introducing a dual autoencoder framework. In this framework, the structure autoencoder employs Graph Attention Networks (GAT) to analyze the structure, while the attribute autoencoder leverages the Long Short-Term Memory Network (LSTM) for processing time series data. This approach is capable of effectively identifying Service Response Abnormal and Service Invocation Abnormal. Moreover, the final experimental results on datasets show that TraceDAE is an efficient anomaly detection approach which outperforms the SOTA(State of The Arts) trace-based anomaly detection methods with F1-scores of 0.970 and 0.925, respectively.
由于其可伸缩性、模块化和可维护性,微服务系统已经成为现代Web应用程序的流行体系结构。然而,随着这些系统的复杂性和规模的增加,异常检测成为一项关键任务。本文介绍了一种基于跟踪的微服务系统异常检测方法TraceDAE。该方法最初构建了一个服务跟踪图(Service Trace Graph, STG)来描述服务调用关系和性能指标,随后引入了一个双自编码器框架。在该框架中,结构自编码器使用图注意网络(GAT)来分析结构,而属性自编码器使用长短期记忆网络(LSTM)来处理时间序列数据。该方法能够有效识别服务响应异常和服务调用异常。最后在数据集上的实验结果表明,TraceDAE是一种高效的异常检测方法,其f1分数分别为0.970和0.925,优于基于SOTA(State of the Arts)的异常检测方法。
{"title":"TraceDAE: Trace-Based Anomaly Detection in Microservice Systems via Dual Autoencoder","authors":"Junjun Li;Shi Ying;Tiangang Li;Xiangbo Tian","doi":"10.1109/TNSM.2025.3583213","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3583213","url":null,"abstract":"Microservice systems have become a popular architecture for modern Web applications owing to their scalability, modularity, and maintainability. However, with the increasing complexity and size of these systems, anomaly detection emerges as a critical task. In this paper, we introduce TraceDAE, a trace-based anomaly detection approach in microservice systems. The approach initially constructs a Service Trace Graph (STG) to depict service invocation relationships and performance metrics, subsequently introducing a dual autoencoder framework. In this framework, the structure autoencoder employs Graph Attention Networks (GAT) to analyze the structure, while the attribute autoencoder leverages the Long Short-Term Memory Network (LSTM) for processing time series data. This approach is capable of effectively identifying Service Response Abnormal and Service Invocation Abnormal. Moreover, the final experimental results on datasets show that TraceDAE is an efficient anomaly detection approach which outperforms the SOTA(State of The Arts) trace-based anomaly detection methods with F1-scores of 0.970 and 0.925, respectively.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"4884-4897"},"PeriodicalIF":5.4,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-24DOI: 10.1109/TNSM.2025.3582223
Bo Pang;Deyun Gao;Xianchao Zhang;Chuan Heng Foh;Hongke Zhang;Victor C. M. Leung
Service Function Chaining (SFC) is widely deployed by telecom operators and cloud service providers, offering traffic QoS guarantees and other additional functions for various applications. The network state at the time of SFC deployment can differ significantly from the runtime conditions, leading to excessive resource allocation and consequent energy waste. The existing SFC reconfiguration methods face the challenge of meeting the latency requirements of delay-sensitive applications while achieving significant energy savings. This paper proposes FR-SFCO, a flow rate-aware SFC offloading framework on programmable data planes for delay-sensitive flows. Specifically, we designed a TCAM-friendly table matching method for FR-SFCO to reduce the flow entries needed for SFC offloading in programmable switches and support larger numbers of offloaded SFC. Then, we proposed a dual-threshold-based offloading trigger mechanism that, according to the real-time traffic arrival rate, can fast offload SFC flows before they default to servers. Building on this, we propose DQN-AOTA, an adaptive offloading thresholds adjustment algorithm based on Deep Q-Learning, which can wisely change the offloading thresholds by interacting with a dynamic network traffic environment to minimize the packet loss and long-term energy consumption. Finally, we build a testbed using BMv2 software switches and Docker containers for extensive evaluation. The experimental results demonstrate the effectiveness of our solution which not only meets the latency constraints for delay-sensitive SFC flows but also reduces energy expenditure by at least 14.6%.
{"title":"FR-SFCO: Energy-Aware Offloading on Data Plane for Delay-Sensitive SFC","authors":"Bo Pang;Deyun Gao;Xianchao Zhang;Chuan Heng Foh;Hongke Zhang;Victor C. M. Leung","doi":"10.1109/TNSM.2025.3582223","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3582223","url":null,"abstract":"Service Function Chaining (SFC) is widely deployed by telecom operators and cloud service providers, offering traffic QoS guarantees and other additional functions for various applications. The network state at the time of SFC deployment can differ significantly from the runtime conditions, leading to excessive resource allocation and consequent energy waste. The existing SFC reconfiguration methods face the challenge of meeting the latency requirements of delay-sensitive applications while achieving significant energy savings. This paper proposes FR-SFCO, a flow rate-aware SFC offloading framework on programmable data planes for delay-sensitive flows. Specifically, we designed a TCAM-friendly table matching method for FR-SFCO to reduce the flow entries needed for SFC offloading in programmable switches and support larger numbers of offloaded SFC. Then, we proposed a dual-threshold-based offloading trigger mechanism that, according to the real-time traffic arrival rate, can fast offload SFC flows before they default to servers. Building on this, we propose DQN-AOTA, an adaptive offloading thresholds adjustment algorithm based on Deep Q-Learning, which can wisely change the offloading thresholds by interacting with a dynamic network traffic environment to minimize the packet loss and long-term energy consumption. Finally, we build a testbed using BMv2 software switches and Docker containers for extensive evaluation. The experimental results demonstrate the effectiveness of our solution which not only meets the latency constraints for delay-sensitive SFC flows but also reduces energy expenditure by at least 14.6%.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"3823-3837"},"PeriodicalIF":5.4,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network emulation is an important component of a digital twin for verifying network behavior without impacting on the service systems. Although we need to repeatedly change network topologies and configuration settings as a part of trial and error for verification, it is not easy to reflect the change without failures because the change affects multiple devices, even if it is as simple as adding a device. We present topology-driven configuration, an idea to separate network topology and generalized configuration to make it easy to change them. Based on this idea, we aim to realize a scalable, simple, and effective configuration platform for emulation networks. We design a configuration generation method using simple and deterministic config templates with a new network parameter data model, and implement it as dot2net. We evaluate three perspectives, scalability, simplicity, and efficacy, of the proposed method using dot2net through measurement and user experiments on existing test network scenarios.
{"title":"Topology-Driven Configuration of Emulation Networks With Deterministic Templating","authors":"Satoru Kobayashi;Ryusei Shiiba;Shinsuke Miwa;Toshiyuki Miyachi;Kensuke Fukuda","doi":"10.1109/TNSM.2025.3582212","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3582212","url":null,"abstract":"Network emulation is an important component of a digital twin for verifying network behavior without impacting on the service systems. Although we need to repeatedly change network topologies and configuration settings as a part of trial and error for verification, it is not easy to reflect the change without failures because the change affects multiple devices, even if it is as simple as adding a device. We present topology-driven configuration, an idea to separate network topology and generalized configuration to make it easy to change them. Based on this idea, we aim to realize a scalable, simple, and effective configuration platform for emulation networks. We design a configuration generation method using simple and deterministic config templates with a new network parameter data model, and implement it as dot2net. We evaluate three perspectives, scalability, simplicity, and efficacy, of the proposed method using dot2net through measurement and user experiments on existing test network scenarios.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"3933-3946"},"PeriodicalIF":5.4,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network backbone black holes(BH) pose significant challenges in the Internet by causing disruptions and data loss as routers silently drop packets without notification. These silent BH failures, stemming from issues like hardware malfunctions or misconfigurations, uniquely affect point-to-point packet flows without disrupting the entire network. Unlike cyber attacks and network intrusions, BHs are often untraceable, making early detection vital and challenging. This study addresses the need for an effective forecasting solution for BH occurrences, especially in environments with unlabeled traffic data where traditional anomaly detection methods fall short. The Type-Independent Black Hole Forecasting Model is introduced to predict BH occurrences with high precision across various anomalies, including contextual and collective anomaly types. The three-stage methodology processes unlabeled time-series network data, where the data is not pre-labeled as anomaly or normal, using machine learning and deep learning techniques to identify and forecast potential BH occurrences. The ‘Point BH Identification and Segregation’ stage segregates point BH traffic using Density-Based Spatial Clustering of Applications with Noise(DBSCAN), followed by Reintegration and Time Series Smoothing. The final stage, Advanced Contextual and Collective BH Detection, leverages Convolutional AutoEncoder(Conv-AE) with window sliding for advanced anomaly detection. Evaluation using a dual-dataset approach, including real backbone network traffic and a time-series adapted public dataset, demonstrates the adaptability of the model to real backbone BH detection systems. Experimental results show superior performance compared to state-of-the-art unsupervised anomaly forecasting models, with a 98% detection rate and 90% F-1 score, outperforming models like MultiHeadSelfAttention, which is the main building block of Transformers.
{"title":"Black Hole Prediction in Backbone Networks: A Comprehensive and Type-Independent Forecasting Model","authors":"Kiymet Kaya;Elif Ak;Eren Ozaltun;Leandros Maglaras;Trung Q. Duong;Berk Canberk;Sule Gunduz Oguducu","doi":"10.1109/TNSM.2025.3581557","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3581557","url":null,"abstract":"Network backbone black holes(BH) pose significant challenges in the Internet by causing disruptions and data loss as routers silently drop packets without notification. These silent BH failures, stemming from issues like hardware malfunctions or misconfigurations, uniquely affect point-to-point packet flows without disrupting the entire network. Unlike cyber attacks and network intrusions, BHs are often untraceable, making early detection vital and challenging. This study addresses the need for an effective forecasting solution for BH occurrences, especially in environments with unlabeled traffic data where traditional anomaly detection methods fall short. The Type-Independent Black Hole Forecasting Model is introduced to predict BH occurrences with high precision across various anomalies, including contextual and collective anomaly types. The three-stage methodology processes unlabeled time-series network data, where the data is not pre-labeled as anomaly or normal, using machine learning and deep learning techniques to identify and forecast potential BH occurrences. The ‘Point BH Identification and Segregation’ stage segregates point BH traffic using Density-Based Spatial Clustering of Applications with Noise(DBSCAN), followed by Reintegration and Time Series Smoothing. The final stage, Advanced Contextual and Collective BH Detection, leverages Convolutional AutoEncoder(Conv-AE) with window sliding for advanced anomaly detection. Evaluation using a dual-dataset approach, including real backbone network traffic and a time-series adapted public dataset, demonstrates the adaptability of the model to real backbone BH detection systems. Experimental results show superior performance compared to state-of-the-art unsupervised anomaly forecasting models, with a 98% detection rate and 90% F-1 score, outperforming models like MultiHeadSelfAttention, which is the main building block of Transformers.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"4983-4997"},"PeriodicalIF":5.4,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145230022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-19DOI: 10.1109/TNSM.2025.3581463
Boubakr Nour;Makan Pourzandi;Mourad Debbabi
Ensuring cybersecurity in an ever-evolving threat landscape requires proactive identification and understanding of potential threats. Conventional detection and prediction solutions often fall short as they predominantly focus on known attack vectors. Advanced Persistent Threats (APTs) are becoming increasingly sophisticated and stealthy, resulting in new threat variants that are undetectable by these detection solutions. This paper introduces Threatify, a novel approach to predicting the most probable threat variants from existing APTs and previously seen attack campaigns. Our approach automates the generation of threat variants using graph-based machine learning based on the attack definition, past attack campaigns, and the security context between different techniques. Threatify leverages a security knowledge base of realistic attack scenarios and cybersecurity expertise to model, generate, and predict new forms of potential future threats by combining inter- (i.e., within the same APT attack) and intra- (i.e., between different APTs) techniques used by threat actors. It is crucial to emphasize that Threatify does not merely mix techniques from different APTs; rather, it constructs a logical and pragmatic kill chain based on their security context. Threatify is able to predict new attack steps, find relevant techniques to be substituted by, and merge APTs’ techniques in the current security context, and thus create previously unexplored threat variants. Our extensive experimental results demonstrate the efficacy of our approach in generating relevant and novel threat variants with a similarity score of 92%, uniqueness of 82%, validity of 95%, and reduction rate of 96%, including those that have never occurred before.
{"title":"Threatify: APT Threat Variant Generation Using Graph-Based Machine Learning","authors":"Boubakr Nour;Makan Pourzandi;Mourad Debbabi","doi":"10.1109/TNSM.2025.3581463","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3581463","url":null,"abstract":"Ensuring cybersecurity in an ever-evolving threat landscape requires proactive identification and understanding of potential threats. Conventional detection and prediction solutions often fall short as they predominantly focus on known attack vectors. Advanced Persistent Threats (APTs) are becoming increasingly sophisticated and stealthy, resulting in new threat variants that are undetectable by these detection solutions. This paper introduces T<sc>hreatify</small>, a novel approach to predicting the most probable threat variants from existing APTs and previously seen attack campaigns. Our approach automates the generation of threat variants using graph-based machine learning based on the attack definition, past attack campaigns, and the security context between different techniques. T<sc>hreatify</small> leverages a security knowledge base of realistic attack scenarios and cybersecurity expertise to model, generate, and predict new forms of potential future threats by combining inter- (i.e., within the same APT attack) and intra- (i.e., between different APTs) techniques used by threat actors. It is crucial to emphasize that T<sc>hreatify</small> does not merely mix techniques from different APTs; rather, it constructs a logical and pragmatic kill chain based on their security context. T<sc>hreatify</small> is able to predict new attack steps, find relevant techniques to be substituted by, and merge APTs’ techniques in the current security context, and thus create previously unexplored threat variants. Our extensive experimental results demonstrate the efficacy of our approach in generating relevant and novel threat variants with a similarity score of 92%, uniqueness of 82%, validity of 95%, and reduction rate of 96%, including those that have never occurred before.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"3978-3994"},"PeriodicalIF":5.4,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the past few years, network infrastructures have transitioned from prominently hardware-based models to networks of functions, where software components provide the required functionalities with unprecedented scalability and flexibility. However, this new vision entails a completely new set of problems related to resource provisioning and the network function operation, making it difficult to manage the network function lifecycle management with traditional, human-in-the-loop approaches. Novel zero-touch management solutions promise autonomous network operation with limited human interactions. However, modeling network function behavior into compelling variables and algorithm is an aspect that such solutions must take into account. In this paper, we propose AZTEC+, a data-driven solution for anticipatory resource provisioning in network slicing scenarios. By leveraging a hybrid and modular deep learning architecture, AZTEC+ not only forecasts the future demands for target services but also identifies the best trade-offs to balance the costs due to the instantiation and reconfiguration of such resources. Our experimental evaluation, based on real-world network data, shows how AZTEC+ can outperform state-of-the-art management solutions for a large set of metrics.
{"title":"AZTEC+: Long- and Short-Term Resource Provisioning for Zero-Touch Network Management","authors":"Sergi Alcalá-Marín;Dario Bega;Marco Gramaglia;Albert Banchs;Xavier Costa-Perez;Marco Fiore","doi":"10.1109/TNSM.2025.3580706","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3580706","url":null,"abstract":"In the past few years, network infrastructures have transitioned from prominently hardware-based models to networks of functions, where software components provide the required functionalities with unprecedented scalability and flexibility. However, this new vision entails a completely new set of problems related to resource provisioning and the network function operation, making it difficult to manage the network function lifecycle management with traditional, human-in-the-loop approaches. Novel zero-touch management solutions promise autonomous network operation with limited human interactions. However, modeling network function behavior into compelling variables and algorithm is an aspect that such solutions must take into account. In this paper, we propose AZTEC+, a data-driven solution for anticipatory resource provisioning in network slicing scenarios. By leveraging a hybrid and modular deep learning architecture, AZTEC+ not only forecasts the future demands for target services but also identifies the best trade-offs to balance the costs due to the instantiation and reconfiguration of such resources. Our experimental evaluation, based on real-world network data, shows how AZTEC+ can outperform state-of-the-art management solutions for a large set of metrics.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"3809-3822"},"PeriodicalIF":5.4,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11039731","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-17DOI: 10.1109/TNSM.2025.3580467
Dan Tang;Chenguang Zuo;Jiliang Zhang;Keqin Li;Qiuwei Yang;Zheng Qin
The TCP protocol’s inherent lack of built-in security mechanisms has rendered it susceptible to various network attacks. Conventional defense approaches face dual challenges: insufficient line-rate processing capacity and impractical online deployment requirements. The emergence of P4-based programmable data planes now enables line-speed traffic processing at the hardware level, creating new opportunities for protocol protection. In this context, we present MARS - a data plane-native TCP abuse detection and mitigation system that synergistically combines the Beaucoup traffic monitoring algorithm with artificial neural network (ANN) based anomaly detection, enhanced by adaptive heuristic mitigation rules. Through comprehensive benchmarking against existing TCP defense mechanisms, our solution demonstrates 12.95% higher throughput maintenance and 25.93% improved congestion window recovery ratio during attack scenarios. Furthermore, the proposed framework establishes several novel evaluation metrics specifically for TCP protocol protection systems.
{"title":"MARS: Defending TCP Protocol Abuses in Programmable Data Plane","authors":"Dan Tang;Chenguang Zuo;Jiliang Zhang;Keqin Li;Qiuwei Yang;Zheng Qin","doi":"10.1109/TNSM.2025.3580467","DOIUrl":"https://doi.org/10.1109/TNSM.2025.3580467","url":null,"abstract":"The TCP protocol’s inherent lack of built-in security mechanisms has rendered it susceptible to various network attacks. Conventional defense approaches face dual challenges: insufficient line-rate processing capacity and impractical online deployment requirements. The emergence of P4-based programmable data planes now enables line-speed traffic processing at the hardware level, creating new opportunities for protocol protection. In this context, we present MARS - a data plane-native TCP abuse detection and mitigation system that synergistically combines the Beaucoup traffic monitoring algorithm with artificial neural network (ANN) based anomaly detection, enhanced by adaptive heuristic mitigation rules. Through comprehensive benchmarking against existing TCP defense mechanisms, our solution demonstrates 12.95% higher throughput maintenance and 25.93% improved congestion window recovery ratio during attack scenarios. Furthermore, the proposed framework establishes several novel evaluation metrics specifically for TCP protocol protection systems.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"4050-4060"},"PeriodicalIF":5.4,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}