Vitor G. da Silva Ruffo , Daniel M. Brandão Lent , Luiz F. Carvalho , Jaime Lloret , Mario Lemes Proença Jr.
{"title":"Generative adversarial networks to detect intrusion and anomaly in IP flow-based networks","authors":"Vitor G. da Silva Ruffo , Daniel M. Brandão Lent , Luiz F. Carvalho , Jaime Lloret , Mario Lemes Proença Jr.","doi":"10.1016/j.future.2024.107531","DOIUrl":null,"url":null,"abstract":"<div><p>Computer networks facilitate regular human tasks, providing services like data streaming, online shopping, and digital communications. These applications require more and more network capacity and dynamicity to accomplish their goals. The networks may be targeted by attacks and intrusions that compromise the applications that rely on them and lead to potential losses. We propose a semi-supervised systematic methodology for developing a detection system for traffic volume anomalies in IP flow-based networks. The system is implemented with a vanilla Generative Adversarial Network (GAN). The mitigation module is triggered whenever an anomaly is detected, automatically blocking the suspect IPs and restoring the correct network functioning. We implemented three versions of the proposed solution by incorporating Long Short-Term Memory (LSTM), 1D-Convolutional Neural Network (1D-CNN), and Temporal Convolutional Network (TCN) into the GAN internal structure. The experiments are conducted on three public benchmark datasets: Orion, CIC-DDoS2019, and CIC-IDS2017. The results show that the three considered deep learning models have distinct impacts on the GAN model and, consequently, on the overall system performance. The 1D-CNN-based GAN implementation is the best since it reasonably solves the mode collapse problem, has the most efficient computational complexity, and achieves competitive Matthews Correlation Coefficient scores for the anomaly detection task. Also, the mitigation module can drop most anomalous flows, blocking only a slight portion of legitimate traffic. For comparison with state-of-the-art models, we implemented 1D-CNN, LSTM, and TCN separately from the GAN. The generative networks show improved overall results in the considered performance metrics compared to the other models.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"163 ","pages":"Article 107531"},"PeriodicalIF":6.2000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004953","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Computer networks facilitate regular human tasks, providing services like data streaming, online shopping, and digital communications. These applications require more and more network capacity and dynamicity to accomplish their goals. The networks may be targeted by attacks and intrusions that compromise the applications that rely on them and lead to potential losses. We propose a semi-supervised systematic methodology for developing a detection system for traffic volume anomalies in IP flow-based networks. The system is implemented with a vanilla Generative Adversarial Network (GAN). The mitigation module is triggered whenever an anomaly is detected, automatically blocking the suspect IPs and restoring the correct network functioning. We implemented three versions of the proposed solution by incorporating Long Short-Term Memory (LSTM), 1D-Convolutional Neural Network (1D-CNN), and Temporal Convolutional Network (TCN) into the GAN internal structure. The experiments are conducted on three public benchmark datasets: Orion, CIC-DDoS2019, and CIC-IDS2017. The results show that the three considered deep learning models have distinct impacts on the GAN model and, consequently, on the overall system performance. The 1D-CNN-based GAN implementation is the best since it reasonably solves the mode collapse problem, has the most efficient computational complexity, and achieves competitive Matthews Correlation Coefficient scores for the anomaly detection task. Also, the mitigation module can drop most anomalous flows, blocking only a slight portion of legitimate traffic. For comparison with state-of-the-art models, we implemented 1D-CNN, LSTM, and TCN separately from the GAN. The generative networks show improved overall results in the considered performance metrics compared to the other models.
计算机网络为人类的常规任务提供便利,提供数据流、在线购物和数字通信等服务。这些应用需要越来越大的网络容量和动态性来实现其目标。网络可能会成为攻击和入侵的目标,从而危及依赖网络的应用程序,并导致潜在的损失。我们提出了一种半监督系统方法,用于开发基于 IP 流量的网络流量异常检测系统。该系统采用虚生成对抗网络(GAN)实现。只要检测到异常,就会触发缓解模块,自动阻止可疑 IP 并恢复网络的正常运行。我们将长短期记忆(LSTM)、一维卷积神经网络(1D-CNN)和时态卷积网络(TCN)纳入 GAN 内部结构,实现了三种版本的拟议解决方案。实验在三个公共基准数据集上进行:Orion、CIC-DDoS2019 和 CIC-IDS2017。结果表明,所考虑的三种深度学习模型对 GAN 模型有不同的影响,因此对整个系统的性能也有不同的影响。基于 1D-CNN 的 GAN 实现是最好的,因为它合理地解决了模式崩溃问题,具有最高效的计算复杂度,并在异常检测任务中获得了有竞争力的马修斯相关系数分数。此外,缓解模块可以放弃大部分异常流量,只阻塞一小部分合法流量。为了与最先进的模型进行比较,我们在 GAN 之外分别实施了 1D-CNN、LSTM 和 TCN。与其他模型相比,生成式网络在所考虑的性能指标方面显示出更好的整体效果。
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.