An ML Based Anomaly Detection System in real-time data streams

Javier Jose Diaz Rivera, Talha Ahmed Khan, Waleed Akbar, Muhammad Afaq, Wang-Cheol Song
{"title":"An ML Based Anomaly Detection System in real-time data streams","authors":"Javier Jose Diaz Rivera, Talha Ahmed Khan, Waleed Akbar, Muhammad Afaq, Wang-Cheol Song","doi":"10.1109/CSCI54926.2021.00270","DOIUrl":null,"url":null,"abstract":"Due to the advancements in machine learning and artificial intelligence applied fields, network anomaly detection systems have experienced an evolution from traditional signature-based methods for intrusion detection. Nonetheless, as security measures evolve, more sophisticated attacks are also constantly being developed by hackers. Not only a robust anomaly detection algorithm is needed, but also a real-time data feeding mechanism for minimizing the reaction-time impact is required. Moreover, DDoS attacks can flood the network data channels with more than thousands of packets per second with the latent effect of overloading most traditional monitoring systems that rely on data storage. Due to this, the research presented in this paper focuses its efforts on implementing a real-time data streaming system for network anomaly detection that can operate during a high volume of traffic data. The solution includes the deployment of a flow collector platform connected to Apache Kafka for receiving NetFlow data from network switches. Also, real-time big data processing techniques are applied through Apache Spark, where the ML anomaly detection is triggered. The detection of anomalies is performed by a combination of the unsupervised learning clustering algorithm k-means and the supervised learning classifier KNN (k- nearest neighbors). Finally, a monitoring system consisting of an ELK stack collects historical data for further evolution of the ML algorithms.","PeriodicalId":206881,"journal":{"name":"2021 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI54926.2021.00270","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Due to the advancements in machine learning and artificial intelligence applied fields, network anomaly detection systems have experienced an evolution from traditional signature-based methods for intrusion detection. Nonetheless, as security measures evolve, more sophisticated attacks are also constantly being developed by hackers. Not only a robust anomaly detection algorithm is needed, but also a real-time data feeding mechanism for minimizing the reaction-time impact is required. Moreover, DDoS attacks can flood the network data channels with more than thousands of packets per second with the latent effect of overloading most traditional monitoring systems that rely on data storage. Due to this, the research presented in this paper focuses its efforts on implementing a real-time data streaming system for network anomaly detection that can operate during a high volume of traffic data. The solution includes the deployment of a flow collector platform connected to Apache Kafka for receiving NetFlow data from network switches. Also, real-time big data processing techniques are applied through Apache Spark, where the ML anomaly detection is triggered. The detection of anomalies is performed by a combination of the unsupervised learning clustering algorithm k-means and the supervised learning classifier KNN (k- nearest neighbors). Finally, a monitoring system consisting of an ELK stack collects historical data for further evolution of the ML algorithms.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于机器学习的实时数据流异常检测系统
由于机器学习和人工智能应用领域的进步,网络异常检测系统经历了从传统的基于签名的入侵检测方法的演变。尽管如此,随着安全措施的发展,黑客也在不断开发更复杂的攻击。不仅需要一个鲁棒的异常检测算法,还需要一个实时的数据馈送机制,以最大限度地减少反应时间的影响。此外,DDoS攻击可以以每秒数千个数据包的速度淹没网络数据通道,从而潜在地使大多数依赖数据存储的传统监控系统过载。因此,本文的研究重点是实现一个可以在大流量数据下运行的网络异常检测实时数据流系统。该解决方案包括部署一个流采集器平台,连接到Apache Kafka,用于接收来自网络交换机的NetFlow数据。同时,通过Apache Spark应用实时大数据处理技术,触发机器学习异常检测。异常检测由无监督学习聚类算法k-means和监督学习分类器KNN (k-最近邻)相结合来完成。最后,由ELK堆栈组成的监控系统收集历史数据,用于ML算法的进一步发展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Remote Video Surveillance Effects of Social Distancing Intention, Affective Risk Perception, and Cabin Fever Syndrome on Perceived Value of E-learning : Type of submission: Late Breaking Paper / Most relevant symposium: CSCI-ISED Cybersecurity Integration: Deploying Critical Infrastructure Security and Resilience Topics into the Undergraduate Curriculum Distributed Algorithms for k-Coverage in Mobile Sensor Networks Software Development Methodologies for Virtual Reality
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1