Classification of Firewall Log Files with Different Algorithms and Performance Analysis of These Algorithms

IF 0.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Journal of Web Engineering Pub Date : 2024-06-01 DOI:10.13052/jwe1540-9589.2344
Ebru Efeoğlu;Gurkan Tuna
{"title":"Classification of Firewall Log Files with Different Algorithms and Performance Analysis of These Algorithms","authors":"Ebru Efeoğlu;Gurkan Tuna","doi":"10.13052/jwe1540-9589.2344","DOIUrl":null,"url":null,"abstract":"Classifying firewall log files allows analysing potential threats and deciding on appropriate rules to prevent them. Therefore, in this study, firewall log files are classified using different classification algorithms and the performance of the algorithms are evaluated using performance metrics. The dataset was prepared using the log files of a firewall. It was filtered to make it free from any personal data and consisted of 12 attributes in total and from these attributes the action attribute was selected as the class. In the performance evaluation, Simple Cart and NB tree algorithms made the best predictions, achieving an accuracy rate of 99.84%. Decision Stump had the worst prediction performance, achieving an accuracy rate of 79.68%. As the total number of instances belonging to each of the classes in the dataset was not equal, the Matthews correlation coefficient was also used as a performance metric in the evaluations. The Simple Cart, BF tree, FT tree, J48 and NB Tree algorithms achieved the highest average values. However, although the reset-both class was not predicted successfully by the others, the Simple Cart algorithm made the best predictions for it. The values of other performance metrics used in this study also support this conclusion. Therefore, the Simple Cart algorithm is recommended for use in classifying firewall log files. However, there is a need to develop a prefiltering and parsing approach to process different log files as each firewall brand creates and maintains log files in its own format. Therefore, in this study, a novel prefiltering and parsing approach has been proposed to process log files with different structures and create structured datasets using them.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 4","pages":"561-593"},"PeriodicalIF":0.7000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10634590","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Web Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10634590/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Classifying firewall log files allows analysing potential threats and deciding on appropriate rules to prevent them. Therefore, in this study, firewall log files are classified using different classification algorithms and the performance of the algorithms are evaluated using performance metrics. The dataset was prepared using the log files of a firewall. It was filtered to make it free from any personal data and consisted of 12 attributes in total and from these attributes the action attribute was selected as the class. In the performance evaluation, Simple Cart and NB tree algorithms made the best predictions, achieving an accuracy rate of 99.84%. Decision Stump had the worst prediction performance, achieving an accuracy rate of 79.68%. As the total number of instances belonging to each of the classes in the dataset was not equal, the Matthews correlation coefficient was also used as a performance metric in the evaluations. The Simple Cart, BF tree, FT tree, J48 and NB Tree algorithms achieved the highest average values. However, although the reset-both class was not predicted successfully by the others, the Simple Cart algorithm made the best predictions for it. The values of other performance metrics used in this study also support this conclusion. Therefore, the Simple Cart algorithm is recommended for use in classifying firewall log files. However, there is a need to develop a prefiltering and parsing approach to process different log files as each firewall brand creates and maintains log files in its own format. Therefore, in this study, a novel prefiltering and parsing approach has been proposed to process log files with different structures and create structured datasets using them.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用不同算法对防火墙日志文件进行分类,并对这些算法进行性能分析
通过对防火墙日志文件进行分类,可以分析潜在的威胁并决定适当的规则来防止这些威胁。因此,本研究使用不同的分类算法对防火墙日志文件进行分类,并使用性能指标对算法的性能进行评估。数据集是用防火墙的日志文件准备的。数据集经过过滤,不包含任何个人数据,总共包含 12 个属性,并从中选择了行动属性作为类别。在性能评估中,Simple Cart 和 NB 树算法的预测效果最好,准确率达到 99.84%。决策树桩的预测性能最差,准确率仅为 79.68%。由于数据集中属于每个类别的实例总数不相等,因此在评估中也使用了马修斯相关系数作为性能指标。简单购物车、BF 树、FT 树、J48 和 NB 树算法的平均值最高。不过,虽然其他算法都未能成功预测重置-双类,但简单购物车算法对该类的预测效果最好。本研究中使用的其他性能指标的值也支持这一结论。因此,建议使用简单购物车算法对防火墙日志文件进行分类。不过,需要开发一种预过滤和解析方法来处理不同的日志文件,因为每个防火墙品牌都以自己的格式创建和维护日志文件。因此,本研究提出了一种新颖的预过滤和解析方法,用于处理不同结构的日志文件,并利用它们创建结构化数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Web Engineering
Journal of Web Engineering 工程技术-计算机:理论方法
CiteScore
1.80
自引率
12.50%
发文量
62
审稿时长
9 months
期刊介绍: The World Wide Web and its associated technologies have become a major implementation and delivery platform for a large variety of applications, ranging from simple institutional information Web sites to sophisticated supply-chain management systems, financial applications, e-government, distance learning, and entertainment, among others. Such applications, in addition to their intrinsic functionality, also exhibit the more complex behavior of distributed applications.
期刊最新文献
Code Smell-Guided Prompting for LLM-Based Defect Prediction in Ansible Scripts Software Practice and Experience on Smart Mobility Digital Twin in Transportation and Automotive Industry: Toward SDV-Empowered Digital Twin Through EV Edge-Cloud and AutoML Privacy and Performance in Virtual Reality: The Advantages of Federated Learning in Collaborative Environments Efficient Machine Learning Systems in Edge Cloud Environments Overcoming Terrain Challenges with Edge Computing Solutions: Optimizing WSN Deployments Over Obstacle Clad-Irregular Terrains
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1