DaE2: Unmasking malicious URLs by leveraging diverse and efficient ensemble machine learning for online security

IF 4.8 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Computers & Security Pub Date : 2024-10-28 DOI:10.1016/j.cose.2024.104170
Abiodun Esther Omolara , Moatsum Alawida
{"title":"DaE2: Unmasking malicious URLs by leveraging diverse and efficient ensemble machine learning for online security","authors":"Abiodun Esther Omolara ,&nbsp;Moatsum Alawida","doi":"10.1016/j.cose.2024.104170","DOIUrl":null,"url":null,"abstract":"<div><div>Over 5.44 billion people now use the Internet, making it a vital part of daily life, enabling communication, e-commerce, education, and more. However, this huge Internet connectivity also raises concerns about online privacy and security, particularly with the rise of malicious Uniform Resource Locators (URLs). Recently, conventional ensemble models have attracted attention due to their notable benefits of reducing the variance in models, enhancing predictive performance, improving prediction accuracy, and demonstrating high generalization potential. But, its application in addressing the challenge of malicious URLs is still an open problem. These URLs often hide behind static links in emails or web pages, posing a threat to individuals and organizations. Despite blacklisting services, many harmful sites evade detection due to inadequate scrutiny or recent creation. Hence, to improve URL detection, a Diverse and Efficient Ensemble (DaE2) machine learning algorithm was developed using four ensemble models, that is, AdaBoost, Bagging, Stacking, and Voting to classify URLs. After preprocessing, the experimental result shown that all models achieved over 80 % accuracy, with AdaBoost reaching 98.5 % and Stacking offering the fastest runtime. AdaBoost and Bagging also delivered strong performance, with F1 scores of 0.980 and 0.976, respectively.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"148 ","pages":"Article 104170"},"PeriodicalIF":4.8000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824004759","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Over 5.44 billion people now use the Internet, making it a vital part of daily life, enabling communication, e-commerce, education, and more. However, this huge Internet connectivity also raises concerns about online privacy and security, particularly with the rise of malicious Uniform Resource Locators (URLs). Recently, conventional ensemble models have attracted attention due to their notable benefits of reducing the variance in models, enhancing predictive performance, improving prediction accuracy, and demonstrating high generalization potential. But, its application in addressing the challenge of malicious URLs is still an open problem. These URLs often hide behind static links in emails or web pages, posing a threat to individuals and organizations. Despite blacklisting services, many harmful sites evade detection due to inadequate scrutiny or recent creation. Hence, to improve URL detection, a Diverse and Efficient Ensemble (DaE2) machine learning algorithm was developed using four ensemble models, that is, AdaBoost, Bagging, Stacking, and Voting to classify URLs. After preprocessing, the experimental result shown that all models achieved over 80 % accuracy, with AdaBoost reaching 98.5 % and Stacking offering the fastest runtime. AdaBoost and Bagging also delivered strong performance, with F1 scores of 0.980 and 0.976, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DaE2:利用多样化和高效的集合机器学习为在线安全揭开恶意 URL 的面纱
目前有超过 54.4 亿人使用互联网,互联网已成为日常生活的重要组成部分,使通信、电子商务、教育等成为可能。然而,巨大的互联网连接也引发了人们对网络隐私和安全的担忧,特别是随着恶意统一资源定位器(URL)的兴起。最近,传统的集合模型因其在减少模型方差、增强预测性能、提高预测准确性和展示高泛化潜力等方面的显著优势而备受关注。但是,它在应对恶意 URL 挑战方面的应用仍是一个未决问题。这些 URL 通常隐藏在电子邮件或网页的静态链接后面,对个人和组织构成威胁。尽管有黑名单服务,但许多有害网站由于审查不充分或最近才创建而逃避检测。因此,为了改进 URL 检测,我们开发了一种多样化高效集合(DaE2)机器学习算法,使用四种集合模型,即 AdaBoost、Bagging、Stacking 和 Voting 来对 URL 进行分类。预处理后的实验结果表明,所有模型的准确率都超过了 80%,其中 AdaBoost 的准确率达到了 98.5%,Stacking 的运行时间最快。AdaBoost 和 Bagging 的性能也很强,F1 分数分别为 0.980 和 0.976。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers & Security
Computers & Security 工程技术-计算机:信息系统
CiteScore
12.40
自引率
7.10%
发文量
365
审稿时长
10.7 months
期刊介绍: Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.
期刊最新文献
Beyond the sandbox: Leveraging symbolic execution for evasive malware classification Trust my IDS: An explainable AI integrated deep learning-based transparent threat detection system for industrial networks PdGAT-ID: An intrusion detection method for industrial control systems based on periodic extraction and spatiotemporal graph attention Dynamic trigger-based attacks against next-generation IoT malware family classifiers Assessing cybersecurity awareness among bank employees: A multi-stage analytical approach using PLS-SEM, ANN, and fsQCA in a developing country context
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1