Unbalanced Web Phishing Classification through Deep Reinforcement Learning

Comput. Pub Date : 2023-06-09 DOI:10.3390/computers12060118
Antonio Maci, Alessandro Santorsola, Anthony J. Coscia, Andrea Iannacone
{"title":"Unbalanced Web Phishing Classification through Deep Reinforcement Learning","authors":"Antonio Maci, Alessandro Santorsola, Anthony J. Coscia, Andrea Iannacone","doi":"10.3390/computers12060118","DOIUrl":null,"url":null,"abstract":"Web phishing is a form of cybercrime aimed at tricking people into visiting malicious URLs to exfiltrate sensitive data. Since the structure of a malicious URL evolves over time, phishing detection mechanisms that can adapt to such variations are paramount. Furthermore, web phishing detection is an unbalanced classification task, as legitimate URLs outnumber malicious ones in real-life cases. Deep learning (DL) has emerged as a promising technique to minimize concept drift to enhance web phishing detection. Deep reinforcement learning (DRL) combines DL with reinforcement learning (RL); that is, a sequential decision-making paradigm in which the problem to be addressed is expressed as a Markov decision process (MDP). Recent studies have proposed an ad hoc MDP formulation to tackle unbalanced classification tasks called the imbalanced classification Markov decision process (ICMDP). In this paper, we exploit the ICMDP to present a double deep Q-Network (DDQN)-based classifier to address the unbalanced web phishing classification problem. The proposed algorithm is evaluated on a Mendeley web phishing dataset, from which three different data imbalance scenarios are generated. Despite a significant training time, it results in better geometric mean, index of balanced accuracy, F1 score, and area under the ROC curve than other DL-based classifiers combined with data-level sampling techniques in all test cases.","PeriodicalId":10526,"journal":{"name":"Comput.","volume":"99 1","pages":"118"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/computers12060118","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Web phishing is a form of cybercrime aimed at tricking people into visiting malicious URLs to exfiltrate sensitive data. Since the structure of a malicious URL evolves over time, phishing detection mechanisms that can adapt to such variations are paramount. Furthermore, web phishing detection is an unbalanced classification task, as legitimate URLs outnumber malicious ones in real-life cases. Deep learning (DL) has emerged as a promising technique to minimize concept drift to enhance web phishing detection. Deep reinforcement learning (DRL) combines DL with reinforcement learning (RL); that is, a sequential decision-making paradigm in which the problem to be addressed is expressed as a Markov decision process (MDP). Recent studies have proposed an ad hoc MDP formulation to tackle unbalanced classification tasks called the imbalanced classification Markov decision process (ICMDP). In this paper, we exploit the ICMDP to present a double deep Q-Network (DDQN)-based classifier to address the unbalanced web phishing classification problem. The proposed algorithm is evaluated on a Mendeley web phishing dataset, from which three different data imbalance scenarios are generated. Despite a significant training time, it results in better geometric mean, index of balanced accuracy, F1 score, and area under the ROC curve than other DL-based classifiers combined with data-level sampling techniques in all test cases.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于深度强化学习的不平衡网络钓鱼分类
网络钓鱼是一种旨在诱骗人们访问恶意url以窃取敏感数据的网络犯罪形式。由于恶意URL的结构会随着时间的推移而演变,因此能够适应这种变化的网络钓鱼检测机制至关重要。此外,网络钓鱼检测是一项不平衡的分类任务,因为在现实生活中,合法的url多于恶意的url。深度学习(DL)已成为一种有前途的技术,以减少概念漂移,提高网络钓鱼检测。深度强化学习(DRL)将深度学习与强化学习(RL)相结合;也就是说,一种顺序决策范式,其中要解决的问题被表示为马尔可夫决策过程(MDP)。最近的研究提出了一种特殊的MDP公式来解决不平衡分类任务,称为不平衡分类马尔可夫决策过程(ICMDP)。在本文中,我们利用ICMDP提出了一个基于双深度Q-Network (DDQN)的分类器来解决不平衡的网络钓鱼分类问题。在Mendeley网络钓鱼数据集上对该算法进行了评估,并从中生成了三种不同的数据不平衡场景。尽管训练时间很长,但在所有测试用例中,它比其他基于dl的分类器结合数据级采样技术在几何均值、平衡精度指数、F1分数和ROC曲线下面积方面都有更好的表现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A U-Net Architecture for Inpainting Lightstage Normal Maps Implementing Virtualization on Single-Board Computers: A Case Study on Edge Computing Electrocardiogram Signals Classification Using Deep-Learning-Based Incorporated Convolutional Neural Network and Long Short-Term Memory Framework The Mechanism of Resonant Amplification of One-Dimensional Detonation Propagating in a Non-Uniform Mixture Application of Immersive VR Serious Games in the Treatment of Schizophrenia Negative Symptoms
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1