Textual and visual content-based anti-phishing: a Bayesian approach.

IEEE transactions on neural networks Pub Date : 2011-10-01 Epub Date: 2011-08-04 DOI:10.1109/TNN.2011.2161999
Haijun Zhang, Gang Liu, Tommy W S Chow, Wenyin Liu
{"title":"Textual and visual content-based anti-phishing: a Bayesian approach.","authors":"Haijun Zhang,&nbsp;Gang Liu,&nbsp;Tommy W S Chow,&nbsp;Wenyin Liu","doi":"10.1109/TNN.2011.2161999","DOIUrl":null,"url":null,"abstract":"<p><p>A novel framework using a Bayesian approach for content-based phishing web page detection is presented. Our model takes into account textual and visual contents to measure the similarity between the protected web page and suspicious web pages. A text classifier, an image classifier, and an algorithm fusing the results from classifiers are introduced. An outstanding feature of this paper is the exploration of a Bayesian model to estimate the matching threshold. This is required in the classifier for determining the class of the web page and identifying whether the web page is phishing or not. In the text classifier, the naive Bayes rule is used to calculate the probability that a web page is phishing. In the image classifier, the earth mover's distance is employed to measure the visual similarity, and our Bayesian model is designed to determine the threshold. In the data fusion algorithm, the Bayes theory is used to synthesize the classification results from textual and visual content. The effectiveness of our proposed approach was examined in a large-scale dataset collected from real phishing cases. Experimental results demonstrated that the text classifier and the image classifier we designed deliver promising results, the fusion algorithm outperforms either of the individual classifiers, and our model can be adapted to different phishing cases.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 10","pages":"1532-46"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2161999","citationCount":"193","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TNN.2011.2161999","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2011/8/4 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 193

Abstract

A novel framework using a Bayesian approach for content-based phishing web page detection is presented. Our model takes into account textual and visual contents to measure the similarity between the protected web page and suspicious web pages. A text classifier, an image classifier, and an algorithm fusing the results from classifiers are introduced. An outstanding feature of this paper is the exploration of a Bayesian model to estimate the matching threshold. This is required in the classifier for determining the class of the web page and identifying whether the web page is phishing or not. In the text classifier, the naive Bayes rule is used to calculate the probability that a web page is phishing. In the image classifier, the earth mover's distance is employed to measure the visual similarity, and our Bayesian model is designed to determine the threshold. In the data fusion algorithm, the Bayes theory is used to synthesize the classification results from textual and visual content. The effectiveness of our proposed approach was examined in a large-scale dataset collected from real phishing cases. Experimental results demonstrated that the text classifier and the image classifier we designed deliver promising results, the fusion algorithm outperforms either of the individual classifiers, and our model can be adapted to different phishing cases.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于文本和视觉内容的反网络钓鱼:贝叶斯方法。
提出了一种基于贝叶斯方法的基于内容的网络钓鱼网页检测框架。我们的模型考虑了文本和视觉内容来衡量受保护网页和可疑网页之间的相似性。介绍了一种文本分类器、图像分类器和一种融合分类器结果的算法。本文的一个突出特点是探索了贝叶斯模型来估计匹配阈值。这在分类器中是必需的,用于确定网页的类别,并识别网页是否为网络钓鱼。在文本分类器中,使用朴素贝叶斯规则计算网页钓鱼的概率。在图像分类器中,采用推土机的距离来衡量视觉相似性,设计贝叶斯模型来确定阈值。在数据融合算法中,利用贝叶斯理论对文本和视觉内容的分类结果进行综合。我们提出的方法的有效性在从真实网络钓鱼案例中收集的大规模数据集中进行了检验。实验结果表明,本文设计的文本分类器和图像分类器效果良好,融合算法优于单独的分类器,并且该模型可以适应不同的网络钓鱼案例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE transactions on neural networks
IEEE transactions on neural networks 工程技术-工程:电子与电气
自引率
0.00%
发文量
2
审稿时长
8.7 months
期刊最新文献
Extracting rules from neural networks as decision diagrams. Design of a data-driven predictive controller for start-up process of AMT vehicles. Data-based hybrid tension estimation and fault diagnosis of cold rolling continuous annealing processes. Unified development of multiplicative algorithms for linear and quadratic nonnegative matrix factorization. Data-based system modeling using a type-2 fuzzy neural network with a hybrid learning algorithm.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1