On Effectiveness of Source Code and SSL Based Features for Phishing Website Detection

2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE) Pub Date : 2019-03-01 DOI:10.1109/ICATIECE45860.2019.9063824

Roopak S, Athira P. Vijayaraghavan, Tony Thomas

{"title":"On Effectiveness of Source Code and SSL Based Features for Phishing Website Detection","authors":"Roopak S, Athira P. Vijayaraghavan, Tony Thomas","doi":"10.1109/ICATIECE45860.2019.9063824","DOIUrl":null,"url":null,"abstract":"Phishing is a social engineering method to steal user credentials through data entry forms from malicious websites. Currently available anti-malware softwares can only detect black listed phishing websites. Similarity based detection methods such as visual similarity can be easily evaded by making some changes in the textual and visual contents of a phishing site. The phishing behavior of a web page can be identified from its URL, domain and source code based features. However, URL and domain based features can be easily defeated by using black hat SEO techniques. In this paper, we extract the relevant rules based on webpage source code and Secure Socket Layering (SSL) based features from a training dataset using Repeated Incremental Pruning to Produce Error Reduction (RIPPER) algorithm. Further, we check for the presence of these rules in a test dataset. Our implementation results show that the webpage source code based rules can identify phishing websites with an accuracy of 0.92.","PeriodicalId":106496,"journal":{"name":"2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICATIECE45860.2019.9063824","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Phishing is a social engineering method to steal user credentials through data entry forms from malicious websites. Currently available anti-malware softwares can only detect black listed phishing websites. Similarity based detection methods such as visual similarity can be easily evaded by making some changes in the textual and visual contents of a phishing site. The phishing behavior of a web page can be identified from its URL, domain and source code based features. However, URL and domain based features can be easily defeated by using black hat SEO techniques. In this paper, we extract the relevant rules based on webpage source code and Secure Socket Layering (SSL) based features from a training dataset using Repeated Incremental Pruning to Produce Error Reduction (RIPPER) algorithm. Further, we check for the presence of these rules in a test dataset. Our implementation results show that the webpage source code based rules can identify phishing websites with an accuracy of 0.92.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于源代码和SSL特征的钓鱼网站检测有效性研究

网络钓鱼是一种通过恶意网站的数据输入表单窃取用户凭证的社会工程方法。目前可用的反恶意软件只能检测黑名单上的钓鱼网站。通过对钓鱼网站的文本和视觉内容进行一些更改，可以很容易地避开基于相似性的检测方法，例如视觉相似性。网页的网络钓鱼行为可以从其URL、域名和基于源代码的特征来识别。然而，基于URL和域名的特性可以通过使用黑帽SEO技术轻松击败。在本文中，我们使用重复增量修剪产生错误减少(RIPPER)算法从训练数据集中提取基于网页源代码和基于安全套接字层(SSL)的特征的相关规则。此外，我们检查测试数据集中是否存在这些规则。我们的实现结果表明，基于网页源代码的规则可以识别网络钓鱼网站，准确率为0.92。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)

自引率

0.00%

发文量