Uniform Resource Locator Classification Using Classical Machine Learning & Deep Learning Techniques

Aws Rayyan, Mohammad Ghassan Aburas, Amjed Al-mousa
{"title":"Uniform Resource Locator Classification Using Classical Machine Learning & Deep Learning Techniques","authors":"Aws Rayyan, Mohammad Ghassan Aburas, Amjed Al-mousa","doi":"10.37256/ccds.4120231847","DOIUrl":null,"url":null,"abstract":"In the Internet era, there is no doubt that the Internet has helped us in many ways by providing us with a means to communicate with anyone around the world. That is said, some people misuse such technology to conduct malicious behaviors. Many things could be exploited to perform such acts, but this work focuses on exploitation methods that use the uniform resource locator (URL). This paper presents the means to extract features from a raw URL. These are used to predict whether a URL is safe for a user to visit or not. The whole process of extracting the data and preparing it for a model is discussed thoroughly in this paper. Several machine learning (ML) models have been trained using different algorithms, including Catboost, RandomForest, and Decision trees, in addition to using and exploring several feedforward deep neural networks learning models. The best model achieved an accuracy of 95.61% on a test set using a deep learning model.","PeriodicalId":158315,"journal":{"name":"Cloud Computing and Data Science","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cloud Computing and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37256/ccds.4120231847","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In the Internet era, there is no doubt that the Internet has helped us in many ways by providing us with a means to communicate with anyone around the world. That is said, some people misuse such technology to conduct malicious behaviors. Many things could be exploited to perform such acts, but this work focuses on exploitation methods that use the uniform resource locator (URL). This paper presents the means to extract features from a raw URL. These are used to predict whether a URL is safe for a user to visit or not. The whole process of extracting the data and preparing it for a model is discussed thoroughly in this paper. Several machine learning (ML) models have been trained using different algorithms, including Catboost, RandomForest, and Decision trees, in addition to using and exploring several feedforward deep neural networks learning models. The best model achieved an accuracy of 95.61% on a test set using a deep learning model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用经典机器学习和深度学习技术的统一资源定位器分类
在互联网时代,毫无疑问,互联网在很多方面帮助了我们,为我们提供了一种与世界各地的任何人交流的手段。也就是说,有些人滥用这种技术进行恶意行为。可以利用许多东西来执行此类行为,但本文主要关注使用统一资源定位符(URL)的利用方法。本文介绍了从原始URL中提取特征的方法。它们用于预测URL对用户访问是否安全。本文对数据提取和模型准备的整个过程进行了深入的讨论。除了使用和探索几种前馈深度神经网络学习模型外,还使用不同的算法训练了几种机器学习(ML)模型,包括Catboost、RandomForest和Decision trees。最好的模型在使用深度学习模型的测试集上实现了95.61%的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DeepMetaDroid: Real-Time Android Malware Detection Using Deep Learning and Metadata Features Advancing Stock Market Predictions with Time Series Analysis including LSTM and ARIMA Geochemical and Geospatial Distribution of Organic Contaminants in the Flood Plain of Ekpetiama, Niger Delta Region of Nigeria Smart Contracts Security Application and Challenges: A Review A Review on Current Trends and Applications of Social Media Research in Sri Lanka
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1