Permission-Based Classification of Android Malware Applications Using Random Forest

European Conference on Cyber Warfare and Security Pub Date : 2023-06-19 DOI:10.34190/eccws.22.1.1212

Nikolaos Chrysikos, P. Karampelas, Konstantinos F. Xylogiannopoulos

{"title":"Permission-Based Classification of Android Malware Applications Using Random Forest","authors":"Nikolaos Chrysikos, P. Karampelas, Konstantinos F. Xylogiannopoulos","doi":"10.34190/eccws.22.1.1212","DOIUrl":null,"url":null,"abstract":"Android is arguably the most widely used mobile operating system in the world. Due to its widespread use, it has attracted a lot of attention of cybercriminals who attempt to exploit its architecture and outsmart innocent users to install malware applications. The number of such applications is growing every day either by alternating a basic exploitation mechanism or by creating novel mechanisms to exfiltrate users’ data. As a result, there is an increasing need for detection mechanisms that can classify these applications to families based on their characteristics. A significant amount of research has already been devoted to analysing and mitigating this growing problem; however, this situation demands more efficient methods with higher precision. The paper proposes such a framework for analysing and classifying a malicious application to certain families relying on the permissions used. The proposed method involves the pre-processing of the applications to extract their permissions, the tokenization of permissions, the data cleansing and finally the application of the Random Forest Classifier to classify the applications in families. The proposed method is trained and tested with a dataset of 11,159 malicious applications categorized in 33 unique families. The precision, recall and f1-score achieved is 98%. The results of the proposed methodology are promising, since it even works in an unbalanced dataset and in many cases outperform other state-of-the-art approaches.","PeriodicalId":258360,"journal":{"name":"European Conference on Cyber Warfare and Security","volume":"430 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Conference on Cyber Warfare and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34190/eccws.22.1.1212","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Android is arguably the most widely used mobile operating system in the world. Due to its widespread use, it has attracted a lot of attention of cybercriminals who attempt to exploit its architecture and outsmart innocent users to install malware applications. The number of such applications is growing every day either by alternating a basic exploitation mechanism or by creating novel mechanisms to exfiltrate users’ data. As a result, there is an increasing need for detection mechanisms that can classify these applications to families based on their characteristics. A significant amount of research has already been devoted to analysing and mitigating this growing problem; however, this situation demands more efficient methods with higher precision. The paper proposes such a framework for analysing and classifying a malicious application to certain families relying on the permissions used. The proposed method involves the pre-processing of the applications to extract their permissions, the tokenization of permissions, the data cleansing and finally the application of the Random Forest Classifier to classify the applications in families. The proposed method is trained and tested with a dataset of 11,159 malicious applications categorized in 33 unique families. The precision, recall and f1-score achieved is 98%. The results of the proposed methodology are promising, since it even works in an unbalanced dataset and in many cases outperform other state-of-the-art approaches.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于权限的Android恶意软件应用随机森林分类

Android可以说是世界上使用最广泛的移动操作系统。由于它的广泛使用，它吸引了很多网络罪犯的注意，他们试图利用它的架构并智取无辜的用户来安装恶意软件。这类应用程序的数量每天都在增长，要么是通过替换一种基本的利用机制，要么是通过创建新的机制来窃取用户数据。因此，越来越需要检测机制，可以根据这些应用的特征对家庭进行分类。已经有大量的研究致力于分析和减轻这一日益严重的问题;然而，这种情况需要更高效、精度更高的方法。本文提出了这样一个框架，用于根据使用的权限对恶意应用程序进行分析和分类。该方法包括对应用程序进行预处理以提取其权限，对权限进行标记化，对数据进行清理，最后应用随机森林分类器对应用程序进行分类。所提出的方法在33个独特家族的11,159个恶意应用程序的数据集上进行了训练和测试。准确率、查全率和f1分均达到98%。所提出的方法的结果是有希望的，因为它甚至可以在不平衡的数据集中工作，并且在许多情况下优于其他最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

European Conference on Cyber Warfare and Security

自引率

0.00%

发文量