Kartik Khariwal, Rishabh Gupta, Jatin P. Singh, Anshul Arora
{"title":"R-MFDroid: Android Malware Detection using Ranked Manifest File Components","authors":"Kartik Khariwal, Rishabh Gupta, Jatin P. Singh, Anshul Arora","doi":"10.35940/IJITEE.G8951.0510721","DOIUrl":null,"url":null,"abstract":"With the increasing fame of Android OS over the past\nfew years, the quantity of malware assaults on Android has\nadditionally expanded. In the year 2018, around 28 million\nmalicious applications were found on the Android platform and\nthese malicious apps were capable of causing huge financial\nlosses and information leakage. Such threats, caused due to these\nmalicious apps, call for a proper detection system for Android\nmalware. There exist some research works that aim to study static\nmanifest components for malware detection. However, to the best\nof our knowledge, none of the previous research works have\naimed to find the best set amongst different manifest file\ncomponents for malware detection. In this work, we focus on\nidentifying the best feature set from manifest file components\n(Permissions, Intents, Hardware Components, Activities, Services,\nBroadcast Receivers, and Content Providers) that could give better\ndetection accuracy. We apply Information Gain to rank the\nmanifest file components intending to find the best set of\ncomponents that can better classify between malware applications\nand benign applications. We put forward a novel algorithm to find\nthe best feature set by using various machine learning classifiers\nlike SVM, XGBoost, and Random Forest along with deep learning\ntechniques like classification using Neural networks. The\nexperimental results highlight that the best set obtained from the\nproposed algorithm consisted of 25 features, i.e., 5 Permissions, 2\nIntents, 9 Activities, 3 Content Providers, 4 Hardware\nComponents, 1 Service, and 1 Broadcast Receiver. The SVM\nclassifier gave the highest classification accuracy of 96.93% and\nan F1-Score of 0.97 with this best set of 25 features.","PeriodicalId":23601,"journal":{"name":"VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35940/IJITEE.G8951.0510721","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
With the increasing fame of Android OS over the past
few years, the quantity of malware assaults on Android has
additionally expanded. In the year 2018, around 28 million
malicious applications were found on the Android platform and
these malicious apps were capable of causing huge financial
losses and information leakage. Such threats, caused due to these
malicious apps, call for a proper detection system for Android
malware. There exist some research works that aim to study static
manifest components for malware detection. However, to the best
of our knowledge, none of the previous research works have
aimed to find the best set amongst different manifest file
components for malware detection. In this work, we focus on
identifying the best feature set from manifest file components
(Permissions, Intents, Hardware Components, Activities, Services,
Broadcast Receivers, and Content Providers) that could give better
detection accuracy. We apply Information Gain to rank the
manifest file components intending to find the best set of
components that can better classify between malware applications
and benign applications. We put forward a novel algorithm to find
the best feature set by using various machine learning classifiers
like SVM, XGBoost, and Random Forest along with deep learning
techniques like classification using Neural networks. The
experimental results highlight that the best set obtained from the
proposed algorithm consisted of 25 features, i.e., 5 Permissions, 2
Intents, 9 Activities, 3 Content Providers, 4 Hardware
Components, 1 Service, and 1 Broadcast Receiver. The SVM
classifier gave the highest classification accuracy of 96.93% and
an F1-Score of 0.97 with this best set of 25 features.