Jinfu Chen , Zian Zhao , Saihua Cai , Xiao Chen , Bilal Ahmad , Luo Song , Kun Wang
{"title":"DCM-GIFT: An Android malware dynamic classification method based on gray-scale image and feature-selection tree","authors":"Jinfu Chen , Zian Zhao , Saihua Cai , Xiao Chen , Bilal Ahmad , Luo Song , Kun Wang","doi":"10.1016/j.infsof.2024.107560","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><p>The boom of Android market makes mobile products more popular and convenient. However, in the face of the complex Android application market, how to efficiently and accurately identify malware has become one of the focuses of research. Various new types of disguised malware lurk in the web pages, links and major application malls. Therefore, people’s privacy and property security have become a major obstacle to the continued development of mobile devices.</p></div><div><h3>Objective:</h3><p>Most of the existing malware classification methods are fixed on one or several types of characteristics of Android devices, such as static characteristics, dynamic characteristics and traffic characteristics. Single feature detection or fixed feature fusion models limit the dimension of detection software, and also cause imbalanced classification results. This paper proposes an Android Malware Dynamic Classification Method based on Gray-scale Image and Feature-selection Tree (DCM-GIFT), which aims to improve and stabilize the precision of Android software classification and enhance the robustness of malware classification.</p></div><div><h3>Method:</h3><p>In this paper, we construct gray-scale images for the original Android traffic to retain the characteristics of the time series and spatial structure of the original network traffic. At the same time, we take the dynamic information and static information of Android software as auxiliary features to build a feature selection tree. The feature-selection algorithm helps the classifier dynamically select the optimal feature fusion scheme, and the resulting fusion feature vector will be trained and predicted using machine learning clusters for model training.</p></div><div><h3>Results:</h3><p>We evaluate the performance of DCM-GIFT on multiple datasets published at the Canadian Institute for Cybersecurity, the area under the accuracy, precision, recall and <span><math><mrow><mi>F</mi><msub><mrow><mn>1</mn></mrow><mrow><mi>m</mi><mi>e</mi><mi>a</mi><mi>s</mi><mi>u</mi><mi>r</mi><mi>e</mi></mrow></msub></mrow></math></span>. The results show that the proposed DCM-GIFT model has significantly better prediction performance compared to other software classification models.</p></div><div><h3>Conclusion:</h3><p>It can be concluded that: (1) In terms of accuracy, precision, recall and <span><math><mrow><mi>F</mi><msub><mrow><mn>1</mn></mrow><mrow><mi>m</mi><mi>e</mi><mi>a</mi><mi>s</mi><mi>u</mi><mi>r</mi><mi>e</mi></mrow></msub></mrow></math></span>, the DCM-GIFT model has a higher average value. (2) The DCM-GIFT model effectively solves the problem of imbalanced classification results in Android software. (3) The DCM-GIFT model achieves the goal of dynamic feature fusion and significantly improves the utilization of system resources.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"176 ","pages":"Article 107560"},"PeriodicalIF":3.8000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584924001654","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Context:
The boom of Android market makes mobile products more popular and convenient. However, in the face of the complex Android application market, how to efficiently and accurately identify malware has become one of the focuses of research. Various new types of disguised malware lurk in the web pages, links and major application malls. Therefore, people’s privacy and property security have become a major obstacle to the continued development of mobile devices.
Objective:
Most of the existing malware classification methods are fixed on one or several types of characteristics of Android devices, such as static characteristics, dynamic characteristics and traffic characteristics. Single feature detection or fixed feature fusion models limit the dimension of detection software, and also cause imbalanced classification results. This paper proposes an Android Malware Dynamic Classification Method based on Gray-scale Image and Feature-selection Tree (DCM-GIFT), which aims to improve and stabilize the precision of Android software classification and enhance the robustness of malware classification.
Method:
In this paper, we construct gray-scale images for the original Android traffic to retain the characteristics of the time series and spatial structure of the original network traffic. At the same time, we take the dynamic information and static information of Android software as auxiliary features to build a feature selection tree. The feature-selection algorithm helps the classifier dynamically select the optimal feature fusion scheme, and the resulting fusion feature vector will be trained and predicted using machine learning clusters for model training.
Results:
We evaluate the performance of DCM-GIFT on multiple datasets published at the Canadian Institute for Cybersecurity, the area under the accuracy, precision, recall and . The results show that the proposed DCM-GIFT model has significantly better prediction performance compared to other software classification models.
Conclusion:
It can be concluded that: (1) In terms of accuracy, precision, recall and , the DCM-GIFT model has a higher average value. (2) The DCM-GIFT model effectively solves the problem of imbalanced classification results in Android software. (3) The DCM-GIFT model achieves the goal of dynamic feature fusion and significantly improves the utilization of system resources.
期刊介绍:
Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include:
• Software management, quality and metrics,
• Software processes,
• Software architecture, modelling, specification, design and programming
• Functional and non-functional software requirements
• Software testing and verification & validation
• Empirical studies of all aspects of engineering and managing software development
Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information.
The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.