{"title":"An Adaptive-Feature Centric XGBoost Ensemble Classifier Model for Improved Malware Detection and Classification","authors":"J. Pavithra, S. Selvakumarasamy","doi":"10.32604/jcs.2022.031889","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) is often used to solve the problem of malware detection and classification and various machine learning approaches are adapted to the problem of malware classification; still acquiring poor performance by the way of feature selection, and classification. To manage the issue, an efficient Adaptive Feature Centric XG Boost Ensemble Learner Classifier “AFC-XG Boost” novel algorithm is presented in this paper. The proposed model has been designed to handle varying data sets of malware detection obtained from Kaggle data set. The model turns the process of XG Boost classifier in several stages to optimize the performance. At preprocessing stage, the data set given has been noise removed, normalized and tamper removed using Feature Base Optimizer “FBO” algorithm. The FBO would normalize the data points as well as performs noise removal according to the feature values and their base information. Similarly, the performance of standard XG Boost has been optimized by adapting Feature selection using Class Based Principle Component Analysis “CBPCA” algorithm, which performs feature selection according to the fitness of any feature for different classes. Based on the selected features, the method generates regression tree for each feature considered. Based on the generated trees, the method performs classification by computing Tree Level Ensemble Similarity “TLES” and Class Level Ensemble Similarity “CLES”. Using both method computes the value of Class Match Similarity “CMS” based on which the malware has been classified. The proposed approach achieves 97% accuracy in malware detection and classification with the less time complexity of 34 seconds for 75000 samples","PeriodicalId":37820,"journal":{"name":"Journal of Cyber Security and Mobility","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cyber Security and Mobility","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32604/jcs.2022.031889","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 1
Abstract
Machine learning (ML) is often used to solve the problem of malware detection and classification and various machine learning approaches are adapted to the problem of malware classification; still acquiring poor performance by the way of feature selection, and classification. To manage the issue, an efficient Adaptive Feature Centric XG Boost Ensemble Learner Classifier “AFC-XG Boost” novel algorithm is presented in this paper. The proposed model has been designed to handle varying data sets of malware detection obtained from Kaggle data set. The model turns the process of XG Boost classifier in several stages to optimize the performance. At preprocessing stage, the data set given has been noise removed, normalized and tamper removed using Feature Base Optimizer “FBO” algorithm. The FBO would normalize the data points as well as performs noise removal according to the feature values and their base information. Similarly, the performance of standard XG Boost has been optimized by adapting Feature selection using Class Based Principle Component Analysis “CBPCA” algorithm, which performs feature selection according to the fitness of any feature for different classes. Based on the selected features, the method generates regression tree for each feature considered. Based on the generated trees, the method performs classification by computing Tree Level Ensemble Similarity “TLES” and Class Level Ensemble Similarity “CLES”. Using both method computes the value of Class Match Similarity “CMS” based on which the malware has been classified. The proposed approach achieves 97% accuracy in malware detection and classification with the less time complexity of 34 seconds for 75000 samples
期刊介绍:
Journal of Cyber Security and Mobility is an international, open-access, peer reviewed journal publishing original research, review/survey, and tutorial papers on all cyber security fields including information, computer & network security, cryptography, digital forensics etc. but also interdisciplinary articles that cover privacy, ethical, legal, economical aspects of cyber security or emerging solutions drawn from other branches of science, for example, nature-inspired. The journal aims at becoming an international source of innovation and an essential reading for IT security professionals around the world by providing an in-depth and holistic view on all security spectrum and solutions ranging from practical to theoretical. Its goal is to bring together researchers and practitioners dealing with the diverse fields of cybersecurity and to cover topics that are equally valuable for professionals as well as for those new in the field from all sectors industry, commerce and academia. This journal covers diverse security issues in cyber space and solutions thereof. As cyber space has moved towards the wireless/mobile world, issues in wireless/mobile communications and those involving mobility aspects will also be published.