Xiaoyu Zhang, Shupeng Wang, Lei Zhang, Chunjie Zhang, Changsheng Li
{"title":"Ensemble feature selection with discriminative and representative properties for malware detection","authors":"Xiaoyu Zhang, Shupeng Wang, Lei Zhang, Chunjie Zhang, Changsheng Li","doi":"10.1109/INFCOMW.2016.7562161","DOIUrl":null,"url":null,"abstract":"Malware data are typically depicted with extremely high-dimensional features, which lays an excessive computational burden on detection methods. For the sake of effectiveness and efficiency, feature selection is an indispensable part for malware detection. In this paper, we propose an ensemble feature selection method with integration of discriminative and representative properties for malware detection. Based on the labeled and unlabeled data, the most discriminative and representative features are selected, respectively. The former extracts the features that are most distinctive with respect to the classes, and the latter focuses on the features that best represent the data. A comprehensive metric is subsequently obtained, which retains the most informative features.","PeriodicalId":348177,"journal":{"name":"2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFCOMW.2016.7562161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Malware data are typically depicted with extremely high-dimensional features, which lays an excessive computational burden on detection methods. For the sake of effectiveness and efficiency, feature selection is an indispensable part for malware detection. In this paper, we propose an ensemble feature selection method with integration of discriminative and representative properties for malware detection. Based on the labeled and unlabeled data, the most discriminative and representative features are selected, respectively. The former extracts the features that are most distinctive with respect to the classes, and the latter focuses on the features that best represent the data. A comprehensive metric is subsequently obtained, which retains the most informative features.