{"title":"基于加权集成排序的特征选择提高CSE-CIC-IDS2018数据集的分类性能","authors":"László Göcs, Z. Johanyák","doi":"10.3390/computers12080147","DOIUrl":null,"url":null,"abstract":"Feature selection is a crucial step in machine learning, aiming to identify the most relevant features in high-dimensional data in order to reduce the computational complexity of model development and improve generalization performance. Ensemble feature-ranking methods combine the results of several feature-selection techniques to identify a subset of the most relevant features for a given task. In many cases, they produce a more comprehensive ranking of features than the individual methods used alone. This paper presents a novel approach to ensemble feature ranking, which uses a weighted average of the individual ranking scores calculated using these individual methods. The optimal weights are determined using a Taguchi-type design of experiments. The proposed methodology significantly improves classification performance on the CSE-CIC-IDS2018 dataset, particularly for attack types where traditional average-based feature-ranking score combinations result in low classification metrics.","PeriodicalId":10526,"journal":{"name":"Comput.","volume":"29 1","pages":"147"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature Selection with Weighted Ensemble Ranking for Improved Classification Performance on the CSE-CIC-IDS2018 Dataset\",\"authors\":\"László Göcs, Z. Johanyák\",\"doi\":\"10.3390/computers12080147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection is a crucial step in machine learning, aiming to identify the most relevant features in high-dimensional data in order to reduce the computational complexity of model development and improve generalization performance. Ensemble feature-ranking methods combine the results of several feature-selection techniques to identify a subset of the most relevant features for a given task. In many cases, they produce a more comprehensive ranking of features than the individual methods used alone. This paper presents a novel approach to ensemble feature ranking, which uses a weighted average of the individual ranking scores calculated using these individual methods. The optimal weights are determined using a Taguchi-type design of experiments. The proposed methodology significantly improves classification performance on the CSE-CIC-IDS2018 dataset, particularly for attack types where traditional average-based feature-ranking score combinations result in low classification metrics.\",\"PeriodicalId\":10526,\"journal\":{\"name\":\"Comput.\",\"volume\":\"29 1\",\"pages\":\"147\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/computers12080147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/computers12080147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature Selection with Weighted Ensemble Ranking for Improved Classification Performance on the CSE-CIC-IDS2018 Dataset
Feature selection is a crucial step in machine learning, aiming to identify the most relevant features in high-dimensional data in order to reduce the computational complexity of model development and improve generalization performance. Ensemble feature-ranking methods combine the results of several feature-selection techniques to identify a subset of the most relevant features for a given task. In many cases, they produce a more comprehensive ranking of features than the individual methods used alone. This paper presents a novel approach to ensemble feature ranking, which uses a weighted average of the individual ranking scores calculated using these individual methods. The optimal weights are determined using a Taguchi-type design of experiments. The proposed methodology significantly improves classification performance on the CSE-CIC-IDS2018 dataset, particularly for attack types where traditional average-based feature-ranking score combinations result in low classification metrics.