{"title":"基于卡方和遗传算法的混合特征选择技术","authors":"Ammar Ismael Kadhim, Ahmed Ayad Abdalhameed","doi":"10.1109/MICEST54286.2022.9790277","DOIUrl":null,"url":null,"abstract":"A huge amount of information is available in different fields like information technology and computer science. A new hybrid feature selection technique via using chi-square with genetic algorithm (GA). An automatic text categorization mechanism was required to identify whether the text is going to a specific category or not. Thus, this technique is used to select the importance and unimportance features via developing the training model. For the existing GA-based, terms and documents are used together as features in the training model and obtain the perfect weights for the features. To evaluate the efficiency of document categorization techniques on the suggested approach, experiments results are conducted utilizing the Naïve Bayes (NB) and C4.5 decision tree classifiers based on two different datasets (BBC sport and BBC news datasets) collection for text categorization. From the empirical findings, it can observed that the hybrid technique can allow to obtain high categorization efficiency depend on the performance evaluation metrics accuracy, precision, recall and F1-score.","PeriodicalId":222003,"journal":{"name":"2022 Muthanna International Conference on Engineering Science and Technology (MICEST)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A hybrid feature selection technique using chi-square with genetic algorithm\",\"authors\":\"Ammar Ismael Kadhim, Ahmed Ayad Abdalhameed\",\"doi\":\"10.1109/MICEST54286.2022.9790277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A huge amount of information is available in different fields like information technology and computer science. A new hybrid feature selection technique via using chi-square with genetic algorithm (GA). An automatic text categorization mechanism was required to identify whether the text is going to a specific category or not. Thus, this technique is used to select the importance and unimportance features via developing the training model. For the existing GA-based, terms and documents are used together as features in the training model and obtain the perfect weights for the features. To evaluate the efficiency of document categorization techniques on the suggested approach, experiments results are conducted utilizing the Naïve Bayes (NB) and C4.5 decision tree classifiers based on two different datasets (BBC sport and BBC news datasets) collection for text categorization. From the empirical findings, it can observed that the hybrid technique can allow to obtain high categorization efficiency depend on the performance evaluation metrics accuracy, precision, recall and F1-score.\",\"PeriodicalId\":222003,\"journal\":{\"name\":\"2022 Muthanna International Conference on Engineering Science and Technology (MICEST)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Muthanna International Conference on Engineering Science and Technology (MICEST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MICEST54286.2022.9790277\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Muthanna International Conference on Engineering Science and Technology (MICEST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MICEST54286.2022.9790277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A hybrid feature selection technique using chi-square with genetic algorithm
A huge amount of information is available in different fields like information technology and computer science. A new hybrid feature selection technique via using chi-square with genetic algorithm (GA). An automatic text categorization mechanism was required to identify whether the text is going to a specific category or not. Thus, this technique is used to select the importance and unimportance features via developing the training model. For the existing GA-based, terms and documents are used together as features in the training model and obtain the perfect weights for the features. To evaluate the efficiency of document categorization techniques on the suggested approach, experiments results are conducted utilizing the Naïve Bayes (NB) and C4.5 decision tree classifiers based on two different datasets (BBC sport and BBC news datasets) collection for text categorization. From the empirical findings, it can observed that the hybrid technique can allow to obtain high categorization efficiency depend on the performance evaluation metrics accuracy, precision, recall and F1-score.