{"title":"Bat算法在各种应用中数据处理的有效性","authors":"Rozlini Mohamed, M. M. Yusof, Noorhaniza Wahid","doi":"10.1109/ICCSCE.2016.7893562","DOIUrl":null,"url":null,"abstract":"Feature selection is a technique used to reduce irrelevant data and finding the most relevant features that would increase classification accuracy. It is widely used in various applications such as medical, agriculture and Information Technology. In producing better classification result, feature selection been applied in many of the classification works as part of preprocessing step; where only a subset of feature been used rather than the whole features from a particular dataset. This research is conducted with the intention to find the appropriate data types according to the percentage of attributes reduction and classification performance. During the experiments, the effectiveness of data handling for Bat algorithm is tested via type of data and size of attributes in generic dataset. 10 datasets from UCI repository from various applications are used. The selected features are selected using Bat algorithm and measured by three classifiers; k-Nearest Neighbor (kNN), Naïve Bayes (NB) and Decision Tree (DT). This paper then analyzes the performance of all classifiers with and without feature selection in term of accuracy, sensitivity, F-Measure and ROC. The research found that although the percentage of reduction is high, it produces lowest result in classification performance since the type of data and number of attribute are not appropriate.","PeriodicalId":6540,"journal":{"name":"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)","volume":"8 1","pages":"151-156"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"The effectiveness of Bat algorithm for data handling in various applications\",\"authors\":\"Rozlini Mohamed, M. M. Yusof, Noorhaniza Wahid\",\"doi\":\"10.1109/ICCSCE.2016.7893562\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection is a technique used to reduce irrelevant data and finding the most relevant features that would increase classification accuracy. It is widely used in various applications such as medical, agriculture and Information Technology. In producing better classification result, feature selection been applied in many of the classification works as part of preprocessing step; where only a subset of feature been used rather than the whole features from a particular dataset. This research is conducted with the intention to find the appropriate data types according to the percentage of attributes reduction and classification performance. During the experiments, the effectiveness of data handling for Bat algorithm is tested via type of data and size of attributes in generic dataset. 10 datasets from UCI repository from various applications are used. The selected features are selected using Bat algorithm and measured by three classifiers; k-Nearest Neighbor (kNN), Naïve Bayes (NB) and Decision Tree (DT). This paper then analyzes the performance of all classifiers with and without feature selection in term of accuracy, sensitivity, F-Measure and ROC. The research found that although the percentage of reduction is high, it produces lowest result in classification performance since the type of data and number of attribute are not appropriate.\",\"PeriodicalId\":6540,\"journal\":{\"name\":\"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)\",\"volume\":\"8 1\",\"pages\":\"151-156\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSCE.2016.7893562\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSCE.2016.7893562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The effectiveness of Bat algorithm for data handling in various applications
Feature selection is a technique used to reduce irrelevant data and finding the most relevant features that would increase classification accuracy. It is widely used in various applications such as medical, agriculture and Information Technology. In producing better classification result, feature selection been applied in many of the classification works as part of preprocessing step; where only a subset of feature been used rather than the whole features from a particular dataset. This research is conducted with the intention to find the appropriate data types according to the percentage of attributes reduction and classification performance. During the experiments, the effectiveness of data handling for Bat algorithm is tested via type of data and size of attributes in generic dataset. 10 datasets from UCI repository from various applications are used. The selected features are selected using Bat algorithm and measured by three classifiers; k-Nearest Neighbor (kNN), Naïve Bayes (NB) and Decision Tree (DT). This paper then analyzes the performance of all classifiers with and without feature selection in term of accuracy, sensitivity, F-Measure and ROC. The research found that although the percentage of reduction is high, it produces lowest result in classification performance since the type of data and number of attribute are not appropriate.