Bat算法在各种应用中数据处理的有效性

2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE) Pub Date : 2016-01-01 DOI:10.1109/ICCSCE.2016.7893562

Rozlini Mohamed, M. M. Yusof, Noorhaniza Wahid

{"title":"Bat算法在各种应用中数据处理的有效性","authors":"Rozlini Mohamed, M. M. Yusof, Noorhaniza Wahid","doi":"10.1109/ICCSCE.2016.7893562","DOIUrl":null,"url":null,"abstract":"Feature selection is a technique used to reduce irrelevant data and finding the most relevant features that would increase classification accuracy. It is widely used in various applications such as medical, agriculture and Information Technology. In producing better classification result, feature selection been applied in many of the classification works as part of preprocessing step; where only a subset of feature been used rather than the whole features from a particular dataset. This research is conducted with the intention to find the appropriate data types according to the percentage of attributes reduction and classification performance. During the experiments, the effectiveness of data handling for Bat algorithm is tested via type of data and size of attributes in generic dataset. 10 datasets from UCI repository from various applications are used. The selected features are selected using Bat algorithm and measured by three classifiers; k-Nearest Neighbor (kNN), Naïve Bayes (NB) and Decision Tree (DT). This paper then analyzes the performance of all classifiers with and without feature selection in term of accuracy, sensitivity, F-Measure and ROC. The research found that although the percentage of reduction is high, it produces lowest result in classification performance since the type of data and number of attribute are not appropriate.","PeriodicalId":6540,"journal":{"name":"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)","volume":"8 1","pages":"151-156"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"The effectiveness of Bat algorithm for data handling in various applications\",\"authors\":\"Rozlini Mohamed, M. M. Yusof, Noorhaniza Wahid\",\"doi\":\"10.1109/ICCSCE.2016.7893562\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection is a technique used to reduce irrelevant data and finding the most relevant features that would increase classification accuracy. It is widely used in various applications such as medical, agriculture and Information Technology. In producing better classification result, feature selection been applied in many of the classification works as part of preprocessing step; where only a subset of feature been used rather than the whole features from a particular dataset. This research is conducted with the intention to find the appropriate data types according to the percentage of attributes reduction and classification performance. During the experiments, the effectiveness of data handling for Bat algorithm is tested via type of data and size of attributes in generic dataset. 10 datasets from UCI repository from various applications are used. The selected features are selected using Bat algorithm and measured by three classifiers; k-Nearest Neighbor (kNN), Naïve Bayes (NB) and Decision Tree (DT). This paper then analyzes the performance of all classifiers with and without feature selection in term of accuracy, sensitivity, F-Measure and ROC. The research found that although the percentage of reduction is high, it produces lowest result in classification performance since the type of data and number of attribute are not appropriate.\",\"PeriodicalId\":6540,\"journal\":{\"name\":\"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)\",\"volume\":\"8 1\",\"pages\":\"151-156\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSCE.2016.7893562\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSCE.2016.7893562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

特征选择是一种用于减少不相关数据并找到最相关特征以提高分类准确性的技术。它被广泛应用于医疗、农业和信息技术等各个领域。为了得到更好的分类结果，特征选择作为预处理的一部分被应用到许多分类工作中;其中只使用了特征的子集，而不是特定数据集中的全部特征。本研究的目的是根据属性约简的百分比和分类性能找到合适的数据类型。在实验中，通过通用数据集中的数据类型和属性大小来测试Bat算法处理数据的有效性。使用了来自不同应用程序的UCI存储库中的10个数据集。所选特征采用Bat算法进行选择，并通过三个分类器进行测量;k近邻(kNN)， Naïve贝叶斯(NB)和决策树(DT)。然后，从准确率、灵敏度、F-Measure和ROC等方面分析了具有和不具有特征选择的分类器的性能。研究发现，虽然减少的百分比很高，但由于数据类型和属性数量不合适，在分类性能上的结果是最低的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

The effectiveness of Bat algorithm for data handling in various applications

Feature selection is a technique used to reduce irrelevant data and finding the most relevant features that would increase classification accuracy. It is widely used in various applications such as medical, agriculture and Information Technology. In producing better classification result, feature selection been applied in many of the classification works as part of preprocessing step; where only a subset of feature been used rather than the whole features from a particular dataset. This research is conducted with the intention to find the appropriate data types according to the percentage of attributes reduction and classification performance. During the experiments, the effectiveness of data handling for Bat algorithm is tested via type of data and size of attributes in generic dataset. 10 datasets from UCI repository from various applications are used. The selected features are selected using Bat algorithm and measured by three classifiers; k-Nearest Neighbor (kNN), Naïve Bayes (NB) and Decision Tree (DT). This paper then analyzes the performance of all classifiers with and without feature selection in term of accuracy, sensitivity, F-Measure and ROC. The research found that although the percentage of reduction is high, it produces lowest result in classification performance since the type of data and number of attribute are not appropriate.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)

自引率

0.00%

发文量