S. Sharmin, A. Ali, Muhammad Asif Hossain Khan, M. Shoyaib
{"title":"Feature Selection and Discretization based on Mutual Information","authors":"S. Sharmin, A. Ali, Muhammad Asif Hossain Khan, M. Shoyaib","doi":"10.1109/ICIVPR.2017.7890885","DOIUrl":null,"url":null,"abstract":"Feature selection and discretization have been considered to be an important research topic in the field of pattern recognition and data mining. However, addressing both these issues at a time is rarely discussed in the existing research. In this paper, these issues have been addressed by developing a heuristic namely discretization and selection of features based on mutual information (DSM). Experimental results on 15 datasets show that the proposed DSM outperforms a number of state-of-the-art feature selection or discretization algorithms. On average, its accuracy surpasses that of the best performing state-of-the-art algorithms by 5% using Support Vector Machine. Moreover, for datasets with a large number of features, it shows promising accuracies even with far less number of features than the other competing algorithms.","PeriodicalId":126745,"journal":{"name":"2017 IEEE International Conference on Imaging, Vision & Pattern Recognition (icIVPR)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Imaging, Vision & Pattern Recognition (icIVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIVPR.2017.7890885","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Feature selection and discretization have been considered to be an important research topic in the field of pattern recognition and data mining. However, addressing both these issues at a time is rarely discussed in the existing research. In this paper, these issues have been addressed by developing a heuristic namely discretization and selection of features based on mutual information (DSM). Experimental results on 15 datasets show that the proposed DSM outperforms a number of state-of-the-art feature selection or discretization algorithms. On average, its accuracy surpasses that of the best performing state-of-the-art algorithms by 5% using Support Vector Machine. Moreover, for datasets with a large number of features, it shows promising accuracies even with far less number of features than the other competing algorithms.