{"title":"A Discretization Algorithm Based on Clustering and CAIR Criterion","authors":"Chaoqun Yi, Jianping Li, Enming Dong","doi":"10.1109/icnc.2011.6022517","DOIUrl":null,"url":null,"abstract":"Discretization algorithms play an important role in machine learning. Traditionally, the discretization methods using the Class- Attribute contingency table always take the boundary points as the initializing intervals' partition points. For it doesn't take care of the data distributing and include the large number of the initialized intervals partition points, so that cause large amount of calculation and unreasonable discretization schemes. To consider the interdependent between the class and attributes as well as the data distributing, a discretization algorithm based on clustering and CAIR criterion is proposed. It uses the NCL clustering to find the initialized intervals partition points, and takes the CAIR criterion as a threshold to reselect the partition points. We feed data discretized by our method into SVM classifier. The experimental results demonstrate that our algorithm is effective not only for fewer rules, but also for higher classification accuracy .","PeriodicalId":87274,"journal":{"name":"International Conference on Computing, Networking, and Communications : [proceedings]. International Conference on Computing, Networking and Communications","volume":"20 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Computing, Networking, and Communications : [proceedings]. International Conference on Computing, Networking and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icnc.2011.6022517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Discretization algorithms play an important role in machine learning. Traditionally, the discretization methods using the Class- Attribute contingency table always take the boundary points as the initializing intervals' partition points. For it doesn't take care of the data distributing and include the large number of the initialized intervals partition points, so that cause large amount of calculation and unreasonable discretization schemes. To consider the interdependent between the class and attributes as well as the data distributing, a discretization algorithm based on clustering and CAIR criterion is proposed. It uses the NCL clustering to find the initialized intervals partition points, and takes the CAIR criterion as a threshold to reselect the partition points. We feed data discretized by our method into SVM classifier. The experimental results demonstrate that our algorithm is effective not only for fewer rules, but also for higher classification accuracy .