{"title":"CLoPAR: Classification based on Predictive Association Rules","authors":"M. N. Dehkordi, M. H. Shenassa","doi":"10.1109/IS.2006.348467","DOIUrl":null,"url":null,"abstract":"Recent studies in data mining have proposed a new classification approach, called associative classification, which, according to several reports, such as Liu, B. et al (1998), achieves higher classification accuracy than traditional classification approaches such as C4.S However, the approach also suffers from two major deficiencies: (1) it generates a very large number of association rules, which leads to high processing overhead; and (2) its confidence-based rule evaluation measure may lead to overfitting. In comparison with associative classification, traditional rule-based classifiers, such as C4.5, FOIL and RIPPER, are substantially faster but their accuracy, in most cases, may not be as high. In this paper, we propose a new classification approach, CLoPAR (Classification based on Predictive Association Rules), which combines the advantages of both associative classification and traditional rule-based classification. Instead of generating a large number of candidate rules as in associative classification, CLoPAR adopts a greedy algorithm to generate rules directly from training data. Moreover, CLoPAR generates and tests more rules than traditional rule-based classifiers to avoid missing important rules. To avoid overfitting, CLoPAR uses expected accuracy to evaluate each rule and uses the best k rules in prediction","PeriodicalId":116809,"journal":{"name":"2006 3rd International IEEE Conference Intelligent Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 3rd International IEEE Conference Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IS.2006.348467","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Recent studies in data mining have proposed a new classification approach, called associative classification, which, according to several reports, such as Liu, B. et al (1998), achieves higher classification accuracy than traditional classification approaches such as C4.S However, the approach also suffers from two major deficiencies: (1) it generates a very large number of association rules, which leads to high processing overhead; and (2) its confidence-based rule evaluation measure may lead to overfitting. In comparison with associative classification, traditional rule-based classifiers, such as C4.5, FOIL and RIPPER, are substantially faster but their accuracy, in most cases, may not be as high. In this paper, we propose a new classification approach, CLoPAR (Classification based on Predictive Association Rules), which combines the advantages of both associative classification and traditional rule-based classification. Instead of generating a large number of candidate rules as in associative classification, CLoPAR adopts a greedy algorithm to generate rules directly from training data. Moreover, CLoPAR generates and tests more rules than traditional rule-based classifiers to avoid missing important rules. To avoid overfitting, CLoPAR uses expected accuracy to evaluate each rule and uses the best k rules in prediction
最近的数据挖掘研究提出了一种新的分类方法,称为关联分类,根据一些报道,如Liu, B. et al(1998),它比传统的分类方法(如C4)实现了更高的分类精度。然而,该方法也存在两个主要缺陷:(1)生成大量关联规则,导致处理开销高;(2)基于置信度的规则评价方法可能导致过拟合。与关联分类相比,传统的基于规则的分类器,如C4.5、FOIL和RIPPER,速度要快得多,但在大多数情况下,它们的准确率可能没有那么高。本文提出了一种新的基于预测关联规则的分类方法CLoPAR (classification based on Predictive Association Rules),它结合了关联分类和传统基于规则的分类的优点。CLoPAR不像关联分类那样生成大量的候选规则,而是采用贪心算法直接从训练数据中生成规则。此外,与传统的基于规则的分类器相比,CLoPAR生成和测试的规则更多,从而避免遗漏重要的规则。为了避免过拟合,CLoPAR使用预期精度来评估每个规则,并在预测中使用最佳的k条规则