{"title":"Addressing Missing Attributes during Data Mining Using Frequent Itemsets and Rough Set Based Predictions","authors":"Jiye Li, N. Cercone, R. Cohen","doi":"10.1109/GrC.2007.144","DOIUrl":null,"url":null,"abstract":"In this paper, we present an improved method for predicting missing attribute values in data sets. We make use of frequent itemsets, generated from the association rules algorithm, displaying the correlations between different items in a set of transactions. In particular, we consider a database as a set of transactions and each data instance as an itemset. Then frequent itemsets can be used as a knowledge base to predict missing attribute values. Our approach integrates the RSFit method based on rough sets theory that produces faster predictions by considering similarities of attribute value pairs, but only for those attributes contained in the core or reduct of the data set. Using empirical studies on UCI and other real world data sets, we demonstrate a significant increase in prediction accuracy obtained from our new integrated approach, referred to as ItemRSFit.","PeriodicalId":259430,"journal":{"name":"2007 IEEE International Conference on Granular Computing (GRC 2007)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Conference on Granular Computing (GRC 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GrC.2007.144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In this paper, we present an improved method for predicting missing attribute values in data sets. We make use of frequent itemsets, generated from the association rules algorithm, displaying the correlations between different items in a set of transactions. In particular, we consider a database as a set of transactions and each data instance as an itemset. Then frequent itemsets can be used as a knowledge base to predict missing attribute values. Our approach integrates the RSFit method based on rough sets theory that produces faster predictions by considering similarities of attribute value pairs, but only for those attributes contained in the core or reduct of the data set. Using empirical studies on UCI and other real world data sets, we demonstrate a significant increase in prediction accuracy obtained from our new integrated approach, referred to as ItemRSFit.