Feature granularity for cardiac datasets using Rough Set

2011 IEEE International Conference on Computer Science and Automation Engineering Pub Date : 2011-06-10 DOI:10.1109/CSAE.2011.5952485

N. Sulaiman, S. Shamsuddin

{"title":"Feature granularity for cardiac datasets using Rough Set","authors":"N. Sulaiman, S. Shamsuddin","doi":"10.1109/CSAE.2011.5952485","DOIUrl":null,"url":null,"abstract":"Rough Set is a remarkable technique that has been successfully implemented in diverse applications including medical field. Typically, Rough Set is an efficient instrument in dealing with huge dataset in concert with missing values and granularing the features. However, large numbers of generated features reducts and rules must be chosen cautiously to reduce the processing power in dealing with massive parameters for classification. Hence, the primary objective of this study is to probe the significant reducts and rules prior to classification process of cardiac datasets from National Heart Institute (NHI), Malaysia. All-embracing analyses are presented to eradicate the insignificant attributes, reduct and rules for better classification taxonomy. Reducts with core attributes and minimal cardinality are preferred to construct new decision table, and subsequently generate high classification rates. In addition, rules with highest support, fewer length and high Rule Importance Measure (RIM) are favored since they reveal high quality performance. The results are compared in terms of the classification accuracy between the original decision table and a new decision table. It demonstrates that the rules with highest support value are more significant compared to the rules with less length.","PeriodicalId":138215,"journal":{"name":"2011 IEEE International Conference on Computer Science and Automation Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Computer Science and Automation Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSAE.2011.5952485","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Rough Set is a remarkable technique that has been successfully implemented in diverse applications including medical field. Typically, Rough Set is an efficient instrument in dealing with huge dataset in concert with missing values and granularing the features. However, large numbers of generated features reducts and rules must be chosen cautiously to reduce the processing power in dealing with massive parameters for classification. Hence, the primary objective of this study is to probe the significant reducts and rules prior to classification process of cardiac datasets from National Heart Institute (NHI), Malaysia. All-embracing analyses are presented to eradicate the insignificant attributes, reduct and rules for better classification taxonomy. Reducts with core attributes and minimal cardinality are preferred to construct new decision table, and subsequently generate high classification rates. In addition, rules with highest support, fewer length and high Rule Importance Measure (RIM) are favored since they reveal high quality performance. The results are compared in terms of the classification accuracy between the original decision table and a new decision table. It demonstrates that the rules with highest support value are more significant compared to the rules with less length.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用粗糙集的心脏数据集特征粒度

粗糙集是一种引人注目的技术，已经成功地应用于包括医学领域在内的各个领域。通常，粗糙集是处理大型数据集的有效工具，可以处理缺失值和颗粒化特征。但是，在处理海量的分类参数时，必须谨慎选择生成的大量特征约简和规则，以降低处理能力。因此，本研究的主要目的是探讨马来西亚国家心脏研究所(NHI)心脏数据集分类过程之前的显著减少和规则。提出了全面的分析，以消除无关紧要的属性，约简和规则，以获得更好的分类分类。具有核心属性和最小基数的约简倾向于构造新的决策表，从而产生较高的分类率。此外，支持度最高、长度较短和规则重要性度量(RIM)较高的规则受到青睐，因为它们显示出高质量的性能。比较了原决策表和新决策表的分类精度。结果表明，与长度较小的规则相比，支持值最大的规则更显著。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2011 IEEE International Conference on Computer Science and Automation Engineering

自引率

0.00%

发文量