Isolating critical data points from boundary region with feature selection

2014 IEEE International Conference on Computational Intelligence and Computing Research Pub Date : 2014-12-01 DOI:10.1109/ICCIC.2014.7238403

A. Anitha, E. Kannan

{"title":"Isolating critical data points from boundary region with feature selection","authors":"A. Anitha, E. Kannan","doi":"10.1109/ICCIC.2014.7238403","DOIUrl":null,"url":null,"abstract":"Immense databases may contain critical instances or chunks-a small heap of records or instances which has domain specific information. These chunks of information are useful in future decision making for improving classification accuracy for labeling of critical, unlabeled instances by reducing false positives and false negatives. Classification process may be assessed based on efficiency and effectiveness. Efficiency is concerned with the time to process the records by reducing attributes in the data set and effectiveness is the improvement in classification accuracy using crucial information. This work focuses on reducing the attributes in the large databases, put forwards an innovative procedure for computing criticality which isolates critical instances from the boundary region and are validated using real-world data set. This work also uses different attribute reduction technique used for fetching the critical instances to reduce the computational time. Results of the experiments show that only subsets of instances are isolated as critical nuggets. It is found that use of attribute reduction technique decreases the computational time. The data set with reduced attributes does not affect the classification accuracy and produces the same result as with the original data set. It also reveals that these critical records helps in improving classification accuracy substantially along with reduced computational time and are validated using real-life data sets.","PeriodicalId":187874,"journal":{"name":"2014 IEEE International Conference on Computational Intelligence and Computing Research","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Computational Intelligence and Computing Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIC.2014.7238403","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Immense databases may contain critical instances or chunks-a small heap of records or instances which has domain specific information. These chunks of information are useful in future decision making for improving classification accuracy for labeling of critical, unlabeled instances by reducing false positives and false negatives. Classification process may be assessed based on efficiency and effectiveness. Efficiency is concerned with the time to process the records by reducing attributes in the data set and effectiveness is the improvement in classification accuracy using crucial information. This work focuses on reducing the attributes in the large databases, put forwards an innovative procedure for computing criticality which isolates critical instances from the boundary region and are validated using real-world data set. This work also uses different attribute reduction technique used for fetching the critical instances to reduce the computational time. Results of the experiments show that only subsets of instances are isolated as critical nuggets. It is found that use of attribute reduction technique decreases the computational time. The data set with reduced attributes does not affect the classification accuracy and produces the same result as with the original data set. It also reveals that these critical records helps in improving classification accuracy substantially along with reduced computational time and are validated using real-life data sets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用特征选择从边界区域分离关键数据点

巨大的数据库可能包含关键实例或块——具有特定领域信息的一小堆记录或实例。这些信息块在未来的决策制定中非常有用，可以通过减少误报和误报来提高标记关键、未标记实例的分类准确性。分类过程可根据效率和效果进行评估。效率是指通过减少数据集中的属性来处理记录的时间，而有效性是指使用关键信息来提高分类精度。本文着重于减少大型数据库中的属性，提出了一种创新的临界性计算方法，该方法将临界实例从边界区域中分离出来，并使用实际数据集进行验证。这项工作还使用了不同的属性约简技术来获取关键实例，以减少计算时间。实验结果表明，只有实例子集被隔离为临界块。结果表明，使用属性约简技术可以减少计算时间。属性约简后的数据集不影响分类精度，并产生与原始数据集相同的结果。它还表明，这些关键记录有助于大大提高分类准确性，同时减少计算时间，并使用实际数据集进行验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 IEEE International Conference on Computational Intelligence and Computing Research

自引率

0.00%

发文量