{"title":"Distributed noun attribute based on its first appearance for text document clustering","authors":"S. Vijayalakshmi, D. Manimegalai","doi":"10.1109/ICCIC.2014.7238544","DOIUrl":null,"url":null,"abstract":"Selection of attributes plays a vital role to improve the quality of clustering. We present a comparative study on three attribute selection techniques and it reveals unattempt combinations, and provides guidelines in selecting attributes. It is occasionally studied in unsupervised learning; however it has been extensively explored in supervised learning. The suggested framework is primarily concerned with the problem of determining and selecting key distributional noun attributes, which are nominated by ranking the attributes according to the importance measure scores from the original noun attributes without class information. Experimental results on Reuter, 20 Newsgroup, WebKB and SCJC (Specific Crime Judgment Corpus) datasets indicate that algorithm with different scores in the context are able to identify the important attributes.","PeriodicalId":187874,"journal":{"name":"2014 IEEE International Conference on Computational Intelligence and Computing Research","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Computational Intelligence and Computing Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIC.2014.7238544","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Selection of attributes plays a vital role to improve the quality of clustering. We present a comparative study on three attribute selection techniques and it reveals unattempt combinations, and provides guidelines in selecting attributes. It is occasionally studied in unsupervised learning; however it has been extensively explored in supervised learning. The suggested framework is primarily concerned with the problem of determining and selecting key distributional noun attributes, which are nominated by ranking the attributes according to the importance measure scores from the original noun attributes without class information. Experimental results on Reuter, 20 Newsgroup, WebKB and SCJC (Specific Crime Judgment Corpus) datasets indicate that algorithm with different scores in the context are able to identify the important attributes.