{"title":"An evolutionary algorithm for Feature Selective Double Clustering of text documents","authors":"Seyednaser Nourashrafeddin, E. Milios, D. Arnold","doi":"10.1109/CEC.2013.6557603","DOIUrl":null,"url":null,"abstract":"We propose FSDC, an evolutionary algorithm for Feature Selective Double Clustering of text documents. We first cluster the terms existing in the document corpus. The term clusters are then fed into multiobjective genetic algorithms to prune non-informative terms and form sets of keyterms representing topics. Based on the topic keyterms found, representative documents for each topic are extracted. These documents are then used as seeds to cluster all documents in the dataset. FSDC is compared to some well-known co-clusterers on real text datasets. The experimental results show that our algorithm can outperform the competitors.","PeriodicalId":211988,"journal":{"name":"2013 IEEE Congress on Evolutionary Computation","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Congress on Evolutionary Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2013.6557603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We propose FSDC, an evolutionary algorithm for Feature Selective Double Clustering of text documents. We first cluster the terms existing in the document corpus. The term clusters are then fed into multiobjective genetic algorithms to prune non-informative terms and form sets of keyterms representing topics. Based on the topic keyterms found, representative documents for each topic are extracted. These documents are then used as seeds to cluster all documents in the dataset. FSDC is compared to some well-known co-clusterers on real text datasets. The experimental results show that our algorithm can outperform the competitors.