{"title":"Multi-objectives-based text clustering technique using K-mean algorithm","authors":"L. Abualigah, A. Khader, M. Al-Betar","doi":"10.1109/CSIT.2016.7549464","DOIUrl":null,"url":null,"abstract":"Text documents clustering is a popular unsupervised text mining tool. It is used for partitioning a collection of text documents into similar clusters based on the distance or similarity measure as decided by an objective function. Text clustering algorithm often makes prior assumptions to satisfy objective function, which is optimized either through traditional techniques or meta-heuristic techniques. In text clustering techniques, the right decision for any document distribution is done using an objective function. Normally, clustering algorithms perform poorly when the configuration of the well-formulated objective function is not sound and complete. Therefore, we proposed multi-objectives-based method namely, combine distance and similarity measure for improving the text clustering technique. Multi-objectives text clustering method is combined with two evaluating criteria which emerge as a robust alternative in several situations. In particular, the multi-objective function in the text clustering domain is not a popular, and it is a core issue that affects the performance of the text clustering technique. The performance of multi-objectives function is investigated using the k-mean text clustering technique. The experiments were conducted using seven standard text datasets. The results showed that the proposed multi-objectives based method outperforms the other measures in term of the performance of the text clustering, evaluated by using two common clustering measures, namely, Accuracy and F-measure.","PeriodicalId":210905,"journal":{"name":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSIT.2016.7549464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 43
Abstract
Text documents clustering is a popular unsupervised text mining tool. It is used for partitioning a collection of text documents into similar clusters based on the distance or similarity measure as decided by an objective function. Text clustering algorithm often makes prior assumptions to satisfy objective function, which is optimized either through traditional techniques or meta-heuristic techniques. In text clustering techniques, the right decision for any document distribution is done using an objective function. Normally, clustering algorithms perform poorly when the configuration of the well-formulated objective function is not sound and complete. Therefore, we proposed multi-objectives-based method namely, combine distance and similarity measure for improving the text clustering technique. Multi-objectives text clustering method is combined with two evaluating criteria which emerge as a robust alternative in several situations. In particular, the multi-objective function in the text clustering domain is not a popular, and it is a core issue that affects the performance of the text clustering technique. The performance of multi-objectives function is investigated using the k-mean text clustering technique. The experiments were conducted using seven standard text datasets. The results showed that the proposed multi-objectives based method outperforms the other measures in term of the performance of the text clustering, evaluated by using two common clustering measures, namely, Accuracy and F-measure.