Kansheng Shi, Lemin Li, Jie He, Haitao Liu, Naitong Zhang, Wentao Song
{"title":"基于语言特征的文本聚类方法","authors":"Kansheng Shi, Lemin Li, Jie He, Haitao Liu, Naitong Zhang, Wentao Song","doi":"10.1109/CCIS.2011.6045042","DOIUrl":null,"url":null,"abstract":"The traditional K-means algorithm is sensitive to the initial point, easy to fall into local optimum. In order to avoid this kind of flaw, an improved K-means text clustering method WIKTCM is proposed. The new method creates an innovative initial centers selection method and accommodates the contribution of characteristics of different parts of speech to the text. In addition, the impact of outliers is considered. Experimental results show that the new method has better clustering results.","PeriodicalId":128504,"journal":{"name":"2011 IEEE International Conference on Cloud Computing and Intelligence Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A linguistic feature based text clustering method\",\"authors\":\"Kansheng Shi, Lemin Li, Jie He, Haitao Liu, Naitong Zhang, Wentao Song\",\"doi\":\"10.1109/CCIS.2011.6045042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The traditional K-means algorithm is sensitive to the initial point, easy to fall into local optimum. In order to avoid this kind of flaw, an improved K-means text clustering method WIKTCM is proposed. The new method creates an innovative initial centers selection method and accommodates the contribution of characteristics of different parts of speech to the text. In addition, the impact of outliers is considered. Experimental results show that the new method has better clustering results.\",\"PeriodicalId\":128504,\"journal\":{\"name\":\"2011 IEEE International Conference on Cloud Computing and Intelligence Systems\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE International Conference on Cloud Computing and Intelligence Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCIS.2011.6045042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Cloud Computing and Intelligence Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCIS.2011.6045042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The traditional K-means algorithm is sensitive to the initial point, easy to fall into local optimum. In order to avoid this kind of flaw, an improved K-means text clustering method WIKTCM is proposed. The new method creates an innovative initial centers selection method and accommodates the contribution of characteristics of different parts of speech to the text. In addition, the impact of outliers is considered. Experimental results show that the new method has better clustering results.