{"title":"Graph-based Word Clustering Considering the Distance and the Connectivity of a Co-occurrence","authors":"Supaporn Simcharoen, H. Unger","doi":"10.1109/RI2C56397.2022.9910302","DOIUrl":null,"url":null,"abstract":"Word clustering is a typical method of natural language processing. Several approaches for word clustering have been developed which consider different factors. The following article presents two factors, including the closest distance and the connectivity of a co-occurrence. The classical clustering algorithms, including k-means and Chinese Whispers, are chosen to compare their cluster quality. The results show that the quality of both proposed clustering algorithms of these two factors is close to k-means clustering.","PeriodicalId":403083,"journal":{"name":"2022 Research, Invention, and Innovation Congress: Innovative Electricals and Electronics (RI2C)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Research, Invention, and Innovation Congress: Innovative Electricals and Electronics (RI2C)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RI2C56397.2022.9910302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Word clustering is a typical method of natural language processing. Several approaches for word clustering have been developed which consider different factors. The following article presents two factors, including the closest distance and the connectivity of a co-occurrence. The classical clustering algorithms, including k-means and Chinese Whispers, are chosen to compare their cluster quality. The results show that the quality of both proposed clustering algorithms of these two factors is close to k-means clustering.