Ma. Shiela C. Sapul, T. Aung, Rachsuda Jiamthapthaksin
{"title":"Trending topic discovery of Twitter Tweets using clustering and topic modeling algorithms","authors":"Ma. Shiela C. Sapul, T. Aung, Rachsuda Jiamthapthaksin","doi":"10.1109/JCSSE.2017.8025911","DOIUrl":null,"url":null,"abstract":"There is no previous research that compares the results of k-means, CLOPE clustering and Latent Dirichlet Allocation (LDA) topic modeling algorithms for detecting trending topics on tweets. Since not all tweets contain hashtags, we considered three training data feature sets: hashtags, keywords and keywords + hashtags in this study. Our proposed methodology proved that CLOPE can also be used in a non-transactional database like Twitter data set to answer the trending topic discovery and could provide more topic patterns than k-means and LDA. Using additional feature sets has improved the results of k-means and LDA, thus, keywords + hashtags can identify more meaningful topics.","PeriodicalId":6460,"journal":{"name":"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"9 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/JCSSE.2017.8025911","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
There is no previous research that compares the results of k-means, CLOPE clustering and Latent Dirichlet Allocation (LDA) topic modeling algorithms for detecting trending topics on tweets. Since not all tweets contain hashtags, we considered three training data feature sets: hashtags, keywords and keywords + hashtags in this study. Our proposed methodology proved that CLOPE can also be used in a non-transactional database like Twitter data set to answer the trending topic discovery and could provide more topic patterns than k-means and LDA. Using additional feature sets has improved the results of k-means and LDA, thus, keywords + hashtags can identify more meaningful topics.