Ma. Shiela C. Sapul, T. Aung, Rachsuda Jiamthapthaksin
{"title":"使用聚类和主题建模算法的Twitter Tweets趋势主题发现","authors":"Ma. Shiela C. Sapul, T. Aung, Rachsuda Jiamthapthaksin","doi":"10.1109/JCSSE.2017.8025911","DOIUrl":null,"url":null,"abstract":"There is no previous research that compares the results of k-means, CLOPE clustering and Latent Dirichlet Allocation (LDA) topic modeling algorithms for detecting trending topics on tweets. Since not all tweets contain hashtags, we considered three training data feature sets: hashtags, keywords and keywords + hashtags in this study. Our proposed methodology proved that CLOPE can also be used in a non-transactional database like Twitter data set to answer the trending topic discovery and could provide more topic patterns than k-means and LDA. Using additional feature sets has improved the results of k-means and LDA, thus, keywords + hashtags can identify more meaningful topics.","PeriodicalId":6460,"journal":{"name":"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"9 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Trending topic discovery of Twitter Tweets using clustering and topic modeling algorithms\",\"authors\":\"Ma. Shiela C. Sapul, T. Aung, Rachsuda Jiamthapthaksin\",\"doi\":\"10.1109/JCSSE.2017.8025911\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is no previous research that compares the results of k-means, CLOPE clustering and Latent Dirichlet Allocation (LDA) topic modeling algorithms for detecting trending topics on tweets. Since not all tweets contain hashtags, we considered three training data feature sets: hashtags, keywords and keywords + hashtags in this study. Our proposed methodology proved that CLOPE can also be used in a non-transactional database like Twitter data set to answer the trending topic discovery and could provide more topic patterns than k-means and LDA. Using additional feature sets has improved the results of k-means and LDA, thus, keywords + hashtags can identify more meaningful topics.\",\"PeriodicalId\":6460,\"journal\":{\"name\":\"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)\",\"volume\":\"9 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/JCSSE.2017.8025911\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/JCSSE.2017.8025911","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Trending topic discovery of Twitter Tweets using clustering and topic modeling algorithms
There is no previous research that compares the results of k-means, CLOPE clustering and Latent Dirichlet Allocation (LDA) topic modeling algorithms for detecting trending topics on tweets. Since not all tweets contain hashtags, we considered three training data feature sets: hashtags, keywords and keywords + hashtags in this study. Our proposed methodology proved that CLOPE can also be used in a non-transactional database like Twitter data set to answer the trending topic discovery and could provide more topic patterns than k-means and LDA. Using additional feature sets has improved the results of k-means and LDA, thus, keywords + hashtags can identify more meaningful topics.