{"title":"基于高效用模式聚类的大规模微博流话题检测","authors":"Jiajia Huang, Min Peng, Hua Wang","doi":"10.1145/2809890.2809894","DOIUrl":null,"url":null,"abstract":"With the popularity of social media, detecting topics from microblog streams have become an increasingly important task. However, it's a challenge due to microblog streams have the characteristics of high-dimension, short and noisy content, fast changing, huge volume and so on. In this paper, we propose a high utility pattern clustering (HUPC) framework over microblog streams. This framework first extracts a group of representative patterns from the microblog stream, and then groups these patterns into topic clusters. This approach works well on large scale of microblog streams because it clusters the patterns that perform better in describing topics, rather than clustering noises and microblogs directly. Furthermore, the proposed framework can detect coherent topics and new emerging topics simultaneously. Extensive experimental results on Twitter streams and Sina Weibo streams show that the developed method achieves better performance than other existing topic detection methods, leading to a desirable solution of detecting event from microblog streams.","PeriodicalId":67056,"journal":{"name":"车间管理","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2015-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":"{\"title\":\"Topic Detection from Large Scale of Microblog Stream with High Utility Pattern Clustering\",\"authors\":\"Jiajia Huang, Min Peng, Hua Wang\",\"doi\":\"10.1145/2809890.2809894\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the popularity of social media, detecting topics from microblog streams have become an increasingly important task. However, it's a challenge due to microblog streams have the characteristics of high-dimension, short and noisy content, fast changing, huge volume and so on. In this paper, we propose a high utility pattern clustering (HUPC) framework over microblog streams. This framework first extracts a group of representative patterns from the microblog stream, and then groups these patterns into topic clusters. This approach works well on large scale of microblog streams because it clusters the patterns that perform better in describing topics, rather than clustering noises and microblogs directly. Furthermore, the proposed framework can detect coherent topics and new emerging topics simultaneously. Extensive experimental results on Twitter streams and Sina Weibo streams show that the developed method achieves better performance than other existing topic detection methods, leading to a desirable solution of detecting event from microblog streams.\",\"PeriodicalId\":67056,\"journal\":{\"name\":\"车间管理\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"35\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"车间管理\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://doi.org/10.1145/2809890.2809894\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"车间管理","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1145/2809890.2809894","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Topic Detection from Large Scale of Microblog Stream with High Utility Pattern Clustering
With the popularity of social media, detecting topics from microblog streams have become an increasingly important task. However, it's a challenge due to microblog streams have the characteristics of high-dimension, short and noisy content, fast changing, huge volume and so on. In this paper, we propose a high utility pattern clustering (HUPC) framework over microblog streams. This framework first extracts a group of representative patterns from the microblog stream, and then groups these patterns into topic clusters. This approach works well on large scale of microblog streams because it clusters the patterns that perform better in describing topics, rather than clustering noises and microblogs directly. Furthermore, the proposed framework can detect coherent topics and new emerging topics simultaneously. Extensive experimental results on Twitter streams and Sina Weibo streams show that the developed method achieves better performance than other existing topic detection methods, leading to a desirable solution of detecting event from microblog streams.