{"title":"Mining evolutionary multi-branch trees from text streams","authors":"Xiting Wang, Shixia Liu, Yangqiu Song, B. Guo","doi":"10.1145/2487575.2487603","DOIUrl":null,"url":null,"abstract":"Understanding topic hierarchies in text streams and their evolution patterns over time is very important in many applications. In this paper, we propose an evolutionary multi-branch tree clustering method for streaming text data. We build evolutionary trees in a Bayesian online filtering framework. The tree construction is formulated as an online posterior estimation problem, which considers both the likelihood of the current tree and conditional prior given the previous tree. We also introduce a constraint model to compute the conditional prior of a tree in the multi-branch setting. Experiments on real world news data demonstrate that our algorithm can better incorporate historical tree information and is more efficient and effective than the traditional evolutionary hierarchical clustering algorithm.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"43 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2487575.2487603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Understanding topic hierarchies in text streams and their evolution patterns over time is very important in many applications. In this paper, we propose an evolutionary multi-branch tree clustering method for streaming text data. We build evolutionary trees in a Bayesian online filtering framework. The tree construction is formulated as an online posterior estimation problem, which considers both the likelihood of the current tree and conditional prior given the previous tree. We also introduce a constraint model to compute the conditional prior of a tree in the multi-branch setting. Experiments on real world news data demonstrate that our algorithm can better incorporate historical tree information and is more efficient and effective than the traditional evolutionary hierarchical clustering algorithm.