The Joint Inference of Topic Diffusion and Evolution in Social Communities

C. Lin, Q. Mei, Jiawei Han, Yunliang Jiang, Marina Danilevsky
{"title":"The Joint Inference of Topic Diffusion and Evolution in Social Communities","authors":"C. Lin, Q. Mei, Jiawei Han, Yunliang Jiang, Marina Danilevsky","doi":"10.1109/ICDM.2011.144","DOIUrl":null,"url":null,"abstract":"The prevalence of Web 2.0 techniques has led to the boom of various online communities, where topics spread ubiquitously among user-generated documents. Working together with this diffusion process is the evolution of topic content, where novel contents are introduced by documents which adopt the topic. Unlike explicit user behavior (e.g., buying a DVD), both the diffusion paths and the evolutionary process of a topic are implicit, making their discovery challenging. In this paper, we track the evolution of an arbitrary topic and reveal the latent diffusion paths of that topic in a social community. A novel and principled probabilistic model is proposed which casts our task as an joint inference problem, which considers textual documents, social influences, and topic evolution in a unified way. Specifically, a mixture model is introduced to model the generation of text according to the diffusion and the evolution of the topic, while the whole diffusion process is regularized with user-level social influences through a Gaussian Markov Random Field. Experiments on both synthetic data and real world data show that the discovery of topic diffusion and evolution benefits from this joint inference, and the probabilistic model we propose performs significantly better than existing methods.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"74","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 11th International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2011.144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 74

Abstract

The prevalence of Web 2.0 techniques has led to the boom of various online communities, where topics spread ubiquitously among user-generated documents. Working together with this diffusion process is the evolution of topic content, where novel contents are introduced by documents which adopt the topic. Unlike explicit user behavior (e.g., buying a DVD), both the diffusion paths and the evolutionary process of a topic are implicit, making their discovery challenging. In this paper, we track the evolution of an arbitrary topic and reveal the latent diffusion paths of that topic in a social community. A novel and principled probabilistic model is proposed which casts our task as an joint inference problem, which considers textual documents, social influences, and topic evolution in a unified way. Specifically, a mixture model is introduced to model the generation of text according to the diffusion and the evolution of the topic, while the whole diffusion process is regularized with user-level social influences through a Gaussian Markov Random Field. Experiments on both synthetic data and real world data show that the discovery of topic diffusion and evolution benefits from this joint inference, and the probabilistic model we propose performs significantly better than existing methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
社会群体中话题扩散与演化的联合推理
Web 2.0技术的流行导致了各种在线社区的繁荣,其中的主题在用户生成的文档中无处不在地传播。与这一扩散过程一起工作的是主题内容的演变,其中采用主题的文件引入了新颖的内容。与明确的用户行为(例如,购买DVD)不同,主题的扩散路径和进化过程都是隐含的,这使得它们的发现具有挑战性。在本文中,我们跟踪了一个任意话题的演变,揭示了该话题在社会群体中的潜在扩散路径。提出了一种新颖的、原则性的概率模型,将我们的任务作为一个联合推理问题,以统一的方式考虑文本文档、社会影响和主题演变。具体来说,根据话题的扩散和演变,引入混合模型对文本的生成进行建模,而整个扩散过程通过高斯马尔可夫随机场用用户层面的社会影响进行正则化。在合成数据和真实世界数据上的实验表明,这种联合推理有助于发现主题的扩散和进化,并且我们提出的概率模型的性能明显优于现有的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Nonnegative Matrix Tri-factorization Based High-Order Co-clustering and Its Fast Implementation Helix: Unsupervised Grammar Induction for Structured Activity Recognition Partitionable Kernels for Mapping Kernels Multi-task Learning for Bayesian Matrix Factorization Discovering the Intrinsic Cardinality and Dimensionality of Time Series Using MDL
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1