Socialized Language Model Smoothing via Bi-directional Influence Propagation on Social Networks

Rui Yan, Cheng-te Li, Hsun-Ping Hsieh, P. Hu, Xiaohua Hu, Tingting He
{"title":"Socialized Language Model Smoothing via Bi-directional Influence Propagation on Social Networks","authors":"Rui Yan, Cheng-te Li, Hsun-Ping Hsieh, P. Hu, Xiaohua Hu, Tingting He","doi":"10.1145/2872427.2874811","DOIUrl":null,"url":null,"abstract":"In recent years, online social networks are among the most popular websites with high PV (Page View) all over the world, as they have renewed the way for information discovery and distribution. Millions of users have registered on these websites and hence generate formidable amount of user-generated contents every day. The social networks become \"giants\", likely eligible to carry on any research tasks. However, we have pointed out that these giants still suffer from their \"Achilles Heel\", i.e., extreme sparsity. Compared with the extremely large data over the whole collection, individual posting documents such as microblogs seem to be too sparse to make a difference under various research scenarios, while actually these postings are different. In this paper we propose to tackle the Achilles Heel of social networks by smoothing the language model via influence propagation. To further our previously proposed work to tackle the sparsity issue, we extend the socialized language model smoothing with bi-directional influence learned from propagation. Intuitively, it is insufficient not to distinguish the influence propagated between information source and target without directions. Hence, we formulate a bi-directional socialized factor graph model, which utilizes both the textual correlations between document pairs and the socialized augmentation networks behind the documents, such as user relationships and social interactions. These factors are modeled as attributes and dependencies among documents and their corresponding users, and then are distinguished on the direction level. We propose an effective learning algorithm to learn the proposed factor graph model with directions. Finally we propagate term counts to smooth documents based on the estimated influence. We run experiments on two instinctive datasets of Twitter and Weibo. The results validate the effectiveness of the proposed model. By incorporating direction information into the socialized language model smoothing, our approach obtains improvement over several alternative methods on both intrinsic and extrinsic evaluations measured in terms of perplexity, nDCG and MAP measurements.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Conference on World Wide Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2872427.2874811","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

In recent years, online social networks are among the most popular websites with high PV (Page View) all over the world, as they have renewed the way for information discovery and distribution. Millions of users have registered on these websites and hence generate formidable amount of user-generated contents every day. The social networks become "giants", likely eligible to carry on any research tasks. However, we have pointed out that these giants still suffer from their "Achilles Heel", i.e., extreme sparsity. Compared with the extremely large data over the whole collection, individual posting documents such as microblogs seem to be too sparse to make a difference under various research scenarios, while actually these postings are different. In this paper we propose to tackle the Achilles Heel of social networks by smoothing the language model via influence propagation. To further our previously proposed work to tackle the sparsity issue, we extend the socialized language model smoothing with bi-directional influence learned from propagation. Intuitively, it is insufficient not to distinguish the influence propagated between information source and target without directions. Hence, we formulate a bi-directional socialized factor graph model, which utilizes both the textual correlations between document pairs and the socialized augmentation networks behind the documents, such as user relationships and social interactions. These factors are modeled as attributes and dependencies among documents and their corresponding users, and then are distinguished on the direction level. We propose an effective learning algorithm to learn the proposed factor graph model with directions. Finally we propagate term counts to smooth documents based on the estimated influence. We run experiments on two instinctive datasets of Twitter and Weibo. The results validate the effectiveness of the proposed model. By incorporating direction information into the socialized language model smoothing, our approach obtains improvement over several alternative methods on both intrinsic and extrinsic evaluations measured in terms of perplexity, nDCG and MAP measurements.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于社交网络双向影响传播的社会化语言模型平滑
近年来,在线社交网络是全球最受欢迎的高PV (Page View)网站之一,因为它更新了信息发现和传播的方式。数以百万计的用户在这些网站上注册,因此每天产生大量的用户生成的内容。社交网络成为“巨人”,可能有资格进行任何研究任务。然而,我们已经指出,这些巨人仍然遭受他们的“阿喀琉斯之踵”,即极端稀疏。与整个集合的海量数据相比,微博等个人发布文档在各种研究场景下显得过于稀疏,无法发挥作用,而实际上这些帖子是不同的。在本文中,我们提出通过影响传播平滑语言模型来解决社交网络的阿喀琉斯之踵。为了进一步解决稀疏性问题,我们利用从传播中学习到的双向影响扩展了社会化语言模型平滑。从直观上看,如果没有方向,不能区分信息源和目标之间传播的影响是不够的。因此,我们制定了一个双向社会化因素图模型,该模型既利用了文档对之间的文本相关性,也利用了文档背后的社会化增强网络,如用户关系和社会互动。将这些因素建模为文档及其相应用户之间的属性和依赖关系,然后在方向级别上进行区分。我们提出了一种有效的学习算法来学习所提出的带方向的因子图模型。最后,我们根据估计的影响将术语计数传播到平滑文档。我们在推特和微博两个本能数据集上进行实验。实验结果验证了该模型的有效性。通过将方向信息整合到社会化语言模型平滑中,我们的方法在基于困惑度、nDCG和MAP测量的内在和外在评估方面都优于几种替代方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
MapWatch: Detecting and Monitoring International Border Personalization on Online Maps Automatic Discovery of Attribute Synonyms Using Query Logs and Table Corpora Learning Global Term Weights for Content-based Recommender Systems From Freebase to Wikidata: The Great Migration GoCAD: GPU-Assisted Online Content-Adaptive Display Power Saving for Mobile Devices in Internet Streaming
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1