微博趋势检测集群:简单的自然语言处理vs话题标签——哪个更有信息量?

T. Hachaj, M. Ogiela
{"title":"微博趋势检测集群:简单的自然语言处理vs话题标签——哪个更有信息量?","authors":"T. Hachaj, M. Ogiela","doi":"10.1109/CISIS.2016.44","DOIUrl":null,"url":null,"abstract":"In this paper we introduce the initial proposition and evaluation of the method that enables detection of clusters of trends among microblogging posts gathered from a given social graph. By the cluster of trends we mean the trending words that are popular among same group of people and which describes their common interests. The information about shared interests of group of people in the social network is very important for business. Knowing it we can for example perform directed advertising campaign aimed at single community of people. We validate our approach on large datasets that contains 22 030 252 tweets posted by 20 130 followers of the world-known actress. We found that clusters of trends detection in microblogging with simple natural language processing (namely lemmatization) did not give any valuable information for business. For the other side hashtags frequency filtering and probability conditional probabilities graph clustering resulted in valuable informative about structure of interest in social network.","PeriodicalId":249236,"journal":{"name":"2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)","volume":"06 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Clusters of Trends Detection in Microblogging: Simple Natural Language Processing vs Hashtags – Which is More Informative?\",\"authors\":\"T. Hachaj, M. Ogiela\",\"doi\":\"10.1109/CISIS.2016.44\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we introduce the initial proposition and evaluation of the method that enables detection of clusters of trends among microblogging posts gathered from a given social graph. By the cluster of trends we mean the trending words that are popular among same group of people and which describes their common interests. The information about shared interests of group of people in the social network is very important for business. Knowing it we can for example perform directed advertising campaign aimed at single community of people. We validate our approach on large datasets that contains 22 030 252 tweets posted by 20 130 followers of the world-known actress. We found that clusters of trends detection in microblogging with simple natural language processing (namely lemmatization) did not give any valuable information for business. For the other side hashtags frequency filtering and probability conditional probabilities graph clustering resulted in valuable informative about structure of interest in social network.\",\"PeriodicalId\":249236,\"journal\":{\"name\":\"2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)\",\"volume\":\"06 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISIS.2016.44\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISIS.2016.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

在本文中,我们介绍了该方法的初始命题和评估,该方法能够从给定的社交图中收集微博帖子中的趋势集群进行检测。我们所说的趋势群指的是在同一群人中流行的趋势词,这些趋势词描述了他们的共同兴趣。社交网络中关于一群人的共同兴趣的信息对商业是非常重要的。知道了这一点,我们就可以针对某一群体进行定向广告宣传。我们在大型数据集上验证了我们的方法,这些数据集包含了这位世界知名女演员的20130名粉丝发布的22 030 252条推文。我们发现,通过简单的自然语言处理(即词汇化)对微博趋势进行聚类检测并不能提供任何有价值的商业信息。另一方面,标签频率滤波和概率条件概率图聚类产生了关于社交网络中兴趣结构的有价值的信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Clusters of Trends Detection in Microblogging: Simple Natural Language Processing vs Hashtags – Which is More Informative?
In this paper we introduce the initial proposition and evaluation of the method that enables detection of clusters of trends among microblogging posts gathered from a given social graph. By the cluster of trends we mean the trending words that are popular among same group of people and which describes their common interests. The information about shared interests of group of people in the social network is very important for business. Knowing it we can for example perform directed advertising campaign aimed at single community of people. We validate our approach on large datasets that contains 22 030 252 tweets posted by 20 130 followers of the world-known actress. We found that clusters of trends detection in microblogging with simple natural language processing (namely lemmatization) did not give any valuable information for business. For the other side hashtags frequency filtering and probability conditional probabilities graph clustering resulted in valuable informative about structure of interest in social network.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
3D Model Generation of Cattle by Shape-from-Silhouette Method for ICT Agriculture Improvement of Mesh Free Deforming Analysis for Maxillofacial Palpation on a Virtual Training System A Proposal of Coding Rule Learning Function in Java Programming Learning Assistant System 3D Model Data Retrieval System Using KAZE Feature for Accepting 2D Image as Query Flexible Screen Sharing System between PC and Tablet for Collaborative Activities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1