Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach

Xiaolong Wang, Furu Wei, Xiaohua Liu, M. Zhou, Ming Zhang
{"title":"Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach","authors":"Xiaolong Wang, Furu Wei, Xiaohua Liu, M. Zhou, Ming Zhang","doi":"10.1145/2063576.2063726","DOIUrl":null,"url":null,"abstract":"Twitter is one of the biggest platforms where massive instant messages (i.e. tweets) are published every day. Users tend to express their real feelings freely in Twitter, which makes it an ideal source for capturing the opinions towards various interesting topics, such as brands, products or celebrities, etc. Naturally, people may anticipate an approach to receiving the common sentiment tendency towards these topics directly rather than through reading the huge amount of tweets about them. On the other side, Hashtags, starting with a symbol \"#\" ahead of keywords or phrases, are widely used in tweets as coarse-grained topics. In this paper, instead of presenting the sentiment polarity of each tweet relevant to the topic, we focus our study on hashtag-level sentiment classification. This task aims to automatically generate the overall sentiment polarity for a given hashtag in a certain time period, which markedly differs from the conventional sentence-level and document-level sentiment analysis. Our investigation illustrates that three types of information is useful to address the task, including (1) sentiment polarity of tweets containing the hashtag; (2) hashtags co-occurrence relationship and (3) the literal meaning of hashtags. Consequently, in order to incorporate the first two types of information into a classification framework where hashtags can be classified collectively, we propose a novel graph model and investigate three approximate collective classification algorithms for inference. Going one step further, we show that the performance can be remarkably improved using an enhanced boosting classification setting in which we employ the literal meaning of hashtags as semi-supervised information. Experimental results on a real-life data set consisting of 29,195 tweets and 2,181 hashtags show the effectiveness of the proposed model and algorithms.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"32 1","pages":"1031-1040"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"470","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2063576.2063726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 470

Abstract

Twitter is one of the biggest platforms where massive instant messages (i.e. tweets) are published every day. Users tend to express their real feelings freely in Twitter, which makes it an ideal source for capturing the opinions towards various interesting topics, such as brands, products or celebrities, etc. Naturally, people may anticipate an approach to receiving the common sentiment tendency towards these topics directly rather than through reading the huge amount of tweets about them. On the other side, Hashtags, starting with a symbol "#" ahead of keywords or phrases, are widely used in tweets as coarse-grained topics. In this paper, instead of presenting the sentiment polarity of each tweet relevant to the topic, we focus our study on hashtag-level sentiment classification. This task aims to automatically generate the overall sentiment polarity for a given hashtag in a certain time period, which markedly differs from the conventional sentence-level and document-level sentiment analysis. Our investigation illustrates that three types of information is useful to address the task, including (1) sentiment polarity of tweets containing the hashtag; (2) hashtags co-occurrence relationship and (3) the literal meaning of hashtags. Consequently, in order to incorporate the first two types of information into a classification framework where hashtags can be classified collectively, we propose a novel graph model and investigate three approximate collective classification algorithms for inference. Going one step further, we show that the performance can be remarkably improved using an enhanced boosting classification setting in which we employ the literal meaning of hashtags as semi-supervised information. Experimental results on a real-life data set consisting of 29,195 tweets and 2,181 hashtags show the effectiveness of the proposed model and algorithms.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
twitter中的主题情感分析:一种基于图的标签情感分类方法
Twitter是最大的平台之一,每天都会发布大量的即时消息(即tweets)。用户倾向于在Twitter上自由地表达自己的真实感受,这使得它成为捕捉各种有趣话题(如品牌、产品或名人等)观点的理想来源。当然,人们可能会期待一种直接接受这些话题的共同情绪倾向的方法,而不是通过阅读大量关于这些话题的推文。另一方面,在关键字或短语前面以“#”符号开头的标签,作为粗粒度的主题,在推文中被广泛使用。在本文中,我们的研究重点是标签级情感分类,而不是呈现与主题相关的每条推文的情感极性。该任务旨在自动生成给定标签在特定时间段内的整体情感极性,这与传统的句子级和文档级情感分析有明显区别。我们的研究表明,三种类型的信息对解决这个任务是有用的,包括:(1)包含标签的推文的情绪极性;(2)标签共现关系;(3)标签的字面含义。因此,为了将前两种类型的信息合并到一个可以对标签进行集体分类的分类框架中,我们提出了一种新的图模型,并研究了三种近似的集体分类算法进行推理。更进一步,我们展示了使用增强的增强分类设置可以显著提高性能,其中我们使用hashtag的字面含义作为半监督信息。在包含29195条推文和2181个标签的真实数据集上的实验结果表明了所提出模型和算法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
scACT: Accurate Cross-modality Translation via Cycle-consistent Training from Unpaired Single-cell Data. iMIRACLE: an Iterative Multi-View Graph Neural Network to Model Intercellular Gene Regulation from Spatial Transcriptomic Data. Federated Node Classification over Distributed Ego-Networks with Secure Contrastive Embedding Sharing. Enabling Health Data Sharing with Fine-Grained Privacy. MedCV: An Interactive Visualization System for Patient Cohort Identification from Medical Claim Data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1