Predicting the Political Polarity of Tweets Using Supervised Machine Learning

Michelle Voong, Keerthana Gunda, S. Gokhale
{"title":"Predicting the Political Polarity of Tweets Using Supervised Machine Learning","authors":"Michelle Voong, Keerthana Gunda, S. Gokhale","doi":"10.1109/COMPSAC48688.2020.000-9","DOIUrl":null,"url":null,"abstract":"With the advent of social media; politicians, media outlets, and ordinary citizens alike are routinely turning to Twitter to share their thoughts and feelings. Discerning politically biased tweets from neutral ones can assist in determining the propensity of an elected official or a media outlet in engaging in political rhetoric. This paper presents a supervised machine learning approach to predict whether a tweet is politically biased or neutral. The approach uses a labeled data set available at Crowdflower, where each tweet is tagged with a partisan/neutral label plus its message type and audience. The approach considers a combination of linguistic features including Term Frequency-Inverse Document Frequency (TF-IDF), bigrams, and trigrams along with metadata features including mentions, retweets, and URLs, as well as the additional labels of message type and audience. It trains both simple and ensemble classifiers and assesses their performance using precision, recall, and F1-score. The results demonstrate that the classifiers can predict the polarity of a tweet accurately when trained on a combination of TF-IDF and metadata features that can be extracted automatically from the tweets, eliminating the need for additional tagging which is manual, cumbersome and error prone.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC48688.2020.000-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

With the advent of social media; politicians, media outlets, and ordinary citizens alike are routinely turning to Twitter to share their thoughts and feelings. Discerning politically biased tweets from neutral ones can assist in determining the propensity of an elected official or a media outlet in engaging in political rhetoric. This paper presents a supervised machine learning approach to predict whether a tweet is politically biased or neutral. The approach uses a labeled data set available at Crowdflower, where each tweet is tagged with a partisan/neutral label plus its message type and audience. The approach considers a combination of linguistic features including Term Frequency-Inverse Document Frequency (TF-IDF), bigrams, and trigrams along with metadata features including mentions, retweets, and URLs, as well as the additional labels of message type and audience. It trains both simple and ensemble classifiers and assesses their performance using precision, recall, and F1-score. The results demonstrate that the classifiers can predict the polarity of a tweet accurately when trained on a combination of TF-IDF and metadata features that can be extracted automatically from the tweets, eliminating the need for additional tagging which is manual, cumbersome and error prone.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用监督机器学习预测推文的政治极性
随着社交媒体的出现;政治家、媒体机构和普通公民都经常求助于Twitter来分享他们的想法和感受。从中立的推文中辨别出政治偏见的推文,有助于确定当选官员或媒体参与政治言论的倾向。本文提出了一种有监督的机器学习方法来预测推文是政治偏见还是中立。该方法使用了Crowdflower提供的标记数据集,其中每条tweet都标有党派/中立标签以及其消息类型和受众。该方法考虑了多种语言特性的组合,包括术语频率-逆文档频率(TF-IDF)、双引号和三元组,以及元数据特性,包括提及、转发和url,以及消息类型和受众的附加标签。它训练简单分类器和集成分类器,并使用精度、召回率和f1分数来评估它们的性能。结果表明,分类器在结合TF-IDF和元数据特征(可以从tweet中自动提取)进行训练时,可以准确地预测tweet的极性,从而消除了手动、繁琐且容易出错的额外标记的需要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The European Concept of Smart City: A Taxonomic Analysis An Early Warning System for Hemodialysis Complications Utilizing Transfer Learning from HD IoT Dataset A Systematic Literature Review of Practical Virtual and Augmented Reality Solutions in Surgery Optimization of Parallel Applications Under CPU Overcommitment A Blockchain Token Economy Model for Financing a Decentralized Electric Vehicle Charging Platform
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1