Comparison of Artificial Decision Techniques for Detection of Sarcastic News Headlines

Tarun Jain, Horesh Kumar, Payal Garg, Abhinav Pillai, Aditya Sinha, Vivek Kumar Verma
{"title":"Comparison of Artificial Decision Techniques for Detection of Sarcastic News Headlines","authors":"Tarun Jain, Horesh Kumar, Payal Garg, Abhinav Pillai, Aditya Sinha, Vivek Kumar Verma","doi":"10.4018/ijcbpl.330131","DOIUrl":null,"url":null,"abstract":"Newspapers are a rich informational source. A headline of an article sparks an interest in the reader. So, news providing agencies tend to create catchy headlines to attract the reader's attention onto them, and this is how sarcasm manages to find its way into news headlines. Sarcasm employs the use of words that carry opposite meaning with respect to what needs to be conveyed. This leads to the need of developing methods by which we can correctly predict whether a piece of text, or news for that matter, truthfully means what it says or is simply being sarcastic about it. Here, the authors have used a dataset containing 55,329 tuples consisting of news headlines from The Onion and the Huffington Post, which was taken from Kaggle, on which they applied feature extraction techniques such as Count Vectorizer, TF-IDF, Hashing Vectorizer, and Global Vectorizer (GloVe). Then they applied seven classifiers on the obtained dataset. The experimental results showed that the highest accuracies among the ML models were 81.39% for LR model with Count Vectorizer, 79.2% for LR model with TF-IDF Vectorizer, and 78% for SVM model with Count Vectorizer. They also obtained the best accuracy of 90.7% using the Bi-LSTM Deep Learning Model. They have trained the seven models and compared them based on their respective accuracies and F1-Scores.","PeriodicalId":38296,"journal":{"name":"International Journal of Cyber Behavior, Psychology and Learning","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Cyber Behavior, Psychology and Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijcbpl.330131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Newspapers are a rich informational source. A headline of an article sparks an interest in the reader. So, news providing agencies tend to create catchy headlines to attract the reader's attention onto them, and this is how sarcasm manages to find its way into news headlines. Sarcasm employs the use of words that carry opposite meaning with respect to what needs to be conveyed. This leads to the need of developing methods by which we can correctly predict whether a piece of text, or news for that matter, truthfully means what it says or is simply being sarcastic about it. Here, the authors have used a dataset containing 55,329 tuples consisting of news headlines from The Onion and the Huffington Post, which was taken from Kaggle, on which they applied feature extraction techniques such as Count Vectorizer, TF-IDF, Hashing Vectorizer, and Global Vectorizer (GloVe). Then they applied seven classifiers on the obtained dataset. The experimental results showed that the highest accuracies among the ML models were 81.39% for LR model with Count Vectorizer, 79.2% for LR model with TF-IDF Vectorizer, and 78% for SVM model with Count Vectorizer. They also obtained the best accuracy of 90.7% using the Bi-LSTM Deep Learning Model. They have trained the seven models and compared them based on their respective accuracies and F1-Scores.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
讽刺新闻标题人工决策技术的比较研究
报纸是丰富的信息来源。文章的标题能引起读者的兴趣。因此,新闻提供机构倾向于创造引人注目的标题来吸引读者的注意力,这就是讽刺如何设法进入新闻标题的原因。讽刺指的是使用与所要表达的意思相反的词语。这导致我们需要开发一种方法,通过这种方法,我们可以正确地预测一篇文章或新闻是真实地表达了它所说的意思,还是仅仅是在讽刺它。在这里,作者使用了一个包含55,329个元组的数据集,这些元组由来自Kaggle的洋葱和赫芬顿邮报的新闻标题组成,他们应用了特征提取技术,如计数矢量器、TF-IDF、哈希矢量器和全局矢量器(GloVe)。然后,他们对获得的数据集应用了7个分类器。实验结果表明,在ML模型中,LR模型与Count Vectorizer的准确率最高,分别为81.39%、79.2%和78%。他们还使用Bi-LSTM深度学习模型获得了90.7%的最佳准确率。他们训练了七个模型,并根据各自的准确率和f1分数对它们进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.10
自引率
0.00%
发文量
20
期刊介绍: The mission of the International Journal of Cyber Behavior, Psychology and Learning (IJCBPL) is to identify learners’ online behavior based on the theories in human psychology, define online education phenomena as explained by the social and cognitive learning theories and principles, and interpret the complexity of cyber learning. IJCBPL offers a multi-disciplinary approach that incorporates the findings from brain research, biology, psychology, human cognition, developmental theory, sociology, motivation theory, and social behavior. This journal welcomes both quantitative and qualitative studies using experimental design, as well as ethnographic methods to understand the dynamics of cyber learning. Impacting multiple areas of research and practices, including secondary and higher education, professional training, Web-based design and development, media learning, adolescent education, school and community, and social communication, IJCBPL targets school teachers, counselors, researchers, and online designers.
期刊最新文献
Managing Professional-Ethical Negotiation for Cyber Conflict Prevention Online TOPSE Emotion Detection via Voice and Speech Recognition Examining Rental House Data With MRL Analysis Integrating Machine Learning for Accurate Prediction of Early Diabetes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1