Linguistic Feature-based Classification for Anger and Anticipation using Machine Learning

K. Ramakrishnan, Vimala Balakrishnan, Kumanan Govaichelvan
{"title":"Linguistic Feature-based Classification for Anger and Anticipation using Machine Learning","authors":"K. Ramakrishnan, Vimala Balakrishnan, Kumanan Govaichelvan","doi":"10.5220/0011289300003277","DOIUrl":null,"url":null,"abstract":"Growing number of online discourses enables the development of emotion mining models using natural language processing techniques. However, language diversity and cultural disparity alters the sentiment orientation of words depending on the community and context. Therefore, this study investigates the impacts of linguistic features, namely lexical and syntactic, in predicting the presence two emotions among Malaysian YouTube users, anger and anticipation. Term Frequency-Inverse Document Frequency (TF-IDF), Unigrams, Bigrams and Parts-of-Speech Tags were used as features to observe the classification performance. The dataset used in this study contains 2500 YouTube comments by Malaysian users on 46 Covid-19 related videos. Comments were extracted from three prominent Malaysian-centric English news channels: Channel News Asia (CNA), The Star News, and New Strait Times, ranging from 16 March 2020 - 30 April 2020 (i.e., first lockdown phase). Random Forest, Support Vector Machine, Logistic Regression, Decision Tree, K-Nearest Neighbour and Multinomial Naive Bayes were the six classification algorithms tested, with results indicating Support Vector Machine with TF-IDF provided the best performance, achieving accuracy of 76% and 73% for anger and anticipation, respectively.","PeriodicalId":88612,"journal":{"name":"News. Phi Delta Epsilon","volume":"46 1","pages":"140-147"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"News. Phi Delta Epsilon","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0011289300003277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Growing number of online discourses enables the development of emotion mining models using natural language processing techniques. However, language diversity and cultural disparity alters the sentiment orientation of words depending on the community and context. Therefore, this study investigates the impacts of linguistic features, namely lexical and syntactic, in predicting the presence two emotions among Malaysian YouTube users, anger and anticipation. Term Frequency-Inverse Document Frequency (TF-IDF), Unigrams, Bigrams and Parts-of-Speech Tags were used as features to observe the classification performance. The dataset used in this study contains 2500 YouTube comments by Malaysian users on 46 Covid-19 related videos. Comments were extracted from three prominent Malaysian-centric English news channels: Channel News Asia (CNA), The Star News, and New Strait Times, ranging from 16 March 2020 - 30 April 2020 (i.e., first lockdown phase). Random Forest, Support Vector Machine, Logistic Regression, Decision Tree, K-Nearest Neighbour and Multinomial Naive Bayes were the six classification algorithms tested, with results indicating Support Vector Machine with TF-IDF provided the best performance, achieving accuracy of 76% and 73% for anger and anticipation, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于语言特征的愤怒和预期分类使用机器学习
越来越多的在线话语使得使用自然语言处理技术的情感挖掘模型得以发展。然而,语言的多样性和文化的差异会根据社区和语境的不同而改变词语的情感取向。因此,本研究考察了语言特征(即词汇和句法)在预测马来西亚YouTube用户愤怒和期待两种情绪存在方面的影响。使用词频-逆文档频率(TF-IDF)、单图、双图和词性标签作为特征来观察分类性能。本研究中使用的数据集包含马来西亚用户对46个Covid-19相关视频的2500条YouTube评论。评论摘自三个以马来西亚为中心的著名英语新闻频道:亚洲新闻频道(CNA)、《星报》和《新海峡时报》,时间为2020年3月16日至2020年4月30日(即第一封锁阶段)。随机森林、支持向量机、逻辑回归、决策树、k近邻和多项式朴素贝叶斯是测试的六种分类算法,结果表明支持向量机与TF-IDF提供了最好的性能,在愤怒和预期方面分别达到76%和73%的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
GAN-Based LiDAR Intensity Simulation Improving Primate Sounds Classification using Binary Presorting for Deep Learning Towards exploring adversarial learning for anomaly detection in complex driving scenes A Study of Neural Collapse for Text Classification Using Artificial Intelligence to Reduce the Risk of Transfusion Hemolytic Reactions
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1