Emotion Recognition in Reddit Comments Using Recurrent Neural Networks

Mahdi Rezapour
{"title":"Emotion Recognition in Reddit Comments Using Recurrent Neural\nNetworks","authors":"Mahdi Rezapour","doi":"10.2174/0126662558273325231201051141","DOIUrl":null,"url":null,"abstract":"\n\nReddit comments are a valuable source of natural language data\nwhere emotion plays a key role in human communication. However, emotion recognition is a\ndifficult task that requires understanding the context and sentiment of the texts. In this paper,\nwe aim to compare the effectiveness of four recurrent neural network (RNN) models for classifying the emotions of Reddit comments.\n\n\n\nWe use a small dataset of 4,922 comments labeled with four emotions: approval,\ndisapproval, love, and annoyance. We also use pre-trained Glove.840B.300d embeddings as\nthe input representation for all models. The models we compare are SimpleRNN, Long ShortTerm Memory (LSTM), bidirectional LSTM, and Gated Recurrent Unit (GRU). We experiment with different text preprocessing steps, such as removing stopwords and applying stemming, removing negation from stopwords, and the effect of setting the embedding layer as\ntrainable on the models.\n\n\n\nWe find that GRU outperforms all other models, achieving an accuracy of 74%. Bidirectional LSTM and LSTM are close behind, while SimpleRNN performs the worst. We observe that the low accuracy is likely due to the presence of sarcasm, irony, and complexity in\nthe texts. We also notice that setting the embedding layer as trainable improves the performance of LSTM but increases the computational cost and training time significantly. We analyze some examples of misclassified texts by GRU and identify the challenges and limitations\nof the dataset and the models\n\n\n\nIn our study GRU was found to be the best model for emotion classification of\nReddit comments among the four RNN models we compared. We also discuss some future directions for research to improve the emotion recognition task on Reddit comments. Furthermore, we provide an extensive discussion of the applications and methods behind each technique in the context of the paper.\n","PeriodicalId":36514,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":"49 44","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Computer Science and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/0126662558273325231201051141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Reddit comments are a valuable source of natural language data where emotion plays a key role in human communication. However, emotion recognition is a difficult task that requires understanding the context and sentiment of the texts. In this paper, we aim to compare the effectiveness of four recurrent neural network (RNN) models for classifying the emotions of Reddit comments. We use a small dataset of 4,922 comments labeled with four emotions: approval, disapproval, love, and annoyance. We also use pre-trained Glove.840B.300d embeddings as the input representation for all models. The models we compare are SimpleRNN, Long ShortTerm Memory (LSTM), bidirectional LSTM, and Gated Recurrent Unit (GRU). We experiment with different text preprocessing steps, such as removing stopwords and applying stemming, removing negation from stopwords, and the effect of setting the embedding layer as trainable on the models. We find that GRU outperforms all other models, achieving an accuracy of 74%. Bidirectional LSTM and LSTM are close behind, while SimpleRNN performs the worst. We observe that the low accuracy is likely due to the presence of sarcasm, irony, and complexity in the texts. We also notice that setting the embedding layer as trainable improves the performance of LSTM but increases the computational cost and training time significantly. We analyze some examples of misclassified texts by GRU and identify the challenges and limitations of the dataset and the models In our study GRU was found to be the best model for emotion classification of Reddit comments among the four RNN models we compared. We also discuss some future directions for research to improve the emotion recognition task on Reddit comments. Furthermore, we provide an extensive discussion of the applications and methods behind each technique in the context of the paper.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用递归神经网络识别 Reddit 评论中的情绪
Reddit 评论是一个宝贵的自然语言数据源,其中情感在人类交流中扮演着重要角色。然而,情感识别是一项艰巨的任务,需要理解文本的上下文和情感。本文旨在比较四种递归神经网络(RNN)模型对 Reddit 评论进行情感分类的效果。我们使用了一个由 4922 条评论组成的小型数据集,其中标注了四种情感:赞同、不赞同、爱和烦恼。我们还使用预训练的 Glove.840B.300d 嵌入作为所有模型的输入表示。我们比较的模型包括 SimpleRNN、Long ShortTerm Memory (LSTM)、bidirectional LSTM 和 Gated Recurrent Unit (GRU)。我们尝试了不同的文本预处理步骤,如去除停顿词和应用词干、去除停顿词中的否定,以及将嵌入层设置为可训练层对模型的影响。双向 LSTM 和 LSTM 紧随其后,而 SimpleRNN 的表现最差。我们发现,准确率低的原因可能是文本中存在讽刺、反讽和复杂性。我们还注意到,将嵌入层设置为可训练层可以提高 LSTM 的性能,但会大大增加计算成本和训练时间。我们分析了 GRU 错误分类文本的一些示例,并指出了数据集和模型所面临的挑战和局限性。在我们的研究中,我们发现 GRU 是四种 RNN 模型中对 Reddit 评论进行情感分类的最佳模型。我们还讨论了改进 Reddit 评论情感识别任务的一些未来研究方向。此外,我们还结合本文对每种技术背后的应用和方法进行了广泛的讨论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Recent Advances in Computer Science and Communications
Recent Advances in Computer Science and Communications Computer Science-Computer Science (all)
CiteScore
2.50
自引率
0.00%
发文量
142
期刊最新文献
Flood Mapping and Damage Analysis Using Multispectral Sentinel-2 Satellite Imagery and Machine Learning Techniques Efficacy of Keystroke Dynamics-Based User Authentication in the Face of Language Complexity Innovation in Knowledge Economy: A Case Study of 3D Printing's Rise in Global Markets and India Cognitive Inherent SLR Enabled Survey for Software Defect Prediction An Era of Communication Technology Using Machine Learning Techniques in Medical Imaging
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1