基于递归神经网络的 COVID-19 推文分类

A. Laksito, Nuruddin Wiranda, Shofiyati Nur Karimah, Mardhiya Hayaty
{"title":"基于递归神经网络的 COVID-19 推文分类","authors":"A. Laksito, Nuruddin Wiranda, Shofiyati Nur Karimah, Mardhiya Hayaty","doi":"10.18517/ijaseit.14.1.18832","DOIUrl":null,"url":null,"abstract":"Due to its extensive use in both public and commercial contexts, sentiment analysis on Twitter has recently received much attention, particularly concerning tweets about COVID-19. Information about COVID-19 has been widely spread over social media, resulting in various views, opinions, and emotions about this pandemic, significantly impacting people's health. It is exceedingly challenging for the authorities to find these rumors on these public platforms manually. This paper proposes a framework for text classification using the RNN model and its updates, such as LSTM, BiLSTM, and GRU. This study aims to determine the best recurrent network model for handling cases of Twitter data classification. We utilized Twitter data relevant to COVID-19 and the lockdown with four classification classes (sad, joy, fear, and anger). In addition, this study aims to prove whether GloVe pre-trained word embedding can increase the accuracy of model predictions. The training and testing datasets were split into 80% and 20%, respectively. Therefore, in this experiment an early stopping technique was used with a limit of 15 epochs and a minimum delta of 0.01, meaning that training will be stopped if there is no improvement of 0.1% accuracy after 15 epochs. We used the f1-score average to measure the accuracy of the classification task results. The test results show that the BiLSTM model with GloVe word embedding yields the best f1-score compared to other models. Moreover, in all model testing, the f1-score value of the 'fear' class displays the highest accuracy compared to other classes.","PeriodicalId":14471,"journal":{"name":"International Journal on Advanced Science, Engineering and Information Technology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The COVID-19 Tweets Classification Based on Recurrent Neural Network\",\"authors\":\"A. Laksito, Nuruddin Wiranda, Shofiyati Nur Karimah, Mardhiya Hayaty\",\"doi\":\"10.18517/ijaseit.14.1.18832\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to its extensive use in both public and commercial contexts, sentiment analysis on Twitter has recently received much attention, particularly concerning tweets about COVID-19. Information about COVID-19 has been widely spread over social media, resulting in various views, opinions, and emotions about this pandemic, significantly impacting people's health. It is exceedingly challenging for the authorities to find these rumors on these public platforms manually. This paper proposes a framework for text classification using the RNN model and its updates, such as LSTM, BiLSTM, and GRU. This study aims to determine the best recurrent network model for handling cases of Twitter data classification. We utilized Twitter data relevant to COVID-19 and the lockdown with four classification classes (sad, joy, fear, and anger). In addition, this study aims to prove whether GloVe pre-trained word embedding can increase the accuracy of model predictions. The training and testing datasets were split into 80% and 20%, respectively. Therefore, in this experiment an early stopping technique was used with a limit of 15 epochs and a minimum delta of 0.01, meaning that training will be stopped if there is no improvement of 0.1% accuracy after 15 epochs. We used the f1-score average to measure the accuracy of the classification task results. The test results show that the BiLSTM model with GloVe word embedding yields the best f1-score compared to other models. Moreover, in all model testing, the f1-score value of the 'fear' class displays the highest accuracy compared to other classes.\",\"PeriodicalId\":14471,\"journal\":{\"name\":\"International Journal on Advanced Science, Engineering and Information Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal on Advanced Science, Engineering and Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18517/ijaseit.14.1.18832\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Agricultural and Biological Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Advanced Science, Engineering and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18517/ijaseit.14.1.18832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0

摘要

由于在公共和商业环境中的广泛应用,Twitter 上的情感分析最近受到了广泛关注,尤其是有关 COVID-19 的推文。有关 COVID-19 的信息在社交媒体上广泛传播,导致人们对这一流行病产生各种观点、意见和情绪,严重影响了人们的健康。对于有关部门来说,在这些公共平台上手动查找这些谣言极具挑战性。本文提出了一个使用 RNN 模型及其更新(如 LSTM、BiLSTM 和 GRU)进行文本分类的框架。本研究旨在确定处理 Twitter 数据分类案例的最佳循环网络模型。我们使用了与 COVID-19 和封锁事件相关的 Twitter 数据,其中包含四个分类类别(悲伤、喜悦、恐惧和愤怒)。此外,本研究还旨在证明 GloVe 预训练词嵌入是否能提高模型预测的准确性。训练数据集和测试数据集分别占 80% 和 20%。因此,在本实验中使用了早期停止技术,限制时间为 15 个历时,最小 delta 为 0.01,即如果 15 个历时后准确率没有提高 0.1%,则停止训练。我们使用 f1 分数平均值来衡量分类任务结果的准确性。测试结果表明,与其他模型相比,采用 GloVe 词嵌入的 BiLSTM 模型的 f1 分数最高。此外,在所有模型测试中,与其他类别相比,"恐惧 "类别的 f1 分数最高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
The COVID-19 Tweets Classification Based on Recurrent Neural Network
Due to its extensive use in both public and commercial contexts, sentiment analysis on Twitter has recently received much attention, particularly concerning tweets about COVID-19. Information about COVID-19 has been widely spread over social media, resulting in various views, opinions, and emotions about this pandemic, significantly impacting people's health. It is exceedingly challenging for the authorities to find these rumors on these public platforms manually. This paper proposes a framework for text classification using the RNN model and its updates, such as LSTM, BiLSTM, and GRU. This study aims to determine the best recurrent network model for handling cases of Twitter data classification. We utilized Twitter data relevant to COVID-19 and the lockdown with four classification classes (sad, joy, fear, and anger). In addition, this study aims to prove whether GloVe pre-trained word embedding can increase the accuracy of model predictions. The training and testing datasets were split into 80% and 20%, respectively. Therefore, in this experiment an early stopping technique was used with a limit of 15 epochs and a minimum delta of 0.01, meaning that training will be stopped if there is no improvement of 0.1% accuracy after 15 epochs. We used the f1-score average to measure the accuracy of the classification task results. The test results show that the BiLSTM model with GloVe word embedding yields the best f1-score compared to other models. Moreover, in all model testing, the f1-score value of the 'fear' class displays the highest accuracy compared to other classes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal on Advanced Science, Engineering and Information Technology
International Journal on Advanced Science, Engineering and Information Technology Agricultural and Biological Sciences-Agricultural and Biological Sciences (all)
CiteScore
1.40
自引率
0.00%
发文量
272
期刊介绍: International Journal on Advanced Science, Engineering and Information Technology (IJASEIT) is an international peer-reviewed journal dedicated to interchange for the results of high quality research in all aspect of science, engineering and information technology. The journal publishes state-of-art papers in fundamental theory, experiments and simulation, as well as applications, with a systematic proposed method, sufficient review on previous works, expanded discussion and concise conclusion. As our commitment to the advancement of science and technology, the IJASEIT follows the open access policy that allows the published articles freely available online without any subscription. The journal scopes include (but not limited to) the followings: -Science: Bioscience & Biotechnology. Chemistry & Food Technology, Environmental, Health Science, Mathematics & Statistics, Applied Physics -Engineering: Architecture, Chemical & Process, Civil & structural, Electrical, Electronic & Systems, Geological & Mining Engineering, Mechanical & Materials -Information Science & Technology: Artificial Intelligence, Computer Science, E-Learning & Multimedia, Information System, Internet & Mobile Computing
期刊最新文献
Medical Record Document Search with TF-IDF and Vector Space Model (VSM) Aesthetic Plastic Surgery Issues During the COVID-19 Period Using Topic Modeling Revolutionizing Echocardiography: A Comparative Study of Advanced AI Models for Precise Left Ventricular Segmentation The Mixed MEWMA and MCUSUM Control Chart Design of Efficiency Series Data of Production Quality Process Monitoring A Comprehensive Review of Machine Learning Approaches for Detecting Malicious Software
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1