基于递归神经网络的 COVID-19 推文分类

Q3 Agricultural and Biological Sciences International Journal on Advanced Science, Engineering and Information Technology Pub Date : 2024-02-28 DOI:10.18517/ijaseit.14.1.18832

A. Laksito, Nuruddin Wiranda, Shofiyati Nur Karimah, Mardhiya Hayaty

{"title":"基于递归神经网络的 COVID-19 推文分类","authors":"A. Laksito, Nuruddin Wiranda, Shofiyati Nur Karimah, Mardhiya Hayaty","doi":"10.18517/ijaseit.14.1.18832","DOIUrl":null,"url":null,"abstract":"Due to its extensive use in both public and commercial contexts, sentiment analysis on Twitter has recently received much attention, particularly concerning tweets about COVID-19. Information about COVID-19 has been widely spread over social media, resulting in various views, opinions, and emotions about this pandemic, significantly impacting people's health. It is exceedingly challenging for the authorities to find these rumors on these public platforms manually. This paper proposes a framework for text classification using the RNN model and its updates, such as LSTM, BiLSTM, and GRU. This study aims to determine the best recurrent network model for handling cases of Twitter data classification. We utilized Twitter data relevant to COVID-19 and the lockdown with four classification classes (sad, joy, fear, and anger). In addition, this study aims to prove whether GloVe pre-trained word embedding can increase the accuracy of model predictions. The training and testing datasets were split into 80% and 20%, respectively. Therefore, in this experiment an early stopping technique was used with a limit of 15 epochs and a minimum delta of 0.01, meaning that training will be stopped if there is no improvement of 0.1% accuracy after 15 epochs. We used the f1-score average to measure the accuracy of the classification task results. The test results show that the BiLSTM model with GloVe word embedding yields the best f1-score compared to other models. Moreover, in all model testing, the f1-score value of the 'fear' class displays the highest accuracy compared to other classes.","PeriodicalId":14471,"journal":{"name":"International Journal on Advanced Science, Engineering and Information Technology","volume":"179 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The COVID-19 Tweets Classification Based on Recurrent Neural Network\",\"authors\":\"A. Laksito, Nuruddin Wiranda, Shofiyati Nur Karimah, Mardhiya Hayaty\",\"doi\":\"10.18517/ijaseit.14.1.18832\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to its extensive use in both public and commercial contexts, sentiment analysis on Twitter has recently received much attention, particularly concerning tweets about COVID-19. Information about COVID-19 has been widely spread over social media, resulting in various views, opinions, and emotions about this pandemic, significantly impacting people's health. It is exceedingly challenging for the authorities to find these rumors on these public platforms manually. This paper proposes a framework for text classification using the RNN model and its updates, such as LSTM, BiLSTM, and GRU. This study aims to determine the best recurrent network model for handling cases of Twitter data classification. We utilized Twitter data relevant to COVID-19 and the lockdown with four classification classes (sad, joy, fear, and anger). In addition, this study aims to prove whether GloVe pre-trained word embedding can increase the accuracy of model predictions. The training and testing datasets were split into 80% and 20%, respectively. Therefore, in this experiment an early stopping technique was used with a limit of 15 epochs and a minimum delta of 0.01, meaning that training will be stopped if there is no improvement of 0.1% accuracy after 15 epochs. We used the f1-score average to measure the accuracy of the classification task results. The test results show that the BiLSTM model with GloVe word embedding yields the best f1-score compared to other models. Moreover, in all model testing, the f1-score value of the 'fear' class displays the highest accuracy compared to other classes.\",\"PeriodicalId\":14471,\"journal\":{\"name\":\"International Journal on Advanced Science, Engineering and Information Technology\",\"volume\":\"179 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal on Advanced Science, Engineering and Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18517/ijaseit.14.1.18832\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Agricultural and Biological Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Advanced Science, Engineering and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18517/ijaseit.14.1.18832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}

引用次数: 0

摘要

由于在公共和商业环境中的广泛应用，Twitter 上的情感分析最近受到了广泛关注，尤其是有关 COVID-19 的推文。有关 COVID-19 的信息在社交媒体上广泛传播，导致人们对这一流行病产生各种观点、意见和情绪，严重影响了人们的健康。对于有关部门来说，在这些公共平台上手动查找这些谣言极具挑战性。本文提出了一个使用 RNN 模型及其更新（如 LSTM、BiLSTM 和 GRU）进行文本分类的框架。本研究旨在确定处理 Twitter 数据分类案例的最佳循环网络模型。我们使用了与 COVID-19 和封锁事件相关的 Twitter 数据，其中包含四个分类类别（悲伤、喜悦、恐惧和愤怒）。此外，本研究还旨在证明 GloVe 预训练词嵌入是否能提高模型预测的准确性。训练数据集和测试数据集分别占 80% 和 20%。因此，在本实验中使用了早期停止技术，限制时间为 15 个历时，最小 delta 为 0.01，即如果 15 个历时后准确率没有提高 0.1%，则停止训练。我们使用 f1 分数平均值来衡量分类任务结果的准确性。测试结果表明，与其他模型相比，采用 GloVe 词嵌入的 BiLSTM 模型的 f1 分数最高。此外，在所有模型测试中，与其他类别相比，"恐惧 "类别的 f1 分数最高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

The COVID-19 Tweets Classification Based on Recurrent Neural Network

Due to its extensive use in both public and commercial contexts, sentiment analysis on Twitter has recently received much attention, particularly concerning tweets about COVID-19. Information about COVID-19 has been widely spread over social media, resulting in various views, opinions, and emotions about this pandemic, significantly impacting people's health. It is exceedingly challenging for the authorities to find these rumors on these public platforms manually. This paper proposes a framework for text classification using the RNN model and its updates, such as LSTM, BiLSTM, and GRU. This study aims to determine the best recurrent network model for handling cases of Twitter data classification. We utilized Twitter data relevant to COVID-19 and the lockdown with four classification classes (sad, joy, fear, and anger). In addition, this study aims to prove whether GloVe pre-trained word embedding can increase the accuracy of model predictions. The training and testing datasets were split into 80% and 20%, respectively. Therefore, in this experiment an early stopping technique was used with a limit of 15 epochs and a minimum delta of 0.01, meaning that training will be stopped if there is no improvement of 0.1% accuracy after 15 epochs. We used the f1-score average to measure the accuracy of the classification task results. The test results show that the BiLSTM model with GloVe word embedding yields the best f1-score compared to other models. Moreover, in all model testing, the f1-score value of the 'fear' class displays the highest accuracy compared to other classes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal on Advanced Science, Engineering and Information Technology Agricultural and Biological Sciences-Agricultural and Biological Sciences (all)

CiteScore

1.40

自引率

0.00%

发文量

272

期刊介绍： International Journal on Advanced Science, Engineering and Information Technology (IJASEIT) is an international peer-reviewed journal dedicated to interchange for the results of high quality research in all aspect of science, engineering and information technology. The journal publishes state-of-art papers in fundamental theory, experiments and simulation, as well as applications, with a systematic proposed method, sufficient review on previous works, expanded discussion and concise conclusion. As our commitment to the advancement of science and technology, the IJASEIT follows the open access policy that allows the published articles freely available online without any subscription. The journal scopes include (but not limited to) the followings: -Science: Bioscience & Biotechnology. Chemistry & Food Technology, Environmental, Health Science, Mathematics & Statistics, Applied Physics -Engineering: Architecture, Chemical & Process, Civil & structural, Electrical, Electronic & Systems, Geological & Mining Engineering, Mechanical & Materials -Information Science & Technology: Artificial Intelligence, Computer Science, E-Learning & Multimedia, Information System, Internet & Mobile Computing