基于机器学习分类模型的马来西亚主要城市COVID-19情绪分析框架

Raihah Aminuddin, Muhammad Akmal Bistamam, Shafaf Ibrahim, Nur Nabilah Abu Mangshor, S. Fesol, Normilah Wahab
{"title":"基于机器学习分类模型的马来西亚主要城市COVID-19情绪分析框架","authors":"Raihah Aminuddin, Muhammad Akmal Bistamam, Shafaf Ibrahim, Nur Nabilah Abu Mangshor, S. Fesol, Normilah Wahab","doi":"10.1109/ICSET53708.2021.9612527","DOIUrl":null,"url":null,"abstract":"Twitter is one of the famous social media platforms for people to share their stories and opinions on any situations, such as the COVID-19 pandemic. With the indirect influence of tweets on users and the rise in cases of COVID-19 in Malaysia, it is important to monitor information related to the pandemic in order to avoid misinformation, panic, or confusion among public. As the data from tweets are also one of the useful raw data sources that can be used for data visualization, this project aims to design and develop a web-based system for visualizing the status of pandemic in Malaysia based on the data collected from Twitter. There are four phases in the methodology of this project: (i) Planning, (ii) Analysis, (iii) Design and Development, and (iv) Testing and Documentation. In the planning and analysis phases, the data will be collected from March 2020 to March 2021 and will be filtered by using keywords and hashtags, such as #COVID19 and #Coronavirus, as well as the location tagged on the tweets. The collected data then will be pre-processed to remove any unwanted texts. The classification of the data is based on sentiment analysis using one of machine learning models that is Support Vector Machine (SVM). The performance of the classification model will be evaluated using the evaluation model: (i) accuracy, (ii) recall, (iii) precision, and (iv) F1-measure. The final output of this project is the data visualization of the sentiment analysis on COVID-19 in Malaysia based on two of its major cities: Kuala Lumpur and Klang.","PeriodicalId":433197,"journal":{"name":"2021 IEEE 11th International Conference on System Engineering and Technology (ICSET)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Sentiment Analysis Framework on COVID-19 in Major Cities of Malaysia based on Tweets using Machine Learning Classification Model\",\"authors\":\"Raihah Aminuddin, Muhammad Akmal Bistamam, Shafaf Ibrahim, Nur Nabilah Abu Mangshor, S. Fesol, Normilah Wahab\",\"doi\":\"10.1109/ICSET53708.2021.9612527\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Twitter is one of the famous social media platforms for people to share their stories and opinions on any situations, such as the COVID-19 pandemic. With the indirect influence of tweets on users and the rise in cases of COVID-19 in Malaysia, it is important to monitor information related to the pandemic in order to avoid misinformation, panic, or confusion among public. As the data from tweets are also one of the useful raw data sources that can be used for data visualization, this project aims to design and develop a web-based system for visualizing the status of pandemic in Malaysia based on the data collected from Twitter. There are four phases in the methodology of this project: (i) Planning, (ii) Analysis, (iii) Design and Development, and (iv) Testing and Documentation. In the planning and analysis phases, the data will be collected from March 2020 to March 2021 and will be filtered by using keywords and hashtags, such as #COVID19 and #Coronavirus, as well as the location tagged on the tweets. The collected data then will be pre-processed to remove any unwanted texts. The classification of the data is based on sentiment analysis using one of machine learning models that is Support Vector Machine (SVM). The performance of the classification model will be evaluated using the evaluation model: (i) accuracy, (ii) recall, (iii) precision, and (iv) F1-measure. The final output of this project is the data visualization of the sentiment analysis on COVID-19 in Malaysia based on two of its major cities: Kuala Lumpur and Klang.\",\"PeriodicalId\":433197,\"journal\":{\"name\":\"2021 IEEE 11th International Conference on System Engineering and Technology (ICSET)\",\"volume\":\"112 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 11th International Conference on System Engineering and Technology (ICSET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSET53708.2021.9612527\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 11th International Conference on System Engineering and Technology (ICSET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSET53708.2021.9612527","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

推特是著名的社交媒体平台之一,人们可以在任何情况下分享自己的故事和观点,例如COVID-19大流行。随着推文对用户的间接影响和马来西亚新冠肺炎病例的增加,为避免公众的错误信息、恐慌或混乱,监测与大流行有关的信息非常重要。由于来自Twitter的数据也是可用于数据可视化的有用原始数据源之一,因此本项目旨在设计和开发一个基于web的系统,以根据从Twitter收集的数据可视化马来西亚的流行病状况。这个项目的方法分为四个阶段:(一)规划,(二)分析,(三)设计和开发,(四)测试和文件编制。在规划和分析阶段,数据将在2020年3月至2021年3月期间收集,并通过关键词和标签(如# covid - 19和#冠状病毒)以及推特上标记的位置进行过滤。然后将对收集到的数据进行预处理,以删除任何不需要的文本。数据的分类基于情感分析,使用一种机器学习模型,即支持向量机(SVM)。将使用评估模型对分类模型的性能进行评估:(i)准确性,(ii)召回率,(iii)精度和(iv) F1-measure。该项目的最终成果是基于马来西亚两个主要城市:吉隆坡和巴生的COVID-19情绪分析的数据可视化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Sentiment Analysis Framework on COVID-19 in Major Cities of Malaysia based on Tweets using Machine Learning Classification Model
Twitter is one of the famous social media platforms for people to share their stories and opinions on any situations, such as the COVID-19 pandemic. With the indirect influence of tweets on users and the rise in cases of COVID-19 in Malaysia, it is important to monitor information related to the pandemic in order to avoid misinformation, panic, or confusion among public. As the data from tweets are also one of the useful raw data sources that can be used for data visualization, this project aims to design and develop a web-based system for visualizing the status of pandemic in Malaysia based on the data collected from Twitter. There are four phases in the methodology of this project: (i) Planning, (ii) Analysis, (iii) Design and Development, and (iv) Testing and Documentation. In the planning and analysis phases, the data will be collected from March 2020 to March 2021 and will be filtered by using keywords and hashtags, such as #COVID19 and #Coronavirus, as well as the location tagged on the tweets. The collected data then will be pre-processed to remove any unwanted texts. The classification of the data is based on sentiment analysis using one of machine learning models that is Support Vector Machine (SVM). The performance of the classification model will be evaluated using the evaluation model: (i) accuracy, (ii) recall, (iii) precision, and (iv) F1-measure. The final output of this project is the data visualization of the sentiment analysis on COVID-19 in Malaysia based on two of its major cities: Kuala Lumpur and Klang.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Encrypted Steganography Quick Response Scheme for Unified Hotel Access Control System NARX Neural Network Modeling of Batch Distillation Process Low Latency Peer to Peer Robot Wireless Communication with Edge Computing Model-based Control of a Gravimetric Dosing Conveyor for Alternative Fuels in the Cement Industry Design of an Arduino-Powered Sleep Monitoring System Based on Electrooculography (EOG) with Temperature Control Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1