在恢复和封锁期间检测covid -19封锁相关讨论的机器学习方法的比较

Journal of Operations Intelligence Pub Date : 2023-10-25 DOI:10.31181/jopi1120233

Mohammed Rashad Baker, A.H. Alamoodi, O.S. Albahri, A.S. Albahri, Salem Garfan, Amneh Alamleh, Moceheb Lazam Shuwandy, Ibrahim Alshakhatreh

{"title":"在恢复和封锁期间检测covid -19封锁相关讨论的机器学习方法的比较","authors":"Mohammed Rashad Baker, A.H. Alamoodi, O.S. Albahri, A.S. Albahri, Salem Garfan, Amneh Alamleh, Moceheb Lazam Shuwandy, Ibrahim Alshakhatreh","doi":"10.31181/jopi1120233","DOIUrl":null,"url":null,"abstract":"Ever since COVID-19 was declared a pandemic, governments around the world have implemented numerous phases of lockdown measures to curb the spread of the virus. These lockdown tactics manifest themselves in the form of widespread fear and panic driven by social media discussions. Given that individuals hold diverse opinions about these lockdown measures during and after their completion, positive and negative lockdown-related discussions should be differentiated to further understand the major related issues and to make appropriate messaging and policy choices in the future. We conduct a sentiment analysis (SA) of COVID-19-lockdown-related tweets by using different machine learning (ML) classifiers and then evaluate their performance before and after using the synthetic minority oversampling technique (SMOTE). This research is performed in five phases, starting with data collection and followed by pre-processing the dataset, preparing the dataset by annotation, applying SMOTE and using ML classifiers. We observe an improvement in accuracy ( ) as confirmed by the Matthew correlation coefficient ( ) across most classifiers, except for the k-nearest neighbour (KNN), whose Acc decreased from 0.82 to 0.59 and MCC decreased from 0.544 to 0.279 before and after SMOTE was applied. Despite the potential of SMOTE with some classifiers, this technique cannot be considered an ultimate solution, especially with other classifiers and datasets. The study provides insights into the need to evaluate and benchmark the integration of data balancing approaches with ML classifiers in addition to considering additional metrics, such as MCC, for binary classification problems, especially in SA.","PeriodicalId":489110,"journal":{"name":"Journal of Operations Intelligence","volume":"23 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparison of Machine Learning Approaches for Detecting COVID-19-Lockdown-Related Discussions During Recovery and Lockdown Periods\",\"authors\":\"Mohammed Rashad Baker, A.H. Alamoodi, O.S. Albahri, A.S. Albahri, Salem Garfan, Amneh Alamleh, Moceheb Lazam Shuwandy, Ibrahim Alshakhatreh\",\"doi\":\"10.31181/jopi1120233\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ever since COVID-19 was declared a pandemic, governments around the world have implemented numerous phases of lockdown measures to curb the spread of the virus. These lockdown tactics manifest themselves in the form of widespread fear and panic driven by social media discussions. Given that individuals hold diverse opinions about these lockdown measures during and after their completion, positive and negative lockdown-related discussions should be differentiated to further understand the major related issues and to make appropriate messaging and policy choices in the future. We conduct a sentiment analysis (SA) of COVID-19-lockdown-related tweets by using different machine learning (ML) classifiers and then evaluate their performance before and after using the synthetic minority oversampling technique (SMOTE). This research is performed in five phases, starting with data collection and followed by pre-processing the dataset, preparing the dataset by annotation, applying SMOTE and using ML classifiers. We observe an improvement in accuracy ( ) as confirmed by the Matthew correlation coefficient ( ) across most classifiers, except for the k-nearest neighbour (KNN), whose Acc decreased from 0.82 to 0.59 and MCC decreased from 0.544 to 0.279 before and after SMOTE was applied. Despite the potential of SMOTE with some classifiers, this technique cannot be considered an ultimate solution, especially with other classifiers and datasets. The study provides insights into the need to evaluate and benchmark the integration of data balancing approaches with ML classifiers in addition to considering additional metrics, such as MCC, for binary classification problems, especially in SA.\",\"PeriodicalId\":489110,\"journal\":{\"name\":\"Journal of Operations Intelligence\",\"volume\":\"23 3\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Operations Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31181/jopi1120233\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Operations Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31181/jopi1120233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

自2019冠状病毒病被宣布为大流行以来，世界各国政府实施了多个阶段的封锁措施，以遏制病毒的传播。这些封锁策略表现为社交媒体讨论引发的广泛恐惧和恐慌。鉴于人们对封锁措施实施期间和实施后的看法不一，应区分积极和消极的封锁讨论，进一步了解相关重大问题，以便在未来做出适当的信息传递和政策选择。我们使用不同的机器学习(ML)分类器对covid -19封锁相关推文进行情感分析(SA)，然后使用合成少数过采样技术(SMOTE)评估其前后的性能。本研究分五个阶段进行，从数据收集开始，然后是数据集预处理，通过注释准备数据集，应用SMOTE和使用ML分类器。我们观察到，除了k近邻(KNN)，在应用SMOTE前后，其Acc从0.82下降到0.59,MCC从0.544下降到0.279，大多数分类器的马修相关系数()证实了准确性()的提高。尽管SMOTE在某些分类器上具有潜力，但这种技术不能被认为是最终的解决方案，特别是在其他分类器和数据集上。该研究提供了对数据平衡方法与ML分类器的集成进行评估和基准测试的需求的见解，此外还考虑了二元分类问题的附加指标，如MCC，特别是在SA中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Comparison of Machine Learning Approaches for Detecting COVID-19-Lockdown-Related Discussions During Recovery and Lockdown Periods

Ever since COVID-19 was declared a pandemic, governments around the world have implemented numerous phases of lockdown measures to curb the spread of the virus. These lockdown tactics manifest themselves in the form of widespread fear and panic driven by social media discussions. Given that individuals hold diverse opinions about these lockdown measures during and after their completion, positive and negative lockdown-related discussions should be differentiated to further understand the major related issues and to make appropriate messaging and policy choices in the future. We conduct a sentiment analysis (SA) of COVID-19-lockdown-related tweets by using different machine learning (ML) classifiers and then evaluate their performance before and after using the synthetic minority oversampling technique (SMOTE). This research is performed in five phases, starting with data collection and followed by pre-processing the dataset, preparing the dataset by annotation, applying SMOTE and using ML classifiers. We observe an improvement in accuracy ( ) as confirmed by the Matthew correlation coefficient ( ) across most classifiers, except for the k-nearest neighbour (KNN), whose Acc decreased from 0.82 to 0.59 and MCC decreased from 0.544 to 0.279 before and after SMOTE was applied. Despite the potential of SMOTE with some classifiers, this technique cannot be considered an ultimate solution, especially with other classifiers and datasets. The study provides insights into the need to evaluate and benchmark the integration of data balancing approaches with ML classifiers in addition to considering additional metrics, such as MCC, for binary classification problems, especially in SA.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Operations Intelligence

自引率

0.00%

发文量