S. Selva Birunda, R. Kanniga Devi, M. Muthukannan, M. Mahesh Babu
{"title":"ACOVMD: Automatic COVID‐19 misinformation detection in Twitter using self‐trained semi‐supervised hybrid deep learning model","authors":"S. Selva Birunda, R. Kanniga Devi, M. Muthukannan, M. Mahesh Babu","doi":"10.1111/issj.12475","DOIUrl":null,"url":null,"abstract":"Abstract During the COVID‐19 pandemic, online social networks are extensively utilized, more than ever before by 8.4%, resulting in the propagation of false information related to COVID‐19. Despite the existence of many fake news detection models; annotation inconsistency, memory consumption, accurate and self‐trained efficient algorithms for detecting the emerging COVID‐19 misinformation tweets are still challenging. Hence, the main aim of this work is to come up with a self‐trained semi‐supervised model that accurately and automatically detects the reliability of emerging COVID‐19 tweets without delay. In this work, COVID‐19 tweet dataset is created in English Language from the period January 2020 to January 2022 as a ground truth database. Then self‐trained semi‐supervised hybrid deep learning model is proposed to train both supervised and unsupervised components simultaneously using the created dataset. The proposed model is self‐trained repeatedly and the model gets updated to predict the reliability of upcoming COVID‐19 tweets that differ from training tweets. We performed experiments multiple times by limiting the percentage amount of labelled tweets shown to the model, namely 80%, 50%, 40%, 30%, 20% and 10% labelled tweets, respectively. Experimental results show that the proposed model achieves 80.92% accuracy and 98.15% accuracy in the 10% and 80% label‐seen experiments, respectively. This shows a clear rising trend in the performance curve. Therefore, this technique will be useful for effectively classifying voluminous amounts of emerging tweets generated as part of the COVID‐19 infodemic. The proposed model may efficiently use a huge amount of unlabelled tweets and enhance the model's generalization performance.","PeriodicalId":35727,"journal":{"name":"International Social Science Journal","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Social Science Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/issj.12475","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract During the COVID‐19 pandemic, online social networks are extensively utilized, more than ever before by 8.4%, resulting in the propagation of false information related to COVID‐19. Despite the existence of many fake news detection models; annotation inconsistency, memory consumption, accurate and self‐trained efficient algorithms for detecting the emerging COVID‐19 misinformation tweets are still challenging. Hence, the main aim of this work is to come up with a self‐trained semi‐supervised model that accurately and automatically detects the reliability of emerging COVID‐19 tweets without delay. In this work, COVID‐19 tweet dataset is created in English Language from the period January 2020 to January 2022 as a ground truth database. Then self‐trained semi‐supervised hybrid deep learning model is proposed to train both supervised and unsupervised components simultaneously using the created dataset. The proposed model is self‐trained repeatedly and the model gets updated to predict the reliability of upcoming COVID‐19 tweets that differ from training tweets. We performed experiments multiple times by limiting the percentage amount of labelled tweets shown to the model, namely 80%, 50%, 40%, 30%, 20% and 10% labelled tweets, respectively. Experimental results show that the proposed model achieves 80.92% accuracy and 98.15% accuracy in the 10% and 80% label‐seen experiments, respectively. This shows a clear rising trend in the performance curve. Therefore, this technique will be useful for effectively classifying voluminous amounts of emerging tweets generated as part of the COVID‐19 infodemic. The proposed model may efficiently use a huge amount of unlabelled tweets and enhance the model's generalization performance.
期刊介绍:
The International Social Science Journal bridges social science communities across disciplines and continents with a view to sharing information and debate with the widest possible audience. The ISSJ has a particular focus on interdisciplinary and transdisciplinary work that pushes the boundaries of current approaches, and welcomes both applied and theoretical research. Originally founded by UNESCO in 1949, ISSJ has since grown into a forum for innovative review, reflection and discussion informed by recent and ongoing international, social science research. It provides a home for work that asks questions in new ways and/or employs original methods to classic problems and whose insights have implications across the disciplines and beyond the academy. The journal publishes regular editions featuring rigorous, peer-reviewed research articles that reflect its international and heterodox scope.