Wei Kitt Wong, Filbert Hilman Juwono, Ing Ming Chew, Basil Andy Lease
{"title":"COVID-19假新闻检测的语言独立模型","authors":"Wei Kitt Wong, Filbert Hilman Juwono, Ing Ming Chew, Basil Andy Lease","doi":"10.18080/jtde.v11n3.789","DOIUrl":null,"url":null,"abstract":"In an era where massive information can be spread easily through social media, fake news detention is increasingly used to prevent widespread misinformation, especially fake news regarding COVID-19. Databases have been built and machine-learning algorithms have been used to identify patterns in news content and filter the false information. A brief overview, ranging from public domain datasets through the deployment of several machine learning models, as well as feature extraction methods, is provided in this paper. As a case study, a mixed language dataset is presented. The dataset consists of tweets of COVID-19 which have been labelled as fake or real news. To perform the detection task, a classification model is implemented using language-independent features. In particular, the features offer numerical inputs that are invariant to the language type; thus, they are suitable for investigation, as many regions in the world have similar linguistic structures. Furthermore, the classification task can be performed by using black box or white box models, each having its own advantages and disadvantages. In this paper, we compare the performance of the two approaches. Simulation results show that the performance difference between black box models and white box models is not significant.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Language Independent Models for COVID-19 Fake News Detection\",\"authors\":\"Wei Kitt Wong, Filbert Hilman Juwono, Ing Ming Chew, Basil Andy Lease\",\"doi\":\"10.18080/jtde.v11n3.789\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In an era where massive information can be spread easily through social media, fake news detention is increasingly used to prevent widespread misinformation, especially fake news regarding COVID-19. Databases have been built and machine-learning algorithms have been used to identify patterns in news content and filter the false information. A brief overview, ranging from public domain datasets through the deployment of several machine learning models, as well as feature extraction methods, is provided in this paper. As a case study, a mixed language dataset is presented. The dataset consists of tweets of COVID-19 which have been labelled as fake or real news. To perform the detection task, a classification model is implemented using language-independent features. In particular, the features offer numerical inputs that are invariant to the language type; thus, they are suitable for investigation, as many regions in the world have similar linguistic structures. Furthermore, the classification task can be performed by using black box or white box models, each having its own advantages and disadvantages. In this paper, we compare the performance of the two approaches. Simulation results show that the performance difference between black box models and white box models is not significant.\",\"PeriodicalId\":37752,\"journal\":{\"name\":\"Australian Journal of Telecommunications and the Digital Economy\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Australian Journal of Telecommunications and the Digital Economy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18080/jtde.v11n3.789\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Australian Journal of Telecommunications and the Digital Economy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18080/jtde.v11n3.789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
Language Independent Models for COVID-19 Fake News Detection
In an era where massive information can be spread easily through social media, fake news detention is increasingly used to prevent widespread misinformation, especially fake news regarding COVID-19. Databases have been built and machine-learning algorithms have been used to identify patterns in news content and filter the false information. A brief overview, ranging from public domain datasets through the deployment of several machine learning models, as well as feature extraction methods, is provided in this paper. As a case study, a mixed language dataset is presented. The dataset consists of tweets of COVID-19 which have been labelled as fake or real news. To perform the detection task, a classification model is implemented using language-independent features. In particular, the features offer numerical inputs that are invariant to the language type; thus, they are suitable for investigation, as many regions in the world have similar linguistic structures. Furthermore, the classification task can be performed by using black box or white box models, each having its own advantages and disadvantages. In this paper, we compare the performance of the two approaches. Simulation results show that the performance difference between black box models and white box models is not significant.
期刊介绍:
The Journal of Telecommunications and the Digital Economy (JTDE) is an international, open-access, high quality, peer reviewed journal, indexed by Scopus and Google Scholar, covering innovative research and practice in Telecommunications, Digital Economy and Applications. The mission of JTDE is to further through publication the objective of advancing learning, knowledge and research worldwide. The JTDE publishes peer reviewed papers that may take the following form: *Research Paper - a paper making an original contribution to engineering knowledge. *Special Interest Paper – a report on significant aspects of a major or notable project. *Review Paper for specialists – an overview of a relevant area intended for specialists in the field covered. *Review Paper for non-specialists – an overview of a relevant area suitable for a reader with an electrical/electronics background. *Public Policy Discussion - a paper that identifies or discusses public policy and includes investigation of legislation, regulation and what is happening around the world including best practice *Tutorial Paper – a paper that explains an important subject or clarifies the approach to an area of design or investigation. *Technical Note – a technical note or letter to the Editors that is not sufficiently developed or extensive in scope to constitute a full paper. *Industry Case Study - a paper that provides details of industry practices utilising a case study to provide an understanding of what is occurring and how the outcomes have been achieved. *Discussion – a contribution to discuss a published paper to which the original author''s response will be sought. Historical - a paper covering a historical topic related to telecommunications or the digital economy.