Language Independent Models for COVID-19 Fake News Detection

Wei Kitt Wong, Filbert Hilman Juwono, Ing Ming Chew, Basil Andy Lease
{"title":"Language Independent Models for COVID-19 Fake News Detection","authors":"Wei Kitt Wong, Filbert Hilman Juwono, Ing Ming Chew, Basil Andy Lease","doi":"10.18080/jtde.v11n3.789","DOIUrl":null,"url":null,"abstract":"In an era where massive information can be spread easily through social media, fake news detention is increasingly used to prevent widespread misinformation, especially fake news regarding COVID-19. Databases have been built and machine-learning algorithms have been used to identify patterns in news content and filter the false information. A brief overview, ranging from public domain datasets through the deployment of several machine learning models, as well as feature extraction methods, is provided in this paper. As a case study, a mixed language dataset is presented. The dataset consists of tweets of COVID-19 which have been labelled as fake or real news. To perform the detection task, a classification model is implemented using language-independent features. In particular, the features offer numerical inputs that are invariant to the language type; thus, they are suitable for investigation, as many regions in the world have similar linguistic structures. Furthermore, the classification task can be performed by using black box or white box models, each having its own advantages and disadvantages. In this paper, we compare the performance of the two approaches. Simulation results show that the performance difference between black box models and white box models is not significant.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Australian Journal of Telecommunications and the Digital Economy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18080/jtde.v11n3.789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

In an era where massive information can be spread easily through social media, fake news detention is increasingly used to prevent widespread misinformation, especially fake news regarding COVID-19. Databases have been built and machine-learning algorithms have been used to identify patterns in news content and filter the false information. A brief overview, ranging from public domain datasets through the deployment of several machine learning models, as well as feature extraction methods, is provided in this paper. As a case study, a mixed language dataset is presented. The dataset consists of tweets of COVID-19 which have been labelled as fake or real news. To perform the detection task, a classification model is implemented using language-independent features. In particular, the features offer numerical inputs that are invariant to the language type; thus, they are suitable for investigation, as many regions in the world have similar linguistic structures. Furthermore, the classification task can be performed by using black box or white box models, each having its own advantages and disadvantages. In this paper, we compare the performance of the two approaches. Simulation results show that the performance difference between black box models and white box models is not significant.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
COVID-19假新闻检测的语言独立模型
在大量信息可以通过社交媒体轻松传播的时代,虚假新闻拘留越来越多地用于防止错误信息的广泛传播,特别是有关新冠肺炎的虚假新闻。已经建立了数据库,并使用机器学习算法来识别新闻内容中的模式并过滤虚假信息。本文简要概述了从公共领域数据集到几个机器学习模型的部署,以及特征提取方法。作为案例研究,给出了一个混合语言数据集。该数据集由COVID-19的推文组成,这些推文被标记为假新闻或真实新闻。为了执行检测任务,使用与语言无关的特征实现分类模型。特别是,这些功能提供了对语言类型不变的数值输入;因此,它们是适合研究的,因为世界上许多地区都有类似的语言结构。此外,可以使用黑盒模型或白盒模型来执行分类任务,每种模型都有自己的优点和缺点。在本文中,我们比较了两种方法的性能。仿真结果表明,黑盒模型与白盒模型的性能差异不显著。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.60
自引率
0.00%
发文量
37
期刊介绍: The Journal of Telecommunications and the Digital Economy (JTDE) is an international, open-access, high quality, peer reviewed journal, indexed by Scopus and Google Scholar, covering innovative research and practice in Telecommunications, Digital Economy and Applications. The mission of JTDE is to further through publication the objective of advancing learning, knowledge and research worldwide. The JTDE publishes peer reviewed papers that may take the following form: *Research Paper - a paper making an original contribution to engineering knowledge. *Special Interest Paper – a report on significant aspects of a major or notable project. *Review Paper for specialists – an overview of a relevant area intended for specialists in the field covered. *Review Paper for non-specialists – an overview of a relevant area suitable for a reader with an electrical/electronics background. *Public Policy Discussion - a paper that identifies or discusses public policy and includes investigation of legislation, regulation and what is happening around the world including best practice *Tutorial Paper – a paper that explains an important subject or clarifies the approach to an area of design or investigation. *Technical Note – a technical note or letter to the Editors that is not sufficiently developed or extensive in scope to constitute a full paper. *Industry Case Study - a paper that provides details of industry practices utilising a case study to provide an understanding of what is occurring and how the outcomes have been achieved. *Discussion – a contribution to discuss a published paper to which the original author''s response will be sought. Historical - a paper covering a historical topic related to telecommunications or the digital economy.
期刊最新文献
Blockchain Technology for Tourism Post COVID-19 ICT-driven Transparency: Empirical Evidence from Selected Asian Countries Big Data Analytics in Tracking COVID-19 Spread Utilizing Google Location Data Harry S. Wragge AM (1929-2023) Phishing Message Detection Based on Keyword Matching
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1