A Comparative Study of Machine Learning Algorithms for the Detection of Fake News on the Internet

V. N. Barbosa, L. C. E. Silva, F. M. M. Neto, Sebastiao Alves Filho
{"title":"A Comparative Study of Machine Learning Algorithms for the Detection of Fake News on the Internet","authors":"V. N. Barbosa, L. C. E. Silva, F. M. M. Neto, Sebastiao Alves Filho","doi":"10.1145/3535511.3535550","DOIUrl":null,"url":null,"abstract":"Context: The increase in the proliferation of fake news on the Internet has significantly impacted the quality and veracity of information received by society. Problem: The malicious use of information can compromise democracy by manipulating people’s opinions. In addition, there are few facilitating mechanisms that classify and help the citizen to know whether a certain news propagated is true or not. This problem has driven new research directions in an attempt to classify and identify these news. Methodology: This work in its methodology performs a comparison of algorithms to serve as an intelligent solution in the detection of fake news in Portuguese. About 12,000 news featured the dataset used for this analysis. Pre-processing techniques were used to analyze the patterns of these news, as well as to reduce noise and eliminate null information. The algorithms used for comparison were Logistic Regression, Stochastic Gradient Descent, Support Vector Machine and Multilayer Perceptron. Result: The results obtained showed that the models generated by the four algorithms obtained an accuracy greater than 90%. To choose the best algorithm, metrics such as precision, recall and f-measure were used for each of the models. The SVM algorithm had the best performance, with 96.39% accuracy. Contribution: In addition to the analytical results presented, this work brought as contributions the availability of a database containing news in Portuguese and an analysis, from the text of the news, both grammatical and structural, in order to detect the existing patterns between true and false.","PeriodicalId":106528,"journal":{"name":"Proceedings of the XVIII Brazilian Symposium on Information Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the XVIII Brazilian Symposium on Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3535511.3535550","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Context: The increase in the proliferation of fake news on the Internet has significantly impacted the quality and veracity of information received by society. Problem: The malicious use of information can compromise democracy by manipulating people’s opinions. In addition, there are few facilitating mechanisms that classify and help the citizen to know whether a certain news propagated is true or not. This problem has driven new research directions in an attempt to classify and identify these news. Methodology: This work in its methodology performs a comparison of algorithms to serve as an intelligent solution in the detection of fake news in Portuguese. About 12,000 news featured the dataset used for this analysis. Pre-processing techniques were used to analyze the patterns of these news, as well as to reduce noise and eliminate null information. The algorithms used for comparison were Logistic Regression, Stochastic Gradient Descent, Support Vector Machine and Multilayer Perceptron. Result: The results obtained showed that the models generated by the four algorithms obtained an accuracy greater than 90%. To choose the best algorithm, metrics such as precision, recall and f-measure were used for each of the models. The SVM algorithm had the best performance, with 96.39% accuracy. Contribution: In addition to the analytical results presented, this work brought as contributions the availability of a database containing news in Portuguese and an analysis, from the text of the news, both grammatical and structural, in order to detect the existing patterns between true and false.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习算法在互联网假新闻检测中的比较研究
背景:互联网上虚假新闻的泛滥严重影响了社会接收信息的质量和真实性。问题:恶意利用信息可以通过操纵人们的意见来损害民主。此外,很少有便利的机制来分类和帮助公民了解某条新闻的传播是真是假。这个问题推动了新的研究方向,试图对这些新闻进行分类和识别。方法论:这项工作在其方法论中执行算法的比较,以作为葡萄牙语假新闻检测的智能解决方案。大约有1.2万条新闻使用了用于分析的数据集。利用预处理技术对这些新闻的模式进行分析,并进行降噪和消除零信息的处理。用于比较的算法有逻辑回归、随机梯度下降、支持向量机和多层感知机。结果:得到的结果表明,四种算法生成的模型准确率均大于90%。为了选择最佳算法,对每个模型使用了精度、召回率和f-measure等指标。其中SVM算法的准确率最高,达到96.39%。贡献:除了提出的分析结果之外,这项工作还带来了一个数据库的可用性,其中包含葡萄牙语新闻和新闻文本的分析,包括语法和结构,以便检测真假之间的现有模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analysis of the Use of Mobile Application to Advance Agricultural Traceability Investigating Information Security in Systems-of-Systems Automated Statistics Extraction of Public Security Events Reported Through Microtexts on Social Networks Supporting Interorganizational Relationships Management A Mobile Application for on-Demand Scheduling of Health Services
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1