印度新闻数据集的假新闻检测

Sudhanshu Kumar, Thoudam Doren Singh
{"title":"印度新闻数据集的假新闻检测","authors":"Sudhanshu Kumar,&nbsp;Thoudam Doren Singh","doi":"10.1016/j.gltp.2022.03.014","DOIUrl":null,"url":null,"abstract":"<div><p>With the increase in social networks, more number of people are creating and sharing information than ever before, many of them have no relevance to reality. Due to this, fake news for various political and commercial purposes are spreading quickly. Online newspaper has made it challenging to identify trustworthy news sources. In this work, Hindi news articles from various news sources are collected. Preprocessing, feature extraction, classification and prediction processes are discussed in detail. Different machine learning algorithms such as Naïve Bayes, logistic regression and Long Short-Term Memory (LSTM) are used to detect the fake news. The preprocessing step includes data cleaning, stop words removal, tokenizing and stemming. Term frequency inverse document frequency(TF-IDF) is used for feature extraction. Naïve Bayes, logistic regression and LSTM classifiers are used and compared for fake news detection with probability of truth. It is observed that among these three classifiers, LSTM achieved best accuracy of 92.36%.</p></div>","PeriodicalId":100588,"journal":{"name":"Global Transitions Proceedings","volume":"3 1","pages":"Pages 289-297"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666285X2200019X/pdfft?md5=158942440d14be3e63b882da17dba987&pid=1-s2.0-S2666285X2200019X-main.pdf","citationCount":"16","resultStr":"{\"title\":\"Fake news detection on Hindi news dataset\",\"authors\":\"Sudhanshu Kumar,&nbsp;Thoudam Doren Singh\",\"doi\":\"10.1016/j.gltp.2022.03.014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>With the increase in social networks, more number of people are creating and sharing information than ever before, many of them have no relevance to reality. Due to this, fake news for various political and commercial purposes are spreading quickly. Online newspaper has made it challenging to identify trustworthy news sources. In this work, Hindi news articles from various news sources are collected. Preprocessing, feature extraction, classification and prediction processes are discussed in detail. Different machine learning algorithms such as Naïve Bayes, logistic regression and Long Short-Term Memory (LSTM) are used to detect the fake news. The preprocessing step includes data cleaning, stop words removal, tokenizing and stemming. Term frequency inverse document frequency(TF-IDF) is used for feature extraction. Naïve Bayes, logistic regression and LSTM classifiers are used and compared for fake news detection with probability of truth. It is observed that among these three classifiers, LSTM achieved best accuracy of 92.36%.</p></div>\",\"PeriodicalId\":100588,\"journal\":{\"name\":\"Global Transitions Proceedings\",\"volume\":\"3 1\",\"pages\":\"Pages 289-297\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666285X2200019X/pdfft?md5=158942440d14be3e63b882da17dba987&pid=1-s2.0-S2666285X2200019X-main.pdf\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Global Transitions Proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666285X2200019X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Global Transitions Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666285X2200019X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

摘要

随着社交网络的增加,越来越多的人创造和分享信息,其中许多与现实无关。因此,出于各种政治和商业目的的假新闻正在迅速传播。在线报纸使人们很难找到值得信赖的新闻来源。在这项工作中,收集了来自各种新闻来源的印地语新闻文章。详细讨论了预处理、特征提取、分类和预测过程。利用Naïve贝叶斯、逻辑回归、长短期记忆(LSTM)等不同的机器学习算法来检测假新闻。预处理步骤包括数据清理、停止词删除、标记化和词干提取。使用词频逆文档频率(TF-IDF)进行特征提取。Naïve使用贝叶斯,逻辑回归和LSTM分类器对假新闻的真实概率检测进行了比较。观察到,在这三种分类器中,LSTM的准确率最高,为92.36%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Fake news detection on Hindi news dataset

With the increase in social networks, more number of people are creating and sharing information than ever before, many of them have no relevance to reality. Due to this, fake news for various political and commercial purposes are spreading quickly. Online newspaper has made it challenging to identify trustworthy news sources. In this work, Hindi news articles from various news sources are collected. Preprocessing, feature extraction, classification and prediction processes are discussed in detail. Different machine learning algorithms such as Naïve Bayes, logistic regression and Long Short-Term Memory (LSTM) are used to detect the fake news. The preprocessing step includes data cleaning, stop words removal, tokenizing and stemming. Term frequency inverse document frequency(TF-IDF) is used for feature extraction. Naïve Bayes, logistic regression and LSTM classifiers are used and compared for fake news detection with probability of truth. It is observed that among these three classifiers, LSTM achieved best accuracy of 92.36%.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enhanced Energy Efficient Secure Routing Protocol for Mobile Ad-Hoc Network Grid interconnected H-bridge multilevel inverter for renewable power applications using repeating units and level boosting network Power Generation Using Ocean Waves: A Review Development of an Arabic HQAS-based ASAG to consider an ignored knowledge in misspelled multiple words short answers Smartphone assist deep neural network to detect the citrus diseases in Agri-informatics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1