Learning Textual Representations from Multiple Modalities to Detect Fake News Through One-Class Learning

M. Gôlo, M. C. D. Souza, R. G. Rossi, S. O. Rezende, B. Nogueira, R. Marcacini
{"title":"Learning Textual Representations from Multiple Modalities to Detect Fake News Through One-Class Learning","authors":"M. Gôlo, M. C. D. Souza, R. G. Rossi, S. O. Rezende, B. Nogueira, R. Marcacini","doi":"10.1145/3470482.3479634","DOIUrl":null,"url":null,"abstract":"Fake news can rapidly spread through internet users. Approaches proposed in the literature for content classification usually learn models considering textual and contextual features from real and fake news to minimize the spread of disinformation. One of the prominent approaches to detect fake news is One-Class Learning (OCL), as it minimizes the data labeling effort, requiring only the labeling of fake news documents. The performance of these algorithms depends on the structured representation of the documents used in the learning process. Generally, a textual-based unimodal representation is used, such as bag-of-words or representations based on linguistic categories. We propose MVAE-FakeNews, a multimodal representation method to detect fake news in OCL. The proposed approach uses a Multimodal Variational Autoencoder, learns a new representation from the combination of two modalities considered promising for fake news detection: text embeddings and topic information. In the experiments, we used three datasets considering Portuguese and English languages. Results show that the MVAE-FakeNews obtained a better F1-Score for the class of interest, outperforming another nine methods in ten of twelve evaluated scenarios. MVAE-FakeNews presented a better average ranking and statistical difference from other representation models. The proposed method proved to be promising to represent the texts in the OCL scenario to detect fake news.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Brazilian Symposium on Multimedia and the Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3470482.3479634","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Fake news can rapidly spread through internet users. Approaches proposed in the literature for content classification usually learn models considering textual and contextual features from real and fake news to minimize the spread of disinformation. One of the prominent approaches to detect fake news is One-Class Learning (OCL), as it minimizes the data labeling effort, requiring only the labeling of fake news documents. The performance of these algorithms depends on the structured representation of the documents used in the learning process. Generally, a textual-based unimodal representation is used, such as bag-of-words or representations based on linguistic categories. We propose MVAE-FakeNews, a multimodal representation method to detect fake news in OCL. The proposed approach uses a Multimodal Variational Autoencoder, learns a new representation from the combination of two modalities considered promising for fake news detection: text embeddings and topic information. In the experiments, we used three datasets considering Portuguese and English languages. Results show that the MVAE-FakeNews obtained a better F1-Score for the class of interest, outperforming another nine methods in ten of twelve evaluated scenarios. MVAE-FakeNews presented a better average ranking and statistical difference from other representation models. The proposed method proved to be promising to represent the texts in the OCL scenario to detect fake news.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过一堂课学习,学习多种形式的文本表示来检测假新闻
假新闻可以通过互联网用户迅速传播。文献中提出的内容分类方法通常学习考虑真假新闻文本和上下文特征的模型,以最大限度地减少虚假信息的传播。检测假新闻的主要方法之一是单类学习(OCL),因为它最大限度地减少了数据标记工作,只需要标记假新闻文档。这些算法的性能取决于学习过程中使用的文档的结构化表示。一般使用基于文本的单模态表示,如词袋表示或基于语言类别的表示。我们提出了一种多模态表示方法mvee - fakenews来检测OCL中的假新闻。提出的方法使用多模态变分自编码器,从文本嵌入和主题信息这两种被认为有希望用于假新闻检测的模态组合中学习新的表示。在实验中,我们使用了三个考虑葡萄牙语和英语语言的数据集。结果表明,MVAE-FakeNews在感兴趣的类别中获得了更好的f1分,在12个评估场景中的10个中优于其他9个方法。与其他表征模型相比,MVAE-FakeNews表现出更好的平均排名和统计差异。所提出的方法被证明有希望表示OCL场景中的文本来检测假新闻。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Evaluating Topic Modeling Pre-processing Pipelines for Portuguese Texts A Proposal to Apply SignWriting in IMSC1 Standard for the Next-Generation of Brazilian DTV Broadcasting System Once Learning for Looking and Identifying Based on YOLO-v5 Object Detection I can’t pay! Accessibility analysis of mobile banking apps Should We Translate? Evaluating Toxicity in Online Comments when Translating from Portuguese to English
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1