That is a Known Lie: Detecting Previously Fact-Checked Claims

Annual Meeting of the Association for Computational Linguistics Pub Date : 2020-05-12 DOI:10.18653/v1/2020.acl-main.332

Shaden Shaar, Giovanni Da San Martino, Nikolay Babulkov, Preslav Nakov

{"title":"That is a Known Lie: Detecting Previously Fact-Checked Claims","authors":"Shaden Shaar, Giovanni Da San Martino, Nikolay Babulkov, Preslav Nakov","doi":"10.18653/v1/2020.acl-main.332","DOIUrl":null,"url":null,"abstract":"The recent proliferation of ”fake news” has triggered a number of responses, most notably the emergence of several manual fact-checking initiatives. As a result and over time, a large number of fact-checked claims have been accumulated, which increases the likelihood that a new claim in social media or a new statement by a politician might have already been fact-checked by some trusted fact-checking organization, as viral claims often come back after a while in social media, and politicians like to repeat their favorite statements, true or false, over and over again. As manual fact-checking is very time-consuming (and fully automatic fact-checking has credibility issues), it is important to try to save this effort and to avoid wasting time on claims that have already been fact-checked. Interestingly, despite the importance of the task, it has been largely ignored by the research community so far. Here, we aim to bridge this gap. In particular, we formulate the task and we discuss how it relates to, but also differs from, previous work. We further create a specialized dataset, which we release to the research community. Finally, we present learning-to-rank experiments that demonstrate sizable improvements over state-of-the-art retrieval and textual similarity approaches.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"117","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Meeting of the Association for Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2020.acl-main.332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 117

Abstract

The recent proliferation of ”fake news” has triggered a number of responses, most notably the emergence of several manual fact-checking initiatives. As a result and over time, a large number of fact-checked claims have been accumulated, which increases the likelihood that a new claim in social media or a new statement by a politician might have already been fact-checked by some trusted fact-checking organization, as viral claims often come back after a while in social media, and politicians like to repeat their favorite statements, true or false, over and over again. As manual fact-checking is very time-consuming (and fully automatic fact-checking has credibility issues), it is important to try to save this effort and to avoid wasting time on claims that have already been fact-checked. Interestingly, despite the importance of the task, it has been largely ignored by the research community so far. Here, we aim to bridge this gap. In particular, we formulate the task and we discuss how it relates to, but also differs from, previous work. We further create a specialized dataset, which we release to the research community. Finally, we present learning-to-rank experiments that demonstrate sizable improvements over state-of-the-art retrieval and textual similarity approaches.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

这是一个已知的谎言:发现先前经过事实核查的说法

最近“假新闻”的泛滥引发了一系列回应，最引人注目的是出现了几项手动事实核查计划。因此，随着时间的推移，大量经过事实核查的说法已经积累起来，这增加了社交媒体上的新说法或政客的新声明可能已经被一些值得信赖的事实核查机构核实过的可能性，因为病毒式的说法通常会在社交媒体上一段时间后卷土重来，政客们喜欢一遍又一遍地重复他们最喜欢的说法，无论真假。由于手动事实核查非常耗时(全自动事实核查存在可信度问题)，因此尽量节省这种努力并避免将时间浪费在已经经过事实核查的声明上是很重要的。有趣的是，尽管这项任务很重要，但迄今为止，它在很大程度上被研究界所忽视。在这里，我们的目标是弥合这一差距。特别是，我们制定了任务，并讨论了它与以前的工作的关系，但也不同于以前的工作。我们进一步创建了一个专门的数据集，并将其发布给研究社区。最后，我们提出了学习排序实验，证明了比最先进的检索和文本相似方法有相当大的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Annual Meeting of the Association for Computational Linguistics

自引率

0.00%

发文量