Learning to generate and evaluate fact-checking explanations with transformers

IF 8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Engineering Applications of Artificial Intelligence Pub Date : 2025-01-01 Epub Date: 2024-10-28 DOI:10.1016/j.engappai.2024.109492

Darius Feher , Abdullah Khered , Hao Zhang , Riza Batista-Navarro , Viktor Schlegel

{"title":"Learning to generate and evaluate fact-checking explanations with transformers","authors":"Darius Feher , Abdullah Khered , Hao Zhang , Riza Batista-Navarro , Viktor Schlegel","doi":"10.1016/j.engappai.2024.109492","DOIUrl":null,"url":null,"abstract":"<div><div>In an era increasingly dominated by digital platforms, the spread of misinformation poses a significant challenge, highlighting the need for solutions capable of assessing information veracity. Our research contributes to the field of Explainable Artificial Antelligence (XAI) by developing transformer-based fact-checking models that contextualise and justify their decisions by generating human-accessible explanations. Importantly, we also develop models for automatic evaluation of explanations for fact-checking verdicts across different dimensions such as <span>(self)-contradiction</span>, <span>hallucination</span>, <span>convincingness</span> and <span>overall quality</span>. By introducing human-centred evaluation methods and developing specialised datasets, we emphasise the need for aligning Artificial Intelligence (AI)-generated explanations with human judgements. This approach not only advances theoretical knowledge in XAI but also holds practical implications by enhancing the transparency, reliability and users’ trust in AI-driven fact-checking systems. Furthermore, the development of our metric learning models is a first step towards potentially increasing efficiency and reducing reliance on extensive manual assessment. Based on experimental results, our best performing generative model achieved a Recall-Oriented Understudy for Gisting Evaluation-1 (<span>ROUGE-1</span>) score of 47.77 demonstrating superior performance in generating fact-checking explanations, particularly when provided with high-quality evidence. Additionally, the best performing metric learning model showed a moderately strong correlation with human judgements on objective dimensions such as <span>(self)-contradiction</span> and <span>hallucination</span>, achieving a Matthews Correlation Coefficient (MCC) of around 0.7.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109492"},"PeriodicalIF":8.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624016506","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/28 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

In an era increasingly dominated by digital platforms, the spread of misinformation poses a significant challenge, highlighting the need for solutions capable of assessing information veracity. Our research contributes to the field of Explainable Artificial Antelligence (XAI) by developing transformer-based fact-checking models that contextualise and justify their decisions by generating human-accessible explanations. Importantly, we also develop models for automatic evaluation of explanations for fact-checking verdicts across different dimensions such as (self)-contradiction, hallucination, convincingness and overall quality. By introducing human-centred evaluation methods and developing specialised datasets, we emphasise the need for aligning Artificial Intelligence (AI)-generated explanations with human judgements. This approach not only advances theoretical knowledge in XAI but also holds practical implications by enhancing the transparency, reliability and users’ trust in AI-driven fact-checking systems. Furthermore, the development of our metric learning models is a first step towards potentially increasing efficiency and reducing reliance on extensive manual assessment. Based on experimental results, our best performing generative model achieved a Recall-Oriented Understudy for Gisting Evaluation-1 (ROUGE-1) score of 47.77 demonstrating superior performance in generating fact-checking explanations, particularly when provided with high-quality evidence. Additionally, the best performing metric learning model showed a moderately strong correlation with human judgements on objective dimensions such as (self)-contradiction and hallucination, achieving a Matthews Correlation Coefficient (MCC) of around 0.7.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

学习用转换器生成和评估事实核查说明

在数字平台日益占主导地位的时代，错误信息的传播构成了巨大的挑战，凸显了对能够评估信息真实性的解决方案的需求。我们的研究为可解释人工智能（XAI）领域做出了贡献，我们开发了基于变压器的事实核查模型，通过生成人类可理解的解释，对其决策进行上下文说明和论证。重要的是，我们还开发了模型，用于自动评估不同维度的事实核查判决解释，如（自我）矛盾、幻觉、说服力和整体质量。通过引入以人为本的评估方法和开发专用数据集，我们强调了将人工智能（AI）生成的解释与人类判断相统一的必要性。这种方法不仅推动了 XAI 理论知识的发展，还通过提高人工智能驱动的事实核查系统的透明度、可靠性和用户信任度，产生了实际影响。此外，我们的度量学习模型的开发是提高效率和减少对大量人工评估依赖的第一步。根据实验结果，我们性能最好的生成模型获得了以召回为导向的 Gisting 评估-1（ROUGE-1）得分 47.77，这表明我们在生成事实核查解释方面表现出色，尤其是在提供高质量证据的情况下。此外，在（自我）矛盾和幻觉等客观维度上，表现最佳的度量学习模型与人类判断显示出适度的强相关性，马修斯相关系数（MCC）达到 0.7 左右。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.