以实体为中心的多域转换器，用于提高假新闻检测的概括能力

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Processing & Management Pub Date : 2024-06-14 DOI:10.1016/j.ipm.2024.103807

Parisa Bazmi , Masoud Asadpour , Azadeh Shakery , Abbas Maazallahi

{"title":"以实体为中心的多域转换器，用于提高假新闻检测的概括能力","authors":"Parisa Bazmi , Masoud Asadpour , Azadeh Shakery , Abbas Maazallahi","doi":"10.1016/j.ipm.2024.103807","DOIUrl":null,"url":null,"abstract":"<div><p>Fake news has become a significant concern in recent times, particularly during the COVID-19 pandemic, as spreading false information can pose significant public health risks. Although many models have been suggested to detect fake news, they are often limited in their ability to extend to new emerging domains since they are designed for a single domain. Previous studies on multidomain fake news detection have focused on developing models that can perform well on multiple domains, but they often lack the ability to generalize to new unseen domains, which limits their effectiveness. To overcome this limitation, in this paper, we propose the Entity-centric Multi-domain Transformer (EMT) model. EMT uses entities in the news as key components in learning domain-invariant and domain-specific news representations, which addresses the challenges of domain shift and incomplete domain labeling in multidomain fake news detection. It incorporates entity background information from external knowledge sources to enhance fine-grained news domain representation. EMT consists of a Domain-Invariant (DI) encoder, a Domain-Specific (DS) encoder, and a Cross-Domain Transformer (CT) that facilitates investigation of domain relationships and knowledge interaction with input news, enabling effective generalization. We evaluate the EMT's performance in multi-domain fake news detection across three settings: supervised multi-domain, zero-shot setting on new unseen domain, and limited samples from new domain. EMT demonstrates greater stability than state-of-the-art models when dealing with domain changes and varying training data. Specifically, in the zero-shot setting on new unseen domains, EMT achieves a good F1 score of approximately 72 %. The results highlight the effectiveness of EMT's entity-centric approach and its potential for real-world applications, as it demonstrates the ability to adapt to various training settings and outperform existing models in handling limited label data and adapting to previously unseen domains.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Entity-centric multi-domain transformer for improving generalization in fake news detection\",\"authors\":\"Parisa Bazmi , Masoud Asadpour , Azadeh Shakery , Abbas Maazallahi\",\"doi\":\"10.1016/j.ipm.2024.103807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Fake news has become a significant concern in recent times, particularly during the COVID-19 pandemic, as spreading false information can pose significant public health risks. Although many models have been suggested to detect fake news, they are often limited in their ability to extend to new emerging domains since they are designed for a single domain. Previous studies on multidomain fake news detection have focused on developing models that can perform well on multiple domains, but they often lack the ability to generalize to new unseen domains, which limits their effectiveness. To overcome this limitation, in this paper, we propose the Entity-centric Multi-domain Transformer (EMT) model. EMT uses entities in the news as key components in learning domain-invariant and domain-specific news representations, which addresses the challenges of domain shift and incomplete domain labeling in multidomain fake news detection. It incorporates entity background information from external knowledge sources to enhance fine-grained news domain representation. EMT consists of a Domain-Invariant (DI) encoder, a Domain-Specific (DS) encoder, and a Cross-Domain Transformer (CT) that facilitates investigation of domain relationships and knowledge interaction with input news, enabling effective generalization. We evaluate the EMT's performance in multi-domain fake news detection across three settings: supervised multi-domain, zero-shot setting on new unseen domain, and limited samples from new domain. EMT demonstrates greater stability than state-of-the-art models when dealing with domain changes and varying training data. Specifically, in the zero-shot setting on new unseen domains, EMT achieves a good F1 score of approximately 72 %. The results highlight the effectiveness of EMT's entity-centric approach and its potential for real-world applications, as it demonstrates the ability to adapt to various training settings and outperform existing models in handling limited label data and adapting to previously unseen domains.</p></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324001663\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324001663","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

近来，特别是在 COVID-19 大流行期间，虚假新闻已成为人们关注的一个重要问题，因为传播虚假信息会给公共健康带来重大风险。虽然已经提出了许多检测假新闻的模型，但由于这些模型是针对单一领域设计的，因此它们扩展到新兴领域的能力往往受到限制。以往关于多领域假新闻检测的研究主要集中在开发能在多个领域表现良好的模型上，但这些模型往往缺乏向新的未见领域推广的能力，从而限制了其有效性。为了克服这一局限，我们在本文中提出了以实体为中心的多域转换器（EMT）模型。EMT 将新闻中的实体作为学习领域不变和特定领域新闻表征的关键组件，从而解决了多领域虚假新闻检测中领域转移和领域标记不完整的难题。它结合了来自外部知识源的实体背景信息，以增强细粒度的新闻领域表征。EMT 由领域不变（DI）编码器、特定领域（DS）编码器和跨领域转换器（CT）组成，有助于研究领域关系以及与输入新闻之间的知识交互，从而实现有效的泛化。我们对 EMT 在多领域假新闻检测中的性能进行了评估，包括三种情况：有监督的多领域检测、在未见过的新领域进行零检测以及来自新领域的有限样本检测。与最先进的模型相比，EMT 在处理领域变化和不同训练数据时表现出更高的稳定性。具体来说，在新的未见领域的 "0-shot "设置中，EMT 取得了约 72% 的良好 F1 分数。这些结果凸显了 EMT 以实体为中心的方法的有效性及其在实际应用中的潜力，因为它展示了适应各种训练设置的能力，并在处理有限的标签数据和适应以前未见过的领域方面优于现有模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Entity-centric multi-domain transformer for improving generalization in fake news detection

Fake news has become a significant concern in recent times, particularly during the COVID-19 pandemic, as spreading false information can pose significant public health risks. Although many models have been suggested to detect fake news, they are often limited in their ability to extend to new emerging domains since they are designed for a single domain. Previous studies on multidomain fake news detection have focused on developing models that can perform well on multiple domains, but they often lack the ability to generalize to new unseen domains, which limits their effectiveness. To overcome this limitation, in this paper, we propose the Entity-centric Multi-domain Transformer (EMT) model. EMT uses entities in the news as key components in learning domain-invariant and domain-specific news representations, which addresses the challenges of domain shift and incomplete domain labeling in multidomain fake news detection. It incorporates entity background information from external knowledge sources to enhance fine-grained news domain representation. EMT consists of a Domain-Invariant (DI) encoder, a Domain-Specific (DS) encoder, and a Cross-Domain Transformer (CT) that facilitates investigation of domain relationships and knowledge interaction with input news, enabling effective generalization. We evaluate the EMT's performance in multi-domain fake news detection across three settings: supervised multi-domain, zero-shot setting on new unseen domain, and limited samples from new domain. EMT demonstrates greater stability than state-of-the-art models when dealing with domain changes and varying training data. Specifically, in the zero-shot setting on new unseen domains, EMT achieves a good F1 score of approximately 72 %. The results highlight the effectiveness of EMT's entity-centric approach and its potential for real-world applications, as it demonstrates the ability to adapt to various training settings and outperform existing models in handling limited label data and adapting to previously unseen domains.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.