{"title":"以实体为中心的多域转换器,用于提高假新闻检测的概括能力","authors":"Parisa Bazmi , Masoud Asadpour , Azadeh Shakery , Abbas Maazallahi","doi":"10.1016/j.ipm.2024.103807","DOIUrl":null,"url":null,"abstract":"<div><p>Fake news has become a significant concern in recent times, particularly during the COVID-19 pandemic, as spreading false information can pose significant public health risks. Although many models have been suggested to detect fake news, they are often limited in their ability to extend to new emerging domains since they are designed for a single domain. Previous studies on multidomain fake news detection have focused on developing models that can perform well on multiple domains, but they often lack the ability to generalize to new unseen domains, which limits their effectiveness. To overcome this limitation, in this paper, we propose the Entity-centric Multi-domain Transformer (EMT) model. EMT uses entities in the news as key components in learning domain-invariant and domain-specific news representations, which addresses the challenges of domain shift and incomplete domain labeling in multidomain fake news detection. It incorporates entity background information from external knowledge sources to enhance fine-grained news domain representation. EMT consists of a Domain-Invariant (DI) encoder, a Domain-Specific (DS) encoder, and a Cross-Domain Transformer (CT) that facilitates investigation of domain relationships and knowledge interaction with input news, enabling effective generalization. We evaluate the EMT's performance in multi-domain fake news detection across three settings: supervised multi-domain, zero-shot setting on new unseen domain, and limited samples from new domain. EMT demonstrates greater stability than state-of-the-art models when dealing with domain changes and varying training data. Specifically, in the zero-shot setting on new unseen domains, EMT achieves a good F1 score of approximately 72 %. The results highlight the effectiveness of EMT's entity-centric approach and its potential for real-world applications, as it demonstrates the ability to adapt to various training settings and outperform existing models in handling limited label data and adapting to previously unseen domains.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Entity-centric multi-domain transformer for improving generalization in fake news detection\",\"authors\":\"Parisa Bazmi , Masoud Asadpour , Azadeh Shakery , Abbas Maazallahi\",\"doi\":\"10.1016/j.ipm.2024.103807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Fake news has become a significant concern in recent times, particularly during the COVID-19 pandemic, as spreading false information can pose significant public health risks. Although many models have been suggested to detect fake news, they are often limited in their ability to extend to new emerging domains since they are designed for a single domain. Previous studies on multidomain fake news detection have focused on developing models that can perform well on multiple domains, but they often lack the ability to generalize to new unseen domains, which limits their effectiveness. To overcome this limitation, in this paper, we propose the Entity-centric Multi-domain Transformer (EMT) model. EMT uses entities in the news as key components in learning domain-invariant and domain-specific news representations, which addresses the challenges of domain shift and incomplete domain labeling in multidomain fake news detection. It incorporates entity background information from external knowledge sources to enhance fine-grained news domain representation. EMT consists of a Domain-Invariant (DI) encoder, a Domain-Specific (DS) encoder, and a Cross-Domain Transformer (CT) that facilitates investigation of domain relationships and knowledge interaction with input news, enabling effective generalization. We evaluate the EMT's performance in multi-domain fake news detection across three settings: supervised multi-domain, zero-shot setting on new unseen domain, and limited samples from new domain. EMT demonstrates greater stability than state-of-the-art models when dealing with domain changes and varying training data. Specifically, in the zero-shot setting on new unseen domains, EMT achieves a good F1 score of approximately 72 %. The results highlight the effectiveness of EMT's entity-centric approach and its potential for real-world applications, as it demonstrates the ability to adapt to various training settings and outperform existing models in handling limited label data and adapting to previously unseen domains.</p></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324001663\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324001663","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Entity-centric multi-domain transformer for improving generalization in fake news detection
Fake news has become a significant concern in recent times, particularly during the COVID-19 pandemic, as spreading false information can pose significant public health risks. Although many models have been suggested to detect fake news, they are often limited in their ability to extend to new emerging domains since they are designed for a single domain. Previous studies on multidomain fake news detection have focused on developing models that can perform well on multiple domains, but they often lack the ability to generalize to new unseen domains, which limits their effectiveness. To overcome this limitation, in this paper, we propose the Entity-centric Multi-domain Transformer (EMT) model. EMT uses entities in the news as key components in learning domain-invariant and domain-specific news representations, which addresses the challenges of domain shift and incomplete domain labeling in multidomain fake news detection. It incorporates entity background information from external knowledge sources to enhance fine-grained news domain representation. EMT consists of a Domain-Invariant (DI) encoder, a Domain-Specific (DS) encoder, and a Cross-Domain Transformer (CT) that facilitates investigation of domain relationships and knowledge interaction with input news, enabling effective generalization. We evaluate the EMT's performance in multi-domain fake news detection across three settings: supervised multi-domain, zero-shot setting on new unseen domain, and limited samples from new domain. EMT demonstrates greater stability than state-of-the-art models when dealing with domain changes and varying training data. Specifically, in the zero-shot setting on new unseen domains, EMT achieves a good F1 score of approximately 72 %. The results highlight the effectiveness of EMT's entity-centric approach and its potential for real-world applications, as it demonstrates the ability to adapt to various training settings and outperform existing models in handling limited label data and adapting to previously unseen domains.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.