Sijia Yang, Xianyong Li, Yajun Du, Dong Huang, Xiaoliang Chen, Yongquan Fan, Shumin Wang
{"title":"A hierarchical dual-view model for fake news detection guided by discriminative lexicons","authors":"Sijia Yang, Xianyong Li, Yajun Du, Dong Huang, Xiaoliang Chen, Yongquan Fan, Shumin Wang","doi":"10.1007/s13042-024-02322-0","DOIUrl":null,"url":null,"abstract":"<p>Fake news detection aims to automatically identify the credibility of source posts, mitigating potential societal harm and conserving human resources. Textual fake news detection methods can be categorized into pattern- and fact-based. Pattern-based models focus on identifying shared writing patterns in source posts, while fact-based models leverage auxiliary external knowledge. Researchers have recently attempted to merge these two views into a comprehensive detection system, achieving superior performance to single-view methods. However, existing dual-view methods often prioritize integrating single-view methods over exploring nuanced characteristics of both perspectives. To address this, we propose a novel hierarchical dual-view model for fake news detection guided by discriminative lexicons. First, we construct two lexicons based on distinct word usage tendencies in fake and real news and further augment them with synonyms sourced from large language models. We then devise a hierarchical attention network to derive semantic representations for the source post, incorporating a lexicon attention loss to guide the prioritization of words from the two lexicons. Subsequently, a lexicon-guided interaction network is employed to model the relations between the source post and its relevant articles, assigning authenticity-aware weights to each article. Finally, the representations of source post and relevant articles are concatenated for joint detection. According to experimental results, our model outperforms many competitive baselines in terms of the macro F1 score ranging from 1.1% to 10.5% on Weibo and from 3.2% to 10.8% on Twitter.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"35 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02322-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Fake news detection aims to automatically identify the credibility of source posts, mitigating potential societal harm and conserving human resources. Textual fake news detection methods can be categorized into pattern- and fact-based. Pattern-based models focus on identifying shared writing patterns in source posts, while fact-based models leverage auxiliary external knowledge. Researchers have recently attempted to merge these two views into a comprehensive detection system, achieving superior performance to single-view methods. However, existing dual-view methods often prioritize integrating single-view methods over exploring nuanced characteristics of both perspectives. To address this, we propose a novel hierarchical dual-view model for fake news detection guided by discriminative lexicons. First, we construct two lexicons based on distinct word usage tendencies in fake and real news and further augment them with synonyms sourced from large language models. We then devise a hierarchical attention network to derive semantic representations for the source post, incorporating a lexicon attention loss to guide the prioritization of words from the two lexicons. Subsequently, a lexicon-guided interaction network is employed to model the relations between the source post and its relevant articles, assigning authenticity-aware weights to each article. Finally, the representations of source post and relevant articles are concatenated for joint detection. According to experimental results, our model outperforms many competitive baselines in terms of the macro F1 score ranging from 1.1% to 10.5% on Weibo and from 3.2% to 10.8% on Twitter.
期刊介绍:
Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data.
The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC.
Key research areas to be covered by the journal include:
Machine Learning for modeling interactions between systems
Pattern Recognition technology to support discovery of system-environment interaction
Control of system-environment interactions
Biochemical interaction in biological and biologically-inspired systems
Learning for improvement of communication schemes between systems