{"title":"利用数据扩增和生成式预训练变换器加强谣言检测","authors":"Mojgan Askarizade","doi":"10.1016/j.eswa.2024.125649","DOIUrl":null,"url":null,"abstract":"<div><div>The advent of social networks has facilitated the rapid dissemination of false information, including rumors, leading to significant societal and individual damages. Extensive research has been dedicated to rumor detection, ranging from machine learning techniques to neural networks. However, the existing methods could not learn the deep concepts of the rumor text to detect the rumor. In addition, imbalanced datasets in the rumor domain reduce the effectiveness of these algorithms. This study addresses this challenge by leveraging the Generative Pre-trained Transformer 2 (GPT-2) model to generate rumor-like texts, thus creating a balanced dataset. Subsequently, a novel approach for classifying rumor texts is proposed by modifying the GPT-2 model. We compare our results with state-of-art machine learning and deep learning methods as well as pre-trained models on the PHEME, Twitter15, and Twitter16 datasets. Our findings demonstrate that the proposed model, implementing advanced artificial intelligence techniques, has improved accuracy and F-measure in the application of detecting rumors compared to previous methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"262 ","pages":"Article 125649"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing rumor detection with data augmentation and generative pre-trained transformer\",\"authors\":\"Mojgan Askarizade\",\"doi\":\"10.1016/j.eswa.2024.125649\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The advent of social networks has facilitated the rapid dissemination of false information, including rumors, leading to significant societal and individual damages. Extensive research has been dedicated to rumor detection, ranging from machine learning techniques to neural networks. However, the existing methods could not learn the deep concepts of the rumor text to detect the rumor. In addition, imbalanced datasets in the rumor domain reduce the effectiveness of these algorithms. This study addresses this challenge by leveraging the Generative Pre-trained Transformer 2 (GPT-2) model to generate rumor-like texts, thus creating a balanced dataset. Subsequently, a novel approach for classifying rumor texts is proposed by modifying the GPT-2 model. We compare our results with state-of-art machine learning and deep learning methods as well as pre-trained models on the PHEME, Twitter15, and Twitter16 datasets. Our findings demonstrate that the proposed model, implementing advanced artificial intelligence techniques, has improved accuracy and F-measure in the application of detecting rumors compared to previous methods.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"262 \",\"pages\":\"Article 125649\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417424025168\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417424025168","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Enhancing rumor detection with data augmentation and generative pre-trained transformer
The advent of social networks has facilitated the rapid dissemination of false information, including rumors, leading to significant societal and individual damages. Extensive research has been dedicated to rumor detection, ranging from machine learning techniques to neural networks. However, the existing methods could not learn the deep concepts of the rumor text to detect the rumor. In addition, imbalanced datasets in the rumor domain reduce the effectiveness of these algorithms. This study addresses this challenge by leveraging the Generative Pre-trained Transformer 2 (GPT-2) model to generate rumor-like texts, thus creating a balanced dataset. Subsequently, a novel approach for classifying rumor texts is proposed by modifying the GPT-2 model. We compare our results with state-of-art machine learning and deep learning methods as well as pre-trained models on the PHEME, Twitter15, and Twitter16 datasets. Our findings demonstrate that the proposed model, implementing advanced artificial intelligence techniques, has improved accuracy and F-measure in the application of detecting rumors compared to previous methods.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.