Muhammad Umer, Arwa A. Jamjoom, Shtwai Alsubai, Aisha Ahmed AlArfaj, E. Alabdulqader, I. Ashraf
{"title":"Predictive Modeling for Arabic Fake News Detection: Leveraging Language Model Embeddings and Stacked Ensemble","authors":"Muhammad Umer, Arwa A. Jamjoom, Shtwai Alsubai, Aisha Ahmed AlArfaj, E. Alabdulqader, I. Ashraf","doi":"10.1145/3677016","DOIUrl":null,"url":null,"abstract":"The proliferation of fake news poses a substantial threat to information integrity, prompting the need for robust detection mechanisms. This study advances the research on Arabic fake news detection and overcomes the limitation of lower accuracy for fake news detection. This research addresses Arabic fake news detection using word embedding and a powerful stacking classifier. The proposed model combines bagging, boosting, and baseline classifiers, harnessing the strengths of each to create a robust ensemble. Extensive experiments are carried out to evaluate the proposed approach indicating remarkable results, with recall, F1 score, accuracy, and precision reaching 99%. The utilization of advanced stacking techniques, coupled with appropriate textual feature extraction, empowers the model to effectively detect Arabic fake news. Study results make a valuable contribution to fake news detection, particularly in the Arabic context, providing a valuable tool for enhancing information veracity and fostering a more informed public discourse. Furthermore, the proposed model’s accuracy is compared with other cutting-edge models from the existing literature to showcase its superior performance.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Asian and Low-Resource Language Information Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3677016","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The proliferation of fake news poses a substantial threat to information integrity, prompting the need for robust detection mechanisms. This study advances the research on Arabic fake news detection and overcomes the limitation of lower accuracy for fake news detection. This research addresses Arabic fake news detection using word embedding and a powerful stacking classifier. The proposed model combines bagging, boosting, and baseline classifiers, harnessing the strengths of each to create a robust ensemble. Extensive experiments are carried out to evaluate the proposed approach indicating remarkable results, with recall, F1 score, accuracy, and precision reaching 99%. The utilization of advanced stacking techniques, coupled with appropriate textual feature extraction, empowers the model to effectively detect Arabic fake news. Study results make a valuable contribution to fake news detection, particularly in the Arabic context, providing a valuable tool for enhancing information veracity and fostering a more informed public discourse. Furthermore, the proposed model’s accuracy is compared with other cutting-edge models from the existing literature to showcase its superior performance.
期刊介绍:
The ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) publishes high quality original archival papers and technical notes in the areas of computation and processing of information in Asian languages, low-resource languages of Africa, Australasia, Oceania and the Americas, as well as related disciplines. The subject areas covered by TALLIP include, but are not limited to:
-Computational Linguistics: including computational phonology, computational morphology, computational syntax (e.g. parsing), computational semantics, computational pragmatics, etc.
-Linguistic Resources: including computational lexicography, terminology, electronic dictionaries, cross-lingual dictionaries, electronic thesauri, etc.
-Hardware and software algorithms and tools for Asian or low-resource language processing, e.g., handwritten character recognition.
-Information Understanding: including text understanding, speech understanding, character recognition, discourse processing, dialogue systems, etc.
-Machine Translation involving Asian or low-resource languages.
-Information Retrieval: including natural language processing (NLP) for concept-based indexing, natural language query interfaces, semantic relevance judgments, etc.
-Information Extraction and Filtering: including automatic abstraction, user profiling, etc.
-Speech processing: including text-to-speech synthesis and automatic speech recognition.
-Multimedia Asian Information Processing: including speech, image, video, image/text translation, etc.
-Cross-lingual information processing involving Asian or low-resource languages.
-Papers that deal in theory, systems design, evaluation and applications in the aforesaid subjects are appropriate for TALLIP. Emphasis will be placed on the originality and the practical significance of the reported research.