A transformer-based architecture for the automatic detection of clickbait for Arabic headlines

Q3 Arts and Humanities Icon Pub Date : 2023-03-01 DOI:10.1109/ICNLP58431.2023.00052

Jihad R’Baiti, R. Faizi, Youssef Hmamouche, A. E. Seghrouchni

引用次数: 0

Abstract

As technology advances, everything is becoming digitized, including newspapers and magazines. Currently, information is accessible in an easy, and fast manner. However, some content creators exploit this opportunity negatively by using unethical methods to attract users’ attention aiming to increase their ads’ income instead of providing accurate information. In this research, we propose a comparative study of various approaches based on natural language processing techniques and deep learning models to face this clickbait phenomenon. This study will enable us to detect this type of content in Arabic. Fine-tuned BERT with an attached neural network layer architecture achieved the highest results with an accuracy of 0.9103, a precision of 0.9111, and a recall of 0.9103 outperformed CNN, LSTM, BiLSTM, and FFNN using the different representation methods TF-IDF, Roberta, and Embedding.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一个基于变压器的架构，用于自动检测阿拉伯标题的标题党

随着科技的进步，一切都变得数字化，包括报纸和杂志。目前，信息的获取是一种简单、快捷的方式。然而，一些内容创造者利用这个机会，使用不道德的方法来吸引用户的注意力，目的是增加他们的广告收入，而不是提供准确的信息。在本研究中，我们提出了基于自然语言处理技术和深度学习模型的各种方法的比较研究，以面对这种标题党现象。这项研究将使我们能够在阿拉伯语中发现这类内容。使用TF-IDF、Roberta、Embedding等不同表示方法的CNN、LSTM、BiLSTM、FFNN的准确率为0.9103，精密度为0.9111，召回率为0.9103。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Icon Arts and Humanities-History and Philosophy of Science

CiteScore

0.30

自引率

0.00%

发文量