Jihad R’Baiti, R. Faizi, Youssef Hmamouche, A. E. Seghrouchni
{"title":"A transformer-based architecture for the automatic detection of clickbait for Arabic headlines","authors":"Jihad R’Baiti, R. Faizi, Youssef Hmamouche, A. E. Seghrouchni","doi":"10.1109/ICNLP58431.2023.00052","DOIUrl":null,"url":null,"abstract":"As technology advances, everything is becoming digitized, including newspapers and magazines. Currently, information is accessible in an easy, and fast manner. However, some content creators exploit this opportunity negatively by using unethical methods to attract users’ attention aiming to increase their ads’ income instead of providing accurate information. In this research, we propose a comparative study of various approaches based on natural language processing techniques and deep learning models to face this clickbait phenomenon. This study will enable us to detect this type of content in Arabic. Fine-tuned BERT with an attached neural network layer architecture achieved the highest results with an accuracy of 0.9103, a precision of 0.9111, and a recall of 0.9103 outperformed CNN, LSTM, BiLSTM, and FFNN using the different representation methods TF-IDF, Roberta, and Embedding.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"23 1 1","pages":"248-252"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Icon","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNLP58431.2023.00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0
Abstract
As technology advances, everything is becoming digitized, including newspapers and magazines. Currently, information is accessible in an easy, and fast manner. However, some content creators exploit this opportunity negatively by using unethical methods to attract users’ attention aiming to increase their ads’ income instead of providing accurate information. In this research, we propose a comparative study of various approaches based on natural language processing techniques and deep learning models to face this clickbait phenomenon. This study will enable us to detect this type of content in Arabic. Fine-tuned BERT with an attached neural network layer architecture achieved the highest results with an accuracy of 0.9103, a precision of 0.9111, and a recall of 0.9103 outperformed CNN, LSTM, BiLSTM, and FFNN using the different representation methods TF-IDF, Roberta, and Embedding.