一个基于变压器的架构,用于自动检测阿拉伯标题的标题党

Q3 Arts and Humanities Icon Pub Date : 2023-03-01 DOI:10.1109/ICNLP58431.2023.00052
Jihad R’Baiti, R. Faizi, Youssef Hmamouche, A. E. Seghrouchni
{"title":"一个基于变压器的架构,用于自动检测阿拉伯标题的标题党","authors":"Jihad R’Baiti, R. Faizi, Youssef Hmamouche, A. E. Seghrouchni","doi":"10.1109/ICNLP58431.2023.00052","DOIUrl":null,"url":null,"abstract":"As technology advances, everything is becoming digitized, including newspapers and magazines. Currently, information is accessible in an easy, and fast manner. However, some content creators exploit this opportunity negatively by using unethical methods to attract users’ attention aiming to increase their ads’ income instead of providing accurate information. In this research, we propose a comparative study of various approaches based on natural language processing techniques and deep learning models to face this clickbait phenomenon. This study will enable us to detect this type of content in Arabic. Fine-tuned BERT with an attached neural network layer architecture achieved the highest results with an accuracy of 0.9103, a precision of 0.9111, and a recall of 0.9103 outperformed CNN, LSTM, BiLSTM, and FFNN using the different representation methods TF-IDF, Roberta, and Embedding.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"23 1 1","pages":"248-252"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A transformer-based architecture for the automatic detection of clickbait for Arabic headlines\",\"authors\":\"Jihad R’Baiti, R. Faizi, Youssef Hmamouche, A. E. Seghrouchni\",\"doi\":\"10.1109/ICNLP58431.2023.00052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As technology advances, everything is becoming digitized, including newspapers and magazines. Currently, information is accessible in an easy, and fast manner. However, some content creators exploit this opportunity negatively by using unethical methods to attract users’ attention aiming to increase their ads’ income instead of providing accurate information. In this research, we propose a comparative study of various approaches based on natural language processing techniques and deep learning models to face this clickbait phenomenon. This study will enable us to detect this type of content in Arabic. Fine-tuned BERT with an attached neural network layer architecture achieved the highest results with an accuracy of 0.9103, a precision of 0.9111, and a recall of 0.9103 outperformed CNN, LSTM, BiLSTM, and FFNN using the different representation methods TF-IDF, Roberta, and Embedding.\",\"PeriodicalId\":53637,\"journal\":{\"name\":\"Icon\",\"volume\":\"23 1 1\",\"pages\":\"248-252\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Icon\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNLP58431.2023.00052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Icon","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNLP58431.2023.00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

摘要

随着科技的进步,一切都变得数字化,包括报纸和杂志。目前,信息的获取是一种简单、快捷的方式。然而,一些内容创造者利用这个机会,使用不道德的方法来吸引用户的注意力,目的是增加他们的广告收入,而不是提供准确的信息。在本研究中,我们提出了基于自然语言处理技术和深度学习模型的各种方法的比较研究,以面对这种标题党现象。这项研究将使我们能够在阿拉伯语中发现这类内容。使用TF-IDF、Roberta、Embedding等不同表示方法的CNN、LSTM、BiLSTM、FFNN的准确率为0.9103,精密度为0.9111,召回率为0.9103。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A transformer-based architecture for the automatic detection of clickbait for Arabic headlines
As technology advances, everything is becoming digitized, including newspapers and magazines. Currently, information is accessible in an easy, and fast manner. However, some content creators exploit this opportunity negatively by using unethical methods to attract users’ attention aiming to increase their ads’ income instead of providing accurate information. In this research, we propose a comparative study of various approaches based on natural language processing techniques and deep learning models to face this clickbait phenomenon. This study will enable us to detect this type of content in Arabic. Fine-tuned BERT with an attached neural network layer architecture achieved the highest results with an accuracy of 0.9103, a precision of 0.9111, and a recall of 0.9103 outperformed CNN, LSTM, BiLSTM, and FFNN using the different representation methods TF-IDF, Roberta, and Embedding.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Icon
Icon Arts and Humanities-History and Philosophy of Science
CiteScore
0.30
自引率
0.00%
发文量
0
期刊最新文献
Long-term Coherent Accumulation Algorithm Based on Radar Altimeter Deep Composite Kernels ELM Based on Spatial Feature Extraction for Hyperspectral Vegetation Image Classification Research based on improved SSD target detection algorithm CON-GAN-BERT: combining Contrastive Learning with Generative Adversarial Nets for Few-Shot Sentiment Classification A Two Stage Learning Algorithm for Hyperspectral Image Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1