在阿拉伯语社交媒体文本中检测极端主义的混合技术

Elektronika ir Elektrotechnika Pub Date : 2023-10-31 DOI:10.5755/j02.eie.34743

Israa Akram Alzuabidi, Layla safwat Jamil, A. A. Ahmed, Shahrul Azman Mohd Noah, Mohammad Kamrul Hasan

{"title":"在阿拉伯语社交媒体文本中检测极端主义的混合技术","authors":"Israa Akram Alzuabidi, Layla safwat Jamil, A. A. Ahmed, Shahrul Azman Mohd Noah, Mohammad Kamrul Hasan","doi":"10.5755/j02.eie.34743","DOIUrl":null,"url":null,"abstract":"Today, social media sites like Twitter provide effective platforms to share opinions and thoughts in public with millions of other users. These opinions shared on such sites influence a large number of people who may easily retweet them and accelerate their spread. Unfortunately, some of these opinions were expressed by extremists who promoted hateful content. Since Arabic is one of the most spoken languages, it is crucial to automate the process of monitoring Arabic content published on social sites. Therefore, this study aims to propose a hybrid technique to detect extremism in Arabic social media texts and articles to monitor the situation of published extremist content. The proposed technique combines the lexicon-based approach with the rough set theory approach. The rough set theory is employed with two approximation strategies: lower approximation and accuracy approximation. The hybrid technique used the rough set theory as a classifier and the lexicon-based as a vector. Furthermore, this study built three types of corpuses (V1, V2, and V3) collected from Twitter. The experimental findings show that among the proposed hybrid methods, the accuracy approximation was superior to the lower approximation with seed vector. It was also revealed that hybrid methods outperformed machine learning techniques in terms of efficiency. Moreover, the study recommends using an accuracy approximation method with seed vector to identify the polarity of the text.","PeriodicalId":507694,"journal":{"name":"Elektronika ir Elektrotechnika","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Hybrid Technique for Detecting Extremism in Arabic Social Media Texts\",\"authors\":\"Israa Akram Alzuabidi, Layla safwat Jamil, A. A. Ahmed, Shahrul Azman Mohd Noah, Mohammad Kamrul Hasan\",\"doi\":\"10.5755/j02.eie.34743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today, social media sites like Twitter provide effective platforms to share opinions and thoughts in public with millions of other users. These opinions shared on such sites influence a large number of people who may easily retweet them and accelerate their spread. Unfortunately, some of these opinions were expressed by extremists who promoted hateful content. Since Arabic is one of the most spoken languages, it is crucial to automate the process of monitoring Arabic content published on social sites. Therefore, this study aims to propose a hybrid technique to detect extremism in Arabic social media texts and articles to monitor the situation of published extremist content. The proposed technique combines the lexicon-based approach with the rough set theory approach. The rough set theory is employed with two approximation strategies: lower approximation and accuracy approximation. The hybrid technique used the rough set theory as a classifier and the lexicon-based as a vector. Furthermore, this study built three types of corpuses (V1, V2, and V3) collected from Twitter. The experimental findings show that among the proposed hybrid methods, the accuracy approximation was superior to the lower approximation with seed vector. It was also revealed that hybrid methods outperformed machine learning techniques in terms of efficiency. Moreover, the study recommends using an accuracy approximation method with seed vector to identify the polarity of the text.\",\"PeriodicalId\":507694,\"journal\":{\"name\":\"Elektronika ir Elektrotechnika\",\"volume\":\"29 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Elektronika ir Elektrotechnika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5755/j02.eie.34743\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Elektronika ir Elektrotechnika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5755/j02.eie.34743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

如今，Twitter 等社交媒体网站提供了与数百万其他用户公开分享意见和想法的有效平台。在这些网站上分享的这些观点会影响大量的人，而这些人很容易转发这些观点并加速其传播。不幸的是，其中一些观点是由宣扬仇恨内容的极端分子表达的。由于阿拉伯语是使用人数最多的语言之一，因此对社交网站上发布的阿拉伯语内容进行自动化监控至关重要。因此，本研究旨在提出一种混合技术，用于检测阿拉伯语社交媒体文本和文章中的极端主义内容，以监控已发布的极端主义内容的情况。所提出的技术结合了基于词典的方法和粗糙集理论方法。粗糙集理论采用两种逼近策略：低度逼近和精度逼近。该混合技术使用粗糙集理论作为分类器，使用基于词典的方法作为向量。此外，本研究还建立了三种类型的语料库（V1、V2 和 V3），这些语料库都是从 Twitter 上收集的。实验结果表明，在所提出的混合方法中，准确度近似值优于种子向量的低近似值。研究还发现，混合方法在效率方面优于机器学习技术。此外，研究建议使用带种子向量的准确度近似法来识别文本的极性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Hybrid Technique for Detecting Extremism in Arabic Social Media Texts

Today, social media sites like Twitter provide effective platforms to share opinions and thoughts in public with millions of other users. These opinions shared on such sites influence a large number of people who may easily retweet them and accelerate their spread. Unfortunately, some of these opinions were expressed by extremists who promoted hateful content. Since Arabic is one of the most spoken languages, it is crucial to automate the process of monitoring Arabic content published on social sites. Therefore, this study aims to propose a hybrid technique to detect extremism in Arabic social media texts and articles to monitor the situation of published extremist content. The proposed technique combines the lexicon-based approach with the rough set theory approach. The rough set theory is employed with two approximation strategies: lower approximation and accuracy approximation. The hybrid technique used the rough set theory as a classifier and the lexicon-based as a vector. Furthermore, this study built three types of corpuses (V1, V2, and V3) collected from Twitter. The experimental findings show that among the proposed hybrid methods, the accuracy approximation was superior to the lower approximation with seed vector. It was also revealed that hybrid methods outperformed machine learning techniques in terms of efficiency. Moreover, the study recommends using an accuracy approximation method with seed vector to identify the polarity of the text.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Elektronika ir Elektrotechnika

自引率

0.00%

发文量