在阿拉伯语社交媒体文本中检测极端主义的混合技术

Israa Akram Alzuabidi, Layla safwat Jamil, A. A. Ahmed, Shahrul Azman Mohd Noah, Mohammad Kamrul Hasan
{"title":"在阿拉伯语社交媒体文本中检测极端主义的混合技术","authors":"Israa Akram Alzuabidi, Layla safwat Jamil, A. A. Ahmed, Shahrul Azman Mohd Noah, Mohammad Kamrul Hasan","doi":"10.5755/j02.eie.34743","DOIUrl":null,"url":null,"abstract":"Today, social media sites like Twitter provide effective platforms to share opinions and thoughts in public with millions of other users. These opinions shared on such sites influence a large number of people who may easily retweet them and accelerate their spread. Unfortunately, some of these opinions were expressed by extremists who promoted hateful content. Since Arabic is one of the most spoken languages, it is crucial to automate the process of monitoring Arabic content published on social sites. Therefore, this study aims to propose a hybrid technique to detect extremism in Arabic social media texts and articles to monitor the situation of published extremist content. The proposed technique combines the lexicon-based approach with the rough set theory approach. The rough set theory is employed with two approximation strategies: lower approximation and accuracy approximation. The hybrid technique used the rough set theory as a classifier and the lexicon-based as a vector. Furthermore, this study built three types of corpuses (V1, V2, and V3) collected from Twitter. The experimental findings show that among the proposed hybrid methods, the accuracy approximation was superior to the lower approximation with seed vector. It was also revealed that hybrid methods outperformed machine learning techniques in terms of efficiency. Moreover, the study recommends using an accuracy approximation method with seed vector to identify the polarity of the text.","PeriodicalId":507694,"journal":{"name":"Elektronika ir Elektrotechnika","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Hybrid Technique for Detecting Extremism in Arabic Social Media Texts\",\"authors\":\"Israa Akram Alzuabidi, Layla safwat Jamil, A. A. Ahmed, Shahrul Azman Mohd Noah, Mohammad Kamrul Hasan\",\"doi\":\"10.5755/j02.eie.34743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today, social media sites like Twitter provide effective platforms to share opinions and thoughts in public with millions of other users. These opinions shared on such sites influence a large number of people who may easily retweet them and accelerate their spread. Unfortunately, some of these opinions were expressed by extremists who promoted hateful content. Since Arabic is one of the most spoken languages, it is crucial to automate the process of monitoring Arabic content published on social sites. Therefore, this study aims to propose a hybrid technique to detect extremism in Arabic social media texts and articles to monitor the situation of published extremist content. The proposed technique combines the lexicon-based approach with the rough set theory approach. The rough set theory is employed with two approximation strategies: lower approximation and accuracy approximation. The hybrid technique used the rough set theory as a classifier and the lexicon-based as a vector. Furthermore, this study built three types of corpuses (V1, V2, and V3) collected from Twitter. The experimental findings show that among the proposed hybrid methods, the accuracy approximation was superior to the lower approximation with seed vector. It was also revealed that hybrid methods outperformed machine learning techniques in terms of efficiency. Moreover, the study recommends using an accuracy approximation method with seed vector to identify the polarity of the text.\",\"PeriodicalId\":507694,\"journal\":{\"name\":\"Elektronika ir Elektrotechnika\",\"volume\":\"29 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Elektronika ir Elektrotechnika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5755/j02.eie.34743\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Elektronika ir Elektrotechnika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5755/j02.eie.34743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

如今,Twitter 等社交媒体网站提供了与数百万其他用户公开分享意见和想法的有效平台。在这些网站上分享的这些观点会影响大量的人,而这些人很容易转发这些观点并加速其传播。不幸的是,其中一些观点是由宣扬仇恨内容的极端分子表达的。由于阿拉伯语是使用人数最多的语言之一,因此对社交网站上发布的阿拉伯语内容进行自动化监控至关重要。因此,本研究旨在提出一种混合技术,用于检测阿拉伯语社交媒体文本和文章中的极端主义内容,以监控已发布的极端主义内容的情况。所提出的技术结合了基于词典的方法和粗糙集理论方法。粗糙集理论采用两种逼近策略:低度逼近和精度逼近。该混合技术使用粗糙集理论作为分类器,使用基于词典的方法作为向量。此外,本研究还建立了三种类型的语料库(V1、V2 和 V3),这些语料库都是从 Twitter 上收集的。实验结果表明,在所提出的混合方法中,准确度近似值优于种子向量的低近似值。研究还发现,混合方法在效率方面优于机器学习技术。此外,研究建议使用带种子向量的准确度近似法来识别文本的极性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Hybrid Technique for Detecting Extremism in Arabic Social Media Texts
Today, social media sites like Twitter provide effective platforms to share opinions and thoughts in public with millions of other users. These opinions shared on such sites influence a large number of people who may easily retweet them and accelerate their spread. Unfortunately, some of these opinions were expressed by extremists who promoted hateful content. Since Arabic is one of the most spoken languages, it is crucial to automate the process of monitoring Arabic content published on social sites. Therefore, this study aims to propose a hybrid technique to detect extremism in Arabic social media texts and articles to monitor the situation of published extremist content. The proposed technique combines the lexicon-based approach with the rough set theory approach. The rough set theory is employed with two approximation strategies: lower approximation and accuracy approximation. The hybrid technique used the rough set theory as a classifier and the lexicon-based as a vector. Furthermore, this study built three types of corpuses (V1, V2, and V3) collected from Twitter. The experimental findings show that among the proposed hybrid methods, the accuracy approximation was superior to the lower approximation with seed vector. It was also revealed that hybrid methods outperformed machine learning techniques in terms of efficiency. Moreover, the study recommends using an accuracy approximation method with seed vector to identify the polarity of the text.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Sparse Point Cloud Registration Network with Semantic Supervision in Wilderness Scenes Performance Analysis of PSO-Based SHEPWM Control of Clone Output Nine-Switch Inverter for Nonlinear Loads Spider Monkey Metaheuristic Tuning of Model Predictive Control with Perched Landing Stabilities for Novel Auxetic Landing Foot in Drones Evaluating the Efficacy of Real-Time Connected Vehicle Basic Safety Messages in Mitigating Aberrant Driving Behaviour and Risk of Vehicle Crashes: Preliminary Insights from Highway Scenarios Development of a Position Control System for Wheeled Humanoid Robot Movement Using the Swerve Drive Method Based on Fuzzy Logic Type-2
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1