A semi-supervised algorithm for detecting extremism propaganda diffusion on social media

IF 0.7 3区 文学 0 LANGUAGE & LINGUISTICS Pragmatics and Society Pub Date : 2022-07-21 DOI:10.1075/ps.21009.fra
M. Francisco, Miguel-Ángel Benítez-Castro, E. Hidalgo-Tenorio, J. Castro
{"title":"A semi-supervised algorithm for detecting extremism propaganda diffusion on social media","authors":"M. Francisco, Miguel-Ángel Benítez-Castro, E. Hidalgo-Tenorio, J. Castro","doi":"10.1075/ps.21009.fra","DOIUrl":null,"url":null,"abstract":"\n Extremist online networks reportedly tend to use Twitter and other Social Networking Sites (SNS) in order to issue\n propaganda and recruitment statements. Traditional machine learning models may encounter problems when used in such a context, due\n to the peculiarities of microblogging sites and the manner in which these networks interact (both between themselves and with\n other networks). Moreover, state-of-the-art approaches have focused on non-transparent techniques that cannot be audited; so,\n despite the fact that they are top performing techniques, it is impossible to check if the models are actually fair. In this\n paper, we present a semi-supervised methodology that uses our Discriminatory Expressions algorithm for feature\n selection to detect expressions that are biased towards extremist content (Francisco and\n Castro 2020). With the help of human experts, the relevant expressions are filtered and used to retrieve further\n extremist content in order to iteratively provide a set of relevant and accurate expressions. These discriminatory expressions\n have been proved to produce less complex models that are easier to comprehend, and thus improve model transparency. In the\n following, we present close to 70 expressions that were discovered by using this method alongside the validation test of the\n algorithm in several different contexts.","PeriodicalId":44036,"journal":{"name":"Pragmatics and Society","volume":" ","pages":""},"PeriodicalIF":0.7000,"publicationDate":"2022-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pragmatics and Society","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1075/ps.21009.fra","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 1

Abstract

Extremist online networks reportedly tend to use Twitter and other Social Networking Sites (SNS) in order to issue propaganda and recruitment statements. Traditional machine learning models may encounter problems when used in such a context, due to the peculiarities of microblogging sites and the manner in which these networks interact (both between themselves and with other networks). Moreover, state-of-the-art approaches have focused on non-transparent techniques that cannot be audited; so, despite the fact that they are top performing techniques, it is impossible to check if the models are actually fair. In this paper, we present a semi-supervised methodology that uses our Discriminatory Expressions algorithm for feature selection to detect expressions that are biased towards extremist content (Francisco and Castro 2020). With the help of human experts, the relevant expressions are filtered and used to retrieve further extremist content in order to iteratively provide a set of relevant and accurate expressions. These discriminatory expressions have been proved to produce less complex models that are easier to comprehend, and thus improve model transparency. In the following, we present close to 70 expressions that were discovered by using this method alongside the validation test of the algorithm in several different contexts.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种检测社交媒体上极端主义宣传扩散的半监督算法
据报道,极端主义网络倾向于使用Twitter和其他社交网站(SNS)来发布宣传和招募声明。传统的机器学习模型在这种情况下使用时可能会遇到问题,因为微博网站的特殊性和这些网络交互的方式(包括它们自己之间和与其他网络之间)。此外,最先进的方法侧重于无法审计的不透明技术;因此,尽管它们是表现最好的技术,但不可能检查这些模型是否真正公平。在本文中,我们提出了一种半监督方法,该方法使用我们的歧视性表达算法进行特征选择,以检测偏向极端主义内容的表达(Francisco and Castro 2020)。在人类专家的帮助下,对相关表达进行过滤,并用于进一步检索极端主义内容,从而迭代地提供一组相关且准确的表达。事实证明,这些歧视性表达产生的模型不那么复杂,更容易理解,从而提高了模型的透明度。在下面,我们展示了近70个表达式,这些表达式是通过使用这种方法以及在几个不同的上下文中对算法进行验证测试而发现的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.40
自引率
0.00%
发文量
42
期刊最新文献
“Not everything is on the hostess” Code accommodation as a measure of inclusion for bilingual people living with dementia of the Alzheimer’s type Verbal play in dementia care “Let’s Just Forget It!” Learning from initial reviews of multilingual graphics illustrating dementia caregiving
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1