在推特上发现圣战主义的倍增者

2015 IEEE International Conference on Data Mining Workshop (ICDMW) Pub Date : 2015-11-14 DOI:10.1109/ICDMW.2015.9

Lisa Kaati, Enghin Omer, Nico Prucha, A. Shrestha

{"title":"在推特上发现圣战主义的倍增者","authors":"Lisa Kaati, Enghin Omer, Nico Prucha, A. Shrestha","doi":"10.1109/ICDMW.2015.9","DOIUrl":null,"url":null,"abstract":"Detecting terrorist related content on social media is a problem for law enforcement agency due to the large amount of information that is available. This work is aiming at detecting tweeps that are involved in media mujahideen - the supporters of jihadist groups who disseminate propaganda content online. To do this we use a machine learning approach where we make use of two sets of features: data dependent features and data independent features. The data dependent features are features that are heavily influenced by the specific dataset while the data independent features are independent of the dataset and can be used on other datasets with similar result. By using this approach we hope that our method can be used as a baseline to classify violent extremist content from different kind of sources since data dependent features from various domains can be added. In our experiments we have used the AdaBoost classifier. The results shows that our approach works very well for classifying English tweeps and English tweets but the approach does not perform as well on Arabic data.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"59","resultStr":"{\"title\":\"Detecting Multipliers of Jihadism on Twitter\",\"authors\":\"Lisa Kaati, Enghin Omer, Nico Prucha, A. Shrestha\",\"doi\":\"10.1109/ICDMW.2015.9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Detecting terrorist related content on social media is a problem for law enforcement agency due to the large amount of information that is available. This work is aiming at detecting tweeps that are involved in media mujahideen - the supporters of jihadist groups who disseminate propaganda content online. To do this we use a machine learning approach where we make use of two sets of features: data dependent features and data independent features. The data dependent features are features that are heavily influenced by the specific dataset while the data independent features are independent of the dataset and can be used on other datasets with similar result. By using this approach we hope that our method can be used as a baseline to classify violent extremist content from different kind of sources since data dependent features from various domains can be added. In our experiments we have used the AdaBoost classifier. The results shows that our approach works very well for classifying English tweeps and English tweets but the approach does not perform as well on Arabic data.\",\"PeriodicalId\":192888,\"journal\":{\"name\":\"2015 IEEE International Conference on Data Mining Workshop (ICDMW)\",\"volume\":\"136 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"59\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Data Mining Workshop (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2015.9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2015.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 59

摘要

在社交媒体上发现与恐怖主义有关的内容对执法机构来说是一个问题，因为可以获得大量的信息。这项工作旨在检测与媒体圣战者有关的推文——圣战组织的支持者在网上传播宣传内容。为了做到这一点，我们使用了一种机器学习方法，其中我们利用了两组特征:数据依赖特征和数据独立特征。数据依赖特征是受特定数据集严重影响的特征，而数据独立特征独立于数据集，可用于具有类似结果的其他数据集。通过使用这种方法，我们希望我们的方法可以作为基线来分类来自不同来源的暴力极端主义内容，因为可以添加来自不同领域的数据依赖特征。在实验中，我们使用了AdaBoost分类器。结果表明，我们的方法对英语推文和英语推文进行分类非常有效，但该方法在阿拉伯语数据上的表现不佳。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Detecting Multipliers of Jihadism on Twitter

Detecting terrorist related content on social media is a problem for law enforcement agency due to the large amount of information that is available. This work is aiming at detecting tweeps that are involved in media mujahideen - the supporters of jihadist groups who disseminate propaganda content online. To do this we use a machine learning approach where we make use of two sets of features: data dependent features and data independent features. The data dependent features are features that are heavily influenced by the specific dataset while the data independent features are independent of the dataset and can be used on other datasets with similar result. By using this approach we hope that our method can be used as a baseline to classify violent extremist content from different kind of sources since data dependent features from various domains can be added. In our experiments we have used the AdaBoost classifier. The results shows that our approach works very well for classifying English tweeps and English tweets but the approach does not perform as well on Arabic data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE International Conference on Data Mining Workshop (ICDMW)

自引率

0.00%

发文量

期刊最新文献

Large-Scale Linear Support Vector Ordinal Regression Solver Joint Recovery and Representation Learning for Robust Correlation Estimation Based on Partially Observed Data Accurate Classification of Biological Data Using Ensembles Large-Scale Unusual Time Series Detection Sentiment Polarity Classification Using Structural Features