Authorship Authentication of Short Messages from Social Networks Using Recurrent Artificial Neural Networks

Southeast Europe Journal of Soft Computing Pub Date : 2018-11-28 DOI:10.21533/SCJOURNAL.V7I2.163

N. M. Demir

引用次数: 0

Abstract

Dataset consists of 17000 tweets collected from Twitter, as 500 tweets for each of 34 authors that meet certain criteria. Raw data is collected by using the software Nvivo. The collected raw data is preprocessed to extract frequencies of 200 features. In the data analysis 128 of features are eliminated since they are rare in tweets. As a progressive presentation, five – ten – fifteen – twenty - thirty and thirty four of these 34 authors are selected each time. Since recurrent artificial neural networks are more stable and iterations converge more quickly, in this work this architecture is preferred. In general, ANNs are more successful in distinguishing two classes, therefore for N authors, N×N neural networks are trained for pair wise classification. These N×N experts then organized as N special teams (CANNT) to aggregate decisions of these N×N experts. Number of authors is seen not so effective on the accuracy of the authentication, and around 80% accuracy is achieved for any number of authors.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于递归人工神经网络的社交网络短消息作者身份认证

数据集由从Twitter收集的17000条推文组成，34位符合特定标准的作者每人500条推文。使用Nvivo软件收集原始数据。对采集到的原始数据进行预处理，提取200个特征的频率。在数据分析中，128个特征被淘汰，因为它们在推文中很少见。作为一种递进的呈现，每次从这34位作者中选出5位、15位、23位和34位。由于循环人工神经网络更稳定，迭代收敛更快，在这项工作中，这种架构是首选的。一般来说，人工神经网络在区分两个类别方面更成功，因此对于N个作者，N×N神经网络被训练用于配对分类。这些N×N专家然后组成N个特别小组(can)来汇总这些N×N专家的决策。作者的数量对身份验证的准确性没有太大影响，对于任何数量的作者，准确率都可以达到80%左右。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Southeast Europe Journal of Soft Computing

自引率

0.00%

发文量

期刊最新文献

Movie Recommender System Hypervariable Regions in 16S rRNA Genes for the Taxonomic Classification A Survey On Security In Wireless Sensor Network Zeka - Friendy Chatterbot Taxonomic Classification of Bacteria Using Common Substrings