{"title":"Authorship Authentication of Short Messages from Social Networks Using Recurrent Artificial Neural Networks","authors":"N. M. Demir","doi":"10.21533/SCJOURNAL.V7I2.163","DOIUrl":null,"url":null,"abstract":"Dataset consists of 17000 tweets collected from Twitter, as 500 tweets for each of 34 authors that meet certain criteria. Raw data is collected by using the software Nvivo. The collected raw data is preprocessed to extract frequencies of 200 features. In the data analysis 128 of features are eliminated since they are rare in tweets. As a progressive presentation, five – ten – fifteen – twenty - thirty and thirty four of these 34 authors are selected each time. Since recurrent artificial neural networks are more stable and iterations converge more quickly, in this work this architecture is preferred. In general, ANNs are more successful in distinguishing two classes, therefore for N authors, N×N neural networks are trained for pair wise classification. These N×N experts then organized as N special teams (CANNT) to aggregate decisions of these N×N experts. Number of authors is seen not so effective on the accuracy of the authentication, and around 80% accuracy is achieved for any number of authors.","PeriodicalId":243185,"journal":{"name":"Southeast Europe Journal of Soft Computing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Southeast Europe Journal of Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21533/SCJOURNAL.V7I2.163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Dataset consists of 17000 tweets collected from Twitter, as 500 tweets for each of 34 authors that meet certain criteria. Raw data is collected by using the software Nvivo. The collected raw data is preprocessed to extract frequencies of 200 features. In the data analysis 128 of features are eliminated since they are rare in tweets. As a progressive presentation, five – ten – fifteen – twenty - thirty and thirty four of these 34 authors are selected each time. Since recurrent artificial neural networks are more stable and iterations converge more quickly, in this work this architecture is preferred. In general, ANNs are more successful in distinguishing two classes, therefore for N authors, N×N neural networks are trained for pair wise classification. These N×N experts then organized as N special teams (CANNT) to aggregate decisions of these N×N experts. Number of authors is seen not so effective on the accuracy of the authentication, and around 80% accuracy is achieved for any number of authors.