{"title":"Finding Participants in a Chat: Authorship Attribution for Conversational Documents","authors":"Giacomo Inches, Morgan Harvey, F. Crestani","doi":"10.1109/SOCIALCOM.2013.45","DOIUrl":null,"url":null,"abstract":"In this work we study the problem of Authorship Attribution for a novel set of documents, namely online chats. Although the problem of Authorship Attribution has been extensively investigated for different document types, from books to letters and from emails to blog posts, to the best of our knowledge this is the first study of Authorship Attribution for conversational documents (IRC chat logs) using statistical models. We experimentally demonstrate the unsuitability of the classical statistical models for conversational documents and propose a novel approach which is able to achieve a high accuracy rate (up to 95%) for hundreds of authors.","PeriodicalId":129308,"journal":{"name":"2013 International Conference on Social Computing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Social Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SOCIALCOM.2013.45","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
In this work we study the problem of Authorship Attribution for a novel set of documents, namely online chats. Although the problem of Authorship Attribution has been extensively investigated for different document types, from books to letters and from emails to blog posts, to the best of our knowledge this is the first study of Authorship Attribution for conversational documents (IRC chat logs) using statistical models. We experimentally demonstrate the unsuitability of the classical statistical models for conversational documents and propose a novel approach which is able to achieve a high accuracy rate (up to 95%) for hundreds of authors.