Multi-Platform Authorship Verification

Abdulaziz Altamimi, N. Clarke, S. Furnell, Fudong Li
{"title":"Multi-Platform Authorship Verification","authors":"Abdulaziz Altamimi, N. Clarke, S. Furnell, Fudong Li","doi":"10.1145/3360664.3360677","DOIUrl":null,"url":null,"abstract":"At the present time, there has been a rapid increase in the variety and popularity of messaging systems such as social network messaging, text messages, email and Twitter, with users frequently exchanging messages across various platforms. Unfortunately, in amongst the legitimate messages, there is a host of illegitimate and inappropriate content - with cyber stalking, trolling and computerassisted crime all taking place. Therefore, there is a need to identify individuals using messaging systems. Stylometry is the study of linguistic features in a text which consists of verifying an author based on his writing style that consists of checking whether a target text was written or not by a specific individual author. Whilst much research has taken place within authorship verification, studies have focused upon singular platforms, often had limited datasets and restricted methodologies that have meant it is difficult to appreciate the real-world value of the approach. This paper seeks to overcome these limitations through providing an analysis of authorship verification across four common messaging systems. This approach enables a direct comparison of recognition performance and provides a basis for analyzing the feature vectors across platforms to better understand what aspects each capitalize upon in order to achieve good classification. The experiments also include an investigation into the feature vector creation, utilizing population and user-based techniques to compare and contrast performance. The experiment involved 50 participants across four common platforms with a total 13,617; 106,359; 4,539; and 6,540 samples for Twitter, SMS, Facebook, and Email achieving an Equal Error Rate (EER) of 20.16%, 7.97%, 25% and 13.11% respectively.","PeriodicalId":409365,"journal":{"name":"Proceedings of the Third Central European Cybersecurity Conference","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third Central European Cybersecurity Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3360664.3360677","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

At the present time, there has been a rapid increase in the variety and popularity of messaging systems such as social network messaging, text messages, email and Twitter, with users frequently exchanging messages across various platforms. Unfortunately, in amongst the legitimate messages, there is a host of illegitimate and inappropriate content - with cyber stalking, trolling and computerassisted crime all taking place. Therefore, there is a need to identify individuals using messaging systems. Stylometry is the study of linguistic features in a text which consists of verifying an author based on his writing style that consists of checking whether a target text was written or not by a specific individual author. Whilst much research has taken place within authorship verification, studies have focused upon singular platforms, often had limited datasets and restricted methodologies that have meant it is difficult to appreciate the real-world value of the approach. This paper seeks to overcome these limitations through providing an analysis of authorship verification across four common messaging systems. This approach enables a direct comparison of recognition performance and provides a basis for analyzing the feature vectors across platforms to better understand what aspects each capitalize upon in order to achieve good classification. The experiments also include an investigation into the feature vector creation, utilizing population and user-based techniques to compare and contrast performance. The experiment involved 50 participants across four common platforms with a total 13,617; 106,359; 4,539; and 6,540 samples for Twitter, SMS, Facebook, and Email achieving an Equal Error Rate (EER) of 20.16%, 7.97%, 25% and 13.11% respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多平台作者验证
目前,社交网络消息、短信、电子邮件和Twitter等消息传递系统的种类和普及程度迅速增加,用户频繁地在各种平台上交换消息。不幸的是,在合法的信息中,有大量的非法和不适当的内容——网络跟踪、钓鱼和计算机辅助犯罪都在发生。因此,有必要识别使用消息传递系统的个人。文体学是对文本语言特征的研究,它包括根据作者的写作风格来验证作者,包括检查目标文本是否由特定的个人作者所写。虽然在作者身份验证方面进行了大量研究,但研究主要集中在单一平台上,通常具有有限的数据集和有限的方法,这意味着很难欣赏该方法的现实价值。本文试图通过提供跨四种常见消息传递系统的作者身份验证分析来克服这些限制。这种方法可以直接比较识别性能,并为分析跨平台的特征向量提供基础,以便更好地了解每个方面都利用哪些方面来实现良好的分类。实验还包括对特征向量创建的研究,利用人口和基于用户的技术来比较和对比性能。该实验涉及四个公共平台的50名参与者,共有13,617人;106359;4539;Twitter、SMS、Facebook和Email的6540个样本的平均错误率(EER)分别为20.16%、7.97%、25%和13.11%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
IPv6 Covert Channels in the Wild A Secure String Class Compliant with PCI DSS Towards a delegation-type secure software development method From Fake News to Virtual Reality: Fake News and Digital Manipulations at the Age of Modern Technology Determining Minimum Hash Width for Hash Chains
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1