Shared Multi-Keyboard and Bilingual Datasets to Support Keystroke Dynamics Research

A. Wahab, Daqing Hou, M. Banavar, S. Schuckers, Kenneth Eaton, Jacob Baldwin, Robert Wright
{"title":"Shared Multi-Keyboard and Bilingual Datasets to Support Keystroke Dynamics Research","authors":"A. Wahab, Daqing Hou, M. Banavar, S. Schuckers, Kenneth Eaton, Jacob Baldwin, Robert Wright","doi":"10.1145/3508398.3511516","DOIUrl":null,"url":null,"abstract":"Keystroke dynamics has been shown to be a promising method for user authentication based on a user's typing rhythms. Over the years, it has seen increasing applications such as in preventing transaction fraud, account takeovers, and identity theft. However, due to the variable nature of keystroke dynamics, a user's typing patterns may vary on a different keyboard or in a different keyboard language setting, which may affect the system accuracy. In other words, an algorithm modeled with data collected using a mechanical keyboard may perform significantly differently when tested with an ergonomic keyboard. Similarly, an algorithm modeled with data collected in one language may perform significantly differently when tested with another language. Hence, there is a need to study the impact of multiple keyboards and multiple languages on keystroke dynamics performance. This motivated us to develop two free-text keystroke dynamics datasets. The first is a multi-keyboard keystroke dataset comprising of four (4) physical keyboards - mechanical, ergonomic, membrane, and laptop keyboards - and the second is a bilingual keystroke dataset in both English and Chinese languages. Data were collected from a total of 86 participants using a non-intrusive web-based keylogger in a semi-controlled setting. To the best of our knowledge, these are the first multi-keyboard and bilingual keystroke datasets, as well as the data collection software, to be made publicly available for research purposes. The usefulness of our datasets was demonstrated by evaluating the performance of two state-of-the-art free-text algorithms.","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3508398.3511516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Keystroke dynamics has been shown to be a promising method for user authentication based on a user's typing rhythms. Over the years, it has seen increasing applications such as in preventing transaction fraud, account takeovers, and identity theft. However, due to the variable nature of keystroke dynamics, a user's typing patterns may vary on a different keyboard or in a different keyboard language setting, which may affect the system accuracy. In other words, an algorithm modeled with data collected using a mechanical keyboard may perform significantly differently when tested with an ergonomic keyboard. Similarly, an algorithm modeled with data collected in one language may perform significantly differently when tested with another language. Hence, there is a need to study the impact of multiple keyboards and multiple languages on keystroke dynamics performance. This motivated us to develop two free-text keystroke dynamics datasets. The first is a multi-keyboard keystroke dataset comprising of four (4) physical keyboards - mechanical, ergonomic, membrane, and laptop keyboards - and the second is a bilingual keystroke dataset in both English and Chinese languages. Data were collected from a total of 86 participants using a non-intrusive web-based keylogger in a semi-controlled setting. To the best of our knowledge, these are the first multi-keyboard and bilingual keystroke datasets, as well as the data collection software, to be made publicly available for research purposes. The usefulness of our datasets was demonstrated by evaluating the performance of two state-of-the-art free-text algorithms.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
共享多键盘和双语数据集,支持击键动力学研究
击键动力学已被证明是一种很有前途的基于用户输入节奏的用户身份验证方法。多年来,它在防止交易欺诈、账户接管和身份盗窃等方面的应用越来越多。然而,由于击键动力学的可变性,用户的输入模式在不同的键盘或不同的键盘语言设置中可能会有所不同,这可能会影响系统的准确性。换句话说,使用机械键盘收集的数据建模的算法在使用人体工程学键盘进行测试时可能表现明显不同。类似地,用一种语言收集的数据建模的算法在用另一种语言测试时可能表现出明显不同。因此,有必要研究多种键盘和多种语言对击键动力学性能的影响。这促使我们开发了两个自由文本击键动力学数据集。第一个是多键盘击键数据集,包括四(4)个物理键盘——机械键盘、人体工程学键盘、薄膜键盘和笔记本电脑键盘——第二个是中英文双语击键数据集。在半受控环境下,使用非侵入式网络键盘记录器从总共86名参与者中收集数据。据我们所知,这是首个公开供研究使用的多键盘和双语击键数据集,以及数据收集软件。通过评估两种最先进的自由文本算法的性能,我们的数据集的有用性得到了证明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Session details: Session 7: Encryption and Privacy RS-PKE: Ranked Searchable Public-Key Encryption for Cloud-Assisted Lightweight Platforms Prediction of Mobile App Privacy Preferences with User Profiles via Federated Learning Building a Commit-level Dataset of Real-world Vulnerabilities Shared Multi-Keyboard and Bilingual Datasets to Support Keystroke Dynamics Research
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1