Scribe

IF 4.5 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Pub Date : 2024-01-12 DOI:10.1145/3631411

Yang Bai, Irtaza Shahid, Harshvardhan Takawale, Nirupam Roy

{"title":"Scribe","authors":"Yang Bai, Irtaza Shahid, Harshvardhan Takawale, Nirupam Roy","doi":"10.1145/3631411","DOIUrl":null,"url":null,"abstract":"This paper presents the design and implementation of Scribe, a comprehensive voice processing and handwriting interface for voice assistants. Distinct from prior works, Scribe is a precise tracking interface that can co-exist with the voice interface on low sampling rate voice assistants. Scribe can be used for 3D free-form drawing, writing, and motion tracking for gaming. Taking handwriting as a specific application, it can also capture natural strokes and the individualized style of writing while occupying only a single frequency. The core technique includes an accurate acoustic ranging method called Cross Frequency Continuous Wave (CFCW) sonar, enabling voice assistants to use ultrasound as a ranging signal while using the regular microphone system of voice assistants as a receiver. We also design a new optimization algorithm that only requires a single frequency for time difference of arrival. Scribe prototype achieves 73 μm of median error for 1D ranging and 1.4 mm of median error in 3D tracking of an acoustic beacon using the microphone array used in voice assistants. Our implementation of an in-air handwriting interface achieves 94.1% accuracy with automatic handwriting-to-text software, similar to writing on paper (96.6%). At the same time, the error rate of voice-based user authentication only increases from 6.26% to 8.28%.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"3 3","pages":"1 - 31"},"PeriodicalIF":4.5000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3631411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

This paper presents the design and implementation of Scribe, a comprehensive voice processing and handwriting interface for voice assistants. Distinct from prior works, Scribe is a precise tracking interface that can co-exist with the voice interface on low sampling rate voice assistants. Scribe can be used for 3D free-form drawing, writing, and motion tracking for gaming. Taking handwriting as a specific application, it can also capture natural strokes and the individualized style of writing while occupying only a single frequency. The core technique includes an accurate acoustic ranging method called Cross Frequency Continuous Wave (CFCW) sonar, enabling voice assistants to use ultrasound as a ranging signal while using the regular microphone system of voice assistants as a receiver. We also design a new optimization algorithm that only requires a single frequency for time difference of arrival. Scribe prototype achieves 73 μm of median error for 1D ranging and 1.4 mm of median error in 3D tracking of an acoustic beacon using the microphone array used in voice assistants. Our implementation of an in-air handwriting interface achieves 94.1% accuracy with automatic handwriting-to-text software, similar to writing on paper (96.6%). At the same time, the error rate of voice-based user authentication only increases from 6.26% to 8.28%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

抄写员

本文介绍了 Scribe 的设计和实现，这是一个用于语音助手的综合语音处理和手写界面。与之前的作品不同，Scribe 是一种精确的跟踪界面，可以与低采样率语音助手的语音界面共存。Scribe 可用于三维自由形态绘画、书写和游戏中的动作跟踪。以手写为例，它还可以捕捉自然的笔画和个性化的书写风格，同时只占用一个频率。核心技术包括一种名为 "跨频连续波（CFCW）声纳 "的精确声学测距方法，使语音助手能够使用超声波作为测距信号，同时使用语音助手的常规麦克风系统作为接收器。我们还设计了一种新的优化算法，该算法只需要单一频率的到达时间差。Scribe 原型实现了 73 μm 的一维测距中值误差和 1.4 mm 的声信标三维跟踪中值误差（使用语音助手中使用的麦克风阵列）。我们实现的空中手写界面通过自动手写到文本软件达到了 94.1% 的准确率，与在纸上书写（96.6%）的准确率相似。同时，基于语音的用户身份验证的错误率仅从 6.26% 增加到 8.28%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊