为聋人和重听用户提供更强大的语言交互

Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility Pub Date : 2018-10-08 DOI:10.1145/3234695.3236343

Raymond Fok, Harmanpreet Kaur, Skanda Palani, Martez E. Mott, Walter S. Lasecki

{"title":"为聋人和重听用户提供更强大的语言交互","authors":"Raymond Fok, Harmanpreet Kaur, Skanda Palani, Martez E. Mott, Walter S. Lasecki","doi":"10.1145/3234695.3236343","DOIUrl":null,"url":null,"abstract":"Mobile, wearable, and other ubiquitous computing devices are increasingly creating a context in which conventional keyboard and screen-based inputs are being replaced in favor of more natural speech-based interactions. Digital personal assistants use speech to control a wide range of functionality, from environmental controls to information access. However, many deaf and hard-of-hearing users have speech patterns that vary from those of hearing users due to incomplete acoustic feedback from their own voices. Because automatic speech recognition (ASR) systems are largely trained using speech from hearing individuals, speech-controlled technologies are typically inaccessible to deaf users. Prior work has focused on providing deaf users access to aural output via real-time captioning or signing, but little has been done to improve users' ability to provide input to these systems' speech-based interfaces. Further, the vocalization patterns of deaf speech often make accurate recognition intractable for both automated systems and human listeners, making traditional approaches to mitigate ASR limitations, such as human captionists, less effective. To bridge this accessibility gap, we investigate the limitations of common speech recognition approaches and techniques---both automatic and human-powered---when applied to deaf speech. We then explore the effectiveness of an iterative crowdsourcing workflow, and characterize the potential for groups to collectively exceed the performance of individuals. This paper contributes a better understanding of the challenges of deaf speech recognition and provides insights for future system development in this space.","PeriodicalId":110197,"journal":{"name":"Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Towards More Robust Speech Interactions for Deaf and Hard of Hearing Users\",\"authors\":\"Raymond Fok, Harmanpreet Kaur, Skanda Palani, Martez E. Mott, Walter S. Lasecki\",\"doi\":\"10.1145/3234695.3236343\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mobile, wearable, and other ubiquitous computing devices are increasingly creating a context in which conventional keyboard and screen-based inputs are being replaced in favor of more natural speech-based interactions. Digital personal assistants use speech to control a wide range of functionality, from environmental controls to information access. However, many deaf and hard-of-hearing users have speech patterns that vary from those of hearing users due to incomplete acoustic feedback from their own voices. Because automatic speech recognition (ASR) systems are largely trained using speech from hearing individuals, speech-controlled technologies are typically inaccessible to deaf users. Prior work has focused on providing deaf users access to aural output via real-time captioning or signing, but little has been done to improve users' ability to provide input to these systems' speech-based interfaces. Further, the vocalization patterns of deaf speech often make accurate recognition intractable for both automated systems and human listeners, making traditional approaches to mitigate ASR limitations, such as human captionists, less effective. To bridge this accessibility gap, we investigate the limitations of common speech recognition approaches and techniques---both automatic and human-powered---when applied to deaf speech. We then explore the effectiveness of an iterative crowdsourcing workflow, and characterize the potential for groups to collectively exceed the performance of individuals. This paper contributes a better understanding of the challenges of deaf speech recognition and provides insights for future system development in this space.\",\"PeriodicalId\":110197,\"journal\":{\"name\":\"Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3234695.3236343\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3234695.3236343","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

摘要

移动、可穿戴和其他无处不在的计算设备正在日益创造一种环境，在这种环境中，传统的键盘和基于屏幕的输入正在被更自然的基于语音的交互所取代。数字个人助理使用语音来控制各种功能，从环境控制到信息访问。然而，许多耳聋和重听用户的语言模式与正常人不同，这是由于他们自己声音的声学反馈不完整。由于自动语音识别(ASR)系统在很大程度上是使用听力正常的人的语音进行训练的，因此失聪用户通常无法使用语音控制技术。先前的工作主要集中在通过实时字幕或签名为聋人用户提供听觉输出，但在提高用户为这些系统的基于语音的界面提供输入的能力方面做得很少。此外，聋人语言的发声模式往往使自动化系统和人类听者难以准确识别，使得减轻ASR限制的传统方法(如人类captionists)效果不佳。为了弥补这种可访问性差距，我们研究了通用语音识别方法和技术(自动和人工)在聋人语言应用时的局限性。然后，我们探讨了迭代众包工作流程的有效性，并描述了群体集体超越个人绩效的潜力。本文有助于更好地理解聋人语音识别的挑战，并为该领域未来的系统开发提供见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Towards More Robust Speech Interactions for Deaf and Hard of Hearing Users

Mobile, wearable, and other ubiquitous computing devices are increasingly creating a context in which conventional keyboard and screen-based inputs are being replaced in favor of more natural speech-based interactions. Digital personal assistants use speech to control a wide range of functionality, from environmental controls to information access. However, many deaf and hard-of-hearing users have speech patterns that vary from those of hearing users due to incomplete acoustic feedback from their own voices. Because automatic speech recognition (ASR) systems are largely trained using speech from hearing individuals, speech-controlled technologies are typically inaccessible to deaf users. Prior work has focused on providing deaf users access to aural output via real-time captioning or signing, but little has been done to improve users' ability to provide input to these systems' speech-based interfaces. Further, the vocalization patterns of deaf speech often make accurate recognition intractable for both automated systems and human listeners, making traditional approaches to mitigate ASR limitations, such as human captionists, less effective. To bridge this accessibility gap, we investigate the limitations of common speech recognition approaches and techniques---both automatic and human-powered---when applied to deaf speech. We then explore the effectiveness of an iterative crowdsourcing workflow, and characterize the potential for groups to collectively exceed the performance of individuals. This paper contributes a better understanding of the challenges of deaf speech recognition and provides insights for future system development in this space.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助