语音和免提交互:神话、挑战和机遇

Cosmin Munteanu, Gerald Penn
{"title":"语音和免提交互:神话、挑战和机遇","authors":"Cosmin Munteanu, Gerald Penn","doi":"10.1145/3098279.3119919","DOIUrl":null,"url":null,"abstract":"HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines - despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the MobileHCI community (and the HCI field at large) has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the unexpected variations in error rates when processing speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces. As such, the development of interactive speech-based systems is mostly driven by engineering efforts to improve such systems with respect to largely arbitrary performance metrics. Such developments have often been void of any user-centered design principles or consideration for usability or usefulness. The goal of this course is to inform the MobileHCI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that HCI researchers and practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.","PeriodicalId":120153,"journal":{"name":"Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Speech and Hands-free interaction: myths, challenges, and opportunities\",\"authors\":\"Cosmin Munteanu, Gerald Penn\",\"doi\":\"10.1145/3098279.3119919\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines - despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the MobileHCI community (and the HCI field at large) has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the unexpected variations in error rates when processing speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces. As such, the development of interactive speech-based systems is mostly driven by engineering efforts to improve such systems with respect to largely arbitrary performance metrics. Such developments have often been void of any user-centered design principles or consideration for usability or usefulness. The goal of this course is to inform the MobileHCI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that HCI researchers and practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.\",\"PeriodicalId\":120153,\"journal\":{\"name\":\"Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3098279.3119919\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3098279.3119919","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

长期以来,人机交互研究一直致力于更好、更自然地促进人与机器之间的信息传递。不幸的是,人类最自然的交流形式——语言,也是机器最难理解的形式之一——尽管,也许,因为它是我们拥有的最高带宽的交流渠道。虽然从工程学、语言学到认知科学的重大研究努力都花在了提高机器理解语音的能力上,但MobileHCI社区(以及整个HCI领域)在将这种模式作为研究的中心焦点方面一直相对胆怯。这可以部分归因于处理语音时错误率的意外变化,这与工业界经常毫无根据的成功声明形成对比,但也归因于设计,特别是评估语音和自然语言界面的内在困难。因此,基于语音的交互式系统的开发主要是由工程努力驱动的,目的是根据很大程度上任意的性能指标来改进此类系统。这样的开发通常缺乏任何以用户为中心的设计原则或对可用性或有用性的考虑。本课程的目标是向MobileHCI社区介绍语音和自然语言研究的现状,消除围绕基于语音的交互的一些神话,并为研究人员和从业者提供机会,了解更多关于语音识别和语音合成如何工作,它们的局限性是什么,以及如何使用它们来增强当前的交互范式。通过这项研究,我们希望HCI研究人员和实践者能够学习如何将语音处理的最新进展与以用户为中心的原则结合起来,设计出更可用和有用的基于语音的交互系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Speech and Hands-free interaction: myths, challenges, and opportunities
HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines - despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the MobileHCI community (and the HCI field at large) has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the unexpected variations in error rates when processing speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces. As such, the development of interactive speech-based systems is mostly driven by engineering efforts to improve such systems with respect to largely arbitrary performance metrics. Such developments have often been void of any user-centered design principles or consideration for usability or usefulness. The goal of this course is to inform the MobileHCI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that HCI researchers and practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Improving pocket paint usability via material design compliance and internationalization & localization support on application level CapSoles: who is walking on what kind of floor? Exploring the feasibility of subliminal priming on smartphones Visual, auditory and haptic navigation feedbacks among older pedestrians Usability of different types of commercial selfie sticks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1