利用先验知识改进人机交互场景下的自动语音识别

IF 4.2 Q2 ROBOTICS ACM Transactions on Human-Robot Interaction Pub Date : 2023-03-13 DOI:10.1145/3568294.3580129

Pradip Pramanick, Chayan Sarkar

{"title":"利用先验知识改进人机交互场景下的自动语音识别","authors":"Pradip Pramanick, Chayan Sarkar","doi":"10.1145/3568294.3580129","DOIUrl":null,"url":null,"abstract":"The prolificacy of human-robot interaction not only depends on a robot's ability to understand the intent and content of the human utterance but also gets impacted by the automatic speech recognition (ASR) system. Modern ASR can provide highly accurate (grammatically and syntactically) translation. Yet, the general purpose ASR often misses out on the semantics of the translation by incorrect word prediction due to open-vocabulary modeling. ASR inaccuracy can have significant repercussions as this can lead to a completely different action by the robot in the real world. Can any prior knowledge be helpful in such a scenario? In this work, we explore how prior knowledge can be utilized in ASR decoding. Using our experiments, we demonstrate how our system can significantly improve ASR translation for robotic task instruction.","PeriodicalId":36515,"journal":{"name":"ACM Transactions on Human-Robot Interaction","volume":"58 1","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Utilizing Prior Knowledge to Improve Automatic Speech Recognition in Human-Robot Interactive Scenarios\",\"authors\":\"Pradip Pramanick, Chayan Sarkar\",\"doi\":\"10.1145/3568294.3580129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The prolificacy of human-robot interaction not only depends on a robot's ability to understand the intent and content of the human utterance but also gets impacted by the automatic speech recognition (ASR) system. Modern ASR can provide highly accurate (grammatically and syntactically) translation. Yet, the general purpose ASR often misses out on the semantics of the translation by incorrect word prediction due to open-vocabulary modeling. ASR inaccuracy can have significant repercussions as this can lead to a completely different action by the robot in the real world. Can any prior knowledge be helpful in such a scenario? In this work, we explore how prior knowledge can be utilized in ASR decoding. Using our experiments, we demonstrate how our system can significantly improve ASR translation for robotic task instruction.\",\"PeriodicalId\":36515,\"journal\":{\"name\":\"ACM Transactions on Human-Robot Interaction\",\"volume\":\"58 1\",\"pages\":\"\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2023-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Human-Robot Interaction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3568294.3580129\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Human-Robot Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3568294.3580129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

人机交互的多产性不仅取决于机器人理解人类话语意图和内容的能力，而且还受到自动语音识别系统的影响。现代ASR可以提供高度准确的(语法和句法)翻译。然而，由于开放词汇建模，通用ASR往往会由于单词预测错误而错过翻译的语义。ASR不准确会产生重大影响，因为这可能导致机器人在现实世界中采取完全不同的行动。在这种情况下，任何先验知识都有帮助吗?在这项工作中，我们探讨了如何将先验知识用于ASR解码。通过我们的实验，我们证明了我们的系统如何显著提高机器人任务指令的ASR翻译。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Utilizing Prior Knowledge to Improve Automatic Speech Recognition in Human-Robot Interactive Scenarios

The prolificacy of human-robot interaction not only depends on a robot's ability to understand the intent and content of the human utterance but also gets impacted by the automatic speech recognition (ASR) system. Modern ASR can provide highly accurate (grammatically and syntactically) translation. Yet, the general purpose ASR often misses out on the semantics of the translation by incorrect word prediction due to open-vocabulary modeling. ASR inaccuracy can have significant repercussions as this can lead to a completely different action by the robot in the real world. Can any prior knowledge be helpful in such a scenario? In this work, we explore how prior knowledge can be utilized in ASR decoding. Using our experiments, we demonstrate how our system can significantly improve ASR translation for robotic task instruction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Human-Robot Interaction Computer Science-Artificial Intelligence

CiteScore

7.70

自引率

5.90%

发文量

期刊介绍： ACM Transactions on Human-Robot Interaction (THRI) is a prestigious Gold Open Access journal that aspires to lead the field of human-robot interaction as a top-tier, peer-reviewed, interdisciplinary publication. The journal prioritizes articles that significantly contribute to the current state of the art, enhance overall knowledge, have a broad appeal, and are accessible to a diverse audience. Submissions are expected to meet a high scholarly standard, and authors are encouraged to ensure their research is well-presented, advancing the understanding of human-robot interaction, adding cutting-edge or general insights to the field, or challenging current perspectives in this research domain. THRI warmly invites well-crafted paper submissions from a variety of disciplines, encompassing robotics, computer science, engineering, design, and the behavioral and social sciences. The scholarly articles published in THRI may cover a range of topics such as the nature of human interactions with robots and robotic technologies, methods to enhance or enable novel forms of interaction, and the societal or organizational impacts of these interactions. The editorial team is also keen on receiving proposals for special issues that focus on specific technical challenges or that apply human-robot interaction research to further areas like social computing, consumer behavior, health, and education.

期刊最新文献

Towards an Integrative Framework for Robot Personality Research Effortless Polite Telepresence using Intention Recognition Introduction to the Special Issue on Sound in Human-Robot Interaction Variable Autonomy Through Responsible Robotics: Design Guidelines and Research Agenda The Power of Robot-mediated Play: Forming Friendships and Expressing Identity.