首页 > 最新文献

2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)最新文献

英文 中文
How Many Robots Do You Want? A Cross-Cultural Exploration on User Preference and Perception of an Assistive Multi-Robot System 你想要多少个机器人?辅助多机器人系统用户偏好与感知的跨文化探索
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515396
Mengni Zhang, Tong Xu, Jackson Hardin, Jilly Jiaqi Cai, J. Brooks, K. Green
There has been increased development of assistive robots for the home, along with empirical studies assessing cultural differences on user perception. However, little attention has been paid to cultural differences with respect to non- humanoid, multi-robot interactions in homes or otherwise. In this exploratory paper, we investigate how cultural differences may impact users’ preferences and perceived usefulness of a multi-robot system by creating an interactive online survey and considering variables often absent in HRI studies. We introduce our multi-robot design and survey construction, and report results evaluated across 191 young adult participants from China, India, and the USA. We find significant effects of culture on both participants’ preferences and perceived usefulness of the system between India and China or the USA, but not between China and the USA. We also find effects of culture on perceived usefulness to be partially mediated by participant preferences. Our findings reinforce the importance of considering cultural differences in designing domestic multi-robotic assistants.
家庭辅助机器人的发展越来越多,同时也有评估用户感知文化差异的实证研究。然而,很少有人注意到文化差异方面的非人形,多机器人在家庭或其他方面的相互作用。在这篇探索性论文中,我们通过创建一个交互式在线调查,并考虑在HRI研究中经常缺失的变量,研究文化差异如何影响用户的偏好和对多机器人系统的感知有用性。我们介绍了我们的多机器人设计和调查构建,并报告了对来自中国,印度和美国的191名年轻成年参与者的评估结果。我们发现文化对印度和中国或美国之间的参与者的偏好和感知系统有用性的显著影响,但在中国和美国之间没有。我们还发现文化对感知有用性的影响部分由参与者偏好介导。我们的研究结果强调了在设计家用多机器人助手时考虑文化差异的重要性。
{"title":"How Many Robots Do You Want? A Cross-Cultural Exploration on User Preference and Perception of an Assistive Multi-Robot System","authors":"Mengni Zhang, Tong Xu, Jackson Hardin, Jilly Jiaqi Cai, J. Brooks, K. Green","doi":"10.1109/RO-MAN50785.2021.9515396","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515396","url":null,"abstract":"There has been increased development of assistive robots for the home, along with empirical studies assessing cultural differences on user perception. However, little attention has been paid to cultural differences with respect to non- humanoid, multi-robot interactions in homes or otherwise. In this exploratory paper, we investigate how cultural differences may impact users’ preferences and perceived usefulness of a multi-robot system by creating an interactive online survey and considering variables often absent in HRI studies. We introduce our multi-robot design and survey construction, and report results evaluated across 191 young adult participants from China, India, and the USA. We find significant effects of culture on both participants’ preferences and perceived usefulness of the system between India and China or the USA, but not between China and the USA. We also find effects of culture on perceived usefulness to be partially mediated by participant preferences. Our findings reinforce the importance of considering cultural differences in designing domestic multi-robotic assistants.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"60 1","pages":"580-585"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75167567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Design of Polycentric Assistive Device for Knee Joint* 膝关节多中心辅助装置设计*
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515529
T. Kikuchi, Takumi Nishimura, K. Fukuoka, Takeru Todaka, Isao Abe
Movements of knee joints are relative motions between tibia and femur, which include rolling and sliding. Conventional wearable knee assistive device utilized a hinge joint and generates nonnegligible mismatched motions under deep flexions. To assist the knee motion including the deep flexion, we developed a polycentric assistive device for knee joints (PAD-KJ) by using two gears with the same module and different radii. These radii were designed so that the center of the smaller gear moves on the trajectory of the knee joint with small error. According to the results of parametric design, the minimum error was about 3 mm to the model trajectory of knee joint. We also proposed a torque generator for this joint. The combination of a lever arm and a linear spring with the pair of gears successfully generated the assistive torque for standing motions. According to an evaluation test, the maximum torque of the single unit was about 1.2 Nm and the assistive torque in real usage will be 4.8 Nm.
膝关节的运动是胫骨和股骨之间的相对运动,包括滚动和滑动。传统的可穿戴膝关节辅助装置采用铰链关节,在深度屈曲下产生不可忽略的错配运动。为了辅助膝关节的深度屈曲运动,我们研制了一种多中心膝关节辅助装置(PAD-KJ),该装置采用相同模量、不同半径的两个齿轮。这些半径的设计使小齿轮的中心在膝关节的轨迹上运动,误差很小。参数化设计结果表明,该模型与膝关节运动轨迹的误差最小约为3 mm。我们还提出了一种用于该关节的扭矩发生器。杠杆臂和线性弹簧与齿轮对的组合成功地产生了站立运动的辅助扭矩。根据评价试验,单机最大扭矩约为1.2 Nm,实际使用时辅助扭矩为4.8 Nm。
{"title":"Design of Polycentric Assistive Device for Knee Joint*","authors":"T. Kikuchi, Takumi Nishimura, K. Fukuoka, Takeru Todaka, Isao Abe","doi":"10.1109/RO-MAN50785.2021.9515529","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515529","url":null,"abstract":"Movements of knee joints are relative motions between tibia and femur, which include rolling and sliding. Conventional wearable knee assistive device utilized a hinge joint and generates nonnegligible mismatched motions under deep flexions. To assist the knee motion including the deep flexion, we developed a polycentric assistive device for knee joints (PAD-KJ) by using two gears with the same module and different radii. These radii were designed so that the center of the smaller gear moves on the trajectory of the knee joint with small error. According to the results of parametric design, the minimum error was about 3 mm to the model trajectory of knee joint. We also proposed a torque generator for this joint. The combination of a lever arm and a linear spring with the pair of gears successfully generated the assistive torque for standing motions. According to an evaluation test, the maximum torque of the single unit was about 1.2 Nm and the assistive torque in real usage will be 4.8 Nm.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"57 1","pages":"546-550"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74565457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Self-trainable 3D-printed prosthetic hands 可自我训练的3d打印假肢手
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515506
Kyungho Nam, C. Crick
3D printed prosthetics have narrowed the gap between the tens of thousands of dollars cost of traditional prosthetic designs and amputees’ needs. However, the World Health Organization estimates that only 5-15% of people can receive adequate prosthesis services [2]. To resolve the lack of prosthesis supply and reduce cost issues (for both materials and maintenance), this paper provides an overview of a self-trainable user-customized system architecture for a 3D printed prosthetic hand to minimize the challenge of accessing and maintaining these supporting devices. In this paper, we develop and implement a customized behavior system that can generate any gesture that users desire. The architecture provides upper limb amputees with self-trainable software and can improve their prosthetic performance at almost no financial cost. All kinds of unique gestures that users want are trainable with the RBF network using 3 channel EMG sensor signals with a 94% average success rate. This result demonstrates that applying user-customized training to the behavior of a prosthetic hand can satisfy individual user requirements in real-life activities with high performance.
3D打印义肢缩小了传统义肢设计数万美元成本与截肢者需求之间的差距。然而,世界卫生组织估计,只有5-15%的人能够获得足够的假肢服务。为了解决假体供应不足和降低成本问题(材料和维护),本文概述了3D打印假手的自训练用户定制系统架构,以最大限度地减少访问和维护这些支持设备的挑战。在本文中,我们开发并实现了一个定制的行为系统,可以生成用户想要的任何手势。该架构为上肢截肢者提供了可自我训练的软件,可以在几乎没有经济成本的情况下改善他们的假肢性能。用户想要的各种独特手势都可以通过RBF网络进行训练,使用3通道肌电信号传感器信号,平均成功率为94%。这一结果表明,将用户定制训练应用于假手的行为可以满足现实生活中用户的个性化需求。
{"title":"Self-trainable 3D-printed prosthetic hands","authors":"Kyungho Nam, C. Crick","doi":"10.1109/RO-MAN50785.2021.9515506","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515506","url":null,"abstract":"3D printed prosthetics have narrowed the gap between the tens of thousands of dollars cost of traditional prosthetic designs and amputees’ needs. However, the World Health Organization estimates that only 5-15% of people can receive adequate prosthesis services [2]. To resolve the lack of prosthesis supply and reduce cost issues (for both materials and maintenance), this paper provides an overview of a self-trainable user-customized system architecture for a 3D printed prosthetic hand to minimize the challenge of accessing and maintaining these supporting devices. In this paper, we develop and implement a customized behavior system that can generate any gesture that users desire. The architecture provides upper limb amputees with self-trainable software and can improve their prosthetic performance at almost no financial cost. All kinds of unique gestures that users want are trainable with the RBF network using 3 channel EMG sensor signals with a 94% average success rate. This result demonstrates that applying user-customized training to the behavior of a prosthetic hand can satisfy individual user requirements in real-life activities with high performance.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"18 1","pages":"1196-1201"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77686946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An interactive response strategy involving a robot avatar in a video conference system for reducing the stress of response time management in communication 一种涉及视频会议系统中机器人化身的交互式响应策略,以减轻通信中响应时间管理的压力
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515503
Faisal Mehmood, Hamed Mahzoon, Y. Yoshikawa, H. Ishiguro
People with response time (RT) management issues usually have high situational communication apprehension, and fear of negative evaluation (FNE) in face-to-face (FtF) and online interactions. Decreasing the stress related to RT management can reduce situational communication apprehension and FNE, ensuring communication successful. In this study, we propose an interactive response strategy involving a robot avatar in a video conference that can reduce the stress related to RT management in a person and making communication successful. Two types of robotic video conferencing systems (VCSs) were considered: 1) a conventional robotic video conferencing system (VCS) where interactive response of robot avatar does not change concerning variation in RT of the subject and 2) a robotic VCS where interactive response of robot avatar changes concerning variation in RT of the subject. Situational communication apprehension measure (SCAM), FNE, sense of being attending (SoBA), and intention to use (ITU) are used as indexes for noticing the decrease in stress related to RT management of a subject in a communication scenario. A total of 51 subjects (M = 33.20, SD = 7.58 years) participated in the subjective evaluation of the web-based survey. The proposed interactive response of a robot avatar was found to have a significant effect on the stress related to RT management in a person. A significant decrease in expected SCAM and FNE scores of subjects was observed. Furthermore, a significant increase in expected SoBA and ITU scores of subjects was also observed. This study contributes to the literature in terms of the impact of the proposed interactive response of the robotic avatar on the stress related to RT management in a person in the video conference, assessed by SCAM, FNE, SoBA, and ITU indexes.
有反应时间管理问题的人通常在面对面和在线互动中有较高的情境沟通理解和负面评价恐惧。减少与RT管理相关的压力可以减少情境沟通理解和FNE,确保沟通成功。在本研究中,我们提出了一种涉及视频会议中的机器人化身的交互式响应策略,可以减少与人的RT管理相关的压力,并使沟通成功。考虑了两种类型的机器人视频会议系统(VCS): 1)传统的机器人视频会议系统(VCS),机器人化身的交互响应不随主体RT的变化而变化;2)机器人VCS,机器人化身的交互响应随主体RT的变化而变化。情境沟通理解测量(SCAM)、FNE、参与感(SoBA)和使用意图(ITU)被用来作为注意到沟通场景中与RT管理相关的压力减少的指标。共有51名受试者(M = 33.20, SD = 7.58岁)参与了网络调查的主观评价。被提议的机器人化身的互动反应被发现对一个人与RT管理相关的压力有显著的影响。观察到受试者的预期诈骗和FNE分数显著下降。此外,还观察到预期的SoBA和ITU科目分数显著增加。本研究通过SCAM、FNE、SoBA和ITU指数评估了机器人化身的交互式响应对视频会议中人员RT管理相关压力的影响,为文献做出了贡献。
{"title":"An interactive response strategy involving a robot avatar in a video conference system for reducing the stress of response time management in communication","authors":"Faisal Mehmood, Hamed Mahzoon, Y. Yoshikawa, H. Ishiguro","doi":"10.1109/RO-MAN50785.2021.9515503","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515503","url":null,"abstract":"People with response time (RT) management issues usually have high situational communication apprehension, and fear of negative evaluation (FNE) in face-to-face (FtF) and online interactions. Decreasing the stress related to RT management can reduce situational communication apprehension and FNE, ensuring communication successful. In this study, we propose an interactive response strategy involving a robot avatar in a video conference that can reduce the stress related to RT management in a person and making communication successful. Two types of robotic video conferencing systems (VCSs) were considered: 1) a conventional robotic video conferencing system (VCS) where interactive response of robot avatar does not change concerning variation in RT of the subject and 2) a robotic VCS where interactive response of robot avatar changes concerning variation in RT of the subject. Situational communication apprehension measure (SCAM), FNE, sense of being attending (SoBA), and intention to use (ITU) are used as indexes for noticing the decrease in stress related to RT management of a subject in a communication scenario. A total of 51 subjects (M = 33.20, SD = 7.58 years) participated in the subjective evaluation of the web-based survey. The proposed interactive response of a robot avatar was found to have a significant effect on the stress related to RT management in a person. A significant decrease in expected SCAM and FNE scores of subjects was observed. Furthermore, a significant increase in expected SoBA and ITU scores of subjects was also observed. This study contributes to the literature in terms of the impact of the proposed interactive response of the robotic avatar on the stress related to RT management in a person in the video conference, assessed by SCAM, FNE, SoBA, and ITU indexes.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"22 1","pages":"969-974"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81520731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Bilateral Dual-Arm Teleoperation Robot System with a Unified Control Architecture 一种具有统一控制体系结构的双边双臂遥操作机器人系统
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515523
Cheng Zhou, Longfei Zhao, Haitao Wang, Lipeng Chen, Yu Zheng
The teleoperation system can transmit human intention to the remote robot, so that the system combines excellent robot operation performance and human intelligence. In this article, we have established a bilateral teleoperation system with force feedback from the arm and gripper. That is, the slave robot system can provide force feedback on both the wrist and the fingers, while the master robot system can render the slave feedback force and human interaction force, and control the slave robot accordingly. In addition, this paper also proposes the framework of the robot’s four-channel bilateral teleoperation control system, which is attributed to two situations: impedance control or admittance control. Finally, single-arm/single-arm, dual-arm/dual-arm bilateral teleoperation experiments prove the effectiveness of the bilateral teleoperation system and the four-channel controller architecture proposed in this paper.
远程操作系统可以将人的意图传递给远程机器人,使系统将优秀的机器人操作性能与人的智能相结合。在本文中,我们建立了一个双向远程操作系统,从手臂和抓手的力反馈。即从机器人系统可以同时提供腕部和手指的力反馈,而主机器人系统可以渲染从反馈力和人机交互力,并对从机器人进行相应的控制。此外,本文还提出了机器人四通道双向遥操作控制系统的框架,该系统分为阻抗控制和导纳控制两种情况。最后,通过单臂/单臂、双臂/双臂双侧遥操作实验,验证了双侧遥操作系统和本文提出的四通道控制器架构的有效性。
{"title":"A Bilateral Dual-Arm Teleoperation Robot System with a Unified Control Architecture","authors":"Cheng Zhou, Longfei Zhao, Haitao Wang, Lipeng Chen, Yu Zheng","doi":"10.1109/RO-MAN50785.2021.9515523","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515523","url":null,"abstract":"The teleoperation system can transmit human intention to the remote robot, so that the system combines excellent robot operation performance and human intelligence. In this article, we have established a bilateral teleoperation system with force feedback from the arm and gripper. That is, the slave robot system can provide force feedback on both the wrist and the fingers, while the master robot system can render the slave feedback force and human interaction force, and control the slave robot accordingly. In addition, this paper also proposes the framework of the robot’s four-channel bilateral teleoperation control system, which is attributed to two situations: impedance control or admittance control. Finally, single-arm/single-arm, dual-arm/dual-arm bilateral teleoperation experiments prove the effectiveness of the bilateral teleoperation system and the four-channel controller architecture proposed in this paper.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"152 1","pages":"495-502"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81723748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Affective, Cognitive and Behavioural Engagement Detection for Human-robot Interaction in a Bartending Scenario 调酒场景中人机交互的情感、认知和行为参与检测
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515435
Alessandra Rossi, Mario Raiano, Silvia Rossi
Guaranteeing people’s engagement during an interaction is very important to elicit positive and effective emotions in public service scenarios. A robot should be able to detect its interlocutor’s level and mode of engagement to accordingly modulate its behaviours. However, there is not a generally accepted model to annotate and classify engagement during an interaction. In this work, we consider engagement as a multidimensional construct with three relevant dimensions: affective, cognitive and behavioural. To be automatically evaluated by a robot, such a complex construct requires the selection of the proper interaction features among a large set of possibilities. Moreover, manually collecting and annotating large datasets of real interactions are extremely time-consuming and costly. In this study, we collected the recordings of human-robot interactions in a bartending scenario, and we compared different feature selection and regression models to find the features that characterise a user’s engagement in the interaction, and the model that can efficiently detect them. Results showed that the characterisation of each dimension separately in terms of features and regression obtains better results with respect to a model that directly combines the three dimensions.
在公共服务场景中,保证人们在互动过程中的参与度对于激发积极有效的情绪是非常重要的。机器人应该能够检测对话者的水平和参与模式,从而相应地调整其行为。然而,在交互过程中,没有一个被普遍接受的模型来注释和分类参与。在这项工作中,我们认为参与是一个多维结构,具有三个相关维度:情感、认知和行为。为了让机器人自动评估这样一个复杂的结构,需要在大量的可能性中选择合适的交互特征。此外,手动收集和注释实际交互的大型数据集非常耗时和昂贵。在这项研究中,我们收集了一个调酒场景中人机交互的记录,并比较了不同的特征选择和回归模型,以找到表征用户参与交互的特征,以及可以有效检测这些特征的模型。结果表明,相对于直接结合三个维度的模型,分别从特征和回归方面对每个维度进行表征获得了更好的结果。
{"title":"Affective, Cognitive and Behavioural Engagement Detection for Human-robot Interaction in a Bartending Scenario","authors":"Alessandra Rossi, Mario Raiano, Silvia Rossi","doi":"10.1109/RO-MAN50785.2021.9515435","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515435","url":null,"abstract":"Guaranteeing people’s engagement during an interaction is very important to elicit positive and effective emotions in public service scenarios. A robot should be able to detect its interlocutor’s level and mode of engagement to accordingly modulate its behaviours. However, there is not a generally accepted model to annotate and classify engagement during an interaction. In this work, we consider engagement as a multidimensional construct with three relevant dimensions: affective, cognitive and behavioural. To be automatically evaluated by a robot, such a complex construct requires the selection of the proper interaction features among a large set of possibilities. Moreover, manually collecting and annotating large datasets of real interactions are extremely time-consuming and costly. In this study, we collected the recordings of human-robot interactions in a bartending scenario, and we compared different feature selection and regression models to find the features that characterise a user’s engagement in the interaction, and the model that can efficiently detect them. Results showed that the characterisation of each dimension separately in terms of features and regression obtains better results with respect to a model that directly combines the three dimensions.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"66 1","pages":"208-213"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80829616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Why talk to people when you can talk to robots? Far-field speaker identification in the wild 既然能和机器人说话,为什么还要和人说话?野外远场说话人识别
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515482
Galadrielle Humblot-Renaux, Chen Li, D. Chrysostomou
Equipping robots with the ability to identify who is talking to them is an important step towards natural and effective verbal interaction. However, speaker identification for voice control remains largely unexplored compared to recent progress in natural language instruction and speech recognition. This motivates us to tackle text-independent speaker identification for human-robot interaction applications in industrial environments. By representing audio segments as time-frequency spectrograms, this can be formulated as an image classification task, allowing us to apply state-of-the-art convolutional neural network (CNN) architectures. To achieve robust prediction in unconstrained, challenging acoustic conditions, we take a data-driven approach and collect a custom dataset with a far-field microphone array, featuring over 3 hours of "in the wild" audio recordings from six speakers, which are then encoded into spectral images for CNN-based classification. We propose a shallow 3-layer CNN, which we compare with the widely used ResNet-18 architecture: in addition to benchmarking these models in terms of accuracy, we visualize the features used by these two models to discriminate between classes, and investigate their reliability in unseen acoustic scenes. Although ResNet-18 reaches the highest raw accuracy, we are able to achieve remarkable online speaker recognition performance with a much more lightweight model which learns lower-level vocal features and produces more reliable confidence scores. The proposed method is successfully integrated into a robotic dialogue system and showcased in a mock user localization and authentication scenario in a realistic industrial environment: https://youtu.be/IVtZ8LKJZ7A.
为机器人配备识别与之交谈的人的能力,是朝着自然有效的语言互动迈出的重要一步。然而,与自然语言教学和语音识别的最新进展相比,语音控制的说话人识别在很大程度上仍未被探索。这促使我们解决工业环境中人机交互应用的文本无关说话人识别问题。通过将音频片段表示为时频谱图,可以将其表述为图像分类任务,从而允许我们应用最先进的卷积神经网络(CNN)架构。为了在无约束、具有挑战性的声学条件下实现鲁棒预测,我们采用数据驱动的方法,使用远场麦克风阵列收集自定义数据集,其中包括来自六个扬声器的超过3小时的“野外”录音,然后将其编码为光谱图像,用于基于cnn的分类。我们提出了一个浅3层的CNN,并将其与广泛使用的ResNet-18架构进行比较:除了对这些模型的准确性进行基准测试外,我们还可视化了这两个模型用于区分类别的特征,并研究了它们在看不见的声学场景中的可靠性。虽然ResNet-18达到了最高的原始精度,但我们能够通过更轻量级的模型实现卓越的在线说话人识别性能,该模型可以学习较低水平的声音特征,并产生更可靠的置信度分数。所提出的方法已成功集成到机器人对话系统中,并在现实工业环境中的模拟用户定位和认证场景中进行了展示:https://youtu.be/IVtZ8LKJZ7A。
{"title":"Why talk to people when you can talk to robots? Far-field speaker identification in the wild","authors":"Galadrielle Humblot-Renaux, Chen Li, D. Chrysostomou","doi":"10.1109/RO-MAN50785.2021.9515482","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515482","url":null,"abstract":"Equipping robots with the ability to identify who is talking to them is an important step towards natural and effective verbal interaction. However, speaker identification for voice control remains largely unexplored compared to recent progress in natural language instruction and speech recognition. This motivates us to tackle text-independent speaker identification for human-robot interaction applications in industrial environments. By representing audio segments as time-frequency spectrograms, this can be formulated as an image classification task, allowing us to apply state-of-the-art convolutional neural network (CNN) architectures. To achieve robust prediction in unconstrained, challenging acoustic conditions, we take a data-driven approach and collect a custom dataset with a far-field microphone array, featuring over 3 hours of \"in the wild\" audio recordings from six speakers, which are then encoded into spectral images for CNN-based classification. We propose a shallow 3-layer CNN, which we compare with the widely used ResNet-18 architecture: in addition to benchmarking these models in terms of accuracy, we visualize the features used by these two models to discriminate between classes, and investigate their reliability in unseen acoustic scenes. Although ResNet-18 reaches the highest raw accuracy, we are able to achieve remarkable online speaker recognition performance with a much more lightweight model which learns lower-level vocal features and produces more reliable confidence scores. The proposed method is successfully integrated into a robotic dialogue system and showcased in a mock user localization and authentication scenario in a realistic industrial environment: https://youtu.be/IVtZ8LKJZ7A.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"134 1","pages":"272-278"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76809821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Intent-Aware Predictive Haptic Guidance and its Application to Shared Control Teleoperation 意向感知预测触觉引导及其在共享控制遥操作中的应用
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515326
Kim Tien Ly, Mithun Poozhiyil, Harit Pandya, G. Neumann, Ayse Kucukyilmaz
This paper presents a haptic shared control paradigm that modulates the level of robotic guidance, based on predictions of human motion intentions. The proposed method incorporates robot trajectories learned from human demonstrations and dynamically adjusts the level of robotic assistance based on how closely the detected intentions match these trajectories. An experimental study is conducted to demonstrate the paradigm on a teleoperated pick-and-place task using a Franka Emika Panda robot arm, controlled via a 3D Systems Touch X haptic interface. In the experiment, the human operator teleoperates a remote robot arm while observing the environment on a 2D screen. While the human teleoperates the robot arm, the objects are tracked, and the human’s motion intentions (e.g., which object will be picked or which bin will be approached) are predicted using a Deep Q-Network (DQN). The predictions are made considering the current robot state and baseline robot trajectories that are learned from human demonstrations using Probabilistic Movement Primitives (ProMPs). The detected intentions are then used to condition the ProMP trajectories to modulate the movement and accommodate changing object configurations. Consequently, the system generates adaptive force guidance as weighted virtual fixtures that are rendered on the haptic device. The outcomes of the user study, conducted with 12 participants, indicate that the proposed paradigm can successfully guide users to robust grasping configurations and brings better performance by reducing the number of grasp attempts and improving trajectory smoothness and length.
本文提出了一种基于对人类运动意图的预测来调节机器人指导水平的触觉共享控制范式。该方法结合了从人类演示中学习到的机器人轨迹,并根据检测到的意图与这些轨迹的匹配程度动态调整机器人辅助水平。通过3D Systems Touch X触觉界面控制的Franka Emika Panda机器人手臂,进行了一项实验研究,以演示远程操作拾取和放置任务的范例。在实验中,人类操作员在观察2D屏幕上的环境的同时遥控机器人手臂。当人类远程操作机械臂时,物体被跟踪,人类的运动意图(例如,哪个物体将被选中或哪个垃圾箱将被接近)被预测使用深度q -网络(DQN)。使用概率运动原语(Probabilistic Movement Primitives, promp)从人类演示中学习到当前机器人状态和基线机器人轨迹,并考虑了这些预测。然后,检测到的意图用于调节ProMP轨迹,以调节运动并适应不断变化的物体配置。因此,系统产生自适应力引导作为加权虚拟夹具呈现在触觉设备上。对12名参与者进行的用户研究结果表明,所提出的范式可以成功地引导用户实现鲁棒抓取配置,并通过减少抓取尝试次数、提高轨迹平滑度和长度带来更好的性能。
{"title":"Intent-Aware Predictive Haptic Guidance and its Application to Shared Control Teleoperation","authors":"Kim Tien Ly, Mithun Poozhiyil, Harit Pandya, G. Neumann, Ayse Kucukyilmaz","doi":"10.1109/RO-MAN50785.2021.9515326","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515326","url":null,"abstract":"This paper presents a haptic shared control paradigm that modulates the level of robotic guidance, based on predictions of human motion intentions. The proposed method incorporates robot trajectories learned from human demonstrations and dynamically adjusts the level of robotic assistance based on how closely the detected intentions match these trajectories. An experimental study is conducted to demonstrate the paradigm on a teleoperated pick-and-place task using a Franka Emika Panda robot arm, controlled via a 3D Systems Touch X haptic interface. In the experiment, the human operator teleoperates a remote robot arm while observing the environment on a 2D screen. While the human teleoperates the robot arm, the objects are tracked, and the human’s motion intentions (e.g., which object will be picked or which bin will be approached) are predicted using a Deep Q-Network (DQN). The predictions are made considering the current robot state and baseline robot trajectories that are learned from human demonstrations using Probabilistic Movement Primitives (ProMPs). The detected intentions are then used to condition the ProMP trajectories to modulate the movement and accommodate changing object configurations. Consequently, the system generates adaptive force guidance as weighted virtual fixtures that are rendered on the haptic device. The outcomes of the user study, conducted with 12 participants, indicate that the proposed paradigm can successfully guide users to robust grasping configurations and brings better performance by reducing the number of grasp attempts and improving trajectory smoothness and length.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"29 1","pages":"565-572"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78646370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
HOIsim: Synthesizing Realistic 3D Human-Object Interaction Data for Human Activity Recognition HOIsim:合成逼真的三维人-物交互数据,用于人类活动识别
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515349
Marsil Zakour, Alaeddine Mellouli, R. Chaudhari
Correct understanding of human activities is critical for meaningful assistance by robots in daily life. The development of perception algorithms and Deep Learning models of human activity requires large-scale sensor datasets. Good real-world activity data is, however, difficult and time- consuming to acquire. Several precisely calibrated and time- synchronized sensors are required, and the annotation and labeling of the collected sensor data is extremely labor intensive.To address these challenges, we present a 3D activity simulator, "HOIsim", focusing on Human-Object Interactions (HOIs). Using HOIsim, we provide a procedurally generated synthetic dataset of two sample daily life activities "lunch" and "breakfast". The dataset contains out-of-the-box ground truth annotations in the form of human and object poses, as well as ground truth activity labels. Furthermore, we introduce methods to meaningfully randomize activity flows and the environment topology. This allows us to generate a large number of random variants of these activities in very less time.Based on an abstraction of the low-level pose data in the form of spatiotemporal graphs of HOIs, we evaluate the generated Lunch dataset only with two Deep Learning models for activity recognition. The first model, based on recurrent neural networks achieves an accuracy of 87%, whereas the other, based on transformers, achieves an accuracy of 94.7%.
正确理解人类活动对于机器人在日常生活中提供有意义的帮助至关重要。人类活动的感知算法和深度学习模型的发展需要大规模的传感器数据集。然而,获得真实世界的活动数据是困难且耗时的。需要几个精确校准和时间同步的传感器,并且对收集到的传感器数据进行注释和标记是极其劳动密集型的。为了应对这些挑战,我们提出了一个3D活动模拟器,“HOIsim”,专注于人机交互(HOIs)。使用HOIsim,我们提供了一个程序生成的两种样本日常生活活动“午餐”和“早餐”的合成数据集。该数据集包含开箱即用的人类和物体姿势形式的地面真相注释,以及地面真相活动标签。此外,我们引入了有意义的随机化活动流和环境拓扑的方法。这使我们能够在非常短的时间内生成这些活动的大量随机变体。基于以hoi时空图形式抽象的低级姿态数据,我们仅使用两种深度学习模型来评估生成的午餐数据集。基于递归神经网络的第一个模型的准确率为87%,而基于变压器的另一个模型的准确率为94.7%。
{"title":"HOIsim: Synthesizing Realistic 3D Human-Object Interaction Data for Human Activity Recognition","authors":"Marsil Zakour, Alaeddine Mellouli, R. Chaudhari","doi":"10.1109/RO-MAN50785.2021.9515349","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515349","url":null,"abstract":"Correct understanding of human activities is critical for meaningful assistance by robots in daily life. The development of perception algorithms and Deep Learning models of human activity requires large-scale sensor datasets. Good real-world activity data is, however, difficult and time- consuming to acquire. Several precisely calibrated and time- synchronized sensors are required, and the annotation and labeling of the collected sensor data is extremely labor intensive.To address these challenges, we present a 3D activity simulator, \"HOIsim\", focusing on Human-Object Interactions (HOIs). Using HOIsim, we provide a procedurally generated synthetic dataset of two sample daily life activities \"lunch\" and \"breakfast\". The dataset contains out-of-the-box ground truth annotations in the form of human and object poses, as well as ground truth activity labels. Furthermore, we introduce methods to meaningfully randomize activity flows and the environment topology. This allows us to generate a large number of random variants of these activities in very less time.Based on an abstraction of the low-level pose data in the form of spatiotemporal graphs of HOIs, we evaluate the generated Lunch dataset only with two Deep Learning models for activity recognition. The first model, based on recurrent neural networks achieves an accuracy of 87%, whereas the other, based on transformers, achieves an accuracy of 94.7%.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"1 1","pages":"1124-1131"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78813350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
What’s in a face? The Effect of Faces in Human Robot Interaction 脸是什么?人脸在人机交互中的作用
Pub Date : 2021-08-08 DOI: 10.1109/RO-MAN50785.2021.9515378
Neelu Gurung, J. B. Grant, D. Herath
The face is the most influential feature in any interaction. This study investigated the effect of robot faces in embodied interactions on a range of subjective and objective factors. The platform used to answer the research question was an interactive robotic art installation, incorporating a robot arm that could be presented with or without a face displayed on an attached screen. Participants were exposed to one of three conditions – the robot arm only, the robot arm with a static face displayed, and the robot arm with a dynamic face displayed. We used the Godspeed Questionnaire to determine whether the different embodiments would be perceived differently on measures of likeability, animacy, and safety before and after the interaction. We also measured how close participants stood to the robot and how much time they spent interacting with the robot. We found that the three embodiments did not significantly differ in time spent, distance, animacy, likeability, or safety before and after the interaction. This surprising result hints at other possible reasons that influence the success of a robot interaction and advances the understanding of the effect of faces in human-robot interaction.
脸是任何互动中最具影响力的特征。本研究探讨了机器人面孔在具身互动中对一系列主客观因素的影响。用于回答研究问题的平台是一个交互式机器人艺术装置,包含一个机器人手臂,可以在附带的屏幕上显示人脸或不显示人脸。参与者被暴露在三种条件中的一种——只看到机械臂,看到静态面孔的机械臂,看到动态面孔的机械臂。我们使用Godspeed问卷来确定不同的实施例在互动前后的亲和力、活力和安全性方面是否会被感知不同。我们还测量了参与者与机器人的距离,以及他们与机器人互动的时间。我们发现这三种表现形式在互动前后所花费的时间、距离、活力、受欢迎程度或安全性方面没有显著差异。这一令人惊讶的结果暗示了影响机器人交互成功的其他可能原因,并推进了对人脸在人机交互中的作用的理解。
{"title":"What’s in a face? The Effect of Faces in Human Robot Interaction","authors":"Neelu Gurung, J. B. Grant, D. Herath","doi":"10.1109/RO-MAN50785.2021.9515378","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515378","url":null,"abstract":"The face is the most influential feature in any interaction. This study investigated the effect of robot faces in embodied interactions on a range of subjective and objective factors. The platform used to answer the research question was an interactive robotic art installation, incorporating a robot arm that could be presented with or without a face displayed on an attached screen. Participants were exposed to one of three conditions – the robot arm only, the robot arm with a static face displayed, and the robot arm with a dynamic face displayed. We used the Godspeed Questionnaire to determine whether the different embodiments would be perceived differently on measures of likeability, animacy, and safety before and after the interaction. We also measured how close participants stood to the robot and how much time they spent interacting with the robot. We found that the three embodiments did not significantly differ in time spent, distance, animacy, likeability, or safety before and after the interaction. This surprising result hints at other possible reasons that influence the success of a robot interaction and advances the understanding of the effect of faces in human-robot interaction.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"1 1","pages":"1024-1030"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78815947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1