Pub Date : 2021-08-08DOI: 10.1109/RO-MAN50785.2021.9515396
Mengni Zhang, Tong Xu, Jackson Hardin, Jilly Jiaqi Cai, J. Brooks, K. Green
There has been increased development of assistive robots for the home, along with empirical studies assessing cultural differences on user perception. However, little attention has been paid to cultural differences with respect to non- humanoid, multi-robot interactions in homes or otherwise. In this exploratory paper, we investigate how cultural differences may impact users’ preferences and perceived usefulness of a multi-robot system by creating an interactive online survey and considering variables often absent in HRI studies. We introduce our multi-robot design and survey construction, and report results evaluated across 191 young adult participants from China, India, and the USA. We find significant effects of culture on both participants’ preferences and perceived usefulness of the system between India and China or the USA, but not between China and the USA. We also find effects of culture on perceived usefulness to be partially mediated by participant preferences. Our findings reinforce the importance of considering cultural differences in designing domestic multi-robotic assistants.
{"title":"How Many Robots Do You Want? A Cross-Cultural Exploration on User Preference and Perception of an Assistive Multi-Robot System","authors":"Mengni Zhang, Tong Xu, Jackson Hardin, Jilly Jiaqi Cai, J. Brooks, K. Green","doi":"10.1109/RO-MAN50785.2021.9515396","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515396","url":null,"abstract":"There has been increased development of assistive robots for the home, along with empirical studies assessing cultural differences on user perception. However, little attention has been paid to cultural differences with respect to non- humanoid, multi-robot interactions in homes or otherwise. In this exploratory paper, we investigate how cultural differences may impact users’ preferences and perceived usefulness of a multi-robot system by creating an interactive online survey and considering variables often absent in HRI studies. We introduce our multi-robot design and survey construction, and report results evaluated across 191 young adult participants from China, India, and the USA. We find significant effects of culture on both participants’ preferences and perceived usefulness of the system between India and China or the USA, but not between China and the USA. We also find effects of culture on perceived usefulness to be partially mediated by participant preferences. Our findings reinforce the importance of considering cultural differences in designing domestic multi-robotic assistants.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"60 1","pages":"580-585"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75167567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-08DOI: 10.1109/RO-MAN50785.2021.9515529
T. Kikuchi, Takumi Nishimura, K. Fukuoka, Takeru Todaka, Isao Abe
Movements of knee joints are relative motions between tibia and femur, which include rolling and sliding. Conventional wearable knee assistive device utilized a hinge joint and generates nonnegligible mismatched motions under deep flexions. To assist the knee motion including the deep flexion, we developed a polycentric assistive device for knee joints (PAD-KJ) by using two gears with the same module and different radii. These radii were designed so that the center of the smaller gear moves on the trajectory of the knee joint with small error. According to the results of parametric design, the minimum error was about 3 mm to the model trajectory of knee joint. We also proposed a torque generator for this joint. The combination of a lever arm and a linear spring with the pair of gears successfully generated the assistive torque for standing motions. According to an evaluation test, the maximum torque of the single unit was about 1.2 Nm and the assistive torque in real usage will be 4.8 Nm.
{"title":"Design of Polycentric Assistive Device for Knee Joint*","authors":"T. Kikuchi, Takumi Nishimura, K. Fukuoka, Takeru Todaka, Isao Abe","doi":"10.1109/RO-MAN50785.2021.9515529","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515529","url":null,"abstract":"Movements of knee joints are relative motions between tibia and femur, which include rolling and sliding. Conventional wearable knee assistive device utilized a hinge joint and generates nonnegligible mismatched motions under deep flexions. To assist the knee motion including the deep flexion, we developed a polycentric assistive device for knee joints (PAD-KJ) by using two gears with the same module and different radii. These radii were designed so that the center of the smaller gear moves on the trajectory of the knee joint with small error. According to the results of parametric design, the minimum error was about 3 mm to the model trajectory of knee joint. We also proposed a torque generator for this joint. The combination of a lever arm and a linear spring with the pair of gears successfully generated the assistive torque for standing motions. According to an evaluation test, the maximum torque of the single unit was about 1.2 Nm and the assistive torque in real usage will be 4.8 Nm.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"57 1","pages":"546-550"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74565457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-08DOI: 10.1109/RO-MAN50785.2021.9515506
Kyungho Nam, C. Crick
3D printed prosthetics have narrowed the gap between the tens of thousands of dollars cost of traditional prosthetic designs and amputees’ needs. However, the World Health Organization estimates that only 5-15% of people can receive adequate prosthesis services [2]. To resolve the lack of prosthesis supply and reduce cost issues (for both materials and maintenance), this paper provides an overview of a self-trainable user-customized system architecture for a 3D printed prosthetic hand to minimize the challenge of accessing and maintaining these supporting devices. In this paper, we develop and implement a customized behavior system that can generate any gesture that users desire. The architecture provides upper limb amputees with self-trainable software and can improve their prosthetic performance at almost no financial cost. All kinds of unique gestures that users want are trainable with the RBF network using 3 channel EMG sensor signals with a 94% average success rate. This result demonstrates that applying user-customized training to the behavior of a prosthetic hand can satisfy individual user requirements in real-life activities with high performance.
{"title":"Self-trainable 3D-printed prosthetic hands","authors":"Kyungho Nam, C. Crick","doi":"10.1109/RO-MAN50785.2021.9515506","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515506","url":null,"abstract":"3D printed prosthetics have narrowed the gap between the tens of thousands of dollars cost of traditional prosthetic designs and amputees’ needs. However, the World Health Organization estimates that only 5-15% of people can receive adequate prosthesis services [2]. To resolve the lack of prosthesis supply and reduce cost issues (for both materials and maintenance), this paper provides an overview of a self-trainable user-customized system architecture for a 3D printed prosthetic hand to minimize the challenge of accessing and maintaining these supporting devices. In this paper, we develop and implement a customized behavior system that can generate any gesture that users desire. The architecture provides upper limb amputees with self-trainable software and can improve their prosthetic performance at almost no financial cost. All kinds of unique gestures that users want are trainable with the RBF network using 3 channel EMG sensor signals with a 94% average success rate. This result demonstrates that applying user-customized training to the behavior of a prosthetic hand can satisfy individual user requirements in real-life activities with high performance.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"18 1","pages":"1196-1201"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77686946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-08DOI: 10.1109/RO-MAN50785.2021.9515503
Faisal Mehmood, Hamed Mahzoon, Y. Yoshikawa, H. Ishiguro
People with response time (RT) management issues usually have high situational communication apprehension, and fear of negative evaluation (FNE) in face-to-face (FtF) and online interactions. Decreasing the stress related to RT management can reduce situational communication apprehension and FNE, ensuring communication successful. In this study, we propose an interactive response strategy involving a robot avatar in a video conference that can reduce the stress related to RT management in a person and making communication successful. Two types of robotic video conferencing systems (VCSs) were considered: 1) a conventional robotic video conferencing system (VCS) where interactive response of robot avatar does not change concerning variation in RT of the subject and 2) a robotic VCS where interactive response of robot avatar changes concerning variation in RT of the subject. Situational communication apprehension measure (SCAM), FNE, sense of being attending (SoBA), and intention to use (ITU) are used as indexes for noticing the decrease in stress related to RT management of a subject in a communication scenario. A total of 51 subjects (M = 33.20, SD = 7.58 years) participated in the subjective evaluation of the web-based survey. The proposed interactive response of a robot avatar was found to have a significant effect on the stress related to RT management in a person. A significant decrease in expected SCAM and FNE scores of subjects was observed. Furthermore, a significant increase in expected SoBA and ITU scores of subjects was also observed. This study contributes to the literature in terms of the impact of the proposed interactive response of the robotic avatar on the stress related to RT management in a person in the video conference, assessed by SCAM, FNE, SoBA, and ITU indexes.
{"title":"An interactive response strategy involving a robot avatar in a video conference system for reducing the stress of response time management in communication","authors":"Faisal Mehmood, Hamed Mahzoon, Y. Yoshikawa, H. Ishiguro","doi":"10.1109/RO-MAN50785.2021.9515503","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515503","url":null,"abstract":"People with response time (RT) management issues usually have high situational communication apprehension, and fear of negative evaluation (FNE) in face-to-face (FtF) and online interactions. Decreasing the stress related to RT management can reduce situational communication apprehension and FNE, ensuring communication successful. In this study, we propose an interactive response strategy involving a robot avatar in a video conference that can reduce the stress related to RT management in a person and making communication successful. Two types of robotic video conferencing systems (VCSs) were considered: 1) a conventional robotic video conferencing system (VCS) where interactive response of robot avatar does not change concerning variation in RT of the subject and 2) a robotic VCS where interactive response of robot avatar changes concerning variation in RT of the subject. Situational communication apprehension measure (SCAM), FNE, sense of being attending (SoBA), and intention to use (ITU) are used as indexes for noticing the decrease in stress related to RT management of a subject in a communication scenario. A total of 51 subjects (M = 33.20, SD = 7.58 years) participated in the subjective evaluation of the web-based survey. The proposed interactive response of a robot avatar was found to have a significant effect on the stress related to RT management in a person. A significant decrease in expected SCAM and FNE scores of subjects was observed. Furthermore, a significant increase in expected SoBA and ITU scores of subjects was also observed. This study contributes to the literature in terms of the impact of the proposed interactive response of the robotic avatar on the stress related to RT management in a person in the video conference, assessed by SCAM, FNE, SoBA, and ITU indexes.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"22 1","pages":"969-974"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81520731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The teleoperation system can transmit human intention to the remote robot, so that the system combines excellent robot operation performance and human intelligence. In this article, we have established a bilateral teleoperation system with force feedback from the arm and gripper. That is, the slave robot system can provide force feedback on both the wrist and the fingers, while the master robot system can render the slave feedback force and human interaction force, and control the slave robot accordingly. In addition, this paper also proposes the framework of the robot’s four-channel bilateral teleoperation control system, which is attributed to two situations: impedance control or admittance control. Finally, single-arm/single-arm, dual-arm/dual-arm bilateral teleoperation experiments prove the effectiveness of the bilateral teleoperation system and the four-channel controller architecture proposed in this paper.
{"title":"A Bilateral Dual-Arm Teleoperation Robot System with a Unified Control Architecture","authors":"Cheng Zhou, Longfei Zhao, Haitao Wang, Lipeng Chen, Yu Zheng","doi":"10.1109/RO-MAN50785.2021.9515523","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515523","url":null,"abstract":"The teleoperation system can transmit human intention to the remote robot, so that the system combines excellent robot operation performance and human intelligence. In this article, we have established a bilateral teleoperation system with force feedback from the arm and gripper. That is, the slave robot system can provide force feedback on both the wrist and the fingers, while the master robot system can render the slave feedback force and human interaction force, and control the slave robot accordingly. In addition, this paper also proposes the framework of the robot’s four-channel bilateral teleoperation control system, which is attributed to two situations: impedance control or admittance control. Finally, single-arm/single-arm, dual-arm/dual-arm bilateral teleoperation experiments prove the effectiveness of the bilateral teleoperation system and the four-channel controller architecture proposed in this paper.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"152 1","pages":"495-502"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81723748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-08DOI: 10.1109/RO-MAN50785.2021.9515435
Alessandra Rossi, Mario Raiano, Silvia Rossi
Guaranteeing people’s engagement during an interaction is very important to elicit positive and effective emotions in public service scenarios. A robot should be able to detect its interlocutor’s level and mode of engagement to accordingly modulate its behaviours. However, there is not a generally accepted model to annotate and classify engagement during an interaction. In this work, we consider engagement as a multidimensional construct with three relevant dimensions: affective, cognitive and behavioural. To be automatically evaluated by a robot, such a complex construct requires the selection of the proper interaction features among a large set of possibilities. Moreover, manually collecting and annotating large datasets of real interactions are extremely time-consuming and costly. In this study, we collected the recordings of human-robot interactions in a bartending scenario, and we compared different feature selection and regression models to find the features that characterise a user’s engagement in the interaction, and the model that can efficiently detect them. Results showed that the characterisation of each dimension separately in terms of features and regression obtains better results with respect to a model that directly combines the three dimensions.
{"title":"Affective, Cognitive and Behavioural Engagement Detection for Human-robot Interaction in a Bartending Scenario","authors":"Alessandra Rossi, Mario Raiano, Silvia Rossi","doi":"10.1109/RO-MAN50785.2021.9515435","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515435","url":null,"abstract":"Guaranteeing people’s engagement during an interaction is very important to elicit positive and effective emotions in public service scenarios. A robot should be able to detect its interlocutor’s level and mode of engagement to accordingly modulate its behaviours. However, there is not a generally accepted model to annotate and classify engagement during an interaction. In this work, we consider engagement as a multidimensional construct with three relevant dimensions: affective, cognitive and behavioural. To be automatically evaluated by a robot, such a complex construct requires the selection of the proper interaction features among a large set of possibilities. Moreover, manually collecting and annotating large datasets of real interactions are extremely time-consuming and costly. In this study, we collected the recordings of human-robot interactions in a bartending scenario, and we compared different feature selection and regression models to find the features that characterise a user’s engagement in the interaction, and the model that can efficiently detect them. Results showed that the characterisation of each dimension separately in terms of features and regression obtains better results with respect to a model that directly combines the three dimensions.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"66 1","pages":"208-213"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80829616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-08DOI: 10.1109/RO-MAN50785.2021.9515482
Galadrielle Humblot-Renaux, Chen Li, D. Chrysostomou
Equipping robots with the ability to identify who is talking to them is an important step towards natural and effective verbal interaction. However, speaker identification for voice control remains largely unexplored compared to recent progress in natural language instruction and speech recognition. This motivates us to tackle text-independent speaker identification for human-robot interaction applications in industrial environments. By representing audio segments as time-frequency spectrograms, this can be formulated as an image classification task, allowing us to apply state-of-the-art convolutional neural network (CNN) architectures. To achieve robust prediction in unconstrained, challenging acoustic conditions, we take a data-driven approach and collect a custom dataset with a far-field microphone array, featuring over 3 hours of "in the wild" audio recordings from six speakers, which are then encoded into spectral images for CNN-based classification. We propose a shallow 3-layer CNN, which we compare with the widely used ResNet-18 architecture: in addition to benchmarking these models in terms of accuracy, we visualize the features used by these two models to discriminate between classes, and investigate their reliability in unseen acoustic scenes. Although ResNet-18 reaches the highest raw accuracy, we are able to achieve remarkable online speaker recognition performance with a much more lightweight model which learns lower-level vocal features and produces more reliable confidence scores. The proposed method is successfully integrated into a robotic dialogue system and showcased in a mock user localization and authentication scenario in a realistic industrial environment: https://youtu.be/IVtZ8LKJZ7A.
{"title":"Why talk to people when you can talk to robots? Far-field speaker identification in the wild","authors":"Galadrielle Humblot-Renaux, Chen Li, D. Chrysostomou","doi":"10.1109/RO-MAN50785.2021.9515482","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515482","url":null,"abstract":"Equipping robots with the ability to identify who is talking to them is an important step towards natural and effective verbal interaction. However, speaker identification for voice control remains largely unexplored compared to recent progress in natural language instruction and speech recognition. This motivates us to tackle text-independent speaker identification for human-robot interaction applications in industrial environments. By representing audio segments as time-frequency spectrograms, this can be formulated as an image classification task, allowing us to apply state-of-the-art convolutional neural network (CNN) architectures. To achieve robust prediction in unconstrained, challenging acoustic conditions, we take a data-driven approach and collect a custom dataset with a far-field microphone array, featuring over 3 hours of \"in the wild\" audio recordings from six speakers, which are then encoded into spectral images for CNN-based classification. We propose a shallow 3-layer CNN, which we compare with the widely used ResNet-18 architecture: in addition to benchmarking these models in terms of accuracy, we visualize the features used by these two models to discriminate between classes, and investigate their reliability in unseen acoustic scenes. Although ResNet-18 reaches the highest raw accuracy, we are able to achieve remarkable online speaker recognition performance with a much more lightweight model which learns lower-level vocal features and produces more reliable confidence scores. The proposed method is successfully integrated into a robotic dialogue system and showcased in a mock user localization and authentication scenario in a realistic industrial environment: https://youtu.be/IVtZ8LKJZ7A.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"134 1","pages":"272-278"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76809821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-08DOI: 10.1109/RO-MAN50785.2021.9515326
Kim Tien Ly, Mithun Poozhiyil, Harit Pandya, G. Neumann, Ayse Kucukyilmaz
This paper presents a haptic shared control paradigm that modulates the level of robotic guidance, based on predictions of human motion intentions. The proposed method incorporates robot trajectories learned from human demonstrations and dynamically adjusts the level of robotic assistance based on how closely the detected intentions match these trajectories. An experimental study is conducted to demonstrate the paradigm on a teleoperated pick-and-place task using a Franka Emika Panda robot arm, controlled via a 3D Systems Touch X haptic interface. In the experiment, the human operator teleoperates a remote robot arm while observing the environment on a 2D screen. While the human teleoperates the robot arm, the objects are tracked, and the human’s motion intentions (e.g., which object will be picked or which bin will be approached) are predicted using a Deep Q-Network (DQN). The predictions are made considering the current robot state and baseline robot trajectories that are learned from human demonstrations using Probabilistic Movement Primitives (ProMPs). The detected intentions are then used to condition the ProMP trajectories to modulate the movement and accommodate changing object configurations. Consequently, the system generates adaptive force guidance as weighted virtual fixtures that are rendered on the haptic device. The outcomes of the user study, conducted with 12 participants, indicate that the proposed paradigm can successfully guide users to robust grasping configurations and brings better performance by reducing the number of grasp attempts and improving trajectory smoothness and length.
本文提出了一种基于对人类运动意图的预测来调节机器人指导水平的触觉共享控制范式。该方法结合了从人类演示中学习到的机器人轨迹,并根据检测到的意图与这些轨迹的匹配程度动态调整机器人辅助水平。通过3D Systems Touch X触觉界面控制的Franka Emika Panda机器人手臂,进行了一项实验研究,以演示远程操作拾取和放置任务的范例。在实验中,人类操作员在观察2D屏幕上的环境的同时遥控机器人手臂。当人类远程操作机械臂时,物体被跟踪,人类的运动意图(例如,哪个物体将被选中或哪个垃圾箱将被接近)被预测使用深度q -网络(DQN)。使用概率运动原语(Probabilistic Movement Primitives, promp)从人类演示中学习到当前机器人状态和基线机器人轨迹,并考虑了这些预测。然后,检测到的意图用于调节ProMP轨迹,以调节运动并适应不断变化的物体配置。因此,系统产生自适应力引导作为加权虚拟夹具呈现在触觉设备上。对12名参与者进行的用户研究结果表明,所提出的范式可以成功地引导用户实现鲁棒抓取配置,并通过减少抓取尝试次数、提高轨迹平滑度和长度带来更好的性能。
{"title":"Intent-Aware Predictive Haptic Guidance and its Application to Shared Control Teleoperation","authors":"Kim Tien Ly, Mithun Poozhiyil, Harit Pandya, G. Neumann, Ayse Kucukyilmaz","doi":"10.1109/RO-MAN50785.2021.9515326","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515326","url":null,"abstract":"This paper presents a haptic shared control paradigm that modulates the level of robotic guidance, based on predictions of human motion intentions. The proposed method incorporates robot trajectories learned from human demonstrations and dynamically adjusts the level of robotic assistance based on how closely the detected intentions match these trajectories. An experimental study is conducted to demonstrate the paradigm on a teleoperated pick-and-place task using a Franka Emika Panda robot arm, controlled via a 3D Systems Touch X haptic interface. In the experiment, the human operator teleoperates a remote robot arm while observing the environment on a 2D screen. While the human teleoperates the robot arm, the objects are tracked, and the human’s motion intentions (e.g., which object will be picked or which bin will be approached) are predicted using a Deep Q-Network (DQN). The predictions are made considering the current robot state and baseline robot trajectories that are learned from human demonstrations using Probabilistic Movement Primitives (ProMPs). The detected intentions are then used to condition the ProMP trajectories to modulate the movement and accommodate changing object configurations. Consequently, the system generates adaptive force guidance as weighted virtual fixtures that are rendered on the haptic device. The outcomes of the user study, conducted with 12 participants, indicate that the proposed paradigm can successfully guide users to robust grasping configurations and brings better performance by reducing the number of grasp attempts and improving trajectory smoothness and length.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"29 1","pages":"565-572"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78646370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-08DOI: 10.1109/RO-MAN50785.2021.9515349
Marsil Zakour, Alaeddine Mellouli, R. Chaudhari
Correct understanding of human activities is critical for meaningful assistance by robots in daily life. The development of perception algorithms and Deep Learning models of human activity requires large-scale sensor datasets. Good real-world activity data is, however, difficult and time- consuming to acquire. Several precisely calibrated and time- synchronized sensors are required, and the annotation and labeling of the collected sensor data is extremely labor intensive.To address these challenges, we present a 3D activity simulator, "HOIsim", focusing on Human-Object Interactions (HOIs). Using HOIsim, we provide a procedurally generated synthetic dataset of two sample daily life activities "lunch" and "breakfast". The dataset contains out-of-the-box ground truth annotations in the form of human and object poses, as well as ground truth activity labels. Furthermore, we introduce methods to meaningfully randomize activity flows and the environment topology. This allows us to generate a large number of random variants of these activities in very less time.Based on an abstraction of the low-level pose data in the form of spatiotemporal graphs of HOIs, we evaluate the generated Lunch dataset only with two Deep Learning models for activity recognition. The first model, based on recurrent neural networks achieves an accuracy of 87%, whereas the other, based on transformers, achieves an accuracy of 94.7%.
{"title":"HOIsim: Synthesizing Realistic 3D Human-Object Interaction Data for Human Activity Recognition","authors":"Marsil Zakour, Alaeddine Mellouli, R. Chaudhari","doi":"10.1109/RO-MAN50785.2021.9515349","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515349","url":null,"abstract":"Correct understanding of human activities is critical for meaningful assistance by robots in daily life. The development of perception algorithms and Deep Learning models of human activity requires large-scale sensor datasets. Good real-world activity data is, however, difficult and time- consuming to acquire. Several precisely calibrated and time- synchronized sensors are required, and the annotation and labeling of the collected sensor data is extremely labor intensive.To address these challenges, we present a 3D activity simulator, \"HOIsim\", focusing on Human-Object Interactions (HOIs). Using HOIsim, we provide a procedurally generated synthetic dataset of two sample daily life activities \"lunch\" and \"breakfast\". The dataset contains out-of-the-box ground truth annotations in the form of human and object poses, as well as ground truth activity labels. Furthermore, we introduce methods to meaningfully randomize activity flows and the environment topology. This allows us to generate a large number of random variants of these activities in very less time.Based on an abstraction of the low-level pose data in the form of spatiotemporal graphs of HOIs, we evaluate the generated Lunch dataset only with two Deep Learning models for activity recognition. The first model, based on recurrent neural networks achieves an accuracy of 87%, whereas the other, based on transformers, achieves an accuracy of 94.7%.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"1 1","pages":"1124-1131"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78813350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-08DOI: 10.1109/RO-MAN50785.2021.9515378
Neelu Gurung, J. B. Grant, D. Herath
The face is the most influential feature in any interaction. This study investigated the effect of robot faces in embodied interactions on a range of subjective and objective factors. The platform used to answer the research question was an interactive robotic art installation, incorporating a robot arm that could be presented with or without a face displayed on an attached screen. Participants were exposed to one of three conditions – the robot arm only, the robot arm with a static face displayed, and the robot arm with a dynamic face displayed. We used the Godspeed Questionnaire to determine whether the different embodiments would be perceived differently on measures of likeability, animacy, and safety before and after the interaction. We also measured how close participants stood to the robot and how much time they spent interacting with the robot. We found that the three embodiments did not significantly differ in time spent, distance, animacy, likeability, or safety before and after the interaction. This surprising result hints at other possible reasons that influence the success of a robot interaction and advances the understanding of the effect of faces in human-robot interaction.
{"title":"What’s in a face? The Effect of Faces in Human Robot Interaction","authors":"Neelu Gurung, J. B. Grant, D. Herath","doi":"10.1109/RO-MAN50785.2021.9515378","DOIUrl":"https://doi.org/10.1109/RO-MAN50785.2021.9515378","url":null,"abstract":"The face is the most influential feature in any interaction. This study investigated the effect of robot faces in embodied interactions on a range of subjective and objective factors. The platform used to answer the research question was an interactive robotic art installation, incorporating a robot arm that could be presented with or without a face displayed on an attached screen. Participants were exposed to one of three conditions – the robot arm only, the robot arm with a static face displayed, and the robot arm with a dynamic face displayed. We used the Godspeed Questionnaire to determine whether the different embodiments would be perceived differently on measures of likeability, animacy, and safety before and after the interaction. We also measured how close participants stood to the robot and how much time they spent interacting with the robot. We found that the three embodiments did not significantly differ in time spent, distance, animacy, likeability, or safety before and after the interaction. This surprising result hints at other possible reasons that influence the success of a robot interaction and advances the understanding of the effect of faces in human-robot interaction.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"1 1","pages":"1024-1030"},"PeriodicalIF":0.0,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78815947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}