Robotic surgery involves significant task switching between tool control and camera control, which can be a source of distraction and error. This study evaluated the performance of a voice-enabled autonomous camera control system compared to a human-operated camera for the da Vinci surgical robot. Twenty subjects performed a series of tasks that required them to instruct the camera to move to specific locations to complete the tasks. The subjects performed the tasks (1) using an automated camera system that could be tailored based on keywords; and (2) directing a human camera operator using voice commands. The data were analyzed using task completion measures and the NASA Task Load Index (TLX) human performance metrics. The human-operated camera control method was able to outperform an automated algorithm in terms of task completion (6.96 vs. 7.71 correct insertions; p-value = 0.044). However, subjective feedback suggests that a voice-enabled autonomous camera control system is comparable to a human-operated camera control system. Based on the subjects’ feedback, thirteen out of the twenty subjects preferred the voice-enabled autonomous camera control system including the surgeon. This study is a step towards a more natural language interface for surgical robotics as these systems become better partners during surgery.
{"title":"Evaluation of a Voice-Enabled Autonomous Camera Control System for the da Vinci Surgical Robot","authors":"Reenu Arikkat Paul, Luay Jawad, Abhishek Shankar, Maitreyee Majumdar, Troy Herrick-Thomason, Abhilash Pandya","doi":"10.3390/robotics13010010","DOIUrl":"https://doi.org/10.3390/robotics13010010","url":null,"abstract":"Robotic surgery involves significant task switching between tool control and camera control, which can be a source of distraction and error. This study evaluated the performance of a voice-enabled autonomous camera control system compared to a human-operated camera for the da Vinci surgical robot. Twenty subjects performed a series of tasks that required them to instruct the camera to move to specific locations to complete the tasks. The subjects performed the tasks (1) using an automated camera system that could be tailored based on keywords; and (2) directing a human camera operator using voice commands. The data were analyzed using task completion measures and the NASA Task Load Index (TLX) human performance metrics. The human-operated camera control method was able to outperform an automated algorithm in terms of task completion (6.96 vs. 7.71 correct insertions; p-value = 0.044). However, subjective feedback suggests that a voice-enabled autonomous camera control system is comparable to a human-operated camera control system. Based on the subjects’ feedback, thirteen out of the twenty subjects preferred the voice-enabled autonomous camera control system including the surgeon. This study is a step towards a more natural language interface for surgical robotics as these systems become better partners during surgery.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"20 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139127196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-31DOI: 10.3390/robotics13010009
Michele Folgheraiter, Sharafatdin Yessirkepov, T. Umurzakov
This paper presents the design of a new lightweight, full-size bipedal robot developed in the Humanoid Robotics Laboratory at Nazarbayev University. The robot, equipped with 12 degrees of freedom (DOFs), stands at 1.1 m tall and weighs only 15 kg (excluding the battery). Through the implementation of a simple mechanical design and the utilization of off-the-shelf components, the overall prototype cost remained under USD 5000. The incorporation of high-performance in-house-developed servomotors enables the robot’s actuation system to generate up to 2400 W of mechanical power, resulting in a power-to-weight ratio of 160 W/kg. The details of the mechanical and electrical design are presented alongside the formalization of the forward kinematic model using the successive screw displacement method and the solution of the inverse kinematics. Tests conducted in both a simulation environment and on the real prototype demonstrate that the robot is capable of accurately following the reference joint trajectories to execute a quasi-static gait, achieving an average power consumption of 496 W.
{"title":"NU-Biped-4.5: A Lightweight and Low-Prototyping-Cost Full-Size Bipedal Robot","authors":"Michele Folgheraiter, Sharafatdin Yessirkepov, T. Umurzakov","doi":"10.3390/robotics13010009","DOIUrl":"https://doi.org/10.3390/robotics13010009","url":null,"abstract":"This paper presents the design of a new lightweight, full-size bipedal robot developed in the Humanoid Robotics Laboratory at Nazarbayev University. The robot, equipped with 12 degrees of freedom (DOFs), stands at 1.1 m tall and weighs only 15 kg (excluding the battery). Through the implementation of a simple mechanical design and the utilization of off-the-shelf components, the overall prototype cost remained under USD 5000. The incorporation of high-performance in-house-developed servomotors enables the robot’s actuation system to generate up to 2400 W of mechanical power, resulting in a power-to-weight ratio of 160 W/kg. The details of the mechanical and electrical design are presented alongside the formalization of the forward kinematic model using the successive screw displacement method and the solution of the inverse kinematics. Tests conducted in both a simulation environment and on the real prototype demonstrate that the robot is capable of accurately following the reference joint trajectories to execute a quasi-static gait, achieving an average power consumption of 496 W.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"121 13","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139133999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-23DOI: 10.3390/robotics13010005
António Fernando Alcântara Ribeiro, Ana Carolina Coelho Lopes, Tiago Alcântara Ribeiro, Nino Sancho Sampaio Martins Pereira, Gil Teixeira Lopes, António Fernando Alcântara Ribeiro
The strategies of multi-autonomous cooperative robots in a football game can be solved in multiple ways. Still, the most common is the “Skills, Tactics and Plays (STP)” architecture, developed so that robots could easily cooperate based on a group of predefined plays, called the playbook. The development of the new strategy algorithm presented in this paper, used by the RoboCup Middle Size League LAR@MSL team, had a completely different approach from most other teams for multiple reasons. Contrary to the typical STP architecture, this strategy, called the Probability-Based Strategy (PBS), uses only skills and decides the outcome of the tactics and plays in real-time based on the probability of arbitrary values given to the possible actions in each situation. The action probability values also affect the robot’s positioning in a way that optimizes the overall probability of scoring a goal. It uses a centralized decision-making strategy rather than the robot’s self-control. The robot is still fully autonomous in the skills assigned to it and uses a communication system with the main computer to synchronize all robots. Also, calibration or any strategy improvements are independent of the robots themselves. The robots’ performance affects the results but does not interfere with the strategy outcome. Moreover, the strategy outcome depends primarily on the opponent team and the probability calibration for each action. The strategy presented has been fully implemented on the team and tested in multiple scenarios, such as simulators, a controlled environment, against humans in a simulator, and in the RoboCup competition.
{"title":"Probability-Based Strategy for a Football Multi-Agent Autonomous Robot System","authors":"António Fernando Alcântara Ribeiro, Ana Carolina Coelho Lopes, Tiago Alcântara Ribeiro, Nino Sancho Sampaio Martins Pereira, Gil Teixeira Lopes, António Fernando Alcântara Ribeiro","doi":"10.3390/robotics13010005","DOIUrl":"https://doi.org/10.3390/robotics13010005","url":null,"abstract":"The strategies of multi-autonomous cooperative robots in a football game can be solved in multiple ways. Still, the most common is the “Skills, Tactics and Plays (STP)” architecture, developed so that robots could easily cooperate based on a group of predefined plays, called the playbook. The development of the new strategy algorithm presented in this paper, used by the RoboCup Middle Size League LAR@MSL team, had a completely different approach from most other teams for multiple reasons. Contrary to the typical STP architecture, this strategy, called the Probability-Based Strategy (PBS), uses only skills and decides the outcome of the tactics and plays in real-time based on the probability of arbitrary values given to the possible actions in each situation. The action probability values also affect the robot’s positioning in a way that optimizes the overall probability of scoring a goal. It uses a centralized decision-making strategy rather than the robot’s self-control. The robot is still fully autonomous in the skills assigned to it and uses a communication system with the main computer to synchronize all robots. Also, calibration or any strategy improvements are independent of the robots themselves. The robots’ performance affects the results but does not interfere with the strategy outcome. Moreover, the strategy outcome depends primarily on the opponent team and the probability calibration for each action. The strategy presented has been fully implemented on the team and tested in multiple scenarios, such as simulators, a controlled environment, against humans in a simulator, and in the RoboCup competition.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"37 4","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139162879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In order to attain precise and robust transformation estimation in simultaneous localization and mapping (SLAM) tasks, the integration of multiple sensors has demonstrated effectiveness and significant potential in robotics applications. Our work emerges as a rapid tightly coupled LIDAR-inertial-visual SLAM system, comprising three tightly coupled components: the LIO module, the VIO module, and the loop closure detection module. The LIO module directly constructs raw scanning point increments into a point cloud map for matching. The VIO component performs image alignment by aligning the observed points and the loop closure detection module imparts real-time cumulative error correction through factor graph optimization using the iSAM2 optimizer. The three components are integrated via an error state iterative Kalman filter (ESIKF). To alleviate computational efforts in loop closure detection, a coarse-to-fine point cloud matching approach is employed, leverging Quatro for deriving a priori state for keyframe point clouds and NanoGICP for detailed transformation computation. Experimental evaluations conducted on both open and private datasets substantiate the superior performance of the proposed method compared to similar approaches. The results indicate the adaptability of this method to various challenging situations.
为了在同步定位和测绘(SLAM)任务中实现精确而稳健的变换估计,多种传感器的集成在机器人应用中显示出了有效性和巨大潜力。我们的研究成果是一种快速、紧密耦合的激光雷达-惯性-视觉 SLAM 系统,由三个紧密耦合的组件组成:LIO 模块、VIO 模块和闭环检测模块。LIO 模块直接将原始扫描点增量构建为点云图,以便进行匹配。VIO 组件通过对齐观测点来执行图像对齐,而环路闭合检测模块则通过使用 iSAM2 优化器进行因子图优化来实现实时累积误差校正。这三个组件通过误差状态迭代卡尔曼滤波器(ESIKF)进行整合。为了减轻闭环检测的计算工作量,采用了一种从粗到细的点云匹配方法,利用 Quatro 为关键帧点云推导先验状态,利用 NanoGICP 进行详细的变换计算。在公开和私有数据集上进行的实验评估证明,与类似方法相比,所提出的方法性能更优越。结果表明,该方法可适应各种具有挑战性的情况。
{"title":"An Enhanced Multi-Sensor Simultaneous Localization and Mapping (SLAM) Framework with Coarse-to-Fine Loop Closure Detection Based on a Tightly Coupled Error State Iterative Kalman Filter","authors":"Changhao Yu, Zichen Chao, Haoran Xie, Yue Hua, Weitao Wu","doi":"10.3390/robotics13010002","DOIUrl":"https://doi.org/10.3390/robotics13010002","url":null,"abstract":"In order to attain precise and robust transformation estimation in simultaneous localization and mapping (SLAM) tasks, the integration of multiple sensors has demonstrated effectiveness and significant potential in robotics applications. Our work emerges as a rapid tightly coupled LIDAR-inertial-visual SLAM system, comprising three tightly coupled components: the LIO module, the VIO module, and the loop closure detection module. The LIO module directly constructs raw scanning point increments into a point cloud map for matching. The VIO component performs image alignment by aligning the observed points and the loop closure detection module imparts real-time cumulative error correction through factor graph optimization using the iSAM2 optimizer. The three components are integrated via an error state iterative Kalman filter (ESIKF). To alleviate computational efforts in loop closure detection, a coarse-to-fine point cloud matching approach is employed, leverging Quatro for deriving a priori state for keyframe point clouds and NanoGICP for detailed transformation computation. Experimental evaluations conducted on both open and private datasets substantiate the superior performance of the proposed method compared to similar approaches. The results indicate the adaptability of this method to various challenging situations.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"49 11","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138948908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-21DOI: 10.3390/robotics13010004
Giuliano Fabris, Lorenzo Scalera, Alessandro Gasparetto
Collaborative robotics represents a modern and efficient framework in which machines can safely interact with humans. Coupled with artificial intelligence (AI) systems, collaborative robots can solve problems that require a certain degree of intelligence not only in industry but also in the entertainment and educational fields. Board games like chess or checkers are a good example. When playing these games, a robotic system has to recognize the board and pieces and estimate their position in the robot reference frame, decide autonomously which is the best move to make (respecting the game rules), and physically execute it. In this paper, an intelligent and collaborative robotic system is presented to play Italian checkers. The system is able to acquire the game state using a camera, select the best move among all the possible ones through a decision-making algorithm, and physically manipulate the game pieces on the board, performing pick-and-place operations. Minimum-time trajectories are optimized online for each pick-and-place operation of the robot so as to make the game more fluent and interactive while meeting the kinematic constraints of the manipulator. The developed system is tested in a real-world setup using a Franka Emika arm with seven degrees of freedom. The experimental results demonstrate the feasibility and performance of the proposed approach.
{"title":"Playing Checkers with an Intelligent and Collaborative Robotic System","authors":"Giuliano Fabris, Lorenzo Scalera, Alessandro Gasparetto","doi":"10.3390/robotics13010004","DOIUrl":"https://doi.org/10.3390/robotics13010004","url":null,"abstract":"Collaborative robotics represents a modern and efficient framework in which machines can safely interact with humans. Coupled with artificial intelligence (AI) systems, collaborative robots can solve problems that require a certain degree of intelligence not only in industry but also in the entertainment and educational fields. Board games like chess or checkers are a good example. When playing these games, a robotic system has to recognize the board and pieces and estimate their position in the robot reference frame, decide autonomously which is the best move to make (respecting the game rules), and physically execute it. In this paper, an intelligent and collaborative robotic system is presented to play Italian checkers. The system is able to acquire the game state using a camera, select the best move among all the possible ones through a decision-making algorithm, and physically manipulate the game pieces on the board, performing pick-and-place operations. Minimum-time trajectories are optimized online for each pick-and-place operation of the robot so as to make the game more fluent and interactive while meeting the kinematic constraints of the manipulator. The developed system is tested in a real-world setup using a Franka Emika arm with seven degrees of freedom. The experimental results demonstrate the feasibility and performance of the proposed approach.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"15 5","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138952242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-19DOI: 10.3390/robotics13010001
Erik Schuetz, Fabian B. Flohr
Predicting the trajectory of other road users, especially vulnerable road users (VRUs), is an important aspect of safety and planning efficiency for autonomous vehicles. With recent advances in Deep-Learning-based approaches in this field, physics- and classical Machine-Learning-based methods cannot exhibit competitive results compared to the former. Hence, this paper provides an extensive review of recent Deep-Learning-based methods in trajectory prediction for VRUs and autonomous driving in general. We review the state and context representations and architectural insights of selected methods, divided into categories according to their primary prediction scheme. Additionally, we summarize reported results on popular datasets for all methods presented in this review. The results show that conditional variational autoencoders achieve the best overall results on both pedestrian and autonomous driving datasets. Finally, we outline possible future research directions for the field of trajectory prediction in autonomous driving.
{"title":"A Review of Trajectory Prediction Methods for the Vulnerable Road User","authors":"Erik Schuetz, Fabian B. Flohr","doi":"10.3390/robotics13010001","DOIUrl":"https://doi.org/10.3390/robotics13010001","url":null,"abstract":"Predicting the trajectory of other road users, especially vulnerable road users (VRUs), is an important aspect of safety and planning efficiency for autonomous vehicles. With recent advances in Deep-Learning-based approaches in this field, physics- and classical Machine-Learning-based methods cannot exhibit competitive results compared to the former. Hence, this paper provides an extensive review of recent Deep-Learning-based methods in trajectory prediction for VRUs and autonomous driving in general. We review the state and context representations and architectural insights of selected methods, divided into categories according to their primary prediction scheme. Additionally, we summarize reported results on popular datasets for all methods presented in this review. The results show that conditional variational autoencoders achieve the best overall results on both pedestrian and autonomous driving datasets. Finally, we outline possible future research directions for the field of trajectory prediction in autonomous driving.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":" 406","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138960544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-16DOI: 10.3390/robotics12060170
Rodrigo Bernardo, João M. C. Sousa, M. Botto, Paulo J. S. Gonçalves
Robotic systems are increasingly present in dynamic environments. This paper proposes a hierarchical control structure wherein a behavior tree (BT) is used to improve the flexibility and adaptability of an omni-directional mobile robot for point stabilization. Flexibility and adaptability are crucial at each level of the sense–plan–act loop to implement robust and effective robotic solutions in dynamic environments. The proposed BT combines high-level decision making and continuous execution monitoring while applying non-linear model predictive control (NMPC) for the point stabilization of an omni-directional mobile robot. The proposed control architecture can guide the mobile robot to any configuration within the workspace while satisfying state constraints (e.g., obstacle avoidance) and input constraints (e.g., motor limits). The effectiveness of the controller was validated through a set of realistic simulation scenarios and experiments in a real environment, where an industrial omni-directional mobile robot performed a point stabilization task with obstacle avoidance in a workspace.
{"title":"A Novel Control Architecture Based on Behavior Trees for an Omni-Directional Mobile Robot","authors":"Rodrigo Bernardo, João M. C. Sousa, M. Botto, Paulo J. S. Gonçalves","doi":"10.3390/robotics12060170","DOIUrl":"https://doi.org/10.3390/robotics12060170","url":null,"abstract":"Robotic systems are increasingly present in dynamic environments. This paper proposes a hierarchical control structure wherein a behavior tree (BT) is used to improve the flexibility and adaptability of an omni-directional mobile robot for point stabilization. Flexibility and adaptability are crucial at each level of the sense–plan–act loop to implement robust and effective robotic solutions in dynamic environments. The proposed BT combines high-level decision making and continuous execution monitoring while applying non-linear model predictive control (NMPC) for the point stabilization of an omni-directional mobile robot. The proposed control architecture can guide the mobile robot to any configuration within the workspace while satisfying state constraints (e.g., obstacle avoidance) and input constraints (e.g., motor limits). The effectiveness of the controller was validated through a set of realistic simulation scenarios and experiments in a real environment, where an industrial omni-directional mobile robot performed a point stabilization task with obstacle avoidance in a workspace.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"67 3","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138967644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-08DOI: 10.3390/robotics12060168
Franziska Legler, Jonas Trezl, Dorothea Langer, Max Bernhagen, A. Dettmann, A. Bullinger
Today’s research on fenceless human–robot collaboration (HRC) is challenged by a relatively slow development of safety features. Simultaneously, design recommendations for HRC are requested by the industry. To simulate HRC scenarios in advance, virtual reality (VR) technology can be utilized and ensure safety. VR also allows researchers to study the effects of safety-restricted features like close distance during movements and events of robotic malfunctions. In this paper, we present a VR experiment with 40 participants collaborating with a heavy-load robot and compare the results to a similar real-world experiment to study transferability and validity. The participant’s proximity to the robot, interaction level, and occurring system failures were varied. State anxiety, trust, and intention to use were used as dependent variables, and valence and arousal values were assessed over time. Overall, state anxiety was low and trust and intention to use were high. Only simulated failures significantly increased state anxiety, reduced trust, and resulted in reduced valence and increased arousal. In comparison with the real-world experiment, non-significant differences in all dependent variables and similar progression of valence and arousal were found during scenarios without system failures. Therefore, the suitability of applying VR in HRC research to study safety-restricted features can be supported; however, further research should examine transferability for high-intensity emotional experiences.
{"title":"Emotional Experience in Human–Robot Collaboration: Suitability of Virtual Reality Scenarios to Study Interactions beyond Safety Restrictions","authors":"Franziska Legler, Jonas Trezl, Dorothea Langer, Max Bernhagen, A. Dettmann, A. Bullinger","doi":"10.3390/robotics12060168","DOIUrl":"https://doi.org/10.3390/robotics12060168","url":null,"abstract":"Today’s research on fenceless human–robot collaboration (HRC) is challenged by a relatively slow development of safety features. Simultaneously, design recommendations for HRC are requested by the industry. To simulate HRC scenarios in advance, virtual reality (VR) technology can be utilized and ensure safety. VR also allows researchers to study the effects of safety-restricted features like close distance during movements and events of robotic malfunctions. In this paper, we present a VR experiment with 40 participants collaborating with a heavy-load robot and compare the results to a similar real-world experiment to study transferability and validity. The participant’s proximity to the robot, interaction level, and occurring system failures were varied. State anxiety, trust, and intention to use were used as dependent variables, and valence and arousal values were assessed over time. Overall, state anxiety was low and trust and intention to use were high. Only simulated failures significantly increased state anxiety, reduced trust, and resulted in reduced valence and increased arousal. In comparison with the real-world experiment, non-significant differences in all dependent variables and similar progression of valence and arousal were found during scenarios without system failures. Therefore, the suitability of applying VR in HRC research to study safety-restricted features can be supported; however, further research should examine transferability for high-intensity emotional experiences.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"87 5","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138586823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-08DOI: 10.3390/robotics12060167
Stavros N. Moutsis, Konstantinos A. Tsintotas, Ioannis Kansizoglou, Antonios Gasteratos
Human action recognition is a computer vision task that identifies how a person or a group acts on a video sequence. Various methods that rely on deep-learning techniques, such as two- or three-dimensional convolutional neural networks (2D-CNNs, 3D-CNNs), recurrent neural networks (RNNs), and vision transformers (ViT), have been proposed to address this problem over the years. Motivated by the fact that most of the used CNNs in human action recognition present high complexity, and the necessity of implementations on mobile platforms that are characterized by restricted computational resources, in this article, we conduct an extensive evaluation protocol over the performance metrics of five lightweight architectures. In particular, we examine how these mobile-oriented CNNs (viz., ShuffleNet-v2, EfficientNet-b0, MobileNet-v3, and GhostNet) execute in spatial analysis compared to a recent tiny ViT, namely EVA-02-Ti, and a higher computational model, ResNet-50. Our models, previously trained on ImageNet and BU101, are measured for their classification accuracy on HMDB51, UCF101, and six classes of the NTU dataset. The average and max scores, as well as the voting approaches, are generated through three and fifteen RGB frames of each video, while two different rates for the dropout layers were assessed during the training. Last, a temporal analysis via multiple types of RNNs that employ features extracted by the trained networks is examined. Our results reveal that EfficientNet-b0 and EVA-02-Ti surpass the other mobile-CNNs, achieving comparable or superior performance to ResNet-50.
人类行为识别是一项计算机视觉任务,用于识别一个人或一组人对视频序列的行为。多年来,人们提出了各种依赖深度学习技术的方法,如二维或三维卷积神经网络(2d - cnn, 3d - cnn),循环神经网络(rnn)和视觉变压器(ViT)来解决这个问题。考虑到人类动作识别中使用的大多数cnn都具有很高的复杂性,以及在计算资源有限的移动平台上实现的必要性,在本文中,我们对五种轻量级架构的性能指标进行了广泛的评估协议。特别是,我们研究了这些面向移动的cnn(即,ShuffleNet-v2, EfficientNet-b0, MobileNet-v3和GhostNet)与最近的小型ViT(即EVA-02-Ti)和更高的计算模型ResNet-50相比,如何在空间分析中执行。我们之前在ImageNet和BU101上训练的模型,在HMDB51、UCF101和NTU数据集的六个类别上测量了它们的分类精度。通过每个视频的3帧和15帧RGB帧生成平均和最高分数以及投票方法,同时在训练期间评估了两种不同的退出层率。最后,通过多种类型的rnn进行时间分析,这些rnn采用由训练过的网络提取的特征。我们的研究结果表明,EfficientNet-b0和EVA-02-Ti超越了其他移动cnn,实现了与ResNet-50相当或更好的性能。
{"title":"Evaluating the Performance of Mobile-Convolutional Neural Networks for Spatial and Temporal Human Action Recognition Analysis","authors":"Stavros N. Moutsis, Konstantinos A. Tsintotas, Ioannis Kansizoglou, Antonios Gasteratos","doi":"10.3390/robotics12060167","DOIUrl":"https://doi.org/10.3390/robotics12060167","url":null,"abstract":"Human action recognition is a computer vision task that identifies how a person or a group acts on a video sequence. Various methods that rely on deep-learning techniques, such as two- or three-dimensional convolutional neural networks (2D-CNNs, 3D-CNNs), recurrent neural networks (RNNs), and vision transformers (ViT), have been proposed to address this problem over the years. Motivated by the fact that most of the used CNNs in human action recognition present high complexity, and the necessity of implementations on mobile platforms that are characterized by restricted computational resources, in this article, we conduct an extensive evaluation protocol over the performance metrics of five lightweight architectures. In particular, we examine how these mobile-oriented CNNs (viz., ShuffleNet-v2, EfficientNet-b0, MobileNet-v3, and GhostNet) execute in spatial analysis compared to a recent tiny ViT, namely EVA-02-Ti, and a higher computational model, ResNet-50. Our models, previously trained on ImageNet and BU101, are measured for their classification accuracy on HMDB51, UCF101, and six classes of the NTU dataset. The average and max scores, as well as the voting approaches, are generated through three and fifteen RGB frames of each video, while two different rates for the dropout layers were assessed during the training. Last, a temporal analysis via multiple types of RNNs that employ features extracted by the trained networks is examined. Our results reveal that EfficientNet-b0 and EVA-02-Ti surpass the other mobile-CNNs, achieving comparable or superior performance to ResNet-50.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"83 24","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138586712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-08DOI: 10.3390/robotics12060169
Ebenezer Raj Selvaraj Mercyshalinie, A. Ghadge, N. Ifejika, Yonas T. Tadesse
The rehabilitation process after the onset of a stroke primarily deals with assisting in regaining mobility, communication skills, swallowing function, and activities of daily living (ADLs). This entirely depends on the specific regions of the brain that have been affected by the stroke. Patients can learn how to utilize adaptive equipment, regain movement, and reduce muscle spasticity through certain repetitive exercises and therapeutic interventions. These exercises can be performed by wearing soft robotic gloves on the impaired extremity. For post-stroke rehabilitation, we have designed and characterized an interactive hand orthosis with tendon-driven finger actuation mechanisms actuated by servo motors, which consists of a fabric glove and force-sensitive resistors (FSRs) at the tip. The robotic device moves the user’s hand when operated by mobile phone to replicate normal gripping behavior. In this paper, the characterization of finger movements in response to step input commands from a mobile app was carried out for each finger at the proximal interphalangeal (PIP), distal interphalangeal (DIP), and metacarpophalangeal (MCP) joints. In general, servo motor-based hand orthoses are energy-efficient; however, they generate noise during actuation. Here, we quantified the noise generated by servo motor actuation for each finger as well as when a group of fingers is simultaneously activated. To test ADL ability, we evaluated the device’s effectiveness in holding different objects from the Action Research Arm Test (ARAT) kit. Our device, novel hand orthosis actuated by servo motors (NOHAS), was tested on ten healthy human subjects and showed an average of 90% success rate in grasping tasks. Our orthotic hand shows promise for aiding post-stroke subjects recover because of its simplicity of use, lightweight construction, and carefully designed components.
{"title":"NOHAS: A Novel Orthotic Hand Actuated by Servo Motors and Mobile App for Stroke Rehabilitation","authors":"Ebenezer Raj Selvaraj Mercyshalinie, A. Ghadge, N. Ifejika, Yonas T. Tadesse","doi":"10.3390/robotics12060169","DOIUrl":"https://doi.org/10.3390/robotics12060169","url":null,"abstract":"The rehabilitation process after the onset of a stroke primarily deals with assisting in regaining mobility, communication skills, swallowing function, and activities of daily living (ADLs). This entirely depends on the specific regions of the brain that have been affected by the stroke. Patients can learn how to utilize adaptive equipment, regain movement, and reduce muscle spasticity through certain repetitive exercises and therapeutic interventions. These exercises can be performed by wearing soft robotic gloves on the impaired extremity. For post-stroke rehabilitation, we have designed and characterized an interactive hand orthosis with tendon-driven finger actuation mechanisms actuated by servo motors, which consists of a fabric glove and force-sensitive resistors (FSRs) at the tip. The robotic device moves the user’s hand when operated by mobile phone to replicate normal gripping behavior. In this paper, the characterization of finger movements in response to step input commands from a mobile app was carried out for each finger at the proximal interphalangeal (PIP), distal interphalangeal (DIP), and metacarpophalangeal (MCP) joints. In general, servo motor-based hand orthoses are energy-efficient; however, they generate noise during actuation. Here, we quantified the noise generated by servo motor actuation for each finger as well as when a group of fingers is simultaneously activated. To test ADL ability, we evaluated the device’s effectiveness in holding different objects from the Action Research Arm Test (ARAT) kit. Our device, novel hand orthosis actuated by servo motors (NOHAS), was tested on ten healthy human subjects and showed an average of 90% success rate in grasping tasks. Our orthotic hand shows promise for aiding post-stroke subjects recover because of its simplicity of use, lightweight construction, and carefully designed components.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"92 ","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139010915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}