使用多传感器头的感知引擎实现高级人形机器人行为

2022 International Conference on Robotics and Automation (ICRA) Pub Date : 2022-05-23 DOI:10.1109/icra46639.2022.9812178

Bhavyansh Mishra, Duncan Calvert, Brendon Ortolano, M. Asselmeier, Luke Fina, Stephen McCrory, H. Sevil, Robert J. Griffin

{"title":"使用多传感器头的感知引擎实现高级人形机器人行为","authors":"Bhavyansh Mishra, Duncan Calvert, Brendon Ortolano, M. Asselmeier, Luke Fina, Stephen McCrory, H. Sevil, Robert J. Griffin","doi":"10.1109/icra46639.2022.9812178","DOIUrl":null,"url":null,"abstract":"For achieving significant levels of autonomy, legged robot behaviors require perceptual awareness of both the terrain for traversal, as well as structures and objects in their surroundings for planning, obstacle avoidance, and high-level decision making. In this work, we present a perception engine for legged robots that extracts the necessary information for developing semantic, contextual, and metric awareness of their surroundings. Our custom sensor configuration consists of (1) an active depth sensor, (2) two monocular cameras looking sideways, (3) a passive stereo sensor observing the terrain, (4) a forward facing active depth camera, and (5) a rotating 3D LIDAR with a large vertical field-of-view (FOV). The mutual overlap in the sensors' FOVs allows us to redundantly detect and track objects of both dynamic and static types. We fuse class masks generated by a semantic segmentation model with LIDAR and depth data to accurately identify and track individual instances of dynamically moving objects. In parallel, active depth and passive stereo streams of the terrain are also fused to map the terrain using the on-board GPU. We evaluate the engine using two different humanoid behaviors, (1) look-and-step and (2) track-and-follow, on the Boston Dynamics Atlas.","PeriodicalId":341244,"journal":{"name":"2022 International Conference on Robotics and Automation (ICRA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Perception Engine Using a Multi-Sensor Head to Enable High-level Humanoid Robot Behaviors\",\"authors\":\"Bhavyansh Mishra, Duncan Calvert, Brendon Ortolano, M. Asselmeier, Luke Fina, Stephen McCrory, H. Sevil, Robert J. Griffin\",\"doi\":\"10.1109/icra46639.2022.9812178\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For achieving significant levels of autonomy, legged robot behaviors require perceptual awareness of both the terrain for traversal, as well as structures and objects in their surroundings for planning, obstacle avoidance, and high-level decision making. In this work, we present a perception engine for legged robots that extracts the necessary information for developing semantic, contextual, and metric awareness of their surroundings. Our custom sensor configuration consists of (1) an active depth sensor, (2) two monocular cameras looking sideways, (3) a passive stereo sensor observing the terrain, (4) a forward facing active depth camera, and (5) a rotating 3D LIDAR with a large vertical field-of-view (FOV). The mutual overlap in the sensors' FOVs allows us to redundantly detect and track objects of both dynamic and static types. We fuse class masks generated by a semantic segmentation model with LIDAR and depth data to accurately identify and track individual instances of dynamically moving objects. In parallel, active depth and passive stereo streams of the terrain are also fused to map the terrain using the on-board GPU. We evaluate the engine using two different humanoid behaviors, (1) look-and-step and (2) track-and-follow, on the Boston Dynamics Atlas.\",\"PeriodicalId\":341244,\"journal\":{\"name\":\"2022 International Conference on Robotics and Automation (ICRA)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Robotics and Automation (ICRA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icra46639.2022.9812178\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icra46639.2022.9812178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

为了实现显著的自主性，有腿机器人的行为需要对穿越的地形以及周围的结构和物体进行感知，以进行规划、避障和高级决策。在这项工作中，我们提出了一种用于有腿机器人的感知引擎，该引擎可以提取必要的信息，以发展对周围环境的语义、上下文和度量意识。我们定制的传感器配置包括(1)一个主动深度传感器，(2)两个侧视的单目摄像头，(3)一个观察地形的被动立体传感器，(4)一个面向前方的主动深度摄像头，以及(5)一个具有大垂直视场(FOV)的旋转3D激光雷达。传感器fov的相互重叠使我们能够冗余地检测和跟踪动态和静态类型的对象。我们将语义分割模型生成的类掩码与激光雷达和深度数据融合在一起，以准确识别和跟踪动态移动物体的单个实例。同时，地形的主动深度和被动立体流也被融合到使用车载GPU绘制地形。我们在波士顿动力地图集上使用两种不同的类人行为(1)观察-步和(2)跟踪-跟随)来评估引擎。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Perception Engine Using a Multi-Sensor Head to Enable High-level Humanoid Robot Behaviors

For achieving significant levels of autonomy, legged robot behaviors require perceptual awareness of both the terrain for traversal, as well as structures and objects in their surroundings for planning, obstacle avoidance, and high-level decision making. In this work, we present a perception engine for legged robots that extracts the necessary information for developing semantic, contextual, and metric awareness of their surroundings. Our custom sensor configuration consists of (1) an active depth sensor, (2) two monocular cameras looking sideways, (3) a passive stereo sensor observing the terrain, (4) a forward facing active depth camera, and (5) a rotating 3D LIDAR with a large vertical field-of-view (FOV). The mutual overlap in the sensors' FOVs allows us to redundantly detect and track objects of both dynamic and static types. We fuse class masks generated by a semantic segmentation model with LIDAR and depth data to accurately identify and track individual instances of dynamically moving objects. In parallel, active depth and passive stereo streams of the terrain are also fused to map the terrain using the on-board GPU. We evaluate the engine using two different humanoid behaviors, (1) look-and-step and (2) track-and-follow, on the Boston Dynamics Atlas.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 International Conference on Robotics and Automation (ICRA)

自引率

0.00%

发文量