Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235249
Christopher Funk, B. Noack, U. Hanebeck
Sensor data fusion in wireless sensor networks poses challenges with respect to both theory and implementation. Unknown cross-correlations between estimates distributed across the network need to be addressed carefully as neglecting them leads to overconfident fusion results. In addition, limited processing power and energy supply of the sensor nodes prohibit the use of complex algorithms and high-bandwidth communication. In this work, fast covariance intersection using both quantized estimates and quantized covariance matrices is considered. The proposed method is computationally efficient and significantly reduces the bandwidth required for data transmission while retaining unbiasedness and conservativeness of fast covariance intersection. The performance of the proposed method is evaluated with respect to that of fast covariance intersection, which proves its effectiveness even in the case of substantial data reduction.
{"title":"Conservative Quantization of Fast Covariance Intersection","authors":"Christopher Funk, B. Noack, U. Hanebeck","doi":"10.1109/MFI49285.2020.9235249","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235249","url":null,"abstract":"Sensor data fusion in wireless sensor networks poses challenges with respect to both theory and implementation. Unknown cross-correlations between estimates distributed across the network need to be addressed carefully as neglecting them leads to overconfident fusion results. In addition, limited processing power and energy supply of the sensor nodes prohibit the use of complex algorithms and high-bandwidth communication. In this work, fast covariance intersection using both quantized estimates and quantized covariance matrices is considered. The proposed method is computationally efficient and significantly reduces the bandwidth required for data transmission while retaining unbiasedness and conservativeness of fast covariance intersection. The performance of the proposed method is evaluated with respect to that of fast covariance intersection, which proves its effectiveness even in the case of substantial data reduction.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128752587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235269
Chanwoong Lee, Hyorim Choi, Shapna Muralidharan, H. Ko, Byounghyun Yoo, G. Kim
In a rapidly aging society, like in South Korea, the number of Alzheimer’s Disease (AD) patients is a significant public health problem, and the need for specialized healthcare centers is in high demand. Healthcare providers generally rely on caregivers (CG) for elderly persons with AD to monitor and help them in their daily activities. K-Log Centre is a healthcare provider located in Korea to help AD patients meet their daily needs with assistance from CG in the center. The CG’S in the K-Log Centre need to attend the patients’ unique demands and everyday essentials for long-term care. Moreover, the CG also describes and logs the day-to-day activities in Activities of Daily Living (ADL) log, which comprises various events in detail. The CG’s logging activities can overburden their work, leading to appalling results like suffering quality of elderly care and hiring additional CG’s to maintain the quality of care and a negative feedback cycle. In this paper, we have analyzed this impending issue in K-Log Centre and propose a method to facilitate machine-assisted human tagging of videos for logging of the elderly activities using Human Activity Recognition (HAR). To enable the scenario, we use a You Only Look Once (YOLO-v3)-based deep learning method for object detection and use it for HAR creating a multi-modal machine-assisted human tagging of videos. The proposed algorithm detects the HAR with a precision of 98.4%. After designing the HAR model, we have tested it in a live video feed from the K-Log Centre to test the proposed method. The model showed an accuracy of 81.4% in live data, reducing the logging activities of the CG’s.
在像韩国这样的快速老龄化社会中,阿尔茨海默病(AD)患者的数量是一个重大的公共卫生问题,对专业医疗中心的需求很高。医疗保健提供者通常依靠照顾者(CG)对老年AD患者进行监测和帮助他们进行日常活动。K-Log中心是一家位于韩国的医疗保健提供商,通过中心CG的协助,帮助AD患者满足他们的日常需求。在K-Log中心的CG需要照顾病人的独特需求和日常必需品的长期护理。此外,CG还在日常生活活动(ADL)日志中描述和记录日常活动,其中包括各种事件的详细信息。CG的记录活动可能会使他们的工作负担过重,导致可怕的结果,比如老年人护理质量下降,需要雇佣额外的CG来维持护理质量,并形成负反馈循环。在本文中,我们分析了K-Log中心即将出现的这个问题,并提出了一种使用人类活动识别(HAR)促进机器辅助人类标记老年人活动记录视频的方法。为了实现该场景,我们使用基于You Only Look Once (YOLO-v3)的深度学习方法进行对象检测,并将其用于HAR,创建多模态机器辅助的视频人类标记。该算法检测HAR的精度为98.4%。在设计HAR模型后,我们在K-Log中心的实时视频馈送中对其进行了测试,以测试所提出的方法。该模型在实时数据中显示出81.4%的精度,减少了CG的测井活动。
{"title":"Machine Assisted Video Tagging of Elderly Activities in K-Log Centre","authors":"Chanwoong Lee, Hyorim Choi, Shapna Muralidharan, H. Ko, Byounghyun Yoo, G. Kim","doi":"10.1109/MFI49285.2020.9235269","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235269","url":null,"abstract":"In a rapidly aging society, like in South Korea, the number of Alzheimer’s Disease (AD) patients is a significant public health problem, and the need for specialized healthcare centers is in high demand. Healthcare providers generally rely on caregivers (CG) for elderly persons with AD to monitor and help them in their daily activities. K-Log Centre is a healthcare provider located in Korea to help AD patients meet their daily needs with assistance from CG in the center. The CG’S in the K-Log Centre need to attend the patients’ unique demands and everyday essentials for long-term care. Moreover, the CG also describes and logs the day-to-day activities in Activities of Daily Living (ADL) log, which comprises various events in detail. The CG’s logging activities can overburden their work, leading to appalling results like suffering quality of elderly care and hiring additional CG’s to maintain the quality of care and a negative feedback cycle. In this paper, we have analyzed this impending issue in K-Log Centre and propose a method to facilitate machine-assisted human tagging of videos for logging of the elderly activities using Human Activity Recognition (HAR). To enable the scenario, we use a You Only Look Once (YOLO-v3)-based deep learning method for object detection and use it for HAR creating a multi-modal machine-assisted human tagging of videos. The proposed algorithm detects the HAR with a precision of 98.4%. After designing the HAR model, we have tested it in a live video feed from the K-Log Centre to test the proposed method. The model showed an accuracy of 81.4% in live data, reducing the logging activities of the CG’s.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116776040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235238
M. Zofka, Lars Töttel, Maximilian Zipfl, Marc Heinrich, Tobias Fleck, P. Schulz, Johann Marius Zöllner
Validation and verification of autonomous vehicles is still an unsolved problem. Although virtual approaches promise a cost efficient and reproducible solution, a most comprehensive and realistic representation of the real world traffic domain is required in order to make valuable statements about the performance of a highly automated driving (HAD) function. Models from different domain experts offer a repository of such representations. However, these models must be linked together for an extensive and uniform mapping of real world traffic domain for HAD performance assessment.Hereby, we propose the concept of a co-simulation architecture built upon the Robot Operating System (ROS) for both coupling and for integration of different domain expert models, immersion and stimulation of real pedestrians as well as AD systems into a common test system. This enables a unified way of generating ground truth for the performance assessment of multi-sensorial AD systems. We demonstrate the applicability of the ROS powered co-simulation by coupling behavior models in our mixed reality environment.
{"title":"Pushing ROS towards the Dark Side: A ROS-based Co-Simulation Architecture for Mixed-Reality Test Systems for Autonomous Vehicles","authors":"M. Zofka, Lars Töttel, Maximilian Zipfl, Marc Heinrich, Tobias Fleck, P. Schulz, Johann Marius Zöllner","doi":"10.1109/MFI49285.2020.9235238","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235238","url":null,"abstract":"Validation and verification of autonomous vehicles is still an unsolved problem. Although virtual approaches promise a cost efficient and reproducible solution, a most comprehensive and realistic representation of the real world traffic domain is required in order to make valuable statements about the performance of a highly automated driving (HAD) function. Models from different domain experts offer a repository of such representations. However, these models must be linked together for an extensive and uniform mapping of real world traffic domain for HAD performance assessment.Hereby, we propose the concept of a co-simulation architecture built upon the Robot Operating System (ROS) for both coupling and for integration of different domain expert models, immersion and stimulation of real pedestrians as well as AD systems into a common test system. This enables a unified way of generating ground truth for the performance assessment of multi-sensorial AD systems. We demonstrate the applicability of the ROS powered co-simulation by coupling behavior models in our mixed reality environment.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128340412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235218
Taeyeong Choi, Theodore P. Pavlic
Learning in robotic systems is largely constrained by the quality of the training data available to a robot learner. Robots may have to make multiple, repeated expensive excursions to gather this data or have humans in the loop to perform demonstrations to ensure reliable performance. The cost can be much higher when a robot embedded within a multi-robot system must learn from the complex aggregate of the many robots that surround it and may react to the learner’s motions. In our previous work [1], [2], we considered the problem of Remote Teammate Localization (ReTLo), where a single robot in a team uses passive observations of a nearby neighbor to accurately infer the position of robots outside of its sensory range even when robot-to-robot communication is not allowed in the system. We demonstrated a communication-free approach to show that the rearmost robot can use motion information of a single robot within its sensory range to predict the positions of all robots in the convoy. Here, we expand on that work with Selective Random Sampling (SRS), a framework that improves the ReTLo learning process by enabling the learner to actively deviate from its trajectory in ways that are likely to lead to better training samples and consequently gain accurate localization ability with fewer observations. By adding diversity to the learner’s motion, SRS simultaneously improves the learner’s predictions of all other teammates and thus can achieve similar performance as prior methods with less data.
{"title":"Automatic Discovery of Motion Patterns that Improve Learning Rate in Communication-Limited Multi-Robot Systems","authors":"Taeyeong Choi, Theodore P. Pavlic","doi":"10.1109/MFI49285.2020.9235218","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235218","url":null,"abstract":"Learning in robotic systems is largely constrained by the quality of the training data available to a robot learner. Robots may have to make multiple, repeated expensive excursions to gather this data or have humans in the loop to perform demonstrations to ensure reliable performance. The cost can be much higher when a robot embedded within a multi-robot system must learn from the complex aggregate of the many robots that surround it and may react to the learner’s motions. In our previous work [1], [2], we considered the problem of Remote Teammate Localization (ReTLo), where a single robot in a team uses passive observations of a nearby neighbor to accurately infer the position of robots outside of its sensory range even when robot-to-robot communication is not allowed in the system. We demonstrated a communication-free approach to show that the rearmost robot can use motion information of a single robot within its sensory range to predict the positions of all robots in the convoy. Here, we expand on that work with Selective Random Sampling (SRS), a framework that improves the ReTLo learning process by enabling the learner to actively deviate from its trajectory in ways that are likely to lead to better training samples and consequently gain accurate localization ability with fewer observations. By adding diversity to the learner’s motion, SRS simultaneously improves the learner’s predictions of all other teammates and thus can achieve similar performance as prior methods with less data.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125935358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235248
Haohao Hu, Johannes Beck, M. Lauer, C. Stiller
In this work, we present an uniform B-spline based continuous fusion approach, which fuses the motion data from an inertial measurement unit and the pose data from a visual localization system accurately, efficiently and continu-ously. Currently, in the domain of robotics and autonomous driving, most of the ego motion fusion approaches are filter based or pose graph based. By using the filter based approaches like the Kalman Filter or the Particle Filter, usually, many parameters should be set carefully, which is a big overhead. Besides that, the filter based approaches can only fuse data in a time forwards direction, which is a big disadvantage in processing async data. Since the pose graph based approaches only fuse the pose data, the inertial measurement unit data should be integrated to estimate the corresponding pose data firstly, which can however bring accumulated error into the fusion system. Additionally, the filter based approaches and the pose graph based approaches only provide discrete fusion results, which may decrease the accuracy of the data processing steps afterwards. Since the fusion approach is generally needed for robots and automated driving vehicles, it is a major goal to make it more accurate, robust, efficient and continuous. Therefore, in this work, we address this problem and apply the axis-angle rotation representation method, the Rodrigues’ formula and the uniform B-spline implementation to solve the ego motion fusion problem continuously. Evaluation results performed on the real world data show that our approach provides accurate, robust and continuous fusion results.
{"title":"Continuous Fusion of IMU and Pose Data using Uniform B-Spline","authors":"Haohao Hu, Johannes Beck, M. Lauer, C. Stiller","doi":"10.1109/MFI49285.2020.9235248","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235248","url":null,"abstract":"In this work, we present an uniform B-spline based continuous fusion approach, which fuses the motion data from an inertial measurement unit and the pose data from a visual localization system accurately, efficiently and continu-ously. Currently, in the domain of robotics and autonomous driving, most of the ego motion fusion approaches are filter based or pose graph based. By using the filter based approaches like the Kalman Filter or the Particle Filter, usually, many parameters should be set carefully, which is a big overhead. Besides that, the filter based approaches can only fuse data in a time forwards direction, which is a big disadvantage in processing async data. Since the pose graph based approaches only fuse the pose data, the inertial measurement unit data should be integrated to estimate the corresponding pose data firstly, which can however bring accumulated error into the fusion system. Additionally, the filter based approaches and the pose graph based approaches only provide discrete fusion results, which may decrease the accuracy of the data processing steps afterwards. Since the fusion approach is generally needed for robots and automated driving vehicles, it is a major goal to make it more accurate, robust, efficient and continuous. Therefore, in this work, we address this problem and apply the axis-angle rotation representation method, the Rodrigues’ formula and the uniform B-spline implementation to solve the ego motion fusion problem continuously. Evaluation results performed on the real world data show that our approach provides accurate, robust and continuous fusion results.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129840726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235262
Aaron Grapentin, Dustin Lehmann, Ardjola Zhupa, T. Seel
Hand motion tracking is a key technology in several applications including ergonomic workplace assessment, human-machine interaction and neurological rehabilitation. Recent technological solutions are based on inertial measurement units (IMUs). They are less obtrusive than exoskeleton-based solutions and overcome the line-of-sight restrictions of optical systems. The number of sensors is crucial for usability, unobtrusiveness, and hardware cost. In this paper, we present a real-time capable, sparse motion tracking solution for hand motion tracking that requires only five IMUs, one on each of the distal finger segments and one on the back of the hand, in contrast to recently proposed full-setup solution with 16 IMUs. The method only uses gyroscope and accelerometer readings and avoids magnetometer readings, which enables unrestricted use in indoor environments, near ferromagnetic materials and electronic devices. We use a moving horizon estimation (MHE) approach that exploits kinematic constraints to track motions and performs long-term stable heading estimation. The proposed method is validated experimentally using a recently developed sensor system. It is found that the proposed method yields qualitatively good agreement of the estimated and the actual hand motion and that the estimates are long-term stable. The root-mean-square deviation between the fingertip position estimates of the sparse and the full setup are found to be in the range of 1 cm. The method is hence highly suitable for unobtrusive and non-restrictive motion tracking in a range of applications.
{"title":"Sparse Magnetometer-Free Real-Time Inertial Hand Motion Tracking","authors":"Aaron Grapentin, Dustin Lehmann, Ardjola Zhupa, T. Seel","doi":"10.1109/MFI49285.2020.9235262","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235262","url":null,"abstract":"Hand motion tracking is a key technology in several applications including ergonomic workplace assessment, human-machine interaction and neurological rehabilitation. Recent technological solutions are based on inertial measurement units (IMUs). They are less obtrusive than exoskeleton-based solutions and overcome the line-of-sight restrictions of optical systems. The number of sensors is crucial for usability, unobtrusiveness, and hardware cost. In this paper, we present a real-time capable, sparse motion tracking solution for hand motion tracking that requires only five IMUs, one on each of the distal finger segments and one on the back of the hand, in contrast to recently proposed full-setup solution with 16 IMUs. The method only uses gyroscope and accelerometer readings and avoids magnetometer readings, which enables unrestricted use in indoor environments, near ferromagnetic materials and electronic devices. We use a moving horizon estimation (MHE) approach that exploits kinematic constraints to track motions and performs long-term stable heading estimation. The proposed method is validated experimentally using a recently developed sensor system. It is found that the proposed method yields qualitatively good agreement of the estimated and the actual hand motion and that the estimates are long-term stable. The root-mean-square deviation between the fingertip position estimates of the sparse and the full setup are found to be in the range of 1 cm. The method is hence highly suitable for unobtrusive and non-restrictive motion tracking in a range of applications.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132705754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235260
Kailai Li, F. Pfaff, U. Hanebeck
We present a novel deterministic sampling approach for von Mises–Fisher distributions of arbitrary dimensions. Following the idea of the unscented transform, samples of configurable size are drawn isotropically on the hypersphere while preserving the mean resultant vector of the underlying distribution. Based on these samples, a von Mises–Fisher filter is proposed for nonlinear estimation of hyperspherical states. Compared with existing von Mises–Fisher-based filtering schemes, the proposed filter exhibits superior hyperspherical tracking performance.
{"title":"Nonlinear von Mises–Fisher Filtering Based on Isotropic Deterministic Sampling","authors":"Kailai Li, F. Pfaff, U. Hanebeck","doi":"10.1109/MFI49285.2020.9235260","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235260","url":null,"abstract":"We present a novel deterministic sampling approach for von Mises–Fisher distributions of arbitrary dimensions. Following the idea of the unscented transform, samples of configurable size are drawn isotropically on the hypersphere while preserving the mean resultant vector of the underlying distribution. Based on these samples, a von Mises–Fisher filter is proposed for nonlinear estimation of hyperspherical states. Compared with existing von Mises–Fisher-based filtering schemes, the proposed filter exhibits superior hyperspherical tracking performance.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"16 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133487595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235268
Jason L. Williams, Shu Jiang, M. O'Brien, Glenn Wagner, E. Hernández, Mark Cox, Alex Pitt, R. Arkin, N. Hudson
While robots have long been proposed as a tool to reduce human personnel’s exposure to danger in subterranean environments, these environments also present significant challenges to the development of these robots. Fundamental to this challenge is the problem of autonomous exploration. Frontier-based methods have been a powerful and successful approach to exploration, but complex 3D environments remain a challenge when online employment is required. This paper presents a new approach that addresses the complexity of operating in 3D by directly modelling the boundary between observed free and unobserved space (the frontier), rather than utilising dense 3D volumetric representations. By avoiding a representation involving a single map, it also achieves scalability to problems where Simultaneous Localisation and Matching (SLAM) loop closures are essential. The approach enabled a team of seven ground and air robots to autonomously explore the DARPA Subterranean Challenge Urban Circuit, jointly traversing over 8 km in a complex and communication denied environment.
{"title":"Online 3D Frontier-Based UGV and UAV Exploration Using Direct Point Cloud Visibility","authors":"Jason L. Williams, Shu Jiang, M. O'Brien, Glenn Wagner, E. Hernández, Mark Cox, Alex Pitt, R. Arkin, N. Hudson","doi":"10.1109/MFI49285.2020.9235268","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235268","url":null,"abstract":"While robots have long been proposed as a tool to reduce human personnel’s exposure to danger in subterranean environments, these environments also present significant challenges to the development of these robots. Fundamental to this challenge is the problem of autonomous exploration. Frontier-based methods have been a powerful and successful approach to exploration, but complex 3D environments remain a challenge when online employment is required. This paper presents a new approach that addresses the complexity of operating in 3D by directly modelling the boundary between observed free and unobserved space (the frontier), rather than utilising dense 3D volumetric representations. By avoiding a representation involving a single map, it also achieves scalability to problems where Simultaneous Localisation and Matching (SLAM) loop closures are essential. The approach enabled a team of seven ground and air robots to autonomously explore the DARPA Subterranean Challenge Urban Circuit, jointly traversing over 8 km in a complex and communication denied environment.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"492 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116030226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235240
Juncong Fei, Wenbo Chen, Philipp Heidenreich, Sascha Wirges, C. Stiller
3D pedestrian detection is a challenging task in automated driving because pedestrians are relatively small, frequently occluded and easily confused with narrow vertical objects. LiDAR and camera are two commonly used sensor modalities for this task, which should provide complementary information. Unexpectedly, LiDAR-only detection methods tend to outperform multisensor fusion methods in public benchmarks. Recently, PointPainting has been presented to eliminate this performance drop by effectively fusing the output of a semantic segmentation network instead of the raw image information. In this paper, we propose a generalization of PointPainting to be able to apply fusion at different levels. After the semantic augmentation of the point cloud, we encode raw point data in pillars to get geometric features and semantic point data in voxels to get semantic features and fuse them in an effective way. Experimental results on the KITTI test set show that SemanticVoxels achieves state-of-the-art performance in both 3D and bird’s eye view pedestrian detection benchmarks. In particular, our approach demonstrates its strength in detecting challenging pedestrian cases and outperforms current state-of-the-art approaches.
{"title":"SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using LiDAR Point Cloud and Semantic Segmentation","authors":"Juncong Fei, Wenbo Chen, Philipp Heidenreich, Sascha Wirges, C. Stiller","doi":"10.1109/MFI49285.2020.9235240","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235240","url":null,"abstract":"3D pedestrian detection is a challenging task in automated driving because pedestrians are relatively small, frequently occluded and easily confused with narrow vertical objects. LiDAR and camera are two commonly used sensor modalities for this task, which should provide complementary information. Unexpectedly, LiDAR-only detection methods tend to outperform multisensor fusion methods in public benchmarks. Recently, PointPainting has been presented to eliminate this performance drop by effectively fusing the output of a semantic segmentation network instead of the raw image information. In this paper, we propose a generalization of PointPainting to be able to apply fusion at different levels. After the semantic augmentation of the point cloud, we encode raw point data in pillars to get geometric features and semantic point data in voxels to get semantic features and fuse them in an effective way. Experimental results on the KITTI test set show that SemanticVoxels achieves state-of-the-art performance in both 3D and bird’s eye view pedestrian detection benchmarks. In particular, our approach demonstrates its strength in detecting challenging pedestrian cases and outperforms current state-of-the-art approaches.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132944957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1109/MFI49285.2020.9235257
Rodolphe Dubois, A. Eudes, V. Fremont
This paper introduces a new dataset dedicated to multi-robot stereo-visual and inertial Simultaneous Localization And Mapping (SLAM). This dataset consists in five indoor multi-robot scenarios acquired with ground and aerial robots in a former Air Museum at ONERA Meudon, France. Those scenarios were designed to exhibit some specific opportunities and challenges associated to collaborative SLAM. Each scenario includes synchronized sequences between multiple robots with stereo images and inertial measurements. They also exhibit explicit direct interactions between robots through the detection of mounted AprilTag markers [1]. Ground-truth trajectories for each robot were computed using Structure-from-Motion algorithms and constrained with the detection of fixed AprilTag markers placed as beacons on the experimental area. Those scenarios have been benchmarked on state-of-the-art monocular, stereo and visual-inertial SLAM algorithms to provide a baseline of the single-robot performances to be enhanced in collaborative frameworks.
{"title":"AirMuseum: a heterogeneous multi-robot dataset for stereo-visual and inertial Simultaneous Localization And Mapping","authors":"Rodolphe Dubois, A. Eudes, V. Fremont","doi":"10.1109/MFI49285.2020.9235257","DOIUrl":"https://doi.org/10.1109/MFI49285.2020.9235257","url":null,"abstract":"This paper introduces a new dataset dedicated to multi-robot stereo-visual and inertial Simultaneous Localization And Mapping (SLAM). This dataset consists in five indoor multi-robot scenarios acquired with ground and aerial robots in a former Air Museum at ONERA Meudon, France. Those scenarios were designed to exhibit some specific opportunities and challenges associated to collaborative SLAM. Each scenario includes synchronized sequences between multiple robots with stereo images and inertial measurements. They also exhibit explicit direct interactions between robots through the detection of mounted AprilTag markers [1]. Ground-truth trajectories for each robot were computed using Structure-from-Motion algorithms and constrained with the detection of fixed AprilTag markers placed as beacons on the experimental area. Those scenarios have been benchmarked on state-of-the-art monocular, stereo and visual-inertial SLAM algorithms to provide a baseline of the single-robot performances to be enhanced in collaborative frameworks.","PeriodicalId":446154,"journal":{"name":"2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125201428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}