首页 > 最新文献

2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)最新文献

英文 中文
Orientation-aware People Detection and Counting Method based on Overhead Fisheye Camera 基于架空鱼眼摄像机的方位感知人群检测与计数方法
Hu Cao, Boyang Peng, Linxuan Jia, Bin Li, Alois Knoll, Guang Chen
The rise of intelligent vision-based people detection and counting methods will have a significant impact on the future security and space management of intelligent buildings. The current deep learning-based people detection algorithm achieves state-of-the-art performance in images collected by standard cameras. Nevertheless, standard vision approaches do not perform well on fisheye cameras because they are not suitable for fisheye images with radial geometry and barrel distortion. Overhead fisheye cameras can provide a larger field of view compared to standard cameras in people detection and counting tasks. In this paper, we propose an orientation-aware people detection and counting method based on an overhead fisheye camera. Specifically, an orientation-aware deep convolutional neural network with simultaneous attention refinement module (SARM) is introduced for people detection in arbitrary directions. Based on the attention mechanism, SARM can suppress the noise feature and highlight the object feature to improve the context focusing ability of the network on the people with different poses and orientations. Following the collection of detection results, an Internet of Things (IoT) system based on Real Time Streaming Protocol (RTSP) is constructed to output results to different devices. Experiments on three common fisheye image datasets show that under low light conditions, our method has high generalization ability and outperforms the state-of-the-art methods.
基于智能视觉的人员检测和计数方法的兴起,将对未来智能建筑的安全和空间管理产生重大影响。目前基于深度学习的人物检测算法在标准相机采集的图像中达到了最先进的性能。然而,标准视觉方法在鱼眼相机上表现不佳,因为它们不适合具有径向几何形状和桶形失真的鱼眼图像。在人员检测和计数任务中,与标准摄像机相比,头顶的鱼眼摄像机可以提供更大的视野。本文提出了一种基于架空鱼眼摄像机的方位感知人群检测与计数方法。具体来说,提出了一种带有同步注意细化模块(SARM)的方向感知深度卷积神经网络,用于任意方向的人物检测。基于注意机制,SARM可以抑制噪声特征,突出目标特征,提高网络对不同姿态和方向的人的上下文聚焦能力。采集检测结果后,构建基于RTSP (Real Time Streaming Protocol)协议的物联网(IoT)系统,将检测结果输出到不同的设备。在三个常见的鱼眼图像数据集上进行的实验表明,在弱光条件下,我们的方法具有较高的泛化能力,优于现有的方法。
{"title":"Orientation-aware People Detection and Counting Method based on Overhead Fisheye Camera","authors":"Hu Cao, Boyang Peng, Linxuan Jia, Bin Li, Alois Knoll, Guang Chen","doi":"10.1109/MFI55806.2022.9913868","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913868","url":null,"abstract":"The rise of intelligent vision-based people detection and counting methods will have a significant impact on the future security and space management of intelligent buildings. The current deep learning-based people detection algorithm achieves state-of-the-art performance in images collected by standard cameras. Nevertheless, standard vision approaches do not perform well on fisheye cameras because they are not suitable for fisheye images with radial geometry and barrel distortion. Overhead fisheye cameras can provide a larger field of view compared to standard cameras in people detection and counting tasks. In this paper, we propose an orientation-aware people detection and counting method based on an overhead fisheye camera. Specifically, an orientation-aware deep convolutional neural network with simultaneous attention refinement module (SARM) is introduced for people detection in arbitrary directions. Based on the attention mechanism, SARM can suppress the noise feature and highlight the object feature to improve the context focusing ability of the network on the people with different poses and orientations. Following the collection of detection results, an Internet of Things (IoT) system based on Real Time Streaming Protocol (RTSP) is constructed to output results to different devices. Experiments on three common fisheye image datasets show that under low light conditions, our method has high generalization ability and outperforms the state-of-the-art methods.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131171364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Direct Position Determination Using Direct Signals and First-Order Reflections by Exploiting the Multipath Environment 利用多径环境利用直接信号和一阶反射直接定位
Devanand Palur Palanivelu, M. Oispuu, W. Koch
This paper introduces a novel subspace-based Direct Position Determination (DPD) approach named Multipath-DPD to localize the source passively from the raw array data representing the direct signals and the first-order reflections in a multipath environment. The multipath propagation is modeled based on the Image-Source Method (ISM). This method takes advantage of the known urban environment and overcomes ambiguity issues, appropriate modeling and assignment of bearing measurements inherent in bearing-based localization approaches. Simulation results show that the proposed Multipath-DPD outperforms the classical DPD in the considered scenarios and demonstrates an asymptotic behavior to the derived Cramér-Rao Bound (CRB).
本文提出了一种基于子空间的直接定位(DPD)方法——多路径直接定位(multipath -DPD),用于在多路径环境下,从表示直接信号和一阶反射的原始阵列数据中被动定位源。基于图像源方法(ISM)对多路径传播进行建模。该方法利用了已知的城市环境,克服了基于方位定位方法固有的模糊问题、适当的建模和方位测量分配。仿真结果表明,在考虑的场景下,所提出的多路径DPD优于经典的DPD,并且对所推导的cram r- rao边界(CRB)具有渐近性。
{"title":"Direct Position Determination Using Direct Signals and First-Order Reflections by Exploiting the Multipath Environment","authors":"Devanand Palur Palanivelu, M. Oispuu, W. Koch","doi":"10.1109/MFI55806.2022.9913869","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913869","url":null,"abstract":"This paper introduces a novel subspace-based Direct Position Determination (DPD) approach named Multipath-DPD to localize the source passively from the raw array data representing the direct signals and the first-order reflections in a multipath environment. The multipath propagation is modeled based on the Image-Source Method (ISM). This method takes advantage of the known urban environment and overcomes ambiguity issues, appropriate modeling and assignment of bearing measurements inherent in bearing-based localization approaches. Simulation results show that the proposed Multipath-DPD outperforms the classical DPD in the considered scenarios and demonstrates an asymptotic behavior to the derived Cramér-Rao Bound (CRB).","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131363696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Self-supervised 3D Object Detection from Monocular Pseudo-LiDAR 基于单目伪激光雷达的自监督三维目标检测
Curie Kim, Ue-Hwan Kim, Jong-Hwan Kim
There have been attempts to detect 3D objects by fusion of stereo camera images and LiDAR sensor data or using LiDAR for pre-training and only monocular images for testing, but there have been less attempts to use only monocular image sequences due to low accuracy. In addition, when depth prediction using only monocular images, only scale-inconsistent depth can be predicted, which is the reason why researchers are reluctant to use monocular images alone.Therefore, we propose a method for predicting absolute depth and detecting 3D objects using only monocular image sequences by enabling end-to-end learning of detection networks and depth prediction networks. As a result, the proposed method surpasses other existing methods in performance on the KITTI 3D dataset. Even when monocular image and 3D LiDAR are used together during training in an attempt to improve performance, ours exhibit is the best performance compared to other methods using the same input. In addition, end-to-end learning not only improves depth prediction performance, but also enables absolute depth prediction, because our network utilizes the fact that the size of a 3D object such as a car is determined by the approximate size.
已经有人尝试通过融合立体摄像机图像和激光雷达传感器数据来检测3D物体,或者使用激光雷达进行预训练,仅使用单眼图像进行测试,但由于精度低,仅使用单眼图像序列的尝试较少。此外,当仅使用单眼图像进行深度预测时,只能预测尺度不一致的深度,这也是研究者不愿意单独使用单眼图像的原因。因此,我们提出了一种仅使用单眼图像序列预测绝对深度和检测3D物体的方法,该方法支持检测网络和深度预测网络的端到端学习。结果表明,该方法在KITTI三维数据集上的性能优于其他现有方法。即使在训练期间将单眼图像和3D激光雷达一起使用以试图提高性能,与使用相同输入的其他方法相比,我们的展示也具有最佳性能。此外,端到端学习不仅可以提高深度预测性能,还可以实现绝对深度预测,因为我们的网络利用了汽车等3D物体的大小是由近似大小决定的这一事实。
{"title":"Self-supervised 3D Object Detection from Monocular Pseudo-LiDAR","authors":"Curie Kim, Ue-Hwan Kim, Jong-Hwan Kim","doi":"10.1109/MFI55806.2022.9913846","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913846","url":null,"abstract":"There have been attempts to detect 3D objects by fusion of stereo camera images and LiDAR sensor data or using LiDAR for pre-training and only monocular images for testing, but there have been less attempts to use only monocular image sequences due to low accuracy. In addition, when depth prediction using only monocular images, only scale-inconsistent depth can be predicted, which is the reason why researchers are reluctant to use monocular images alone.Therefore, we propose a method for predicting absolute depth and detecting 3D objects using only monocular image sequences by enabling end-to-end learning of detection networks and depth prediction networks. As a result, the proposed method surpasses other existing methods in performance on the KITTI 3D dataset. Even when monocular image and 3D LiDAR are used together during training in an attempt to improve performance, ours exhibit is the best performance compared to other methods using the same input. In addition, end-to-end learning not only improves depth prediction performance, but also enables absolute depth prediction, because our network utilizes the fact that the size of a 3D object such as a car is determined by the approximate size.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126467143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Spatio-Temporal-Semantic Environment Representation for Autonomous Mobile Robots equipped with various Sensor Systems 基于不同传感器系统的自主移动机器人时空语义环境表征
Mark Niemeyer, Sebastian Pütz, J. Hertzberg
The large amount of high resolution sensor data, both temporal and spatial, that autonomous mobile robots collect in today’s systems requires structured and efficient management and storage during the robot mission. In response, we present SEEREP: A Spatio-Temporal-Semantic Environment Representation for Autonomous Mobile Robots. SEEREP handles various types of data at once and provides an efficient query interface for all three modalities that can be combined for high-level analyses. It supports common robotic sensor data types such as images and point clouds, as well as sensor and robot coordinate frames changing over time. Furthermore, SEEREP provides an efficient HDF5-based storage system running on the robot during operation, compatible with ROS and the corresponding sensor message definitions. The compressed HDF5 data backend can be transferred efficiently to an application server with a running SEEREP query server providing gRPC interfaces with Protobuf and Flattbuffer message types. The query server can support high-level planning and reasoning systems in e.g. agricultural environments, or other partially unstructured environments that change over time. In this paper we show that SEEREP is much better suited for these tasks than a traditional GIS, which cannot handle the different types of robotic sensor data.
在当今的系统中,自主移动机器人收集的大量高分辨率传感器数据,包括时间和空间,需要在机器人任务期间进行结构化和有效的管理和存储。作为回应,我们提出了SEEREP:自主移动机器人的时空语义环境表示。SEEREP可以同时处理各种类型的数据,并为所有三种模式提供有效的查询接口,这些模式可以组合起来进行高级分析。它支持常见的机器人传感器数据类型,如图像和点云,以及传感器和机器人坐标帧随时间变化。此外,SEEREP提供了一个高效的基于hdf5的存储系统,在机器人操作期间运行,与ROS和相应的传感器消息定义兼容。压缩的HDF5数据后端可以高效地传输到具有运行SEEREP查询服务器的应用服务器,该服务器提供带有Protobuf和Flattbuffer消息类型的gRPC接口。查询服务器可以支持高级规划和推理系统,例如农业环境,或其他随时间变化的部分非结构化环境。在本文中,我们表明SEEREP比传统的GIS更适合这些任务,传统的GIS不能处理不同类型的机器人传感器数据。
{"title":"A Spatio-Temporal-Semantic Environment Representation for Autonomous Mobile Robots equipped with various Sensor Systems","authors":"Mark Niemeyer, Sebastian Pütz, J. Hertzberg","doi":"10.1109/MFI55806.2022.9913873","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913873","url":null,"abstract":"The large amount of high resolution sensor data, both temporal and spatial, that autonomous mobile robots collect in today’s systems requires structured and efficient management and storage during the robot mission. In response, we present SEEREP: A Spatio-Temporal-Semantic Environment Representation for Autonomous Mobile Robots. SEEREP handles various types of data at once and provides an efficient query interface for all three modalities that can be combined for high-level analyses. It supports common robotic sensor data types such as images and point clouds, as well as sensor and robot coordinate frames changing over time. Furthermore, SEEREP provides an efficient HDF5-based storage system running on the robot during operation, compatible with ROS and the corresponding sensor message definitions. The compressed HDF5 data backend can be transferred efficiently to an application server with a running SEEREP query server providing gRPC interfaces with Protobuf and Flattbuffer message types. The query server can support high-level planning and reasoning systems in e.g. agricultural environments, or other partially unstructured environments that change over time. In this paper we show that SEEREP is much better suited for these tasks than a traditional GIS, which cannot handle the different types of robotic sensor data.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128819491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Optimal Sensor Placement for Multilateration Using Alternating Greedy Removal and Placement 利用交替贪婪去除和放置的最优传感器放置
Daniel Frisch, Kailai Li, U. Hanebeck
We present a novel algorithm for optimal sensor placement in multilateration problems. Our goal is to design a sensor network that achieves optimal localization accuracy anywhere in the covered region. We consider the discrete placement problem, where the possible locations of the sensors are selected from a discrete set. Thus, we obtain a combinatorial optimization problem instead of a continuous one. While at first, combinatorial optimization sounds like more effort, we present an algorithm that finds a globally optimal solution surprisingly quickly.
我们提出了一种新的算法来优化传感器的放置。我们的目标是设计一个传感器网络,在覆盖区域的任何地方都能达到最佳的定位精度。我们考虑的是离散放置问题,其中传感器的可能位置是从一个离散集合中选择的。因此,我们得到的是一个组合优化问题,而不是一个连续优化问题。虽然一开始,组合优化听起来更费力,但我们提出了一种算法,可以惊人地快速找到全局最优解。
{"title":"Optimal Sensor Placement for Multilateration Using Alternating Greedy Removal and Placement","authors":"Daniel Frisch, Kailai Li, U. Hanebeck","doi":"10.1109/MFI55806.2022.9913847","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913847","url":null,"abstract":"We present a novel algorithm for optimal sensor placement in multilateration problems. Our goal is to design a sensor network that achieves optimal localization accuracy anywhere in the covered region. We consider the discrete placement problem, where the possible locations of the sensors are selected from a discrete set. Thus, we obtain a combinatorial optimization problem instead of a continuous one. While at first, combinatorial optimization sounds like more effort, we present an algorithm that finds a globally optimal solution surprisingly quickly.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133583671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automated Road Asset Data Collection and Classification using Consumer Dashcams 使用消费者行车记录仪自动收集和分类道路资产数据
Michael Sieverts, Yoshihiro Obata, Mohammad Farhadmanesh, D. Sacharny, T. Henderson
A growing remote sensing network comprised of consumer dashcams presents Departments of Transportation (DOTs) worldwide with opportunities to dramatically reduce the costs and effort associated with monitoring and maintaining hundreds of thousands of sign assets on public roadways. However, many technical challenges confront the applications and technologies that will enable this transformation of roadway maintenance. This paper highlights an efficient approach to the problem of detection and classification of more than 600 classes of traffic signs in the United States, as defined in the Manual on Uniform Traffic Control Devices (MUTCD). Given the variability of specifications and the quality of images and metadata collected from consumer dashcams, a deep learning approach offers an efficient development tool to small organizations that want to leverage this data type for detection and classification. This paper presents a two-step process, a detection network that locates signs in dashcam images and a classification network that first extracts the bounding box from the previous detection to assign a specific sign class from over 600 classes of signs. The detection network is trained using labeled data from dashcams in Nashville, Tennessee, and a combination of real and synthetic data is used to train the classification network. The architecture presented here was applied to real-world image data provided by the Utah Department of Transportation and Blyncsy, Inc., and achieved modest results (test accuracy of 0.47) with a relatively low development time.
由消费者行车记录仪组成的日益增长的遥感网络为全球交通部门(DOTs)提供了机会,可以大幅降低与监控和维护公共道路上数十万个标志资产相关的成本和工作量。然而,要实现道路维护的这种转变,应用和技术面临着许多技术挑战。本文重点介绍了一种有效的方法来检测和分类美国600多种交通标志,这些标志是根据统一交通控制设备手册(MUTCD)定义的。考虑到规范的可变性以及从消费者行车记录仪收集的图像和元数据的质量,深度学习方法为希望利用这种数据类型进行检测和分类的小型组织提供了一种高效的开发工具。本文提出了一个两步的过程,一个检测网络定位行车摄像头图像中的标志,一个分类网络首先从之前的检测中提取边界框,从600多个标志类别中分配特定的标志类别。检测网络使用来自田纳西州纳什维尔的行车记录仪的标记数据进行训练,并使用真实数据和合成数据的组合来训练分类网络。本文介绍的架构应用于由犹他州交通部和Blyncsy, Inc.提供的真实图像数据,并以相对较低的开发时间获得了适度的结果(测试精度为0.47)。
{"title":"Automated Road Asset Data Collection and Classification using Consumer Dashcams","authors":"Michael Sieverts, Yoshihiro Obata, Mohammad Farhadmanesh, D. Sacharny, T. Henderson","doi":"10.1109/MFI55806.2022.9913859","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913859","url":null,"abstract":"A growing remote sensing network comprised of consumer dashcams presents Departments of Transportation (DOTs) worldwide with opportunities to dramatically reduce the costs and effort associated with monitoring and maintaining hundreds of thousands of sign assets on public roadways. However, many technical challenges confront the applications and technologies that will enable this transformation of roadway maintenance. This paper highlights an efficient approach to the problem of detection and classification of more than 600 classes of traffic signs in the United States, as defined in the Manual on Uniform Traffic Control Devices (MUTCD). Given the variability of specifications and the quality of images and metadata collected from consumer dashcams, a deep learning approach offers an efficient development tool to small organizations that want to leverage this data type for detection and classification. This paper presents a two-step process, a detection network that locates signs in dashcam images and a classification network that first extracts the bounding box from the previous detection to assign a specific sign class from over 600 classes of signs. The detection network is trained using labeled data from dashcams in Nashville, Tennessee, and a combination of real and synthetic data is used to train the classification network. The architecture presented here was applied to real-world image data provided by the Utah Department of Transportation and Blyncsy, Inc., and achieved modest results (test accuracy of 0.47) with a relatively low development time.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134204110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Patch-NetVLAD+: Learned patch descriptor and weighted matching strategy for place recognition patch - netvlad +:学习了用于位置识别的补丁描述符和加权匹配策略
Yingfeng Cai, Junqiao Zhao, Jiafeng Cui, Fenglin Zhang, Chen Ye, T. Feng
Visual Place Recognition (VPR) in areas with similar scenes such as urban or indoor scenarios is a major challenge. Existing VPR methods using global descriptors have difficulty capturing local specific region (LSR) in the scene and are therefore prone to localization confusion in such scenarios. As a result, finding the LSRs that are critical for location recognition becomes key. To address this challenge, we introduced Patch-NetVLAD+, which was inspired by patch-based VPR researches. Our method proposed a fine-tuning strategy with triplet loss to make NetVLAD suitable for extracting patch-level descriptors. Moreover, unlike existing methods that treat all patches in an image equally, our method extracts patches of LSR, which present less frequently throughout the dataset, and makes them play an important role in VPR by assigning proper weights to them. Experiments on Pittsburgh30k and Tokyo247 datasets show that our approach achieved up to 9.3% performance improvement than existing patch-based methods.
在具有类似场景的区域(如城市或室内场景)中,视觉位置识别(VPR)是一个主要挑战。使用全局描述符的现有VPR方法难以捕获场景中的局部特定区域(LSR),因此在此类场景中容易出现定位混淆。因此,找到对位置识别至关重要的lsr就成为了关键。为了应对这一挑战,我们引入了Patch-NetVLAD+,该技术受到基于补丁的VPR研究的启发。我们的方法提出了一种带有三重损失的微调策略,使NetVLAD适合于提取补丁级描述符。此外,与现有方法对图像中的所有斑块一视同仁不同,我们的方法提取了在数据集中出现频率较低的LSR斑块,并通过分配适当的权重使其在VPR中发挥重要作用。在pittsburgh - 30k和Tokyo247数据集上的实验表明,我们的方法比现有的基于补丁的方法的性能提高了9.3%。
{"title":"Patch-NetVLAD+: Learned patch descriptor and weighted matching strategy for place recognition","authors":"Yingfeng Cai, Junqiao Zhao, Jiafeng Cui, Fenglin Zhang, Chen Ye, T. Feng","doi":"10.1109/MFI55806.2022.9913860","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913860","url":null,"abstract":"Visual Place Recognition (VPR) in areas with similar scenes such as urban or indoor scenarios is a major challenge. Existing VPR methods using global descriptors have difficulty capturing local specific region (LSR) in the scene and are therefore prone to localization confusion in such scenarios. As a result, finding the LSRs that are critical for location recognition becomes key. To address this challenge, we introduced Patch-NetVLAD+, which was inspired by patch-based VPR researches. Our method proposed a fine-tuning strategy with triplet loss to make NetVLAD suitable for extracting patch-level descriptors. Moreover, unlike existing methods that treat all patches in an image equally, our method extracts patches of LSR, which present less frequently throughout the dataset, and makes them play an important role in VPR by assigning proper weights to them. Experiments on Pittsburgh30k and Tokyo247 datasets show that our approach achieved up to 9.3% performance improvement than existing patch-based methods.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121461956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DSC: Deep Scan Context Descriptor for Large-Scale Place Recognition 用于大规模位置识别的深度扫描上下文描述符
Jiafeng Cui, Teng Huang, Yingfeng Cai, Junqiao Zhao, Lu Xiong, Zhuoping Yu
LiDAR-based place recognition is an essential and challenging task both in loop closure detection and global relocalization. We propose Deep Scan Context (DSC), a general and discriminative global descriptor that captures the relationship among segments of a point cloud. Unlike previous methods that utilize either semantics or a sequence of adjacent point clouds for better place recognition, we only use the raw point clouds to get competitive results. Concretely, we first segment the point cloud egocentrically to divide the point cloud into several segments and extract the features of the segments from both spatial distribution and shape differences. Then, we introduce a graph neural network to aggregate these features into an embedding representation. Extensive experiments conducted on the KITTI dataset show that DSC is robust to scene variants and outperforms existing methods.
基于激光雷达的位置识别在环闭合检测和全局再定位中都是一项重要而富有挑战性的任务。我们提出了深度扫描上下文(DSC),这是一种通用的判别性全局描述符,用于捕获点云各部分之间的关系。与之前利用语义或相邻点云序列来更好地识别位置的方法不同,我们只使用原始点云来获得竞争结果。具体来说,我们首先以自我为中心对点云进行分割,将点云分割成若干段,并从空间分布和形状差异两方面提取各段的特征。然后,我们引入一个图神经网络将这些特征聚合到一个嵌入表示中。在KITTI数据集上进行的大量实验表明,DSC对场景变量具有鲁棒性,优于现有方法。
{"title":"DSC: Deep Scan Context Descriptor for Large-Scale Place Recognition","authors":"Jiafeng Cui, Teng Huang, Yingfeng Cai, Junqiao Zhao, Lu Xiong, Zhuoping Yu","doi":"10.1109/MFI55806.2022.9913850","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913850","url":null,"abstract":"LiDAR-based place recognition is an essential and challenging task both in loop closure detection and global relocalization. We propose Deep Scan Context (DSC), a general and discriminative global descriptor that captures the relationship among segments of a point cloud. Unlike previous methods that utilize either semantics or a sequence of adjacent point clouds for better place recognition, we only use the raw point clouds to get competitive results. Concretely, we first segment the point cloud egocentrically to divide the point cloud into several segments and extract the features of the segments from both spatial distribution and shape differences. Then, we introduce a graph neural network to aggregate these features into an embedding representation. Extensive experiments conducted on the KITTI dataset show that DSC is robust to scene variants and outperforms existing methods.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131380827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1