在具有挑战性的条件下用于视觉位置识别的深度学习现成的整体特征描述符

2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2020-09-21 DOI:10.1109/MMSP48831.2020.9287063

Farid Aliajni, Esa Rahtu

{"title":"在具有挑战性的条件下用于视觉位置识别的深度学习现成的整体特征描述符","authors":"Farid Aliajni, Esa Rahtu","doi":"10.1109/MMSP48831.2020.9287063","DOIUrl":null,"url":null,"abstract":"In this paper, we present a comprehensive study on the utility of deep learning feature extraction methods for visual place recognition task in three challenging conditions, appearance variation, viewpoint variation and combination of both appearance and viewpoint variation. We extensively compared the performance of convolutional neural network architectures with batch normalization layers in terms of fraction of the correct matches. These architectures are primarily trained for image classification and object detection problems and used as holistic feature descriptors for visual place recognition task. To verify effectiveness of our results, we utilized four real world datasets in place recognition. Our investigation demonstrates that convolutional neural network architectures coupled with batch normalization and trained for other tasks in computer vision outperform architectures which are specifically designed for place recognition tasks.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Deep Learning Off-the-shelf Holistic Feature Descriptors for Visual Place Recognition in Challenging Conditions\",\"authors\":\"Farid Aliajni, Esa Rahtu\",\"doi\":\"10.1109/MMSP48831.2020.9287063\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a comprehensive study on the utility of deep learning feature extraction methods for visual place recognition task in three challenging conditions, appearance variation, viewpoint variation and combination of both appearance and viewpoint variation. We extensively compared the performance of convolutional neural network architectures with batch normalization layers in terms of fraction of the correct matches. These architectures are primarily trained for image classification and object detection problems and used as holistic feature descriptors for visual place recognition task. To verify effectiveness of our results, we utilized four real world datasets in place recognition. Our investigation demonstrates that convolutional neural network architectures coupled with batch normalization and trained for other tasks in computer vision outperform architectures which are specifically designed for place recognition tasks.\",\"PeriodicalId\":188283,\"journal\":{\"name\":\"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP48831.2020.9287063\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP48831.2020.9287063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在本文中，我们全面研究了深度学习特征提取方法在外观变化、视点变化以及外观和视点变化相结合的三种挑战性条件下的视觉位置识别任务中的应用。我们从正确匹配的比例方面广泛地比较了卷积神经网络架构与批处理归一化层的性能。这些结构主要用于图像分类和目标检测问题，并用作视觉位置识别任务的整体特征描述符。为了验证结果的有效性，我们在位置识别中使用了四个真实世界的数据集。我们的研究表明，卷积神经网络体系结构与批处理归一化和计算机视觉中其他任务的训练相结合，优于专门为位置识别任务设计的体系结构。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Deep Learning Off-the-shelf Holistic Feature Descriptors for Visual Place Recognition in Challenging Conditions

In this paper, we present a comprehensive study on the utility of deep learning feature extraction methods for visual place recognition task in three challenging conditions, appearance variation, viewpoint variation and combination of both appearance and viewpoint variation. We extensively compared the performance of convolutional neural network architectures with batch normalization layers in terms of fraction of the correct matches. These architectures are primarily trained for image classification and object detection problems and used as holistic feature descriptors for visual place recognition task. To verify effectiveness of our results, we utilized four real world datasets in place recognition. Our investigation demonstrates that convolutional neural network architectures coupled with batch normalization and trained for other tasks in computer vision outperform architectures which are specifically designed for place recognition tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)

自引率

0.00%

发文量