仅使用稀疏语义对象特征的鲁棒无人机视觉教学和重复

A. Toudeshki, Faraz Shamshirdar, R. Vaughan
{"title":"仅使用稀疏语义对象特征的鲁棒无人机视觉教学和重复","authors":"A. Toudeshki, Faraz Shamshirdar, R. Vaughan","doi":"10.1109/CRV.2018.00034","DOIUrl":null,"url":null,"abstract":"We demonstrate the use of semantic object detections as robust features for Visual Teach and Repeat (VTR). Recent CNN-based object detectors are able to reliably detect objects of tens or hundreds of categories in video at frame rates. We show that such detections are repeatable enough to use as landmarks for VTR, without any low-level image features. Since object detections are highly invariant to lighting and surface appearance changes, our VTR can cope with global lighting changes and local movements of the landmark objects. In the teaching phase we build extremely compact scene descriptors: a list of detected object labels and their image-plane locations. In the repeating phase, we use Seq-SLAM-like relocalization to identify the most similar learned scene, then use a motion control algorithm based on the funnel lane theory to navigate the robot along the previously piloted trajectory. We evaluate the method on a commodity UAV, examining the robustness of the algorithm to new viewpoints, lighting conditions, and movements of landmark objects. The results suggest that semantic object features could be useful due to their invariance to superficial appearance changes compared to low-level image features.","PeriodicalId":281779,"journal":{"name":"2018 15th Conference on Computer and Robot Vision (CRV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Robust UAV Visual Teach and Repeat Using Only Sparse Semantic Object Features\",\"authors\":\"A. Toudeshki, Faraz Shamshirdar, R. Vaughan\",\"doi\":\"10.1109/CRV.2018.00034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We demonstrate the use of semantic object detections as robust features for Visual Teach and Repeat (VTR). Recent CNN-based object detectors are able to reliably detect objects of tens or hundreds of categories in video at frame rates. We show that such detections are repeatable enough to use as landmarks for VTR, without any low-level image features. Since object detections are highly invariant to lighting and surface appearance changes, our VTR can cope with global lighting changes and local movements of the landmark objects. In the teaching phase we build extremely compact scene descriptors: a list of detected object labels and their image-plane locations. In the repeating phase, we use Seq-SLAM-like relocalization to identify the most similar learned scene, then use a motion control algorithm based on the funnel lane theory to navigate the robot along the previously piloted trajectory. We evaluate the method on a commodity UAV, examining the robustness of the algorithm to new viewpoints, lighting conditions, and movements of landmark objects. The results suggest that semantic object features could be useful due to their invariance to superficial appearance changes compared to low-level image features.\",\"PeriodicalId\":281779,\"journal\":{\"name\":\"2018 15th Conference on Computer and Robot Vision (CRV)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 15th Conference on Computer and Robot Vision (CRV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CRV.2018.00034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 15th Conference on Computer and Robot Vision (CRV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV.2018.00034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

我们演示了使用语义对象检测作为视觉教学和重复(VTR)的鲁棒特征。最近基于cnn的目标检测器能够以帧率可靠地检测视频中数十或数百个类别的对象。我们证明这种检测是可重复的,足以用作VTR的地标,没有任何低级图像特征。由于物体检测对照明和表面外观变化具有高度的不变性,因此我们的VTR可以处理全局照明变化和地标物体的局部运动。在教学阶段,我们构建了非常紧凑的场景描述符:检测到的对象标签及其图像平面位置的列表。在重复阶段,我们使用类似seq - slam的重新定位方法来识别最相似的学习场景,然后使用基于漏斗车道理论的运动控制算法来沿着先前的驾驶轨迹导航机器人。我们在一架商用无人机上评估了该方法,检查了算法对新视点、光照条件和地标物体运动的鲁棒性。结果表明,与低级图像特征相比,语义对象特征对表面外观变化的不变性可能是有用的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Robust UAV Visual Teach and Repeat Using Only Sparse Semantic Object Features
We demonstrate the use of semantic object detections as robust features for Visual Teach and Repeat (VTR). Recent CNN-based object detectors are able to reliably detect objects of tens or hundreds of categories in video at frame rates. We show that such detections are repeatable enough to use as landmarks for VTR, without any low-level image features. Since object detections are highly invariant to lighting and surface appearance changes, our VTR can cope with global lighting changes and local movements of the landmark objects. In the teaching phase we build extremely compact scene descriptors: a list of detected object labels and their image-plane locations. In the repeating phase, we use Seq-SLAM-like relocalization to identify the most similar learned scene, then use a motion control algorithm based on the funnel lane theory to navigate the robot along the previously piloted trajectory. We evaluate the method on a commodity UAV, examining the robustness of the algorithm to new viewpoints, lighting conditions, and movements of landmark objects. The results suggest that semantic object features could be useful due to their invariance to superficial appearance changes compared to low-level image features.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Systematic Street View Sampling: High Quality Annotation of Power Infrastructure in Rural Ontario Deep Learning-Driven Depth from Defocus via Active Multispectral Quasi-Random Projections with Complex Subpatterns De-noising of Lidar Point Clouds Corrupted by Snowfall Grading Prenatal Hydronephrosis from Ultrasound Imaging Using Deep Convolutional Neural Networks Automotive Semi-specular Surface Defect Detection System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1