Unsupervised Active Visual Search With Monte Carlo Planning Under Uncertain Detections

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2024-08-29 DOI:10.1109/TPAMI.2024.3451994

Francesco Taioli;Francesco Giuliari;Yiming Wang;Riccardo Berra;Alberto Castellini;Alessio Del Bue;Alessandro Farinelli;Marco Cristani;Francesco Setti

{"title":"Unsupervised Active Visual Search With Monte Carlo Planning Under Uncertain Detections","authors":"Francesco Taioli;Francesco Giuliari;Yiming Wang;Riccardo Berra;Alberto Castellini;Alessio Del Bue;Alessandro Farinelli;Marco Cristani;Francesco Setti","doi":"10.1109/TPAMI.2024.3451994","DOIUrl":null,"url":null,"abstract":"We propose a solution for Active Visual Search of objects in an environment, whose 2D floor map is the only known information. Our solution has three key features that make it more plausible and robust to detector failures compared to state-of-the-art methods: \ni)\n it is unsupervised as it does not need any training sessions. \nii)\n During the exploration, a probability distribution on the 2D floor map is updated according to an intuitive mechanism, while an improved belief update increases the effectiveness of the agent's exploration. \niii)\n We incorporate the awareness that an object detector may fail into the aforementioned probability modelling by exploiting the success statistics of a specific detector. Our solution is dubbed POMP-BE-PD (Pomcp-based Online Motion Planning with Belief by Exploration and Probabilistic Detection). It uses the current pose of an agent and an RGB-D observation to learn an optimal search policy, exploiting a POMDP solved by a Monte-Carlo planning approach. On the Active Vision Dataset Benchmark, we increase the average success rate over all the environments by a significant 35\n<inline-formula><tex-math>$\\%$</tex-math></inline-formula>\n while decreasing the average path length by 4\n<inline-formula><tex-math>$\\%$</tex-math></inline-formula>\n with respect to competing methods. Thus, our results are state-of-the-art, even without any training procedure.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"46 12","pages":"11047-11058"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10659171","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10659171/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We propose a solution for Active Visual Search of objects in an environment, whose 2D floor map is the only known information. Our solution has three key features that make it more plausible and robust to detector failures compared to state-of-the-art methods: i) it is unsupervised as it does not need any training sessions. ii) During the exploration, a probability distribution on the 2D floor map is updated according to an intuitive mechanism, while an improved belief update increases the effectiveness of the agent's exploration. iii) We incorporate the awareness that an object detector may fail into the aforementioned probability modelling by exploiting the success statistics of a specific detector. Our solution is dubbed POMP-BE-PD (Pomcp-based Online Motion Planning with Belief by Exploration and Probabilistic Detection). It uses the current pose of an agent and an RGB-D observation to learn an optimal search policy, exploiting a POMDP solved by a Monte-Carlo planning approach. On the Active Vision Dataset Benchmark, we increase the average success rate over all the environments by a significant 35

$\%$

while decreasing the average path length by 4

$\%$

with respect to competing methods. Thus, our results are state-of-the-art, even without any training procedure.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

不确定检测条件下的无监督主动视觉搜索与蒙特卡洛规划

我们提出了一种对环境中的物体进行主动视觉搜索的解决方案，环境中的二维平面图是唯一已知的信息。与最先进的方法相比，我们的解决方案有三个主要特点，使其在检测器失效时更具可信度和鲁棒性：i) 它是无监督的，因为它不需要任何训练课程；ii) 在探索过程中，2D 地图上的概率分布会根据一种直观的机制进行更新，而改进的信念更新会提高代理探索的有效性；iii) 我们通过利用特定检测器的成功统计数据，将物体检测器可能失效的意识纳入上述概率建模中。我们的解决方案被称为 POMP-BE-PD（Pomcp-based Online Motion Planning with Belief by Exploration and Probabilistic Detection）。它使用代理的当前姿势和 RGB-D 观察结果来学习最优搜索策略，利用蒙特卡洛规划方法求解的 POMDP。在 "主动视觉数据集基准"（Active Vision Dataset Benchmark）上，与其他竞争方法相比，我们将所有环境下的平均成功率大幅提高了 35%，同时将平均路径长度减少了 4%。因此，即使没有任何训练程序，我们的结果也是最先进的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量