基于图卷积网络的多视图单目标检测决策

Journal of multimedia information system Pub Date : 2023-09-30 DOI:10.33851/jmis.2023.10.3.207

Ren Wang, Tae Sung Kim, Tae-Ho Lee, Jin-Sung Kim, Hyuk-Jae Lee

{"title":"基于图卷积网络的多视图单目标检测决策","authors":"Ren Wang, Tae Sung Kim, Tae-Ho Lee, Jin-Sung Kim, Hyuk-Jae Lee","doi":"10.33851/jmis.2023.10.3.207","DOIUrl":null,"url":null,"abstract":"Aggregating predicted outputs from multiple views helps boost multi-view single object detection performance. Decision-making strategies are flexible to perform this result-level aggregation. However, the relationship among multiple views is not exploited in aggregation. This study proposes a novel decision-making model with graph convolutional networks (DM-GCN) to address this issue by establishing a relationship among predicted outputs with graph convolutional networks. Through training, the proposed DM-GCN learns to make a correct decision by enhancing the contributions from informative views. DM-GCN is light, fast, and can be applied to any object detector with a negligible computational cost. Moreover, a real captured dataset named Yogurt10 with a new metric is proposed to investigate the performance of DM-GCN in the multi-view single object detection task. Experimental results show that DM-GCN achieves superior performance compared to classical decision-making strategies. A visual explanation is also provided to interpret how DM-GCN makes a correct decision.","PeriodicalId":477174,"journal":{"name":"Journal of multimedia information system","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Decision-Making for Multi-View Single Object Detection with Graph Convolutional Networks\",\"authors\":\"Ren Wang, Tae Sung Kim, Tae-Ho Lee, Jin-Sung Kim, Hyuk-Jae Lee\",\"doi\":\"10.33851/jmis.2023.10.3.207\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aggregating predicted outputs from multiple views helps boost multi-view single object detection performance. Decision-making strategies are flexible to perform this result-level aggregation. However, the relationship among multiple views is not exploited in aggregation. This study proposes a novel decision-making model with graph convolutional networks (DM-GCN) to address this issue by establishing a relationship among predicted outputs with graph convolutional networks. Through training, the proposed DM-GCN learns to make a correct decision by enhancing the contributions from informative views. DM-GCN is light, fast, and can be applied to any object detector with a negligible computational cost. Moreover, a real captured dataset named Yogurt10 with a new metric is proposed to investigate the performance of DM-GCN in the multi-view single object detection task. Experimental results show that DM-GCN achieves superior performance compared to classical decision-making strategies. A visual explanation is also provided to interpret how DM-GCN makes a correct decision.\",\"PeriodicalId\":477174,\"journal\":{\"name\":\"Journal of multimedia information system\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of multimedia information system\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33851/jmis.2023.10.3.207\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of multimedia information system","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33851/jmis.2023.10.3.207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

聚合来自多个视图的预测输出有助于提高多视图单目标检测性能。决策策略可以灵活地执行这种结果级聚合。然而，在聚合中没有利用多个视图之间的关系。本研究提出了一种新的基于图卷积网络的决策模型(DM-GCN)，通过与图卷积网络建立预测输出之间的关系来解决这一问题。通过训练，所提出的DM-GCN通过增强信息性观点的贡献来学习做出正确的决策。DM-GCN重量轻，速度快，可以应用于任何目标检测器，计算成本可以忽略不计。此外，本文还提出了一个具有新度量的真实捕获数据集Yogurt10，以研究DM-GCN在多视图单目标检测任务中的性能。实验结果表明，与经典决策策略相比，DM-GCN策略具有更优的性能。还提供了直观的解释，以解释DM-GCN如何做出正确的决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Decision-Making for Multi-View Single Object Detection with Graph Convolutional Networks

Aggregating predicted outputs from multiple views helps boost multi-view single object detection performance. Decision-making strategies are flexible to perform this result-level aggregation. However, the relationship among multiple views is not exploited in aggregation. This study proposes a novel decision-making model with graph convolutional networks (DM-GCN) to address this issue by establishing a relationship among predicted outputs with graph convolutional networks. Through training, the proposed DM-GCN learns to make a correct decision by enhancing the contributions from informative views. DM-GCN is light, fast, and can be applied to any object detector with a negligible computational cost. Moreover, a real captured dataset named Yogurt10 with a new metric is proposed to investigate the performance of DM-GCN in the multi-view single object detection task. Experimental results show that DM-GCN achieves superior performance compared to classical decision-making strategies. A visual explanation is also provided to interpret how DM-GCN makes a correct decision.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of multimedia information system

自引率

0.00%

发文量