Ren Wang, Tae Sung Kim, Tae-Ho Lee, Jin-Sung Kim, Hyuk-Jae Lee
{"title":"基于图卷积网络的多视图单目标检测决策","authors":"Ren Wang, Tae Sung Kim, Tae-Ho Lee, Jin-Sung Kim, Hyuk-Jae Lee","doi":"10.33851/jmis.2023.10.3.207","DOIUrl":null,"url":null,"abstract":"Aggregating predicted outputs from multiple views helps boost multi-view single object detection performance. Decision-making strategies are flexible to perform this result-level aggregation. However, the relationship among multiple views is not exploited in aggregation. This study proposes a novel decision-making model with graph convolutional networks (DM-GCN) to address this issue by establishing a relationship among predicted outputs with graph convolutional networks. Through training, the proposed DM-GCN learns to make a correct decision by enhancing the contributions from informative views. DM-GCN is light, fast, and can be applied to any object detector with a negligible computational cost. Moreover, a real captured dataset named Yogurt10 with a new metric is proposed to investigate the performance of DM-GCN in the multi-view single object detection task. Experimental results show that DM-GCN achieves superior performance compared to classical decision-making strategies. A visual explanation is also provided to interpret how DM-GCN makes a correct decision.","PeriodicalId":477174,"journal":{"name":"Journal of multimedia information system","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Decision-Making for Multi-View Single Object Detection with Graph Convolutional Networks\",\"authors\":\"Ren Wang, Tae Sung Kim, Tae-Ho Lee, Jin-Sung Kim, Hyuk-Jae Lee\",\"doi\":\"10.33851/jmis.2023.10.3.207\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aggregating predicted outputs from multiple views helps boost multi-view single object detection performance. Decision-making strategies are flexible to perform this result-level aggregation. However, the relationship among multiple views is not exploited in aggregation. This study proposes a novel decision-making model with graph convolutional networks (DM-GCN) to address this issue by establishing a relationship among predicted outputs with graph convolutional networks. Through training, the proposed DM-GCN learns to make a correct decision by enhancing the contributions from informative views. DM-GCN is light, fast, and can be applied to any object detector with a negligible computational cost. Moreover, a real captured dataset named Yogurt10 with a new metric is proposed to investigate the performance of DM-GCN in the multi-view single object detection task. Experimental results show that DM-GCN achieves superior performance compared to classical decision-making strategies. A visual explanation is also provided to interpret how DM-GCN makes a correct decision.\",\"PeriodicalId\":477174,\"journal\":{\"name\":\"Journal of multimedia information system\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of multimedia information system\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33851/jmis.2023.10.3.207\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of multimedia information system","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33851/jmis.2023.10.3.207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Decision-Making for Multi-View Single Object Detection with Graph Convolutional Networks
Aggregating predicted outputs from multiple views helps boost multi-view single object detection performance. Decision-making strategies are flexible to perform this result-level aggregation. However, the relationship among multiple views is not exploited in aggregation. This study proposes a novel decision-making model with graph convolutional networks (DM-GCN) to address this issue by establishing a relationship among predicted outputs with graph convolutional networks. Through training, the proposed DM-GCN learns to make a correct decision by enhancing the contributions from informative views. DM-GCN is light, fast, and can be applied to any object detector with a negligible computational cost. Moreover, a real captured dataset named Yogurt10 with a new metric is proposed to investigate the performance of DM-GCN in the multi-view single object detection task. Experimental results show that DM-GCN achieves superior performance compared to classical decision-making strategies. A visual explanation is also provided to interpret how DM-GCN makes a correct decision.