{"title":"基于知识的物体检测推理网络","authors":"Huigang Zhang, Liuan Wang, Jun Sun","doi":"10.1109/ICIP42928.2021.9506228","DOIUrl":null,"url":null,"abstract":"The mainstream object detection algorithms rely on recognizing object instances individually, but do not consider the high-level relationship among objects in context. This will inevitably lead to biased detection results, due to the lack of commonsense knowledge that humans often use to assist the task for object identification. In this paper, we present a novel reasoning module to endow the current detection systems with the power of commonsense knowledge. Specifically, we use graph attention network (GAT) to represent the knowledge among objects. The knowledge covers visual and semantic relations. Through the iterative update of GAT, the object features can be enriched. Experiments on the COCO detection benchmark indicate that our knowledge-based reasoning network has achieved consistent improvements upon various CNN detectors. We achieved 1.9 and 1.8 points higher Average Precision (AP) than Faster-RCNN and Mask-RCNN respectively, when using ResNet50-FPN as backbone.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"493 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Knowledge-Based Reasoning Network For Object Detection\",\"authors\":\"Huigang Zhang, Liuan Wang, Jun Sun\",\"doi\":\"10.1109/ICIP42928.2021.9506228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The mainstream object detection algorithms rely on recognizing object instances individually, but do not consider the high-level relationship among objects in context. This will inevitably lead to biased detection results, due to the lack of commonsense knowledge that humans often use to assist the task for object identification. In this paper, we present a novel reasoning module to endow the current detection systems with the power of commonsense knowledge. Specifically, we use graph attention network (GAT) to represent the knowledge among objects. The knowledge covers visual and semantic relations. Through the iterative update of GAT, the object features can be enriched. Experiments on the COCO detection benchmark indicate that our knowledge-based reasoning network has achieved consistent improvements upon various CNN detectors. We achieved 1.9 and 1.8 points higher Average Precision (AP) than Faster-RCNN and Mask-RCNN respectively, when using ResNet50-FPN as backbone.\",\"PeriodicalId\":314429,\"journal\":{\"name\":\"2021 IEEE International Conference on Image Processing (ICIP)\",\"volume\":\"493 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Image Processing (ICIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIP42928.2021.9506228\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP42928.2021.9506228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Knowledge-Based Reasoning Network For Object Detection
The mainstream object detection algorithms rely on recognizing object instances individually, but do not consider the high-level relationship among objects in context. This will inevitably lead to biased detection results, due to the lack of commonsense knowledge that humans often use to assist the task for object identification. In this paper, we present a novel reasoning module to endow the current detection systems with the power of commonsense knowledge. Specifically, we use graph attention network (GAT) to represent the knowledge among objects. The knowledge covers visual and semantic relations. Through the iterative update of GAT, the object features can be enriched. Experiments on the COCO detection benchmark indicate that our knowledge-based reasoning network has achieved consistent improvements upon various CNN detectors. We achieved 1.9 and 1.8 points higher Average Precision (AP) than Faster-RCNN and Mask-RCNN respectively, when using ResNet50-FPN as backbone.