RON: Reverse Connection with Objectness Prior Networks for Object Detection

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-06 DOI:10.1109/CVPR.2017.557

Tao Kong, F. Sun, Anbang Yao, Huaping Liu, Ming Lu, Yurong Chen

{"title":"RON: Reverse Connection with Objectness Prior Networks for Object Detection","authors":"Tao Kong, F. Sun, Anbang Yao, Huaping Liu, Ming Lu, Yurong Chen","doi":"10.1109/CVPR.2017.557","DOIUrl":null,"url":null,"abstract":"We present RON, an efficient and effective framework for generic object detection. Our motivation is to smartly associate the best of the region-based (e.g., Faster R-CNN) and region-free (e.g., SSD) methodologies. Under fully convolutional architecture, RON mainly focuses on two fundamental problems: (a) multi-scale object localization and (b) negative sample mining. To address (a), we design the reverse connection, which enables the network to detect objects on multi-levels of CNNs. To deal with (b), we propose the objectness prior to significantly reduce the searching space of objects. We optimize the reverse connection, objectness prior and object detector jointly by a multi-task loss function, thus RON can directly predict final detection results from all locations of various feature maps. Extensive experiments on the challenging PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO benchmarks demonstrate the competitive performance of RON. Specifically, with VGG-16 and low resolution 384×384 input size, the network gets 81.3% mAP on PASCAL VOC 2007, 80.7% mAP on PASCAL VOC 2012 datasets. Its superiority increases when datasets become larger and more difficult, as demonstrated by the results on the MS COCO dataset. With 1.5G GPU memory at test phase, the speed of the network is 15 FPS, 3 times faster than the Faster R-CNN counterpart. Code will be made publicly available.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"8 1","pages":"5244-5252"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"384","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2017.557","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 384

Abstract

We present RON, an efficient and effective framework for generic object detection. Our motivation is to smartly associate the best of the region-based (e.g., Faster R-CNN) and region-free (e.g., SSD) methodologies. Under fully convolutional architecture, RON mainly focuses on two fundamental problems: (a) multi-scale object localization and (b) negative sample mining. To address (a), we design the reverse connection, which enables the network to detect objects on multi-levels of CNNs. To deal with (b), we propose the objectness prior to significantly reduce the searching space of objects. We optimize the reverse connection, objectness prior and object detector jointly by a multi-task loss function, thus RON can directly predict final detection results from all locations of various feature maps. Extensive experiments on the challenging PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO benchmarks demonstrate the competitive performance of RON. Specifically, with VGG-16 and low resolution 384×384 input size, the network gets 81.3% mAP on PASCAL VOC 2007, 80.7% mAP on PASCAL VOC 2012 datasets. Its superiority increases when datasets become larger and more difficult, as demonstrated by the results on the MS COCO dataset. With 1.5G GPU memory at test phase, the speed of the network is 15 FPS, 3 times faster than the Faster R-CNN counterpart. Code will be made publicly available.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

罗恩:反向连接与对象先验网络的对象检测

提出了一种高效的通用目标检测框架RON。我们的动机是巧妙地将基于区域(例如，Faster R-CNN)和无区域(例如，SSD)的最佳方法联系起来。在全卷积架构下，RON主要关注两个基本问题:(a)多尺度目标定位和(b)负样本挖掘。为了解决(a)，我们设计了反向连接，使网络能够检测多层cnn上的对象。为了处理(b)，我们提出了客体性优先，以显著减少对象的搜索空间。我们通过多任务损失函数共同优化反向连接、对象先验和对象检测器，从而RON可以直接预测各种特征图的所有位置的最终检测结果。在具有挑战性的PASCAL VOC 2007, PASCAL VOC 2012和MS COCO基准上进行的大量实验证明了RON的竞争性能。具体来说，使用VGG-16和低分辨率384 - 384输入大小，网络在PASCAL VOC 2007数据集上得到81.3%的mAP，在PASCAL VOC 2012数据集上得到80.7%的mAP。MS COCO数据集的结果表明，当数据集变得更大、更困难时，其优势就会增加。在测试阶段使用1.5G GPU内存，网络速度为15 FPS，比更快的R-CNN快3倍。代码将公开提供。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量

期刊最新文献

FFTLasso: Large-Scale LASSO in the Fourier Domain Semantically Coherent Co-Segmentation and Reconstruction of Dynamic Scenes Coarse-to-Fine Segmentation with Shape-Tailored Continuum Scale Spaces Joint Gap Detection and Inpainting of Line Drawings Wetness and Color from a Single Multispectral Image