Tianyi Zhang;Chunyun Chen;Yun Liu;Xue Geng;Mohamed M. Sabry Aly;Jie Lin
{"title":"PSRR-MaxpoolNMS++: Fast Non-Maximum Suppression With Discretization and Pooling","authors":"Tianyi Zhang;Chunyun Chen;Yun Liu;Xue Geng;Mohamed M. Sabry Aly;Jie Lin","doi":"10.1109/TPAMI.2024.3485898","DOIUrl":null,"url":null,"abstract":"Non-maximum suppression (NMS) is an essential post-processing step for object detection. The de-facto standard for NMS, namely GreedyNMS, is not parallelizable and could thus be the performance bottleneck in object detection pipelines. MaxpoolNMS is introduced as a fast and parallelizable alternative to GreedyNMS. However, MaxpoolNMS is only capable of replacing the GreedyNMS at the first stage of two-stage detectors like Faster R-CNN. To address this issue, we observe that MaxpoolNMS employs the process of \n<italic>box coordinate discretization</i>\n followed by \n<italic>local score argmax calculation</i>\n, to discard the nested-loop pipeline in GreedyNMS to enable parallelizable implementations. In this paper, we introduce a simple \n<italic>Relationship Recovery</i>\n module and a \n<italic>Pyramid Shifted MaxpoolNMS</i>\n module to improve the above two stages, respectively. With these two modules, our \n<bold>PSRR-MaxpoolNMS</b>\n is a generic and parallelizable approach, which can completely replace GreedyNMS at all stages in all detectors. Furthermore, we extend PSRR-MaxpoolNMS to the more powerful \n<bold>PSRR-MaxpoolNMS++</b>\n. As for \n<italic>box coordinate discretization</i>\n, we propose \n<italic>Density-based Discretization</i>\n for better adherence to the target density of the suppression. As for \n<italic>local score argmax calculation</i>\n, we propose an \n<italic>Adjacent Scale Pooling</i>\n scheme for mining out the duplicated box pairs more accurately and efficiently. Extensive experiments demonstrate that both our PSRR-MaxpoolNMS and PSRR-MaxpoolNMS++ outperform MaxpoolNMS by a large margin. Additionally, PSRR-MaxpoolNMS++ not only surpasses PSRR-MaxpoolNMS but also attains competitive accuracy and much better efficiency when compared with GreedyNMS. Therefore, PSRR-MaxpoolNMS++ is a parallelizable NMS solution that can effectively replace GreedyNMS at all stages in all detectors.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 2","pages":"978-993"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10736991/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Non-maximum suppression (NMS) is an essential post-processing step for object detection. The de-facto standard for NMS, namely GreedyNMS, is not parallelizable and could thus be the performance bottleneck in object detection pipelines. MaxpoolNMS is introduced as a fast and parallelizable alternative to GreedyNMS. However, MaxpoolNMS is only capable of replacing the GreedyNMS at the first stage of two-stage detectors like Faster R-CNN. To address this issue, we observe that MaxpoolNMS employs the process of
box coordinate discretization
followed by
local score argmax calculation
, to discard the nested-loop pipeline in GreedyNMS to enable parallelizable implementations. In this paper, we introduce a simple
Relationship Recovery
module and a
Pyramid Shifted MaxpoolNMS
module to improve the above two stages, respectively. With these two modules, our
PSRR-MaxpoolNMS
is a generic and parallelizable approach, which can completely replace GreedyNMS at all stages in all detectors. Furthermore, we extend PSRR-MaxpoolNMS to the more powerful
PSRR-MaxpoolNMS++
. As for
box coordinate discretization
, we propose
Density-based Discretization
for better adherence to the target density of the suppression. As for
local score argmax calculation
, we propose an
Adjacent Scale Pooling
scheme for mining out the duplicated box pairs more accurately and efficiently. Extensive experiments demonstrate that both our PSRR-MaxpoolNMS and PSRR-MaxpoolNMS++ outperform MaxpoolNMS by a large margin. Additionally, PSRR-MaxpoolNMS++ not only surpasses PSRR-MaxpoolNMS but also attains competitive accuracy and much better efficiency when compared with GreedyNMS. Therefore, PSRR-MaxpoolNMS++ is a parallelizable NMS solution that can effectively replace GreedyNMS at all stages in all detectors.